Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 5 of 5
Full-Text Articles in Computer Sciences
Predicting Occurrence Of The Term Sarcopenia With Semi-Supervised Machine Learning, Kevin Flasch
Predicting Occurrence Of The Term Sarcopenia With Semi-Supervised Machine Learning, Kevin Flasch
Theses and Dissertations
Sarcopenia is a medical condition that involves loss of muscle mass. It has been difficult todefine and only recently assigned an official medical code, leading to many medical records lacking a coded diagnosis although the clinical note text may discuss it or symptoms of it. This thesis investigates the application of machine learning and natural language processing to analyze clinical note text to see how well the term ’sarcopenia’ can be predicted in clinical note text from records concerning the condition.
A variety of machine learning models combined with different features and text processingare tested against training data that mentions …
Analysis Of Music Genre Clustering Algorithms, Samuel Walter Stern
Analysis Of Music Genre Clustering Algorithms, Samuel Walter Stern
Theses and Dissertations
Classification and clustering of music genres has become an increasingly prevalent focusin recent years, prompting a push for research into relevant algorithms. The most successful algorithms have typically applied the Naive Bayes or k-Nearest Neighbors algorithms, or used Neural Networks to perform classification. This thesis seeks to investigate the use of unsupervised clustering algorithms such as K-Means or Hierarchical clustering, and establish their usefulness in comparison to or conjunction with established methods.
Unsupervised Biomedical Named Entity Recognition, Omid Ghiasvand
Unsupervised Biomedical Named Entity Recognition, Omid Ghiasvand
Theses and Dissertations
Named entity recognition (NER) from text is an important task for several applications, including in the biomedical domain. Supervised machine learning based systems have been the most successful on NER task, however, they require correct annotations in large quantities for training. Annotating text manually is very labor intensive and also needs domain expertise. The purpose of this research is to reduce human annotation effort and to decrease cost of annotation for building NER systems in the biomedical domain. The method developed in this work is based on leveraging the availability of resources like UMLS (Unified Medical Language System), that contain …
Data Mining Revision Controlled Document History Metadata For Automatic Classification, Dustin Maass
Data Mining Revision Controlled Document History Metadata For Automatic Classification, Dustin Maass
Theses and Dissertations
Version controlled documents provide a complete history of the changes to the document, including everything from what was changed to who made the change and much more. Through the use of cluster analysis and several sets of manipulated data, this research examines the revision history of Wikipedia in an attempt to find language-independent patterns that could assist in automatic page classification software. Utilizing two sample data sets and applying the aforementioned cluster analysis, no conclusive evidence was found that would indicate that such patterns exist. Our work on the software, however, does provide a foundation for more possible types of …
Extraction And Classification Of Drug-Drug Interaction From Biomedical Text Using A Two-Stage Classifier, Majid Rastegar-Mojarad
Extraction And Classification Of Drug-Drug Interaction From Biomedical Text Using A Two-Stage Classifier, Majid Rastegar-Mojarad
Theses and Dissertations
One of the critical causes of medical errors is Drug-Drug interaction (DDI), which occurs when one drug increases or decreases the effect of another drug. We propose a machine learning system to extract and classify drug-drug interactions from the biomedical literature, using the annotated corpus from the DDIExtraction-2013 shared task challenge. Our approach applies a two-stage classifier to handle the highly unbalanced class distribution in the corpus. The first stage is designed for binary classification of drug pairs as interacting or non-interacting, and the second stage for further classification of interacting pairs into one of four interacting types: advise, effect, …