Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 12 of 12

Full-Text Articles in Entire DC Network

Extracting Patterns Of Semantic Roles From Accident Narratives, Soundarya Jayakumar May 2023

Extracting Patterns Of Semantic Roles From Accident Narratives, Soundarya Jayakumar

Theses and Dissertations

Accident databases are filled with rich information about accidents. Analyzing these datasets can reveal useful information which can be used to prevent similar accidents in the future. Policy makers, and safety management organizations can design appropriate measures based on the analysis done to prevent accidents. Besides structured data, crash reports include natural language narratives which contain valuable accident-related information which is otherwise not present in the structured data. Using natural language processing (NLP) techniques one can analyze these narratives and mine hidden patterns of accidents from them. The thesis focuses on developing an algorithm to extract common patterns of semantic …


Emotion Classification And Intensity Prediction On Tweets, Sharath Chander Pugazhenthi May 2023

Emotion Classification And Intensity Prediction On Tweets, Sharath Chander Pugazhenthi

Theses and Dissertations

The task of finding an emotion associated with the text from individuals on a social media platform has become very crucial as it influences the current state of mind of a particular individual in real life. It also helps one to understand social behavior at a given point in time. Microblogging platforms like Twitter serves as a powerful tool for expressing one’s thoughts. Several work have been done in classifying the emotion associated with it. The thesis comprises of a system that first classifies the tweet into one of the four emotions - anger, joy, sadness, and fear with good …


Predicting Occurrence Of The Term Sarcopenia With Semi-Supervised Machine Learning, Kevin Flasch Dec 2021

Predicting Occurrence Of The Term Sarcopenia With Semi-Supervised Machine Learning, Kevin Flasch

Theses and Dissertations

Sarcopenia is a medical condition that involves loss of muscle mass. It has been difficult todefine and only recently assigned an official medical code, leading to many medical records lacking a coded diagnosis although the clinical note text may discuss it or symptoms of it. This thesis investigates the application of machine learning and natural language processing to analyze clinical note text to see how well the term ’sarcopenia’ can be predicted in clinical note text from records concerning the condition.

A variety of machine learning models combined with different features and text processingare tested against training data that mentions …


An Application Of Clustering And Cluster Update Methods To Boiler Sensor Prediction And Case-Based-Reasoning To Boiler Repair, Timothy Edward Rooney Dec 2019

An Application Of Clustering And Cluster Update Methods To Boiler Sensor Prediction And Case-Based-Reasoning To Boiler Repair, Timothy Edward Rooney

Theses and Dissertations

Driven by demand from both consumers and manufacturers alike, Internet of Things (IoT)

capabilities are being built into more products. Consumers want more control and access to their

devices, while manufacturers can find data gathered from IoT-capable products invaluable. In

this thesis, we use data from a growing fleet of IoT-connected boilers in the residential, lightcommercial, and medium-commercial ranges to demonstrate a framework for cluster initialization

and updating. We compare two methods of dynamically updating clusters: a sequential method

inspired by sequential K-means clustering and a cohesion-based method called DYNC. A predictive

artificial neural network system demonstrates the effectiveness of …


Use Of Text Data In Identifying And Prioritizing Potential Drug Repositioning Candidates, Majid Rastegar-Mojarad May 2019

Use Of Text Data In Identifying And Prioritizing Potential Drug Repositioning Candidates, Majid Rastegar-Mojarad

Theses and Dissertations

New drug development costs between 500 million and 2 billion dollars and takes 10-15 years, with a success rate of less than 10%. Drug repurposing (defined as discovering new indications for existing drugs) could play a significant role in drug development, especially considering the declining success rates of developing novel drugs. In the period 2007-2009, drug repurposing led to the launching of 30-40% of new drugs. Typically, new indications for existing medications are identified by accident. However, new technologies and a large number of available resources enable the development of systematic approaches to identify and validate drug-repurposing candidates with significantly …


Unsupervised Biomedical Named Entity Recognition, Omid Ghiasvand Aug 2017

Unsupervised Biomedical Named Entity Recognition, Omid Ghiasvand

Theses and Dissertations

Named entity recognition (NER) from text is an important task for several applications, including in the biomedical domain. Supervised machine learning based systems have been the most successful on NER task, however, they require correct annotations in large quantities for training. Annotating text manually is very labor intensive and also needs domain expertise. The purpose of this research is to reduce human annotation effort and to decrease cost of annotation for building NER systems in the biomedical domain. The method developed in this work is based on leveraging the availability of resources like UMLS (Unified Medical Language System), that contain …


Bayesian Methods And Machine Learning For Processing Text And Image Data, Yingying Gu Aug 2017

Bayesian Methods And Machine Learning For Processing Text And Image Data, Yingying Gu

Theses and Dissertations

Classification/clustering is an important class of unstructured data processing problems. The classification (supervised, semi-supervised and unsupervised) aims to discover the clusters and group the similar data into categories for information organization and knowledge discovery. My work focuses on using the Bayesian methods and machine learning techniques to classify the free-text and image data, and address how to overcome the limitations of the traditional methods. The Bayesian approach provides a way to allow using more variations(numerical or categorical), and estimate the probabilities instead of explicit rules, which will benefit in the ambiguous cases. The MAP(maximum a posterior) estimation is used to …


Three Essays On Enhancing Clinical Trial Subject Recruitment Using Natural Language Processing And Text Mining, Euisung Jung Aug 2015

Three Essays On Enhancing Clinical Trial Subject Recruitment Using Natural Language Processing And Text Mining, Euisung Jung

Theses and Dissertations

Patient recruitment and enrollment are critical factors for a successful clinical trial; however, recruitment tends to be the most common problem in most clinical trials. The success of a clinical trial depends on efficiently recruiting suitable patients to conduct the trial. Every clinical trial research has a protocol, which describes what will be done in the study and how it will be conducted. Also, the protocol ensures the safety of the trial subjects and the integrity of the data collected. The eligibility criteria section of clinical trial protocols is important because it specifies the necessary conditions that participants have to …


Three Essays On Opinion Mining Of Social Media Texts, Shuyuan Deng Dec 2014

Three Essays On Opinion Mining Of Social Media Texts, Shuyuan Deng

Theses and Dissertations

This dissertation research is a collection of three essays on opinion mining of social media texts. I explore different theoretical and methodological perspectives in this inquiry. The first essay focuses on improving lexicon-based sentiment classification. I propose a method to automatically generate a sentiment lexicon that incorporates knowledge from both the language domain and the content domain. This method learns word associations from a large unannotated corpus. These associations are used to identify new sentiment words. Using a Twitter data set containing 743,069 tweets related to the stock market, I show that the sentiment lexicons generated using the proposed method …


Adverse Drug Event Detection, Causality Inference, Patient Communication And Translational Research, Balaji Polepalli Ramesh May 2014

Adverse Drug Event Detection, Causality Inference, Patient Communication And Translational Research, Balaji Polepalli Ramesh

Theses and Dissertations

Adverse drug events (ADEs) are injuries resulting from a medical intervention related to a drug. ADEs are responsible for nearly 20% of all the adverse events that occur in hospitalized patients. ADEs have been shown to increase the cost of health care and the length of stays in hospital. Therefore, detecting and preventing ADEs for pharmacovigilance is an important task that can improve the quality of health care and reduce the cost in a hospital setting. In this dissertation, we focus on the development of ADEtector, a system that identifies ADEs and medication information from electronic medical records and the …


Disease Name Extraction From Clinical Text Using Conditional Random Fields, Omid Ghiasvand May 2014

Disease Name Extraction From Clinical Text Using Conditional Random Fields, Omid Ghiasvand

Theses and Dissertations

The aim of the research done in this thesis was to extract disease and disorder names from clinical texts. We utilized Conditional Random Fields (CRF) as the main method to label diseases and disorders in clinical sentences. We used some other tools such as MetaMap and Stanford Core NLP tool to extract some crucial features. MetaMap tool was used to identify names of diseases/disorders that are already in UMLS Metathesaurus. Some other important features such as lemmatized versions of words, and POS tags were extracted using the Stanford Core NLP tool. Some more features were extracted directly from UMLS Metathesaurus, …


Extraction And Classification Of Drug-Drug Interaction From Biomedical Text Using A Two-Stage Classifier, Majid Rastegar-Mojarad Dec 2013

Extraction And Classification Of Drug-Drug Interaction From Biomedical Text Using A Two-Stage Classifier, Majid Rastegar-Mojarad

Theses and Dissertations

One of the critical causes of medical errors is Drug-Drug interaction (DDI), which occurs when one drug increases or decreases the effect of another drug. We propose a machine learning system to extract and classify drug-drug interactions from the biomedical literature, using the annotated corpus from the DDIExtraction-2013 shared task challenge. Our approach applies a two-stage classifier to handle the highly unbalanced class distribution in the corpus. The first stage is designed for binary classification of drug pairs as interacting or non-interacting, and the second stage for further classification of interacting pairs into one of four interacting types: advise, effect, …