Engineering | Open Access Articles | Digital Commons Network™

Bayesian Methods And Machine Learning For Processing Text And Image Data, Yingying Gu Aug 2017

Bayesian Methods And Machine Learning For Processing Text And Image Data, Yingying Gu

Theses and Dissertations

Classification/clustering is an important class of unstructured data processing problems. The classification (supervised, semi-supervised and unsupervised) aims to discover the clusters and group the similar data into categories for information organization and knowledge discovery. My work focuses on using the Bayesian methods and machine learning techniques to classify the free-text and image data, and address how to overcome the limitations of the traditional methods. The Bayesian approach provides a way to allow using more variations(numerical or categorical), and estimate the probabilities instead of explicit rules, which will benefit in the ambiguous cases. The MAP(maximum a posterior) estimation is used to …

Go to article

Semi-Supervised Approach To Monitoring Clinical Depressive Symptoms In Social Media, Amir Hossein Yazdavar, Hussein S. Al-Olimat, Monireh Ebrahimi, Goonmeet Bajaj, Tanvi Banerjee, Krishnaprasad Thirunarayan, Jyotishman Pathak, Amit Sheth Jan 2017

Semi-Supervised Approach To Monitoring Clinical Depressive Symptoms In Social Media, Amir Hossein Yazdavar, Hussein S. Al-Olimat, Monireh Ebrahimi, Goonmeet Bajaj, Tanvi Banerjee, Krishnaprasad Thirunarayan, Jyotishman Pathak, Amit Sheth

Computer Science and Engineering Faculty Publications

With the rise of social media, millions of people are routinely expressing their moods, feelings, and daily struggles with mental health issues on social media platforms like Twitter. Unlike traditional observational cohort studies conducted through questionnaires and self-reported surveys, we explore the reliable detection of clinical depression from tweets obtained unobtrusively. Based on the analysis of tweets crawled from users with self-reported depressive symptoms in their Twitter profiles, we demonstrate the potential for detecting clinical depression symptoms which emulate the PHQ-9 questionnaire clinicians use today. Our study uses a semi-supervised statistical model to evaluate how the duration of these symptoms …

Go to article

Parsing Metamap Files In Hadoop, Amy Olex, Alberto Cano, Bridget T. Mcinnes Jan 2017

Parsing Metamap Files In Hadoop, Amy Olex, Alberto Cano, Bridget T. Mcinnes

Computer Science Publications

The UMLS::Association CUICollector module identifies UMLS Concept Unique Identifier bigrams and their frequencies in a biomedical text corpus. CUICollector was re-implemented in Hadoop MapReduce to improve algorithm speed, flexibility, and scalability. Evaluation of the Hadoop implementation compared to the serial module produced equivalent results and achieved a 28x speedup on a single-node Hadoop system.

Go to article

Multi-Class Classification Of Textual Data: Detection And Mitigation Of Cheating In Massively Multiplayer Online Role Playing Games, Naga Sai Nikhil Maguluri Jan 2017

Multi-Class Classification Of Textual Data: Detection And Mitigation Of Cheating In Massively Multiplayer Online Role Playing Games, Naga Sai Nikhil Maguluri

Browse all Theses and Dissertations

The success of any multiplayer game depends on the player’s experience. Cheating/Hacking undermines the player’s experience and thus the success of that game. Cheaters, who use hacks, bots or trainers are ruining the gaming experience of a player and are making him leave the game. As the video game industry is a constantly increasing multibillion dollar economy, it is crucial to assure and maintain a state of security. Players reflect their gaming experience in one of the following places: multiplayer chat, game reviews, and social media. This thesis is an exploratory study where our goal is to experiment and propose …

Go to article

Semantics-Based Summarization Of Entities In Knowledge Graphs, Kalpa Gunaratna Jan 2017

Semantics-Based Summarization Of Entities In Knowledge Graphs, Kalpa Gunaratna

Browse all Theses and Dissertations

The processing of structured and semi-structured content on the Web has been gaining attention with the rapid progress in the Linking Open Data project and the development of commercial knowledge graphs. Knowledge graphs capture domain-specific or encyclopedic knowledge in the form of a data layer and add rich and explicit semantics on top of the data layer to infer additional knowledge. The data layer of a knowledge graph represents entities and their descriptions. The semantic layer on top of the data layer is called the schema (ontology), where relationships of the entity descriptions, their classes, and the hierarchy of the …

Go to article

Engineering Commons^™

Full-Text Articles in Engineering

Bayesian Methods And Machine Learning For Processing Text And Image Data, Yingying Gu

Theses and Dissertations

Semi-Supervised Approach To Monitoring Clinical Depressive Symptoms In Social Media, Amir Hossein Yazdavar, Hussein S. Al-Olimat, Monireh Ebrahimi, Goonmeet Bajaj, Tanvi Banerjee, Krishnaprasad Thirunarayan, Jyotishman Pathak, Amit Sheth

Computer Science and Engineering Faculty Publications

Parsing Metamap Files In Hadoop, Amy Olex, Alberto Cano, Bridget T. Mcinnes

Computer Science Publications

Multi-Class Classification Of Textual Data: Detection And Mitigation Of Cheating In Massively Multiplayer Online Role Playing Games, Naga Sai Nikhil Maguluri

Browse all Theses and Dissertations

Semantics-Based Summarization Of Entities In Knowledge Graphs, Kalpa Gunaratna

Browse all Theses and Dissertations