Open Access. Powered by Scholars. Published by Universities.®
Library and Information Science Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
Articles 1 - 3 of 3
Full-Text Articles in Library and Information Science
Collecting Legacy Corpora From Social Science Research For Text Mining Evaluation, Bei Yu, Min-Chun Ku
Collecting Legacy Corpora From Social Science Research For Text Mining Evaluation, Bei Yu, Min-Chun Ku
School of Information Studies - Faculty Scholarship
In this poster we describe a pilot study of searching social science literature for legacy corpora to evaluate text mining algorithms. The new emerging field of computational social science demands large amount of social science data to train and evaluate computational models. We argue that the legacy corpora that were annotated by social science researchers through traditional Qualitative Data Analysis (QDA) are ideal data sets to evaluate text mining methods, such as text categorization and clustering. As a pilot study, we searched articles that involve content analysis and discourse analysis in leading communication journals, and then contacted the authors regarding …
Exploring The Characteristics Of Opinion Expressions For Political Opinion Classification, Bei Yu, Stefan Kaufmann, Daniel Diermeier
Exploring The Characteristics Of Opinion Expressions For Political Opinion Classification, Bei Yu, Stefan Kaufmann, Daniel Diermeier
School of Information Studies - Faculty Scholarship
Recently there has been increasing interest in constructing general-purpose political opinion classifiers for applications in e-Rulemaking. This problem is generally modeled as a sentiment classification task in a new domain. However, the classification accuracy is not as good as that in other domains such as customer reviews. In this paper, we report the results of a series of experiments designed to explore the characteristics of political opinion expression which might affect the sentiment classification performance. We found that the average sentiment level of Congressional debate is higher than that of neutral news articles, but lower than that of movie reviews. …
Improved Document Representation For Classification Tasks For The Intelligence Community, Elizabeth D. Liddy, Ozgur Yilmazel, Svetlana Symonenko, Niranjan Balasubramanian
Improved Document Representation For Classification Tasks For The Intelligence Community, Elizabeth D. Liddy, Ozgur Yilmazel, Svetlana Symonenko, Niranjan Balasubramanian
School of Information Studies - Faculty Scholarship
This research addresses the question of whether the AI technologies of Natural Language Processing (NLP) and Machine Learning (ML) can be used to improve security within the Intelligence Community (IC).