Open Access. Powered by Scholars. Published by Universities.®
Social and Behavioral Sciences Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
Articles 1 - 2 of 2
Full-Text Articles in Social and Behavioral Sciences
Assessing Topical Homogeneity With Word Embedding And Distance Matrices, Jeffrey M. Stanton, Yisi Sang
Assessing Topical Homogeneity With Word Embedding And Distance Matrices, Jeffrey M. Stanton, Yisi Sang
School of Information Studies - Faculty Scholarship
Researchers from many fields have used statistical tools to make sense of large bodies of text. Many tools support quantitative analysis of documents within a corpus, but relatively few studies have examined statistical characteristics of whole corpora. Statistical summaries of whole corpora and comparisons between corpora have potential application in the analysis of topically organized applications such social media platforms. In this study, we created matrix representations of several corpora and examined several statistical tests to make comparisons between pairs of corpora with respect to the topical homogeneity of documents within each corpus. Results of three experiments suggested that a …
Finding Datasets In Publications: The Syracuse University Approach, Tong Zeng, Daniel E. Acuna
Finding Datasets In Publications: The Syracuse University Approach, Tong Zeng, Daniel E. Acuna
School of Information Studies - Faculty Scholarship
Datasets are critical for scientific research, playing a role in replication, reproducibility, and efficiency. Researchers have recently shown that datasets are becoming more important for science to function properly, even serving as artifacts of study themselves. However, citing datasets is not a common or standard practice in spite of recent efforts by data repositories and funding agencies. This greatly affects our ability to track their usage and importance. A potential solution to this problem is to automatically extract dataset mentions from scientific articles. In this work, we propose to achieve such extraction by using a neural network based on a …