Open Access. Powered by Scholars. Published by Universities.®

Social and Behavioral Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Library and Information Science

Syracuse University

Series

2020

Data analysis

Articles 1 - 1 of 1

Full-Text Articles in Social and Behavioral Sciences

Assessing Topical Homogeneity With Word Embedding And Distance Matrices, Jeffrey M. Stanton, Yisi Sang Oct 2020

Assessing Topical Homogeneity With Word Embedding And Distance Matrices, Jeffrey M. Stanton, Yisi Sang

School of Information Studies - Faculty Scholarship

Researchers from many fields have used statistical tools to make sense of large bodies of text. Many tools support quantitative analysis of documents within a corpus, but relatively few studies have examined statistical characteristics of whole corpora. Statistical summaries of whole corpora and comparisons between corpora have potential application in the analysis of topically organized applications such social media platforms. In this study, we created matrix representations of several corpora and examined several statistical tests to make comparisons between pairs of corpora with respect to the topical homogeneity of documents within each corpus. Results of three experiments suggested that a …