Open Access. Powered by Scholars. Published by Universities.®

Social and Behavioral Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 7 of 7

Full-Text Articles in Social and Behavioral Sciences

Forecasting The Onset And Course Of Mental Illness With Twitter Data, Andrew G. Reece, Andrew J. Reagan, Katharina L.M. Lix, Peter Sheridan Dodds, Christopher M. Danforth, Ellen J. Langer Dec 2017

Forecasting The Onset And Course Of Mental Illness With Twitter Data, Andrew G. Reece, Andrew J. Reagan, Katharina L.M. Lix, Peter Sheridan Dodds, Christopher M. Danforth, Ellen J. Langer

College of Engineering and Mathematical Sciences Faculty Publications

We developed computational models to predict the emergence of depression and Post-Traumatic Stress Disorder in Twitter users. Twitter data and details of depression history were collected from 204 individuals (105 depressed, 99 healthy). We extracted predictive features measuring affect, linguistic style, and context from participant tweets (N = 279,951) and built models using these features with supervised learning algorithms. Resulting models successfully discriminated between depressed and healthy content, and compared favorably to general practitioners' average success rates in diagnosing depression, albeit in a separate population. Results held even when the analysis was restricted to content posted before first depression diagnosis. …


Erratum To: Instagram Photos Reveal Predictive Markers Of Depression (Epj Data Science, (2017), 6, 1, (15), 10.1140/Epjds/S13688-017-0110-Z), Andrew G. Reece, Christopher M. Danforth Dec 2017

Erratum To: Instagram Photos Reveal Predictive Markers Of Depression (Epj Data Science, (2017), 6, 1, (15), 10.1140/Epjds/S13688-017-0110-Z), Andrew G. Reece, Christopher M. Danforth

College of Engineering and Mathematical Sciences Faculty Publications

Upon publication of the original article [1], it was noticed that Figure 2 contained an error. The horizontal bars for the likes row were incorrectly shown as blue. The horizontal bars for the ‘likes’ row should be orange. This has now been acknowledged and corrected in this erratum. The correct Figure 2 is shown below. In the section Method, subsection Improving data quality, the sentence ‘We also excluded participants with CES-D scores of 22 or higher. should read as We also excluded participants with CES-D scores of 21 or lower. This has now been acknowledged and corrected in this erratum. …


Instagram Photos Reveal Predictive Markers Of Depression, Andrew G. Reece, Christopher M. Danforth Dec 2017

Instagram Photos Reveal Predictive Markers Of Depression, Andrew G. Reece, Christopher M. Danforth

College of Engineering and Mathematical Sciences Faculty Publications

Using Instagram data from 166 individuals, we applied machine learning tools to successfully identify markers of depression. Statistical features were computationally extracted from 43,950 participant Instagram photos, using color analysis, metadata components, and algorithmic face detection. Resulting models outperformed general practitioners’ average unassisted diagnostic success rate for depression. These results held even when the analysis was restricted to posts made before depressed individuals were first diagnosed. Human ratings of photo attributes (happy, sad, etc.) were weaker predictors of depression, and were uncorrelated with computationally-generated features. These results suggest new avenues for early screening and detection of mental illness.


Sentiment Analysis Methods For Understanding Large-Scale Texts: A Case For Using Continuum-Scored Words And Word Shift Graphs, Andrew J. Reagan, Christopher M. Danforth, Brian Tivnan, Jake Ryland Williams, Peter Sheridan Dodds Dec 2017

Sentiment Analysis Methods For Understanding Large-Scale Texts: A Case For Using Continuum-Scored Words And Word Shift Graphs, Andrew J. Reagan, Christopher M. Danforth, Brian Tivnan, Jake Ryland Williams, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

The emergence and global adoption of social media has rendered possible the real-time estimation of population-scale sentiment, an extraordinary capacity which has profound implications for our understanding of human behavior. Given the growing assortment of sentiment-measuring instruments, it is imperative to understand which aspects of sentiment dictionaries contribute to both their classification accuracy and their ability to provide richer understanding of texts. Here, we perform detailed, quantitative tests and qualitative assessments of 6 dictionary-based methods applied to 4 different corpora, and briefly examine a further 20 methods. We show that while inappropriate for sentences, dictionary-based methods are generally robust in …


Simon's Fundamental Rich-Get-Richer Model Entails A Dominant First-Mover Advantage, Peter Sheridan Dodds, David Rushing Dewhurst, Fletcher F. Hazlehurst, Colin M. Van Oort, Lewis Mitchell, Andrew J. Reagan, Jake Ryland Williams, Christopher M. Danforth May 2017

Simon's Fundamental Rich-Get-Richer Model Entails A Dominant First-Mover Advantage, Peter Sheridan Dodds, David Rushing Dewhurst, Fletcher F. Hazlehurst, Colin M. Van Oort, Lewis Mitchell, Andrew J. Reagan, Jake Ryland Williams, Christopher M. Danforth

College of Engineering and Mathematical Sciences Faculty Publications

Herbert Simon's classic rich-get-richer model is one of the simplest empirically supported mechanisms capable of generating heavy-tail size distributions for complex systems. Simon argued analytically that a population of flavored elements growing by either adding a novel element or randomly replicating an existing one would afford a distribution of group sizes with a power-law tail. Here, we show that, in fact, Simon's model does not produce a simple power-law size distribution as the initial element has a dominant first-mover advantage, and will be overrepresented by a factor proportional to the inverse of the innovation probability. The first group's size discrepancy …


Connecting Every Bit Of Knowledge: The Structure Of Wikipedia's First Link Network, Mark Ibrahim, Christopher M. Danforth, Peter Sheridan Dodds Mar 2017

Connecting Every Bit Of Knowledge: The Structure Of Wikipedia's First Link Network, Mark Ibrahim, Christopher M. Danforth, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

Apples, porcupines, and the most obscure Bob Dylan song—is every topic a few clicks from Philosophy? Within Wikipedia, the surprising answer is yes: nearly all paths lead to Philosophy. Wikipedia is the largest, most meticulously indexed collection of human knowledge ever amassed. More than information about a topic, Wikipedia is a web of naturally emerging relationships. By following the first link in each article, we algorithmically construct a directed network of all 4.7 million articles: Wikipedia's First Link Network. Here, we study the English edition of Wikipedia's First Link Network for insight into how the many articles on inventions, places, …


The Lexicocalorimeter: Gauging Public Health Through Caloric Input And Output On Social Media, Sharon E. Alajajian, Jake Ryland Williams, Andrew J. Reagan, Stephen C. Alajajian, Morgan R. Frank, Lewis Mitchell, Jacob Lahne, Christopher M. Danforth, Peter Sheridan Dodds Feb 2017

The Lexicocalorimeter: Gauging Public Health Through Caloric Input And Output On Social Media, Sharon E. Alajajian, Jake Ryland Williams, Andrew J. Reagan, Stephen C. Alajajian, Morgan R. Frank, Lewis Mitchell, Jacob Lahne, Christopher M. Danforth, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

We propose and develop a Lexicocalorimeter: an online, interactive instrument for measuring the "caloric content" of social media and other large-scale texts. We do so by constructing extensive yet improvable tables of food and activity related phrases, and respectively assigning them with sourced estimates of caloric intake and expenditure. We show that for Twitter, our naive measures of "caloric input", "caloric output", and the ratio of these measures are all strong correlates with health and well-being measures for the contiguous United States. Our caloric balance measure in many cases outperforms both its constituent quantities; is tunable to specific health and …