Open Access. Powered by Scholars. Published by Universities.®

Medicine and Health Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 50

Full-Text Articles in Medicine and Health

The Shocklet Transform: A Decomposition Method For The Identification Of Local, Mechanism-Driven Dynamics In Sociotechnical Time Series, David Rushing Dewhurst, Thayer Alshaabi, Dilan Kiley, Michael V. Arnold, Joshua R. Minot, Christopher M. Danforth, Peter Sheridan Dodds Dec 2020

The Shocklet Transform: A Decomposition Method For The Identification Of Local, Mechanism-Driven Dynamics In Sociotechnical Time Series, David Rushing Dewhurst, Thayer Alshaabi, Dilan Kiley, Michael V. Arnold, Joshua R. Minot, Christopher M. Danforth, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

We introduce a qualitative, shape-based, timescale-independent time-domain transform used to extract local dynamics from sociotechnical time series—termed the Discrete Shocklet Transform (DST)—and an associated similarity search routine, the Shocklet Transform And Ranking (STAR) algorithm, that indicates time windows during which panels of time series display qualitatively-similar anomalous behavior. After distinguishing our algorithms from other methods used in anomaly detection and time series similarity search, such as the matrix profile, seasonal-hybrid ESD, and discrete wavelet transform-based procedures, we demonstrate the DST’s ability to identify mechanism-driven dynamics at a wide range of timescales and its relative insensitivity to functional parameterization. As an …


Chimera States And Seizures In A Mouse Neuronal Model, Henry M. Mitchell, Peter Sheridan Dodds, J. Matthew Mahoney, Christopher M. Danforth Oct 2020

Chimera States And Seizures In A Mouse Neuronal Model, Henry M. Mitchell, Peter Sheridan Dodds, J. Matthew Mahoney, Christopher M. Danforth

College of Engineering and Mathematical Sciences Faculty Publications

Chimera states - the coexistence of synchrony and asynchrony in a nonlocally-coupled network of identical oscillators - are often used as a model framework for epileptic seizures. Here, we explore the dynamics of chimera states in a network of modified Hindmarsh-Rose neurons, configured to reflect the graph of the mesoscale mouse connectome. Our model produces superficially epileptiform activity converging on persistent chimera states in a large region of a two-parameter space governing connections (a) between subcortices within a cortex and (b) between cortices. Our findings contribute to a growing body of literature suggesting mathematical models can qualitatively reproduce epileptic seizure …


Novel Evolutionary Algorithm Identifies Interactions Driving Infestation Of Triatoma Dimidiata, A Chagas Disease Vector, John P. Hanley, Donna M. Rizzo, Lori Stevens, Sara Helms Cahan, Patricia L. Dorn, Leslie A. Morrissey, Antonieta Guadalupe Rodas, Lucia C. Orantes, Carlota Monroy Aug 2020

Novel Evolutionary Algorithm Identifies Interactions Driving Infestation Of Triatoma Dimidiata, A Chagas Disease Vector, John P. Hanley, Donna M. Rizzo, Lori Stevens, Sara Helms Cahan, Patricia L. Dorn, Leslie A. Morrissey, Antonieta Guadalupe Rodas, Lucia C. Orantes, Carlota Monroy

College of Engineering and Mathematical Sciences Faculty Publications

Chagas disease is a lethal, neglected tropical disease. Unfortunately, aggressive insecticide-spraying campaigns have not been able to eliminate domestic infestation of Triatoma dimidiata, the native vector in Guatemala. To target interventions toward houses most at risk of infestation, comprehensive socioeconomic and entomologic surveys were conducted in two towns in Jutiapa, Guatemala. Given the exhaustively large search space associated with combinations of risk factors, traditional statistics are limited in their ability to discover risk factor interactions. Two recently developed statistical evolutionary algorithms, specifically designed to accommodate risk factor interactions and heterogeneity, were applied to this large combinatorial search space and used …


Hahahahaha, Duuuuude, Yeeessss!: A Two-Parameter Characterization Of Stretchable Words And The Dynamics Of Mistypings And Misspellings, Tyler J. Gray, Christopher M. Danforth, Peter Sheridan Dodds May 2020

Hahahahaha, Duuuuude, Yeeessss!: A Two-Parameter Characterization Of Stretchable Words And The Dynamics Of Mistypings And Misspellings, Tyler J. Gray, Christopher M. Danforth, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

Stretched words like 'heellllp' or 'heyyyyy' are a regular feature of spoken language, often used to emphasize or exaggerate the underlying meaning of the root word. While stretched words are rarely found in formal written language and dictionaries, they are prevalent within social media. In this paper, we examine the frequency distributions of 'stretchable words' found in roughly 100 billion tweets authored over an 8 year period. We introduce two central parameters, 'balance' and 'stretch', that capture their main characteristics, and explore their dynamics by creating visual tools we call 'balance plots' and 'spelling trees'. We discuss how the tools …


A Tandem Evolutionary Algorithm For Identifying Causal Rules From Complex Data, John P. Hanley, Donna M. Rizzo, Jeffrey S. Buzas, Margaret J. Eppstein Jan 2020

A Tandem Evolutionary Algorithm For Identifying Causal Rules From Complex Data, John P. Hanley, Donna M. Rizzo, Jeffrey S. Buzas, Margaret J. Eppstein

College of Engineering and Mathematical Sciences Faculty Publications

We propose a new evolutionary approach for discovering causal rules in complex classification problems from batch data. Key aspects include (a) the use of a hypergeometric probability mass function as a principled statistic for assessing fitness that quantifies the probability that the observed association between a given clause and target class is due to chance, taking into account the size of the dataset, the amount of missing data, and the distribution of outcome categories, (b) tandem age-layered evolutionary algorithms for evolving parsimonious archives of conjunctive clauses, and disjunctions of these conjunctions, each of which have probabilistically significant associations with outcome …


A Crowdsourcing Approach To Understand Weight And Weight Loss In Men, Tiffany Rounds, Josh Bongard, Paul Hines, Jean Harvey Mar 2019

A Crowdsourcing Approach To Understand Weight And Weight Loss In Men, Tiffany Rounds, Josh Bongard, Paul Hines, Jean Harvey

College of Engineering and Mathematical Sciences Faculty Publications

No abstract provided.


Social Media Usage Patterns During Natural Hazards, Meredith T. Niles, Benjamin F. Emery, Andrew J. Reagan, Peter Sheridan Dodds, Christopher M. Danforth Feb 2019

Social Media Usage Patterns During Natural Hazards, Meredith T. Niles, Benjamin F. Emery, Andrew J. Reagan, Peter Sheridan Dodds, Christopher M. Danforth

College of Engineering and Mathematical Sciences Faculty Publications

Natural hazards are becoming increasingly expensive as climate change and development are exposing communities to greater risks. Preparation and recovery are critical for climate change resilience, and social media are being used more and more to communicate before, during, and after disasters. While there is a growing body of research aimed at understanding how people use social media surrounding disaster events, most existing work has focused on a single disaster case study. In the present study, we analyze five of the costliest disasters in the last decade in the United States (Hurricanes Irene and Sandy, two sets of tornado outbreaks, …


English Verb Regularization In Books And Tweets, Tyler J. Gray, Andrew J. Reagan, Peter Sheridan Dodds, Christopher M. Danforth Dec 2018

English Verb Regularization In Books And Tweets, Tyler J. Gray, Andrew J. Reagan, Peter Sheridan Dodds, Christopher M. Danforth

College of Engineering and Mathematical Sciences Faculty Publications

The English language has evolved dramatically throughout its lifespan, to the extent that a modern speaker of Old English would be incomprehensible without translation. One concrete indicator of this process is the movement from irregular to regular (-ed) forms for the past tense of verbs. In this study we quantify the extent of verb regularization using two vastly disparate datasets: (1) Six years of published books scanned by Google (2003-2008), and (2) A decade of social media messages posted to Twitter (2008-2017). We find that the extent of verb regularization is greater on Twitter, taken as a whole, than in …


Uncovering Vector, Parasite, Blood Meal And Microbiome Patterns From Mixed-Dna Specimens Of The Chagas Disease Vector Triatoma Dimidiata, Lucia C. Orantes, Carlota Monroy, Patricia L. Dorn, Lori Stevens, Donna M. Rizzo, Leslie Morrissey, John P. Hanley, Antonieta Guadalupe Rodas, Bethany Richards, Kimberly F. Wallin, Sara Helms Cahan Oct 2018

Uncovering Vector, Parasite, Blood Meal And Microbiome Patterns From Mixed-Dna Specimens Of The Chagas Disease Vector Triatoma Dimidiata, Lucia C. Orantes, Carlota Monroy, Patricia L. Dorn, Lori Stevens, Donna M. Rizzo, Leslie Morrissey, John P. Hanley, Antonieta Guadalupe Rodas, Bethany Richards, Kimberly F. Wallin, Sara Helms Cahan

College of Engineering and Mathematical Sciences Faculty Publications

Chagas disease, considered a neglected disease by the World Health Organization, is caused by the protozoan parasite Trypanosoma cruzi, and transmitted by >140 triatomine species across the Americas. In Central America, the main vector is Triatoma dimidiata, an opportunistic blood meal feeder inhabiting both domestic and sylvatic ecotopes. Given the diversity of interacting biological agents involved in the epidemiology of Chagas disease, having simultaneous information on the dynamics of the parasite, vector, the gut microbiome of the vector, and the blood meal source would facilitate identifying key biotic factors associated with the risk of T. cruzi transmission. In this study, …


Continuum Rich-Get-Richer Processes: Mean Field Analysis With An Application To Firm Size, David Rushing Dewhurst, Christopher M. Danforth, Peter Sheridan Dodds Jun 2018

Continuum Rich-Get-Richer Processes: Mean Field Analysis With An Application To Firm Size, David Rushing Dewhurst, Christopher M. Danforth, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

Classical rich-get-richer models have found much success in being able to broadly reproduce the statistics and dynamics of diverse real complex systems. These rich-get-richer models are based on classical urn models and unfold step by step in discrete time. Here, we consider a natural variation acting on a temporal continuum in the form of a partial differential equation (PDE). We first show that the continuum version of Simon's canonical preferential attachment model exhibits an identical size distribution. In relaxing Simon's assumption of a linear growth mechanism, we consider the case of an arbitrary growth kernel and find the general solution …


Divergent Discourse Between Protests And Counter-Protests: #Blacklivesmatter And #Alllivesmatter, Ryan J. Gallagher, Andrew J. Reagan, Christopher M. Danforth, Peter Sheridan Dodds Apr 2018

Divergent Discourse Between Protests And Counter-Protests: #Blacklivesmatter And #Alllivesmatter, Ryan J. Gallagher, Andrew J. Reagan, Christopher M. Danforth, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

Since the shooting of Black teenager Michael Brown by White police officer Darren Wilson in Ferguson, Missouri, the protest hashtag #BlackLivesMatter has amplified critiques of extrajudicial killings of Black Americans. In response to #BlackLivesMatter, other Twitter users have adopted #AllLivesMatter, a counter-protest hashtag whose content argues that equal attention should be given to all lives regardless of race. Through a multi-level analysis of over 860,000 tweets, we study how these protests and counter-protests diverge by quantifying aspects of their discourse. We find that #AllLivesMatter facilitates opposition between #BlackLivesMatter and hashtags such as #PoliceLivesMatter and #BlueLivesMatter in such a way that …


Forecasting The Onset And Course Of Mental Illness With Twitter Data, Andrew G. Reece, Andrew J. Reagan, Katharina L.M. Lix, Peter Sheridan Dodds, Christopher M. Danforth, Ellen J. Langer Dec 2017

Forecasting The Onset And Course Of Mental Illness With Twitter Data, Andrew G. Reece, Andrew J. Reagan, Katharina L.M. Lix, Peter Sheridan Dodds, Christopher M. Danforth, Ellen J. Langer

College of Engineering and Mathematical Sciences Faculty Publications

We developed computational models to predict the emergence of depression and Post-Traumatic Stress Disorder in Twitter users. Twitter data and details of depression history were collected from 204 individuals (105 depressed, 99 healthy). We extracted predictive features measuring affect, linguistic style, and context from participant tweets (N = 279,951) and built models using these features with supervised learning algorithms. Resulting models successfully discriminated between depressed and healthy content, and compared favorably to general practitioners' average success rates in diagnosing depression, albeit in a separate population. Results held even when the analysis was restricted to content posted before first depression diagnosis. …


Erratum To: Instagram Photos Reveal Predictive Markers Of Depression (Epj Data Science, (2017), 6, 1, (15), 10.1140/Epjds/S13688-017-0110-Z), Andrew G. Reece, Christopher M. Danforth Dec 2017

Erratum To: Instagram Photos Reveal Predictive Markers Of Depression (Epj Data Science, (2017), 6, 1, (15), 10.1140/Epjds/S13688-017-0110-Z), Andrew G. Reece, Christopher M. Danforth

College of Engineering and Mathematical Sciences Faculty Publications

Upon publication of the original article [1], it was noticed that Figure 2 contained an error. The horizontal bars for the likes row were incorrectly shown as blue. The horizontal bars for the ‘likes’ row should be orange. This has now been acknowledged and corrected in this erratum. The correct Figure 2 is shown below. In the section Method, subsection Improving data quality, the sentence ‘We also excluded participants with CES-D scores of 22 or higher. should read as We also excluded participants with CES-D scores of 21 or lower. This has now been acknowledged and corrected in this erratum. …


Instagram Photos Reveal Predictive Markers Of Depression, Andrew G. Reece, Christopher M. Danforth Dec 2017

Instagram Photos Reveal Predictive Markers Of Depression, Andrew G. Reece, Christopher M. Danforth

College of Engineering and Mathematical Sciences Faculty Publications

Using Instagram data from 166 individuals, we applied machine learning tools to successfully identify markers of depression. Statistical features were computationally extracted from 43,950 participant Instagram photos, using color analysis, metadata components, and algorithmic face detection. Resulting models outperformed general practitioners’ average unassisted diagnostic success rate for depression. These results held even when the analysis was restricted to posts made before depressed individuals were first diagnosed. Human ratings of photo attributes (happy, sad, etc.) were weaker predictors of depression, and were uncorrelated with computationally-generated features. These results suggest new avenues for early screening and detection of mental illness.


Sentiment Analysis Methods For Understanding Large-Scale Texts: A Case For Using Continuum-Scored Words And Word Shift Graphs, Andrew J. Reagan, Christopher M. Danforth, Brian Tivnan, Jake Ryland Williams, Peter Sheridan Dodds Dec 2017

Sentiment Analysis Methods For Understanding Large-Scale Texts: A Case For Using Continuum-Scored Words And Word Shift Graphs, Andrew J. Reagan, Christopher M. Danforth, Brian Tivnan, Jake Ryland Williams, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

The emergence and global adoption of social media has rendered possible the real-time estimation of population-scale sentiment, an extraordinary capacity which has profound implications for our understanding of human behavior. Given the growing assortment of sentiment-measuring instruments, it is imperative to understand which aspects of sentiment dictionaries contribute to both their classification accuracy and their ability to provide richer understanding of texts. Here, we perform detailed, quantitative tests and qualitative assessments of 6 dictionary-based methods applied to 4 different corpora, and briefly examine a further 20 methods. We show that while inappropriate for sentences, dictionary-based methods are generally robust in …


Simon's Fundamental Rich-Get-Richer Model Entails A Dominant First-Mover Advantage, Peter Sheridan Dodds, David Rushing Dewhurst, Fletcher F. Hazlehurst, Colin M. Van Oort, Lewis Mitchell, Andrew J. Reagan, Jake Ryland Williams, Christopher M. Danforth May 2017

Simon's Fundamental Rich-Get-Richer Model Entails A Dominant First-Mover Advantage, Peter Sheridan Dodds, David Rushing Dewhurst, Fletcher F. Hazlehurst, Colin M. Van Oort, Lewis Mitchell, Andrew J. Reagan, Jake Ryland Williams, Christopher M. Danforth

College of Engineering and Mathematical Sciences Faculty Publications

Herbert Simon's classic rich-get-richer model is one of the simplest empirically supported mechanisms capable of generating heavy-tail size distributions for complex systems. Simon argued analytically that a population of flavored elements growing by either adding a novel element or randomly replicating an existing one would afford a distribution of group sizes with a power-law tail. Here, we show that, in fact, Simon's model does not produce a simple power-law size distribution as the initial element has a dominant first-mover advantage, and will be overrepresented by a factor proportional to the inverse of the innovation probability. The first group's size discrepancy …


Connecting Every Bit Of Knowledge: The Structure Of Wikipedia's First Link Network, Mark Ibrahim, Christopher M. Danforth, Peter Sheridan Dodds Mar 2017

Connecting Every Bit Of Knowledge: The Structure Of Wikipedia's First Link Network, Mark Ibrahim, Christopher M. Danforth, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

Apples, porcupines, and the most obscure Bob Dylan song—is every topic a few clicks from Philosophy? Within Wikipedia, the surprising answer is yes: nearly all paths lead to Philosophy. Wikipedia is the largest, most meticulously indexed collection of human knowledge ever amassed. More than information about a topic, Wikipedia is a web of naturally emerging relationships. By following the first link in each article, we algorithmically construct a directed network of all 4.7 million articles: Wikipedia's First Link Network. Here, we study the English edition of Wikipedia's First Link Network for insight into how the many articles on inventions, places, …


The Lexicocalorimeter: Gauging Public Health Through Caloric Input And Output On Social Media, Sharon E. Alajajian, Jake Ryland Williams, Andrew J. Reagan, Stephen C. Alajajian, Morgan R. Frank, Lewis Mitchell, Jacob Lahne, Christopher M. Danforth, Peter Sheridan Dodds Feb 2017

The Lexicocalorimeter: Gauging Public Health Through Caloric Input And Output On Social Media, Sharon E. Alajajian, Jake Ryland Williams, Andrew J. Reagan, Stephen C. Alajajian, Morgan R. Frank, Lewis Mitchell, Jacob Lahne, Christopher M. Danforth, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

We propose and develop a Lexicocalorimeter: an online, interactive instrument for measuring the "caloric content" of social media and other large-scale texts. We do so by constructing extensive yet improvable tables of food and activity related phrases, and respectively assigning them with sourced estimates of caloric intake and expenditure. We show that for Twitter, our naive measures of "caloric input", "caloric output", and the ratio of these measures are all strong correlates with health and well-being measures for the contiguous United States. Our caloric balance measure in many cases outperforms both its constituent quantities; is tunable to specific health and …


The Emotional Arcs Of Stories Are Dominated By Six Basic Shapes, Andrew J. Reagan, Lewis Mitchell, Dilan Kiley, Christopher M. Danforth, Peter Sheridan Dodds Dec 2016

The Emotional Arcs Of Stories Are Dominated By Six Basic Shapes, Andrew J. Reagan, Lewis Mitchell, Dilan Kiley, Christopher M. Danforth, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

Advances in computing power, natural language processing, and digitization of text now make it possible to study a culture’s evolution through its texts using a ‘big data’ lens. Our ability to communicate relies in part upon a shared emotional experience, with stories often following distinct emotional trajectories and forming patterns that are meaningful to us. Here, by classifying the emotional arcs for a filtered subset of 1,327 stories from Project Gutenberg’s fiction collection, we find a set of six core emotional arcs which form the essential building blocks of complex emotional trajectories. We strengthen our findings by separately applying matrix …


Vaporous Marketing: Uncovering Pervasive Electronic Cigarette Advertisements On Twitter, Eric M. Clark, Chris A. Jones, Jake Ryland Williams, Allison N. Kurti, Mitchell Craig Norotsky, Christopher M. Danforth, Peter Sheridan Dodds Jul 2016

Vaporous Marketing: Uncovering Pervasive Electronic Cigarette Advertisements On Twitter, Eric M. Clark, Chris A. Jones, Jake Ryland Williams, Allison N. Kurti, Mitchell Craig Norotsky, Christopher M. Danforth, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

Background Twitter has become the "wild-west" of marketing and promotional strategies for advertisement agencies. Electronic cigarettes have been heavily marketed across Twitter feeds, offering discounts, "kid-friendly" flavors, algorithmically generated false testimonials, and free samples. Methods All electronic cigarette keyword related tweets from a 10% sample of Twitter spanning January 2012 through December 2014 (approximately 850,000 total tweets) were identified and categorized as Automated or Organic by combining a keyword classification and a machine trained Human Detection algorithm. A sentiment analysis using Hedonometrics was performed on Organic tweets to quantify the change in consumer sentiments over time. Commercialized tweets were topically …


Game Story Space Of Professional Sports: Australian Rules Football, Dilan Patrick Kiley, Andrew J. Reagan, Lewis Mitchell, Christopher M. Danforth, Peter Sheridan Dodds May 2016

Game Story Space Of Professional Sports: Australian Rules Football, Dilan Patrick Kiley, Andrew J. Reagan, Lewis Mitchell, Christopher M. Danforth, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

Sports are spontaneous generators of stories. Through skill and chance, the script of each game is dynamically written in real time by players acting out possible trajectories allowed by a sport's rules. By properly characterizing a given sport's ecology of "game stories," we are able to capture the sport's capacity for unfolding interesting narratives, in part by contrasting them with random walks. Here we explore the game story space afforded by a data set of 1310 Australian Football League (AFL) score lines. We find that AFL games exhibit a continuous spectrum of stories rather than distinct clusters. We show how …


A Regional Model For Malaria Vector Developmental Habitats Evaluated Using Explicit, Pond-Resolving Surface Hydrology Simulations, Ernest Ohene Asare, Adrian Mark Tompkins, Arne Bomblies Mar 2016

A Regional Model For Malaria Vector Developmental Habitats Evaluated Using Explicit, Pond-Resolving Surface Hydrology Simulations, Ernest Ohene Asare, Adrian Mark Tompkins, Arne Bomblies

College of Engineering and Mathematical Sciences Faculty Publications

This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Dynamical malaria models can relate precipitation to the availability of vector breeding sites using simple models of surface hydrology. Here, a revised scheme is developed for the VECTRI malaria model, which is evaluated alongside the default scheme using a two year simulation by HYDREMATS, a 10 metre resolution, village-scale model that explicitly simulates individual ponds. Despite the simplicity of the two VECTRI surface hydrology parametrization schemes, …


Identifying Missing Dictionary Entries With Frequency-Conserving Context Models, Jake Ryland Williams, Eric M. Clark, James P. Bagrow, Christopher M. Danforth, Peter Sheridan Dodds Oct 2015

Identifying Missing Dictionary Entries With Frequency-Conserving Context Models, Jake Ryland Williams, Eric M. Clark, James P. Bagrow, Christopher M. Danforth, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

In an effort to better understand meaning from natural language texts, we explore methods aimed at organizing lexical objects into contexts. A number of these methods for organization fall into a family defined by word ordering. Unlike demographic or spatial partitions of data, these collocation models are of special importance for their universal applicability. While we are interested here in text and have framed our treatment appropriately, our work is potentially applicable to other areas of research (e.g., speech, genomics, and mobility patterns) where one has ordered categorical data (e.g., sounds, genes, and locations). Our approach focuses on the phrase …


Characterizing The Google Books Corpus: Strong Limits To Inferences Of Socio-Cultural And Linguistic Evolution, Eitan Adam Pechenick, Christopher M. Danforth, Peter Sheridan Dodds Oct 2015

Characterizing The Google Books Corpus: Strong Limits To Inferences Of Socio-Cultural And Linguistic Evolution, Eitan Adam Pechenick, Christopher M. Danforth, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

It is tempting to treat frequency trends from the Google Books data sets as indicators of the "true" popularity of various words and phrases. Doing so allows us to draw quantitatively strong conclusions about the evolution of cultural perception of a given topic, such as time or gender. However, the Google Books corpus suffers from a number of limitations which make it an obscure mask of cultural popularity. A primary issue is that the corpus is in effect a library, containing one of each book. A single, prolific author is thereby able to noticeably insert new phrases into the Google …


If You've Seen One Worm, Have You Seen Them All? Spatial, Community, And Genetic Variability Of Tubificid Communities In Montana, Nilanjan Lodh, Donna M. Rizzo, Billie L. Kerans, Stephanie Mcginnis, Nikolaos Fytilis, Lori Stevens Sep 2015

If You've Seen One Worm, Have You Seen Them All? Spatial, Community, And Genetic Variability Of Tubificid Communities In Montana, Nilanjan Lodh, Donna M. Rizzo, Billie L. Kerans, Stephanie Mcginnis, Nikolaos Fytilis, Lori Stevens

College of Engineering and Mathematical Sciences Faculty Publications

Genetic studies are recognized increasingly as important for understanding naturally occurring disease dynamics and are used to predict host genetic diversity and coevolutionary processes and to identify species composition in ecological communities. Tubifex tubifex, the definitive host of the whirling disease parasite Myxobolus cerebralis, comprises 6 known lineages that vary widely in parasite susceptibility. We used 16S ribosomal DNA (16S rDNA) to identify relationships among genetic variability of 3 oligochaete genera (T. tubifex, Rhyacodrilus spp., and Ilyodrilus spp.; Oligochaeta:Tubificidae), oligochaete assemblage composition, and the presence of whirling disease in 9 locations across 4 watersheds in Montana, USA. We assessed genetic …


Climate Change Sentiment On Twitter: An Unsolicited Public Opinion Poll, Emily M. Cody, Andrew J. Reagan, Lewis Mitchell, Peter Sheridan Dodds, Christopher M. Danforth Aug 2015

Climate Change Sentiment On Twitter: An Unsolicited Public Opinion Poll, Emily M. Cody, Andrew J. Reagan, Lewis Mitchell, Peter Sheridan Dodds, Christopher M. Danforth

College of Engineering and Mathematical Sciences Faculty Publications

The consequences of anthropogenic climate change are extensively debated through scientific papers, newspaper articles, and blogs. Newspaper articles may lack accuracy, while the severity of findings in scientific papers may be too opaque for the public to understand. Social media, however, is a forum where individuals of diverse backgrounds can share their thoughts and opinions. As consumption shifts from old media to new, Twitter has become a valuable resource for analyzing current events and headline news. In this research, we analyze tweets containing the word "climate" collected between September 2008 and July 2014. Through use of a previously developed sentiment …


Zipfs Law Holds For Phrases, Not Words, Jake Ryland Williams, Paul R. Lessard, Suma Desu, Eric M. Clark, James P. Bagrow, Christopher M. Danforth, Peter Sheridan Dodds Aug 2015

Zipfs Law Holds For Phrases, Not Words, Jake Ryland Williams, Paul R. Lessard, Suma Desu, Eric M. Clark, James P. Bagrow, Christopher M. Danforth, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

With Zipfs law being originally and most famously observed for word frequency, it is surprisingly limited in its applicability to human language, holding over no more than three to four orders of magnitude before hitting a clear break in scaling. Here, building on the simple observation that phrases of one or more words comprise the most coherent units of meaning in language, we show empirically that Zipfs law for phrases extends over as many as nine orders of rank magnitude. In doing so, we develop a principled and scalable statistical mechanical method of random text partitioning, which opens up a …


Reply To Garcia Et Al.: Common Mistakes In Measuring Frequency-Dependent Word Characteristics, Peter Sheridan Dodds, Eric M. Clark, Suma Desu, Morgan R. Frank, Andrew J. Reagan, Jake Ryland Williams, Lewis Mitchell, Kameron Decker Harris, Isabel M. Kloumann, James P. Bagrow, Karine Megerdoomian, Matthew T. Mcmahon, Brian F. Tivnan, Christopher M. Danforth Jun 2015

Reply To Garcia Et Al.: Common Mistakes In Measuring Frequency-Dependent Word Characteristics, Peter Sheridan Dodds, Eric M. Clark, Suma Desu, Morgan R. Frank, Andrew J. Reagan, Jake Ryland Williams, Lewis Mitchell, Kameron Decker Harris, Isabel M. Kloumann, James P. Bagrow, Karine Megerdoomian, Matthew T. Mcmahon, Brian F. Tivnan, Christopher M. Danforth

College of Engineering and Mathematical Sciences Faculty Publications

No abstract provided.


Text Mixing Shapes The Anatomy Of Rank-Frequency Distributions, Jake Ryland Williams, James P. Bagrow, Christopher M. Danforth, Peter Sheridan Dodds May 2015

Text Mixing Shapes The Anatomy Of Rank-Frequency Distributions, Jake Ryland Williams, James P. Bagrow, Christopher M. Danforth, Peter Sheridan Dodds

College of Engineering and Mathematical Sciences Faculty Publications

Natural languages are full of rules and exceptions. One of the most famous quantitative rules is Zipf's law, which states that the frequency of occurrence of a word is approximately inversely proportional to its rank. Though this "law" of ranks has been found to hold across disparate texts and forms of data, analyses of increasingly large corpora since the late 1990s have revealed the existence of two scaling regimes. These regimes have thus far been explained by a hypothesis suggesting a separability of languages into core and noncore lexica. Here we present and defend an alternative hypothesis that the two …


Participation And Contribution In Crowdsourced Surveys, Robert Swain, Alex Berger, Josh Bongard, Paul Hines Apr 2015

Participation And Contribution In Crowdsourced Surveys, Robert Swain, Alex Berger, Josh Bongard, Paul Hines

College of Engineering and Mathematical Sciences Faculty Publications

This paper identifies trends within and relationships between the amount of participation and the quality of contributions in three crowdsourced surveys. Participants were asked to perform a collective problem solving task that lacked any explicit incentive: they were instructed not only to respond to survey questions but also to pose new questions that they thought might-if responded to by others-predict an outcome variable of interest to them. While the three surveys had very different outcome variables, target audiences, methods of advertisement, and lengths of deployment, we found very similar patterns of collective behavior. In particular, we found that: the rate …