Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- Language (3)
- Resilient Communities (3)
- Climate Solutions (2)
- Happiness (2)
- Machine learning (2)
-
- Malaria (2)
- Social media (2)
- Sustainable Agriculture (2)
- 16S (1)
- Africa (1)
- Agent-based modeling (1)
- Anopheles (1)
- Anopheles gambiae (1)
- Blogs (1)
- Chimera state (1)
- Community assembly (1)
- Complex systems (1)
- Computational social science (1)
- Data visualization (1)
- Depression (1)
- Emotion (1)
- Epilepsy (1)
- Epistasis (1)
- Evolutionary algorithm (1)
- Graph algorithms (1)
- Haplotype diversity (1)
- Hedonometer (1)
- Heterogeneity (1)
- Immunity (1)
Articles 1 - 30 of 50
Full-Text Articles in Human Ecology
The Shocklet Transform: A Decomposition Method For The Identification Of Local, Mechanism-Driven Dynamics In Sociotechnical Time Series, David Rushing Dewhurst, Thayer Alshaabi, Dilan Kiley, Michael V. Arnold, Joshua R. Minot, Christopher M. Danforth, Peter Sheridan Dodds
The Shocklet Transform: A Decomposition Method For The Identification Of Local, Mechanism-Driven Dynamics In Sociotechnical Time Series, David Rushing Dewhurst, Thayer Alshaabi, Dilan Kiley, Michael V. Arnold, Joshua R. Minot, Christopher M. Danforth, Peter Sheridan Dodds
College of Engineering and Mathematical Sciences Faculty Publications
We introduce a qualitative, shape-based, timescale-independent time-domain transform used to extract local dynamics from sociotechnical time series—termed the Discrete Shocklet Transform (DST)—and an associated similarity search routine, the Shocklet Transform And Ranking (STAR) algorithm, that indicates time windows during which panels of time series display qualitatively-similar anomalous behavior. After distinguishing our algorithms from other methods used in anomaly detection and time series similarity search, such as the matrix profile, seasonal-hybrid ESD, and discrete wavelet transform-based procedures, we demonstrate the DST’s ability to identify mechanism-driven dynamics at a wide range of timescales and its relative insensitivity to functional parameterization. As an …
Chimera States And Seizures In A Mouse Neuronal Model, Henry M. Mitchell, Peter Sheridan Dodds, J. Matthew Mahoney, Christopher M. Danforth
Chimera States And Seizures In A Mouse Neuronal Model, Henry M. Mitchell, Peter Sheridan Dodds, J. Matthew Mahoney, Christopher M. Danforth
College of Engineering and Mathematical Sciences Faculty Publications
Chimera states - the coexistence of synchrony and asynchrony in a nonlocally-coupled network of identical oscillators - are often used as a model framework for epileptic seizures. Here, we explore the dynamics of chimera states in a network of modified Hindmarsh-Rose neurons, configured to reflect the graph of the mesoscale mouse connectome. Our model produces superficially epileptiform activity converging on persistent chimera states in a large region of a two-parameter space governing connections (a) between subcortices within a cortex and (b) between cortices. Our findings contribute to a growing body of literature suggesting mathematical models can qualitatively reproduce epileptic seizure …
Novel Evolutionary Algorithm Identifies Interactions Driving Infestation Of Triatoma Dimidiata, A Chagas Disease Vector, John P. Hanley, Donna M. Rizzo, Lori Stevens, Sara Helms Cahan, Patricia L. Dorn, Leslie A. Morrissey, Antonieta Guadalupe Rodas, Lucia C. Orantes, Carlota Monroy
Novel Evolutionary Algorithm Identifies Interactions Driving Infestation Of Triatoma Dimidiata, A Chagas Disease Vector, John P. Hanley, Donna M. Rizzo, Lori Stevens, Sara Helms Cahan, Patricia L. Dorn, Leslie A. Morrissey, Antonieta Guadalupe Rodas, Lucia C. Orantes, Carlota Monroy
College of Engineering and Mathematical Sciences Faculty Publications
Chagas disease is a lethal, neglected tropical disease. Unfortunately, aggressive insecticide-spraying campaigns have not been able to eliminate domestic infestation of Triatoma dimidiata, the native vector in Guatemala. To target interventions toward houses most at risk of infestation, comprehensive socioeconomic and entomologic surveys were conducted in two towns in Jutiapa, Guatemala. Given the exhaustively large search space associated with combinations of risk factors, traditional statistics are limited in their ability to discover risk factor interactions. Two recently developed statistical evolutionary algorithms, specifically designed to accommodate risk factor interactions and heterogeneity, were applied to this large combinatorial search space and used …
Hahahahaha, Duuuuude, Yeeessss!: A Two-Parameter Characterization Of Stretchable Words And The Dynamics Of Mistypings And Misspellings, Tyler J. Gray, Christopher M. Danforth, Peter Sheridan Dodds
Hahahahaha, Duuuuude, Yeeessss!: A Two-Parameter Characterization Of Stretchable Words And The Dynamics Of Mistypings And Misspellings, Tyler J. Gray, Christopher M. Danforth, Peter Sheridan Dodds
College of Engineering and Mathematical Sciences Faculty Publications
Stretched words like 'heellllp' or 'heyyyyy' are a regular feature of spoken language, often used to emphasize or exaggerate the underlying meaning of the root word. While stretched words are rarely found in formal written language and dictionaries, they are prevalent within social media. In this paper, we examine the frequency distributions of 'stretchable words' found in roughly 100 billion tweets authored over an 8 year period. We introduce two central parameters, 'balance' and 'stretch', that capture their main characteristics, and explore their dynamics by creating visual tools we call 'balance plots' and 'spelling trees'. We discuss how the tools …
A Tandem Evolutionary Algorithm For Identifying Causal Rules From Complex Data, John P. Hanley, Donna M. Rizzo, Jeffrey S. Buzas, Margaret J. Eppstein
A Tandem Evolutionary Algorithm For Identifying Causal Rules From Complex Data, John P. Hanley, Donna M. Rizzo, Jeffrey S. Buzas, Margaret J. Eppstein
College of Engineering and Mathematical Sciences Faculty Publications
We propose a new evolutionary approach for discovering causal rules in complex classification problems from batch data. Key aspects include (a) the use of a hypergeometric probability mass function as a principled statistic for assessing fitness that quantifies the probability that the observed association between a given clause and target class is due to chance, taking into account the size of the dataset, the amount of missing data, and the distribution of outcome categories, (b) tandem age-layered evolutionary algorithms for evolving parsimonious archives of conjunctive clauses, and disjunctions of these conjunctions, each of which have probabilistically significant associations with outcome …
A Crowdsourcing Approach To Understand Weight And Weight Loss In Men, Tiffany Rounds, Josh Bongard, Paul Hines, Jean Harvey
A Crowdsourcing Approach To Understand Weight And Weight Loss In Men, Tiffany Rounds, Josh Bongard, Paul Hines, Jean Harvey
College of Engineering and Mathematical Sciences Faculty Publications
No abstract provided.
Social Media Usage Patterns During Natural Hazards, Meredith T. Niles, Benjamin F. Emery, Andrew J. Reagan, Peter Sheridan Dodds, Christopher M. Danforth
Social Media Usage Patterns During Natural Hazards, Meredith T. Niles, Benjamin F. Emery, Andrew J. Reagan, Peter Sheridan Dodds, Christopher M. Danforth
College of Engineering and Mathematical Sciences Faculty Publications
Natural hazards are becoming increasingly expensive as climate change and development are exposing communities to greater risks. Preparation and recovery are critical for climate change resilience, and social media are being used more and more to communicate before, during, and after disasters. While there is a growing body of research aimed at understanding how people use social media surrounding disaster events, most existing work has focused on a single disaster case study. In the present study, we analyze five of the costliest disasters in the last decade in the United States (Hurricanes Irene and Sandy, two sets of tornado outbreaks, …
English Verb Regularization In Books And Tweets, Tyler J. Gray, Andrew J. Reagan, Peter Sheridan Dodds, Christopher M. Danforth
English Verb Regularization In Books And Tweets, Tyler J. Gray, Andrew J. Reagan, Peter Sheridan Dodds, Christopher M. Danforth
College of Engineering and Mathematical Sciences Faculty Publications
The English language has evolved dramatically throughout its lifespan, to the extent that a modern speaker of Old English would be incomprehensible without translation. One concrete indicator of this process is the movement from irregular to regular (-ed) forms for the past tense of verbs. In this study we quantify the extent of verb regularization using two vastly disparate datasets: (1) Six years of published books scanned by Google (2003-2008), and (2) A decade of social media messages posted to Twitter (2008-2017). We find that the extent of verb regularization is greater on Twitter, taken as a whole, than in …
Uncovering Vector, Parasite, Blood Meal And Microbiome Patterns From Mixed-Dna Specimens Of The Chagas Disease Vector Triatoma Dimidiata, Lucia C. Orantes, Carlota Monroy, Patricia L. Dorn, Lori Stevens, Donna M. Rizzo, Leslie Morrissey, John P. Hanley, Antonieta Guadalupe Rodas, Bethany Richards, Kimberly F. Wallin, Sara Helms Cahan
Uncovering Vector, Parasite, Blood Meal And Microbiome Patterns From Mixed-Dna Specimens Of The Chagas Disease Vector Triatoma Dimidiata, Lucia C. Orantes, Carlota Monroy, Patricia L. Dorn, Lori Stevens, Donna M. Rizzo, Leslie Morrissey, John P. Hanley, Antonieta Guadalupe Rodas, Bethany Richards, Kimberly F. Wallin, Sara Helms Cahan
College of Engineering and Mathematical Sciences Faculty Publications
Chagas disease, considered a neglected disease by the World Health Organization, is caused by the protozoan parasite Trypanosoma cruzi, and transmitted by >140 triatomine species across the Americas. In Central America, the main vector is Triatoma dimidiata, an opportunistic blood meal feeder inhabiting both domestic and sylvatic ecotopes. Given the diversity of interacting biological agents involved in the epidemiology of Chagas disease, having simultaneous information on the dynamics of the parasite, vector, the gut microbiome of the vector, and the blood meal source would facilitate identifying key biotic factors associated with the risk of T. cruzi transmission. In this study, …
Continuum Rich-Get-Richer Processes: Mean Field Analysis With An Application To Firm Size, David Rushing Dewhurst, Christopher M. Danforth, Peter Sheridan Dodds
Continuum Rich-Get-Richer Processes: Mean Field Analysis With An Application To Firm Size, David Rushing Dewhurst, Christopher M. Danforth, Peter Sheridan Dodds
College of Engineering and Mathematical Sciences Faculty Publications
Classical rich-get-richer models have found much success in being able to broadly reproduce the statistics and dynamics of diverse real complex systems. These rich-get-richer models are based on classical urn models and unfold step by step in discrete time. Here, we consider a natural variation acting on a temporal continuum in the form of a partial differential equation (PDE). We first show that the continuum version of Simon's canonical preferential attachment model exhibits an identical size distribution. In relaxing Simon's assumption of a linear growth mechanism, we consider the case of an arbitrary growth kernel and find the general solution …
Divergent Discourse Between Protests And Counter-Protests: #Blacklivesmatter And #Alllivesmatter, Ryan J. Gallagher, Andrew J. Reagan, Christopher M. Danforth, Peter Sheridan Dodds
Divergent Discourse Between Protests And Counter-Protests: #Blacklivesmatter And #Alllivesmatter, Ryan J. Gallagher, Andrew J. Reagan, Christopher M. Danforth, Peter Sheridan Dodds
College of Engineering and Mathematical Sciences Faculty Publications
Since the shooting of Black teenager Michael Brown by White police officer Darren Wilson in Ferguson, Missouri, the protest hashtag #BlackLivesMatter has amplified critiques of extrajudicial killings of Black Americans. In response to #BlackLivesMatter, other Twitter users have adopted #AllLivesMatter, a counter-protest hashtag whose content argues that equal attention should be given to all lives regardless of race. Through a multi-level analysis of over 860,000 tweets, we study how these protests and counter-protests diverge by quantifying aspects of their discourse. We find that #AllLivesMatter facilitates opposition between #BlackLivesMatter and hashtags such as #PoliceLivesMatter and #BlueLivesMatter in such a way that …
Forecasting The Onset And Course Of Mental Illness With Twitter Data, Andrew G. Reece, Andrew J. Reagan, Katharina L.M. Lix, Peter Sheridan Dodds, Christopher M. Danforth, Ellen J. Langer
Forecasting The Onset And Course Of Mental Illness With Twitter Data, Andrew G. Reece, Andrew J. Reagan, Katharina L.M. Lix, Peter Sheridan Dodds, Christopher M. Danforth, Ellen J. Langer
College of Engineering and Mathematical Sciences Faculty Publications
We developed computational models to predict the emergence of depression and Post-Traumatic Stress Disorder in Twitter users. Twitter data and details of depression history were collected from 204 individuals (105 depressed, 99 healthy). We extracted predictive features measuring affect, linguistic style, and context from participant tweets (N = 279,951) and built models using these features with supervised learning algorithms. Resulting models successfully discriminated between depressed and healthy content, and compared favorably to general practitioners' average success rates in diagnosing depression, albeit in a separate population. Results held even when the analysis was restricted to content posted before first depression diagnosis. …
Erratum To: Instagram Photos Reveal Predictive Markers Of Depression (Epj Data Science, (2017), 6, 1, (15), 10.1140/Epjds/S13688-017-0110-Z), Andrew G. Reece, Christopher M. Danforth
Erratum To: Instagram Photos Reveal Predictive Markers Of Depression (Epj Data Science, (2017), 6, 1, (15), 10.1140/Epjds/S13688-017-0110-Z), Andrew G. Reece, Christopher M. Danforth
College of Engineering and Mathematical Sciences Faculty Publications
Upon publication of the original article [1], it was noticed that Figure 2 contained an error. The horizontal bars for the likes row were incorrectly shown as blue. The horizontal bars for the ‘likes’ row should be orange. This has now been acknowledged and corrected in this erratum. The correct Figure 2 is shown below. In the section Method, subsection Improving data quality, the sentence ‘We also excluded participants with CES-D scores of 22 or higher. should read as We also excluded participants with CES-D scores of 21 or lower. This has now been acknowledged and corrected in this erratum. …
Instagram Photos Reveal Predictive Markers Of Depression, Andrew G. Reece, Christopher M. Danforth
Instagram Photos Reveal Predictive Markers Of Depression, Andrew G. Reece, Christopher M. Danforth
College of Engineering and Mathematical Sciences Faculty Publications
Using Instagram data from 166 individuals, we applied machine learning tools to successfully identify markers of depression. Statistical features were computationally extracted from 43,950 participant Instagram photos, using color analysis, metadata components, and algorithmic face detection. Resulting models outperformed general practitioners’ average unassisted diagnostic success rate for depression. These results held even when the analysis was restricted to posts made before depressed individuals were first diagnosed. Human ratings of photo attributes (happy, sad, etc.) were weaker predictors of depression, and were uncorrelated with computationally-generated features. These results suggest new avenues for early screening and detection of mental illness.
Sentiment Analysis Methods For Understanding Large-Scale Texts: A Case For Using Continuum-Scored Words And Word Shift Graphs, Andrew J. Reagan, Christopher M. Danforth, Brian Tivnan, Jake Ryland Williams, Peter Sheridan Dodds
Sentiment Analysis Methods For Understanding Large-Scale Texts: A Case For Using Continuum-Scored Words And Word Shift Graphs, Andrew J. Reagan, Christopher M. Danforth, Brian Tivnan, Jake Ryland Williams, Peter Sheridan Dodds
College of Engineering and Mathematical Sciences Faculty Publications
The emergence and global adoption of social media has rendered possible the real-time estimation of population-scale sentiment, an extraordinary capacity which has profound implications for our understanding of human behavior. Given the growing assortment of sentiment-measuring instruments, it is imperative to understand which aspects of sentiment dictionaries contribute to both their classification accuracy and their ability to provide richer understanding of texts. Here, we perform detailed, quantitative tests and qualitative assessments of 6 dictionary-based methods applied to 4 different corpora, and briefly examine a further 20 methods. We show that while inappropriate for sentences, dictionary-based methods are generally robust in …
Simon's Fundamental Rich-Get-Richer Model Entails A Dominant First-Mover Advantage, Peter Sheridan Dodds, David Rushing Dewhurst, Fletcher F. Hazlehurst, Colin M. Van Oort, Lewis Mitchell, Andrew J. Reagan, Jake Ryland Williams, Christopher M. Danforth
Simon's Fundamental Rich-Get-Richer Model Entails A Dominant First-Mover Advantage, Peter Sheridan Dodds, David Rushing Dewhurst, Fletcher F. Hazlehurst, Colin M. Van Oort, Lewis Mitchell, Andrew J. Reagan, Jake Ryland Williams, Christopher M. Danforth
College of Engineering and Mathematical Sciences Faculty Publications
Herbert Simon's classic rich-get-richer model is one of the simplest empirically supported mechanisms capable of generating heavy-tail size distributions for complex systems. Simon argued analytically that a population of flavored elements growing by either adding a novel element or randomly replicating an existing one would afford a distribution of group sizes with a power-law tail. Here, we show that, in fact, Simon's model does not produce a simple power-law size distribution as the initial element has a dominant first-mover advantage, and will be overrepresented by a factor proportional to the inverse of the innovation probability. The first group's size discrepancy …
Connecting Every Bit Of Knowledge: The Structure Of Wikipedia's First Link Network, Mark Ibrahim, Christopher M. Danforth, Peter Sheridan Dodds
Connecting Every Bit Of Knowledge: The Structure Of Wikipedia's First Link Network, Mark Ibrahim, Christopher M. Danforth, Peter Sheridan Dodds
College of Engineering and Mathematical Sciences Faculty Publications
Apples, porcupines, and the most obscure Bob Dylan song—is every topic a few clicks from Philosophy? Within Wikipedia, the surprising answer is yes: nearly all paths lead to Philosophy. Wikipedia is the largest, most meticulously indexed collection of human knowledge ever amassed. More than information about a topic, Wikipedia is a web of naturally emerging relationships. By following the first link in each article, we algorithmically construct a directed network of all 4.7 million articles: Wikipedia's First Link Network. Here, we study the English edition of Wikipedia's First Link Network for insight into how the many articles on inventions, places, …
The Lexicocalorimeter: Gauging Public Health Through Caloric Input And Output On Social Media, Sharon E. Alajajian, Jake Ryland Williams, Andrew J. Reagan, Stephen C. Alajajian, Morgan R. Frank, Lewis Mitchell, Jacob Lahne, Christopher M. Danforth, Peter Sheridan Dodds
The Lexicocalorimeter: Gauging Public Health Through Caloric Input And Output On Social Media, Sharon E. Alajajian, Jake Ryland Williams, Andrew J. Reagan, Stephen C. Alajajian, Morgan R. Frank, Lewis Mitchell, Jacob Lahne, Christopher M. Danforth, Peter Sheridan Dodds
College of Engineering and Mathematical Sciences Faculty Publications
We propose and develop a Lexicocalorimeter: an online, interactive instrument for measuring the "caloric content" of social media and other large-scale texts. We do so by constructing extensive yet improvable tables of food and activity related phrases, and respectively assigning them with sourced estimates of caloric intake and expenditure. We show that for Twitter, our naive measures of "caloric input", "caloric output", and the ratio of these measures are all strong correlates with health and well-being measures for the contiguous United States. Our caloric balance measure in many cases outperforms both its constituent quantities; is tunable to specific health and …
The Emotional Arcs Of Stories Are Dominated By Six Basic Shapes, Andrew J. Reagan, Lewis Mitchell, Dilan Kiley, Christopher M. Danforth, Peter Sheridan Dodds
The Emotional Arcs Of Stories Are Dominated By Six Basic Shapes, Andrew J. Reagan, Lewis Mitchell, Dilan Kiley, Christopher M. Danforth, Peter Sheridan Dodds
College of Engineering and Mathematical Sciences Faculty Publications
Advances in computing power, natural language processing, and digitization of text now make it possible to study a culture’s evolution through its texts using a ‘big data’ lens. Our ability to communicate relies in part upon a shared emotional experience, with stories often following distinct emotional trajectories and forming patterns that are meaningful to us. Here, by classifying the emotional arcs for a filtered subset of 1,327 stories from Project Gutenberg’s fiction collection, we find a set of six core emotional arcs which form the essential building blocks of complex emotional trajectories. We strengthen our findings by separately applying matrix …
Vaporous Marketing: Uncovering Pervasive Electronic Cigarette Advertisements On Twitter, Eric M. Clark, Chris A. Jones, Jake Ryland Williams, Allison N. Kurti, Mitchell Craig Norotsky, Christopher M. Danforth, Peter Sheridan Dodds
Vaporous Marketing: Uncovering Pervasive Electronic Cigarette Advertisements On Twitter, Eric M. Clark, Chris A. Jones, Jake Ryland Williams, Allison N. Kurti, Mitchell Craig Norotsky, Christopher M. Danforth, Peter Sheridan Dodds
College of Engineering and Mathematical Sciences Faculty Publications
Background Twitter has become the "wild-west" of marketing and promotional strategies for advertisement agencies. Electronic cigarettes have been heavily marketed across Twitter feeds, offering discounts, "kid-friendly" flavors, algorithmically generated false testimonials, and free samples. Methods All electronic cigarette keyword related tweets from a 10% sample of Twitter spanning January 2012 through December 2014 (approximately 850,000 total tweets) were identified and categorized as Automated or Organic by combining a keyword classification and a machine trained Human Detection algorithm. A sentiment analysis using Hedonometrics was performed on Organic tweets to quantify the change in consumer sentiments over time. Commercialized tweets were topically …
Game Story Space Of Professional Sports: Australian Rules Football, Dilan Patrick Kiley, Andrew J. Reagan, Lewis Mitchell, Christopher M. Danforth, Peter Sheridan Dodds
Game Story Space Of Professional Sports: Australian Rules Football, Dilan Patrick Kiley, Andrew J. Reagan, Lewis Mitchell, Christopher M. Danforth, Peter Sheridan Dodds
College of Engineering and Mathematical Sciences Faculty Publications
Sports are spontaneous generators of stories. Through skill and chance, the script of each game is dynamically written in real time by players acting out possible trajectories allowed by a sport's rules. By properly characterizing a given sport's ecology of "game stories," we are able to capture the sport's capacity for unfolding interesting narratives, in part by contrasting them with random walks. Here we explore the game story space afforded by a data set of 1310 Australian Football League (AFL) score lines. We find that AFL games exhibit a continuous spectrum of stories rather than distinct clusters. We show how …
A Regional Model For Malaria Vector Developmental Habitats Evaluated Using Explicit, Pond-Resolving Surface Hydrology Simulations, Ernest Ohene Asare, Adrian Mark Tompkins, Arne Bomblies
A Regional Model For Malaria Vector Developmental Habitats Evaluated Using Explicit, Pond-Resolving Surface Hydrology Simulations, Ernest Ohene Asare, Adrian Mark Tompkins, Arne Bomblies
College of Engineering and Mathematical Sciences Faculty Publications
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Dynamical malaria models can relate precipitation to the availability of vector breeding sites using simple models of surface hydrology. Here, a revised scheme is developed for the VECTRI malaria model, which is evaluated alongside the default scheme using a two year simulation by HYDREMATS, a 10 metre resolution, village-scale model that explicitly simulates individual ponds. Despite the simplicity of the two VECTRI surface hydrology parametrization schemes, …
Identifying Missing Dictionary Entries With Frequency-Conserving Context Models, Jake Ryland Williams, Eric M. Clark, James P. Bagrow, Christopher M. Danforth, Peter Sheridan Dodds
Identifying Missing Dictionary Entries With Frequency-Conserving Context Models, Jake Ryland Williams, Eric M. Clark, James P. Bagrow, Christopher M. Danforth, Peter Sheridan Dodds
College of Engineering and Mathematical Sciences Faculty Publications
In an effort to better understand meaning from natural language texts, we explore methods aimed at organizing lexical objects into contexts. A number of these methods for organization fall into a family defined by word ordering. Unlike demographic or spatial partitions of data, these collocation models are of special importance for their universal applicability. While we are interested here in text and have framed our treatment appropriately, our work is potentially applicable to other areas of research (e.g., speech, genomics, and mobility patterns) where one has ordered categorical data (e.g., sounds, genes, and locations). Our approach focuses on the phrase …
Characterizing The Google Books Corpus: Strong Limits To Inferences Of Socio-Cultural And Linguistic Evolution, Eitan Adam Pechenick, Christopher M. Danforth, Peter Sheridan Dodds
Characterizing The Google Books Corpus: Strong Limits To Inferences Of Socio-Cultural And Linguistic Evolution, Eitan Adam Pechenick, Christopher M. Danforth, Peter Sheridan Dodds
College of Engineering and Mathematical Sciences Faculty Publications
It is tempting to treat frequency trends from the Google Books data sets as indicators of the "true" popularity of various words and phrases. Doing so allows us to draw quantitatively strong conclusions about the evolution of cultural perception of a given topic, such as time or gender. However, the Google Books corpus suffers from a number of limitations which make it an obscure mask of cultural popularity. A primary issue is that the corpus is in effect a library, containing one of each book. A single, prolific author is thereby able to noticeably insert new phrases into the Google …
If You've Seen One Worm, Have You Seen Them All? Spatial, Community, And Genetic Variability Of Tubificid Communities In Montana, Nilanjan Lodh, Donna M. Rizzo, Billie L. Kerans, Stephanie Mcginnis, Nikolaos Fytilis, Lori Stevens
If You've Seen One Worm, Have You Seen Them All? Spatial, Community, And Genetic Variability Of Tubificid Communities In Montana, Nilanjan Lodh, Donna M. Rizzo, Billie L. Kerans, Stephanie Mcginnis, Nikolaos Fytilis, Lori Stevens
College of Engineering and Mathematical Sciences Faculty Publications
Genetic studies are recognized increasingly as important for understanding naturally occurring disease dynamics and are used to predict host genetic diversity and coevolutionary processes and to identify species composition in ecological communities. Tubifex tubifex, the definitive host of the whirling disease parasite Myxobolus cerebralis, comprises 6 known lineages that vary widely in parasite susceptibility. We used 16S ribosomal DNA (16S rDNA) to identify relationships among genetic variability of 3 oligochaete genera (T. tubifex, Rhyacodrilus spp., and Ilyodrilus spp.; Oligochaeta:Tubificidae), oligochaete assemblage composition, and the presence of whirling disease in 9 locations across 4 watersheds in Montana, USA. We assessed genetic …
Climate Change Sentiment On Twitter: An Unsolicited Public Opinion Poll, Emily M. Cody, Andrew J. Reagan, Lewis Mitchell, Peter Sheridan Dodds, Christopher M. Danforth
Climate Change Sentiment On Twitter: An Unsolicited Public Opinion Poll, Emily M. Cody, Andrew J. Reagan, Lewis Mitchell, Peter Sheridan Dodds, Christopher M. Danforth
College of Engineering and Mathematical Sciences Faculty Publications
The consequences of anthropogenic climate change are extensively debated through scientific papers, newspaper articles, and blogs. Newspaper articles may lack accuracy, while the severity of findings in scientific papers may be too opaque for the public to understand. Social media, however, is a forum where individuals of diverse backgrounds can share their thoughts and opinions. As consumption shifts from old media to new, Twitter has become a valuable resource for analyzing current events and headline news. In this research, we analyze tweets containing the word "climate" collected between September 2008 and July 2014. Through use of a previously developed sentiment …
Zipfs Law Holds For Phrases, Not Words, Jake Ryland Williams, Paul R. Lessard, Suma Desu, Eric M. Clark, James P. Bagrow, Christopher M. Danforth, Peter Sheridan Dodds
Zipfs Law Holds For Phrases, Not Words, Jake Ryland Williams, Paul R. Lessard, Suma Desu, Eric M. Clark, James P. Bagrow, Christopher M. Danforth, Peter Sheridan Dodds
College of Engineering and Mathematical Sciences Faculty Publications
With Zipfs law being originally and most famously observed for word frequency, it is surprisingly limited in its applicability to human language, holding over no more than three to four orders of magnitude before hitting a clear break in scaling. Here, building on the simple observation that phrases of one or more words comprise the most coherent units of meaning in language, we show empirically that Zipfs law for phrases extends over as many as nine orders of rank magnitude. In doing so, we develop a principled and scalable statistical mechanical method of random text partitioning, which opens up a …
Reply To Garcia Et Al.: Common Mistakes In Measuring Frequency-Dependent Word Characteristics, Peter Sheridan Dodds, Eric M. Clark, Suma Desu, Morgan R. Frank, Andrew J. Reagan, Jake Ryland Williams, Lewis Mitchell, Kameron Decker Harris, Isabel M. Kloumann, James P. Bagrow, Karine Megerdoomian, Matthew T. Mcmahon, Brian F. Tivnan, Christopher M. Danforth
Reply To Garcia Et Al.: Common Mistakes In Measuring Frequency-Dependent Word Characteristics, Peter Sheridan Dodds, Eric M. Clark, Suma Desu, Morgan R. Frank, Andrew J. Reagan, Jake Ryland Williams, Lewis Mitchell, Kameron Decker Harris, Isabel M. Kloumann, James P. Bagrow, Karine Megerdoomian, Matthew T. Mcmahon, Brian F. Tivnan, Christopher M. Danforth
College of Engineering and Mathematical Sciences Faculty Publications
No abstract provided.
Text Mixing Shapes The Anatomy Of Rank-Frequency Distributions, Jake Ryland Williams, James P. Bagrow, Christopher M. Danforth, Peter Sheridan Dodds
Text Mixing Shapes The Anatomy Of Rank-Frequency Distributions, Jake Ryland Williams, James P. Bagrow, Christopher M. Danforth, Peter Sheridan Dodds
College of Engineering and Mathematical Sciences Faculty Publications
Natural languages are full of rules and exceptions. One of the most famous quantitative rules is Zipf's law, which states that the frequency of occurrence of a word is approximately inversely proportional to its rank. Though this "law" of ranks has been found to hold across disparate texts and forms of data, analyses of increasingly large corpora since the late 1990s have revealed the existence of two scaling regimes. These regimes have thus far been explained by a hypothesis suggesting a separability of languages into core and noncore lexica. Here we present and defend an alternative hypothesis that the two …
Participation And Contribution In Crowdsourced Surveys, Robert Swain, Alex Berger, Josh Bongard, Paul Hines
Participation And Contribution In Crowdsourced Surveys, Robert Swain, Alex Berger, Josh Bongard, Paul Hines
College of Engineering and Mathematical Sciences Faculty Publications
This paper identifies trends within and relationships between the amount of participation and the quality of contributions in three crowdsourced surveys. Participants were asked to perform a collective problem solving task that lacked any explicit incentive: they were instructed not only to respond to survey questions but also to pose new questions that they thought might-if responded to by others-predict an outcome variable of interest to them. While the three surveys had very different outcome variables, target audiences, methods of advertisement, and lengths of deployment, we found very similar patterns of collective behavior. In particular, we found that: the rate …