Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

2017

Discipline
Institution
Keyword
Publication
Publication Type

Articles 1 - 11 of 11

Full-Text Articles in Computational Linguistics

Cloud‐Based Text Analytics Harvesting, Cleaning And Analyzing Corporate Earnings Conference Calls, Michael Chuancai Zhang, Vikram Gazula, Dan Stone, Hong Xie Oct 2017

Cloud‐Based Text Analytics Harvesting, Cleaning And Analyzing Corporate Earnings Conference Calls, Michael Chuancai Zhang, Vikram Gazula, Dan Stone, Hong Xie

Commonwealth Computational Summit

No abstract provided.


Cloud-Based Text Analytics: Harvesting, Cleaning And Analyzing Corporate Earnings Conference Calls, Michael Chuancai Zhang, Vikram Gazula, Dan Stone, Hong Xie Oct 2017

Cloud-Based Text Analytics: Harvesting, Cleaning And Analyzing Corporate Earnings Conference Calls, Michael Chuancai Zhang, Vikram Gazula, Dan Stone, Hong Xie

Commonwealth Computational Summit

Does management language cohesion in earnings conference calls matter to the capital market? As a part of the research on the above question, and taking advantage of the modern IT technologies, this project:

  • harvested 115,882 earnings conference call transcripts from SeekingAlpha.com
  • parsed and structured 89,988 transcripts using regular expressions in Stata
  • analyzed 179,976 text files using Amazon Elastic Compute Cloud (Amazon EC2), which
  • saved almost 2 years (675 days) of the project time
As this project is related to big data, text analytics, and big computing, it may be a good case to show how we can benefit from modern …


A Sentiment Analysis Of Language & Gender Using Word Embedding Models, Ellyn Rolleston Keith Sep 2017

A Sentiment Analysis Of Language & Gender Using Word Embedding Models, Ellyn Rolleston Keith

Dissertations, Theses, and Capstone Projects

Since Robin Lakoff started the conversation around language and gender with her 1975 essay “Language and Woman’s Place,” extensive work has been done on analyzing sociolinguistics associated with gender. While much work has been done on the differences between how men and women use language, there is less research to be found on language about women as opposed to language about men. In this work, I build a word embedding model from a corpus of Wikipedia film summaries and use this model to create lists of words associated with men and words associated with women. I then use sentiment analysis …


Back To The Future: Logic And Machine Learning, Simon Dobnik, John D. Kelleher Jun 2017

Back To The Future: Logic And Machine Learning, Simon Dobnik, John D. Kelleher

Conference papers

In this paper we argue that since the beginning of the natural language processing or computational linguistics there has been a strong connection between logic and machine learning. First of all, there is something logical about language or linguistic about logic. Secondly, we argue that rather than distinguishing between logic and machine learning, a more useful distinction is between top-down approaches and data-driven approaches. Examining some recent approaches in deep learning we argue that they incorporate both properties and this is the reason for their very successful adoption to solve several problems within language technology.


Quantitative Criticism Of Literary Relationships, Joseph P. Dexter, Theodore Katz, Nilesh Tripuraneni, Tathagata Dasgupta, Ajay Kannan, James Brofos, Jorge A. Bonilla Lopez, Lea Schroeder Apr 2017

Quantitative Criticism Of Literary Relationships, Joseph P. Dexter, Theodore Katz, Nilesh Tripuraneni, Tathagata Dasgupta, Ajay Kannan, James Brofos, Jorge A. Bonilla Lopez, Lea Schroeder

Dartmouth Scholarship

Authors often convey meaning by referring to or imitating prior works of literature, a process that creates complex networks of literary relationships (“intertextuality”) and contributes to cultural evolution. In this paper, we use techniques from stylometry and machine learning to address subjective literary critical questions about Latin literature, a corpus marked by an extraordinary concentration of intertextuality. Our work, which we term “quantitative criticism,” focuses on case studies involving two influential Roman authors, the playwright Seneca and the historian Livy. We find that four plays related to but distinct from Seneca’s main writings are differentiated from the rest of the …


Es-Esa: An Information Retrieval Prototype Using Explicit Semantic Analysis And Elasticsearch, Brian D. Sloan Feb 2017

Es-Esa: An Information Retrieval Prototype Using Explicit Semantic Analysis And Elasticsearch, Brian D. Sloan

Dissertations, Theses, and Capstone Projects

Many modern information retrieval systems work by using keyword search to locate documents in an inverted index by matching those documents based on terms in a user’s query. While highly effective for many use-cases, one notable drawback to simple keyword-based searching is that the contextual knowledge surrounding the user’s underlying information need may be lost, particularly if the user’s query terms are ambiguous or have multiple meanings. Research in the field of semantic search aims to make progress towards resolving this. One methodology in particular, explicit semantic analysis, works by modeling a document not only as a set of …


Robot Perception Errors And Human Resolution Strategies In Situated Human-Robot Dialogue, Niels Schütte, Brian Mac Namee, John D. Kelleher Jan 2017

Robot Perception Errors And Human Resolution Strategies In Situated Human-Robot Dialogue, Niels Schütte, Brian Mac Namee, John D. Kelleher

Articles

Errors in visual perception may cause problems in situated dialogues. We investigated this problem through an experiment in which human participants interacted through a natural language dialogue interface with a simulated robot.We introduced errors into the robot’s perception, and observed the resulting problems in the dialogues and their resolutions.We then introduced different methods for the user to request information about the robot’s understanding of the environment. We quantify the impact of perception errors on the dialogues, and investigate resolution attempts by users at a structural level and at the level of referring expressions.


Vanilla Sequence-To-Sequence Neural Nets Cannot Model Reduplication, Brandon Prickett Jan 2017

Vanilla Sequence-To-Sequence Neural Nets Cannot Model Reduplication, Brandon Prickett

OWP Linguistics

This paper presents results from a series of simulations that attempted to teach a vanilla sequence-to-sequence neural network a reduplication process. These attempts did not succeed, suggesting that added machinery is necessary for connectionist models to perform such a task.


Acoustic Classification Of Focus: On The Web And In The Lab, Jonathan Howell, Mats Rooth, Michael Wagner Jan 2017

Acoustic Classification Of Focus: On The Web And In The Lab, Jonathan Howell, Mats Rooth, Michael Wagner

Department of Linguistics Faculty Scholarship and Creative Works

We present a new methodological approach which combines both naturally-occurring speech harvested on the web and speech data elicited in the laboratory. This proof-of-concept study examines the phenomenon of focus sensitivity in English, in which the interpretation of particular grammatical constructions (e.g., the comparative) is sensitive to the location of prosodic prominence. Machine learning algorithms (support vector machines and linear discriminant analysis) and human perception experiments are used to cross-validate the web-harvested and lab-elicited speech. Results con rm the theoretical predictions for location of prominence in comparative clauses and the advantages using both web-harvested and lab-elicited speech. The most robust …


Generating Amharic Present Tense Verbs: A Network Morphology & Datr Account, T. Michael W. Halcomb Jan 2017

Generating Amharic Present Tense Verbs: A Network Morphology & Datr Account, T. Michael W. Halcomb

Theses and Dissertations--Linguistics

In this thesis I attempt to model, that is, computationally reproduce, the natural transmission (i.e. inflectional regularities) of twenty present tense Amharic verbs (i.e. triradicals beginning with consonants) as used by the language’s speakers. I root my approach in the linguistic theory of network morphology (NM) and model it using the DATR evaluator. In Chapter 1, I provide an overview of Amharic and discuss the fidel as an abugida, the verb system’s root-and-pattern morphology, and how radicals of each lexeme interacts with prefixes and suffixes. I offer an overview of NM in Chapter 2 and DATR in Chapter 3. In …


Acoustic Classification Of Focus: On The Web And In The Lab, Jonathan Howell, Mats Rooth, Michael Wagner Dec 2016

Acoustic Classification Of Focus: On The Web And In The Lab, Jonathan Howell, Mats Rooth, Michael Wagner

Jonathan Howell

We present a new methodological approach which combines both naturally-occurring speech harvested on the web and speech data elicited in the laboratory. This proof-of-concept study examines the phenomenon of focus sensitivity in English, in which the interpretation of particular grammatical constructions (e.g., the comparative) is sensitive to the location of prosodic prominence. Machine learning algorithms (support vector machines and linear discriminant analysis) and human perception experiments are used to cross-validate the web-harvested and lab-elicited speech. Results con rm the theoretical predictions for location of prominence in comparative clauses and the advantages using both web-harvested and lab-elicited speech. The most robust …