Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 6 of 6

Full-Text Articles in Computational Linguistics

Using Textual Features To Predict Popular Content On Digg, Paul H. Miller May 2011

Using Textual Features To Predict Popular Content On Digg, Paul H. Miller

Paul H Miller

Over the past few years, collaborative rating sites, such as Netflix, Digg and Stumble, have become increasingly prevalent sites for users to find trending content. I used various data mining techniques to study Digg, a social news site, to examine the influence of content on popularity. What influence does content have on popularity, and what influence does content have on users’ decisions? Overwhelmingly, prior studies have consistently shown that predicting popularity based on content is difficult and maybe even inherently impossible. The same submission can have multiple outcomes and content neither determines popularity, nor individual user decisions. My results show …


Using Textual Features To Predict Popular Content On Digg, Paul H. Miller Apr 2011

Using Textual Features To Predict Popular Content On Digg, Paul H. Miller

Department of English: Dissertations, Theses, and Student Research

Over the past few years, collaborative rating sites, such as Netflix, Digg and Stumble, have become increasingly prevalent sites for users to find trending content. I used various data mining techniques to study Digg, a social news site, to examine the influence of content on popularity. What influence does content have on popularity, and what influence does content have on users’ decisions? Overwhelmingly, prior studies have consistently shown that predicting popularity based on content is difficult and maybe even inherently impossible. The same submission can have multiple outcomes and content neither determines popularity, nor individual user decisions. My results show …


Prosodylab-Aligner: A Tool For Forced Alignment Of Laboratory Speech, Kyle Gorman, Jonathan Howell, Michael Wagner Jan 2011

Prosodylab-Aligner: A Tool For Forced Alignment Of Laboratory Speech, Kyle Gorman, Jonathan Howell, Michael Wagner

Department of Linguistics Faculty Scholarship and Creative Works

The Penn Forced Aligner automates the alignment process using the Hidden Markov Model Toolkit (HTK). The core of Prosodylab-Aligner is align.py, a script which performs acoustic model training and alignment. This script automates calls to HTK and SoX, an open-source command-line tool which is capable of resampling audio. The included README file provides instructions for installing HTK and SoX on Linux and Mac OS X, and can also be run on Windows. During training, the model is initialized with flat-start monophones, which are then submitted to a single round of model estimation. Then, a tied-state 'small pause' model is inserted …


The Low Entropy Conjecture: The Challenges Of Modern Irish Nominal Declension, Robert Malouf, Farrell Ackerman Jan 2011

The Low Entropy Conjecture: The Challenges Of Modern Irish Nominal Declension, Robert Malouf, Farrell Ackerman

Robert Malouf

No abstract provided.


Computational Style Processing, Foaad Khosmood Dec 2010

Computational Style Processing, Foaad Khosmood

Foaad Khosmood

Our main thesis is that computational processing of natural language styles can be accomplished using corpus analysis methods and language transformation rules. We demonstrate this first by statistically modeling natural language styles, and second by developing tools that carry out style processing, and finally by running experiments using the tools and evaluating the results. Specifically, we present a model for style in natural languages, and demonstrate style processing in three ways: Our system analyzes styles in quantifiable terms according to our model (analysis), associates documents based on stylistic similarity to known corpora (classification) and manipulates texts to match a desired …


Prosodylab-Aligner: A Tool For Forced Alignment Of Laboratory Speech, Kyle Gorman, Jonathan Howell, Michael Wagner Dec 2010

Prosodylab-Aligner: A Tool For Forced Alignment Of Laboratory Speech, Kyle Gorman, Jonathan Howell, Michael Wagner

Jonathan Howell

The Penn Forced Aligner automates the alignment process using the Hidden Markov Model Toolkit (HTK). The core of Prosodylab-Aligner is align.py, a script which performs acoustic model training and alignment. This script automates calls to HTK and SoX, an open-source command-line tool which is capable of resampling audio. The included README file provides instructions for installing HTK and SoX on Linux and Mac OS X, and can also be run on Windows. During training, the model is initialized with flat-start monophones, which are then submitted to a single round of model estimation. Then, a tied-state 'small pause' model is inserted …