Open Access. Powered by Scholars. Published by Universities.®
- Keyword
Articles 1 - 3 of 3
Full-Text Articles in Computational Linguistics
Predicting Music Genre Preferences Based On Online Comments, Andrew J. Sinclair
Predicting Music Genre Preferences Based On Online Comments, Andrew J. Sinclair
Master's Theses
Communication Accommodation Theory (CAT) states that individuals adapt to each other’s communicative behaviors. This adaptation is called “convergence.” In this work we explore the convergence of writing styles of users of the online music distribution plat- form SoundCloud.com. In order to evaluate our system we created a corpus of over 38,000 comments retrieved from SoundCloud in April 2014. The corpus represents comments from 8 distinct musical genres: Classical, Electronic, Hip Hop, Jazz, Country, Metal, Folk, and World. Our corpus contains: short comments, frequent misspellings, little sentence struc- ture, hashtags, emoticons, and URLs. We adapt techniques used by researchers analyzing other …
Misheard Me Oronyminator: Using Oronyms To Validate The Correctness Of Frequency Dictionaries, Jennifer G. Hughes
Misheard Me Oronyminator: Using Oronyms To Validate The Correctness Of Frequency Dictionaries, Jennifer G. Hughes
Master's Theses
In the field of speech recognition, an algorithm must learn to tell the difference between "a nice rock" and "a gneiss rock". These identical-sounding phrases are called oronyms. Word frequency dictionaries are often used by speech recognition systems to help resolve phonetic sequences with more than one possible orthographic phrase interpretation, by looking up which oronym of the root phonetic sequence contains the most-common words.
Our paper demonstrates a technique used to validate word frequency dictionary values. We chose to use frequency values from the UNISYN dictionary, which tallies each word on a per-occurance basis, using a proprietary text corpus, …
Csc Senior Project: Nlpstats, Michael Mease
Csc Senior Project: Nlpstats, Michael Mease
Computer Science and Software Engineering
Natural Language Processing has recently increased in popularity. The field of authorship analysis, specifically, uses various characteristics of text quantified by markers. NLPStats serves as a tool designed to streamline marker extraction based on user needs. A flexible query system allows for custom marker requests, adjustment of result formatting, and preprocessing options. Furthermore, an efficiently designed structure ensures that users retrieve information quickly. As a whole, NLPStats enables anyone, regardless of NLP experience, to extract important information about the text of a document.