Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

Brigham Young University

Theses/Dissertations

2015

Crowdsourcing

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

Facilitating Corpus Annotation By Improving Annotation Aggregation, Paul L. Felt Dec 2015

Facilitating Corpus Annotation By Improving Annotation Aggregation, Paul L. Felt

Theses and Dissertations

Annotated text corpora facilitate the linguistic investigation of language as well as the automation of natural language processing (NLP) tasks. NLP tasks include problems such as spam email detection, grammatical analysis, and identifying mentions of people, places, and events in text. However, constructing high quality annotated corpora can be expensive. Cost can be reduced by employing low-cost internet workers in a practice known as crowdsourcing, but the resulting annotations are often inaccurate, decreasing the usefulness of a corpus. This inaccuracy is typically mitigated by collecting multiple redundant judgments and aggregating them (e.g., via majority vote) to produce high quality consensus …