Open Access. Powered by Scholars. Published by Universities.®

Business Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

California State University, San Bernardino

2011

Dvd

Articles 1 - 1 of 1

Full-Text Articles in Business

Speech Corpus Generation From Dvds Of Movies And Tv Series, Veton Z. Kepuska, Pattarapong Rojanasthien Jan 2011

Speech Corpus Generation From Dvds Of Movies And Tv Series, Veton Z. Kepuska, Pattarapong Rojanasthien

Journal of International Technology and Information Management

Speech corpus is a database of audio files containing spoken words/sentences and text transcriptions. In this work we present a data collection system for creating speech corpora from movies and TV series DVDs. Corpus generation from these DVDs is significantly lower- cost solution comparing to conventional way of obtaining a speech corpus. In addition, it also takes a shorter amount of time to collect the data and processes it into a corpus. In order to be able to perform this operation the Data Collection Toolkit is introduced. This toolkit is an application developed using C# .Net Framework 3.5 in Visual …