Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 1 of 1
Full-Text Articles in Entire DC Network
Spoken Language Recognition On Open-Source Datasets, Brady Arendale, Samira Zarandioon, Ryan Goodwin, Douglas Reynolds
Spoken Language Recognition On Open-Source Datasets, Brady Arendale, Samira Zarandioon, Ryan Goodwin, Douglas Reynolds
SMU Data Science Review
The field of speaker and language recognition is constantly being researched and developed, but much of this research is done on private or expensive datasets, making the field more inaccessible than many other areas of machine learning. In addition, many papers make performance claims without comparing their models to other recent research. With the recent development of public multilingual speech corpora such as Mozilla's Common Voice as well as several single-language corpora, we now have the resources to attempt to address both of these problems. We construct an eight-language dataset from Common Voice and a Google Bengali corpus as well …