Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 1 of 1
Full-Text Articles in Physical Sciences and Mathematics
A Latent Dirichlet Allocation/N-Gram Composite Language Model, Raymond Daniel Kulhanek
A Latent Dirichlet Allocation/N-Gram Composite Language Model, Raymond Daniel Kulhanek
Browse all Theses and Dissertations
I present a composite language model in which an n-gram language model is integrated with the Latent Dirichlet Allocation topic clustering model. I also describe a parallel architecture that allows this model to be trained over large corpora and present experimental results that show how the composite model compares to a standard n-gram model over corpora of varying size.