Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 1 of 1
Full-Text Articles in Computer Engineering
A Large Scale Distributed Syntactic, Semantic And Lexical Language Model For Machine Translation, Ming Tan
A Large Scale Distributed Syntactic, Semantic And Lexical Language Model For Machine Translation, Ming Tan
Browse all Theses and Dissertations
The n-gram model is the most widely used language model (LM) in statistical machine translation system, due to its simplicity and scalability. However, it only encodes the local lexical relation between adjacent words and clearly ignores the rich syntactic and semantic structures of the natural languages. Attempting to increase the order of an n-gram to describe longer range dependencies in natural language immediately runs into the curse of dimensionality. Although previous researches tried to increase the order of n-gram on a large corpus, they did not see obvious improvement beyond 6-gram. Meanwhile, other LMs, such as syntactic language models and …