Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics

Wright State University

2013

Computer Science

Articles 1 - 1 of 1

Full-Text Articles in Engineering

A Large Scale Distributed Syntactic, Semantic And Lexical Language Model For Machine Translation, Ming Tan Jan 2013

A Large Scale Distributed Syntactic, Semantic And Lexical Language Model For Machine Translation, Ming Tan

Browse all Theses and Dissertations

The n-gram model is the most widely used language model (LM) in statistical machine translation system, due to its simplicity and scalability. However, it only encodes the local lexical relation between adjacent words and clearly ignores the rich syntactic and semantic structures of the natural languages. Attempting to increase the order of an n-gram to describe longer range dependencies in natural language immediately runs into the curse of dimensionality. Although previous researches tried to increase the order of n-gram on a large corpus, they did not see obvious improvement beyond 6-gram. Meanwhile, other LMs, such as syntactic language models and …