Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Ranking Comments: An Entropy-Based Method With Word Embedding Clustering, Yuyang Zhang Aug 2020

Ranking Comments: An Entropy-Based Method With Word Embedding Clustering, Yuyang Zhang

Electronic Thesis and Dissertation Repository

Automatically ranking comments by their relevance plays an important role in text mining and text summarization area. In this thesis, firstly, we introduce a new text digitalization method: the bag of word clusters model. Unlike the traditional bag of words model that treats each word as an independent item, we group semantic-related words as clusters using pre-trained word2vec word embeddings and represent each comment as a distribution of word clusters. This method can extract both semantic and statistical information from texts. Next, we propose an unsupervised ranking algorithm that identifies relevant comments by their distance to the “ideal” comment. The …


Cross Language Information Transfer Between Modern Standard Arabic And Its Dialects – A Framework For Automatic Speech Recognition System Language Model, Tiba Zaki Abdulhameed Apr 2020

Cross Language Information Transfer Between Modern Standard Arabic And Its Dialects – A Framework For Automatic Speech Recognition System Language Model, Tiba Zaki Abdulhameed

Dissertations

Significant advances have been made with Modern Standard Arabic (MSA) Automatic Speech Recognition (ASR) applications. Yet, dialectal conversation ASR is still trailing behind due to limited language resources. As is the case in most cultures, the formal Modern Standard Arabic language is not used in daily life. Instead, varieties of regional dialects are spoken, which creates a dire need to address dialect ASR systems. Processing MSA language naturally poses considerable challenges that are passed on to the processing of its derived dialects. In dialects, many words have gradually morphed from MSA pronunciations and at many times have different usages. Also, …