Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Artificial Intelligence and Robotics

San Jose State University

Natural Language Processing

Publication Year

Articles 1 - 6 of 6

Full-Text Articles in Physical Sciences and Mathematics

Wikipedia Web Table Interpretation, Keyword-Based Search, And Ranking, Kartikee Dabir Jan 2023

Wikipedia Web Table Interpretation, Keyword-Based Search, And Ranking, Kartikee Dabir

Master's Projects

Information retrieval and data interpretation on the web, for the purpose of gaining knowledgeable insights, has been a widely researched topic from the onset of the world wide web or what is today popularly known as the internet. Web tables are structured tabular data present amidst unstructured, heterogenous data on the web. This makes web tables a rich source of information for a variety of tasks like data analysis, data interpretation, and information retrieval pertaining to extracting knowledge from information present on the web. Wikipedia tables which are a subset of web tables hold a huge amount of useful data, …


Comparative Analysis Of Transformer-Based Models For Text-To-Speech Normalization, Pankti Dholakia Jan 2023

Comparative Analysis Of Transformer-Based Models For Text-To-Speech Normalization, Pankti Dholakia

Master's Projects

Text-to-Speech (TTS) normalization is an essential component of natural language processing (NLP) that plays a crucial role in the production of natural-sounding synthesized speech. However, there are limitations to the TTS normalization procedure. Lengthy input sequences and variations in spoken language can present difficulties. The motivation behind this research is to address the challenges associated with TTS normalization by evaluating and comparing the performance of various models. The aim is to determine their effectiveness in handling language variations. The models include LSTM-GRU, Transformer, GCN-Transformer, GCNN-Transformer, Reformer, and a BERT language model that has been pre-trained. The research evaluates the performance …


Multi-Label Text Classification With Transfer Learning, Likhitha Yelamanchili Jan 2023

Multi-Label Text Classification With Transfer Learning, Likhitha Yelamanchili

Master's Projects

Multi-label text categorization is a crucial task in Natural Language Processing, where each text instance can be simultaneously assigned to numerous labels. This project's goal is to assess how well several deep learning models perform on a real-world dataset for multi-label text classification. We employed data augmentation techniques like Synonym Substitution and Random Word Substitution to address the problem of data imbalance. We conducted experiments on a toxic comment classification dataset to evaluate the effectiveness of several deep learning models including Bi-LSTM, GRU, and Bi-GRU, as well as fine- tuned pre-trained BERT models. Many metrics, including log loss, recall@k, and …


Caption And Image Based Next-Word Auto-Completion, Meet Patel Jan 2022

Caption And Image Based Next-Word Auto-Completion, Meet Patel

Master's Projects

With the increasing number of options or choices in terms of entities like products, movies, songs, etc. which are now available to users, they try to save time by looking for an application or system that provides automatic recommendations. Recommender systems are automated computing processes that leverage concepts of Machine Learning, Data Mining and Artificial Intelligence towards generating product recommendations based on a user’s preferences. These systems have given a significant boost to businesses across multiple segments as a result of reduced human intervention. One similar aspect of this is content writing. It would save users a lot of time …


Improved Chinese Language Processing For An Open Source Search Engine, Xianghong Sun May 2020

Improved Chinese Language Processing For An Open Source Search Engine, Xianghong Sun

Master's Projects

Natural Language Processing (NLP) is the process of computers analyzing on human languages. There are also many areas in NLP. Some of the areas include speech recognition, natural language understanding, and natural language generation.

Information retrieval and natural language processing for Asians languages has its own unique set of challenges not present for Indo-European languages. Some of these are text segmentation, named entity recognition in unsegmented text, and part of speech tagging. In this report, we describe our implementation of and experiments with improving the Chinese language processing sub-component of an open source search engine, Yioop. In particular, we rewrote …


Question Type Recognition Using Natural Language Input, Aishwarya Soni Jun 2017

Question Type Recognition Using Natural Language Input, Aishwarya Soni

Master's Projects

Recently, numerous specialists are concentrating on the utilization of Natural Language Processing (NLP) systems in various domains, for example, data extraction and content mining. One of the difficulties with these innovations is building up a precise Question and Answering (QA) System. Question type recognition is the most significant task in a QA system, for example, chat bots. Organization such as National Institute of Standards (NIST) hosts a conference series called as Text REtrieval Conference (TREC) series which keeps a competition every year to encourage and improve the technique of information retrieval from a large corpus of text. When a user …