Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 6 of 6

Full-Text Articles in Physical Sciences and Mathematics

Movie Reviews Sentiment Analysis Using Bert, Gibson Nkhata Dec 2022

Movie Reviews Sentiment Analysis Using Bert, Gibson Nkhata

Graduate Theses and Dissertations

Sentiment analysis (SA) or opinion mining is analysis of emotions and opinions from texts. It is one of the active research areas in Natural Language Processing (NLP). Various approaches have been deployed in the literature to address the problem. These techniques devise complex and sophisticated frameworks in order to attain optimal accuracy with their focus on polarity classification or binary classification. In this paper, we aim to fine-tune BERT in a simple but robust approach for movie reviews sentiment analysis to provide better accuracy than state-of-the-art (SOTA) methods. We start by conducting sentiment classification for every review, followed by computing …


Pipeline For Calculating Calories For Print Recipes With Minimal User Intervention, Karl W. Holten Aug 2022

Pipeline For Calculating Calories For Print Recipes With Minimal User Intervention, Karl W. Holten

Theses and Dissertations

The thesis will provide a pipeline to estimate calorie counts from print recipes. The pipeline takes scanned recipes from cookbooks and uses Optical Character Recognition (OCR) to convert the scanned images of recipes to text. Several OCR tools were tested for their accuracy on fractions using a sample of the data, and the most accurate tool is used on the data. Next, a specially trained named entity recognition model is used to identify ingredients, quantities and units. These ingredients are used to search a database of values from the FDA to compute a calorie count for the recipe. The thesis …


Using A Bert-Based Ensemble Network For Abusive Language Detection, Noah Ballinger May 2022

Using A Bert-Based Ensemble Network For Abusive Language Detection, Noah Ballinger

Computer Science and Computer Engineering Undergraduate Honors Theses

Over the past two decades, online discussion has skyrocketed in scope and scale. However, so has the amount of toxicity and offensive posts on social media and other discussion sites. Despite this rise in prevalence, the ability to automatically moderate online discussion platforms has seen minimal development. Recently, though, as the capabilities of artificial intelligence (AI) continue to improve, the potential of AI-based detection of harmful internet content has become a real possibility. In the past couple years, there has been a surge in performance on tasks in the field of natural language processing, mainly due to the development of …


Contextualized Vector Embeddings For Malware Detection, Vinay Pandya Jan 2022

Contextualized Vector Embeddings For Malware Detection, Vinay Pandya

Master's Projects

Malware classification is a technique to classify different types of malware which form an integral part of system security. The aim of this project is to use context dependant word embeddings to classify malware. Tansformers is a novel architecture which utilizes self attention to handle long range dependencies. They are particularly effective in many complex natural language processing tasks such as Masked Lan- guage Modelling(MLM) and Next Sentence Prediction(NSP). Different transfomer architectures such as BERT, DistilBert, Albert, and Roberta are used to generate context dependant word embeddings. These embeddings would help in classifying different malware samples based on their similarity …


Analysis Of Public Sentiment Of Covid-19 Pandemic, Vaccines, And Lockdowns, Devinesh Singh Jan 2022

Analysis Of Public Sentiment Of Covid-19 Pandemic, Vaccines, And Lockdowns, Devinesh Singh

Master's Projects

CoV-2 pandemic prompted lockdown measures to be implemented worldwide; these directives were implemented nationwide to stunt the spread of the infection. Throughout the lockdowns, millions of individuals resorted to social media for entertainment, communicate with friends and family, and express their opinions about the pandemic. Simultaneously, social media aided in the dissemination of misinformation, which has proven to be a threat to global health. Sentiment analysis, a technique used to analyze textual data, can be used to gain an overview of public opinion behind CoV-2 from Twitter and TikTok. The primary focus of the project is to build a deep …


Temporal Disambiguation Of Relative Temporal Expressions In Clinical Texts Using Temporally Fine-Tuned Contextual Word Embeddings., Amy L. Olex Jan 2022

Temporal Disambiguation Of Relative Temporal Expressions In Clinical Texts Using Temporally Fine-Tuned Contextual Word Embeddings., Amy L. Olex

Theses and Dissertations

Temporal reasoning is the ability to extract and assimilate temporal information to reconstruct a series of events such that they can be reasoned over to answer questions involving time. Temporal reasoning in the clinical domain is challenging due to specialized medical terms and nomenclature, shorthand notation, fragmented text, a variety of writing styles used by different medical units, redundancy of information that has to be reconciled, and an increased number of temporal references as compared to general domain texts. Work in the area of clinical temporal reasoning has progressed, but the current state-of-the-art still has a ways to go before …