Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 13 of 13

Full-Text Articles in Artificial Intelligence and Robotics

Predicting High-Cap Tech Stock Polarity: A Combined Approach Using Support Vector Machines And Bidirectional Encoders From Transformers, Ian L. Grisham May 2023

Predicting High-Cap Tech Stock Polarity: A Combined Approach Using Support Vector Machines And Bidirectional Encoders From Transformers, Ian L. Grisham

Electronic Theses and Dissertations

The abundance, accessibility, and scale of data have engendered an era where machine learning can quickly and accurately solve complex problems, identify complicated patterns, and uncover intricate trends. One research area where many have applied these techniques is the stock market. Yet, financial domains are influenced by many factors and are notoriously difficult to predict due to their volatile and multivariate behavior. However, the literature indicates that public sentiment data may exhibit significant predictive qualities and improve a model’s ability to predict intricate trends. In this study, momentum SVM classification accuracy was compared between datasets that did and did not …


Citation Polarity Identification From Scientific Articles Using Deep Learning Methods, Souvik Kundu Mar 2023

Citation Polarity Identification From Scientific Articles Using Deep Learning Methods, Souvik Kundu

Electronic Thesis and Dissertation Repository

The way in which research articles are cited reflects how previous work is utilized by other researchers or stakeholders and can indicate the impact of that work on subsequent experiments. Based on human intuition, citations can be perceived as positive, negative, or neutral. While current citation indexing systems provide information on the author and publication name of the cited article, as well as the citation count, they do not indicate the polarity of the citation. This study aims to identify the polarity of citations in scientific research articles using pre-trained language models like BERT, ELECTRA, RoBERTa, Bio-RoBERTa, SPECTER, ERNIE, LongFormer, …


Automated Short-Answer Grading And Misconception Detection Using Large Language Models, Nazmul H. Kazi Jan 2023

Automated Short-Answer Grading And Misconception Detection Using Large Language Models, Nazmul H. Kazi

UNF Graduate Theses and Dissertations

As education technology continues to evolve, the domains of Automatic Short-Answer Grading (ASAG) and Automated Misconception Detection (AMD) stand at the forefront of innovative approaches to educational assessment. We explore the transformative potential of Large Language Models (LLMs) in revolutionizing these critical areas. Leveraging the remarkable capabilities of LLMs in semantic inference, contextual understanding, and transfer learning, we embark on a comprehensive journey to enhance both ASAG and AMD. On ASAG, we illuminate the efficacy of transfer learning by fine-tuning RoBERTa Large, a state-of-the-art LLM, on task-related corpora, e.g. the Multi-Genre Natural Language Inference (MNLI) corpus. The model's adaptability across …


Combating Fake News: A Gravity Well Simulation To Model Echo Chamber Formation In Social Media, Jeremy E. Thompson Jan 2023

Combating Fake News: A Gravity Well Simulation To Model Echo Chamber Formation In Social Media, Jeremy E. Thompson

Dartmouth College Ph.D Dissertations

Fake news has become a serious concern as distributing misinformation has become easier and more impactful. A solution is critically required. One solution is to ban fake news, but that approach could create more problems than it solves, and would also be problematic from the beginning, as it must first be identified to be banned. We initially propose a method to automatically recognize suspected fake news, and to provide news consumers with more information as to its veracity. We suggest that fake news is comprised of two components: premises and misleading content. Fake news can be condensed down to a …


Malware Classification Using Api Call Information And Word Embeddings, Sahil Aggarwal Jan 2023

Malware Classification Using Api Call Information And Word Embeddings, Sahil Aggarwal

Master's Projects

Malware classification is the process of classifying malware into recognizable categories and is an integral part of implementing computer security. In recent times, machine learning has emerged as one of the most suitable techniques to perform this task. Models can be trained on various malware features such as opcodes, and API calls among many others to deduce information that would be helpful in the classification.

Word embeddings are a key part of natural language processing and can be seen as a representation of text wherein similar words will have closer representations. These embeddings can be used to discover a quantifiable …


Spam Comments Detection In Youtube Videos, Priyusha Kotta Jan 2023

Spam Comments Detection In Youtube Videos, Priyusha Kotta

Master's Projects

This paper suggests an innovative way for finding spam or ham comments on the video- sharing website YouTube. Comments that are contextually irrelevant for a particular video or have a commercial motive constitute as spam. In the past few years, with the advent of advertisements spreading to new arenas such as the social media has created a lucrative platform for many. Today, it is being widely used by everyone. But this innovation comes with its own impediments. We can see how malicious users have taken over these platforms with the aid of automated bots that can deploy a well-coordinated spam …


Movie Reviews Sentiment Analysis Using Bert, Gibson Nkhata Dec 2022

Movie Reviews Sentiment Analysis Using Bert, Gibson Nkhata

Graduate Theses and Dissertations

Sentiment analysis (SA) or opinion mining is analysis of emotions and opinions from texts. It is one of the active research areas in Natural Language Processing (NLP). Various approaches have been deployed in the literature to address the problem. These techniques devise complex and sophisticated frameworks in order to attain optimal accuracy with their focus on polarity classification or binary classification. In this paper, we aim to fine-tune BERT in a simple but robust approach for movie reviews sentiment analysis to provide better accuracy than state-of-the-art (SOTA) methods. We start by conducting sentiment classification for every review, followed by computing …


Using A Bert-Based Ensemble Network For Abusive Language Detection, Noah Ballinger May 2022

Using A Bert-Based Ensemble Network For Abusive Language Detection, Noah Ballinger

Computer Science and Computer Engineering Undergraduate Honors Theses

Over the past two decades, online discussion has skyrocketed in scope and scale. However, so has the amount of toxicity and offensive posts on social media and other discussion sites. Despite this rise in prevalence, the ability to automatically moderate online discussion platforms has seen minimal development. Recently, though, as the capabilities of artificial intelligence (AI) continue to improve, the potential of AI-based detection of harmful internet content has become a real possibility. In the past couple years, there has been a surge in performance on tasks in the field of natural language processing, mainly due to the development of …


Contextualized Vector Embeddings For Malware Detection, Vinay Pandya Jan 2022

Contextualized Vector Embeddings For Malware Detection, Vinay Pandya

Master's Projects

Malware classification is a technique to classify different types of malware which form an integral part of system security. The aim of this project is to use context dependant word embeddings to classify malware. Tansformers is a novel architecture which utilizes self attention to handle long range dependencies. They are particularly effective in many complex natural language processing tasks such as Masked Lan- guage Modelling(MLM) and Next Sentence Prediction(NSP). Different transfomer architectures such as BERT, DistilBert, Albert, and Roberta are used to generate context dependant word embeddings. These embeddings would help in classifying different malware samples based on their similarity …


Analysis Of Public Sentiment Of Covid-19 Pandemic, Vaccines, And Lockdowns, Devinesh Singh Jan 2022

Analysis Of Public Sentiment Of Covid-19 Pandemic, Vaccines, And Lockdowns, Devinesh Singh

Master's Projects

CoV-2 pandemic prompted lockdown measures to be implemented worldwide; these directives were implemented nationwide to stunt the spread of the infection. Throughout the lockdowns, millions of individuals resorted to social media for entertainment, communicate with friends and family, and express their opinions about the pandemic. Simultaneously, social media aided in the dissemination of misinformation, which has proven to be a threat to global health. Sentiment analysis, a technique used to analyze textual data, can be used to gain an overview of public opinion behind CoV-2 from Twitter and TikTok. The primary focus of the project is to build a deep …


Temporal Disambiguation Of Relative Temporal Expressions In Clinical Texts Using Temporally Fine-Tuned Contextual Word Embeddings., Amy L. Olex Jan 2022

Temporal Disambiguation Of Relative Temporal Expressions In Clinical Texts Using Temporally Fine-Tuned Contextual Word Embeddings., Amy L. Olex

Theses and Dissertations

Temporal reasoning is the ability to extract and assimilate temporal information to reconstruct a series of events such that they can be reasoned over to answer questions involving time. Temporal reasoning in the clinical domain is challenging due to specialized medical terms and nomenclature, shorthand notation, fragmented text, a variety of writing styles used by different medical units, redundancy of information that has to be reconciled, and an increased number of temporal references as compared to general domain texts. Work in the area of clinical temporal reasoning has progressed, but the current state-of-the-art still has a ways to go before …


Bert Efficacy On Scientific And Medical Datasets: A Systematic Literature Review, Clayton Cohn Nov 2020

Bert Efficacy On Scientific And Medical Datasets: A Systematic Literature Review, Clayton Cohn

College of Computing and Digital Media Dissertations

Bidirectional Encoder Representations from Transformers (BERT) [Devlin et al., 2018] has been shown to be effective at modeling a multitude of datasets across a wide variety of Natural Language Processing (NLP) tasks; however, little research has been done regarding BERT’s effectiveness at modeling domain-specific datasets. Specifically, scientific and medical datasets present a particularly difficult challenge in NLP, as these types of corpora are often rife with technical jargon that is largely absent from the canonical corpora that BERT and other transfer learning models were originally trained on. This thesis is a Systematic Literature Review (SLR) of twenty-seven studies that were …


Health-Aware Food Planner: A Personalized Recipe Generation Approach Based On Gpt-2, Bushra Aljbawi Jan 2020

Health-Aware Food Planner: A Personalized Recipe Generation Approach Based On Gpt-2, Bushra Aljbawi

Theses and Dissertations (Comprehensive)

"What to eat today?" With the flourish of Internet, more and more people nowadays are inclined to find an answer to this most problematic question online. The recent explosion of food networks; however, produces large volumes of recipes, making it even harder to make an informed decision. This yields the need for advanced decision-making algorithms and efficient recommendation systems. Conventional recommender systems are not feasible anymore as food is a complicated feature that presents unique challenges and is less studied. For example, it can be one of the main reasons for obesity and many other chronic diseases. Food recommender system …