Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 14 of 14

Full-Text Articles in Entire DC Network

Robust And Uncertainty-Aware Image Classification Using Bayesian Vision Transformer Model, Fazlur Rahman Bin Karim Dec 2023

Robust And Uncertainty-Aware Image Classification Using Bayesian Vision Transformer Model, Fazlur Rahman Bin Karim

Theses and Dissertations

Transformer Neural Networks have emerged as the predominant architecture for addressing a wide range of Natural Language Processing (NLP) applications such as machine translation, speech recognition, sentiment analysis, text anomaly detection, etc. This noteworthy achievement of Transformer Neural Networks in the NLP field has sparked a growing interest in integrating and utilizing Transformer models in computer vision tasks. The Vision Transformer (ViT) model efficiently captures long-range dependencies by employing a self-attention mechanism to transform different image data into meaningful, significant representations. Recently, the Vision Transformer (ViT) has exhibited incredible performance in solving image classification problems by utilizing ViT models, thereby …


Evidence Assisted Learning For Clinical Decision Support Systems, Bhanu Pratap Singh Rawat Aug 2023

Evidence Assisted Learning For Clinical Decision Support Systems, Bhanu Pratap Singh Rawat

Doctoral Dissertations

Clinical decision support systems (CDSS) provide intelligently filtered knowledge and patient-specific and population information to the clinicians, nursing staff and healthcare professionals. CDSS can significantly improve the quality, safety, efficiency and effectiveness of health care. Over the last decade, American hospitals have adopted electronic health records (EHRs) widely resulting in a massive collection of clinical notes such as admission notes, physician notes, nursing notes and discharge summaries. For the past couple of decades, most of the work in CDSS has been focused on developing knowledge-based systems using structured data such as medications and ICD codes. In contrast, the EHR notes …


Counterfactual Replacement Analysis For Interpretation Of Blackbox Sexism Classification Models, Anders Knospe Jun 2023

Counterfactual Replacement Analysis For Interpretation Of Blackbox Sexism Classification Models, Anders Knospe

Computer Science Senior Theses

This paper describes the AKD team’s system designed for SemEval-2023 Task 10: Explainable Detection of Online Sexism (EDOS). We implement a simple fine-tuned GPT-3 model, ranking 26 on the leaderboard for task A. We also discuss different approaches to interpretability in the context of critiquing the EDOS task’s sub-category oriented approach. Finally, we propose counterfactual replacement analysis, a novel prototype technique for approaching explainability.


Extracting Patterns Of Semantic Roles From Accident Narratives, Soundarya Jayakumar May 2023

Extracting Patterns Of Semantic Roles From Accident Narratives, Soundarya Jayakumar

Theses and Dissertations

Accident databases are filled with rich information about accidents. Analyzing these datasets can reveal useful information which can be used to prevent similar accidents in the future. Policy makers, and safety management organizations can design appropriate measures based on the analysis done to prevent accidents. Besides structured data, crash reports include natural language narratives which contain valuable accident-related information which is otherwise not present in the structured data. Using natural language processing (NLP) techniques one can analyze these narratives and mine hidden patterns of accidents from them. The thesis focuses on developing an algorithm to extract common patterns of semantic …


Emotion Classification And Intensity Prediction On Tweets, Sharath Chander Pugazhenthi May 2023

Emotion Classification And Intensity Prediction On Tweets, Sharath Chander Pugazhenthi

Theses and Dissertations

The task of finding an emotion associated with the text from individuals on a social media platform has become very crucial as it influences the current state of mind of a particular individual in real life. It also helps one to understand social behavior at a given point in time. Microblogging platforms like Twitter serves as a powerful tool for expressing one’s thoughts. Several work have been done in classifying the emotion associated with it. The thesis comprises of a system that first classifies the tweet into one of the four emotions - anger, joy, sadness, and fear with good …


Quantification Of Various Types Of Biases In Large Language Models, Sudhashree Sayenju Apr 2023

Quantification Of Various Types Of Biases In Large Language Models, Sudhashree Sayenju

Doctor of Data Science and Analytics Dissertations

Natural Language Processing (NLP) systems are included everywhere on the internet from search engines, language translations to more advanced systems like voice assistant and customer service. Since humans are always on the receiving end of NLP technologies, it is very important to analyze whether or not the Large Language Models (LLMs) in use have bias and are therefore unfair. The majority of the research in NLP bias has focused on societal stereotype biases embedded in LLMs. However, our research focuses on all types of biases, namely model class level bias, stereotype bias and domain bias present in LLMs. Model class …


Learning Analytics Through Machine Learning And Natural Language Processing, Bokai Yang Apr 2023

Learning Analytics Through Machine Learning And Natural Language Processing, Bokai Yang

Theses and Dissertations

The increase of computing power and the ability to log students’ data with the help of the computer-assisted learning systems has led to an increased interest in developing and applying computer science techniques for analyzing learning data. To understand and investigate how learning-generated data can be used to improve student success, data mining techniques have been applied to several educational tasks. This dissertation investigates three important tasks in various domains of educational data mining: learners’ behavior analysis, essay structure analysis and feedback providing, and learners’ dropout prediction. The first project applied latent semantic analysis and machine learning approaches to investigate …


A Hybrid Continual Machine Learning Model For Efficient Hierarchical Classification Of Domain-Specific Text In The Presence Of Class Overlap (Case Study: It Support Tickets), Yasmen M. Wahba Mar 2023

A Hybrid Continual Machine Learning Model For Efficient Hierarchical Classification Of Domain-Specific Text In The Presence Of Class Overlap (Case Study: It Support Tickets), Yasmen M. Wahba

Electronic Thesis and Dissertation Repository

In today’s world, support ticketing systems are employed by a wide range of businesses. The ticketing system facilitates the interaction between customers and the support teams when the customer faces an issue with a product or a service. For large-scale IT companies with a large number of clients and a great volume of communications, the task of automating the classification of incoming tickets is key to guaranteeing long-term clients and ensuring business growth.

Although the problem of text classification has been widely studied in the literature, the majority of the proposed approaches revolve around state-of-the-art deep learning models. This thesis …


Increasing Code Completion Accuracy In Pythia Models For Non-Standard Python Libraries, David Buksbaum Jan 2023

Increasing Code Completion Accuracy In Pythia Models For Non-Standard Python Libraries, David Buksbaum

CCE Theses and Dissertations

Contemporary software development with modern programming languages leverages Integrated Development Environments, smart text editors, and similar tooling with code completion capabilities to increase the efficiency of software developers. Recent code completion research has shown that the combination of natural language processing with recurrent neural networks configured with long short-term memory can improve the accuracy of code completion predictions over prior models. It is well known that the accuracy of predictive systems based on training data is correlated to the quality and the quantity of the training data. This dissertation demonstrates that by expanding the training data set to include more …


Practical Ai Value Alignment Using Stories, Md Sultan Al Nahian Jan 2023

Practical Ai Value Alignment Using Stories, Md Sultan Al Nahian

Theses and Dissertations--Computer Science

As more machine learning agents interact with humans, it is increasingly a prospect that an agent trained to perform a task optimally - using only a measure of task performance as feedback--can violate societal norms for acceptable behavior or cause harm. Consequently, it becomes necessary to prioritize task performance and ensure that AI actions do not have detrimental effects. Value alignment is a property of intelligent agents, wherein they solely pursue goals and activities that are non-harmful and beneficial to humans. Current approaches to value alignment largely depend on imitation learning or learning from demonstration methods. However, the dynamic nature …


Effective Systems For Insider Threat Detection, Muhanned Qasim Jabbar Alslaiman Jan 2023

Effective Systems For Insider Threat Detection, Muhanned Qasim Jabbar Alslaiman

Browse all Theses and Dissertations

Insider threats to information security have become a burden for organizations. Understanding insider activities leads to an effective improvement in identifying insider attacks and limits their threats. This dissertation presents three systems to detect insider threats effectively. The aim is to reduce the false negative rate (FNR), provide better dataset use, and reduce dimensionality and zero padding effects. The systems developed utilize deep learning techniques and are evaluated using the CERT 4.2 dataset. The dataset is analyzed and reformed so that each row represents a variable length sample of user activities. Two data representations are implemented to model extracted features …


Wikipedia Web Table Interpretation, Keyword-Based Search, And Ranking, Kartikee Dabir Jan 2023

Wikipedia Web Table Interpretation, Keyword-Based Search, And Ranking, Kartikee Dabir

Master's Projects

Information retrieval and data interpretation on the web, for the purpose of gaining knowledgeable insights, has been a widely researched topic from the onset of the world wide web or what is today popularly known as the internet. Web tables are structured tabular data present amidst unstructured, heterogenous data on the web. This makes web tables a rich source of information for a variety of tasks like data analysis, data interpretation, and information retrieval pertaining to extracting knowledge from information present on the web. Wikipedia tables which are a subset of web tables hold a huge amount of useful data, …


Multi-Label Text Classification With Transfer Learning, Likhitha Yelamanchili Jan 2023

Multi-Label Text Classification With Transfer Learning, Likhitha Yelamanchili

Master's Projects

Multi-label text categorization is a crucial task in Natural Language Processing, where each text instance can be simultaneously assigned to numerous labels. This project's goal is to assess how well several deep learning models perform on a real-world dataset for multi-label text classification. We employed data augmentation techniques like Synonym Substitution and Random Word Substitution to address the problem of data imbalance. We conducted experiments on a toxic comment classification dataset to evaluate the effectiveness of several deep learning models including Bi-LSTM, GRU, and Bi-GRU, as well as fine- tuned pre-trained BERT models. Many metrics, including log loss, recall@k, and …


Comparative Analysis Of Transformer-Based Models For Text-To-Speech Normalization, Pankti Dholakia Jan 2023

Comparative Analysis Of Transformer-Based Models For Text-To-Speech Normalization, Pankti Dholakia

Master's Projects

Text-to-Speech (TTS) normalization is an essential component of natural language processing (NLP) that plays a crucial role in the production of natural-sounding synthesized speech. However, there are limitations to the TTS normalization procedure. Lengthy input sequences and variations in spoken language can present difficulties. The motivation behind this research is to address the challenges associated with TTS normalization by evaluating and comparing the performance of various models. The aim is to determine their effectiveness in handling language variations. The models include LSTM-GRU, Transformer, GCN-Transformer, GCNN-Transformer, Reformer, and a BERT language model that has been pre-trained. The research evaluates the performance …