Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 6 of 6

Full-Text Articles in Engineering

Defining And Detecting Toxicity On Social Media: Context And Knowledge Are Key, Amit Sheth, Valerie Shalin, Ugur Kursuncu Dec 2021

Defining And Detecting Toxicity On Social Media: Context And Knowledge Are Key, Amit Sheth, Valerie Shalin, Ugur Kursuncu

Publications

As the role of online platforms has become increasingly prominent for communication, toxic behaviors, such as cyberbullying and harassment, have been rampant in the last decade. On the other hand, online toxicity is multi-dimensional and sensitive in nature, which makes its detection challenging. As the impact of exposure to online toxicity can lead to serious implications for individuals and communities, reliable models and algorithms are required for detecting and understanding such communications. In this paper We define toxicity to provide a foundation drawing social theories. Then, we provide an approach that identifies multiple dimensions of toxicity and incorporates explicit knowledge …


Semantically Meaningful Sentence Embeddings, Rojina Deuja Dec 2021

Semantically Meaningful Sentence Embeddings, Rojina Deuja

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Text embedding is an approach used in Natural Language Processing (NLP) to represent words, phrases, sentences, and documents. It is the process of obtaining numeric representations of text to feed into machine learning models as vectors (arrays of numbers). One of the biggest challenges in text embedding is representing longer text segments like sentences. These representations should capture the meaning of the segment and the semantic relationship between its constituents. Such representations are known as semantically meaningful embeddings. In this thesis, we seek to improve upon the quality of sentence embeddings that capture semantic information.

The current state-of-the-art models are …


Cybert: Cybersecurity Claim Classification By Fine-Tuning The Bert Language Model, Kimia Ameri, Michael Hempel, Hamid Sharif, Juan Lopez Jr., Kalyan Perumalla Nov 2021

Cybert: Cybersecurity Claim Classification By Fine-Tuning The Bert Language Model, Kimia Ameri, Michael Hempel, Hamid Sharif, Juan Lopez Jr., Kalyan Perumalla

Department of Electrical and Computer Engineering: Faculty Publications

We introduce CyBERT, a cybersecurity feature claims classifier based on bidirectional encoder representations from transformers and a key component in our semi-automated cybersecurity vetting for industrial control systems (ICS). To train CyBERT, we created a corpus of labeled sequences from ICS device documentation collected across a wide range of vendors and devices. This corpus provides the foundation for fine-tuning BERT’s language model, including a prediction-guided relabeling process. We propose an approach to obtain optimal hyperparameters, including the learning rate, the number of dense layers, and their configuration, to increase the accuracy of our classifier. Fine-tuning all hyperparameters of the resulting …


Exploiting Bert And Roberta To Improve Performance For Aspect Based Sentiment Analysis, Gagan Reddy Narayanaswamy Jan 2021

Exploiting Bert And Roberta To Improve Performance For Aspect Based Sentiment Analysis, Gagan Reddy Narayanaswamy

Dissertations

Sentiment Analysis also known as opinion mining is a type of text research that analyses people’s opinions expressed in written language. Sentiment analysis brings together various research areas such as Natural Language Processing (NLP), Data Mining, and Text Mining, and is fast becoming of major importance to companies and organizations as it is started to incorporate online commerce data for analysis. Often the data on which sentiment analysis is performed will be reviews. The data can range from reviews of a small product to a big multinational corporation. The goal of performing sentiment analysis is to extract information from those …


Evaluating The Performance Of Transformer Architecture Over Attention Architecture On Image Captioning, Deepti Balasubramaniam Jan 2021

Evaluating The Performance Of Transformer Architecture Over Attention Architecture On Image Captioning, Deepti Balasubramaniam

Dissertations

Over the last few decades computer vision and Natural Language processing has shown tremendous improvement in different tasks such as image captioning, video captioning, machine translation etc using deep learning models. However, there were not much researches related to image captioning based on transformers and how it outperforms other models that were implemented for image captioning. In this study will be designing a simple encoder-decoder model, attention model and transformer model for image captioning using Flickr8K dataset where will be discussing about the hyperparameters of the model, type of pre-trained model used and how long the model has been trained. …


Nlp Is Not Enough - Contextualization Of User Input In Chatbots, Nathan Dolbir, Triyasha Dastidar, Kaushik Roy Jan 2021

Nlp Is Not Enough - Contextualization Of User Input In Chatbots, Nathan Dolbir, Triyasha Dastidar, Kaushik Roy

Publications

AI chatbots have made vast strides in technology improvement in recent years and are already operational in many industries. Advanced Natural Language Processing techniques, based on deep networks, efficiently process user requests to carry out their functions. As chatbots gain traction, their applicability in healthcare is an attractive proposition due to the reduced economic and people costs of an overburdened system. However, healthcare bots require safe and medically accurate information capture, which deep networks aren’t yet capable of due to user text and speech variations. Knowledge in symbolic structures is more suited for accurate reasoning but cannot handle natural language …