Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 14 of 14

Full-Text Articles in Physical Sciences and Mathematics

Code Execution Capability As A Metric For Machine Learning–Assisted Software Vulnerability Detection Models, Daniel Grahn, Lingwei Chen, Junjie Zhang Jan 2023

Code Execution Capability As A Metric For Machine Learning–Assisted Software Vulnerability Detection Models, Daniel Grahn, Lingwei Chen, Junjie Zhang

Computer Science and Engineering Faculty Publications

In this paper, we consider how the ability to learn Code Execution Tasks affects a model’s accuracy on software vulnerability detection (SVD) benchmark datasets. We initially find that models can achieve near state-of-the-art accuracy on SVD benchmarks regardless of their ability to learn Code Execution Tasks. However, these models fail to generalize well across SVD benchmarks. The results indicate a bias in the datasets that allows models to predict non- SVD signals. Under the theory that different collection methods will reduce biases, we investigate combining the SVD datasets. When trained on combined datasets, SVD accuracy is reduced but correlation with …


Novel Natural Language Processing Models For Medical Terms And Symptoms Detection In Twitter, Farahnaz Golrooy Motlagh Jan 2022

Novel Natural Language Processing Models For Medical Terms And Symptoms Detection In Twitter, Farahnaz Golrooy Motlagh

Browse all Theses and Dissertations

This dissertation focuses on disambiguation of language use on Twitter about drug use, consumption types of drugs, drug legalization, ontology-enhanced approaches, and prediction analysis of data-driven by developing novel NLP models. Three technical aims comprise this work: (a) leveraging pattern recognition techniques to improve the quality and quantity of crawled Twitter posts related to drug abuse; (b) using an expert-curated, domain-specific DsOn ontology model that improve knowledge extraction in the form of drug-to-symptom and drug-to-side effect relations; and (c) modeling the prediction of public perception of the drug’s legalization and the sentiment analysis of drug consumption on Twitter. We collected …


Deep Understanding Of Technical Documents : Automated Generation Of Pseudocode From Digital Diagrams & Analysis/Synthesis Of Mathematical Formulas, Nikolaos Gkorgkolis Jan 2022

Deep Understanding Of Technical Documents : Automated Generation Of Pseudocode From Digital Diagrams & Analysis/Synthesis Of Mathematical Formulas, Nikolaos Gkorgkolis

Browse all Theses and Dissertations

The technical document is an entity that consists of several essential and interconnected parts, often referred to as modalities. Despite the extensive attention that certain parts have already received, per say the textual information, there are several aspects that severely under researched. Two such modalities are the utility of diagram images and the deep automated understanding of mathematical formulas. Inspired by existing holistic approaches to the deep understanding of technical documents, we develop a novel formal scheme for the modelling of digital diagram images. This extends to a generative framework that allows for the creation of artificial images and their …


Knowledge Graph Reasoning Over Unseen Rdf Data, Bhargavacharan Reddy Kaithi Jan 2019

Knowledge Graph Reasoning Over Unseen Rdf Data, Bhargavacharan Reddy Kaithi

Browse all Theses and Dissertations

In recent years, the research in deep learning and knowledge engineering has made a wide impact on the data and knowledge representations. The research in knowledge engineering has frequently focused on modeling the high level human cognitive abilities, such as reasoning, making inferences, and validation. Semantic Web Technologies and Deep Learning have an interest in creating intelligent artifacts. Deep learning is a set of machine learning algorithms that attempt to model data representations through many layers of non-linear transformations. Deep learning is in- creasingly employed to analyze various knowledge representations mentioned in Semantic Web and provides better results for Semantic …


A Framework To Understand Emoji Meaning: Similarity And Sense Disambiguation Of Emoji Using Emojinet, Sanjaya Wijeratne Jan 2018

A Framework To Understand Emoji Meaning: Similarity And Sense Disambiguation Of Emoji Using Emojinet, Sanjaya Wijeratne

Browse all Theses and Dissertations

Pictographs, commonly referred to as `emoji’, have become a popular way to enhance electronic communications. They are an important component of the language used in social media. With their introduction in the late 1990’s, emoji have been widely used to enhance the sentiment, emotion, and sarcasm expressed in social media messages. They are equally popular across many social media sites including Facebook, Instagram, and Twitter. In 2015, Instagram reported that nearly half of the photo comments posted on Instagram contain emoji, and in the same year, Twitter reported that the `face with tears of joy’ emoji has been tweeted 6.6 …


Use Of Adaptive Mobile Applications To Improve Mindfulness, Wiehan Boshoff Jan 2018

Use Of Adaptive Mobile Applications To Improve Mindfulness, Wiehan Boshoff

Browse all Theses and Dissertations

Mindfulness is the state of retaining awareness of what is happening at the current point in time. It has been used in multiple forms to reduce stress, anxiety, and even depression. Promoting Mindfulness can be done in various ways, but current research shows a trend towards preferential usage of breathing exercises over other methods to reach a mindful state. Studies have showcased that breathing can be used as a tool to promote brain control, specifically in the auditory cortex region. Research pertaining to disorders such as Tinnitus, the phantom awareness of sound, could potentially benefit from using these brain control …


What Are People Tweeting About Zika? An Exploratory Study Concerning Its Symptoms, Treatment, Transmission, And Prevention, Michele Miller, Tanvi Banerjee, Roopteja Muppalla, William L. Romine, Amit Sheth Apr 2017

What Are People Tweeting About Zika? An Exploratory Study Concerning Its Symptoms, Treatment, Transmission, And Prevention, Michele Miller, Tanvi Banerjee, Roopteja Muppalla, William L. Romine, Amit Sheth

Kno.e.sis Publications

Background: In order to harness what people are tweeting about Zika, there needs to be a computational framework that leverages machine learning techniques to recognize relevant Zika tweets and, further, categorize these into disease-specific categories to address specific societal concerns related to the prevention, transmission, symptoms, and treatment of Zika virus.

Objective: The purpose of this study was to determine the relevancy of the tweets and what people were tweeting about the 4 disease characteristics of Zika: symptoms, transmission, prevention, and treatment.

Methods: A combination of natural language processing and machine learning techniques was used to determine what people were …


Deep Learning Approach For Intrusion Detection System (Ids) In The Internet Of Things (Iot) Network Using Gated Recurrent Neural Networks (Gru), Manoj Kumar Putchala Jan 2017

Deep Learning Approach For Intrusion Detection System (Ids) In The Internet Of Things (Iot) Network Using Gated Recurrent Neural Networks (Gru), Manoj Kumar Putchala

Browse all Theses and Dissertations

The Internet of Things (IoT) is a complex paradigm where billions of devices are connected to a network. These connected devices form an intelligent system of systems that share the data without human-to-computer or human-to-human interaction. These systems extract meaningful data that can transform human lives, businesses, and the world in significant ways. However, the reality of IoT is prone to countless cyber-attacks in the extremely hostile environment like the internet. The recent hack of 2014 Jeep Cherokee, iStan pacemaker, and a German steel plant are a few notable security breaches. To secure an IoT system, the traditional high-end security …


Multi-Class Classification Of Textual Data: Detection And Mitigation Of Cheating In Massively Multiplayer Online Role Playing Games, Naga Sai Nikhil Maguluri Jan 2017

Multi-Class Classification Of Textual Data: Detection And Mitigation Of Cheating In Massively Multiplayer Online Role Playing Games, Naga Sai Nikhil Maguluri

Browse all Theses and Dissertations

The success of any multiplayer game depends on the player’s experience. Cheating/Hacking undermines the player’s experience and thus the success of that game. Cheaters, who use hacks, bots or trainers are ruining the gaming experience of a player and are making him leave the game. As the video game industry is a constantly increasing multibillion dollar economy, it is crucial to assure and maintain a state of security. Players reflect their gaming experience in one of the following places: multiplayer chat, game reviews, and social media. This thesis is an exploratory study where our goal is to experiment and propose …


Features For Ranking Tweets Based On Credibility And Newsworthiness, Jacob W. Ross Jan 2015

Features For Ranking Tweets Based On Credibility And Newsworthiness, Jacob W. Ross

Browse all Theses and Dissertations

We create a robust and general feature set for learning to rank algorithms that rank tweets based on credibility and newsworthiness. In previous works, it has been demonstrated that when the training and testing data are from two distinct time periods, the ranker performs poorly. We improve upon previous work by creating a feature set that does not over fit a particular year or set of topics. This is critical given how people utilize social media changes as time progresses, and the topics discussed vary. In addition, we are constantly gaining new tweet data. Thus, it is important to be …


Contrast Pattern Aided Regression And Classification, Vahid Taslimitehrani Jan 2015

Contrast Pattern Aided Regression And Classification, Vahid Taslimitehrani

Browse all Theses and Dissertations

Regression and classification techniques play an essential role in many data mining tasks and have broad applications. However, most of the state-of-the-art regression and classification techniques are often unable to adequately model the interactions among predictor variables in highly heterogeneous datasets. New techniques that can effectively model such complex and heterogeneous structures are needed to significantly improve prediction accuracy. In this dissertation, we propose a novel type of accurate and interpretable regression and classification models, named as Pattern Aided Regression (PXR) and Pattern Aided Classification (PXC) respectively. Both PXR and PXC rely on identifying regions in the data space where …


Data Analytics For Power Utility Storm Planning, Lan Lin, Aldo Dagnino, Derek Doran, Swapna S. Gokhale Oct 2014

Data Analytics For Power Utility Storm Planning, Lan Lin, Aldo Dagnino, Derek Doran, Swapna S. Gokhale

Kno.e.sis Publications

As the world population grows, recent climatic changes seem to bring powerful storms to populated areas. The impact of these storms on utility services is devastating. Hurricane Sandy is a recent example of the enormous damages that storms can inflict on infrastructure, society, and the economy. Quick response to these emergencies represents a big challenge to electric power utilities. Traditionally utilities develop preparedness plans for storm emergency situations based on the experience of utility experts and with limited use of historical data. With the advent of the Smart Grid, utilities are incorporating automation and sensing technologies in their grids and …


An Evolutionary Approximation To Contrastive Divergence In Convolutional Restricted Boltzmann Machines, Ryan R. Mccoppin Jan 2014

An Evolutionary Approximation To Contrastive Divergence In Convolutional Restricted Boltzmann Machines, Ryan R. Mccoppin

Browse all Theses and Dissertations

Deep learning is an emerging area in machine learning that exploits multi-layered neural networks to extract invariant relationships from large data sets. Deep learning uses layers of non-linear transformations to represent data in abstract and discrete forms. Several different architectures have been developed over the past few years specifically to process images including the Convolutional Restricted Boltzmann Machine. The Boltzmann Machine is trained using contrastive divergence, a depth-first gradient based training algorithm. Gradient based training methods have no guarantee of reaching an optimal solution and tend to search a limited region of the solution space. In this thesis, we present …


Pattern Recognition Via Machine Learning With Genetic Decision-Programming, Carl C. Hoff Jan 2005

Pattern Recognition Via Machine Learning With Genetic Decision-Programming, Carl C. Hoff

Browse all Theses and Dissertations

In the intersection of pattern recognition, machine learning, and evolutionary computation is a new search technique by which computers might program themselves. That technique is called genetic decision-programming. A computer can gain the ability to distinguish among the things that it needs to recognize by using genetic decision-programming for pattern discovery and concept learning. Those patterns and concepts can be easily encoded in the spines of a decision program (tree or diagram). A spine consists of two parts: (1) the test-outcome pairs along a path from the program's root to any of its leaves and (2) the conclusion in that …