Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

PDF

Theses/Dissertations

2019

Machine Learning

Institution
Publication

Articles 1 - 30 of 47

Full-Text Articles in Physical Sciences and Mathematics

Ordinal Hyperplane Loss, Bob Vanderheyden Dec 2019

Ordinal Hyperplane Loss, Bob Vanderheyden

Doctor of Data Science and Analytics Dissertations

This research presents the development of a new framework for analyzing ordered class data, commonly called “ordinal class” data. The focus of the work is the development of classifiers (predictive models) that predict classes from available data. Ratings scales, medical classification scales, socio-economic scales, meaningful groupings of continuous data, facial emotional intensity and facial age estimation are examples of ordinal data for which data scientists may be asked to develop predictive classifiers. It is possible to treat ordinal classification like any other classification problem that has more than two classes. Specifying a model with this strategy does not fully utilize …


Information Extraction From Biomedical Text Using Machine Learning, Deepti Garg Dec 2019

Information Extraction From Biomedical Text Using Machine Learning, Deepti Garg

Master's Projects

Inadequate drug experimental data and the use of unlicensed drugs may cause adverse drug reactions, especially in pediatric populations. Every year the U.S. Food and Drug Administration approves human prescription drugs for marketing. The labels associated with these drugs include information about clinical trials and drug response in pediatric population. In order for doctors to make an informed decision about the safety and effectiveness of these drugs for children, there is a need to analyze complex and often unstructured drug labels. In this work, first, an exploratory analysis of drug labels using a Natural Language Processing pipeline is performed. Second, …


Finding A Viable Neural Network Architecture For Use With Upper Limb Prosthetics, Maxwell Lavin Dec 2019

Finding A Viable Neural Network Architecture For Use With Upper Limb Prosthetics, Maxwell Lavin

Master of Science in Computer Science Theses

This paper attempts to answer the question of if it’s possible to produce a simple, quick, and accurate neural network for the use in upper-limb prosthetics. Through the implementation of convolutional and artificial neural networks and feature extraction on electromyographic data different possible architectures are examined with regards to processing time, complexity, and accuracy. It is found that the most accurate architecture is a multi-entry categorical cross entropy convolutional neural network with 100% accuracy. The issue is that it is also the slowest method requiring 9 minutes to run. The next best method found was a single-entry binary cross entropy …


Automatic Inference Of Causal Reasoning Chains From Student Essays, Simon Mark Hughes Oct 2019

Automatic Inference Of Causal Reasoning Chains From Student Essays, Simon Mark Hughes

College of Computing and Digital Media Dissertations

While there has been an increasing focus on higher-level thinking skills arising from the Common Core Standards, many high-school and middle-school students struggle to combine and integrate information from multiple sources when writing essays. Writing is an important learning skill, and there is increasing evidence that writing about a topic develops a deeper understanding in the student. However, grading essays is time consuming for teachers, resulting in an increasing focus on shallower forms of assessment that are easier to automate, such as multiple-choice tests. Existing essay grading software has attempted to ease this burden but relies on shallow lexico-syntactic features …


Prediction Of Hierarchical Classification Of Transposable Elements Using Machine Learning Techniques, Manisha Panta Aug 2019

Prediction Of Hierarchical Classification Of Transposable Elements Using Machine Learning Techniques, Manisha Panta

University of New Orleans Theses and Dissertations

Transposable Elements (TEs) or jumping genes are the DNA sequences that have an intrinsic capability to move within a host genome from one genomic location to another. Studies show that the presence of a TE within or adjacent to a functional gene may alter its expression. TEs can also cause an increase in the rate of mutation and can even promote gross genetic arrangements. Thus, the proper classification of the identified jumping genes is important to understand their genetic and evolutionary effects. While computational methods have been developed that perform either binary classification or multi-label classification of TEs, few studies …


Feature Selection And Analysis For Standard Machine Learning Classification Of Audio Beehive Samples, Chelsi Gupta Aug 2019

Feature Selection And Analysis For Standard Machine Learning Classification Of Audio Beehive Samples, Chelsi Gupta

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The beekeepers need to inspect their hives regularly in order to protect them from various stressors. Manual inspection of hives require a lot of time and effort. Hence, many researchers have started using electronic beehive monitoring (EBM) systems to collect critical information from beehives, so as to alert the beekeepers of possible threats to the hive. EBM collects information by applying multiple sensors into the hive. The sensors collect information in the form of video, audio or temperature data from the hives.

This thesis involves the automatic classification of audio samples from a beehive into bee buzzing, cricket chirping and …


Static Malware Detection Using Deep Neural Networks On Portable Executables, Piyush Aniruddha Puranik Aug 2019

Static Malware Detection Using Deep Neural Networks On Portable Executables, Piyush Aniruddha Puranik

UNLV Theses, Dissertations, Professional Papers, and Capstones

There are two main components of malware analysis. One is static malware analysis and the other is dynamic malware analysis. Static malware analysis involves examining the basic structure of the malware executable without executing it, while dynamic malware analysis relies on examining malware behavior after executing it in a controlled environment. Static malware analysis is typically done by modern anti-malware software by using signature-based analysis or heuristic-based analysis.

This thesis proposes the use of deep neural networks to learn features from a malware’s portable executable (PE) to minimize the occurrences of false positives when recognizing new malware. We use the …


Enhancing Scalability In Genetic Programming With Adaptable Constraints, Type Constraints And Automatically Defined Functions, George Gerules Jul 2019

Enhancing Scalability In Genetic Programming With Adaptable Constraints, Type Constraints And Automatically Defined Functions, George Gerules

Dissertations

Genetic Programming is a type of biological inspired machine learning. It is composed of a population of stochastic individuals. Those individuals can exchange portions of themselves with others in the population through the crossover operation that draws its inspiration from biology. Other biologically inspired operations include mutation and reproduction. The form an individual takes can be many things. It, however, is represented most of the time as a computer program. Constructing correct efficient programs can be notoriously difficult. Various grammar, typing, function constraint, or counting mechanisms can guide creation and evolution of those individuals. These mechanisms can reduce search space …


Supervised Machine Learning Models For Fake News Detection, Andrea Lopez, Adelo Vieira, Zafar Ahsan, Farooq Sabib, Shirley Marinho Jun 2019

Supervised Machine Learning Models For Fake News Detection, Andrea Lopez, Adelo Vieira, Zafar Ahsan, Farooq Sabib, Shirley Marinho

ICT

Fake news or the distribution of disinformation has become one of the most challenging issues in society. News and information are churned out across online websites and platforms in real-time, with little or no way for the viewing public to determine what is real or manufactured. But an awareness of what we are consuming online is becoming apparent and efforts are underway to explore how we separate fake content from genuine and truthful information. The most challenging part of fake news is determining how to spot it. In technology, there are ways to help us do this. Supervised machine learning …


Identifying Hourly Traffic Patterns With Python Deep Learning, Christopher L. Leavitt Jun 2019

Identifying Hourly Traffic Patterns With Python Deep Learning, Christopher L. Leavitt

Computer Engineering

This project was designed to explore and analyze the potential abilities and usefulness of applying machine learning models to data collected by parking sensors at a major metro shopping mall. By examining patterns in rates at which customer enter and exit parking garages on the campus of the Bellevue Collection shopping mall in Bellevue, Washington, a recurrent neural network will use data points from the previous hours will be trained to forecast future trends.


Classifying Challenging Behaviors In Autism Spectrum Disorder With Neural Document Embeddings, Abigail Atchison May 2019

Classifying Challenging Behaviors In Autism Spectrum Disorder With Neural Document Embeddings, Abigail Atchison

Computational and Data Sciences (MS) Theses

The understanding and treatment of challenging behaviors in individuals with Autism Spectrum Disorder is paramount to enabling the success of behavioral therapy; an essential step in this process being the labeling of challenging behaviors demonstrated in therapy sessions. These manifestations differ across individuals and within individuals over time and thus, the appropriate classification of a challenging behavior when considering purely qualitative factors can be unclear. In this thesis we seek to add quantitative depth to this otherwise qualitative task of challenging behavior classification. We do so through the application of natural language processing techniques to behavioral descriptions extracted from the …


Classifying Classic Ciphers Using Machine Learning, Nivedhitha Ramarathnam Krishna May 2019

Classifying Classic Ciphers Using Machine Learning, Nivedhitha Ramarathnam Krishna

Master's Projects

We consider the problem of identifying the classic cipher that was used to generate a given ciphertext message. We assume that the plaintext is English and we restrict our attention to ciphertext consisting only of alphabetic characters. Among the classic ciphers considered are the simple substitution, Vigenère cipher, playfair cipher, and column transposition cipher. The problem of classification is approached in two ways. The first method uses support vector machines (SVM) trained directly on ciphertext to classify the ciphers. In the second approach, we train hidden Markov models (HMM) on each ciphertext message, then use these trained HMMs as features …


Emulation Vs Instrumentation For Android Malware Detection, Anukriti Sinha May 2019

Emulation Vs Instrumentation For Android Malware Detection, Anukriti Sinha

Master's Projects

In resource constrained devices, malware detection is typically based on offline analysis using emulation. In previous work it has been claimed that such emulation fails for a significant percentage of Android malware because well-designed malware detects that the code is being emulated. An alternative to emulation is malware analysis based on code that is executing on an actual Android device. In this research, we collect features from a corpus of Android malware using both emulation and on-phone instrumentation. We train machine learning models based on emulated features and also train models based on features collected via instrumentation, and we compare …


Differential Estimation Of Audiograms Using Gaussian Process Active Model Selection, Trevor Larsen May 2019

Differential Estimation Of Audiograms Using Gaussian Process Active Model Selection, Trevor Larsen

McKelvey School of Engineering Theses & Dissertations

Classical methods for psychometric function estimation either require excessive resources to perform, as in the method of constants, or produce only a low resolution approximation of the target psychometric function, as in adaptive staircase or up-down procedures. This thesis makes two primary contributions to the estimation of the audiogram, a clinically relevant psychometric function estimated by querying a patient’s for audibility of a collection of tones. First, it covers the implementation of a Gaussian process model for learning an audiogram using another audiogram as a prior belief to speed up the learning procedure. Second, it implements a use case of …


Commonsense Knowledge In Sentiment Analysis Of Ordinance Reactions For Smart Governance, Manish Puri May 2019

Commonsense Knowledge In Sentiment Analysis Of Ordinance Reactions For Smart Governance, Manish Puri

Theses, Dissertations and Culminating Projects

Smart Governance is an emerging research area which has attracted scientific as well as policy interests, and aims to improve collaboration between government and citizens, as well as other stakeholders. Our project aims to enable lawmakers to incorporate data driven decision making in enacting ordinances. Our first objective is to create a mechanism for mapping ordinances (local laws) and tweets to Smart City Characteristics (SCC). The use of SCC has allowed us to create a mapping between a huge number of ordinances and tweets, and the use of Commonsense Knowledge (CSK) has allowed us to utilize human judgment in mapping. …


Supervised Machine Learning Models For Fake News Detection, Gofaas Group, Andrea Lopez, Adelo Vieira, Zafar Ahsan, Farooq Saqib, Shirley Marinho May 2019

Supervised Machine Learning Models For Fake News Detection, Gofaas Group, Andrea Lopez, Adelo Vieira, Zafar Ahsan, Farooq Saqib, Shirley Marinho

ICT

Fake news or the distribution of disinformation has become one of the most challenging issues in society. News and information are churned out across online websites and platforms in real-time, with little or no way for the viewing public to determine what is real or manufactured. But an awareness of what we are consuming online is becoming apparent and efforts are underway to explore how we separate fake content from genuine and truthful information.

The most challenging part of fake news is determining how to spot it. In technology, there are ways to help us do this. Supervised machine learning …


Deep Embedding Kernel, Linh Le Apr 2019

Deep Embedding Kernel, Linh Le

Doctor of Data Science and Analytics Dissertations

Kernel methods and deep learning are two major branches of machine learning that have achieved numerous successes in both analytics and artificial intelligence. While having their own unique characteristics, both branches work through mapping data to a feature space that is supposedly more favorable towards the given task. This dissertation addresses the strengths and weaknesses of each mapping method through combining them and forming a family of novel deep architectures that center around the Deep Embedding Kernel (DEK). In short, DEK is a realization of a kernel function through a newly deep architecture. The mapping in DEK is both implicit …


Machine Learning Methods For Personalized Health Monitoring Using Wearable Sensors, Annamalai Natarajan Mar 2019

Machine Learning Methods For Personalized Health Monitoring Using Wearable Sensors, Annamalai Natarajan

Doctoral Dissertations

Mobile health is an emerging field that allows for real-time monitoring of individuals between routine clinical visits. Among others it makes it possible to remotely gather health signals, track disease progression and provide just-in-time interventions. Consumer grade wearable sensors can remotely gather health signals and other time series data. While wearable sensors can be readily deployed on individuals, there are significant challenges in converting raw sensor data into actionable insights. In this dissertation, we develop machine learning methods and models for personalized health monitoring using wearables. Specifically, we address three challenges that arise in these settings. First, data gathered from …


Neural Machine Translation, Quinn M. Lanners, Thomas Laurent Mar 2019

Neural Machine Translation, Quinn M. Lanners, Thomas Laurent

Honors Thesis

Neural Machine Translation is the primary algorithm used in industry to perform machine translation. This state-of-the-art algorithm is an application of deep learning in which massive datasets of translated sentences are used to train a model capable of translating between any two languages. The architecture behind neural machine translation is composed of two recurrent neural networks used together in tandem to create an Encoder Decoder structure. Attention mechanisms have recently been developed to further increase the accuracy of these models. In this senior thesis, the various parts of Neural Machine Translation are explored towards the eventual creation of a tutorial …


A Study Of Face Embedding In Face Recognition, Khanh Duc Le Mar 2019

A Study Of Face Embedding In Face Recognition, Khanh Duc Le

Master's Theses

Face Recognition has been a long-standing topic in computer vision and pattern recognition field because of its wide and important applications in our daily lives such as surveillance system, access control, and so on. The current modern face recognition model, which keeps only a couple of images per person in the database, can now recognize a face with high accuracy. Moreover, the model does not need to be retrained every time a new person is added to the database.

By using the face dataset from Digital Democracy, the thesis will explore the capability of this model by comparing it with …


Dish: Democracy In State Houses, Nicholas A. Russo Feb 2019

Dish: Democracy In State Houses, Nicholas A. Russo

Master's Theses

In our current political climate, state level legislators have become increasingly impor- tant. Due to cuts in funding and growing focus at the national level, public oversight for these legislators has drastically decreased. This makes it difficult for citizens and activists to understand the relationships and commonalities between legislators. This thesis provides three contributions to address this issue. First, we created a data set containing over 1200 features focused on a legislator’s activity on bills. Second, we created embeddings that represented a legislator’s level of activity and engagement for a given bill using a custom model called Democracy2Vec. Third, we …


Opioid Misuse Detection In Hospitalized Patients Using Convolutional Neural Networks, Brihat Sharma Jan 2019

Opioid Misuse Detection In Hospitalized Patients Using Convolutional Neural Networks, Brihat Sharma

Master's Theses

Opioid misuse is a major public health problem in the world. In 2016, 11.3 million people were reported to misuse opioids in the US only. Opioid-related inpatient and emergency department visits have increased by 64 percent and the rate of opioid-related visits has nearly doubled between 2009 and 2014. It is thus critical for healthcare systems to detect opioid misuse cases. Patients hospitalized for consequences of their opioid misuse present an opportunity for intervention but better screening and surveillance methods are needed to guide providers. The current screening methods with self-report questionnaire data are time-consuming and difficult to perform in …


Data Driven Approach To Characterize And Forecast The Impact Of Freeway Work Zones On Mobility Using Probe Vehicle Data, Mohsen Kamyab Jan 2019

Data Driven Approach To Characterize And Forecast The Impact Of Freeway Work Zones On Mobility Using Probe Vehicle Data, Mohsen Kamyab

Wayne State University Dissertations

The presence of work zones on freeways causes traffic congestion and creates hazardous conditions for commuters and construction workers. Traffic congestion resulting from work zones causes negative impacts on traffic mobility (delay), the environment (vehicle emissions), and safety when stopped or slowed vehicles become vulnerable to rear-end collisions. Addressing these concerns, a data-driven approach was utilized to develop methodologies to measure, predict, and characterize the impact work zones have on Michigan interstates. This study used probe vehicle data, collected from GPS devices in vehicles, as the primary source for mobility data. This data was used to fulfill three objectives: develop …


Computer-Aided Classification Of Impulse Oscillometric Measures Of Respiratory Small Airways Function In Children, Nancy Selene Avila Jan 2019

Computer-Aided Classification Of Impulse Oscillometric Measures Of Respiratory Small Airways Function In Children, Nancy Selene Avila

Open Access Theses & Dissertations

Computer-aided classification of respiratory small airways dysfunction is not an easy task. There is a need to develop more robust classifiers, specifically for children as the classification studies performed to date have the following limitations: 1) they include features derived from tests that are not suitable for children and 2) they cannot distinguish between mild and severe small airway dysfunction.

This Dissertation describes the classification algorithms with high discriminative capacity to distinguish different levels of respiratory small airways function in children (Asthma, Small Airways Impairment, Possible Small Airways Impairment, and Normal lung function). This ability came from innovative feature selection, …


A Machine Learning Recommender Model For Ride Sharing Based On Rider Characteristics And User Threshold Time, Govind Pramod Yatnalkar Jan 2019

A Machine Learning Recommender Model For Ride Sharing Based On Rider Characteristics And User Threshold Time, Govind Pramod Yatnalkar

Theses, Dissertations and Capstones

In the present age, human life is prospering incredibly due to the 4th Industrial Revolution or The Age of Digitization and Computing. The ubiquitous availability of the Internet and advanced computing systems have resulted in the rapid development of smart cities. From connected devices to live vehicle tracking, technology is taking the field of transportation to a new level. An essential part of the transportation domain in smart cities is Ride Sharing. It is an excellent solution to issues like pollution, traffic, and the rapid consumption of fuel. Even though Ride Sharing has several benefits, the current usage is …


Explainable Neural Networks Based Anomaly Detection For Cyber-Physical Systems, Kasun Amarasinghe Jan 2019

Explainable Neural Networks Based Anomaly Detection For Cyber-Physical Systems, Kasun Amarasinghe

Theses and Dissertations

Cyber-Physical Systems (CPSs) are the core of modern critical infrastructure (e.g. power-grids) and securing them is of paramount importance. Anomaly detection in data is crucial for CPS security. While Artificial Neural Networks (ANNs) are strong candidates for the task, they are seldom deployed in safety-critical domains due to the perception that ANNs are black-boxes. Therefore, to leverage ANNs in CPSs, cracking open the black box through explanation is essential.

The main objective of this dissertation is developing explainable ANN-based Anomaly Detection Systems for Cyber-Physical Systems (CP-ADS). The main objective was broken down into three sub-objectives: 1) Identifying key-requirements that an …


Dedicated Hardware For Machine/Deep Learning: Domain Specific Architectures, Angel Izael Solis Jan 2019

Dedicated Hardware For Machine/Deep Learning: Domain Specific Architectures, Angel Izael Solis

Open Access Theses & Dissertations

Artificial intelligence has come a very long way from being a mere spectacle on the silver screen in the 1920s [Hml18]. As artificial intelligence continues to evolve, and we begin to develop more sophisticated Artificial Neural Networks, the need for specialized and more efficient machines (less computational strain while maintaining the same performance results) becomes increasingly evident. Though these “new” techniques, such as Multilayer Perceptron’s, Convolutional Neural Networks and Recurrent Neural Networks, may seem as if they are on the cutting edge of technology, many of these ideas are over 60 years old! However, many of these earlier models, at …


A Novel Set Of Weight Initialization Techniques For Deep Learning Architectures, Diego Aguirre Jan 2019

A Novel Set Of Weight Initialization Techniques For Deep Learning Architectures, Diego Aguirre

Open Access Theses & Dissertations

The importance of weight initialization when building a deep learning model is often underappreciated. Even though it is usually seen as a minor detail in the model creation cycle, this process has shown to have a strong impact on the training time of a network and the quality of the resulting model. In fact, the implications of choosing a poor initialization scheme range from leading to the creation of a poorly performing model to preventing optimization techniques (like stochastic gradient descent) from converging.

In this work, we introduce and evaluate a set of novel weight initialization techniques for deep learning …


Randomized Algorithms For Preconditioner Selection With Applications To Kernel Regression, Conner Dipaolo Jan 2019

Randomized Algorithms For Preconditioner Selection With Applications To Kernel Regression, Conner Dipaolo

HMC Senior Theses

The task of choosing a preconditioner M to use when solving a linear system Ax=b with iterative methods is often tedious and most methods remain ad-hoc. This thesis presents a randomized algorithm to make this chore less painful through use of randomized algorithms for estimating traces. In particular, we show that the preconditioner stability || I - M-1A ||F, known to forecast preconditioner quality, can be computed in the time it takes to run a constant number of iterations of conjugate gradients through use of sketching methods. This is in spite of folklore which …


The Structural Information Filtered Features Potential For Machine Learning Calculations Of Energies And Forces Of Atomic Systems., Jorge Arturo Hernandez Zeledon Jan 2019

The Structural Information Filtered Features Potential For Machine Learning Calculations Of Energies And Forces Of Atomic Systems., Jorge Arturo Hernandez Zeledon

Graduate Theses, Dissertations, and Problem Reports

In the last ten years, machine learning potentials have been successfully applied to the study of crystals, and molecules. However, more complex materials like clusters, macro-molecules, and glasses are out reach of current methods. The input of any machine learning system is a tensor of features (the most universal type are rank 1 tensors or vectors of features), the quality of any machine learning system is directly related to how well the feature space describes the original physical system. So far, the feature engineering process for machine learning potentials can not describe complex material. The current methods are highly inefficient …