Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

2019

Machine Learning

Discipline
Institution
Publication
File Type

Articles 1 - 30 of 60

Full-Text Articles in Physical Sciences and Mathematics

Ordinal Hyperplane Loss, Bob Vanderheyden Dec 2019

Ordinal Hyperplane Loss, Bob Vanderheyden

Doctor of Data Science and Analytics Dissertations

This research presents the development of a new framework for analyzing ordered class data, commonly called “ordinal class” data. The focus of the work is the development of classifiers (predictive models) that predict classes from available data. Ratings scales, medical classification scales, socio-economic scales, meaningful groupings of continuous data, facial emotional intensity and facial age estimation are examples of ordinal data for which data scientists may be asked to develop predictive classifiers. It is possible to treat ordinal classification like any other classification problem that has more than two classes. Specifying a model with this strategy does not fully utilize …


Information Extraction From Biomedical Text Using Machine Learning, Deepti Garg Dec 2019

Information Extraction From Biomedical Text Using Machine Learning, Deepti Garg

Master's Projects

Inadequate drug experimental data and the use of unlicensed drugs may cause adverse drug reactions, especially in pediatric populations. Every year the U.S. Food and Drug Administration approves human prescription drugs for marketing. The labels associated with these drugs include information about clinical trials and drug response in pediatric population. In order for doctors to make an informed decision about the safety and effectiveness of these drugs for children, there is a need to analyze complex and often unstructured drug labels. In this work, first, an exploratory analysis of drug labels using a Natural Language Processing pipeline is performed. Second, …


Finding A Viable Neural Network Architecture For Use With Upper Limb Prosthetics, Maxwell Lavin Dec 2019

Finding A Viable Neural Network Architecture For Use With Upper Limb Prosthetics, Maxwell Lavin

Master of Science in Computer Science Theses

This paper attempts to answer the question of if it’s possible to produce a simple, quick, and accurate neural network for the use in upper-limb prosthetics. Through the implementation of convolutional and artificial neural networks and feature extraction on electromyographic data different possible architectures are examined with regards to processing time, complexity, and accuracy. It is found that the most accurate architecture is a multi-entry categorical cross entropy convolutional neural network with 100% accuracy. The issue is that it is also the slowest method requiring 9 minutes to run. The next best method found was a single-entry binary cross entropy …


Fractional Random Weighted Bootstrapping For Classification On Imbalanced Data With Ensemble Decision Tree Methods, Sean Charles Carter Nov 2019

Fractional Random Weighted Bootstrapping For Classification On Imbalanced Data With Ensemble Decision Tree Methods, Sean Charles Carter

USF Tampa Graduate Theses and Dissertations

Ensemble methods are commonly used for building predictive models for classification. Models that are unstable to perturbations in the training set, such as the decision tree, often see considerable reductions in error when grouped, using bootstrapped resamples of the training data to train many models. The non-parametric bootstrap, however, has limited efficacy when used on severely imbalanced data, especially when the number of observations of one or more classes is exceptionally small. We explore the fractional random weighted bootstrap, which randomly assigns fractional weights to observations, as an alternative resampling pro cedure in training machine learning ensembles, particularly decision tree …


Automatic Inference Of Causal Reasoning Chains From Student Essays, Simon Mark Hughes Oct 2019

Automatic Inference Of Causal Reasoning Chains From Student Essays, Simon Mark Hughes

College of Computing and Digital Media Dissertations

While there has been an increasing focus on higher-level thinking skills arising from the Common Core Standards, many high-school and middle-school students struggle to combine and integrate information from multiple sources when writing essays. Writing is an important learning skill, and there is increasing evidence that writing about a topic develops a deeper understanding in the student. However, grading essays is time consuming for teachers, resulting in an increasing focus on shallower forms of assessment that are easier to automate, such as multiple-choice tests. Existing essay grading software has attempted to ease this burden but relies on shallow lexico-syntactic features …


Predicting Absenteeism Of Female Students In Alabama, Funmilola Okelana Aug 2019

Predicting Absenteeism Of Female Students In Alabama, Funmilola Okelana

Dissertations and Theses

Abstract

Students are chronically absent when they miss at least 15 days of the school year. Past researchers have identified income and environment as factors that affect school absenteeism. Alabama is a poor state with a high crime rate. The hypothesis for this research is that the absenteeism of female students in Alabama is high. Do we reject or fail to reject this hypothesis. If we fail to reject this hypothesis, then what other factors can affect absenteeism in schools? How can we best predict the absenteeism of female students in Alabama? What is the effect of bad data on …


Prediction Of Hierarchical Classification Of Transposable Elements Using Machine Learning Techniques, Manisha Panta Aug 2019

Prediction Of Hierarchical Classification Of Transposable Elements Using Machine Learning Techniques, Manisha Panta

University of New Orleans Theses and Dissertations

Transposable Elements (TEs) or jumping genes are the DNA sequences that have an intrinsic capability to move within a host genome from one genomic location to another. Studies show that the presence of a TE within or adjacent to a functional gene may alter its expression. TEs can also cause an increase in the rate of mutation and can even promote gross genetic arrangements. Thus, the proper classification of the identified jumping genes is important to understand their genetic and evolutionary effects. While computational methods have been developed that perform either binary classification or multi-label classification of TEs, few studies …


Feature Selection And Analysis For Standard Machine Learning Classification Of Audio Beehive Samples, Chelsi Gupta Aug 2019

Feature Selection And Analysis For Standard Machine Learning Classification Of Audio Beehive Samples, Chelsi Gupta

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The beekeepers need to inspect their hives regularly in order to protect them from various stressors. Manual inspection of hives require a lot of time and effort. Hence, many researchers have started using electronic beehive monitoring (EBM) systems to collect critical information from beehives, so as to alert the beekeepers of possible threats to the hive. EBM collects information by applying multiple sensors into the hive. The sensors collect information in the form of video, audio or temperature data from the hives.

This thesis involves the automatic classification of audio samples from a beehive into bee buzzing, cricket chirping and …


Machine Learning Techniques As Applied To Discrete And Combinatorial Structures, Samuel David Schwartz Aug 2019

Machine Learning Techniques As Applied To Discrete And Combinatorial Structures, Samuel David Schwartz

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Machine Learning Techniques have been used on a wide array of input types: images, sound waves, text, and so forth. In articulating these input types to the almighty machine, there have been all sorts of amazing problems that have been solved for many practical purposes.

Nevertheless, there are some input types which don’t lend themselves nicely to the standard set of machine learning tools we have. Moreover, there are some provably difficult problems which are abysmally hard to solve within a reasonable time frame.

This thesis addresses several of these difficult problems. It frames these problems such that we can …


Static Malware Detection Using Deep Neural Networks On Portable Executables, Piyush Aniruddha Puranik Aug 2019

Static Malware Detection Using Deep Neural Networks On Portable Executables, Piyush Aniruddha Puranik

UNLV Theses, Dissertations, Professional Papers, and Capstones

There are two main components of malware analysis. One is static malware analysis and the other is dynamic malware analysis. Static malware analysis involves examining the basic structure of the malware executable without executing it, while dynamic malware analysis relies on examining malware behavior after executing it in a controlled environment. Static malware analysis is typically done by modern anti-malware software by using signature-based analysis or heuristic-based analysis.

This thesis proposes the use of deep neural networks to learn features from a malware’s portable executable (PE) to minimize the occurrences of false positives when recognizing new malware. We use the …


Enhancing Scalability In Genetic Programming With Adaptable Constraints, Type Constraints And Automatically Defined Functions, George Gerules Jul 2019

Enhancing Scalability In Genetic Programming With Adaptable Constraints, Type Constraints And Automatically Defined Functions, George Gerules

Dissertations

Genetic Programming is a type of biological inspired machine learning. It is composed of a population of stochastic individuals. Those individuals can exchange portions of themselves with others in the population through the crossover operation that draws its inspiration from biology. Other biologically inspired operations include mutation and reproduction. The form an individual takes can be many things. It, however, is represented most of the time as a computer program. Constructing correct efficient programs can be notoriously difficult. Various grammar, typing, function constraint, or counting mechanisms can guide creation and evolution of those individuals. These mechanisms can reduce search space …


Supervised Machine Learning Models For Fake News Detection, Andrea Lopez, Adelo Vieira, Zafar Ahsan, Farooq Sabib, Shirley Marinho Jun 2019

Supervised Machine Learning Models For Fake News Detection, Andrea Lopez, Adelo Vieira, Zafar Ahsan, Farooq Sabib, Shirley Marinho

ICT

Fake news or the distribution of disinformation has become one of the most challenging issues in society. News and information are churned out across online websites and platforms in real-time, with little or no way for the viewing public to determine what is real or manufactured. But an awareness of what we are consuming online is becoming apparent and efforts are underway to explore how we separate fake content from genuine and truthful information. The most challenging part of fake news is determining how to spot it. In technology, there are ways to help us do this. Supervised machine learning …


Identifying Hourly Traffic Patterns With Python Deep Learning, Christopher L. Leavitt Jun 2019

Identifying Hourly Traffic Patterns With Python Deep Learning, Christopher L. Leavitt

Computer Engineering

This project was designed to explore and analyze the potential abilities and usefulness of applying machine learning models to data collected by parking sensors at a major metro shopping mall. By examining patterns in rates at which customer enter and exit parking garages on the campus of the Bellevue Collection shopping mall in Bellevue, Washington, a recurrent neural network will use data points from the previous hours will be trained to forecast future trends.


Classifying Challenging Behaviors In Autism Spectrum Disorder With Neural Document Embeddings, Abigail Atchison May 2019

Classifying Challenging Behaviors In Autism Spectrum Disorder With Neural Document Embeddings, Abigail Atchison

Computational and Data Sciences (MS) Theses

The understanding and treatment of challenging behaviors in individuals with Autism Spectrum Disorder is paramount to enabling the success of behavioral therapy; an essential step in this process being the labeling of challenging behaviors demonstrated in therapy sessions. These manifestations differ across individuals and within individuals over time and thus, the appropriate classification of a challenging behavior when considering purely qualitative factors can be unclear. In this thesis we seek to add quantitative depth to this otherwise qualitative task of challenging behavior classification. We do so through the application of natural language processing techniques to behavioral descriptions extracted from the …


Using Satellite-Based Hydro-Climate Variables And Machine Learning For Streamflow Modeling At Various Scales In The Upper Mississippi River Basin, Dongjae Kwon May 2019

Using Satellite-Based Hydro-Climate Variables And Machine Learning For Streamflow Modeling At Various Scales In The Upper Mississippi River Basin, Dongjae Kwon

Theses and Dissertations

Streamflow data are essential to study the hydrologic cycle and to attain appropriate water resource management policies. However, the availability of gauge data is limited due to various reasons such as economic, political, instrumental malfunctioning, and poor spatial distribution. Although streamflow can be simulated by process-based and machine learning approaches, applicability is limited due to intensive modeling effort, or its black-box nature, respectively. Here, we introduce a machine learning (Boosted Regression Tree (BRT)) approach based on remote sensing data to simulate monthly streamflow for three of varying sizes watersheds in the Upper Mississippi River Basin (UMRB). By integrating spatial land …


Classifying Classic Ciphers Using Machine Learning, Nivedhitha Ramarathnam Krishna May 2019

Classifying Classic Ciphers Using Machine Learning, Nivedhitha Ramarathnam Krishna

Master's Projects

We consider the problem of identifying the classic cipher that was used to generate a given ciphertext message. We assume that the plaintext is English and we restrict our attention to ciphertext consisting only of alphabetic characters. Among the classic ciphers considered are the simple substitution, Vigenère cipher, playfair cipher, and column transposition cipher. The problem of classification is approached in two ways. The first method uses support vector machines (SVM) trained directly on ciphertext to classify the ciphers. In the second approach, we train hidden Markov models (HMM) on each ciphertext message, then use these trained HMMs as features …


Emulation Vs Instrumentation For Android Malware Detection, Anukriti Sinha May 2019

Emulation Vs Instrumentation For Android Malware Detection, Anukriti Sinha

Master's Projects

In resource constrained devices, malware detection is typically based on offline analysis using emulation. In previous work it has been claimed that such emulation fails for a significant percentage of Android malware because well-designed malware detects that the code is being emulated. An alternative to emulation is malware analysis based on code that is executing on an actual Android device. In this research, we collect features from a corpus of Android malware using both emulation and on-phone instrumentation. We train machine learning models based on emulated features and also train models based on features collected via instrumentation, and we compare …


Differential Estimation Of Audiograms Using Gaussian Process Active Model Selection, Trevor Larsen May 2019

Differential Estimation Of Audiograms Using Gaussian Process Active Model Selection, Trevor Larsen

McKelvey School of Engineering Theses & Dissertations

Classical methods for psychometric function estimation either require excessive resources to perform, as in the method of constants, or produce only a low resolution approximation of the target psychometric function, as in adaptive staircase or up-down procedures. This thesis makes two primary contributions to the estimation of the audiogram, a clinically relevant psychometric function estimated by querying a patient’s for audibility of a collection of tones. First, it covers the implementation of a Gaussian process model for learning an audiogram using another audiogram as a prior belief to speed up the learning procedure. Second, it implements a use case of …


Bias Reduction In Machine Learning Classifiers For Spatiotemporal Analysis Of Coral Reefs Using Remote Sensing Images, Justin J. Gapper May 2019

Bias Reduction In Machine Learning Classifiers For Spatiotemporal Analysis Of Coral Reefs Using Remote Sensing Images, Justin J. Gapper

Computational and Data Sciences (PhD) Dissertations

This dissertation is an evaluation of the generalization characteristics of machine learning classifiers as applied to the detection of coral reefs using remote sensing images. Three scientific studies have been conducted as part of this research: 1) Evaluation of Spatial Generalization Characteristics of a Robust Classifier as Applied to Coral Reef Habitats in Remote Islands of the Pacific Ocean 2) Coral Reef Change Detection in Remote Pacific Islands using Support Vector Machine Classifiers 3) A Generalized Machine Learning Classifier for Spatiotemporal Analysis of Coral Reefs in the Red Sea. The aim of this dissertation is to propose and evaluate a …


Teaching Computers To Teach Themselves: Synthesizing Training Data Based On Human-Perceived Elements, James Little May 2019

Teaching Computers To Teach Themselves: Synthesizing Training Data Based On Human-Perceived Elements, James Little

Honors Projects

Isolation-Based Scene Generation (IBSG) is a process for creating synthetic datasets made to train machine learning detectors and classifiers. In this project, we formalize the IBSG process and describe the scenarios—object detection and object classification given audio or image input—in which it can be useful. We then look at the Stanford Street View House Number (SVHN) dataset and build several different IBSG training datasets based on existing SVHN data. We try to improve the compositing algorithm used to build the IBSG dataset so that models trained with synthetic data perform as well as models trained with the original SVHN training …


Model-Independent Estimation Of Optimal Hedging Strategies With Deep Neural Networks, Tobias Michael Furtwaengler May 2019

Model-Independent Estimation Of Optimal Hedging Strategies With Deep Neural Networks, Tobias Michael Furtwaengler

Theses and Dissertations

Inspired by the recent paper Buehler et al. (2018), this thesis aims to investigate the optimal hedging and pricing of financial derivatives with neural networks. We utilize the concept of convex risk measures to define optimal hedging strategies without strong assumptions on the underlying market dynamics. Furthermore, the setting allows the incorporation of market frictions and thus the determination of optimal hedging strategies and prices even in incomplete markets. We then use the approximation capabilities of neural networks to find close-to optimal estimates for these strategies.

We will elaborate on the theoretical foundations of this approach and carry out implementations …


Model-Independent Estimation Of Optimal Hedging Strategies With Deep Neural Networks, Tobias Michael Furtwaengler May 2019

Model-Independent Estimation Of Optimal Hedging Strategies With Deep Neural Networks, Tobias Michael Furtwaengler

Theses and Dissertations

Inspired by the recent paper Buehler et al. (2018), this thesis aims to investigate the optimal hedging and pricing of financial derivatives with neural networks. We utilize the concept of convex risk measures to define optimal hedging strategies without strong assumptions on the underlying market dynamics. Furthermore, the setting allows the incorporation of market frictions and thus the determination of optimal hedging strategies and prices even in incomplete markets. We then use the approximation capabilities of neural networks to find close-to optimal estimates for these strategies.

We will elaborate on the theoretical foundations of this approach and carry out implementations …


Commonsense Knowledge In Sentiment Analysis Of Ordinance Reactions For Smart Governance, Manish Puri May 2019

Commonsense Knowledge In Sentiment Analysis Of Ordinance Reactions For Smart Governance, Manish Puri

Theses, Dissertations and Culminating Projects

Smart Governance is an emerging research area which has attracted scientific as well as policy interests, and aims to improve collaboration between government and citizens, as well as other stakeholders. Our project aims to enable lawmakers to incorporate data driven decision making in enacting ordinances. Our first objective is to create a mechanism for mapping ordinances (local laws) and tweets to Smart City Characteristics (SCC). The use of SCC has allowed us to create a mapping between a huge number of ordinances and tweets, and the use of Commonsense Knowledge (CSK) has allowed us to utilize human judgment in mapping. …


Supervised Machine Learning Models For Fake News Detection, Gofaas Group, Andrea Lopez, Adelo Vieira, Zafar Ahsan, Farooq Saqib, Shirley Marinho May 2019

Supervised Machine Learning Models For Fake News Detection, Gofaas Group, Andrea Lopez, Adelo Vieira, Zafar Ahsan, Farooq Saqib, Shirley Marinho

ICT

Fake news or the distribution of disinformation has become one of the most challenging issues in society. News and information are churned out across online websites and platforms in real-time, with little or no way for the viewing public to determine what is real or manufactured. But an awareness of what we are consuming online is becoming apparent and efforts are underway to explore how we separate fake content from genuine and truthful information.

The most challenging part of fake news is determining how to spot it. In technology, there are ways to help us do this. Supervised machine learning …


Deep Embedding Kernel, Linh Le Apr 2019

Deep Embedding Kernel, Linh Le

Doctor of Data Science and Analytics Dissertations

Kernel methods and deep learning are two major branches of machine learning that have achieved numerous successes in both analytics and artificial intelligence. While having their own unique characteristics, both branches work through mapping data to a feature space that is supposedly more favorable towards the given task. This dissertation addresses the strengths and weaknesses of each mapping method through combining them and forming a family of novel deep architectures that center around the Deep Embedding Kernel (DEK). In short, DEK is a realization of a kernel function through a newly deep architecture. The mapping in DEK is both implicit …


Machine Learning Methods For Personalized Health Monitoring Using Wearable Sensors, Annamalai Natarajan Mar 2019

Machine Learning Methods For Personalized Health Monitoring Using Wearable Sensors, Annamalai Natarajan

Doctoral Dissertations

Mobile health is an emerging field that allows for real-time monitoring of individuals between routine clinical visits. Among others it makes it possible to remotely gather health signals, track disease progression and provide just-in-time interventions. Consumer grade wearable sensors can remotely gather health signals and other time series data. While wearable sensors can be readily deployed on individuals, there are significant challenges in converting raw sensor data into actionable insights. In this dissertation, we develop machine learning methods and models for personalized health monitoring using wearables. Specifically, we address three challenges that arise in these settings. First, data gathered from …


Neural Machine Translation, Quinn M. Lanners, Thomas Laurent Mar 2019

Neural Machine Translation, Quinn M. Lanners, Thomas Laurent

Honors Thesis

Neural Machine Translation is the primary algorithm used in industry to perform machine translation. This state-of-the-art algorithm is an application of deep learning in which massive datasets of translated sentences are used to train a model capable of translating between any two languages. The architecture behind neural machine translation is composed of two recurrent neural networks used together in tandem to create an Encoder Decoder structure. Attention mechanisms have recently been developed to further increase the accuracy of these models. In this senior thesis, the various parts of Neural Machine Translation are explored towards the eventual creation of a tutorial …


A Study Of Face Embedding In Face Recognition, Khanh Duc Le Mar 2019

A Study Of Face Embedding In Face Recognition, Khanh Duc Le

Master's Theses

Face Recognition has been a long-standing topic in computer vision and pattern recognition field because of its wide and important applications in our daily lives such as surveillance system, access control, and so on. The current modern face recognition model, which keeps only a couple of images per person in the database, can now recognize a face with high accuracy. Moreover, the model does not need to be retrained every time a new person is added to the database.

By using the face dataset from Digital Democracy, the thesis will explore the capability of this model by comparing it with …


Dish: Democracy In State Houses, Nicholas A. Russo Feb 2019

Dish: Democracy In State Houses, Nicholas A. Russo

Master's Theses

In our current political climate, state level legislators have become increasingly impor- tant. Due to cuts in funding and growing focus at the national level, public oversight for these legislators has drastically decreased. This makes it difficult for citizens and activists to understand the relationships and commonalities between legislators. This thesis provides three contributions to address this issue. First, we created a data set containing over 1200 features focused on a legislator’s activity on bills. Second, we created embeddings that represented a legislator’s level of activity and engagement for a given bill using a custom model called Democracy2Vec. Third, we …


Opioid Misuse Detection In Hospitalized Patients Using Convolutional Neural Networks, Brihat Sharma Jan 2019

Opioid Misuse Detection In Hospitalized Patients Using Convolutional Neural Networks, Brihat Sharma

Master's Theses

Opioid misuse is a major public health problem in the world. In 2016, 11.3 million people were reported to misuse opioids in the US only. Opioid-related inpatient and emergency department visits have increased by 64 percent and the rate of opioid-related visits has nearly doubled between 2009 and 2014. It is thus critical for healthcare systems to detect opioid misuse cases. Patients hospitalized for consequences of their opioid misuse present an opportunity for intervention but better screening and surveillance methods are needed to guide providers. The current screening methods with self-report questionnaire data are time-consuming and difficult to perform in …