Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2015

Machine Learning

Discipline
Institution
Publication
Publication Type

Articles 1 - 20 of 20

Full-Text Articles in Physical Sciences and Mathematics

Email Similarity Matching And Automatic Reply Generation Using Statistical Topic Modeling And Machine Learning, Zachery L. Schiller Dec 2015

Email Similarity Matching And Automatic Reply Generation Using Statistical Topic Modeling And Machine Learning, Zachery L. Schiller

Electronic Theses and Dissertations

Responding to email is a time-consuming task that is a requirement for most professions. Many people find themselves answering the same questions over and over, repeatedly replying with answers they have written previously either in whole or in part. In this thesis, the Automatic Mail Reply (AMR) system is implemented to help with repeated email response creation. The system uses past email interactions and, through unsupervised statistical learning, attempts to recover relevant information to give to the user to assist in writing their reply.

Three statistical learning models, term frequency-inverse document frequency (tf-idf), Latent Semantic Analysis (LSA), and Latent Dirichlet …


Optical Spectroscopy And Chemometrics For Discrimination Of Dyed Textile Fibers And Magnetic Audio Tapes, Nathan C. Fuenffinger Dec 2015

Optical Spectroscopy And Chemometrics For Discrimination Of Dyed Textile Fibers And Magnetic Audio Tapes, Nathan C. Fuenffinger

Theses and Dissertations

This dissertation focuses on the application of both novel and standard chemometric approaches toward societal problems of interest in the areas of forensic science and cultural heritage preservation. Microspectrophotometry (MSP), a technique enabling measurements of absorption of electromagnetic radiation by microscopic materials in the ultraviolet-visible (UV-Vis) region, is widely used by forensic examiners for comparisons of metameric textile fibers. These comparisons are often hindered, however, by the raw or normalized spectra showing little detail or having few points of comparison. Derivative preprocessing can enhance structure in some instances. We have demonstrated through the use of multivariate statistics that derivatives are …


Neuron Clustering For Mitigating Catastrophic Forgetting In Supervised And Reinforcement Learning, Benjamin Frederick Goodrich Dec 2015

Neuron Clustering For Mitigating Catastrophic Forgetting In Supervised And Reinforcement Learning, Benjamin Frederick Goodrich

Doctoral Dissertations

Neural networks have had many great successes in recent years, particularly with the advent of deep learning and many novel training techniques. One issue that has affected neural networks and prevented them from performing well in more realistic online environments is that of catastrophic forgetting. Catastrophic forgetting affects supervised learning systems when input samples are temporally correlated or are non-stationary. However, most real-world problems are non-stationary in nature, resulting in prolonged periods of time separating inputs drawn from different regions of the input space.

Reinforcement learning represents a worst-case scenario when it comes to precipitating catastrophic forgetting in neural networks. …


Predicting Intraday Financial Market Dynamics Using Takens' Vectors; Incorporating Causality Testing And Machine Learning Techniques, Abubakar-Sadiq Bouda Abdulai Dec 2015

Predicting Intraday Financial Market Dynamics Using Takens' Vectors; Incorporating Causality Testing And Machine Learning Techniques, Abubakar-Sadiq Bouda Abdulai

Electronic Theses and Dissertations

Traditional approaches to predicting financial market dynamics tend to be linear and stationary, whereas financial time series data is increasingly nonlinear and non-stationary. Lately, advances in dynamical systems theory have enabled the extraction of complex dynamics from time series data. These developments include theory of time delay embedding and phase space reconstruction of dynamical systems from a scalar time series. In this thesis, a time delay embedding approach for predicting intraday stock or stock index movement is developed. The approach combines methods of nonlinear time series analysis with those of causality testing, theory of dynamical systems and machine learning (artificial …


Ensemble Learning Method On Machine Maintenance Data, Xiaochuang Zhao Nov 2015

Ensemble Learning Method On Machine Maintenance Data, Xiaochuang Zhao

USF Tampa Graduate Theses and Dissertations

In the industry, a lot of companies are facing the explosion of big data. With this much information stored, companies want to make sense of the data and use it to help them for better decision making, especially for future prediction. A lot of money can be saved and huge revenue can be generated with the power of big data. When building statistical learning models for prediction, companies in the industry are aiming to build models with efficiency and high accuracy. After the learning models have been developed for production, new data will be generated. With the updated data, the …


Sudden Cardiac Arrest Prediction Through Heart Rate Variability Analysis, Luke Joseph Plewa Jun 2015

Sudden Cardiac Arrest Prediction Through Heart Rate Variability Analysis, Luke Joseph Plewa

Master's Theses

The increase in popularity for wearable technologies (see: Apple Watch and Microsoft Band) has opened the door for an Internet of Things solution to healthcare. One of the most prevalent healthcare problems today is the poor survival rate of out-of hospital sudden cardiac arrests (9.5% on 360,000 cases in the USA in 2013). It has been proven that heart rate derived features can give an early indicator of sudden cardiac arrest, and that providing an early warning has the potential to save many lives. Many of these new wearable devices are capable of providing this warning through their heart rate …


Application Of Machine Learning To Mapping And Simulating Gene Regulatory Networks, Hien-Haw Liow May 2015

Application Of Machine Learning To Mapping And Simulating Gene Regulatory Networks, Hien-Haw Liow

Arts & Sciences Electronic Theses and Dissertations

This dissertation explores, proposes, and examines methods of applying modernmachine learning and Bayesian statistics in the quantitative and qualitative modeling of gene regulatory networks using high-throughput gene expression data. A semi-parametric Bayesian model based on random forest is developed to infer quantitative aspects of gene regulation relations; a parametric model is developed to predict geneexpression levels solely from genotype information. Simulation of network behavior is shown to complement regression analysis greatly in capturing the dynamics of gene regulatory networks. Finally, as an application and extension of novel approaches in gene expression analysis, new methods of discovering topological structure of gene …


Modeling Visual Features To Recognize Biological Motion: A Developmental Approach, Giulio Sandini, Nicoletta Noceti, Alessia Vignolo, Alessandra Sciutti, Francesco Rea, Alessandro Verri, Francesca Odone May 2015

Modeling Visual Features To Recognize Biological Motion: A Developmental Approach, Giulio Sandini, Nicoletta Noceti, Alessia Vignolo, Alessandra Sciutti, Francesco Rea, Alessandro Verri, Francesca Odone

MODVIS Workshop

In this work we deal with the problem of designing and developing computational vision models – comparable to the early stages of the human development – using coarse low-level information.

More specifically, we consider a binary classification setting to characterize biological movements with respect to non-biological dynamic events. To this purpose, our model builds on top of the optical flow estimation, and abstract the representation to simulate the limited amount of visual information available at birth. We take inspiration from known biological motion regularities explained by the Two-Thirds Power Law, and design a motion representation that includes different low-level features, …


Hybrid Agent Based Simulation With Adaptive Learning Of Travel Mode Choices For University Commuters (Wip), Nagesh Shukla, Albert Munoz, Jun Ma, Nam Huynh Apr 2015

Hybrid Agent Based Simulation With Adaptive Learning Of Travel Mode Choices For University Commuters (Wip), Nagesh Shukla, Albert Munoz, Jun Ma, Nam Huynh

Nagesh Shukla

This paper presents a methodology for developing a hybrid agent-based micro-simulation model to capture the impacts of commuter travel mode choices on a University campus transport network. The proposed methodology involves: (i) developing realistic population of commuter agents (students and staff); (ii) assigning activity lists and travel mode choices to agents using machine learning method; and, (iii) traffic micro-simulation of the study area transport network. This furthers the understanding of current transport modal distributions, factors affecting the travel mode choice decisions, and, network performance through a number of hypothetical travel scenarios.


Geological Object Recognition In Extraterrestrial Environments, Gregory M. Elfers Apr 2015

Geological Object Recognition In Extraterrestrial Environments, Gregory M. Elfers

Electronic Thesis and Dissertation Repository

On July 4 1997, the landing of NASA’s Pathnder probe and its rover Sojourner marked the beginning of a new era in space exploration; robots with the ability to move have made up the vanguard of human extraterrestrial exploration ever since. With Sojourners landing, for the rst time, a ground traversing robot was at a distance too far from earth to make direct human control practical. This has given rise to the development of autonomous systems to improve the e?ciency of these robots,in both their ability to move,and their ability to make decisions regarding their environment. Computer Vision comprises a …


Evaluating Defect Prediction Using A Massive Set Of Metrics, Xiao Xuan, David Lo, Xin Xia, Yuan Tian Apr 2015

Evaluating Defect Prediction Using A Massive Set Of Metrics, Xiao Xuan, David Lo, Xin Xia, Yuan Tian

Research Collection School Of Computing and Information Systems

To evaluate the performance of a within-project defect prediction approach, people normally use precision, recall, and F-measure scores. However, in machine learning literature, there are a large number of evaluation metrics to evaluate the performance of an algorithm, (e.g., Matthews Correlation Coefficient, G-means, etc.), and these metrics evaluate an approach from different aspects. In this paper, we investigate the performance of within-project defect prediction approaches on a large number of evaluation metrics. We choose 6 state-of-the-art approaches including naive Bayes, decision tree, logistic regression, kNN, random forest and Bayesian network which are widely used in defect prediction literature. And we …


Machine Learning For Predicting Soil Classes In Three Semi-Arid Landscapes, Colby W. Brungard, Janis L. Boettinger, Michael C. Duniway, Skye A. Wills, Thomas C. Edwards Jr. Feb 2015

Machine Learning For Predicting Soil Classes In Three Semi-Arid Landscapes, Colby W. Brungard, Janis L. Boettinger, Michael C. Duniway, Skye A. Wills, Thomas C. Edwards Jr.

Plants, Soils, and Climate Faculty Publications

Mapping the spatial distribution of soil taxonomic classes is important for informing soil use and management decisions. Digital soil mapping (DSM) can quantitatively predict the spatial distribution of soil taxonomic classes. Key components of DSM are the method and the set of environmental covariates used to predict soil classes. Machine learning is a general term for a broad set of statistical modeling techniques. Many different machine learning models have been applied in the literature and there are different approaches for selecting covariates for DSM. However, there is little guidance as to which, if any, machine learning model and covariate set …


Effective Auto Encoder For Unsupervised Sparse Representation, Faria Mahnaz Jan 2015

Effective Auto Encoder For Unsupervised Sparse Representation, Faria Mahnaz

Wayne State University Theses

High dimensionality and the sheer size of unlabeled data available today demand

new development in unsupervised learning of sparse representation. Despite of recent

advances in representation learning, most of the current methods are limited when

dealing with large scale unlabeled data. In this study, we propose a new unsupervised

method that is able to learn sparse representation from unlabeled data efficiently. We

derive a closed-form solution based on the sequential minimal optimization (SMO)

for training an auto encoder-decoder module, which efficiently extracts sparse and

compact features from any data set with various size. The inference process in the

proposed learning …


Unsupervised Learning And Image Classification In High Performance Computing Cluster, Itauma Itauma Jan 2015

Unsupervised Learning And Image Classification In High Performance Computing Cluster, Itauma Itauma

Wayne State University Theses

Feature learning and object classification in machine learning have become very active research areas in recent decades. Identifying good features has various benefits for object classification in respect to reducing the computational cost and increasing the classification accuracy. In addition, many research studies have focused on the use of Graphics Processing Units (GPUs) to improve the training time for machine learning algorithms. In this study, the use of an alternative platform, called High Performance Computing Cluster (HPCC), to handle unsupervised feature learning, image and speech classification and improve the computational cost is proposed.

HPCC is a Big Data processing and …


Lexical Mechanics: Partitions, Mixtures, And Context, Jake Ryland Williams Jan 2015

Lexical Mechanics: Partitions, Mixtures, And Context, Jake Ryland Williams

Graduate College Dissertations and Theses

Highly structured for efficient communication, natural languages are complex systems. Unlike in their computational cousins, functions and meanings in natural languages are relative, frequently prescribed to symbols through unexpected social processes. Despite grammar and definition, the presence of metaphor can leave unwitting language users "in the dark," so to speak. This is not problematic, but rather an important operational feature of languages, since the lifting of meaning onto higher-order structures allows individuals to compress descriptions of regularly-conveyed information. This compressed terminology, often only appropriate when taken locally (in context), is beneficial in an enormous world of novel experience. However, what …


Graph-Based Regularization In Machine Learning: Discovering Driver Modules In Biological Networks, Xi Gao Jan 2015

Graph-Based Regularization In Machine Learning: Discovering Driver Modules In Biological Networks, Xi Gao

Theses and Dissertations

Curiosity of human nature drives us to explore the origins of what makes each of us different. From ancient legends and mythology, Mendel's law, Punnett square to modern genetic research, we carry on this old but eternal question. Thanks to technological revolution, today's scientists try to answer this question using easily measurable gene expression and other profiling data. However, the exploration can easily get lost in the data of growing volume, dimension, noise and complexity. This dissertation is aimed at developing new machine learning methods that take data from different classes as input, augment them with knowledge of feature relationships, …


Geographic Relevance For Travel Search: The 2014-2015 Harvey Mudd College Clinic Project For Expedia, Inc., Hannah Long Jan 2015

Geographic Relevance For Travel Search: The 2014-2015 Harvey Mudd College Clinic Project For Expedia, Inc., Hannah Long

Scripps Senior Theses

The purpose of this Clinic project is to help Expedia, Inc. expand the search capabilities it offers to its users. In particular, the goal is to help the company respond to unconstrained search queries by generating a method to associate hotels and regions around the world with the higher-level attributes that describe them, such as “family- friendly” or “culturally-rich.” Our team utilized machine-learning algorithms to extract metadata from textual data about hotels and cities. We focused on two machine-learning models: decision trees and Latent Dirichlet Allocation (LDA). The first appeared to be a promising approach, but would require more resources …


Novel Classification Of Slow Movement Objects In Urban Traffic Environments Using Wideband Pulse Doppler Radar, Berta Rodriguez Hervas Jan 2015

Novel Classification Of Slow Movement Objects In Urban Traffic Environments Using Wideband Pulse Doppler Radar, Berta Rodriguez Hervas

Open Access Theses & Dissertations

Every year thousands of people are involved in traffic accidents, some of which are fatal. An important percentage of these fatalities are caused by human error, which could be prevented by increasing the awareness of drivers and the autonomy of vehicles. Since driver assistance systems have the potential to positively impact tens of millions of people, the purpose of this research is to study the micro-Doppler characteristics of vulnerable urban traffic components, i.e. pedestrians and bicyclists, based on information obtained from radar backscatter, and to develop a classification technique that allows automatic target recognition with a vehicle integrated system. For …


Features For Ranking Tweets Based On Credibility And Newsworthiness, Jacob W. Ross Jan 2015

Features For Ranking Tweets Based On Credibility And Newsworthiness, Jacob W. Ross

Browse all Theses and Dissertations

We create a robust and general feature set for learning to rank algorithms that rank tweets based on credibility and newsworthiness. In previous works, it has been demonstrated that when the training and testing data are from two distinct time periods, the ranker performs poorly. We improve upon previous work by creating a feature set that does not over fit a particular year or set of topics. This is critical given how people utilize social media changes as time progresses, and the topics discussed vary. In addition, we are constantly gaining new tweet data. Thus, it is important to be …


Contrast Pattern Aided Regression And Classification, Vahid Taslimitehrani Jan 2015

Contrast Pattern Aided Regression And Classification, Vahid Taslimitehrani

Browse all Theses and Dissertations

Regression and classification techniques play an essential role in many data mining tasks and have broad applications. However, most of the state-of-the-art regression and classification techniques are often unable to adequately model the interactions among predictor variables in highly heterogeneous datasets. New techniques that can effectively model such complex and heterogeneous structures are needed to significantly improve prediction accuracy. In this dissertation, we propose a novel type of accurate and interpretable regression and classification models, named as Pattern Aided Regression (PXR) and Pattern Aided Classification (PXC) respectively. Both PXR and PXC rely on identifying regions in the data space where …