Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

2017

Machine learning

Institution
Publication
Publication Type
File Type

Articles 1 - 30 of 67

Full-Text Articles in Physical Sciences and Mathematics

Looping Predictive Method To Improve Accuracy Of A Machine Learning Model, Subramanyam Reddy Pogili Dec 2017

Looping Predictive Method To Improve Accuracy Of A Machine Learning Model, Subramanyam Reddy Pogili

Theses

The topic of this project is an analysis of drug-related tweets. The goal is to build a Machine Learning Model that can distinguish between tweets that indicate drug abuse and other tweets that also contain the name of a drug but do not describe abuse. Drugs can be illegal, such as heroin, or legal drugs with a potential of abuse, such as painkillers. However, building a good Machine Learning Model requires a large amount of training data. For each training tweet, a human expert has determined whether it indicates drug abuse or not. This is difficult work for humans. …


Visual Odometry Using Convolutional Neural Networks, Alec Graves, Steffen Lim, Thomas Fagan, Kevin Mcfall Phd. Dec 2017

Visual Odometry Using Convolutional Neural Networks, Alec Graves, Steffen Lim, Thomas Fagan, Kevin Mcfall Phd.

The Kennesaw Journal of Undergraduate Research

Visual odometry is the process of tracking an agent's motion over time using a visual sensor. The visual odometry problem has only been recently solved using traditional, non-machine learning techniques. Despite the success of neural networks at many related problems such as object recognition, feature detection, and optical flow, visual odometry still has not been solved with a deep learning technique. This paper attempts to implement several Convolutional Neural Networks to solve the visual odometry problem and compare slight variations in data preprocessing. The work presented is a step toward reaching a legitimate neural network solution.


Uncovering New Links Through Interaction Duration, Laxmi Amulya Gundala Dec 2017

Uncovering New Links Through Interaction Duration, Laxmi Amulya Gundala

Boise State University Theses and Dissertations

Link Prediction is the problem of inferring new relationships among nodes in a network that can occur in the near future. Classical approaches mainly consider neighborhood structure similarity when linking nodes. However, we may also want to take into account whether the two nodes we are going to link will benefit from that by having an active interaction over time. For instance, it is better to link two nodes � and � if we know that these two nodes will interact in the social network in the future, rather than suggesting �, who may never interact with �. Thus, the …


A Test Driven Approach To Develop Web-Based Machine Learning Applications, Armin Esmaeilzadeh Dec 2017

A Test Driven Approach To Develop Web-Based Machine Learning Applications, Armin Esmaeilzadeh

UNLV Theses, Dissertations, Professional Papers, and Capstones

The purpose of this thesis is to propose the design and architecture of a testable, scalable, and ef-cient web-based application that models and implements machine learning applications in cancer prediction. There are various components that form the architecture of our web-based application including server, database, programming language, web framework, and front-end design. There are also other factors associated with our application such as testability, scalability, performance, and design pattern. Our main focus in this thesis is on the testability of the system while consid- ering the importance of other factors as well.

The data set for our application is a …


Ethics And Bias In Machine Learning: A Technical Study Of What Makes Us “Good”, Ashley Nicole Shadowen Dec 2017

Ethics And Bias In Machine Learning: A Technical Study Of What Makes Us “Good”, Ashley Nicole Shadowen

Student Theses

The topic of machine ethics is growing in recognition and energy, but bias in machine learning algorithms outpaces it to date. Bias is a complicated term with good and bad connotations in the field of algorithmic prediction making. Especially in circumstances with legal and ethical consequences, we must study the results of these machines to ensure fairness. This paper attempts to address ethics at the algorithmic level of autonomous machines. There is no one solution to solving machine bias, it depends on the context of the given system and the most reasonable way to avoid biased decisions while maintaining the …


Deep-Learned Generative Representations Of 3d Shape Families, Haibin Huang Nov 2017

Deep-Learned Generative Representations Of 3d Shape Families, Haibin Huang

Doctoral Dissertations

Digital representations of 3D shapes are becoming increasingly useful in several emerging applications, such as 3D printing, virtual reality and augmented reality. However, traditional modeling softwares require users to have extensive modeling experience, artistic skills and training to handle their complex interfaces and perform the necessary low-level geometric manipulation commands. Thus, there is an emerging need for computer algorithms that help novice and casual users to quickly and easily generate 3D content. In this work, I will present deep learning algorithms that are capable of automatically inferring parametric representations of shape families, which can be used to generate new 3D …


Deep Energy-Based Models For Structured Prediction, David Belanger Nov 2017

Deep Energy-Based Models For Structured Prediction, David Belanger

Doctoral Dissertations

We introduce structured prediction energy networks (SPENs), a flexible frame- work for structured prediction. A deep architecture is used to define an energy func- tion over candidate outputs and predictions are produced by gradient-based energy minimization. This deep energy captures dependencies between labels that would lead to intractable graphical models, and allows us to automatically discover discrim- inative features of the structured output. Furthermore, practitioners can explore a wide variety of energy function architectures without having to hand-design predic- tion and learning methods for each model. This is because all of our prediction and learning methods interact with the energy …


An Integrated Framework For Modeling And Predicting Spatiotemporal Phenomena In Urban Environments, Tuc Viet Le Nov 2017

An Integrated Framework For Modeling And Predicting Spatiotemporal Phenomena In Urban Environments, Tuc Viet Le

Dissertations and Theses Collection (Open Access)

This thesis proposes a general solution framework that integrates methods in machine learning in creative ways to solve a diverse set of problems arising in urban environments. It particularly focuses on modeling spatiotemporal data for the purpose of predicting urban phenomena. Concretely, the framework is applied to solve three specific real-world problems: human mobility prediction, trac speed prediction and incident prediction. For human mobility prediction, I use visitor trajectories collected a large theme park in Singapore as a simplified microcosm of an urban area. A trajectory is an ordered sequence of attraction visits and corresponding timestamps produced by a visitor. …


Anomaly Detection For A Water Treatment System Using Unsupervised Machine Learning, Jun Inoue, Yoriyuki Yamagata, Yuqi Chen, Christopher M. Poskitt, Jun Sun Nov 2017

Anomaly Detection For A Water Treatment System Using Unsupervised Machine Learning, Jun Inoue, Yoriyuki Yamagata, Yuqi Chen, Christopher M. Poskitt, Jun Sun

Research Collection School Of Computing and Information Systems

In this paper, we propose and evaluate the application of unsupervised machine learning to anomaly detection for a Cyber-Physical System (CPS). We compare two methods: Deep Neural Networks (DNN) adapted to time series data generated by a CPS, and one-class Support Vector Machines (SVM). These methods are evaluated against data from the Secure Water Treatment (SWaT) testbed, a scaled-down but fully operational raw water purification plant. For both methods, we first train detectors using a log generated by SWaT operating under normal conditions. Then, we evaluate the performance of both methods using a log generated by SWaT operating under 36 …


Scalable Online Kernel Learning, Jing Lu Nov 2017

Scalable Online Kernel Learning, Jing Lu

Dissertations and Theses Collection (Open Access)

One critical deficiency of traditional online kernel learning methods is their increasing and unbounded number of support vectors (SV’s), making them inefficient and non-scalable for large-scale applications. Recent studies on budget online learning have attempted to overcome this shortcoming by bounding the number of SV’s. Despite being extensively studied, budget algorithms usually suffer from several drawbacks.
First of all, although existing algorithms attempt to bound the number of SV’s at each iteration, most of them fail to bound the number of SV’s for the final averaged classifier, which is commonly used for online-to-batch conversion. To solve this problem, we propose …


Evolution Of Bias In Human And Machine Learning Algorithm Interaction, Wenlong Sun, Olfa Nasraoui, Patrick Shafto Oct 2017

Evolution Of Bias In Human And Machine Learning Algorithm Interaction, Wenlong Sun, Olfa Nasraoui, Patrick Shafto

Commonwealth Computational Summit

Human algorithm interaction:

  • people are now affected by the output of all types of machine learning algorithms.
  • social media, blogs, social networks, and other services and applications.

Motivation:

  • ML algorithm relied on reliable labels from experts to build prediction.
  • However, ML algorithm started to receive data from the more general population.
  • The interaction leads to biased result which is caused by ingesting unchecked information from general population, such as biased samples and biased labels.


A Comparative Study On Machine Learning Algorithms For Network Defense, Abdinur Ali, Yen-Hung Hu, Chung-Chu (George) Hsieh, Mushtaq Khan Oct 2017

A Comparative Study On Machine Learning Algorithms For Network Defense, Abdinur Ali, Yen-Hung Hu, Chung-Chu (George) Hsieh, Mushtaq Khan

Virginia Journal of Science

Network security specialists use machine learning algorithms to detect computer network attacks and prevent unauthorized access to their networks. Traditionally, signature and anomaly detection techniques have been used for network defense. However, detection techniques must adapt to keep pace with continuously changing security attacks. Therefore, machine learning algorithms always learn from experience and are appropriate tools for this adaptation. In this paper, ten machine learning algorithms were trained with the KDD99 dataset with labels, then they were tested with different dataset without labels. The researchers investigate the speed and the efficiency of these machine learning algorithms in terms of several …


Feature Space Augmentation: Improving Prediction Accuracy Of Classical Problems In Cognitive Science And Computer Vison, Piyush Saxena Oct 2017

Feature Space Augmentation: Improving Prediction Accuracy Of Classical Problems In Cognitive Science And Computer Vison, Piyush Saxena

Dissertations (1934 -)

The prediction accuracy in many classical problems across multiple domains has seen a rise since computational tools such as multi-layer neural nets and complex machine learning algorithms have become widely accessible to the research community. In this research, we take a step back and examine the feature space in two problems from very different domains. We show that novel augmentation to the feature space yields higher performance. Emotion Recognition in Adults from a Control Group: The objective is to quantify the emotional state of an individual at any time using data collected by wearable sensors. We define emotional state as …


Vungle Inc. Improves Monetization Using Big-Data Analytics, Bert De Reyck, Ioannis Fragkos, Yael Gruksha-Cockayne, Casey Lichtendahl, Hammond Guerin, Andre Kritzer Oct 2017

Vungle Inc. Improves Monetization Using Big-Data Analytics, Bert De Reyck, Ioannis Fragkos, Yael Gruksha-Cockayne, Casey Lichtendahl, Hammond Guerin, Andre Kritzer

Research Collection Lee Kong Chian School Of Business

The advent of big data has created opportunities for firms to customize their products and services to unprecedented levels of granularity. Using big data to personalize an offering in real time, however, remains a major challenge. In the mobile advertising industry, once a customer enters the network, an ad-serving decision must be made in a matter of milliseconds. In this work, we describe the design and implementation of an ad-serving algorithm that incorporates machine-learning methods to make personalized ad-serving decisions within milliseconds. We developed this algorithm for Vungle Inc., one of the largest global mobile ad networks. Our approach also …


Exploring The Internal Statistics: Single Image Super-Resolution, Completion And Captioning, Yang Xian Sep 2017

Exploring The Internal Statistics: Single Image Super-Resolution, Completion And Captioning, Yang Xian

Dissertations, Theses, and Capstone Projects

Image enhancement has drawn increasingly attention in improving image quality or interpretability. It aims to modify images to achieve a better perception for human visual system or a more suitable representation for further analysis in a variety of applications such as medical imaging, remote sensing, and video surveillance. Based on different attributes of the given input images, enhancement tasks vary, e.g., noise removal, deblurring, resolution enhancement, prediction of missing pixels, etc. The latter two are usually referred to as image super-resolution and image inpainting (or completion).

Image super-resolution and completion are numerically ill-posed problems. Multi-frame-based approaches make use of the …


Inferring Spread Of Readers’ Emotion Affected By Online News, Agus Sulistya, Ferdian Thung, David Lo Sep 2017

Inferring Spread Of Readers’ Emotion Affected By Online News, Agus Sulistya, Ferdian Thung, David Lo

Research Collection School Of Computing and Information Systems

Depending on the reader, A news article may be viewed from many different perspectives, thus triggering different (and possibly contradicting) emotions. In this paper, we formulate a problem of predicting readers’ emotion distribution affected by a news article. Our approach analyzes affective annotations provided by readers of news articles taken from a non-English online news site. We create a new corpus from the annotated articles, and build a domain-specific emotion lexicon and word embedding features. We finally construct a multi-target regression model from a set of features extracted from online news articles. Our experiments show that by combining lexicon and …


Sugarmate: Non-Intrusive Blood Glucose Monitoring With Smartphones, Weixi Gu, Yuxun Zhou, Zimu Zhou, Xi Liu, Han Zou, Pei Zhang, Costas J. Spanos, Lin Zhang Sep 2017

Sugarmate: Non-Intrusive Blood Glucose Monitoring With Smartphones, Weixi Gu, Yuxun Zhou, Zimu Zhou, Xi Liu, Han Zou, Pei Zhang, Costas J. Spanos, Lin Zhang

Research Collection School Of Computing and Information Systems

Inferring abnormal glucose events such as hyperglycemia and hypoglycemia is crucial for the health of both diabetic patients and non-diabetic people. However, regular blood glucose monitoring can be invasive and inconvenient in everyday life. We present SugarMate, a first smartphone-based blood glucose inference system as a temporary alternative to continuous blood glucose monitors (CGM) when they are uncomfortable or inconvenient to wear. In addition to the records of food, drug and insulin intake, it leverages smartphone sensors to measure physical activities and sleep quality automatically. Provided with the imbalanced and often limited measurements, a challenge of SugarMate is the inference …


Improving Pure-Tone Audiometry Using Probabilistic Machine Learning Classification, Xinyu Song Aug 2017

Improving Pure-Tone Audiometry Using Probabilistic Machine Learning Classification, Xinyu Song

McKelvey School of Engineering Theses & Dissertations

Hearing loss is a critical public health concern, affecting hundreds millions of people worldwide and dramatically impacting quality of life for affected individuals. While treatment techniques have evolved in recent years, methods for assessing hearing ability have remained relatively unchanged for decades. The standard clinical procedure is the modified Hughson-Westlake procedure, an adaptive pure-tone detection task that is typically performed manually by audiologists, costing millions of collective hours annually among healthcare professionals. In addition to the high burden of labor, the technique provides limited detail about an individual’s hearing ability, estimating only detection thresholds at a handful of pre-defined pure-tone …


Effect Of Label Noise On The Machine-Learned Classification Of Earthquake Damage, Jared Frank, Umaa Rebbapragada, James Bialas, Thomas Oommen, Timothy C. Havens Aug 2017

Effect Of Label Noise On The Machine-Learned Classification Of Earthquake Damage, Jared Frank, Umaa Rebbapragada, James Bialas, Thomas Oommen, Timothy C. Havens

Michigan Tech Publications

Automated classification of earthquake damage in remotely-sensed imagery using machine learning techniques depends on training data, or data examples that are labeled correctly by a human expert as containing damage or not. Mislabeled training data are a major source of classifier error due to the use of imprecise digital labeling tools and crowdsourced volunteers who are not adequately trained on or invested in the task. The spatial nature of remote sensing classification leads to the consistent mislabeling of classes that occur in close proximity to rubble, which is a major byproduct of earthquake damage in urban areas. In this study, …


Applying Machine Learning To Computational Chemistry: Can We Predict Molecular Properties Faster Without Compromising Accuracy?, Hanjing Xu, Pradeep Gurunathan, Lyudmila Slipchenko Aug 2017

Applying Machine Learning To Computational Chemistry: Can We Predict Molecular Properties Faster Without Compromising Accuracy?, Hanjing Xu, Pradeep Gurunathan, Lyudmila Slipchenko

The Summer Undergraduate Research Fellowship (SURF) Symposium

Non-covalent interactions are crucial in analyzing protein folding and structure, function of DNA and RNA, structures of molecular crystals and aggregates, and many other processes in the fields of biology and chemistry. However, it is time and resource consuming to calculate such interactions using quantum-mechanical formulations. Our group has proposed previously that the effective fragment potential (EFP) method could serve as an efficient alternative to solve this problem. However, one of the computational bottlenecks of the EFP method is obtaining parameters for each molecule/fragment in the system, before the actual EFP simulations can be carried out. Here we present a …


Machine Learning In Xenon1t Analysis, Dillon A. Davis, Rafael F. Lang, Darryl P. Masson Aug 2017

Machine Learning In Xenon1t Analysis, Dillon A. Davis, Rafael F. Lang, Darryl P. Masson

The Summer Undergraduate Research Fellowship (SURF) Symposium

In process of analyzing large amounts of quantitative data, it can be quite time consuming and challenging to uncover populations of interest contained amongst the background data. Therefore, the ability to partially automate the process while gaining additional insight into the interdependencies of key parameters via machine learning seems quite appealing. As of now, the primary means of reviewing the data is by manually plotting data in different parameter spaces to recognize key features, which is slow and error prone. In this experiment, many well-known machine learning algorithms were applied to a dataset to attempt to semi-automatically identify known populations, …


Predicting Locations Of Pollution Sources Using Convolutional Neural Networks, Yiheng Chi, Nickolas D. Winovich, Guang Lin Aug 2017

Predicting Locations Of Pollution Sources Using Convolutional Neural Networks, Yiheng Chi, Nickolas D. Winovich, Guang Lin

The Summer Undergraduate Research Fellowship (SURF) Symposium

Pollution is a severe problem today, and the main challenge in water and air pollution controls and eliminations is detecting and locating pollution sources. This research project aims to predict the locations of pollution sources given diffusion information of pollution in the form of array or image data. These predictions are done using machine learning. The relations between time, location, and pollution concentration are first formulated as pollution diffusion equations, which are partial differential equations (PDEs), and then deep convolutional neural networks are built and trained to solve these PDEs. The convolutional neural networks consist of convolutional layers, reLU layers …


Asymptotically Unbiased Estimation Of A Nonsymmetric Dependence Measure Applied To Sensor Data Analytics And Financial Time Series, Angel Caƫaron, Razvan Andonie, Yvonne Chueh Aug 2017

Asymptotically Unbiased Estimation Of A Nonsymmetric Dependence Measure Applied To Sensor Data Analytics And Financial Time Series, Angel Caƫaron, Razvan Andonie, Yvonne Chueh

All Faculty Scholarship for the College of the Sciences

A fundamental concept frequently applied to statistical machine learning is the detection of dependencies between unknown random variables found from data samples. In previous work, we have introduced a nonparametric unilateral dependence measure based on Onicescu’s information energy and a kNN method for estimating this measure from an available sample set of discrete or continuous variables. This paper provides the formal proofs which show that the estimator is asymptotically unbiased and has asymptotic zero variance when the sample size increases. It implies that the estimator has good statistical qualities. We investigate the performance of the estimator for data analysis applications …


Analyzing The Relationship Between Human Behavior And Indoor Air Quality, Beiyu Lin, Yibo Huangfu, Nathan Lima, Bertram Jobson, Max Kirk, Patrick O’Keeffe, Shelley N. Pressley, Von Walden, Brian Lamb, Diane J. Cook Aug 2017

Analyzing The Relationship Between Human Behavior And Indoor Air Quality, Beiyu Lin, Yibo Huangfu, Nathan Lima, Bertram Jobson, Max Kirk, Patrick O’Keeffe, Shelley N. Pressley, Von Walden, Brian Lamb, Diane J. Cook

Computer Science Faculty Publications and Presentations

In the coming decades, as we experience global population growth and global aging issues, there will be corresponding concerns about the quality of the air we experience inside and outside buildings. Because we can anticipate that there will be behavioral changes that accompany population growth and aging, we examine the relationship between home occupant behavior and indoor air quality. To do this, we collect both sensor-based behavior data and chemical indoor air quality measurements in smart home environments. We introduce a novel machine learning-based approach to quantify the correlation between smart home features and chemical measurements of air quality, and …


Dynamic Adversarial Mining - Effectively Applying Machine Learning In Adversarial Non-Stationary Environments., Tegjyot Singh Sethi Aug 2017

Dynamic Adversarial Mining - Effectively Applying Machine Learning In Adversarial Non-Stationary Environments., Tegjyot Singh Sethi

Electronic Theses and Dissertations

While understanding of machine learning and data mining is still in its budding stages, the engineering applications of the same has found immense acceptance and success. Cybersecurity applications such as intrusion detection systems, spam filtering, and CAPTCHA authentication, have all begun adopting machine learning as a viable technique to deal with large scale adversarial activity. However, the naive usage of machine learning in an adversarial setting is prone to reverse engineering and evasion attacks, as most of these techniques were designed primarily for a static setting. The security domain is a dynamic landscape, with an ongoing never ending arms race …


Accurate And Justifiable : New Algorithms For Explainable Recommendations., Behnoush Abdollahi Aug 2017

Accurate And Justifiable : New Algorithms For Explainable Recommendations., Behnoush Abdollahi

Electronic Theses and Dissertations

Websites and online services thrive with large amounts of online information, products, and choices, that are available but exceedingly difficult to find and discover. This has prompted two major paradigms to help sift through information: information retrieval and recommender systems. The broad family of information retrieval techniques has given rise to the modern search engines which return relevant results, following a user's explicit query. The broad family of recommender systems, on the other hand, works in a more subtle manner, and do not require an explicit query to provide relevant results. Collaborative Filtering (CF) recommender systems are based on algorithms …


Constructing Interactive Visual Classification, Clustering And Dimension Reduction Models For N-D Data, Boris Kovalerchuk, Dmytro Dovhalets Jul 2017

Constructing Interactive Visual Classification, Clustering And Dimension Reduction Models For N-D Data, Boris Kovalerchuk, Dmytro Dovhalets

Computer Science Faculty Scholarship

The exploration of multidimensional datasets of all possible sizes and dimensions is a long-standing challenge in knowledge discovery, machine learning, and visualization. While multiple efficient visualization methods for n-D data analysis exist, the loss of information, occlusion, and clutter continue to be a challenge. This paper proposes and explores a new interactive method for visual discovery of n-D relations for supervised learning. The method includes automatic, interactive, and combined algorithms for discovering linear relations, dimension reduction, and generalization for non-linear relations. This method is a special category of reversible General Line Coordinates (GLC). It produces graphs in 2-D that represent …


Classification With Large Sparse Datasets: Convergence Analysis And Scalable Algorithms, Xiang Li Jul 2017

Classification With Large Sparse Datasets: Convergence Analysis And Scalable Algorithms, Xiang Li

Electronic Thesis and Dissertation Repository

Large and sparse datasets, such as user ratings over a large collection of items, are common in the big data era. Many applications need to classify the users or items based on the high-dimensional and sparse data vectors, e.g., to predict the profitability of a product or the age group of a user, etc. Linear classifiers are popular choices for classifying such datasets because of their efficiency. In order to classify the large sparse data more effectively, the following important questions need to be answered.

1. Sparse data and convergence behavior. How different properties of a dataset, such as …


Identifying Twitter Spam By Utilizing Random Forests, Humza S. Haider Jul 2017

Identifying Twitter Spam By Utilizing Random Forests, Humza S. Haider

Scholarly Horizons: University of Minnesota, Morris Undergraduate Journal

The use of Twitter has rapidly grown since the first tweet in 2006. The number of spammers on Twitter shows a similar increase. Classifying users into spammers and non-spammers has been heavily researched, and new methods for spam detection are developing rapidly. One of these classification techniques is known as random forests. We examine three studies that employ random forests using user based features, geo-tagged features, and time dependent features. Each study showed high accuracy rates and F-measures with the exception of one model that had a test set with a more realistic proportion of spam relative to typical testing …


Problems In Graph-Structured Modeling And Learning, James Atwood Jul 2017

Problems In Graph-Structured Modeling And Learning, James Atwood

Doctoral Dissertations

This thesis investigates three problems in graph-structured modeling and learning. We first present a method for efficiently generating large instances from nonlinear preferential attachment models of network structure. This is followed by a description of diffusion-convolutional neural networks, a new model for graph-structured data which is able to outperform probabilistic relational models and kernel-on-graph methods at node classification tasks. We conclude with an optimal privacy-protection method for users of online services that remains effective when users have poor knowledge of an adversary's behavior.