Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 33

Full-Text Articles in Physical Sciences and Mathematics

Looping Predictive Method To Improve Accuracy Of A Machine Learning Model, Subramanyam Reddy Pogili Dec 2017

Looping Predictive Method To Improve Accuracy Of A Machine Learning Model, Subramanyam Reddy Pogili

Theses

The topic of this project is an analysis of drug-related tweets. The goal is to build a Machine Learning Model that can distinguish between tweets that indicate drug abuse and other tweets that also contain the name of a drug but do not describe abuse. Drugs can be illegal, such as heroin, or legal drugs with a potential of abuse, such as painkillers. However, building a good Machine Learning Model requires a large amount of training data. For each training tweet, a human expert has determined whether it indicates drug abuse or not. This is difficult work for humans. …


Ethics And Bias In Machine Learning: A Technical Study Of What Makes Us “Good”, Ashley Nicole Shadowen Dec 2017

Ethics And Bias In Machine Learning: A Technical Study Of What Makes Us “Good”, Ashley Nicole Shadowen

Student Theses

The topic of machine ethics is growing in recognition and energy, but bias in machine learning algorithms outpaces it to date. Bias is a complicated term with good and bad connotations in the field of algorithmic prediction making. Especially in circumstances with legal and ethical consequences, we must study the results of these machines to ensure fairness. This paper attempts to address ethics at the algorithmic level of autonomous machines. There is no one solution to solving machine bias, it depends on the context of the given system and the most reasonable way to avoid biased decisions while maintaining the …


Uncovering New Links Through Interaction Duration, Laxmi Amulya Gundala Dec 2017

Uncovering New Links Through Interaction Duration, Laxmi Amulya Gundala

Boise State University Theses and Dissertations

Link Prediction is the problem of inferring new relationships among nodes in a network that can occur in the near future. Classical approaches mainly consider neighborhood structure similarity when linking nodes. However, we may also want to take into account whether the two nodes we are going to link will benefit from that by having an active interaction over time. For instance, it is better to link two nodes � and � if we know that these two nodes will interact in the social network in the future, rather than suggesting �, who may never interact with �. Thus, the …


A Test Driven Approach To Develop Web-Based Machine Learning Applications, Armin Esmaeilzadeh Dec 2017

A Test Driven Approach To Develop Web-Based Machine Learning Applications, Armin Esmaeilzadeh

UNLV Theses, Dissertations, Professional Papers, and Capstones

The purpose of this thesis is to propose the design and architecture of a testable, scalable, and ef-cient web-based application that models and implements machine learning applications in cancer prediction. There are various components that form the architecture of our web-based application including server, database, programming language, web framework, and front-end design. There are also other factors associated with our application such as testability, scalability, performance, and design pattern. Our main focus in this thesis is on the testability of the system while consid- ering the importance of other factors as well.

The data set for our application is a …


Deep-Learned Generative Representations Of 3d Shape Families, Haibin Huang Nov 2017

Deep-Learned Generative Representations Of 3d Shape Families, Haibin Huang

Doctoral Dissertations

Digital representations of 3D shapes are becoming increasingly useful in several emerging applications, such as 3D printing, virtual reality and augmented reality. However, traditional modeling softwares require users to have extensive modeling experience, artistic skills and training to handle their complex interfaces and perform the necessary low-level geometric manipulation commands. Thus, there is an emerging need for computer algorithms that help novice and casual users to quickly and easily generate 3D content. In this work, I will present deep learning algorithms that are capable of automatically inferring parametric representations of shape families, which can be used to generate new 3D …


Deep Energy-Based Models For Structured Prediction, David Belanger Nov 2017

Deep Energy-Based Models For Structured Prediction, David Belanger

Doctoral Dissertations

We introduce structured prediction energy networks (SPENs), a flexible frame- work for structured prediction. A deep architecture is used to define an energy func- tion over candidate outputs and predictions are produced by gradient-based energy minimization. This deep energy captures dependencies between labels that would lead to intractable graphical models, and allows us to automatically discover discrim- inative features of the structured output. Furthermore, practitioners can explore a wide variety of energy function architectures without having to hand-design predic- tion and learning methods for each model. This is because all of our prediction and learning methods interact with the energy …


An Integrated Framework For Modeling And Predicting Spatiotemporal Phenomena In Urban Environments, Tuc Viet Le Nov 2017

An Integrated Framework For Modeling And Predicting Spatiotemporal Phenomena In Urban Environments, Tuc Viet Le

Dissertations and Theses Collection (Open Access)

This thesis proposes a general solution framework that integrates methods in machine learning in creative ways to solve a diverse set of problems arising in urban environments. It particularly focuses on modeling spatiotemporal data for the purpose of predicting urban phenomena. Concretely, the framework is applied to solve three specific real-world problems: human mobility prediction, trac speed prediction and incident prediction. For human mobility prediction, I use visitor trajectories collected a large theme park in Singapore as a simplified microcosm of an urban area. A trajectory is an ordered sequence of attraction visits and corresponding timestamps produced by a visitor. …


Scalable Online Kernel Learning, Jing Lu Nov 2017

Scalable Online Kernel Learning, Jing Lu

Dissertations and Theses Collection (Open Access)

One critical deficiency of traditional online kernel learning methods is their increasing and unbounded number of support vectors (SV’s), making them inefficient and non-scalable for large-scale applications. Recent studies on budget online learning have attempted to overcome this shortcoming by bounding the number of SV’s. Despite being extensively studied, budget algorithms usually suffer from several drawbacks.
First of all, although existing algorithms attempt to bound the number of SV’s at each iteration, most of them fail to bound the number of SV’s for the final averaged classifier, which is commonly used for online-to-batch conversion. To solve this problem, we propose …


Feature Space Augmentation: Improving Prediction Accuracy Of Classical Problems In Cognitive Science And Computer Vison, Piyush Saxena Oct 2017

Feature Space Augmentation: Improving Prediction Accuracy Of Classical Problems In Cognitive Science And Computer Vison, Piyush Saxena

Dissertations (1934 -)

The prediction accuracy in many classical problems across multiple domains has seen a rise since computational tools such as multi-layer neural nets and complex machine learning algorithms have become widely accessible to the research community. In this research, we take a step back and examine the feature space in two problems from very different domains. We show that novel augmentation to the feature space yields higher performance. Emotion Recognition in Adults from a Control Group: The objective is to quantify the emotional state of an individual at any time using data collected by wearable sensors. We define emotional state as …


Exploring The Internal Statistics: Single Image Super-Resolution, Completion And Captioning, Yang Xian Sep 2017

Exploring The Internal Statistics: Single Image Super-Resolution, Completion And Captioning, Yang Xian

Dissertations, Theses, and Capstone Projects

Image enhancement has drawn increasingly attention in improving image quality or interpretability. It aims to modify images to achieve a better perception for human visual system or a more suitable representation for further analysis in a variety of applications such as medical imaging, remote sensing, and video surveillance. Based on different attributes of the given input images, enhancement tasks vary, e.g., noise removal, deblurring, resolution enhancement, prediction of missing pixels, etc. The latter two are usually referred to as image super-resolution and image inpainting (or completion).

Image super-resolution and completion are numerically ill-posed problems. Multi-frame-based approaches make use of the …


Improving Pure-Tone Audiometry Using Probabilistic Machine Learning Classification, Xinyu Song Aug 2017

Improving Pure-Tone Audiometry Using Probabilistic Machine Learning Classification, Xinyu Song

McKelvey School of Engineering Theses & Dissertations

Hearing loss is a critical public health concern, affecting hundreds millions of people worldwide and dramatically impacting quality of life for affected individuals. While treatment techniques have evolved in recent years, methods for assessing hearing ability have remained relatively unchanged for decades. The standard clinical procedure is the modified Hughson-Westlake procedure, an adaptive pure-tone detection task that is typically performed manually by audiologists, costing millions of collective hours annually among healthcare professionals. In addition to the high burden of labor, the technique provides limited detail about an individual’s hearing ability, estimating only detection thresholds at a handful of pre-defined pure-tone …


Accurate And Justifiable : New Algorithms For Explainable Recommendations., Behnoush Abdollahi Aug 2017

Accurate And Justifiable : New Algorithms For Explainable Recommendations., Behnoush Abdollahi

Electronic Theses and Dissertations

Websites and online services thrive with large amounts of online information, products, and choices, that are available but exceedingly difficult to find and discover. This has prompted two major paradigms to help sift through information: information retrieval and recommender systems. The broad family of information retrieval techniques has given rise to the modern search engines which return relevant results, following a user's explicit query. The broad family of recommender systems, on the other hand, works in a more subtle manner, and do not require an explicit query to provide relevant results. Collaborative Filtering (CF) recommender systems are based on algorithms …


Dynamic Adversarial Mining - Effectively Applying Machine Learning In Adversarial Non-Stationary Environments., Tegjyot Singh Sethi Aug 2017

Dynamic Adversarial Mining - Effectively Applying Machine Learning In Adversarial Non-Stationary Environments., Tegjyot Singh Sethi

Electronic Theses and Dissertations

While understanding of machine learning and data mining is still in its budding stages, the engineering applications of the same has found immense acceptance and success. Cybersecurity applications such as intrusion detection systems, spam filtering, and CAPTCHA authentication, have all begun adopting machine learning as a viable technique to deal with large scale adversarial activity. However, the naive usage of machine learning in an adversarial setting is prone to reverse engineering and evasion attacks, as most of these techniques were designed primarily for a static setting. The security domain is a dynamic landscape, with an ongoing never ending arms race …


Classification With Large Sparse Datasets: Convergence Analysis And Scalable Algorithms, Xiang Li Jul 2017

Classification With Large Sparse Datasets: Convergence Analysis And Scalable Algorithms, Xiang Li

Electronic Thesis and Dissertation Repository

Large and sparse datasets, such as user ratings over a large collection of items, are common in the big data era. Many applications need to classify the users or items based on the high-dimensional and sparse data vectors, e.g., to predict the profitability of a product or the age group of a user, etc. Linear classifiers are popular choices for classifying such datasets because of their efficiency. In order to classify the large sparse data more effectively, the following important questions need to be answered.

1. Sparse data and convergence behavior. How different properties of a dataset, such as …


Problems In Graph-Structured Modeling And Learning, James Atwood Jul 2017

Problems In Graph-Structured Modeling And Learning, James Atwood

Doctoral Dissertations

This thesis investigates three problems in graph-structured modeling and learning. We first present a method for efficiently generating large instances from nonlinear preferential attachment models of network structure. This is followed by a description of diffusion-convolutional neural networks, a new model for graph-structured data which is able to outperform probabilistic relational models and kernel-on-graph methods at node classification tasks. We conclude with an optimal privacy-protection method for users of online services that remains effective when users have poor knowledge of an adversary's behavior.


Speech Based Machine Learning Models For Emotional State Recognition And Ptsd Detection, Debrup Banerjee Jul 2017

Speech Based Machine Learning Models For Emotional State Recognition And Ptsd Detection, Debrup Banerjee

Electrical & Computer Engineering Theses & Dissertations

Recognition of emotional state and diagnosis of trauma related illnesses such as posttraumatic stress disorder (PTSD) using speech signals have been active research topics over the past decade. A typical emotion recognition system consists of three components: speech segmentation, feature extraction and emotion identification. Various speech features have been developed for emotional state recognition which can be divided into three categories, namely, excitation, vocal tract and prosodic. However, the capabilities of different feature categories and advanced machine learning techniques have not been fully explored for emotion recognition and PTSD diagnosis. For PTSD assessment, clinical diagnosis through structured interviews is a …


Solving Algorithmic Problems In Finitely Presented Groups Via Machine Learning, Jonathan Gryak Jun 2017

Solving Algorithmic Problems In Finitely Presented Groups Via Machine Learning, Jonathan Gryak

Dissertations, Theses, and Capstone Projects

Machine learning and pattern recognition techniques have been successfully applied to algorithmic problems in free groups. In this dissertation, we seek to extend these techniques to finitely presented non-free groups, in particular to polycyclic and metabelian groups that are of interest to non-commutative cryptography.

As a prototypical example, we utilize supervised learning methods to construct classifiers that can solve the conjugacy decision problem, i.e., determine whether or not a pair of elements from a specified group are conjugate. The accuracies of classifiers created using decision trees, random forests, and N-tuple neural network models are evaluated for several non-free groups. …


The Ogcleaner: Detecting False-Positive Sequence Homology, Masaki Stanley Fujimoto Jun 2017

The Ogcleaner: Detecting False-Positive Sequence Homology, Masaki Stanley Fujimoto

Theses and Dissertations

Within bioinformatics, phylogenetics is the study of the evolutionary relationships between different species and organisms. The genetic revolution has caused an explosion in the amount of raw genomic information that is available to scientists for study. While there has been an explosion in available data, analysis methods have lagged behind. A key task in phylogenetics is identifying homology clusters. Current methods rely on using heuristics based on pairwise sequence comparison to identify homology clusters. We propose the Orthology Group Cleaner (the OGCleaner) as a method to evaluate cluster level verification of putative homology clusters in order to create higher quality …


Mining Frequency Of Drug Side Effects Over A Large Twitter Dataset Using Apache Spark, Dennis Hsu May 2017

Mining Frequency Of Drug Side Effects Over A Large Twitter Dataset Using Apache Spark, Dennis Hsu

Master's Projects

Despite clinical trials by pharmaceutical companies as well as current FDA reporting systems, there are still drug side effects that have not been caught. To find a larger sample of reports, a possible way is to mine online social media. With its current widespread use, social media such as Twitter has given rise to massive amounts of data, which can be used as reports for drug side effects. To process these large datasets, Apache Spark has become popular for fast, distributed batch processing. In this work, we have improved on previous pipelines in sentimental analysis-based mining, processing, and extracting tweets …


Image Spam Detection, Aneri Chavda May 2017

Image Spam Detection, Aneri Chavda

Master's Projects

Email is one of the most common forms of digital communication. Spam can be de ned as unsolicited bulk email, while image spam includes spam text embedded inside images. Image spam is used by spammers so as to evade text-based spam lters and hence it poses a threat to email based communication. In this research, we analyze image spam detection methods based on various combinations of image processing and machine learning techniques.


Aspect Discovery From Product Reviews, Ying Ding May 2017

Aspect Discovery From Product Reviews, Ying Ding

Dissertations and Theses Collection

With the rapid development of online shopping sites and social media, product reviews are accumulating. These reviews contain information that is valuable to both businesses and customers. To businesses, companies can easily get a large number of feedback of their products, which is difficult to achieve by doing customer survey in the traditional way. To customers, they can know the products they are interested in better by reading reviews, which may be uneasy without online reviews. However, the accumulation has caused consuming all reviews impossible. It is necessary to develop automated techniques to efficiently process them. One of the most …


Detecting Malicious Campaigns In Crowdsourcing Platforms, Hongkyu Choi May 2017

Detecting Malicious Campaigns In Crowdsourcing Platforms, Hongkyu Choi

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Crowdsourcing sites such as Mechanical Turk and Crowdflower provide a marketplace where requesters create tasks and recruit workers, who may perform certain tasks in order to get financial compensation. Anyone in the world can be a requester and/or a worker as long as he/she has the Internet connection. Crowdsourcing creates a new way to solve various tasks by using “human computation power”. However, crowdsourcing has been misused by malicious requesters and unethical workers for account generation, search engine optimization, content and link generation, ad posting and spam mailing, and social network linking. It creates new threats to the Web system. …


Using A Multi Variate Pattern Analysis (Mvpa) Approach To Decode Fmri Responses To Fear And Anxiety., Sajjad Torabian Esfahani May 2017

Using A Multi Variate Pattern Analysis (Mvpa) Approach To Decode Fmri Responses To Fear And Anxiety., Sajjad Torabian Esfahani

Electronic Theses and Dissertations

This study analyzed fMRI responses to fear and anxiety using a Multi Variate Pattern Analysis (MVPA) approach. Compared to conventional univariate methods which only represent regions of activation, MVPA provides us with more detailed patterns of voxels. We successfully found different patterns for fear and anxiety through separate classification attempts in each subject’s representational space. Further, we transformed all the individual models into a standard space to do group analysis. Results showed that subjects share a more common fear response. Also, the amygdala and hippocampus areas are more important for differentiating fear than anxiety.


Improving Long Term Stock Market Prediction With Text Analysis, Tanner A. Bohn Apr 2017

Improving Long Term Stock Market Prediction With Text Analysis, Tanner A. Bohn

Electronic Thesis and Dissertation Repository

The task of forecasting stock performance is well studied with clear monetary motivations for those wishing to invest. A large amount of research in the area of stock performance prediction has already been done, and multiple existing results have shown that data derived from textual sources related to the stock market can be successfully used towards forecasting. These existing approaches have mostly focused on short term forecasting, used relatively simple sentiment analysis techniques, or had little data available. In this thesis, we prepare over ten years worth of stock data and propose a solution which combines features from textual yearly …


Investigating Citation Linkage Between Research Articles, Kokou Hospice Houngbo Apr 2017

Investigating Citation Linkage Between Research Articles, Kokou Hospice Houngbo

Electronic Thesis and Dissertation Repository

In recent years, there has been a dramatic increase in scientific publications across the globe. To help navigate this overabundance of information, methods have been devised to find papers with related content, but they are lacking in the ability to provide specific information that a researcher may need without having to read hundreds of linked papers. The search and browsing capabilities of online domain specific scientific repositories are limited to finding a paper citing other papers, but do not point to the specific text that is being cited. Providing this capability to the research community will be beneficial in terms …


Development And Evaluation Of Machine Learning Algorithms For Biomedical Applications, Turki Talal Turki Apr 2017

Development And Evaluation Of Machine Learning Algorithms For Biomedical Applications, Turki Talal Turki

Dissertations

Gene network inference and drug response prediction are two important problems in computational biomedicine. The former helps scientists better understand the functional elements and regulatory circuits of cells. The latter helps a physician gain full understanding of the effective treatment on patients. Both problems have been widely studied, though current solutions are far from perfect. More research is needed to improve the accuracy of existing approaches.

This dissertation develops machine learning and data mining algorithms, and applies these algorithms to solve the two important biomedical problems. Specifically, to tackle the gene network inference problem, the dissertation proposes (i) new techniques …


Viewability Prediction For Display Advertising, Chong Wang Apr 2017

Viewability Prediction For Display Advertising, Chong Wang

Dissertations

As a massive industry, display advertising delivers advertisers’ marketing messages to attract customers through graphic banners on webpages. Display advertising is also the most essential revenue source of online publishers. Currently, advertisers are charged by user response or ad serving. However, recent studies show that users barely click or convert display ads. Moreover, about half of the ads are actually never seen by users. In this case, advertisers cannot enhance their brand awareness and increase return on investment. Publishers also lose much revenue. Therefore, the ad pricing standards are shifting to a new model: ad impressions are paid if they …


Using Machine Learning To Predict Chemotherapy Response In Cell Lines And Patients Based On Genetic Expression, Dimo Angelov Mar 2017

Using Machine Learning To Predict Chemotherapy Response In Cell Lines And Patients Based On Genetic Expression, Dimo Angelov

Electronic Thesis and Dissertation Repository

The goal of this thesis was to examine different machine learning techniques for predicting chemotherapy response in cell lines and patients based on genetic expression. After trying regression, multi-class classification techniques and binary classification it was concluded that binary classification was the best method for training models due to the limited size of available cell line data. We found support vector machine classifiers trained on cell line data were easier to use and produced better results compared to neural networks. Sequential backward feature selection was able to select genes for the models that produced good results, however the greedy algorithm …


Malware Detection Using The Index Of Coincidence, Bhavna Gurnani Jan 2017

Malware Detection Using The Index Of Coincidence, Bhavna Gurnani

Master's Projects

In this research, we apply the Index of Coincidence (IC) to problems in malware analysis. The IC, which is often used in cryptanalysis of classic ciphers, is a technique for measuring the repeat rate in a string of symbols. A score based on the IC is applied to a variety of challenging malware families. We nd that this relatively simple IC score performs surprisingly well, with superior results in comparison to various machine learning based scores, at least in some cases.


Towards A Relative-Pitch Neural Network System For Chorale Composition And Harmonization, Samuel P. Goree Jan 2017

Towards A Relative-Pitch Neural Network System For Chorale Composition And Harmonization, Samuel P. Goree

Honors Papers

Computational creativity researchers interested in applying machine learning to computer composition often use the music of J.S. Bach to train their systems. Working with Bach, though, requires grappling with the conventions of tonal music, which can be difficult for computer systems to learn. In this paper, we propose and implement an alternate approach to composition and harmonization of chorales based on pitch-relative note encodings to avoid tonality altogether. We then evaluate our approach using a survey and expert analysis, and find that pitch-relative encodings do not significantly affect human-comparability, likability or creativity. However, an extension of this model that better …