Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine Learning

2013

Discipline
Institution
Publication
Publication Type

Articles 1 - 17 of 17

Full-Text Articles in Physical Sciences and Mathematics

A Topics Analysis Model For Health Insurance Claims, Jared Anthony Webb Oct 2013

A Topics Analysis Model For Health Insurance Claims, Jared Anthony Webb

Theses and Dissertations

Mathematical probability has a rich theory and powerful applications. Of particular note is the Markov chain Monte Carlo (MCMC) method for sampling from high dimensional distributions that may not admit a naive analysis. We develop the theory of the MCMC method from first principles and prove its relevance. We also define a Bayesian hierarchical model for generating data. By understanding how data are generated we may infer hidden structure about these models. We use a specific MCMC method called a Gibbs' sampler to discover topic distributions in a hierarchical Bayesian model called Topics Over Time. We propose an innovative use …


Will We Connect Again? Machine Learning For Link Prediction In Mobile Social Networks, Ole J. Mengshoel, Raj Desai, Andrew Chen, Brian Tran Jul 2013

Will We Connect Again? Machine Learning For Link Prediction In Mobile Social Networks, Ole J. Mengshoel, Raj Desai, Andrew Chen, Brian Tran

Ole J Mengshoel

In this paper we examine link prediction for two types of data sets with mobility data, namely call data records (from the MIT Reality Mining project) and location-based social networking data (from the companies Gowalla and Brightkite). These data sets contain location information, which we incorporate in the features used for prediction. We also examine different strategies for data cleaning, in particular thresholding based on the amount of social interaction. We investigate the machine learning algorithms Decision Tree, Naïve Bayes, Support Vector Machine, and Logistic Regression. Generally, we find that our feature selection and filtering of the data sets have …


Optimizing Parallel Belief Propagation In Junction Trees Using Regression, Lu Zheng, Ole J. Mengshoel Jul 2013

Optimizing Parallel Belief Propagation In Junction Trees Using Regression, Lu Zheng, Ole J. Mengshoel

Ole J Mengshoel

The junction tree approach, with applications in artificial intelligence, computer vision, machine learning, and statistics, is often used for computing posterior distributions in probabilistic graphical models. One of the key challenges associated with junction trees is computational, and several parallel computing technologies - including many-core processors - have been investigated to meet this challenge. Many-core processors (including GPUs) are now programmable, unfortunately their complexities make it hard to manually tune their parameters in order to optimize software performance. In this paper, we investigate a machine learning approach to minimize the execution time of parallel junction tree algorithms implemented on a …


Assessment And Prediction Of Cardiovascular Status During Cardiac Arrest Through Machine Learning And Dynamical Time-Series Analysis, Sharad Shandilya Jul 2013

Assessment And Prediction Of Cardiovascular Status During Cardiac Arrest Through Machine Learning And Dynamical Time-Series Analysis, Sharad Shandilya

Theses and Dissertations

In this work, new methods of feature extraction, feature selection, stochastic data characterization/modeling, variance reduction and measures for parametric discrimination are proposed. These methods have implications for data mining, machine learning, and information theory. A novel decision-support system is developed in order to guide intervention during cardiac arrest. The models are built upon knowledge extracted with signal-processing, non-linear dynamic and machine-learning methods. The proposed ECG characterization, combined with information extracted from PetCO2 signals, shows viability for decision-support in clinical settings. The approach, which focuses on integration of multiple features through machine learning techniques, suits well to inclusion of multiple physiologic …


Latent Topic Analysis For Predicting Group Purchasing Behavior On The Social Web, Feng-Tso Sun, Martin Griss, Ole J. Mengshoel, Yi-Ting Yeh Jun 2013

Latent Topic Analysis For Predicting Group Purchasing Behavior On The Social Web, Feng-Tso Sun, Martin Griss, Ole J. Mengshoel, Yi-Ting Yeh

Ole J Mengshoel

Group-deal websites, where customers purchase products or services in groups, are an interesting phenomenon on the Web. Each purchase is kicked o#11;ff by a group initiator, and other customers can join in. Customers form communities with people with similar interests and preferences (as in a social network), and this drives bulk purchasing (similar to online stores, but in larger quantities per order, thus customers get a better deal). In this work, we aim to better understand what factors in influence customers' purchasing behavior for such social group-deal websites. We propose two probabilistic graphical models, i.e., a product-centric inference model (PCIM) …


Mobile Computing: Challenges And Opportunities For Autonomy And Feedback, Ole J. Mengshoel, Bob Iannucci, Abe Ishihara May 2013

Mobile Computing: Challenges And Opportunities For Autonomy And Feedback, Ole J. Mengshoel, Bob Iannucci, Abe Ishihara

Ole J Mengshoel

Mobile devices have evolved to become computing platforms more similar to desktops and workstations than the cell phones and handsets of yesteryear. Unfortunately, today’s mobile infrastructures are mirrors of the wired past. Devices, apps, and networks impact one another, but a systematic approach for allowing them to cooperate is currently missing. We propose an approach that seeks to open key interfaces and to apply feedback and autonomic computing to improve both user experience and mobile system dynamics.


Subsemble: An Ensemble Method For Combining Subset-Specific Algorithm Fits, Stephanie Sapp, Mark J. Van Der Laan, John Canny May 2013

Subsemble: An Ensemble Method For Combining Subset-Specific Algorithm Fits, Stephanie Sapp, Mark J. Van Der Laan, John Canny

U.C. Berkeley Division of Biostatistics Working Paper Series

Ensemble methods using the same underlying algorithm trained on different subsets of observations have recently received increased attention as practical prediction tools for massive datasets. We propose Subsemble: a general subset ensemble prediction method, which can be used for small, moderate, or large datasets. Subsemble partitions the full dataset into subsets of observations, fits a specified underlying algorithm on each subset, and uses a clever form of V-fold cross-validation to output a prediction function that combines the subset-specific fits. We give an oracle result that provides a theoretical performance guarantee for Subsemble. Through simulations, we demonstrate that Subsemble can be …


Exploiting Domain Structure In Multiagent Decision-Theoretic Planning And Reasoning, Akshat Kumar May 2013

Exploiting Domain Structure In Multiagent Decision-Theoretic Planning And Reasoning, Akshat Kumar

Open Access Dissertations

This thesis focuses on decision-theoretic reasoning and planning problems that arise when a group of collaborative agents are tasked to achieve a goal that requires collective effort. The main contribution of this thesis is the development of effective, scalable and quality-bounded computational approaches for multiagent planning and coordination under uncertainty. This is achieved by a synthesis of techniques from multiple areas of artificial intelligence, machine learning and operations research. Empirically, each algorithmic contribution has been tested rigorously on common benchmark problems and, in many cases, real-world applications from machine learning and operations research literature.

The first part of the thesis …


Automating Large-Scale Simulation Calibration To Real-World Sensor Data, Richard Everett Edwards May 2013

Automating Large-Scale Simulation Calibration To Real-World Sensor Data, Richard Everett Edwards

Doctoral Dissertations

Many key decisions and design policies are made using sophisticated computer simulations. However, these sophisticated computer simulations have several major problems. The two main issues are 1) gaps between the simulation model and the actual structure, and 2) limitations of the modeling engine's capabilities. This dissertation's goal is to address these simulation deficiencies by presenting a general automated process for tuning simulation inputs such that simulation output matches real world measured data. The automated process involves the following key components -- 1) Identify a model that accurately estimates the real world simulation calibration target from measured sensor data; 2) Identify …


Enhancement Of Random Forests Using Trees With Oblique Splits, Andrejus Parfionovas May 2013

Enhancement Of Random Forests Using Trees With Oblique Splits, Andrejus Parfionovas

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Statistical classification is widely used in many areas where there is a need to make a data-driven decision, or to classify complicated cases or objects. For instance: disease diagnostics (is a patient sick or healthy, based on the blood test results?); weather forecasting (will there be a storm tomorrow, based on today's atmospheric pressure, air temperature, and wind velocity?); speech recognition (what was said over the phone, based on the caller's voice level and articulation); spam detection (can the unsolicited commercial e-mails be identified by their content?); and so on.

Classification trees …


Knowledge Extraction In Video Through The Interaction Analysis Of Activities, Omar Ulises Florez May 2013

Knowledge Extraction In Video Through The Interaction Analysis Of Activities, Omar Ulises Florez

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

A video is a growing stream of unstructured data that significantly increases the amount of information transmitted and stored on the Internet. For example, every minute YouTube users upload 72 GB of information. Some of the best applications for video analysis include the monitoring of activities in defense and security scenarios such as the autonomous planes that collect video and images at reduced risk and the surveillance cameras in public places like traffic lights, airports, and schools.

Some of the challenges in the analysis of video correspond to implement complex operations such as searching of activities, understanding of scenes, and …


An Automatic Framework For Embryonic Localization Using Edges In A Scale Space, Zachary Bessinger May 2013

An Automatic Framework For Embryonic Localization Using Edges In A Scale Space, Zachary Bessinger

Masters Theses & Specialist Projects

Localization of Drosophila embryos in images is a fundamental step in an automatic computational system for the exploration of gene-gene interaction on Drosophila. Contour extraction of embryonic images is challenging due to many variations in embryonic images. In the thesis work, we develop a localization framework based on the analysis of connected components of edge pixels in a scale space. We propose criteria to select optimal scales for embryonic localization. Furthermore, we propose a scale mapping strategy to compress the range of a scale space in order to improve the efficiency of the localization framework. The effectiveness of the proposed …


A Hierarchical Multi-Output Nearest Neighbor Model For Multi-Output Dependence Learning, Richard Glenn Morris Mar 2013

A Hierarchical Multi-Output Nearest Neighbor Model For Multi-Output Dependence Learning, Richard Glenn Morris

Theses and Dissertations

Multi-Output Dependence (MOD) learning is a generalization of standard classification problems that allows for multiple outputs that are dependent on each other. A primary issue that arises in the context of MOD learning is that for any given input pattern there can be multiple correct output patterns. This changes the learning task from function approximation to relation approximation. Previous algorithms do not consider this problem, and thus cannot be readily applied to MOD problems. To perform MOD learning, we introduce the Hierarchical Multi-Output Nearest Neighbor model (HMONN) that employs a basic learning model for each output and a modified nearest …


Spoons: Netflix Outage Detection Using Microtext Classification, Eriq A. Augusitne Mar 2013

Spoons: Netflix Outage Detection Using Microtext Classification, Eriq A. Augusitne

Master's Theses

Every week there are over a billion new posts to Twitter services and many of those messages contain feedback to companies about their services. One company that recognizes this unused source of information is Netflix. That is why Netflix initiated the development of a system that lets them respond to the millions of Twitter and Netflix users that are acting as sensors and reporting all types of user visible outages. This system enhances the feedback loop between Netflix and its customers by increasing the amount of customer feedback that Netflix receives and reducing the time it takes for Netflix to …


Learning With An Insufficient Supply Of Data Via Knowledge Transfer And Sharing, Samir Al-Stouhi Jan 2013

Learning With An Insufficient Supply Of Data Via Knowledge Transfer And Sharing, Samir Al-Stouhi

Wayne State University Dissertations

As machine learning methods extend to more complex and diverse set of problems, situations arise where the complexity and availability of data presents a situation where the information source is not "adequate" to generate a representative hypothesis. Learning from multiple sources of data is a promising research direction as researchers leverage ever more diverse sources of information. Since data is not readily available, knowledge has to be transferred from other sources and new methods (both supervised and un-supervised) have to be developed to selectively share and transfer knowledge. In this dissertation, we present both supervised and un-supervised techniques to tackle …


Hybrid Agent Based Simulation With Adaptive Learning Of Travel Mode Choices For University Commuters (Wip), Nagesh Shukla, Albert Munoz, Jun Ma, Nam Huynh Jan 2013

Hybrid Agent Based Simulation With Adaptive Learning Of Travel Mode Choices For University Commuters (Wip), Nagesh Shukla, Albert Munoz, Jun Ma, Nam Huynh

SMART Infrastructure Facility - Papers

This paper presents a methodology for developing a hybrid agent-based micro-simulation model to capture the impacts of commuter travel mode choices on a University campus transport network. The proposed methodology involves: (i) developing realistic population of commuter agents (students and staff); (ii) assigning activity lists and travel mode choices to agents using machine learning method; and, (iii) traffic micro-simulation of the study area transport network. This furthers the understanding of current transport modal distributions, factors affecting the travel mode choice decisions, and, network performance through a number of hypothetical travel scenarios.


Energy Efficient Context-Aware Framework In Mobile Sensing, Ozgur Yurur Jan 2013

Energy Efficient Context-Aware Framework In Mobile Sensing, Ozgur Yurur

USF Tampa Graduate Theses and Dissertations

The ever-increasing technological advances in embedded systems engineering, together with the proliferation of small-size sensor design and deployment, have enabled mobile devices (e.g., smartphones) to recognize daily occurring human based actions, activities and interactions. Therefore, inferring a vast variety of mobile device user based activities from a very diverse context obtained by a series of sensory observations has drawn much interest in the research area of ubiquitous sensing. The existence and awareness of the context provides the capability of being conscious of physical environments or situations around mobile device users, and this allows network services to respond proactively and intelligently …