Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine learning

2015

Discipline
Institution
Publication
Publication Type

Articles 1 - 30 of 30

Full-Text Articles in Physical Sciences and Mathematics

A Data Science Course For Undergraduates: Thinking With Data, Benjamin Baumer Dec 2015

A Data Science Course For Undergraduates: Thinking With Data, Benjamin Baumer

Mathematics and Statistics: Faculty Publications

Data science is an emerging interdisciplinary field that combines elements of mathematics, statistics, computer science, and knowledge in a particular application domain for the purpose of extracting meaningful information from the increasingly sophisticated array of data available in many settings. These data tend to be nontraditional, in the sense that they are often live, large, complex, and/or messy. A first course in statistics at the undergraduate level typically introduces students to a variety of techniques to analyze small, neat, and clean datasets. However, whether they pursue more formal training in statistics or not, many of these students will end up …


The Performance Of Random Prototypes In Hierarchical Models Of Vision, Kendall Lee Stewart Dec 2015

The Performance Of Random Prototypes In Hierarchical Models Of Vision, Kendall Lee Stewart

Dissertations and Theses

I investigate properties of HMAX, a computational model of hierarchical processing in the primate visual cortex. High-level cortical neurons have been shown to respond highly to particular natural shapes, such as faces. HMAX models this property with a dictionary of natural shapes, called prototypes, that respond to the presence of those shapes. The resulting set of similarity measurements is an effective descriptor for classifying images. Curiously, prior work has shown that replacing the dictionary of natural shapes with entirely random prototypes has little impact on classification performance. This work explores that phenomenon by studying the performance of random prototypes on …


Dynamic Data Management In A Data Grid Environment, Björn Barrefors Dec 2015

Dynamic Data Management In A Data Grid Environment, Björn Barrefors

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

A data grid is a geographically distributed set of resources providing a facility for computationally intensive analysis of large datasets to a large number of geographically distributed users. In the scientific community, data grids have become increasingly popular as scientific research is driven by large datasets. Until recently, developments in data management for data grids have focused on management of data at lower layers in the data grid architecture. With dataset sizes expected to approach exabyte scale in coming years, data management in data grids are facing a new set of challenges. In particularly, the problem of automatically placing and …


Energy Forecasting For Event Venues: Big Data And Prediction Accuracy, Katarina Grolinger, Alexandra L'Heureux, Miriam Am Capretz, Luke Seewald Dec 2015

Energy Forecasting For Event Venues: Big Data And Prediction Accuracy, Katarina Grolinger, Alexandra L'Heureux, Miriam Am Capretz, Luke Seewald

Electrical and Computer Engineering Publications

Advances in sensor technologies and the proliferation of smart meters have resulted in an explosion of energy-related data sets. These Big Data have created opportunities for development of new energy services and a promise of better energy management and conservation. Sensor-based energy forecasting has been researched in the context of office buildings, schools, and residential buildings. This paper investigates sensor-based forecasting in the context of event-organizing venues, which present an especially difficult scenario due to large variations in consumption caused by the hosted events. Moreover, the significance of the data set size, specifically the impact of temporal granularity, on energy …


Evaluating The Intrinsic Similarity Between Neural Networks, Stephen Charles Ashmore Dec 2015

Evaluating The Intrinsic Similarity Between Neural Networks, Stephen Charles Ashmore

Graduate Theses and Dissertations

We present Forward Bipartite Alignment (FBA), a method that aligns the topological structures of two neural networks. Neural networks are considered to be a black box, because neural networks contain complex model surface determined by their weights that combine attributes non-linearly. Two networks that make similar predictions on training data may still generalize differently. FBA enables a diversity of applications, including visualization and canonicalization of neural networks, ensembles, and cross-over between unrelated neural networks in evolutionary optimization. We describe the FBA algorithm, and describe implementations for three applications: genetic algorithms, visualization, and ensembles. We demonstrate FBA's usefulness by comparing a …


Distributed Approach For Peptide Identification, Naga V K Abhinav Vedanbhatla Oct 2015

Distributed Approach For Peptide Identification, Naga V K Abhinav Vedanbhatla

Masters Theses & Specialist Projects

A crucial step in protein identification is peptide identification. The Peptide Spectrum Match (PSM) information set is enormous. Hence, it is a time-consuming procedure to work on a single machine. PSMs are situated by a cross connection, a factual score, or a probability that the match between the trial and speculative is right and original. This procedure takes quite a while to execute. So, there is demand for enhancement of the performance to handle extensive peptide information sets. Development of appropriate distributed frameworks are expected to lessen the processing time.

The designed framework uses a peptide handling algorithm named C-Ranker, …


Learning Relative Similarity From Data Streams: Active Online Learning Approaches, Shuji Hao, Peilin Zhao, Steven C. H. Hoi, Chunyan Miao Oct 2015

Learning Relative Similarity From Data Streams: Active Online Learning Approaches, Shuji Hao, Peilin Zhao, Steven C. H. Hoi, Chunyan Miao

Research Collection School Of Computing and Information Systems

Relative similarity learning, as an important learning scheme for information retrieval, aims to learn a bi-linear similarity function from a collection of labeled instance-pairs, and the learned function would assign a high similarity value for a similar instance-pair and a low value for a dissimilar pair. Existing algorithms usually assume the labels of all the pairs in data streams are always made available for learning. However, this is not always realistic in practice since the number of possible pairs is quadratic to the number of instances in the database, and manually labeling the pairs could be very costly and time …


Detecting, Modeling, And Predicting User Temporal Intention, Hany M. Salaheldeen Jul 2015

Detecting, Modeling, And Predicting User Temporal Intention, Hany M. Salaheldeen

Computer Science Theses & Dissertations

The content of social media has grown exponentially in the recent years and its role has evolved from narrating life events to actually shaping them. Unfortunately, content posted and shared in social networks is vulnerable and prone to loss or change, rendering the context associated with it (a tweet, post, status, or others) meaningless. There is an inherent value in maintaining the consistency of such social records as in some cases they take over the task of being the first draft of history as collections of these social posts narrate the pulse of the street during historic events, protest, riots, …


Reliable Patch Trackers: Robust Visual Tracking By Exploiting Reliable Patches, Yang Li, Jianke Zhu, Steven C. H. Hoi Jun 2015

Reliable Patch Trackers: Robust Visual Tracking By Exploiting Reliable Patches, Yang Li, Jianke Zhu, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Most modern trackers typically employ a bounding box given in the first frame to track visual objects, where their tracking results are often sensitive to the initialization. In this paper, we propose a new tracking method, Reliable Patch Trackers (RPT), which attempts to identify and exploit the reliable patches that can be tracked effectively through the whole tracking process. Specifically, we present a tracking reliability metric to measure how reliably a patch can be tracked, where a probability model is proposed to estimate the distribution of reliable patches under a sequential Monte Carlo framework. As the reliable patches distributed over …


Information Filtering By Multiple Examples, Mingzhu Zhu May 2015

Information Filtering By Multiple Examples, Mingzhu Zhu

Dissertations

A key to successfully satisfy an information need lies in how users express it using keywords as queries. However, for many users, expressing their information needs using keywords is difficult, especially when the information need is complex. Search By Multiple Examples (SBME), a promising method for overcoming this problem, allows users to specify their information needs as a set of relevant documents rather than as a set of keywords.

Most of the studies on SBME adopt the Positive Unlabeled learning (PU learning) techniques by treating the user's provided examples (denoted as query examples) as positive set and the entire data …


Neuroscience-Inspired Dynamic Architectures, Catherine Dorothy Schuman May 2015

Neuroscience-Inspired Dynamic Architectures, Catherine Dorothy Schuman

Doctoral Dissertations

Biological brains are some of the most powerful computational devices on Earth. Computer scientists have long drawn inspiration from neuroscience to produce computational tools. This work introduces neuroscience-inspired dynamic architectures (NIDA), spiking neural networks embedded in a geometric space that exhibit dynamic behavior. A neuromorphic hardware implementation based on NIDA networks, Dynamic Adaptive Neural Network Array (DANNA), is discussed. Neuromorphic implementations are one alternative/complement to traditional von Neumann computation. A method for designing/training NIDA networks, based on evolutionary optimization, is introduced. We demonstrate the utility of NIDA networks on classification tasks, a control task, and an anomaly detection task. There …


Feature Identification And Reduction For Improved Generalization Accuracy In Secondary-Structure Prediction Using Temporal Context Inputs In Machine-Learning Models, Matthew Benjamin Seeley May 2015

Feature Identification And Reduction For Improved Generalization Accuracy In Secondary-Structure Prediction Using Temporal Context Inputs In Machine-Learning Models, Matthew Benjamin Seeley

Theses and Dissertations

A protein's properties are influenced by both its amino-acid sequence and its three-dimensional conformation. Ascertaining a protein's sequence is relatively easy using modern techniques, but determining its conformation requires much more expensive and time-consuming techniques. Consequently, it would be useful to identify a method that can accurately predict a protein's secondary-structure conformation using only the protein's sequence data. This problem is not trivial, however, because identical amino-acid subsequences in different contexts sometimes have disparate secondary structures, while highly dissimilar amino-acid subsequences sometimes have identical secondary structures. We propose (1) to develop a set of metrics that facilitates better comparisons between …


Pattern Recognition And Matching In Ice Core Data, Nathan Dunn Apr 2015

Pattern Recognition And Matching In Ice Core Data, Nathan Dunn

Honors College

The purpose of this research is to investigate the potential of applying concepts from ma- chine learning, such as pattern recognition and matching, to detect climatic signals in ice core data. The main components of this project are the development of a pattern language for expressing relationships between chemical signals over time, a method of tokenizing ice core chemistry data into an easily manageable form, a method of matching specific instances of climatic signals to a specific pattern string, and a method to recognize and evaluate patterns within ice core chemistry data. While there are weaknesses in each of these …


Using Instance-Level Meta-Information To Facilitate A More Principled Approach To Machine Learning, Michael Reed Smith Apr 2015

Using Instance-Level Meta-Information To Facilitate A More Principled Approach To Machine Learning, Michael Reed Smith

Theses and Dissertations

As the capability for capturing and storing data increases and becomes more ubiquitous, an increasing number of organizations are looking to use machine learning techniques as a means of understanding and leveraging their data. However, the success of applying machine learning techniques depends on which learning algorithm is selected, the hyperparameters that are provided to the selected learning algorithm, and the data that is supplied to the learning algorithm. Even among machine learning experts, selecting an appropriate learning algorithm, setting its associated hyperparameters, and preprocessing the data can be a challenging task and is generally left to the expertise of …


Epistemological Databases For Probabilistic Knowledge Base Construction, Michael Louis Wick Mar 2015

Epistemological Databases For Probabilistic Knowledge Base Construction, Michael Louis Wick

Doctoral Dissertations

Knowledge bases (KB) facilitate real world decision making by providing access to structured relational information that enables pattern discovery and semantic queries. Although there is a large amount of data available for populating a KB; the data must first be gathered and assembled. Traditionally, this integration is performed automatically by storing the output of an information extraction pipeline directly into a database as if this prediction were the ``truth.'' However, the resulting KB is often not reliable because (a) errors accumulate in the integration pipeline, and (b) they persist in the KB even after new information arrives that could rectify …


Learning With Joint Inference And Latent Linguistic Structure In Graphical Models, Jason Narad Mar 2015

Learning With Joint Inference And Latent Linguistic Structure In Graphical Models, Jason Narad

Doctoral Dissertations

Constructing end-to-end NLP systems requires the processing of many types of linguistic information prior to solving the desired end task. A common approach to this problem is to construct a pipeline, one component for each task, with each system's output becoming input for the next. This approach poses two problems. First, errors propagate, and, much like the childhood game of "telephone", combining systems in this manner can lead to unintelligible outcomes. Second, each component task requires annotated training data to act as supervision for training the model. These annotations are often expensive and time-consuming to produce, may differ from each …


Leveraging Contextual Relationships Between Objects For Localization, Clinton Leif Olson Mar 2015

Leveraging Contextual Relationships Between Objects For Localization, Clinton Leif Olson

Dissertations and Theses

Object localization is currently an active area of research in computer vision. The object localization task is to identify all locations of an object class within an image by drawing a bounding box around objects that are instances of that class. Object locations are typically found by computing a classification score over a small window at multiple locations in the image, based on some chosen criteria, and choosing the highest scoring windows as the object bounding-boxes. Localization methods vary widely, but there is a growing trend towards methods that are able to make localization more accurate and efficient through the …


Use Of A High-Value Social Audience Index For Target Audience Identification On Twitter, Siaw Ling Lo, David Cornforth, Raymond. Chiong Feb 2015

Use Of A High-Value Social Audience Index For Target Audience Identification On Twitter, Siaw Ling Lo, David Cornforth, Raymond. Chiong

Research Collection School Of Computing and Information Systems

With the large and growing user base of social media, it is not an easy feat to identify potential customers for business. This is mainly due to the challenge of extracting commercially viable contents from the vast amount of free-form conversations. In this paper, we analyse the Twitter content of an account owner and its list of followers through various text mining methods and segment the list of followers via an index. We have termed this index as the High-Value Social Audience (HVSA) index. This HVSA index enables a company or organisation to devise their marketing and engagement plan according …


Cancer Risk Prediction With Next Generation Sequencing Data Using Machine Learning, Nihir Patel Jan 2015

Cancer Risk Prediction With Next Generation Sequencing Data Using Machine Learning, Nihir Patel

Theses

The use of computational biology for next generation sequencing (NGS) analysis is rapidly increasing in genomics research. However, the effectiveness of NGS data to predict disease abundance is yet unclear. This research investigates the problem in the whole exome NGS data of the chronic lymphocytic leukemia (CLL) available at dbGaP. Initially, raw reads from samples are aligned to the human reference genome using burrows wheeler aligner. From the samples, structural variants, namely, Single Nucleotide Polymorphism (SNP) and Insertion Deletion (INDEL) are identified and are filtered using SAMtools as well as with Genome Analyzer Tool Kit (GATK). Subsequently, the variants are …


Reverse Engineering The Human Brain: An Evolutionary Computation Approach To The Analysis Of Fmri, Nicholas Allgaier Jan 2015

Reverse Engineering The Human Brain: An Evolutionary Computation Approach To The Analysis Of Fmri, Nicholas Allgaier

Graduate College Dissertations and Theses

The field of neuroimaging has truly become data rich, and as such, novel analytical methods capable of gleaning meaningful information from large stores of imaging data are in high demand. Those methods that might also be applicable on the level of individual subjects, and thus potentially useful clinically, are of special interest. In this dissertation we introduce just such a method, called nonlinear functional mapping (NFM), and demonstrate its application in the analysis of resting state fMRI (functional Magnetic Resonance Imaging) from a 242-subject subset of the IMAGEN project, a European study of risk-taking behavior in adolescents that includes longitudinal …


Energy Cost Forecasting For Event Venues, Katarina Grolinger, Andrea Zagar, Miriam Am Capretz, Luke Seewald Jan 2015

Energy Cost Forecasting For Event Venues, Katarina Grolinger, Andrea Zagar, Miriam Am Capretz, Luke Seewald

Electrical and Computer Engineering Publications

Electricity price, consumption, and demand forecasting has been a topic of research interest for a long time. The proliferation of smart meters has created new opportunities in energy prediction. This paper investigates energy cost forecasting in the context of entertainment event-organizing venues, which poses significant difficulty due to fluctuations in energy demand and wholesale electricity prices. The objective is to predict the overall cost of energy consumed during an entertainment event. Predictions are carried out separately for each event category and feature selection is used to select the most effective combination of event attributes for each category. Three machine learning …


Singular Value Computation And Subspace Clustering, Qiao Liang Jan 2015

Singular Value Computation And Subspace Clustering, Qiao Liang

Theses and Dissertations--Mathematics

In this dissertation we discuss two problems. In the first part, we consider the problem of computing a few extreme eigenvalues of a symmetric definite generalized eigenvalue problem or a few extreme singular values of a large and sparse matrix. The standard method of choice of computing a few extreme eigenvalues of a large symmetric matrix is the Lanczos or the implicitly restarted Lanczos method. These methods usually employ a shift-and-invert transformation to accelerate the speed of convergence, which is not practical for truly large problems. With this in mind, Golub and Ye proposes an inverse-free preconditioned Krylov subspace method, …


Characterization Of Prose By Rhetorical Structure For Machine Learning Classification, James Java Jan 2015

Characterization Of Prose By Rhetorical Structure For Machine Learning Classification, James Java

CCE Theses and Dissertations

Measures of classical rhetorical structure in text can improve accuracy in certain types of stylistic classification tasks such as authorship attribution. This research augments the relatively scarce work in the automated identification of rhetorical figures and uses the resulting statistics to characterize an author's rhetorical style. These characterizations of style can then become part of the feature set of various classification models.

Our Rhetorica software identifies 14 classical rhetorical figures in free English text, with generally good precision and recall, and provides summary measures to use in descriptive or classification tasks. Classification models trained on Rhetorica's rhetorical measures paired with …


A Comparative Study Of Two Prediction Models For Brain Tumor Progression, Deqi Zhou, Loc Tran, Jihong Wang, Jiang Li, Karen O. Egiazarian (Ed.), Sos S. Agaian (Ed.), Atanas P. Gotchev (Ed.) Jan 2015

A Comparative Study Of Two Prediction Models For Brain Tumor Progression, Deqi Zhou, Loc Tran, Jihong Wang, Jiang Li, Karen O. Egiazarian (Ed.), Sos S. Agaian (Ed.), Atanas P. Gotchev (Ed.)

Electrical & Computer Engineering Faculty Publications

MR diffusion tensor imaging (DTI) technique together with traditional T1 or T2 weighted MRI scans supplies rich information sources for brain cancer diagnoses. These images form large-scale, high-dimensional data sets. Due to the fact that significant correlations exist among these images, we assume low-dimensional geometry data structures (manifolds) are embedded in the high-dimensional space. Those manifolds might be hidden from radiologists because it is challenging for human experts to interpret high-dimensional data. Identification of the manifold is a critical step for successfully analyzing multimodal MR images.

We have developed various manifold learning algorithms (Tran et al. 2011; Tran et al. …


A Middleware Framework For Application-Aware And User-Specific Energy Optimization In Smart Mobile Devices, Sudeep Pasricha, Brad K. Donohoo, Chris Ohlsen Jan 2015

A Middleware Framework For Application-Aware And User-Specific Energy Optimization In Smart Mobile Devices, Sudeep Pasricha, Brad K. Donohoo, Chris Ohlsen

U.S. Air Force Research

munication, and social interaction. In addition to the demand for an acceptable level of performance and a comprehensive set of features, users often desire extended battery lifetime. In fact, limited battery lifetime is one of the biggest obstacles facing the current utility and future growth of increasingly sophisticated ‘‘smart’’ mobile devices. This paper proposes a novel application-aware and user-interaction aware energy optimization middleware framework (AURA) for pervasive mobile devices. AURA optimizes CPU and screen backlight energy consumption while maintaining a minimum acceptable level of performance. The proposed framework employs a novel Bayesian application classifier and management strategies based on Markov …


Modeling User Transportation Patterns Using Mobile Devices, Erfan Davami Jan 2015

Modeling User Transportation Patterns Using Mobile Devices, Erfan Davami

Electronic Theses and Dissertations

Participatory sensing frameworks use humans and their computing devices as a large mobile sensing network. Dramatic accessibility and affordability have turned mobile devices (smartphone and tablet computers) into the most popular computational machines in the world, exceeding laptops. By the end of 2013, more than 1.5 billion people on earth will have a smartphone. Increased coverage and higher speeds of cellular networks have given these devices the power to constantly stream large amounts of data. Most mobile devices are equipped with advanced sensors such as GPS, cameras, and microphones. This expansion of smartphone numbers and power has created a sensing …


Designing Medical Interactive Systems Via Assessment Of Human Mental Workload, Luca Longo Jan 2015

Designing Medical Interactive Systems Via Assessment Of Human Mental Workload, Luca Longo

Conference papers

In clinical settings, Human-computer systems need to be designed in a way that medical errors are reduced and patient care is enhanced. Inspection methods are usually employed in HCI to assess usability of interactive systems. However, they do not consider the state of the operator while executing a task, the surrounding environment and the task demands. It is argued that assessing performance of operators is fundamental for designing optimal systems with which healthcare can be effectively delivered. The aim of our solution is to assess performance of operators employing the notion of Mental Workload (MWL) this being a construct believed …


Seavipers - Computer Vision And Inertial Position Reference Sensor System (Cviprss), Justin Lee Erdman Jan 2015

Seavipers - Computer Vision And Inertial Position Reference Sensor System (Cviprss), Justin Lee Erdman

LSU Doctoral Dissertations

This work describes the design and development of an optical, Computer Vision (CV) based sensor for use as a Position Reference System (PRS) in Dynamic Positioning (DP). Using a combination of robotics and CV techniques, the sensor provides range and heading information to a selected reference object. The proposed optical system is superior to existing ones because it does not depend upon special reflectors nor does it require a lengthy set-up time. This system, the Computer Vision and Inertial Position Reference Sensor System (CVIPRSS, pronounced \nickname), combines a laser rangefinder, infrared camera, and a pan--tilt unit with the robust TLD …


Visual Saliency Estimation : A Pre-Attentive Cognitive And Context-Aware Approach, Amanda Shannon Danko Jan 2015

Visual Saliency Estimation : A Pre-Attentive Cognitive And Context-Aware Approach, Amanda Shannon Danko

Legacy Theses & Dissertations (2009 - 2024)

At each glance, biological vision systems organize a tremendous amount of input and


Rice Blast Disease Forecasting For Northern Philippines, Proceso L. Fernandez Jr, Alvin R. Malicdem Jan 2015

Rice Blast Disease Forecasting For Northern Philippines, Proceso L. Fernandez Jr, Alvin R. Malicdem

Department of Information Systems & Computer Science Faculty Publications

Rice blast disease has become an enigmatic problem in several rice growing ecosystems of both tropical and temperate regions of the world. In this study, we develop models for predicting the occurrence and severity of rice blast disease, with the aim of helping to prevent or at least mitigate the spread of such disease. Data from 2 government agencies in selected provinces from northern Philippines were gathered, cleaned and synchronized for the purpose of building the predictive models. After the data synchronization, dimensionality reduction of the feature space was done, using Principal Component Analysis (PCA), to determine the most important …