Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

Machine learning

2014

Discipline
Institution
Publication
Publication Type

Articles 1 - 30 of 36

Full-Text Articles in Physical Sciences and Mathematics

Causal Discovery For Relational Domains: Representation, Reasoning, And Learning, Marc Maier Nov 2014

Causal Discovery For Relational Domains: Representation, Reasoning, And Learning, Marc Maier

Doctoral Dissertations

Many domains are currently experiencing the growing trend to record and analyze massive, observational data sets with increasing complexity. A commonly made claim is that these data sets hold potential to transform their corresponding domains by providing previously unknown or unexpected explanations and enabling informed decision-making. However, only knowledge of the underlying causal generative process, as opposed to knowledge of associational patterns, can support such tasks. Most methods for traditional causal discovery—the development of algorithms that learn causal structure from observational data—are restricted to representations that require limiting assumptions on the form of the data. Causal discovery has almost exclusively …


Adaptive Step-Sizes For Reinforcement Learning, William C. Dabney Nov 2014

Adaptive Step-Sizes For Reinforcement Learning, William C. Dabney

Doctoral Dissertations

The central theme motivating this dissertation is the desire to develop reinforcement learning algorithms that “just work” regardless of the domain in which they are applied. The largest impediment to this goal is the sensitivity of reinforcement learning algorithms to the step-size parameter used to rescale incremental updates. Adaptive step-size algorithms attempt to reduce this sensitivity or eliminate the step-size parameter entirely by automatically adjusting the step size throughout the learning process. Such algorithms provide an alternative to the standard “guess-and-check” methods used to find parameters known as parameter tuning. However, the problems with parameter tuning are currently masked by …


A Parallel Genetic Algorithm For Tuning Neural Networks, Nathan Chadderdon, Ben Harsha, Steven Bogaerts Nov 2014

A Parallel Genetic Algorithm For Tuning Neural Networks, Nathan Chadderdon, Ben Harsha, Steven Bogaerts

Annual Student Research Poster Session

One challenge in using artificial neural networks is how to determine appropriate parameters for network structure and learning. Often parameters such as learning rate or number of hidden units are set arbitrarily or with a general "intuition" as to what would be most effective. The goal of this project is to use a genetic algorithm to tune a population of neural networks to determine the best structure and parameters. This paper considers a genetic algorithm to tune the number of hidden units, learning rate, momentum, and number of examples viewed per weight update. Experiments and results are discussed for two …


Intelligent Indexing: A Semi-Automated, Trainable System For Field Labeling, Robert T. Clawson Sep 2014

Intelligent Indexing: A Semi-Automated, Trainable System For Field Labeling, Robert T. Clawson

Theses and Dissertations

We present Intelligent Indexing: a general, scalable, collaborative approach to indexing and transcription of non-machine-readable documents that exploits visual consensus and group labeling while harnessing human recognition and domain expertise. In our system, indexers work directly on the page, and with minimal context switching can navigate the page, enter labels, and interact with the recognition engine. Interaction with the recognition engine occurs through preview windows that allow the indexer to quickly verify and correct recommendations. This interaction is far superior to conventional, tedious, inefficient post-correction and editing. Intelligent Indexing is a trainable system that improves over time and can provide …


Scaling Mcmc Inference And Belief Propagation To Large, Dense Graphical Models, Sameer Singh Aug 2014

Scaling Mcmc Inference And Belief Propagation To Large, Dense Graphical Models, Sameer Singh

Doctoral Dissertations

With the physical constraints of semiconductor-based electronics becoming increasingly limiting in the past decade, single-core CPUs have given way to multi-core and distributed computing platforms. At the same time, access to large data collections is progressively becoming commonplace due to the lowering cost of storage and bandwidth. Traditional machine learning paradigms that have been designed to operate sequentially on single processor architectures seem destined to become obsolete in this world of multi-core, multi-node systems and massive data sets. Inference for graphical models is one such example for which most existing algorithms are sequential in nature and are difficult to scale …


Incorporating Boltzmann Machine Priors For Semantic Labeling In Images And Videos, Andrew Kae Aug 2014

Incorporating Boltzmann Machine Priors For Semantic Labeling In Images And Videos, Andrew Kae

Doctoral Dissertations

Semantic labeling is the task of assigning category labels to regions in an image. For example, a scene may consist of regions corresponding to categories such as sky, water, and ground, or parts of a face such as eyes, nose, and mouth. Semantic labeling is an important mid-level vision task for grouping and organizing image regions into coherent parts. Labeling these regions allows us to better understand the scene itself as well as properties of the objects in the scene, such as their parts, location, and interaction within the scene. Typical approaches for this task include the conditional random field …


Automated Image Interpretation For Science Autonomy In Robotic Planetary Exploration, Raymond Francis Aug 2014

Automated Image Interpretation For Science Autonomy In Robotic Planetary Exploration, Raymond Francis

Electronic Thesis and Dissertation Repository

Advances in the capabilities of robotic planetary exploration missions have increased the wealth of scientific data they produce, presenting challenges for mission science and operations imposed by the limits of interplanetary radio communications. These data budget pressures can be relieved by increased robotic autonomy, both for onboard operations tasks and for decision- making in response to science data.

This thesis presents new techniques in automated image interpretation for natural scenes of relevance to planetary science and exploration, and elaborates autonomy scenarios under which they could be used to extend the reach and performance of exploration missions on planetary surfaces.

Two …


3d Robotic Sensing Of People: Human Perception, Representation And Activity Recognition, Hao Zhang Aug 2014

3d Robotic Sensing Of People: Human Perception, Representation And Activity Recognition, Hao Zhang

Doctoral Dissertations

The robots are coming. Their presence will eventually bridge the digital-physical divide and dramatically impact human life by taking over tasks where our current society has shortcomings (e.g., search and rescue, elderly care, and child education). Human-centered robotics (HCR) is a vision to address how robots can coexist with humans and help people live safer, simpler and more independent lives.

As humans, we have a remarkable ability to perceive the world around us, perceive people, and interpret their behaviors. Endowing robots with these critical capabilities in highly dynamic human social environments is a significant but very challenging problem in practical …


Prediction Of Hydrological Models’ Uncertainty By A Committee Of Machine Learning-Models, Nagendra Kayastha, Dimitri P. Solomatine, Durga Lal Shrestha Aug 2014

Prediction Of Hydrological Models’ Uncertainty By A Committee Of Machine Learning-Models, Nagendra Kayastha, Dimitri P. Solomatine, Durga Lal Shrestha

International Conference on Hydroinformatics

This study presents an approach to combine uncertainties of the hydrological model outputs predicted from a number of machine learning models. The machine learning based uncertainty prediction approach is very useful for estimation of hydrological models' uncertainty in particular hydro-metrological situation in real-time application [1]. In this approach the hydrological model realizations from Monte Carlo simulations are used to build different machine learning uncertainty models to predict uncertainty (quantiles of pdf) of the a deterministic output from hydrological model . Uncertainty models are trained using antecedent precipitation and streamflows as inputs. The trained models are then employed to predict the …


Adam: Automated Detection And Attribution Of Malicious Webpages, Ahmed E. Kosba, Aziz Mohaisen, Andrew G. West, Trevor Tonn, Huy Kang Kim Aug 2014

Adam: Automated Detection And Attribution Of Malicious Webpages, Ahmed E. Kosba, Aziz Mohaisen, Andrew G. West, Trevor Tonn, Huy Kang Kim

Andrew G. West

Malicious webpages are a prevalent and severe threat in the Internet security landscape. This fact has motivated numerous static and dynamic techniques to alleviate such threats. Building on this existing literature, this work introduces the design and evaluation of ADAM, a system that uses machine-learning over network metadata derived from the sandboxed execution of webpage content. ADAM aims to detect malicious webpages and identify the nature of those vulnerabilities using a simple set of features. Machine-trained models are not novel in this problem space. Instead, it is the dynamic network artifacts (and their subsequent feature representations) collected during rendering that …


Convergence Of A Reinforcement Learning Algorithm In Continuous Domains, Stephen Carden Aug 2014

Convergence Of A Reinforcement Learning Algorithm In Continuous Domains, Stephen Carden

All Dissertations

In the field of Reinforcement Learning, Markov Decision Processes with a finite number of states and actions have been well studied, and there exist algorithms capable of producing a sequence of policies which converge to an optimal policy with probability one. Convergence guarantees for problems with continuous states also exist. Until recently, no online algorithm for continuous states and continuous actions has been proven to produce optimal policies. This Dissertation contains the results of research into reinforcement learning algorithms for problems in which both the state and action spaces are continuous. The problems to be solved are introduced formally as …


Collaborative Online Multitask Learning, Guangxia Li, Steven C. H. Hoi, Kuiyu Chang, Wenting Liu, Ramesh Jain Aug 2014

Collaborative Online Multitask Learning, Guangxia Li, Steven C. H. Hoi, Kuiyu Chang, Wenting Liu, Ramesh Jain

Research Collection School Of Computing and Information Systems

We study the problem of online multitask learning for solving multiple related classification tasks in parallel, aiming at classifying every sequence of data received by each task accurately and efficiently. One practical example of online multitask learning is the micro-blog sentiment detection on a group of users, which classifies micro-blog posts generated by each user into emotional or non-emotional categories. This particular online learning task is challenging for a number of reasons. First of all, to meet the critical requirements of online applications, a highly efficient and scalable classification solution that can make immediate predictions with low learning cost is …


Improving Structural Features Prediction In Protein Structure Modeling, Ashraf Yaseen Jul 2014

Improving Structural Features Prediction In Protein Structure Modeling, Ashraf Yaseen

Computer Science Theses & Dissertations

Proteins play a vital role in the biological activities of all living species. In nature, a protein folds into a specific and energetically favorable three-dimensional structure which is critical to its biological function. Hence, there has been a great effort by researchers in both experimentally determining and computationally predicting the structures of proteins.

The current experimental methods of protein structure determination are complicated, time-consuming, and expensive. On the other hand, the sequencing of proteins is fast, simple, and relatively less expensive. Thus, the gap between the number of known sequences and the determined structures is growing, and is expected to …


Bioinformatic Solutions To Complex Problems In Mass Spectrometry Based Analysis Of Biomolecules, Ryan M. Taylor Jul 2014

Bioinformatic Solutions To Complex Problems In Mass Spectrometry Based Analysis Of Biomolecules, Ryan M. Taylor

Theses and Dissertations

Biological research has benefitted greatly from the advent of omic methods. For many biomolecules, mass spectrometry (MS) methods are most widely employed due to the sensitivity which allows low quantities of sample and the speed which allows analysis of complex samples. Improvements in instrument and sample preparation techniques create opportunities for large scale experimentation. The complexity and volume of data produced by modern MS-omic instrumentation challenges biological interpretation, while the complexity of the instrumentation, sample noise, and complexity of data analysis present difficulties in maintaining and ensuring data quality, validity, and relevance. We present a corpus of tools which improves …


Integrating Cross-Scale Analysis In The Spatial And Temporal Domains For Classification Of Behavioral Movement, Ali Soleymani, Jonathan Cachat, Kyle Robinson, Somayeh Dodge, Allan Kalueff, Robert Weibel Jun 2014

Integrating Cross-Scale Analysis In The Spatial And Temporal Domains For Classification Of Behavioral Movement, Ali Soleymani, Jonathan Cachat, Kyle Robinson, Somayeh Dodge, Allan Kalueff, Robert Weibel

Journal of Spatial Information Science

Since various behavioral movement patterns are likely to be valid within different unique ranges of spatial and temporal scales (e.g. instantaneous diurnal or seasonal) with the corresponding spatial extents a cross-scale approach is needed for accurate classification of behaviors expressed in movement. Here we introduce a methodology for the characterization and classification of behavioral movement data that relies on computing and analyzing movement features jointly in both the spatial and temporal domains. The proposed methodology consists of three stages. In the first stage focusing on the spatial domain the underlying movement space is partitioned into several zonings that correspond to …


Musical Motif Discovery In Non-Musical Media, Daniel S. Johnson Jun 2014

Musical Motif Discovery In Non-Musical Media, Daniel S. Johnson

Theses and Dissertations

Many music composition algorithms attempt to compose music in a particular style. The resulting music is often impressive and indistinguishable from the style of the training data, but it tends to lack significant innovation. In an effort to increase innovation in the selection of pitches and rhythms, we present a system that discovers musical motifs by coupling machine learning techniques with an inspirational component. The inspirational component allows for the discovery of musical motifs that are unlikely to be produced by a generative model, while the machine learning component harnesses innovation. Candidate motifs are extracted from non-musical media such as …


Towards An Automated Weight Lifting Coach: Introducing Lift, Michael Andrew Lady Jun 2014

Towards An Automated Weight Lifting Coach: Introducing Lift, Michael Andrew Lady

Master's Theses

The fitness device market is young and rapidly growing. More people than ever before take count of how many steps they walk, how many calories they burn, their heart rate over time, and even their quality of sleep. New, and as of yet, unreleased fitness devices have promised the next evolution of functionality with exercise technique analysis. These next generation of fitness devices have wrist and armband style form factors, which may not be optimal for barbell exercises such as back squat, bench press, and overhead press where a sensor on one arm may not provide the most relevant data …


A Probabilistic Model Of Hierarchical Music Analysis, Phillip Benjamin Kirlin Apr 2014

A Probabilistic Model Of Hierarchical Music Analysis, Phillip Benjamin Kirlin

Doctoral Dissertations

Schenkerian music theory supposes that Western tonal compositions can be viewed as hierarchies of musical objects. The process of Schenkerian analysis reveals this hierarchy by identifying connections between notes or chords of a composition that illustrate both the small- and large-scale construction of the music. We present a new probabilistic model of this variety of music analysis, details of how the parameters of the model can be learned from a corpus, an algorithm for deriving the most probable analysis for a given piece of music, and both quantitative and human-based evaluations of the algorithm's performance. In addition, we describe the …


Ensemble Methods For Historical Machine-Printed Document Recognition, William B. Lund Apr 2014

Ensemble Methods For Historical Machine-Printed Document Recognition, William B. Lund

Theses and Dissertations

The usefulness of digitized documents is directly related to the quality of the extracted text. Optical Character Recognition (OCR) has reached a point where well-formatted and clean machine- printed documents are easily recognizable by current commercial OCR products; however, older or degraded machine-printed documents present problems to OCR engines resulting in word error rates (WER) that severely limit either automated or manual use of the extracted text. Major archives of historical machine-printed documents are being assembled around the globe, requiring an accurate transcription of the text for the automated creation of descriptive metadata, full-text searching, and information extraction. Given document …


Moving Object Detection For Interception By A Humanoid Robot, Saltanat B. Tazhibayeva Apr 2014

Moving Object Detection For Interception By A Humanoid Robot, Saltanat B. Tazhibayeva

Open Access Theses

Interception of a moving object with an autonomous robot is an important problem in robotics. It has various application areas, such as in an industrial setting where products on a conveyor would be picked up by a robotic arm, in the military to halt intruders, in robotic soccer (where the robots try to get to the moving ball and try to block an opponent's attempt to pass the ball), and in other challenging situations. Interception, in and of itself, is a complex task that demands a system with target recognition capability, proper navigation and actuation toward the moving target. There …


Document Classification In Support Of Automated Metadata Extraction Form Heterogeneous Collections, Paul K. Flynn Apr 2014

Document Classification In Support Of Automated Metadata Extraction Form Heterogeneous Collections, Paul K. Flynn

Computer Science Theses & Dissertations

A number of federal agencies, universities, laboratories, and companies are placing their documents online and making them searchable via metadata fields such as author, title, and publishing organization. To enable this, every document in the collection must be catalogued using the metadata fields. Though time consuming, the task of identifying metadata fields by inspecting the document is easy for a human. The visual cues in the formatting of the document along with accumulated knowledge and intelligence make it easy for a human to identify various metadata fields. Even with the best possible automated procedures, numerous sources of error exist, including …


Machine Learning In Wireless Sensor Networks: Algorithms, Strategies, And Applications, Mohammad Abu Alsheikh, Shaowei Lin, Dusit Niyato, Hwee-Pink Tan Apr 2014

Machine Learning In Wireless Sensor Networks: Algorithms, Strategies, And Applications, Mohammad Abu Alsheikh, Shaowei Lin, Dusit Niyato, Hwee-Pink Tan

Research Collection School Of Computing and Information Systems

Wireless sensor networks (WSNs) monitor dynamic environments that change rapidly over time. This dynamic behavior is either caused by external factors or initiated by the system designers themselves. To adapt to such conditions, sensor networks often adopt machine learning techniques to eliminate the need for unnecessary redesign. Machine learning also inspires many practical solutions that maximize resource utilization and prolong the lifespan of the network. In this paper, we present an extensive literature review over the period 2002-2013 of machine learning methods that were used to address common issues in WSNs. The advantages and disadvantages of each proposed algorithm are …


Stfu Noob!: Predicting Crowdsourced Decisions On Toxic Behavior In Online Games, Jeremy Blackburn, Haewoon Kwak Apr 2014

Stfu Noob!: Predicting Crowdsourced Decisions On Toxic Behavior In Online Games, Jeremy Blackburn, Haewoon Kwak

Research Collection School Of Computing and Information Systems

One problem facing players of competitive games is negative, or toxic, behavior. League of Legends, the largest eSport game, uses a crowdsourcing platform called the Tribunal to judge whether a reported toxic player should be punished or not. The Tribunal is a two stage system requiring reports from those players that directly observe toxic behavior, and human experts that review aggregated reports. While this system has successfully dealt with the vague nature of toxic behavior by majority rules based on many votes, it naturally requires tremendous cost, time, and human efforts. In this paper, we propose a supervised learning approach …


On Predicting User Affiliations Using Social Features In Online Social Networks, Minh Thap Nguyen Mar 2014

On Predicting User Affiliations Using Social Features In Online Social Networks, Minh Thap Nguyen

Dissertations and Theses Collection (Open Access)

User profiling such as user affiliation prediction in online social network is a challenging task, with many important applications in targeted marketing and personalized recommendation. The research task here is to predict some user affiliation attributes that suggest user participation in different social groups.


Retrieval-Based Face Annotation By Weak Label Regularized Local Coordinate Coding, Dayong Wang, Steven C. H. Hoi, Ying He, Jianke Zhu, Mei Tao, Jiebo Luo Mar 2014

Retrieval-Based Face Annotation By Weak Label Regularized Local Coordinate Coding, Dayong Wang, Steven C. H. Hoi, Ying He, Jianke Zhu, Mei Tao, Jiebo Luo

Research Collection School Of Computing and Information Systems

Auto face annotation, which aims to detect human faces from a facial image and assign them proper human names, is a fundamental research problem and beneficial to many real-world applications. In this work, we address this problem by investigating a retrieval-based annotation scheme of mining massive web facial images that are freely available over the Internet. In particular, given a facial image, we first retrieve the top n similar instances from a large-scale web facial image database using content-based image retrieval techniques, and then use their labels for auto annotation. Such a scheme has two major challenges: 1) how to …


The Role Of Prototype Learning In Hierarchical Models Of Vision, Michael David Thomure Feb 2014

The Role Of Prototype Learning In Hierarchical Models Of Vision, Michael David Thomure

Dissertations and Theses

I conduct a study of learning in HMAX-like models, which are hierarchical models of visual processing in biological vision systems. Such models compute a new representation for an image based on the similarity of image sub-parts to a number of specific patterns, called prototypes. Despite being a central piece of the overall model, the issue of choosing the best prototypes for a given task is still an open problem. I study this problem, and consider the best way to increase task performance while decreasing the computational costs of the model. This work broadens our understanding of HMAX and related hierarchical …


Svmaud: Using Textual Information To Predict The Audience Level Of Written Works Using Support Vector Machines, Todd Will Jan 2014

Svmaud: Using Textual Information To Predict The Audience Level Of Written Works Using Support Vector Machines, Todd Will

Dissertations

Information retrieval systems should seek to match resources with the reading ability of the individual user; similarly, an author must choose vocabulary and sentence structures appropriate for his or her audience. Traditional readability formulas, including the popular Flesch-Kincaid Reading Age and the Dale-Chall Reading Ease Score, rely on numerical representations of text characteristics, including syllable counts and sentence lengths, to suggest audience level of resources. However, the author’s chosen vocabulary, sentence structure, and even the page formatting can alter the predicted audience level by several levels, especially in the case of digital library resources. For these reasons, the performance of …


Complementary Layered Learning, Sean Mondesire Jan 2014

Complementary Layered Learning, Sean Mondesire

Electronic Theses and Dissertations

Layered learning is a machine learning paradigm used to develop autonomous robotic-based agents by decomposing a complex task into simpler subtasks and learns each sequentially. Although the paradigm continues to have success in multiple domains, performance can be unexpectedly unsatisfactory. Using Boolean-logic problems and autonomous agent navigation, we show poor performance is due to the learner forgetting how to perform earlier learned subtasks too quickly (favoring plasticity) or having difficulty learning new things (favoring stability). We demonstrate that this imbalance can hinder learning so that task performance is no better than that of a suboptimal learning technique, monolithic learning, which …


Sketchart: A Pen-Based Tool For Chart Generation And Interaction., Andres Vargas Gonzalez Jan 2014

Sketchart: A Pen-Based Tool For Chart Generation And Interaction., Andres Vargas Gonzalez

Electronic Theses and Dissertations

It has been shown that representing data with the right visualization increases the understanding of qualitative and quantitative information encoded in documents. However, current tools for generating such visualizations involve the use of traditional WIMP techniques, which perhaps makes free interaction and direct manipulation of the content harder. In this thesis, we present a pen-based prototype for data visualization using 10 different types of bar based charts. The prototype lets users sketch a chart and interact with the information once the drawing is identified. The prototype's user interface consists of an area to sketch and touch based elements that will …


Remote Sensing With Computational Intelligence Modelling For Monitoring The Ecosystem State And Hydraulic Pattern In A Constructed Wetland, Golam Mohiuddin Jan 2014

Remote Sensing With Computational Intelligence Modelling For Monitoring The Ecosystem State And Hydraulic Pattern In A Constructed Wetland, Golam Mohiuddin

Electronic Theses and Dissertations

Monitoring the heterogeneous aquatic environment such as the Stormwater Treatment Areas (STAs) located at the northeast of the Everglades is extremely important in understanding the land processes of the constructed wetland in its capacity to remove nutrient. Direct monitoring and measurements of ecosystem evolution and changing velocities at every single part of the STA are not always feasible. Integrated remote sensing, monitoring, and modeling technique can be a state-of-the-art tool to estimate the spatial and temporal distributions of flow velocity regimes and ecological functioning in such dynamic aquatic environments. In this presentation, comparison between four computational intelligence models including Extreme …