Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics

PDF

2017

Machine Learning

Institution
Publication
Publication Type

Articles 1 - 30 of 50

Full-Text Articles in Entire DC Network

Knowledge Driven Approaches And Machine Learning Improve The Identification Of Clinically Relevant Somatic Mutations In Cancer Genomics, Benjamin John Ainscough Dec 2017

Knowledge Driven Approaches And Machine Learning Improve The Identification Of Clinically Relevant Somatic Mutations In Cancer Genomics, Benjamin John Ainscough

Arts & Sciences Electronic Theses and Dissertations

For cancer genomics to fully expand its utility from research discovery to clinical adoption, somatic variant detection pipelines must be optimized and standardized to ensure identification of clinically relevant mutations and to reduce laborious and error-prone post-processing steps. To address the need for improved catalogues of clinically and biologically important somatic mutations, we developed DoCM, a Database of Curated Mutations in Cancer (http://docm.info), as described in Chapter 2. DoCM is an open source, openly licensed resource to enable the cancer research community to aggregate, store and track biologically and clinically important cancer variants. DoCM is currently comprised of 1,364 variants …


Threshold Free Detection Of Elliptical Landmarks Using Machine Learning, Lifan Zhang Dec 2017

Threshold Free Detection Of Elliptical Landmarks Using Machine Learning, Lifan Zhang

Theses and Dissertations

Elliptical shape detection is widely used in practical applications. Nearly all classical ellipse detection algorithms require some form of threshold, which can be a major cause of detection failure, especially in the challenging case of Moire Phase Tracking (MPT) target images. To meet the challenge, a threshold free detection algorithm for elliptical landmarks is proposed in this thesis. The proposed Aligned Gradient and Unaligned Gradient (AGUG) algorithm is a Support Vector Machine (SVM)-based classification algorithm, original features are extracted from the gradient information corresponding to the sampled pixels. with proper selection of features, the proposed algorithm has a high accuracy …


Developing Leading And Lagging Indicators To Enhance Equipment Reliability In A Lean System, Dhanush Agara Mallesh Dec 2017

Developing Leading And Lagging Indicators To Enhance Equipment Reliability In A Lean System, Dhanush Agara Mallesh

Masters Theses

With increasing complexity in equipment, the failure rates are becoming a critical metric due to the unplanned maintenance in a production environment. Unplanned maintenance in manufacturing process is created issues with downtimes and decreasing the reliability of equipment. Failures in equipment have resulted in the loss of revenue to organizations encouraging maintenance practitioners to analyze ways to change unplanned to planned maintenance. Efficient failure prediction models are being developed to learn about the failures in advance. With this information, failures predicted can reduce the downtimes in the system and improve the throughput.

The goal of this thesis is to predict …


Graph-Based Latent Embedding, Annotation And Representation Learning In Neural Networks For Semi-Supervised And Unsupervised Settings, Ismail Ozsel Kilinc Nov 2017

Graph-Based Latent Embedding, Annotation And Representation Learning In Neural Networks For Semi-Supervised And Unsupervised Settings, Ismail Ozsel Kilinc

USF Tampa Graduate Theses and Dissertations

Machine learning has been immensely successful in supervised learning with outstanding examples in major industrial applications such as voice and image recognition. Following these developments, the most recent research has now begun to focus primarily on algorithms which can exploit very large sets of unlabeled examples to reduce the amount of manually labeled data required for existing models to perform well. In this dissertation, we propose graph-based latent embedding/annotation/representation learning techniques in neural networks tailored for semi-supervised and unsupervised learning problems. Specifically, we propose a novel regularization technique called Graph-based Activity Regularization (GAR) and a novel output layer modification called …


Modular Mechanistic Networks: On Bridging Mechanistic And Phenomenological Models With Deep Neural Networks In Natural Language Processing, Simon Dobnik, John D. Kelleher Nov 2017

Modular Mechanistic Networks: On Bridging Mechanistic And Phenomenological Models With Deep Neural Networks In Natural Language Processing, Simon Dobnik, John D. Kelleher

Books/Book chapters

Natural language processing (NLP) can be done using either top-down (theory driven) and bottom-up (data driven) approaches, which we call mechanistic and phenomenological respectively. The approaches are frequently considered to stand in opposition to each other. Examining some recent approaches in deep learning we argue that deep neural networks incorporate both perspectives and, furthermore, that leveraging this aspect of deep learning may help in solving complex problems within language technology, such as modelling language and perception in the domain of spatial cognition.


Adaft: A Resource-Efficient Framework For Adaptive Fault-Tolerance In Cyber-Physical Systems, Ye Xu Nov 2017

Adaft: A Resource-Efficient Framework For Adaptive Fault-Tolerance In Cyber-Physical Systems, Ye Xu

Doctoral Dissertations

Cyber-physical systems frequently have to use massive redundancy to meet application requirements for high reliability. While such redundancy is required, it can be activated adaptively, based on the current state of the controlled plant. Most of the time the physical plant is in a state that allows for a lower level of fault-tolerance. Avoiding the continuous deployment of massive fault-tolerance will greatly reduce the workload of CPSs. In this dissertation, we demonstrate a software simulation framework (AdaFT) that can automatically generate the sub-spaces within which our adaptive fault-tolerance can be applied. We also show the theoretical benefits of AdaFT, and …


Predicting Mental Conditions Based On "History Of Present Illness" In Psychiatric Notes With Deep Neural Networks, Tung Tran, Ramakanth Kavuluru Nov 2017

Predicting Mental Conditions Based On "History Of Present Illness" In Psychiatric Notes With Deep Neural Networks, Tung Tran, Ramakanth Kavuluru

Computer Science Faculty Publications

Background—Applications of natural language processing to mental health notes are not common given the sensitive nature of the associated narratives. The CEGS N-GRID 2016 Shared Task in Clinical Natural Language Processing (NLP) changed this scenario by providing the first set of neuropsychiatric notes to participants. This study summarizes our efforts and results in proposing a novel data use case for this dataset as part of the third track in this shared task.

Objective—We explore the feasibility and effectiveness of predicting a set of common mental conditions a patient has based on the short textual description of patient’s history …


Tell Me Why? Tell Me More! Explaining Predictions, Iterated Learning Bias, And Counter-Polarization In Big Data Discovery Models, Olfa Nasraoui Oct 2017

Tell Me Why? Tell Me More! Explaining Predictions, Iterated Learning Bias, And Counter-Polarization In Big Data Discovery Models, Olfa Nasraoui

Commonwealth Computational Summit

Outline:

What can go Wrong in Machine Learning?

  • Unfair Machine Learning
  • Iterated Bias & Polarization
  • Black Box models

Tell me more: Counter-Polarization

Tell me why: Explanation Generation


Automatic Music Transcription With Convolutional Neural Networks Using Intuitive Filter Shapes, Jonathan Sleep Oct 2017

Automatic Music Transcription With Convolutional Neural Networks Using Intuitive Filter Shapes, Jonathan Sleep

Master's Theses

This thesis explores the challenge of automatic music transcription with a combination of digital signal processing and machine learning methods. Automatic music transcription is important for musicians who can't do it themselves or find it tedious. We start with an existing model, designed by Sigtia, Benetos and Dixon, and develop it in a number of original ways. We find that by using convolutional neural networks with filter shapes more tailored for spectrogram data, we see better and faster transcription results when evaluating the new model on a dataset of classical piano music. We also find that employing better practices shows …


Lung Ct Radiomics: An Overview Of Using Images As Data, Samuel Hunt Hawkins Sep 2017

Lung Ct Radiomics: An Overview Of Using Images As Data, Samuel Hunt Hawkins

USF Tampa Graduate Theses and Dissertations

Lung cancer is the leading cause of cancer-related death in the United States and worldwide. Early detection of lung cancer can help improve patient outcomes, and survival prediction can inform plans of treatment. By extracting quantitative features from computed tomography scans of lung cancer, predictive models can be built that can achieve both early detection and survival prediction. To build these predictive models, first a detected lung nodule is segmented, then image features are extracted, and finally a model can be built utilizing image features to make predictions. These predictions can help radiologists improve cancer care.

Building predictive models based …


Rating By Ranking: An Improved Scale For Judgement-Based Labels, Jack O'Neill, Sarah Jane Delany, Brian Mac Namee Aug 2017

Rating By Ranking: An Improved Scale For Judgement-Based Labels, Jack O'Neill, Sarah Jane Delany, Brian Mac Namee

Conference papers

Labels representing value judgements are commonly elicited using an interval scale of absolute values. Data collected in such a manner is not always reliable. Psychologists have long recognized a number of biases to which many human raters are prone, and which result in disagreement among raters as to the true gold standard rating of any particular object. We hypothesize that the issues arising from rater bias may be mitigated by treating the data received as an ordered set of preferences rather than a collection of absolute values. We experiment on real-world and artificially generated data, finding that treating label ratings …


Information Theoretic Study Of Gaussian Graphical Models And Their Applications, Ali Moharrer Aug 2017

Information Theoretic Study Of Gaussian Graphical Models And Their Applications, Ali Moharrer

LSU Doctoral Dissertations

In many problems we are dealing with characterizing a behavior of a complex stochastic system or its response to a set of particular inputs. Such problems span over several topics such as machine learning, complex networks, e.g., social or communication networks; biology, etc. Probabilistic graphical models (PGMs) are powerful tools that offer a compact modeling of complex systems. They are designed to capture the random behavior, i.e., the joint distribution of the system to the best possible accuracy. Our goal is to study certain algebraic and topological properties of a special class of graphical models, known as Gaussian graphs. First, …


Machine Learning Based Protein Sequence To (Un)Structure Mapping And Interaction Prediction, Sumaiya Iqbal Aug 2017

Machine Learning Based Protein Sequence To (Un)Structure Mapping And Interaction Prediction, Sumaiya Iqbal

University of New Orleans Theses and Dissertations

Proteins are the fundamental macromolecules within a cell that carry out most of the biological functions. The computational study of protein structure and its functions, using machine learning and data analytics, is elemental in advancing the life-science research due to the fast-growing biological data and the extensive complexities involved in their analyses towards discovering meaningful insights. Mapping of protein’s primary sequence is not only limited to its structure, we extend that to its disordered component known as Intrinsically Disordered Proteins or Regions in proteins (IDPs/IDRs), and hence the involved dynamics, which help us explain complex interaction within a cell that …


Predictive Power And Validity Of Connectome Predictive Modeling: A Replication And Extension, Michael Wang, Joaquin Goni, Enrico Amico Aug 2017

Predictive Power And Validity Of Connectome Predictive Modeling: A Replication And Extension, Michael Wang, Joaquin Goni, Enrico Amico

The Summer Undergraduate Research Fellowship (SURF) Symposium

Neuroimaging, particularly functional magnetic resonance imaging (fMRI), is a rapidly growing research area and has applications ranging from disease classification to understanding neural development. With new advancements in imaging technology, researchers must employ new techniques to accommodate the influx of high resolution data sets. Here, we replicate a new technique: connectome-based predictive modeling (CPM), which constructs a linear predictive model of brain connectivity and behavior. CPM’s advantages over classic machine learning techniques include its relative ease of implementation and transparency compared to “black box” opaqueness and complexity. Is this method efficient, powerful, and reliable in the prediction of behavioral measures …


Operating System Identification By Ipv6 Communication Using Machine Learning Ensembles, Adrian Ordorica Aug 2017

Operating System Identification By Ipv6 Communication Using Machine Learning Ensembles, Adrian Ordorica

Graduate Theses and Dissertations

Operating system (OS) identification tools, sometimes called fingerprinting tools, are essential for the reconnaissance phase of penetration testing. While OS identification is traditionally performed by passive or active tools that use fingerprint databases, very little work has focused on using machine learning techniques. Moreover, significantly more work has focused on IPv4 than IPv6. We introduce a collaborative neural network ensemble that uses a unique voting system and a random forest ensemble to deliver accurate predictions. This approach uses IPv6 features as well as packet metadata features for OS identification. Our experiment shows that our approach is valid and we achieve …


Semantic Visualization For Short Texts With Word Embeddings, Van Minh Tuan Le, Hady W. Lauw Aug 2017

Semantic Visualization For Short Texts With Word Embeddings, Van Minh Tuan Le, Hady W. Lauw

Research Collection School Of Computing and Information Systems

Semantic visualization integrates topic modeling and visualization, such that every document is associated with a topic distribution as well as visualization coordinates on a low-dimensional Euclidean space. We address the problem of semantic visualization for short texts. Such documents are increasingly common, including tweets, search snippets, news headlines, or status updates. Due to their short lengths, it is difficult to model semantics as the word co-occurrences in such a corpus are very sparse. Our approach is to incorporate auxiliary information, such as word embeddings from a larger corpus, to supplement the lack of co-occurrences. This requires the development of a …


Data Analysis Methods Using Persistence Diagrams, Andrew Marchese Aug 2017

Data Analysis Methods Using Persistence Diagrams, Andrew Marchese

Doctoral Dissertations

In recent years, persistent homology techniques have been used to study data and dynamical systems. Using these techniques, information about the shape and geometry of the data and systems leads to important information regarding the periodicity, bistability, and chaos of the underlying systems. In this thesis, we study all aspects of the application of persistent homology to data analysis. In particular, we introduce a new distance on the space of persistence diagrams, and show that it is useful in detecting changes in geometry and topology, which is essential for the supervised learning problem. Moreover, we introduce a clustering framework directly …


Recommendation Vs Sentiment Analysis: A Text-Driven Latent Factor Model For Rating Prediction With Cold-Start Awareness, Kaisong Song, Wei Gao, Shi Feng Feng, Daling Wang, Kam-Fai Wong, Chengqi Zhang Aug 2017

Recommendation Vs Sentiment Analysis: A Text-Driven Latent Factor Model For Rating Prediction With Cold-Start Awareness, Kaisong Song, Wei Gao, Shi Feng Feng, Daling Wang, Kam-Fai Wong, Chengqi Zhang

Research Collection School Of Computing and Information Systems

Review rating prediction is an important research topic. The problem was approached from either the perspective of recommender systems (RS) or that of sentiment analysis (SA). Recent SA research using deep neural networks (DNNs) has realized the importance of user and product interaction for better interpreting the sentiment of reviews. However, the complexity of DNN models in terms of the scale of parameters is very high, and the performance is not always satisfying especially when user-product interaction is sparse. In this paper, we propose a simple, extensible RS-based model, called Text-driven Latent Factor Model (TLFM), to capture the semantics of …


Bayesian Methods And Machine Learning For Processing Text And Image Data, Yingying Gu Aug 2017

Bayesian Methods And Machine Learning For Processing Text And Image Data, Yingying Gu

Theses and Dissertations

Classification/clustering is an important class of unstructured data processing problems. The classification (supervised, semi-supervised and unsupervised) aims to discover the clusters and group the similar data into categories for information organization and knowledge discovery. My work focuses on using the Bayesian methods and machine learning techniques to classify the free-text and image data, and address how to overcome the limitations of the traditional methods. The Bayesian approach provides a way to allow using more variations(numerical or categorical), and estimate the probabilities instead of explicit rules, which will benefit in the ambiguous cases. The MAP(maximum a posterior) estimation is used to …


Basket-Sensitive Personalized Item Recommendation, Duc Trong Le, Hady W. Lauw, Yuan Fang Aug 2017

Basket-Sensitive Personalized Item Recommendation, Duc Trong Le, Hady W. Lauw, Yuan Fang

Research Collection School Of Computing and Information Systems

Personalized item recommendation is useful in narrowing down the list of options provided to a user. In this paper, we address the problem scenario where the user is currently holding a basket of items, and the task is to recommend an item to be added to the basket. Here, we assume that items currently in a basket share some association based on an underlying latent need, e.g., ingredients to prepare some dish, spare parts of some device. Thus, it is important that a recommended item is relevant not only to the user, but also to the existing items in the …


Encoding And Recall Of Spatio-Temporal Episodic Memory In Real Time, Poo-Hee Chang, Ah-Hwee Tan Aug 2017

Encoding And Recall Of Spatio-Temporal Episodic Memory In Real Time, Poo-Hee Chang, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Episodic memory enables a cognitive system to improve its performance by reflecting upon past events. In this paper, we propose a computational model called STEM for encoding and recall of episodic events together with the associated contextual information in real time. Based on a class of self-organizing neural networks, STEM is designed to learn memory chunks or cognitive nodes, each encoding a set of co-occurring multi-modal activity patterns across multiple pattern channels. We present algorithms for recall of events based on partial and inexact input patterns. Our empirical results based on a public domain data set show that STEM displays …


Object Detection Meets Knowledge Graphs, Yuan Fang, Kingsley Kuan, Jie Lin, Cheston Tan, Vijay Chandrasekhar Aug 2017

Object Detection Meets Knowledge Graphs, Yuan Fang, Kingsley Kuan, Jie Lin, Cheston Tan, Vijay Chandrasekhar

Research Collection School Of Computing and Information Systems

Object detection in images is a crucial task in computer vision, with important applications ranging from security surveillance to autonomous vehicles. Existing state-of-the-art algorithms, including deep neural networks, only focus on utilizing features within an image itself, largely neglecting the vast amount of background knowledge about the real world. In this paper, we propose a novel framework of knowledge-aware object detection, which enables the integration of external knowledge such as knowledge graphs into any object detection algorithm. The framework employs the notion of semantic consistency to quantify and generalize knowledge, which improves object detection through a re-optimization process to achieve …


Method For Enabling Causal Inference In Relational Domains, David Arbour Jul 2017

Method For Enabling Causal Inference In Relational Domains, David Arbour

Doctoral Dissertations

The analysis of data from complex systems is quickly becoming a fundamental aspect of modern business, government, and science. The field of causal learning is concerned with developing a set of statistical methods that allow practitioners make inferences about unseen interventions. This field has seen significant advances in recent years. However, the vast majority of this work assumes that data instances are independent, whereas many systems are best described in terms of interconnected instances, i.e. relational systems. This discrepancy prevents causal inference techniques from being reliably applied in many real-world settings.
In this thesis, I will present three contributions to …


Signet: A Neural Network Architecture For Predicting Protein-Protein Interactions, Muhammad S. Ahmed Jul 2017

Signet: A Neural Network Architecture For Predicting Protein-Protein Interactions, Muhammad S. Ahmed

Electronic Thesis and Dissertation Repository

The study of protein-protein interactions (PPI) is critically important within the field of Molecular Biology, as proteins facilitate key organismal functions including the maintenance of both cellular structure and function. Current experimental methods for elucidating PPIs are greatly hindered by large operating costs, lengthy wait times, as well as low accuracy. The recent development of computational PPI predicting techniques has worked to address many of these issues. Despite this, many of these methods utilize over-engineered features and naive learning algorithms. With the recent advances in Machine Learning and Artificial Intelligence, we attempt to view this problem through a novel, deep …


How Artificial Intelligence Is Impacting Manufacturing Industry, Deepak Srinivasan, Maitreyi Ramesh Swaroop, Balaji Rajaram, Sri Krishan Iyer Jul 2017

How Artificial Intelligence Is Impacting Manufacturing Industry, Deepak Srinivasan, Maitreyi Ramesh Swaroop, Balaji Rajaram, Sri Krishan Iyer

Research Collection School Of Computing and Information Systems

In this survey, we study the impact of Artificial Intelligence (AI) on manufacturing sector. AI methods can be utilized to make new thoughts several ways: by delivering novel mixes of wellknown thoughts; by investigating the capability of theoretical spaces; and by making changes that empower the era of unexplored thoughts. AI will have less trouble in displaying the era of new thoughts than in automating their assessment. We describe the advances that have been made on AI in manufacturing industry. We close with how to overcome the issues in this area.


Image-Based Identification Of Cell Cultures By Machine Learning, Oluleye Hezekiah Babatunde Jun 2017

Image-Based Identification Of Cell Cultures By Machine Learning, Oluleye Hezekiah Babatunde

Oluleye Babatunde

Biomedical laboratories often use different cell types in the same assay or the same cell type in different
assays. One cell type can become contaminated by another, or cells can be mis-identified, giving poor
results. Addressing these issues by DNA analyses can be time-consuming, labor intensive or costly to
implement. Here we uniquely employ Legendre moments (LM), Zernike moments (ZM), circularity and
a genetic algorithm (GA) to advance a computer-based vision system, and we task it to identify four cell
types used in virology: HeLa, Vero, BHK and PC3. By employing a k-nearest neighbor (kNN), multilayer
perceptron (MLP), Convolutional Neural …


Real-Time Classification Of Biomedical Signals, Parkinson’S Analytical Model, Abolfazl Saghafi Jun 2017

Real-Time Classification Of Biomedical Signals, Parkinson’S Analytical Model, Abolfazl Saghafi

USF Tampa Graduate Theses and Dissertations

The reach of technological innovation continues to grow, changing all industries as it evolves. In healthcare, technology is increasingly playing a role in almost all processes, from patient registration to data monitoring, from lab tests to self-care tools. The increase in the amount and diversity of generated clinical data requires development of new technologies and procedures capable of integrating and analyzing the BIG generated information as well as providing support in their interpretation.

To that extent, this dissertation focuses on the analysis and processing of biomedical signals, specifically brain and heart signals, using advanced machine learning techniques. That is, the …


Travel Mode Identification With Smartphone Sensors, Xing Su Jun 2017

Travel Mode Identification With Smartphone Sensors, Xing Su

Dissertations, Theses, and Capstone Projects

Personal trips in a modern urban society typically involve multiple travel modes. Recognizing a traveller's transportation mode is not only critical to personal context-awareness in related applications, but also essential to urban traffic operations, transportation planning, and facility design. While the state of the art in travel mode recognition mainly relies on large-scale infrastructure-based fixed sensors or on individuals' GPS devices, the emergence of the smartphone provides a promising alternative with its ever-growing computing, networking, and sensing powers. In this thesis, we propose new algorithms for travel mode identification using smartphone sensors. The prototype system is built upon the latest …


Understanding Music Track Popularity In A Social Network, Jing Ren, Robert J. Kauffman Jun 2017

Understanding Music Track Popularity In A Social Network, Jing Ren, Robert J. Kauffman

Research Collection School Of Computing and Information Systems

Thousands of music tracks are uploaded to the Internet every day through websites and social networks that focus on music. While some content has been popular for decades, some tracks that have just been released have been ignored. What makes a music track popular? Can the duration of a music track’s popularity be explained and predicted? By analysing data on the performance of a music track on the ranking charts, coupled with the creation of machine-generated music semantics constructs and a variety of other track, artist and market descriptors, this research tests a model to assess how track popularity and …


Credit Scoring Using Logistic Regression, Ansen Mathew May 2017

Credit Scoring Using Logistic Regression, Ansen Mathew

Master's Projects

This report presents an approach to predict the credit scores of customers using the Logistic Regression machine learning algorithm. The research objective of this project is to perform a comparative study between feature selection and feature extraction, against the same dataset using the Logistic Regression machine learning algorithm. For feature selection, we have used Stepwise Logistic Regression. For feature extraction, we have used Singular Value Decomposition (SVD) and Weighted Singular Value Decomposition (SVD). In order to test the accuracy obtained using feature selection and feature extraction, we used a public credit dataset having 11 features and 150,000 records. After performing …