Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- University of Kentucky (4)
- Washington University in St. Louis (3)
- Western University (3)
- Missouri University of Science and Technology (2)
- Wright State University (2)
-
- City University of New York (CUNY) (1)
- Edith Cowan University (1)
- Northern Illinois University (1)
- University of Massachusetts Boston (1)
- University of Nevada, Las Vegas (1)
- University of New Orleans (1)
- University of Tennessee, Knoxville (1)
- University of Wisconsin Milwaukee (1)
- Virginia Commonwealth University (1)
- Publication Year
- Publication
-
- Electronic Thesis and Dissertation Repository (3)
- Doctoral Dissertations (2)
- Kno.e.sis Publications (2)
- McKelvey School of Engineering Theses & Dissertations (2)
- Theses and Dissertations (2)
-
- Theses and Dissertations--Computer Science (2)
- Arts & Sciences Electronic Theses and Dissertations (1)
- Computer Science Faculty Publications (1)
- Dissertations, Theses, and Capstone Projects (1)
- Graduate Doctoral Dissertations (1)
- Graduate Research Theses & Dissertations (1)
- Kentucky Injury Prevention and Research Center Faculty Publications (1)
- Masters Theses (1)
- Theses: Doctorates and Masters (1)
- UNLV Theses, Dissertations, Professional Papers, and Capstones (1)
- University of New Orleans Theses and Dissertations (1)
- Publication Type
Articles 1 - 23 of 23
Full-Text Articles in Physical Sciences and Mathematics
The Role Of Machine Learning And Network Analyses In Understanding Microbial Composition In An Experimental Prairie, Ali Eastman Oku
The Role Of Machine Learning And Network Analyses In Understanding Microbial Composition In An Experimental Prairie, Ali Eastman Oku
Graduate Research Theses & Dissertations
Machine learning and network analyses are powerful modern tools can process and map out connections between large amount of ecological data from complex environmental communities. Random forests, an ensemble machine learning algorithm, are particularly powerful as they can capture complex patterns in data while remaining easily interpretable. These tools are specifically useful in experimental settings where different types of data are collected. The aim of this study was to demonstrate the utility of machine learning models and network analyses at analyzing diverse ecological data from dynamic plant-soil microbial communities in a prairie ecosystem. Our experimental system is an experimental prairie …
Development Of The Assessment Of Clinical Prediction Model Transportability (Apt) Checklist, Sean Chonghwan Yu
Development Of The Assessment Of Clinical Prediction Model Transportability (Apt) Checklist, Sean Chonghwan Yu
McKelvey School of Engineering Theses & Dissertations
Clinical Prediction Models (CPM) have long been used for Clinical Decision Support (CDS) initially based on simple clinical scoring systems, and increasingly based on complex machine learning models relying on large-scale Electronic Health Record (EHR) data. External implementation – or the application of CPMs on sites where it was not originally developed – is valuable as it reduces the need for redundant de novo CPM development, enables CPM usage by low resource organizations, facilitates external validation studies, and encourages collaborative development of CPMs. Further, adoption of externally developed CPMs has been facilitated by ongoing interoperability efforts in standards, policy, and …
Better Understanding Genomic Architecture With The Use Of Applied Statistics And Explainable Artificial Intelligence, Jonathon C. Romero
Better Understanding Genomic Architecture With The Use Of Applied Statistics And Explainable Artificial Intelligence, Jonathon C. Romero
Doctoral Dissertations
With the continuous improvements in biological data collection, new techniques are needed to better understand the complex relationships in genomic and other biological data sets. Explainable Artificial Intelligence (X-AI) techniques like Iterative Random Forest (iRF) excel at finding interactions within data, such as genomic epistasis. Here, the introduction of new methods to mine for these complex interactions is shown in a variety of scenarios. The application of iRF as a method for Genomic Wide Epistasis Studies shows that the method is robust in finding interacting sets of features in synthetic data, without requiring the exponentially increasing computation time of many …
Machine Learning Analysis Of Single Nucleotide Polymorphism (Snp) Data To Predict Bone Mineral Density In African American Women, Erick Githua Wakayu
Machine Learning Analysis Of Single Nucleotide Polymorphism (Snp) Data To Predict Bone Mineral Density In African American Women, Erick Githua Wakayu
UNLV Theses, Dissertations, Professional Papers, and Capstones
Osteoporosis is a debilitating disease in which an individual’s bones weaken, making bones fragile and more susceptible to fracture. While commonly found amongst postmenopausal Caucasian and Asian women based on previous studies, those of African descent (African American/Black) have largely been ignored when it comes to osteoporotic studies, especially when it comes to Genome Wide Association Studies (GWAS). From GWA studies, we gain access to single nucleotide poly-morphisms (SNPs) that may contribute to certain illnesses, such as osteoporosis. With low Bone Mineral Density (BMD) being one of the primary markers of potential osteoporosis, it is prudent that proper research is …
Machine Learning Models For Deciphering Regulatory Mechanisms And Morphological Variations In Cancer, Saman Farahmand
Machine Learning Models For Deciphering Regulatory Mechanisms And Morphological Variations In Cancer, Saman Farahmand
Graduate Doctoral Dissertations
The exponential growth of multi-omics biological datasets is resulting in an emerging paradigm shift in fundamental biological research. In recent years, imaging and transcriptomics datasets are increasingly incorporated into biological studies, pushing biology further into the domain of data-intensive-sciences. New approaches and tools from statistics, computer science, and data engineering are profoundly influencing biological research. Harnessing this ever-growing deluge of multi-omics biological data requires the development of novel and creative computational approaches. In parallel, fundamental research in data sciences and Artificial Intelligence (AI) has advanced tremendously, allowing the scientific community to generate a massive amount of knowledge from data. Advances …
Mapping Transcription Factor Networks And Elucidating Their Biological Determinants, Yiming Kang
Mapping Transcription Factor Networks And Elucidating Their Biological Determinants, Yiming Kang
McKelvey School of Engineering Theses & Dissertations
A central goal in systems biology is to accurately map the transcription factor (TF) network of a cell. Such a network map is a key component for many downstream applications, from developmental biology to transcriptome engineering, and from disease modeling to drug discovery. Building a reliable network map requires a wide range of data sources including TF binding locations and gene expression data after direct TF perturbations. However, we are facing two roadblocks. First, rich resources are available only for a few well-studied systems and cannot be easily replicated for new organisms or cell types. Second, when TF binding and …
Neural Network Supervised And Reinforcement Learning For Neurological, Diagnostic, And Modeling Problems, Donald Wunsch Iii
Neural Network Supervised And Reinforcement Learning For Neurological, Diagnostic, And Modeling Problems, Donald Wunsch Iii
Masters Theses
“As the medical world becomes increasingly intertwined with the tech sphere, machine learning on medical datasets and mathematical models becomes an attractive application. This research looks at the predictive capabilities of neural networks and other machine learning algorithms, and assesses the validity of several feature selection strategies to reduce the negative effects of high dataset dimensionality. Our results indicate that several feature selection methods can maintain high validation and test accuracy on classification tasks, with neural networks performing best, for both single class and multi-class classification applications. This research also evaluates a proof-of-concept application of a deep-Q-learning network (DQN) to …
Machine Learning Applications For Drug Repurposing, Hansaim Lim
Machine Learning Applications For Drug Repurposing, Hansaim Lim
Dissertations, Theses, and Capstone Projects
The cost of bringing a drug to market is astounding and the failure rate is intimidating. Drug discovery has been of limited success under the conventional reductionist model of one-drug-one-gene-one-disease paradigm, where a single disease-associated gene is identified and a molecular binder to the specific target is subsequently designed. Under the simplistic paradigm of drug discovery, a drug molecule is assumed to interact only with the intended on-target. However, small molecular drugs often interact with multiple targets, and those off-target interactions are not considered under the conventional paradigm. As a result, drug-induced side effects and adverse reactions are often neglected …
Machine Learning With Digital Signal Processing For Rapid And Accurate Alignment-Free Genome Analysis: From Methodological Design To A Covid-19 Case Study, Gurjit Singh Randhawa
Machine Learning With Digital Signal Processing For Rapid And Accurate Alignment-Free Genome Analysis: From Methodological Design To A Covid-19 Case Study, Gurjit Singh Randhawa
Electronic Thesis and Dissertation Repository
In the field of bioinformatics, taxonomic classification is the scientific practice of identifying, naming, and grouping of organisms based on their similarities and differences. The problem of taxonomic classification is of immense importance considering that nearly 86% of existing species on Earth and 91% of marine species remain unclassified. Due to the magnitude of the datasets, the need exists for an approach and software tool that is scalable enough to handle large datasets and can be used for rapid sequence comparison and analysis. We propose ML-DSP, a stand-alone alignment-free software tool that uses Machine Learning and Digital Signal Processing to …
Deriving Statistical Inference From The Application Of Artificial Neural Networks To Clinical Metabolomics Data, Kevin M. Mendez
Deriving Statistical Inference From The Application Of Artificial Neural Networks To Clinical Metabolomics Data, Kevin M. Mendez
Theses: Doctorates and Masters
Metabolomics data are complex with a high degree of multicollinearity. As such, multivariate linear projection methods, such as partial least squares discriminant analysis (PLS-DA) have become standard. Non-linear projections methods, typified by Artificial Neural Networks (ANNs) may be more appropriate to model potential nonlinear latent covariance; however, they are not widely used due to difficulty in deriving statistical inference, and thus biological interpretation. Herein, we illustrate the utility of ANNs for clinical metabolomics using publicly available data sets and develop an open framework for deriving and visualising statistical inference from ANNs equivalent to standard PLS-DA methods.
Enhancing Timeliness Of Drug Overdose Mortality Surveillance: A Machine Learning Approach, Patrick J. Ward, Peter J. Rock, Svetla Slavova, April M. Young, Terry L. Bunn, Ramakanth Kavuluru
Enhancing Timeliness Of Drug Overdose Mortality Surveillance: A Machine Learning Approach, Patrick J. Ward, Peter J. Rock, Svetla Slavova, April M. Young, Terry L. Bunn, Ramakanth Kavuluru
Kentucky Injury Prevention and Research Center Faculty Publications
BACKGROUND: Timely data is key to effective public health responses to epidemics. Drug overdose deaths are identified in surveillance systems through ICD-10 codes present on death certificates. ICD-10 coding takes time, but free-text information is available on death certificates prior to ICD-10 coding. The objective of this study was to develop a machine learning method to classify free-text death certificates as drug overdoses to provide faster drug overdose mortality surveillance.
METHODS: Using 2017–2018 Kentucky death certificate data, free-text fields were tokenized and features were created from these tokens using natural language processing (NLP). Word, bigram, and trigram features were created …
Relation Prediction Over Biomedical Knowledge Bases For Drug Repositioning, Mehmet Bakal
Relation Prediction Over Biomedical Knowledge Bases For Drug Repositioning, Mehmet Bakal
Theses and Dissertations--Computer Science
Identifying new potential treatment options for medical conditions that cause human disease burden is a central task of biomedical research. Since all candidate drugs cannot be tested with animal and clinical trials, in vitro approaches are first attempted to identify promising candidates. Likewise, identifying other essential relations (e.g., causation, prevention) between biomedical entities is also critical to understand biomedical processes. Hence, it is crucial to develop automated relation prediction systems that can yield plausible biomedical relations to expedite the discovery process. In this dissertation, we demonstrate three approaches to predict treatment relations between biomedical entities for the drug repositioning task …
Computational Modelling Of Human Transcriptional Regulation By An Information Theory-Based Approach, Ruipeng Lu
Computational Modelling Of Human Transcriptional Regulation By An Information Theory-Based Approach, Ruipeng Lu
Electronic Thesis and Dissertation Repository
ChIP-seq experiments can identify the genome-wide binding site motifs of a transcription factor (TF) and determine its sequence specificity. Multiple algorithms were developed to derive TF binding site (TFBS) motifs from ChIP-seq data, including the entropy minimization-based Bipad that can derive both contiguous and bipartite motifs. Prior studies applying these algorithms to ChIP-seq data only analyzed a small number of top peaks with the highest signal strengths, biasing their resultant position weight matrices (PWMs) towards consensus-like, strong binding sites; nor did they derive bipartite motifs, disabling the accurate modelling of binding behavior of dimeric TFs.
This thesis presents a novel …
Scalable Feature Selection And Extraction With Applications In Kinase Polypharmacology, Derek Jones
Scalable Feature Selection And Extraction With Applications In Kinase Polypharmacology, Derek Jones
Theses and Dissertations--Computer Science
In order to reduce the time associated with and the costs of drug discovery, machine learning is being used to automate much of the work in this process. However the size and complex nature of molecular data makes the application of machine learning especially challenging. Much work must go into the process of engineering features that are then used to train machine learning models, costing considerable amounts of time and requiring the knowledge of domain experts to be most effective. The purpose of this work is to demonstrate data driven approaches to perform the feature selection and extraction steps in …
Machine Learning Techniques Implementation In Power Optimization, Data Processing, And Bio-Medical Applications, Khalid Khairullah Mezied Al-Jabery
Machine Learning Techniques Implementation In Power Optimization, Data Processing, And Bio-Medical Applications, Khalid Khairullah Mezied Al-Jabery
Doctoral Dissertations
"The rapid progress and development in machine-learning algorithms becomes a key factor in determining the future of humanity. These algorithms and techniques were utilized to solve a wide spectrum of problems extended from data mining and knowledge discovery to unsupervised learning and optimization. This dissertation consists of two study areas. The first area investigates the use of reinforcement learning and adaptive critic design algorithms in the field of power grid control. The second area in this dissertation, consisting of three papers, focuses on developing and applying clustering algorithms on biomedical data. The first paper presents a novel modelling approach for …
Knowledge Driven Approaches And Machine Learning Improve The Identification Of Clinically Relevant Somatic Mutations In Cancer Genomics, Benjamin John Ainscough
Knowledge Driven Approaches And Machine Learning Improve The Identification Of Clinically Relevant Somatic Mutations In Cancer Genomics, Benjamin John Ainscough
Arts & Sciences Electronic Theses and Dissertations
For cancer genomics to fully expand its utility from research discovery to clinical adoption, somatic variant detection pipelines must be optimized and standardized to ensure identification of clinically relevant mutations and to reduce laborious and error-prone post-processing steps. To address the need for improved catalogues of clinically and biologically important somatic mutations, we developed DoCM, a Database of Curated Mutations in Cancer (http://docm.info), as described in Chapter 2. DoCM is an open source, openly licensed resource to enable the cancer research community to aggregate, store and track biologically and clinically important cancer variants. DoCM is currently comprised of 1,364 variants …
Predicting Mental Conditions Based On "History Of Present Illness" In Psychiatric Notes With Deep Neural Networks, Tung Tran, Ramakanth Kavuluru
Predicting Mental Conditions Based On "History Of Present Illness" In Psychiatric Notes With Deep Neural Networks, Tung Tran, Ramakanth Kavuluru
Computer Science Faculty Publications
Background—Applications of natural language processing to mental health notes are not common given the sensitive nature of the associated narratives. The CEGS N-GRID 2016 Shared Task in Clinical Natural Language Processing (NLP) changed this scenario by providing the first set of neuropsychiatric notes to participants. This study summarizes our efforts and results in proposing a novel data use case for this dataset as part of the third track in this shared task.
Objective—We explore the feasibility and effectiveness of predicting a set of common mental conditions a patient has based on the short textual description of patient’s history …
Machine Learning Based Protein Sequence To (Un)Structure Mapping And Interaction Prediction, Sumaiya Iqbal
Machine Learning Based Protein Sequence To (Un)Structure Mapping And Interaction Prediction, Sumaiya Iqbal
University of New Orleans Theses and Dissertations
Proteins are the fundamental macromolecules within a cell that carry out most of the biological functions. The computational study of protein structure and its functions, using machine learning and data analytics, is elemental in advancing the life-science research due to the fast-growing biological data and the extensive complexities involved in their analyses towards discovering meaningful insights. Mapping of protein’s primary sequence is not only limited to its structure, we extend that to its disordered component known as Intrinsically Disordered Proteins or Regions in proteins (IDPs/IDRs), and hence the involved dynamics, which help us explain complex interaction within a cell that …
Signet: A Neural Network Architecture For Predicting Protein-Protein Interactions, Muhammad S. Ahmed
Signet: A Neural Network Architecture For Predicting Protein-Protein Interactions, Muhammad S. Ahmed
Electronic Thesis and Dissertation Repository
The study of protein-protein interactions (PPI) is critically important within the field of Molecular Biology, as proteins facilitate key organismal functions including the maintenance of both cellular structure and function. Current experimental methods for elucidating PPIs are greatly hindered by large operating costs, lengthy wait times, as well as low accuracy. The recent development of computational PPI predicting techniques has worked to address many of these issues. Despite this, many of these methods utilize over-engineered features and naive learning algorithms. With the recent advances in Machine Learning and Artificial Intelligence, we attempt to view this problem through a novel, deep …
What Are People Tweeting About Zika? An Exploratory Study Concerning Its Symptoms, Treatment, Transmission, And Prevention, Michele Miller, Tanvi Banerjee, Roopteja Muppalla, William L. Romine, Amit Sheth
What Are People Tweeting About Zika? An Exploratory Study Concerning Its Symptoms, Treatment, Transmission, And Prevention, Michele Miller, Tanvi Banerjee, Roopteja Muppalla, William L. Romine, Amit Sheth
Kno.e.sis Publications
Background: In order to harness what people are tweeting about Zika, there needs to be a computational framework that leverages machine learning techniques to recognize relevant Zika tweets and, further, categorize these into disease-specific categories to address specific societal concerns related to the prevention, transmission, symptoms, and treatment of Zika virus.
Objective: The purpose of this study was to determine the relevancy of the tweets and what people were tweeting about the 4 disease characteristics of Zika: symptoms, transmission, prevention, and treatment.
Methods: A combination of natural language processing and machine learning techniques was used to determine what people were …
Stage-Specific Predictive Models For Cancer Survivability, Elham Sagheb Hossein Pour
Stage-Specific Predictive Models For Cancer Survivability, Elham Sagheb Hossein Pour
Theses and Dissertations
Survivability of cancer strongly depends on the stage of cancer. In most previous works, machine learning survivability prediction models for a particular cancer, were trained and evaluated together on all stages of the cancer. In this work, we trained and evaluated survivability prediction models for five major cancers, together on all stages and separately for every stage. We named these models joint and stage-specific models respectively. The obtained results for the cancers which we investigated reveal that, the best model to predict the survivability of the cancer for one specific stage is the model which is specifically built for that …
Graph-Based Regularization In Machine Learning: Discovering Driver Modules In Biological Networks, Xi Gao
Graph-Based Regularization In Machine Learning: Discovering Driver Modules In Biological Networks, Xi Gao
Theses and Dissertations
Curiosity of human nature drives us to explore the origins of what makes each of us different. From ancient legends and mythology, Mendel's law, Punnett square to modern genetic research, we carry on this old but eternal question. Thanks to technological revolution, today's scientists try to answer this question using easily measurable gene expression and other profiling data. However, the exploration can easily get lost in the data of growing volume, dimension, noise and complexity. This dissertation is aimed at developing new machine learning methods that take data from different classes as input, augment them with knowledge of feature relationships, …
Data Analytics For Power Utility Storm Planning, Lan Lin, Aldo Dagnino, Derek Doran, Swapna S. Gokhale
Data Analytics For Power Utility Storm Planning, Lan Lin, Aldo Dagnino, Derek Doran, Swapna S. Gokhale
Kno.e.sis Publications
As the world population grows, recent climatic changes seem to bring powerful storms to populated areas. The impact of these storms on utility services is devastating. Hurricane Sandy is a recent example of the enormous damages that storms can inflict on infrastructure, society, and the economy. Quick response to these emergencies represents a big challenge to electric power utilities. Traditionally utilities develop preparedness plans for storm emergency situations based on the experience of utility experts and with limited use of historical data. With the advent of the Smart Grid, utilities are incorporating automation and sensing technologies in their grids and …