Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 26 of 26

Full-Text Articles in Entire DC Network

Image Segmentation By Convolutional Neural Networks In Coral Resilience Research, Jennifer Benbow Jan 2024

Image Segmentation By Convolutional Neural Networks In Coral Resilience Research, Jennifer Benbow

Master's Projects

As ocean temperatures rise, coral bleaching is becoming more frequent and severe. Selective breeding experiments show promise for enhancing coral resilience, but scaling these projects is hindered by the labor-intensive nature of taking numerous time series measurements as corals grow. Automating this process with computer vision is one solution to this bottleneck, and to our knowledge, no such tool exists at present. To fill this gap, we have trained a set of machine learning models, based on the Mask R-CNN framework, for segmenting juvenile corals in lab-based coral resilience research. This work shows that retraining the Mask R-CNN architecture through …


Model-Based Deep Autoencoders For Clustering Single-Cell Rna Sequencing Data With Side Information, Xiang Lin Dec 2023

Model-Based Deep Autoencoders For Clustering Single-Cell Rna Sequencing Data With Side Information, Xiang Lin

Dissertations

Clustering analysis has been conducted extensively in single-cell RNA sequencing (scRNA-seq) studies. scRNA-seq can profile tens of thousands of genes' activities within a single cell. Thousands or tens of thousands of cells can be captured simultaneously in a typical scRNA-seq experiment. Biologists would like to cluster these cells for exploring and elucidating cell types or subtypes. Numerous methods have been designed for clustering scRNA-seq data. Yet, single-cell technologies develop so fast in the past few years that those existing methods do not catch up with these rapid changes and fail to fully fulfil their potential. For instance, besides profiling transcription …


Deephtlv: A Deep Learning Framework For Detecting Human T-Lymphotrophic Virus 1 Integration Sites, Johnathan Jia, Johnathan Jia May 2023

Deephtlv: A Deep Learning Framework For Detecting Human T-Lymphotrophic Virus 1 Integration Sites, Johnathan Jia, Johnathan Jia

Dissertations & Theses (Open Access)

In the 1980s, researchers found the first human oncogenic retrovirus called human T-lymphotrophic virus type 1 (HTLV-1). Since then, HTLV-1 has been identified as the causative agent behind several diseases such as adult T-cell leukemia/lymphoma (ATL) and a HTLV-1 associated myelopathy or tropical spastic paraparesis (HAM/TSP). As part of its normal replication cycle, the genome is converted into DNA and integrated into the genome. With several hundreds to thousands of unique viral integration sites (VISs) distributed with indeterminate preference throughout the genome, detection of HTLV-1 VISs is a challenging task. Experimental studies typically use molecular biology …


Wearable Sensor Gait Analysis For Fall Detection Using Deep Learning Methods, Haben Girmay Yhdego May 2023

Wearable Sensor Gait Analysis For Fall Detection Using Deep Learning Methods, Haben Girmay Yhdego

Electrical & Computer Engineering Theses & Dissertations

World Health Organization (WHO) data show that around 684,000 people die from falls yearly, making it the second-highest mortality rate after traffic accidents [1]. Early detection of falls, followed by pneumatic protection, is one of the most effective means of ensuring the safety of the elderly. In light of the recent widespread adoption of wearable sensors, it has become increasingly critical that fall detection models are developed that can effectively process large and sequential sensor signal data. Several researchers have recently developed fall detection algorithms based on wearable sensor data. However, real-time fall detection remains challenging because of the wide …


Multimodal Neuron Classification Based On Morphology And Electrophysiology, Aqib Ahmad Jan 2023

Multimodal Neuron Classification Based On Morphology And Electrophysiology, Aqib Ahmad

Graduate Theses, Dissertations, and Problem Reports

Categorizing neurons into different types to understand neural circuits and ultimately brain function is a major challenge in neuroscience. While electrical properties are critical in defining a neuron, its morphology is equally important. Advancements in single-cell analysis methods have allowed neuroscientists to simultaneously capture multiple data modalities from a neuron. We propose a method to classify neurons using both morphological structure and electrophysiology. Current approaches are based on a limited analysis of morphological features. We propose to use a new graph neural network to learn representations that more comprehensively account for the complexity of the shape of neuronal structures. In …


Deep Active Learning For Classifying Cancer Pathology Reports, Kevin De Angeli, Shang Gao, Mohammed Alawad, Hong‑Jun Yoon, Noah Schaeferkoetter, Xiao‑Cheng Wu, Eric B. Durbin, Jennifer Doherty, Antoinette Stroup, Linda Coyle, Lynne Penberthy, Georgia Tourassi Mar 2021

Deep Active Learning For Classifying Cancer Pathology Reports, Kevin De Angeli, Shang Gao, Mohammed Alawad, Hong‑Jun Yoon, Noah Schaeferkoetter, Xiao‑Cheng Wu, Eric B. Durbin, Jennifer Doherty, Antoinette Stroup, Linda Coyle, Lynne Penberthy, Georgia Tourassi

Kentucky Cancer Registry Faculty Publications

Background: Automated text classification has many important applications in the clinical setting; however, obtaining labelled data for training machine learning and deep learning models is often difficult and expensive. Active learning techniques may mitigate this challenge by reducing the amount of labelled data required to effectively train a model. In this study, we analyze the effectiveness of 11 active learning algorithms on classifying subsite and histology from cancer pathology reports using a Convolutional Neural Network as the text classification model.

Results: We compare the performance of each active learning strategy using two differently sized datasets and two different classification tasks. …


Deep Learning For Multi-Tissue Cancer Classification Of Gene Expressions, Tarek Khorshed Jan 2021

Deep Learning For Multi-Tissue Cancer Classification Of Gene Expressions, Tarek Khorshed

Theses and Dissertations

We contribute in saving the lives of cancer patients through early detection and diagnosis, since one of the major challenges in cancer treatment is that patients are diagnosed at very late stages when appropriate medical interventions become less effective and full curative treatment is no longer achievable. Cancer classification using gene expressions is extremely challenging given the complexity and high dimensionality of the data. Current classification methods typically rely on samples collected from a single tissue type and perform a prerequisite of gene feature selection to avoid processing the full set of genes. These methods fall short in taking advantage …


A Multi-Resolution Graph Convolution Network For Contiguous Epitope Prediction, Lisa Oh Jan 2021

A Multi-Resolution Graph Convolution Network For Contiguous Epitope Prediction, Lisa Oh

Dartmouth College Master’s Theses

Computational methods for predicting binding interfaces between antigens and antibodies (epitopes and paratopes) are faster and cheaper than traditional experimental structure determination methods. A sufficiently reliable computational predictor that could scale to large sets of available antibody sequence data could thus inform and expedite many biomedical pursuits, such as better understanding immune responses to vaccination and natural infection and developing better drugs and vaccines. However, current state-of-the-art predictors produce discontiguous predictions, e.g., predicting the epitope in many different spots on an antigen, even though in reality they typically comprise a single localized region. We seek to produce contiguous predicted epitopes, …


New Methods For Deep Learning Based Real-Valued Inter-Residue Distance Prediction, Jacob Barger Nov 2020

New Methods For Deep Learning Based Real-Valued Inter-Residue Distance Prediction, Jacob Barger

Theses

Background: Much of the recent success in protein structure prediction has been a result of accurate protein contact prediction--a binary classification problem. Dozens of methods, built from various types of machine learning and deep learning algorithms, have been published over the last two decades for predicting contacts. Recently, many groups, including Google DeepMind, have demonstrated that reformulating the problem as a multi-class classification problem is a more promising direction to pursue. As an alternative approach, we recently proposed real-valued distance predictions, formulating the problem as a regression problem. The nuances of protein 3D structures make this formulation appropriate, allowing predictions …


Deepfrag-K: A Fragment-Based Deep Learning Approach For Protein Fold Recognition, Wessam Elhefnawy, Min Li, Jianxin Wang, Yaohang Li Nov 2020

Deepfrag-K: A Fragment-Based Deep Learning Approach For Protein Fold Recognition, Wessam Elhefnawy, Min Li, Jianxin Wang, Yaohang Li

Computer Science Faculty Publications

Background: One of the most essential problems in structural bioinformatics is protein fold recognition. In this paper, we design a novel deep learning architecture, so-called DeepFrag-k, which identifies fold discriminative features at fragment level to improve the accuracy of protein fold recognition. DeepFrag-k is composed of two stages: the first stage employs a multi-modal Deep Belief Network (DBN) to predict the potential structural fragments given a sequence, represented as a fragment vector, and then the second stage uses a deep convolutional neural network (CNN) to classify the fragment vector into the corresponding fold.

Results: Our results show that DeepFrag-k yields …


A Review Of Integrative Imputation For Multi-Omics Datasets, Meng Song, Jonathan Greenbaum, Joseph Luttrell, Weihua Zhou, Chong Wu, Hui Shen, Ping Gong, Chaoyang Zhang, Hong Wen Deng Oct 2020

A Review Of Integrative Imputation For Multi-Omics Datasets, Meng Song, Jonathan Greenbaum, Joseph Luttrell, Weihua Zhou, Chong Wu, Hui Shen, Ping Gong, Chaoyang Zhang, Hong Wen Deng

Michigan Tech Publications

Multi-omics studies, which explore the interactions between multiple types of biological factors, have significant advantages over single-omics analysis for their ability to provide a more holistic view of biological processes, uncover the causal and functional mechanisms for complex diseases, and facilitate new discoveries in precision medicine. However, omics datasets often contain missing values, and in multi-omics study designs it is common for individuals to be represented for some omics layers but not all. Since most statistical analyses cannot be applied directly to the incomplete datasets, imputation is typically performed to infer the missing values. Integrative imputation techniques which make use …


Computational Methods For Predicting Protein-Protein Interactions And Binding Sites, Yiwei Li Aug 2020

Computational Methods For Predicting Protein-Protein Interactions And Binding Sites, Yiwei Li

Electronic Thesis and Dissertation Repository

Proteins are essential to organisms and participate in virtually every process within cells. Quite often, they keep the cells functioning by interacting with other proteins. This process is called protein-protein interaction (PPI). The bonding amino acid residues during the process of protein-protein interactions are called PPI binding sites. Identifying PPIs and PPI binding sites are fundamental problems in system biology.

Experimental methods for solving these two problems are slow and expensive. Therefore, great efforts are being made towards increasing the performance of computational methods.

We present DELPHI, a deep learning based program for PPI site prediction and SPRINT, an algorithmic …


Table-To-Text: Generating Descriptive Text For Scientific Tables From Randomized Controlled Trials, Qiang Wei May 2020

Table-To-Text: Generating Descriptive Text For Scientific Tables From Randomized Controlled Trials, Qiang Wei

Dissertations & Theses (Open Access)

Unprecedented amounts of data have been generated in the biomedical domain, and the bottleneck for biomedical research has shifted from data generation to data management, interpretation, and communication. Therefore, it is highly desirable to develop systems to assist in text generation from biomedical data, which will greatly improve the dissemination of scientific findings. However, very few studies have investigated issues of data-to-text generation in the biomedical domain. Here I present a systematic study for generating descriptive text from tables in randomized clinical trials (RCT) articles, which includes: (1) an information model for representing RCT tables; (2) annotated corpora containing pairs …


Mhcherrypan, A Novel Model To Predict The Binding Affinity Of Pan-Specific Class I Hla-Peptide, Xuezhi Xie Apr 2020

Mhcherrypan, A Novel Model To Predict The Binding Affinity Of Pan-Specific Class I Hla-Peptide, Xuezhi Xie

Electronic Thesis and Dissertation Repository

The human leukocyte antigen (HLA) system or complex plays an essential role in regulating the immune system in humans. Accurate prediction of peptide binding with HLA can efficiently help to identify those neoantigens, which potentially make a big difference in immune drug development. HLA is one of the most polymorphic genetic systems in humans, and thousands of HLA allelic versions exist. Due to the high polymorphism of HLA complex, it is still pretty difficult to accurately predict the binding affinity. In this thesis, we presented a new algorithm to combine convolutional neural network and long short-term memory to solve this …


Toward Automated Region Detection & Parcellation Of Rat Brain Tissue Images, Alexandro Arnal Jan 2020

Toward Automated Region Detection & Parcellation Of Rat Brain Tissue Images, Alexandro Arnal

Open Access Theses & Dissertations

People who analyze images of biological tissue often rely on segmentation of structures as a preliminary step. In particular, laboratories studying the rat brain manually delineate brain regions to position scientific findings on a brain atlas to propose hypotheses about the rat brain, and ultimately, the human brain. Our work intersects with the preliminary step of delineating regions in images of brain tissue via computational methods.

We investigate pixel-wise classification or segmentation of brain regions using ten histological images of brain tissue sections stained for Nissl substance, and two deep learning models: U-Net and Tile2Vec. Our goal is to assess …


Vaxinsight: An Artificial Intelligence System To Access Large-Scale Public Perceptions Of Vaccination From Social Media, Jingcheng Du Dec 2019

Vaxinsight: An Artificial Intelligence System To Access Large-Scale Public Perceptions Of Vaccination From Social Media, Jingcheng Du

Dissertations & Theses (Open Access)

Vaccination is considered one of the greatest public health achievements of the 20th century. A high vaccination rate is required to reduce the prevalence and incidence of vaccine-preventable diseases. However, in the last two decades, there has been a significant and increasing number of people who refuse or delay getting vaccinated and who prohibit their children from receiving vaccinations. Importantly, under-vaccination is associated with infectious disease outbreaks. A good understanding of public perceptions regarding vaccinations is important if we are to develop effective vaccination promotion strategies. Traditional methods of research, such as surveys, suffer limitations that impede our understanding of …


Utilizing Temporal Information In The Ehr For Developing A Novel Continuous Prediction Model, Kang Lin Hsieh Aug 2019

Utilizing Temporal Information In The Ehr For Developing A Novel Continuous Prediction Model, Kang Lin Hsieh

Dissertations & Theses (Open Access)

Type 2 diabetes mellitus (T2DM) is a nation-wide prevalent chronic condition, which includes direct and indirect healthcare costs. T2DM, however, is a preventable chronic condition based on previous clinical research. Many prediction models were based on the risk factors identified by clinical trials. One of the major tasks of the T2DM prediction models is to estimate the risks for further testing by HbA1c or fasting plasma glucose to determine whether the patient has or does not have T2DM because nation-wide screening is not cost-effective.

Those models had substantial limitations on data quality, such as missing values. In this dissertation, I …


Model-Based Deep Autoencoders For Characterizing Discrete Data With Application To Genomic Data Analysis, Tian Tian May 2019

Model-Based Deep Autoencoders For Characterizing Discrete Data With Application To Genomic Data Analysis, Tian Tian

Dissertations

Deep learning techniques have achieved tremendous successes in a wide range of real applications in recent years. For dimension reduction, deep neural networks (DNNs) provide a natural choice to parameterize a non-linear transforming function that maps the original high dimensional data to a lower dimensional latent space. Autoencoder is a kind of DNNs used to learn efficient feature representation in an unsupervised manner. Deep autoencoder has been widely explored and applied to analysis of continuous data, while it is understudied for characterizing discrete data. This dissertation focuses on developing model-based deep autoencoders for modeling discrete data. A motivating example of …


Highly Accurate Fragment Library For Protein Fold Recognition, Wessam Elhefnawy Apr 2019

Highly Accurate Fragment Library For Protein Fold Recognition, Wessam Elhefnawy

Computer Science Theses & Dissertations

Proteins play a crucial role in living organisms as they perform many vital tasks in every living cell. Knowledge of protein folding has a deep impact on understanding the heterogeneity and molecular functions of proteins. Such information leads to crucial advances in drug design and disease understanding. Fold recognition is a key step in the protein structure discovery process, especially when traditional computational methods fail to yield convincing structural homologies. In this work, we present a new protein fold recognition approach using machine learning and data mining methodologies.

First, we identify a protein structural fragment library (Frag-K) composed of a …


Applying Machine Learning Algorithms For The Analysis Of Biological Sequences And Medical Records, Shaopeng Gu Jan 2019

Applying Machine Learning Algorithms For The Analysis Of Biological Sequences And Medical Records, Shaopeng Gu

Electronic Theses and Dissertations

The modern sequencing technology revolutionizes the genomic research and triggers explosive growth of DNA, RNA, and protein sequences. How to infer the structure and function from biological sequences is a fundamentally important task in genomics and proteomics fields. With the development of statistical and machine learning methods, an integrated and user-friendly tool containing the state-of-the-art data mining methods are needed. Here, we propose SeqFea-Learn, a comprehensive Python pipeline that integrating multiple steps: feature extraction, dimensionality reduction, feature selection, predicting model constructions based on machine learning and deep learning approaches to analyze sequences. We used enhancers, RNA N6- methyladenosine sites and …


Prediction Of 1p/19q Codeletion Status In Diffuse Glioma Patients Using Preoperative Multiparametric Magnetic Resonance Imaging, Donnie Kim Aug 2018

Prediction Of 1p/19q Codeletion Status In Diffuse Glioma Patients Using Preoperative Multiparametric Magnetic Resonance Imaging, Donnie Kim

Dissertations & Theses (Open Access)

A complete codeletion of chromosome 1p/19q is strongly correlated with better overall survival of diffuse glioma patients, hence determining the codeletion status early in the course of a patient’s disease would be valuable in that patient’s care. The current practice requires a surgical biopsy in order to assess the codeletion status, which exposes patients to risks and is limited in its accuracy by sampling variations. To overcome such limitations, we utilized four conventional magnetic resonance imaging sequences to predict the 1p/19q status. We extracted three sets of image-derived features, namely texture-based, topology-based, and convolutional neural network (CNN)-based, and analyzed each …


Deep Learning For Segmentation Of 3d Cryo-Em Images, Devin Reid Haslam Jul 2018

Deep Learning For Segmentation Of 3d Cryo-Em Images, Devin Reid Haslam

Computer Science Theses & Dissertations

Cryo-electron microscopy (cryo-EM) is an emerging biophysical technique for structural determination of protein complexes. However, accurate detection of secondary structures is still challenging when cryo-EM density maps are at medium resolutions (5-10 Å). Most existing methods are image processing methods that do not fully utilize available images in the cryo-EM database. In this paper, we present a deep learning approach to segment secondary structure elements as helices and β-sheets from medium- resolution density maps. The proposed 3D convolutional neural network is shown to detect secondary structure locations with an F1 score between 0.79 and 0.88 for six simulated test cases. …


Deep Learning Architectures For Multi-Label Classification Of Intelligent Health Risk Prediction, Andrew Maxwell, Runzhi Li, Bei Yang, Heng Weng, Aihua Ou, Huixiao Hong, Zhaoxian Zhou, Ping Gong, Chaoyang Zhang Dec 2017

Deep Learning Architectures For Multi-Label Classification Of Intelligent Health Risk Prediction, Andrew Maxwell, Runzhi Li, Bei Yang, Heng Weng, Aihua Ou, Huixiao Hong, Zhaoxian Zhou, Ping Gong, Chaoyang Zhang

Faculty Publications

No abstract provided.


Protein Residue-Residue Contact Prediction Using Stacked Denoising Autoencoders, Joseph Bailey Luttrell Iv Aug 2016

Protein Residue-Residue Contact Prediction Using Stacked Denoising Autoencoders, Joseph Bailey Luttrell Iv

Honors Theses

Protein residue-residue contact prediction is one of many areas of bioinformatics research that aims to assist researchers in the discovery of structural features of proteins. Predicting the existence of such structural features can provide a starting point for studying the tertiary structures of proteins. This has the potential to be useful in applications such as drug design where tertiary structure predictions may play an important role in approximating the interactions between drugs and their targets without expending the monetary resources necessary for preliminary experimentation. Here, four different methods involving deep learning, support vector machines (SVMs), and direct coupling analysis were …


Machine Learning Methods For Brain Image Analysis, Ahmed Fakhry Jul 2016

Machine Learning Methods For Brain Image Analysis, Ahmed Fakhry

Computer Science Theses & Dissertations

Understanding how the brain functions and quantifying compound interactions between complex synaptic networks inside the brain remain some of the most challenging problems in neuroscience. Lack or abundance of data, shortage of manpower along with heterogeneity of data following from various species all served as an added complexity to the already perplexing problem. The ability to process vast amount of brain data need to be performed automatically, yet with an accuracy close to manual human-level performance. These automated methods essentially need to generalize well to be able to accommodate data from different species. Also, novel approaches and techniques are becoming …


A Computational Framework For Learning From Complex Data: Formulations, Algorithms, And Applications, Wenlu Zhang Jul 2016

A Computational Framework For Learning From Complex Data: Formulations, Algorithms, And Applications, Wenlu Zhang

Computer Science Theses & Dissertations

Many real-world processes are dynamically changing over time. As a consequence, the observed complex data generated by these processes also evolve smoothly. For example, in computational biology, the expression data matrices are evolving, since gene expression controls are deployed sequentially during development in many biological processes. Investigations into the spatial and temporal gene expression dynamics are essential for understanding the regulatory biology governing development. In this dissertation, I mainly focus on two types of complex data: genome-wide spatial gene expression patterns in the model organism fruit fly and Allen Brain Atlas mouse brain data. I provide a framework to explore …