Open Access. Powered by Scholars. Published by Universities.®

Computational Biology Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 15 of 15

Full-Text Articles in Computational Biology

An Approach To Developing Benchmark Datasets For Protein Secondary Structure Segmentation From Cryo-Em Density Maps, Thu Nguyen, Yongcheng Mu, Jiangwen Sun, Jing He Jan 2023

An Approach To Developing Benchmark Datasets For Protein Secondary Structure Segmentation From Cryo-Em Density Maps, Thu Nguyen, Yongcheng Mu, Jiangwen Sun, Jing He

Computer Science Faculty Publications

More and more deep learning approaches have been proposed to segment secondary structures from cryo-electron density maps at medium resolution range (5--10Å). Although the deep learning approaches show great potential, only a few small experimental data sets have been used to test the approaches. There is limited understanding about potential factors, in data, that affect the performance of segmentation. We propose an approach to generate data sets with desired specifications in three potential factors - the protein sequence identity, structural contents, and data quality. The approach was implemented and has generated a test set and various training sets to study …


Dfhic: A Dilated Full Convolution Model To Enhance The Resolution Of Hi-C Data, Bin Wang, Kun Liu, Yaohang Li, Jianxin Wang Jan 2023

Dfhic: A Dilated Full Convolution Model To Enhance The Resolution Of Hi-C Data, Bin Wang, Kun Liu, Yaohang Li, Jianxin Wang

Computer Science Faculty Publications

Motivation: Hi-C technology has been the most widely used chromosome conformation capture(3C) experiment that measures the frequency of all paired interactions in the entire genome, which is a powerful tool for studying the 3D structure of the genome. The fineness of the constructed genome structure depends on the resolution of Hi-C data. However, due to the fact that high-resolution Hi-C data require deep sequencing and thus high experimental cost, most available Hi-C data are in low-resolution. Hence, it is essential to enhance the quality of Hi-C data by developing the effective computational methods.

Results: In this work, we propose …


Intergenic Transcription In In Vivo Developed Bovine Oocytes And Pre-Implantation Embryos, Saurav Ranjitkar, Mohammad Shiri, Jiangwen Sun, Xiuchun Tian Jan 2023

Intergenic Transcription In In Vivo Developed Bovine Oocytes And Pre-Implantation Embryos, Saurav Ranjitkar, Mohammad Shiri, Jiangwen Sun, Xiuchun Tian

Computer Science Faculty Publications

Background

Intergenic transcription, either failure to terminate at the transcription end site (TES), or transcription initiation at other intergenic regions, is present in cultured cells and enhanced in the presence of stressors such as viral infection. Transcription termination failure has not been characterized in natural biological samples such as pre-implantation embryos which express more than 10,000 genes and undergo drastic changes in DNA methylation.

Results

Using Automatic Readthrough Transcription Detection (ARTDeco) and data of in vivo developed bovine oocytes and embryos, we found abundant intergenic transcripts that we termed as read-outs (transcribed from 5 to 15 kb after TES) and …


Cellbrf: A Feature Selection Method For Single-Cell Clustering Using Cell Balance And Random Forest, Yunpei Xu, Hong-Dong Li, Cui-Xiang Lin, Ruiqing Zheng, Yaohang Li, Jinhui Xu, Jianxin Wang Jan 2023

Cellbrf: A Feature Selection Method For Single-Cell Clustering Using Cell Balance And Random Forest, Yunpei Xu, Hong-Dong Li, Cui-Xiang Lin, Ruiqing Zheng, Yaohang Li, Jinhui Xu, Jianxin Wang

Computer Science Faculty Publications

Motivation

Single-cell RNA sequencing (scRNA-seq) offers a powerful tool to dissect the complexity of biological tissues through cell sub-population identification in combination with clustering approaches. Feature selection is a critical step for improving the accuracy and interpretability of single-cell clustering. Existing feature selection methods underutilize the discriminatory potential of genes across distinct cell types. We hypothesize that incorporating such information could further boost the performance of single cell clustering. Results

We develop CellBRF, a feature selection method that considers genes’ relevance to cell types for single-cell clustering. The key idea is to identify genes that are most important for discriminating …


Completing Single-Cell Dna Methylome Profiles Via Transfer Learning Together With Kl-Divergence, Sanjeeva Dodlapati, Zongliang Jiang, Jiangwen Sun Jan 2022

Completing Single-Cell Dna Methylome Profiles Via Transfer Learning Together With Kl-Divergence, Sanjeeva Dodlapati, Zongliang Jiang, Jiangwen Sun

Computer Science Faculty Publications

The high level of sparsity in methylome profiles obtained using whole-genome bisulfite sequencing in the case of low biological material amount limits its value in the study of systems in which large samples are difficult to assemble, such as mammalian preimplantation embryonic development. The recently developed computational methods for addressing the sparsity by imputing missing have their limits when the required minimum data coverage or profiles of the same tissue in other modalities are not available. In this study, we explored the use of transfer learning together with Kullback-Leibler (KL) divergence to train predictive models for completing methylome profiles with …


Fmri Feature Extraction Model For Adhd Classification Using Convolutional Neural Network, Senuri De Silva, Sanuwani Udara Dayarathna, Gangani Ariyarathne, Dulani Meedeniya, Sampath Jayarathna Jan 2021

Fmri Feature Extraction Model For Adhd Classification Using Convolutional Neural Network, Senuri De Silva, Sanuwani Udara Dayarathna, Gangani Ariyarathne, Dulani Meedeniya, Sampath Jayarathna

Computer Science Faculty Publications

Biomedical intelligence provides a predictive mechanism for the automatic diagnosis of diseases and disorders. With the advancements of computational biology, neuroimaging techniques have been used extensively in clinical data analysis. Attention deficit hyperactivity disorder (ADHD) is a psychiatric disorder, with the symptomology of inattention, impulsivity, and hyperactivity, in which early diagnosis is crucial to prevent unwelcome outcomes. This study addresses ADHD identification using functional magnetic resonance imaging (fMRI) data for the resting state brain by evaluating multiple feature extraction methods. The features of seed-based correlation (SBC), fractional amplitude of low-frequency fluctuation (fALFF), and regional homogeneity (ReHo) are comparatively applied to …


Analysis Of Subtelomeric Rextal Assemblies Using Quast, Tunazzina Islam, Desh Ranjan, Mohammad Zubair, Eleanor Young, Ming Xiao, Harold Riethman Jan 2021

Analysis Of Subtelomeric Rextal Assemblies Using Quast, Tunazzina Islam, Desh Ranjan, Mohammad Zubair, Eleanor Young, Ming Xiao, Harold Riethman

Computer Science Faculty Publications

Genomic regions of high segmental duplication content and/or structural variation have led to gaps and misassemblies in the human reference sequence, and are refractory to assembly from whole-genome short-read datasets. Human subtelomere regions are highly enriched in both segmental duplication content and structural variations, and as a consequence are both impossible to assemble accurately and highly variable from individual to individual. Recently, we developed a pipeline for improved region-specific assembly called Regional Extension of Assemblies Using Linked-Reads (REXTAL). In this study, we evaluate REXTAL and genome-wide assembly (Supernova) approaches on 10X Genomics linked-reads data sets partitioned and barcoded using the …


Overlap Matrix Completion For Predicting Drug-Associated Indications, Menhyun Yang, Huimin Luo, Yaohang Li, Fang-Xiang Wu, Jianxin Wang Dec 2019

Overlap Matrix Completion For Predicting Drug-Associated Indications, Menhyun Yang, Huimin Luo, Yaohang Li, Fang-Xiang Wu, Jianxin Wang

Computer Science Faculty Publications

Identification of potential drug-associated indications is critical for either approved or novel drugs in drug repositioning. Current computational methods based on drug similarity and disease similarity have been developed to predict drug-disease associations. When more reliable drug- or disease-related information becomes available and is integrated, the prediction precision can be continuously improved. However, it is a challenging problem to effectively incorporate multiple types of prior information, representing different characteristics of drugs and diseases, to identify promising drug-disease associations. In this study, we propose an overlap matrix completion (OMC) for bilayer networks (OMC2) and tri-layer networks (OMC3) to predict potential drug-associated …


Prediction Of Lncrna-Disease Associations Based On Inductive Matrix Completion, Chengqian Lu, Mengyun Yang, Feng Luo, Fang-Xiang Wu, Min Li, Yi Pan, Yaohang Li, Jianxin Wang Apr 2018

Prediction Of Lncrna-Disease Associations Based On Inductive Matrix Completion, Chengqian Lu, Mengyun Yang, Feng Luo, Fang-Xiang Wu, Min Li, Yi Pan, Yaohang Li, Jianxin Wang

Computer Science Faculty Publications

Motivation: Accumulating evidences indicate that long non-coding RNAs (lncRNAs) play pivotal roles in various biological processes. Mutations and dysregulations of lncRNAs are implicated in miscellaneous human diseases. Predicting lncRNA–disease associations is beneficial to disease diagnosis as well as treatment. Although many computational methods have been developed, precisely identifying lncRNA–disease associations, especially for novel lncRNAs, remains challenging.

Results: In this study, we propose a method (named SIMCLDA) for predicting potential lncRNA– disease associations based on inductive matrix completion. We compute Gaussian interaction profile kernel of lncRNAs from known lncRNA–disease interactions and functional similarity of diseases based on disease–gene and gene–gene onotology …


Comparing An Atomic Model Or Structure To A Corresponding Cryo-Electron Microscopy Image At The Central Axis Of A Helix, Stephanie Zeil, Julio Kovacs, Willy Wriggers, Jing He Jan 2017

Comparing An Atomic Model Or Structure To A Corresponding Cryo-Electron Microscopy Image At The Central Axis Of A Helix, Stephanie Zeil, Julio Kovacs, Willy Wriggers, Jing He

Computer Science Faculty Publications

Three-dimensional density maps of biological specimens from cryo-electron microscopy (cryo-EM) can be interpreted in the form of atomic models that are modeled into the density, or they can be compared to known atomic structures. When the central axis of a helix is detectable in a cryo-EM density map, it is possible to quantify the agreement between this central axis and a central axis calculated from the atomic model or structure. We propose a novel arc-length association method to compare the two axes reliably. This method was applied to 79 helices in simulated density maps and six case studies using cryo-EM …


Deep Models For Brain Em Image Segmentation: Novel Insights And Improved Performance, Ahmed Fakhry, Hanchuan Peng, Shuiwang Ji Jan 2016

Deep Models For Brain Em Image Segmentation: Novel Insights And Improved Performance, Ahmed Fakhry, Hanchuan Peng, Shuiwang Ji

Computer Science Faculty Publications

Motivation: Accurate segmentation of brain electron microscopy (EM) images is a critical step in dense circuit reconstruction. Although deep neural networks (DNNs) have been widely used in a number of applications in computer vision, most of these models that proved to be effective on image classification tasks cannot be applied directly to EM image segmentation, due to the different objectives of these tasks. As a result, it is desirable to develop an optimized architecture that uses the full power of DNNs and tailored specifically for EM image segmentation.

Results: In this work, we proposed a novel design of DNNs for …


Isquest: Finding Insertion Sequences In Prokaryotic Sequence Fragment Data, Abhishek Biswas, David T. Gauthier, Desh Ranjan, Mohammad Zubair Jun 2015

Isquest: Finding Insertion Sequences In Prokaryotic Sequence Fragment Data, Abhishek Biswas, David T. Gauthier, Desh Ranjan, Mohammad Zubair

Computer Science Faculty Publications

Motivation: Insertion sequences (ISs) are transposable elements present in most bacterial and archaeal genomes that play an important role in genomic evolution. The increasing availability of sequenced prokaryotic genomes offers the opportunity to study ISs comprehensively, but development of efficient and accurate tools is required for discovery and annotation. Additionally, prokaryotic genomes are frequently deposited as incomplete, or draft stage because of the substantial cost and effort required to finish genome assembly projects. Development of methods to identify IS directly from raw sequence reads or draft genomes are therefore desirable. Software tools such as Optimized Annotation System for Insertion Sequences …


De Novo Protein Structure Modeling And Energy Function Design, Lin Chen Jan 2015

De Novo Protein Structure Modeling And Energy Function Design, Lin Chen

Computer Science Theses & Dissertations

The two major challenges in protein structure prediction problems are (1) the lack of an accurate energy function and (2) the lack of an efficient search algorithm. A protein energy function accurately describing the interaction between residues is able to supervise the optimization of a protein conformation, as well as select native or native-like structures from numerous possible conformations. An efficient search algorithm must be able to reduce a conformational space to a reasonable size without missing the native conformation. My PhD research studies focused on these two directions.

A protein energy function—the distance and orientation dependent energy function of …


Computational Analysis Of Gene Expression And Connectivity Patterns In The Convoluted Structures Of Mouse Cerebellum, Tao Zeng Jun 2014

Computational Analysis Of Gene Expression And Connectivity Patterns In The Convoluted Structures Of Mouse Cerebellum, Tao Zeng

Computer Science Theses & Dissertations

One significant difference between evolved mammalian brains and other species is that mammalian brains exhibit increasingly convoluted structures in the cerebral cortex. Groove and ridge shaped structures named gyri and sulci expand surface area of cerebral cortex, making more functions possible. Prior studies using neuroimaging techniques such as dMRI and DTI have revealed that neural fibers are heavily connected to gyri comparing to those connected to sulci, such macro-scale experiments indicates that gyri are involved in large scale information processing while sulci process information locally. However, molecular and cellar level evidences, namely, gene expression pattern and its resulting neuronal connectivity …


Computational Network Analysis Of The Anatomical And Genetic Organizations In The Mouse Brain, Shuiwang Ji Jan 2011

Computational Network Analysis Of The Anatomical And Genetic Organizations In The Mouse Brain, Shuiwang Ji

Computer Science Faculty Publications

Motivation: The mammalian central nervous system (CNS) generates high-level behavior and cognitive functions. Elucidating the anatomical and genetic organizations in the CNS is a key step toward understanding the functional brain circuitry. The CNS contains an enormous number of cell types, each with unique gene expression patterns. Therefore, it is of central importance to capture the spatial expression patterns in the brain. Currently, genome-wide atlas of spatial expression patterns in the mouse brain has been made available, and the data are in the form of aligned 3D data arrays. The sheer volume and complexity of these data pose significant challenges …