Open Access. Powered by Scholars. Published by Universities.®

Molecular Biology Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 2 of 2

Full-Text Articles in Molecular Biology

Deepep: A Deep Learning Framework For Identifying Essential Proteins, Min Zeng, Min Li, Fang-Xiang Wu, Yaohang Li, Yi Pan Dec 2019

Deepep: A Deep Learning Framework For Identifying Essential Proteins, Min Zeng, Min Li, Fang-Xiang Wu, Yaohang Li, Yi Pan

Computer Science Faculty Publications

Background: Essential proteins are crucial for cellular life and thus, identification of essential proteins is an important topic and a challenging problem for researchers. Recently lots of computational approaches have been proposed to handle this problem. However, traditional centrality methods cannot fully represent the topological features of biological networks. In addition, identifying essential proteins is an imbalanced learning problem; but few current shallow machine learning-based methods are designed to handle the imbalanced characteristics. Results: We develop DeepEP based on a deep learning framework that uses the node2vec technique, multi-scale convolutional neural networks and a sampling technique to identify essential proteins. …


Isquest: Finding Insertion Sequences In Prokaryotic Sequence Fragment Data, Abhishek Biswas, David T. Gauthier, Desh Ranjan, Mohammad Zubair Jun 2015

Isquest: Finding Insertion Sequences In Prokaryotic Sequence Fragment Data, Abhishek Biswas, David T. Gauthier, Desh Ranjan, Mohammad Zubair

Computer Science Faculty Publications

Motivation: Insertion sequences (ISs) are transposable elements present in most bacterial and archaeal genomes that play an important role in genomic evolution. The increasing availability of sequenced prokaryotic genomes offers the opportunity to study ISs comprehensively, but development of efficient and accurate tools is required for discovery and annotation. Additionally, prokaryotic genomes are frequently deposited as incomplete, or draft stage because of the substantial cost and effort required to finish genome assembly projects. Development of methods to identify IS directly from raw sequence reads or draft genomes are therefore desirable. Software tools such as Optimized Annotation System for Insertion Sequences …