Open Access. Powered by Scholars. Published by Universities.®

Life Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 18 of 18

Full-Text Articles in Life Sciences

Bayesmotif: De Novo Protein Sorting Motif Discovery From Impure Datasets, Jianjun Hu, F. Zhang Jun 2015

Bayesmotif: De Novo Protein Sorting Motif Discovery From Impure Datasets, Jianjun Hu, F. Zhang

Jianjun Hu

Background

Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms.

Methods

We formulated the protein sorting motif discovery problem as a classification problem …


Hemo: A Sustainable Multi-Objective Evolutionary Optimization Framework, Jianjun Hu, K. Seo, Z. Fan, R. Rosenberg, E. Goodman Jun 2015

Hemo: A Sustainable Multi-Objective Evolutionary Optimization Framework, Jianjun Hu, K. Seo, Z. Fan, R. Rosenberg, E. Goodman

Jianjun Hu

No abstract provided.


Improving Protein Localization Prediction Using Amino Acid Group Based Physichemical Encoding, Jianjun Hu, F. Zhang Jun 2015

Improving Protein Localization Prediction Using Amino Acid Group Based Physichemical Encoding, Jianjun Hu, F. Zhang

Jianjun Hu

No abstract provided.


Hemebind: A Novel Method For Heme Binding Residue Prediction By Combining Structural And Sequence Information, R. Liu, Jianjun Hu Jun 2015

Hemebind: A Novel Method For Heme Binding Residue Prediction By Combining Structural And Sequence Information, R. Liu, Jianjun Hu

Jianjun Hu

Background Accurate prediction of binding residues involved in the interactions between proteins and small ligands is one of the major challenges in structural bioinformatics. Heme is an essential and commonly used ligand that plays critical roles in electron transfer, catalysis, signal transduction and gene expression. Although much effort has been devoted to the development of various generic algorithms for ligand binding site prediction over the last decade, no algorithm has been specifically designed to complement experimental techniques for identification of heme binding residues. Consequently, an urgent need is to develop a computational method for recognizing these important residues. Results Here …


Proteomic Characterization Of Her-2/Neu-Overexpressing Breast Cancer Cells, Hexin Chen, G. Pimienta, Y. Gu, X. Sun, Jianjun Hu, M.-S. Kim, R. Chaerkady, M. Gucek, R. Cole, S. Sukumar, A. Pandey Jun 2015

Proteomic Characterization Of Her-2/Neu-Overexpressing Breast Cancer Cells, Hexin Chen, G. Pimienta, Y. Gu, X. Sun, Jianjun Hu, M.-S. Kim, R. Chaerkady, M. Gucek, R. Cole, S. Sukumar, A. Pandey

Jianjun Hu

No abstract provided.


Dnabind: A Hybrid Algorithm For Structure-Based Prediction Of Dna-Binding Residues By Combining Machine Learning- And Template-Based Approaches, R. Liu, Jianjun Hu Jun 2015

Dnabind: A Hybrid Algorithm For Structure-Based Prediction Of Dna-Binding Residues By Combining Machine Learning- And Template-Based Approaches, R. Liu, Jianjun Hu

Jianjun Hu

No abstract provided.


Computational Identification Of Post-Translational Modification-Based Nuclear Import Regulations By Characterizing Nuclear Localization Signal-Import Receptor Interaction, J.-R. Lin, Z. Liu, Jianjun Hu Jun 2015

Computational Identification Of Post-Translational Modification-Based Nuclear Import Regulations By Characterizing Nuclear Localization Signal-Import Receptor Interaction, J.-R. Lin, Z. Liu, Jianjun Hu

Jianjun Hu

No abstract provided.


Subcellular Localization Of Marine Bacterial Alkaline Phosphatases, H. Luo, Ronald Benner, R. Long, Jianjun Hu Jun 2015

Subcellular Localization Of Marine Bacterial Alkaline Phosphatases, H. Luo, Ronald Benner, R. Long, Jianjun Hu

Jianjun Hu

Bacterial alkaline phosphatases (APases) are important enzymes in organophosphate utilization in the ocean. The subcellular localization of APases has significant ecological implications for marine biota but is largely unknown. The extensive metagenomic sequence databases from the Global Ocean Sampling Expedition provide an opportunity to address this question. A bioinformatics pipeline was developed to identify marine bacterial APases from the metagenomic databases, and a consensus classification algorithm was designed to predict their subcellular localizations. We identified 3,733 bacterial APase sequences (including PhoA, PhoD, and PhoX) and found that cytoplasmic (41%) and extracellular (30%) APases exceed their periplasmic (17%), outer membrane (12%), …


Integrative Disease Classification Based On Cross-Platform Microarray Data, C.-C. Liu, Jianjun Hu, M. Kalakrishnan, H. Huang, X. Zhou Jun 2015

Integrative Disease Classification Based On Cross-Platform Microarray Data, C.-C. Liu, Jianjun Hu, M. Kalakrishnan, H. Huang, X. Zhou

Jianjun Hu

Background Disease classification has been an important application of microarray technology. However, most microarray-based classifiers can only handle data generated within the same study, since microarray data generated by different laboratories or with different platforms can not be compared directly due to systematic variations. This issue has severely limited the practical use of microarray-based disease classification. Results In this study, we tested the feasibility of disease classification by integrating the large amount of heterogeneous microarray datasets from the public microarray repositories. Cross-platform data compatibility is created by deriving expression log-rank ratios within datasets. One may then compare vectors of log-rank …


Scored Protein-Protein Interaction To Predict Subcellular Localizations For Yeast Using Diffusion Kernel, A. Mondal, Jianjun Hu Jun 2015

Scored Protein-Protein Interaction To Predict Subcellular Localizations For Yeast Using Diffusion Kernel, A. Mondal, Jianjun Hu

Jianjun Hu

No abstract provided.


Minimalist Ensemble Algorithms For Genome-Wide Protein Localization Prediction, J.-R. Lin, A. Mondal, R. Liu, Jianjun Hu Jun 2015

Minimalist Ensemble Algorithms For Genome-Wide Protein Localization Prediction, J.-R. Lin, A. Mondal, R. Liu, Jianjun Hu

Jianjun Hu

Background Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms. Results This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature …


Limitations And Potentials Of Current Motif Discovery Algorithms, Jianjun Hu, Bin Li, D. Kihara Jun 2015

Limitations And Potentials Of Current Motif Discovery Algorithms, Jianjun Hu, Bin Li, D. Kihara

Jianjun Hu

Computational methods for de novo identification of gene regulation elements, such as transcription factor binding sites, have proved to be useful for deciphering genetic regulatory networks. However, despite the availability of a large number of algorithms, their strengths and weaknesses are not sufficiently understood. Here, we designed a comprehensive set of performance measures and benchmarked five modern sequence-based motif discovery algorithms using large datasets generated from Escherichia coli RegulonDB. Factors that affect the prediction accuracy, scalability and reliability are characterized. It is revealed that the nucleotide and the binding site level accuracy are very low, while the motif level accuracy …


Seqnls: Nuclear Localization Signal Prediction Based On Frequent Pattern Mining And Linear Motif Scoring, J.-R. Lin, Jianjun Hu Jun 2015

Seqnls: Nuclear Localization Signal Prediction Based On Frequent Pattern Mining And Linear Motif Scoring, J.-R. Lin, Jianjun Hu

Jianjun Hu

Nuclear localization signals (NLSs) are stretches of residues in proteins mediating their importing into the nucleus. NLSs are known to have diverse patterns, of which only a limited number are covered by currently known NLS motifs. Here we propose a sequential pattern mining algorithm SeqNLS to effectively identify potential NLS patterns without being constrained by the limitation of current knowledge of NLSs. The extracted frequent sequential patterns are used to predict NLS candidates which are then filtered by a linear motif-scoring scheme based on predicted sequence disorder and by the relatively local conservation (IRLC) based masking. The experiment results on …


Integrative Missing Value Estimation For Microarray Data, Jianjun Hu, H. Li, M. Waterman, X. Zhou Jun 2015

Integrative Missing Value Estimation For Microarray Data, Jianjun Hu, H. Li, M. Waterman, X. Zhou

Jianjun Hu

Background Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples. In fact, more than 80% of the time-series datasets in Stanford Microarray Database contain less than eight samples. Results We present the integrative Missing Value Estimation method (iMISS) by incorporating information from multiple reference microarray datasets to improve missing value estimation. For each gene with missing data, we derive a consistent neighbor-gene list by taking reference data sets …


Robust And Efficient Genetic Algorithms With Hierarchical Niching And A Sustainable Evolutionary Computation Model, Jianjun Hu, E. Goodman Jun 2015

Robust And Efficient Genetic Algorithms With Hierarchical Niching And A Sustainable Evolutionary Computation Model, Jianjun Hu, E. Goodman

Jianjun Hu

No abstract provided.


Emd: An Ensemble Algorithm For Discovering Regulatory Motifs In Dna Sequences, Jianjun Hu, Y. Yang, D. Kihara Jun 2015

Emd: An Ensemble Algorithm For Discovering Regulatory Motifs In Dna Sequences, Jianjun Hu, Y. Yang, D. Kihara

Jianjun Hu

Background Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble algorithms have emerged as an effective strategy in bioinformatics for improving the prediction accuracy by exploiting the synergetic prediction capability of multiple algorithms. Results We proposed a novel clustering-based ensemble algorithm named EMD for de novo motif discovery by combining multiple predictions from multiple runs of one or more base component algorithms. The ensemble approach is …


Computational Prediction Of Heme-Binding Residues By Exploiting Residue Interaction Network, R. Liu, Jianjun Hu Jun 2015

Computational Prediction Of Heme-Binding Residues By Exploiting Residue Interaction Network, R. Liu, Jianjun Hu

Jianjun Hu

Computational identification of heme-binding residues is beneficial for predicting and designing novel heme proteins. Here we proposed a novel method for heme-binding residue prediction by exploiting topological properties of these residues in the residue interaction networks derived from three-dimensional structures. Comprehensive analysis showed that key residues located in heme-binding regions are generally associated with the nodes with higher degree, closeness and betweenness, but lower clustering coefficient in the network. HemeNet, a support vector machine (SVM) based predictor, was developed to identify heme-binding residues by combining topological features with existing sequence and structural features. The results showed that incorporation of network-based …


Prediction Of Discontinuous B-Cell Epitopes Using Logistic Regression And Structural Information, R. Liu, Jianjun Hu Jun 2015

Prediction Of Discontinuous B-Cell Epitopes Using Logistic Regression And Structural Information, R. Liu, Jianjun Hu

Jianjun Hu

Computational prediction of discontinuous B-cell epitopes remains challenging, but it is an important task in vaccine design. In this study, we developed a novel computational method to predict discontinuous epitope residues by combining the logistic regression model with two important structural features, B-factor and relative accessible surface area (RASA). We conducted five-fold cross-validation on a representative dataset composed of antigen structures bound with antibodies and independent testing on Epitome database, respectively. Experimental results indicate that besides the well-known RASA feature, B-factor can also be used to identify discontinuous epitopes. Furthermore, these two features are complementary and their combination can remarkably …