Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Keyword
-
- Bioinformatics (9)
- Bioinformatics, Computational Biology (6)
- Boolean network (3)
- Support vector machine (3)
- Biology, Bioinformatics (2)
-
- Dynamic bayesian network (2)
- Gene ontology (2)
- Genomics (2)
- Support vector machines (2)
- Adaptive immune cell (1)
- African american (1)
- Algorithms (1)
- Amyloid proteins (1)
- Annotation search (1)
- Antimicrobial activity (1)
- Antimicrobial peptide (1)
- Assessment (1)
- Assessment instruments (1)
- B-factor (1)
- Basic local alignment search tool (1)
- Batch blast (1)
- BayesMotif (1)
- Bayesian network (1)
- Biological immune system (1)
- Boolean function (1)
- Bootstrap aggregation (bagging) (1)
- Breast cancer (1)
- Built structures (1)
- Calling pipeline comparison (1)
- Cancer microarray data (1)
Articles 1 - 30 of 40
Full-Text Articles in Entire DC Network
Structure–Activity Relationship-Based Chemical Classification Of Highly Imbalanced Tox21 Datasets, Gabriel Idakwo, Sundar Thangapandian, Joseph Luttrell, Yan Li, Nan Wang, Zhaoxian Zhou, Huixiao Hong, Bei Yang, Chaoyang Zhang, Ping Gong
Structure–Activity Relationship-Based Chemical Classification Of Highly Imbalanced Tox21 Datasets, Gabriel Idakwo, Sundar Thangapandian, Joseph Luttrell, Yan Li, Nan Wang, Zhaoxian Zhou, Huixiao Hong, Bei Yang, Chaoyang Zhang, Ping Gong
Faculty Publications
The specificity of toxicant-target biomolecule interactions lends to the very imbalanced nature of many toxicity datasets, causing poor performance in Structure–Activity Relationship (SAR)-based chemical classification. Undersampling and oversampling are representative techniques for handling such an imbalance challenge. However, removing inactive chemical compound instances from the majority class using an undersampling technique can result in information loss, whereas increasing active toxicant instances in the minority class by interpolation tends to introduce artificial minority instances that often cross into the majority class space, giving rise to class overlapping and a higher false prediction rate. In this study, in order to improve the …
Entrna: A Framework To Predict Rna Foldability, Congzhe Su, Jeffery D. Weir, Fei Zhang, Hao Yan, Teresa Wu
Entrna: A Framework To Predict Rna Foldability, Congzhe Su, Jeffery D. Weir, Fei Zhang, Hao Yan, Teresa Wu
Faculty Publications
RNA molecules play many crucial roles in living systems. The spatial complexity that exists in RNA structures determines their cellular functions. Therefore, understanding RNA folding conformations, in particular, RNA secondary structures, is critical for elucidating biological functions. Existing literature has focused on RNA design as either an RNA structure prediction problem or an RNA inverse folding problem where free energy has played a key role.
In Silico Identification Of Genetic Mutations Conferring Resistance To Acetohydroxyacid Synthase Inhibitors: A Case Study Of Kochia Scoparia, Yan Li, Michael D. Netherland, Chaoyang Zhang, Huixiao Hong, Ping Gong
In Silico Identification Of Genetic Mutations Conferring Resistance To Acetohydroxyacid Synthase Inhibitors: A Case Study Of Kochia Scoparia, Yan Li, Michael D. Netherland, Chaoyang Zhang, Huixiao Hong, Ping Gong
Faculty Publications
Mutations that confer herbicide resistance are a primary concern for herbicide-based chemical control of invasive plants and are often under-characterized structurally and functionally. As the outcome of selection pressure, resistance mutations usually result from repeated long-term applications of herbicides with the same mode of action and are discovered through extensive field trials. Here we used acetohydroxyacid synthase (AHAS) of Kochia scoparia (KsAHAS) as an example to demonstrate that, given the sequence of a target protein, the impact of genetic mutations on ligand binding could be evaluated and resistance mutations could be identified using a biophysics-based computational approach. Briefly, …
Predicting Protein Residue-Residue Contacts Using Random Forests And Deep Networks, Joseph Luttrell Iv, Tong Liu, Chaoyang Zhang, Zheng Wang
Predicting Protein Residue-Residue Contacts Using Random Forests And Deep Networks, Joseph Luttrell Iv, Tong Liu, Chaoyang Zhang, Zheng Wang
Faculty Publications
Background: The ability to predict which pairs of amino acid residues in a protein are in contact with each other offers many advantages for various areas of research that focus on proteins. For example, contact prediction can be used to reduce the computational complexity of predicting the structure of proteins and even to help identify functionally important regions of proteins. These predictions are becoming especially important given the relatively low number of experimentally determined protein structures compared to the amount of available protein sequence data.
Results: Here we have developed and benchmarked a set of machine learning methods …
Similarities And Differences Between Variants Called With Human Reference Genome Hg19 Or Hg38, Bohu Pan, Rebecca Kusko, Wenming Xiao, Yuantin Zheng, Zhichao Liu, Chunlin Xiao, Sugunadevi Sakkiah, Wenjing Guo, Ping Gong, Chaoyang Zhang, Weigong Ge, Leming Shi, Weida Tong, Huixiao Hong
Similarities And Differences Between Variants Called With Human Reference Genome Hg19 Or Hg38, Bohu Pan, Rebecca Kusko, Wenming Xiao, Yuantin Zheng, Zhichao Liu, Chunlin Xiao, Sugunadevi Sakkiah, Wenjing Guo, Ping Gong, Chaoyang Zhang, Weigong Ge, Leming Shi, Weida Tong, Huixiao Hong
Faculty Publications
Background: Reference genome selection is a prerequisite for successful analysis of next generation sequencing (NGS) data. Current practice employs one of the two most recent human reference genome versions: HG19 or HG38. To date, the impact of genome version on SNV identification has not been rigorously assessed.
Results: We conducted analysis comparing the SNVs identified based on HG19 vs HG38, leveraging whole genome sequencing (WGS) data from the genome-in-a-bottle (GIAB) project. First, SNVs were called using 26 different bioinformatics pipelines with either HG19 or HG38. Next, two tools were used to convert the called SNVs between HG19 and …
Deep Learning Architectures For Multi-Label Classification Of Intelligent Health Risk Prediction, Andrew Maxwell, Runzhi Li, Bei Yang, Heng Weng, Aihua Ou, Huixiao Hong, Zhaoxian Zhou, Ping Gong, Chaoyang Zhang
Deep Learning Architectures For Multi-Label Classification Of Intelligent Health Risk Prediction, Andrew Maxwell, Runzhi Li, Bei Yang, Heng Weng, Aihua Ou, Huixiao Hong, Zhaoxian Zhou, Ping Gong, Chaoyang Zhang
Faculty Publications
No abstract provided.
Proceedings Of The 2014 Midsouth Computational Biology And Bioinformatics Society (Mcbios) Conference, Jonathan D. Wren, Mikhail G. Dozmorov, Dennis Burian, Andy Perkins, Chaoyang Zhang, Peter Hoyt, Rakesh Kaundal
Proceedings Of The 2014 Midsouth Computational Biology And Bioinformatics Society (Mcbios) Conference, Jonathan D. Wren, Mikhail G. Dozmorov, Dennis Burian, Andy Perkins, Chaoyang Zhang, Peter Hoyt, Rakesh Kaundal
Faculty Publications
No abstract provided.
Smoq: A Tool For Predicting The Absolute Residue-Specific Quality Of A Single Protein Model With Support Vector Machine, Renzhi Cao, Zheng Wang, Yiheng Wang, Jianlin Cheng
Smoq: A Tool For Predicting The Absolute Residue-Specific Quality Of A Single Protein Model With Support Vector Machine, Renzhi Cao, Zheng Wang, Yiheng Wang, Jianlin Cheng
Faculty Publications
Background: It is important to predict the quality of a protein structural model before its native structure is known. The method that can predict the absolute local quality of individual residues in a single protein model is rare, yet particularly needed for using, ranking and refining protein models.
Results: We developed a machine learning tool (SMOQ) that can predict the distance deviation of each residue in a single protein model. SMOQ uses support vector machines (SVM) with protein sequence and structural features (i.e. basic feature set), including amino acid sequence, secondary structures, solvent accessibilities, and residue-residue contacts to …
A Course-Based Research Experience: How Benefits Change With Increased Investment In Instructional Time, Christopher D. Shaffer, Consuelo J. Alvarez, April E. Bednarski, David Dunbar, Anya L. Goodman, Catherine Reinke, Anne G. Rosenwald, Michael J. Wolyniak, Cheryl Bailey, Daron Barnard, Christopher Bazinet, Dale L. Beach, James E.J. Bedard, Satish Bhalla, John Braverman, Martin Burg, Vidya Chandrasekaran, Hui-Min Chung, Kari Clase, Randall J. Dejong, Justin R. Diangelo, Chunguang Du, Todd T. Eckdahl, Heather Eisler, Julia A. Emerson, Amy Frary, Donald Frohlich, Yuying Gosser, Shubha Govind, Adam Haberman, Amy T. Hark, Charles Hauser, Arlene Hoogewerf, Laura L.M. Hoopes, Carina E. Howell, Diana Johnson, Christopher J. Jones, Lisa Kadlec, Marian Kaehler, S. Catherine Silver Key, Adam Kleinschmit, Nighat P. Kokan, Olga Kopp, Gary Kuleck, Judith Leatherman, Jane Lopilato, Christy Mackinnon, Juan Carlos Martinez-Cruzado, Gerard Mcneil, Stephanie Mel, Hemlata Mistry, Alexis Nagengast, Paul Overvoorde, Don W. Paetkau, Susan Parrish, Celeste N. Peterson, Mary Preuss, Laura K. Reed, Dennis Revie, Srebrenka Robic, Jennifer Roecklein-Canfield, Michael R. Rubin, Kenneth Saville, Stephanie Schroeder, Karim Sharif, Mary Shaw, Gary Skuse, Christopher D. Smith, Mary A. Smith, Sheryl T. Smith, Eric Spana, Mary Spratt, Aparna Sreenivasan, Joyce Stamm, Paul Szauter, Jeffrey S. Thompson, Matthew Wawersik, James Youngblom, Leming Zhou, Elaine R. Mardis, Jeremy Buhler, Wilson Leung, David Lopatto, Sarah C.R. Elgin
A Course-Based Research Experience: How Benefits Change With Increased Investment In Instructional Time, Christopher D. Shaffer, Consuelo J. Alvarez, April E. Bednarski, David Dunbar, Anya L. Goodman, Catherine Reinke, Anne G. Rosenwald, Michael J. Wolyniak, Cheryl Bailey, Daron Barnard, Christopher Bazinet, Dale L. Beach, James E.J. Bedard, Satish Bhalla, John Braverman, Martin Burg, Vidya Chandrasekaran, Hui-Min Chung, Kari Clase, Randall J. Dejong, Justin R. Diangelo, Chunguang Du, Todd T. Eckdahl, Heather Eisler, Julia A. Emerson, Amy Frary, Donald Frohlich, Yuying Gosser, Shubha Govind, Adam Haberman, Amy T. Hark, Charles Hauser, Arlene Hoogewerf, Laura L.M. Hoopes, Carina E. Howell, Diana Johnson, Christopher J. Jones, Lisa Kadlec, Marian Kaehler, S. Catherine Silver Key, Adam Kleinschmit, Nighat P. Kokan, Olga Kopp, Gary Kuleck, Judith Leatherman, Jane Lopilato, Christy Mackinnon, Juan Carlos Martinez-Cruzado, Gerard Mcneil, Stephanie Mel, Hemlata Mistry, Alexis Nagengast, Paul Overvoorde, Don W. Paetkau, Susan Parrish, Celeste N. Peterson, Mary Preuss, Laura K. Reed, Dennis Revie, Srebrenka Robic, Jennifer Roecklein-Canfield, Michael R. Rubin, Kenneth Saville, Stephanie Schroeder, Karim Sharif, Mary Shaw, Gary Skuse, Christopher D. Smith, Mary A. Smith, Sheryl T. Smith, Eric Spana, Mary Spratt, Aparna Sreenivasan, Joyce Stamm, Paul Szauter, Jeffrey S. Thompson, Matthew Wawersik, James Youngblom, Leming Zhou, Elaine R. Mardis, Jeremy Buhler, Wilson Leung, David Lopatto, Sarah C.R. Elgin
Faculty Publications
There is widespread agreement that science, technology, engineering, and mathematics programs should provide undergraduates with research experience. Practical issues and limited resources, however, make this a challenge. We have developed a bioinformatics project that provides a course-based research experience for students at a diverse group of schools and offers the opportunity to tailor this experience to local curriculum and institution-specific student needs. We assessed both attitude and knowledge gains, looking for insights into how students respond given this wide range of curricular and institutional variables. While different approaches all appear to result in learning gains, we find that a significant …
Differential Reconstructed Gene Interaction Networks For Deriving Toxicity Threshold In Chemical Risk Assessment, Yi Yang, Andrew Maxwell, Xiaowei Zhang, Nan Wang, Edward J. Perkins, Chaoyang Zhang, Ping Gong
Differential Reconstructed Gene Interaction Networks For Deriving Toxicity Threshold In Chemical Risk Assessment, Yi Yang, Andrew Maxwell, Xiaowei Zhang, Nan Wang, Edward J. Perkins, Chaoyang Zhang, Ping Gong
Faculty Publications
Background: Pathway alterations reflected as changes in gene expression regulation and gene interaction can result from cellular exposure to toxicants. Such information is often used to elucidate toxicological modes of action. From a risk assessment perspective, alterations in biological pathways are a rich resource for setting toxicant thresholds, which may be more sensitive and mechanism-informed than traditional toxicity endpoints. Here we developed a novel differential networks (DNs) approach to connect pathway perturbation with toxicity threshold setting.
Methods: Our DNs approach consists of 6 steps: time-series gene expression data collection, identification of altered genes, gene interaction network reconstruction, differential …
Seqnls: Nuclear Localization Signal Prediction Based On Frequent Pattern Mining And Linear Motif Scoring, J.-R. Lin, Jianjun Hu
Seqnls: Nuclear Localization Signal Prediction Based On Frequent Pattern Mining And Linear Motif Scoring, J.-R. Lin, Jianjun Hu
Faculty Publications
Nuclear localization signals (NLSs) are stretches of residues in proteins mediating their importing into the nucleus. NLSs are known to have diverse patterns, of which only a limited number are covered by currently known NLS motifs. Here we propose a sequential pattern mining algorithm SeqNLS to effectively identify potential NLS patterns without being constrained by the limitation of current knowledge of NLSs. The extracted frequent sequential patterns are used to predict NLS candidates which are then filtered by a linear motif-scoring scheme based on predicted sequence disorder and by the relatively local conservation (IRLC) based masking.
The experiment results on …
Mtbindingsim: Simulate Protein Binding To Microtubules, Julia T. Philip, Charles H. Pence, Holly V. Goodson
Mtbindingsim: Simulate Protein Binding To Microtubules, Julia T. Philip, Charles H. Pence, Holly V. Goodson
Faculty Publications
Summary: Many protein–protein interactions are more complex than can be accounted for by 1:1 binding models. However, biochemists have few tools available to help them recognize and predict the behaviors of these more complicated systems, making it difficult to design experiments that distinguish between possible binding models. MTBindingSim provides researchers with an environment in which they can rapidly compare different models of binding for a given scenario. It is written specifically with microtubule polymers in mind, but many of its models apply equally well to any polymer or any protein–protein interaction. MTBindingSim can thus both help in training intuition about …
Minimalist Ensemble Algorithms For Genome-Wide Protein Localization Prediction, J.-R. Lin, A. M. Mondal, R. Liu, Jianjun Hu
Minimalist Ensemble Algorithms For Genome-Wide Protein Localization Prediction, J.-R. Lin, A. M. Mondal, R. Liu, Jianjun Hu
Faculty Publications
Background
Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms.
Results
This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature …
Transcriptomic Profiles Of Peripheral White Blood Cells In Type Ii Diabetes And Racial Differences In Expression Profiles, Jinghe Mao, Junmei Ai, Xinchun Zhou, Ming Shenwu, Manuel Ong Jr., Marketta Blue, Jasmine T. Washington, Xiaonan Wang, Youping Deng
Transcriptomic Profiles Of Peripheral White Blood Cells In Type Ii Diabetes And Racial Differences In Expression Profiles, Jinghe Mao, Junmei Ai, Xinchun Zhou, Ming Shenwu, Manuel Ong Jr., Marketta Blue, Jasmine T. Washington, Xiaonan Wang, Youping Deng
Faculty Publications
Background: Along with obesity, physical inactivity, and family history of metabolic disorders, African American ethnicity is a risk factor for type 2 diabetes (T2D) in the United States. However, little is known about the differences in gene expression and transcriptomic profiles of blood in T2D between African Americans (AA) and Caucasians (CAU), and microarray analysis of peripheral white blood cells (WBCs) from these two ethnic groups will facilitate our understanding of the underlying molecular mechanism in T2D and identify genetic biomarkers responsible for the disparities.
Results: A whole human genome oligomicroarray of peripheral WBCs was performed on 144 …
Refnetbuilder: A Platform For Construction Of Integrated Reference Gene Regulatory Networks From Expressed Sequence Tags, Ying Li, Ping Gong, Edward J. Perkins, Chaoyang Zhang, Nan Wang
Refnetbuilder: A Platform For Construction Of Integrated Reference Gene Regulatory Networks From Expressed Sequence Tags, Ying Li, Ping Gong, Edward J. Perkins, Chaoyang Zhang, Nan Wang
Faculty Publications
Background: Gene Regulatory Networks (GRNs) provide integrated views of gene interactions that control biological processes. Many public databases contain biological interactions extracted from experimentally validated literature reports, but most furnish only information for a few genetic model organisms. In order to provide a bioinformatic tool for researchers who work with non-model organisms, we developed RefNetBuilder, a new platform that allows construction of putative reference pathways or GRNs from expressed sequence tags (ESTs).
Results: RefNetBuilder was designed to have the flexibility to extract and archive pathway or GRN information from public databases such as the Kyoto Encyclopedia of Genes …
The Proteogenomic Mapping Tool, William S. Sanders, Nan Wang, Susan M. Bridges, Brandon M. Malone, Yoginder S. Dandass, Fiona M. Mccarthy, Bindu Nanduri, Mark L. Lawrence, Shane C. Burgess
The Proteogenomic Mapping Tool, William S. Sanders, Nan Wang, Susan M. Bridges, Brandon M. Malone, Yoginder S. Dandass, Fiona M. Mccarthy, Bindu Nanduri, Mark L. Lawrence, Shane C. Burgess
Faculty Publications
Background: High-throughput mass spectrometry (MS) proteomics data is increasingly being used to complement traditional structural genome annotation methods. To keep pace with the high speed of experimental data generation and to aid in structural genome annotation, experimentally observed peptides need to be mapped back to their source genome location quickly and exactly. Previously, the tools to do this have been limited to custom scripts designed by individual research groups to analyze their own data, are generally not widely available, and do not scale well with large eukaryotic genomes.
Results: The Proteogenomic Mapping Tool includes a Java implementation of …
Computational Prediction Of Heme-Binding Residues By Exploiting Residue Interaction Network, R. Liu, Jianjun Hu
Computational Prediction Of Heme-Binding Residues By Exploiting Residue Interaction Network, R. Liu, Jianjun Hu
Faculty Publications
Computational identification of heme-binding residues is beneficial for predicting and designing novel heme proteins. Here we proposed a novel method for heme-binding residue prediction by exploiting topological properties of these residues in the residue interaction networks derived from three-dimensional structures. Comprehensive analysis showed that key residues located in heme-binding regions are generally associated with the nodes with higher degree, closeness and betweenness, but lower clustering coefficient in the network. HemeNet, a support vector machine (SVM) based predictor, was developed to identify heme-binding residues by combining topological features with existing sequence and structural features. The results showed that incorporation of network-based …
Prediction Of Discontinuous B-Cell Epitopes Using Logistic Regression And Structural Information, R. Liu, Jianjun Hu
Prediction Of Discontinuous B-Cell Epitopes Using Logistic Regression And Structural Information, R. Liu, Jianjun Hu
Faculty Publications
Computational prediction of discontinuous B-cell epitopes remains challenging, but it is an important task in vaccine design. In this study, we developed a novel computational method to predict discontinuous epitope residues by combining the logistic regression model with two important structural features, B-factor and relative accessible surface area (RASA). We conducted five-fold cross-validation on a representative dataset composed of antigen structures bound with antibodies and independent testing on Epitome database, respectively. Experimental results indicate that besides the well-known RASA feature, B-factor can also be used to identify discontinuous epitopes. Furthermore, these two features are complementary and their combination can remarkably …
Hemebind: A Novel Method For Heme Binding Residue Prediction By Combining Structural And Sequence Information, R. Liu, Jianjun Hu
Hemebind: A Novel Method For Heme Binding Residue Prediction By Combining Structural And Sequence Information, R. Liu, Jianjun Hu
Faculty Publications
Background
Accurate prediction of binding residues involved in the interactions between proteins and small ligands is one of the major challenges in structural bioinformatics. Heme is an essential and commonly used ligand that plays critical roles in electron transfer, catalysis, signal transduction and gene expression. Although much effort has been devoted to the development of various generic algorithms for ligand binding site prediction over the last decade, no algorithm has been specifically designed to complement experimental techniques for identification of heme binding residues. Consequently, an urgent need is to develop a computational method for recognizing these important residues.
Results
Here …
Quail Genomics: A Knowledgebase For Northern Bobwhite, Arun Rawat, Kurt A. Gust, Mohamed O. Elasri, Edward J. Perkins
Quail Genomics: A Knowledgebase For Northern Bobwhite, Arun Rawat, Kurt A. Gust, Mohamed O. Elasri, Edward J. Perkins
Faculty Publications
Background
The Quail Genomics knowledgebase (http://www.quailgenomics.info) has been initiated to share and develop functional genomic data for Northern bobwhite (Colinus virginianus). This web-based platform has been designed to allow researchers to perform analysis and curate genomic information for this non-model species that has little supporting information in GenBank.
Description
A multi-tissue, normalized cDNA library generated for Northern bobwhite was sequenced using 454 Life Sciences next generation sequencing. The Quail Genomics knowledgebase represents the 478,142 raw ESTs generated from the sequencing effort in addition to assembled nucleotide and protein sequences including 21,980 unigenes annotated with meta-data. A …
Time Lagged Information Theoretic Approaches To The Reverse Engineering Of Gene Regulatory Networks, Vijender Chaitankar, Preetam Ghosh, Edward J. Perkins, Ping Gong, Youping Deng, Chaoyang Zhang
Time Lagged Information Theoretic Approaches To The Reverse Engineering Of Gene Regulatory Networks, Vijender Chaitankar, Preetam Ghosh, Edward J. Perkins, Ping Gong, Youping Deng, Chaoyang Zhang
Faculty Publications
Background: A number of models and algorithms have been proposed in the past for gene regulatory network (GRN) inference; however, none of them address the effects of the size of time-series microarray expression data in terms of the number of time-points. In this paper, we study this problem by analyzing the behaviour of three algorithms based on information theory and dynamic Bayesian network (DBN) models. These algorithms were implemented on different sizes of data generated by synthetic networks. Experiments show that the inference accuracy of these algorithms reaches a saturation point after a specific data size brought about by …
Dynamics Of Protofibril Elongation And Association Involved In Aβ42 Peptide Aggregation In Alzheimer's Disease, Preetam Ghosh, Amit Kumar, Bhaswati Datta, Vijayaraghavan Rangachari
Dynamics Of Protofibril Elongation And Association Involved In Aβ42 Peptide Aggregation In Alzheimer's Disease, Preetam Ghosh, Amit Kumar, Bhaswati Datta, Vijayaraghavan Rangachari
Faculty Publications
Background: The aggregates of a protein called, ‘Aβ’ found in brains of Alzheimer’s patients are strongly believed to be the cause for neuronal death and cognitive decline. Among the different forms of Aβ aggregates, smaller aggregates called ‘soluble oligomers’ are increasingly believed to be the primary neurotoxic species responsible for early synaptic dysfunction. Since it is well known that the Aβ aggregation is a nucleation dependant process, it is widely believed that the toxic oligomers are intermediates to fibril formation, or what we call the ‘on-pathway’ products. Modeling of Aβ aggregation has been of intense investigation during the last …
Incorporating Genomics And Bioinformatics Across The Life Sciences Curriculum, Jayna L. Ditty, Christopher A. Kvaal, Brad Goodner, Sharyn K. Freyermuth, Cheryl Bailey, Robert A. Britton, Stuart G. Gordon, Sabine Heinhorst, Kelyenne Reed, Zhaohui Xu, Erin R. Sanders-Lorenz, Seth Axen, Edwin Kim, Mitrick Johns, Kathleen Scott, Cheryl A. Kerfeld
Incorporating Genomics And Bioinformatics Across The Life Sciences Curriculum, Jayna L. Ditty, Christopher A. Kvaal, Brad Goodner, Sharyn K. Freyermuth, Cheryl Bailey, Robert A. Britton, Stuart G. Gordon, Sabine Heinhorst, Kelyenne Reed, Zhaohui Xu, Erin R. Sanders-Lorenz, Seth Axen, Edwin Kim, Mitrick Johns, Kathleen Scott, Cheryl A. Kerfeld
Faculty Publications
No abstract provided.
Bayesmotif: De Novo Protein Sorting Motif Discovery From Impure Datasets, Jianjun Hu, F. Zhang
Bayesmotif: De Novo Protein Sorting Motif Discovery From Impure Datasets, Jianjun Hu, F. Zhang
Faculty Publications
Background
Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms.
Methods
We formulated the protein sorting motif discovery problem as a classification problem …
Feature Selection And Classification Of Maqc-Ii Breast Cancer And Multiple Myeloma Microarray Gene Expression Data, Qingzhong Liu, Andrew H. Sung, Zhongxue Chen, Jianzhong Liu, Xudong Huang, Youping Deng
Feature Selection And Classification Of Maqc-Ii Breast Cancer And Multiple Myeloma Microarray Gene Expression Data, Qingzhong Liu, Andrew H. Sung, Zhongxue Chen, Jianzhong Liu, Xudong Huang, Youping Deng
Faculty Publications
Microarray data has a high dimension of variables but available datasets usually have only a small number of samples, thereby making the study of such datasets interesting and challenging. In the task of analyzing microarray data for the purpose of, e.g., predicting gene-disease association, feature selection is very important because it provides a way to handle the high dimensionality by exploiting information redundancy induced by associations among genetic markers. Judicious feature selection in microarray data analysis can result in significant reduction of cost while maintaining or improving the classification or prediction accuracy of learning machines that are employed to sort …
Subcellular Localization Of Marine Bacterial Alkaline Phosphatases, H. Luo, Ronald Benner, R. A. Long, Jianjun Hu
Subcellular Localization Of Marine Bacterial Alkaline Phosphatases, H. Luo, Ronald Benner, R. A. Long, Jianjun Hu
Faculty Publications
Bacterial alkaline phosphatases (APases) are important enzymes in organophosphate utilization in the ocean. The subcellular localization of APases has significant ecological implications for marine biota but is largely unknown. The extensive metagenomic sequence databases from the Global Ocean Sampling Expedition provide an opportunity to address this question. A bioinformatics pipeline was developed to identify marine bacterial APases from the metagenomic databases, and a consensus classification algorithm was designed to predict their subcellular localizations. We identified 3,733 bacterial APase sequences (including PhoA, PhoD, and PhoX) and found that cytoplasmic (41%) and extracellular (30%) APases exceed their periplasmic (17%), outer membrane (12%), …
Integrative Disease Classification Based On Cross-Platform Microarray Data, C.-C. Liu, Jianjun Hu, M. Kalakrishnan, H. Huang, X. J. Zhou
Integrative Disease Classification Based On Cross-Platform Microarray Data, C.-C. Liu, Jianjun Hu, M. Kalakrishnan, H. Huang, X. J. Zhou
Faculty Publications
Background
Disease classification has been an important application of microarray technology. However, most microarray-based classifiers can only handle data generated within the same study, since microarray data generated by different laboratories or with different platforms can not be compared directly due to systematic variations. This issue has severely limited the practical use of microarray-based disease classification.
Results
In this study, we tested the feasibility of disease classification by integrating the large amount of heterogeneous microarray datasets from the public microarray repositories. Cross-platform data compatibility is created by deriving expression log-rank ratios within datasets. One may then compare vectors of log-rank …
Novel Implementation Of Conditional Co-Regulation By Graph Theory To Derive Co-Expressed Genes From Microarray Data, Arun Rawat, Georg J. Seifert, Youping Deng
Novel Implementation Of Conditional Co-Regulation By Graph Theory To Derive Co-Expressed Genes From Microarray Data, Arun Rawat, Georg J. Seifert, Youping Deng
Faculty Publications
Background
Most existing transcriptional databases like Comprehensive Systems-Biology Database (CSB.DB) and Arabidopsis Microarray Database and Analysis Toolbox (GENEVESTIGATOR) help to seek a shared biological role (similar pathways and biosynthetic cycles) based on correlation. These utilize conventional methods like Pearson correlation and Spearman rank correlation to calculate correlation among genes. However, not all are genes expressed in all the conditions and this leads to their exclusion in these transcriptional databases that consist of experiments performed in varied conditions. This leads to incomplete studies of co-regulation among groups of genes that might be linked to the same or related biosynthetic pathway.
Results …
Cloning, Analysis And Functional Annotation Of Expressed Sequence Tags From The Earthworm Eisenia Fetida, Mehdi Pirooznia, Ping Gong, Xin Guan, Laura S. Inouye, Kuan Yang, Edward J. Perkins, Youping Deng
Cloning, Analysis And Functional Annotation Of Expressed Sequence Tags From The Earthworm Eisenia Fetida, Mehdi Pirooznia, Ping Gong, Xin Guan, Laura S. Inouye, Kuan Yang, Edward J. Perkins, Youping Deng
Faculty Publications
Background
Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR.
Results
A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone …
Comparison Of Probabilistic Boolean Network And Dynamic Bayesian Network Approaches For Inferring Gene Regulatory Networks, Peng Li, Chaoyang Zhang, Edward J. Perkins, Ping Gong, Youping Deng
Comparison Of Probabilistic Boolean Network And Dynamic Bayesian Network Approaches For Inferring Gene Regulatory Networks, Peng Li, Chaoyang Zhang, Edward J. Perkins, Ping Gong, Youping Deng
Faculty Publications
Background: The regulation of gene expression is achieved through gene regulatory networks (GRNs) in which collections of genes interact with one another and other substances in a cell. In order to understand the underlying function of organisms, it is necessary to study the behavior of genes in a gene regulatory network context. Several computational approaches are available for modeling gene regulatory networks with different datasets. In order to optimize modeling of GRN, these approaches must be compared and evaluated in terms of accuracy and efficiency.
Results: In this paper, two important computational approaches for modeling gene regulatory networks, …