Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

PDF

2013

Bioinformatics

Discipline
Institution
Publication
Publication Type

Articles 1 - 30 of 36

Full-Text Articles in Entire DC Network

Automated Annotation Of Functional Imaging Experiments Via Multi-Label Classification, Matthew Turner, Chayan Chakrabarti, Thomas B. Jones, Jiawei F. Xu, Peter T. Fox, George F. Luger, Angela R. Laird, Jessica A. Turner Dec 2013

Automated Annotation Of Functional Imaging Experiments Via Multi-Label Classification, Matthew Turner, Chayan Chakrabarti, Thomas B. Jones, Jiawei F. Xu, Peter T. Fox, George F. Luger, Angela R. Laird, Jessica A. Turner

Neuroscience Institute Faculty Publications

Identifying the experimental methods in human neuroimaging papers is important for grouping meaningfully similar experiments for meta-analyses. Currently, this can only be done by human readers. We present the performance of common machine learning (text mining) methods applied to the problem of automatically classifying or labeling this literature. Labeling terms are from the Cognitive Paradigm Ontology (CogPO), the text corpora are abstracts of published functional neuroimaging papers, and the methods use the performance of a human expert as training data. We aim to replicate the expert’s annotation of multiple labels per abstract identifying the experimental stimuli, cognitive paradigms, response types, …


Gene Regulatory Network Analysis And Web-Based Application Development, Yi Yang Dec 2013

Gene Regulatory Network Analysis And Web-Based Application Development, Yi Yang

Dissertations

Microarray data is a valuable source for gene regulatory network analysis. Using earthworm microarray data analysis as an example, this dissertation demonstrates that a bioinformatics-guided reverse engineering approach can be applied to analyze time-series data to uncover the underlying molecular mechanism. My network reconstruction results reinforce previous findings that certain neurotransmitter pathways are the target of two chemicals - carbaryl and RDX. This study also concludes that perturbations to these pathways by sublethal concentrations of these two chemicals were temporary, and earthworms were capable of fully recovering. Moreover, differential networks (DNs) analysis indicates that many pathways other than those related …


Towards The Prediction Of Mutations In Genomic Sequences, Juan Carlos Martinez Nov 2013

Towards The Prediction Of Mutations In Genomic Sequences, Juan Carlos Martinez

FIU Electronic Theses and Dissertations

Bio-systems are inherently complex information processing systems. Furthermore, physiological complexities of biological systems limit the formation of a hypothesis in terms of behavior and the ability to test hypothesis. More importantly the identification and classification of mutation in patients are centric topics in today’s cancer research.

Next generation sequencing (NGS) technologies can provide genome-wide coverage at a single nucleotide resolution and at reasonable speed and cost. The unprecedented molecular characterization provided by NGS offers the potential for an individualized approach to treatment. These advances in cancer genomics have enabled scientists to interrogate cancer-specific genomic variants and compare them with the …


Unique Microbial Communities Persist In Individual Cystic Fibrosis Patients Throughout A Clinical Exacerbation, Katherine E. Price, Thomas H. Hampton, Alex H. Gifford, Emily L. Dolben, Deborah A. Hogan, Hilary G. Morrison, Mitchell L. Sogin, George A. O’Tooled Nov 2013

Unique Microbial Communities Persist In Individual Cystic Fibrosis Patients Throughout A Clinical Exacerbation, Katherine E. Price, Thomas H. Hampton, Alex H. Gifford, Emily L. Dolben, Deborah A. Hogan, Hilary G. Morrison, Mitchell L. Sogin, George A. O’Tooled

Dartmouth Scholarship

Cystic fibrosis (CF) is caused by inherited mutations in the cystic fibrosis transmembrane conductance regulator gene and results in a lung environment that is highly conducive to polymicrobial infection. Over a lifetime, decreasing bacterial diversity and the presence of Pseudomonas aeruginosa in the lung are correlated with worsening lung disease. However, to date, no change in community diversity, overall microbial load or individual microbes has been shown to correlate with the onset of an acute exacerbation in CF patients. We followed 17 adult CF patients throughout the course of clinical exacerbation, treatment and recovery, using deep sequencing and quantitative PCR …


Identifying Chromosome Rearrangements In The Allopolyploid Brassica Napus Using Pyrosequencing, Alexandra R. Barbella Oct 2013

Identifying Chromosome Rearrangements In The Allopolyploid Brassica Napus Using Pyrosequencing, Alexandra R. Barbella

Master's Theses

Allopolyploids form through the hybridization of two or more diploid genomes. A challenge to reproduction in allopolyploids is that pairing can occur between homologous chromosomes or homeologous chromosomes (i.e.different subgenomes.). Crossover between homeologous chromosomes can result in chromosome rearrangements that lower fertility and overall fitness. Rearrangements can alter the dosage of either entire chromosomes or just parts of chromosomes. Understanding the frequency and extent of rearrangements will help to explain the evolution and genome stabilization of agriculturally important allopolyploid species. Pyrosequencing is a useful tool in the study dosage changes in allopolyploids because it allows quantification of the relative contribution …


Development And Integration Of Informatic Tools For Qualitative And Quantitative Characterization Of Proteomic Datasets Generated By Tandem Mass Spectrometry, Rachel Michelle Adams Aug 2013

Development And Integration Of Informatic Tools For Qualitative And Quantitative Characterization Of Proteomic Datasets Generated By Tandem Mass Spectrometry, Rachel Michelle Adams

Doctoral Dissertations

Shotgun proteomic experiments provide qualitative and quantitative analytical information from biological samples ranging in complexity from simple bacterial isolates to higher eukaryotes such as plants and humans and even to communities of microbial organisms. Improvements to instrument performance, sample preparation, and informatic tools are increasing the scope and volume of data that can be analyzed by mass spectrometry (MS). To accommodate for these advances, it is becoming increasingly essential to choose and/or create tools that can not only scale well but also those that make more informed decisions using additional features within the data. Incorporating novel and existing tools into …


Novel Methods Based On Regression Techniques To Analyze Multistate Models And High-Dimensional Omics Data., Sutirtha Chakraborty Aug 2013

Novel Methods Based On Regression Techniques To Analyze Multistate Models And High-Dimensional Omics Data., Sutirtha Chakraborty

Electronic Theses and Dissertations

The dissertation is based on four distinct research projects that are loosely interconnected by the common link of a regression framework. Chapter 1 provides an introductory outline of the problems addressed in the projects along with a detailed review of the previous works that have been done on them and a brief discussion on our newly developed methodologies. Chapter 2 describes the first project that is concerned with the identification of hidden subject-specific sources of heterogeneity in gene expression profiling analyses and adjusting for them by a technique based on Partial Least Squares (PLS) regression, in order to ensure a …


Identification Of Yeast Cell Cycle Regulated Genes Based On Genomic Features, Chao Cheng, Yao Fu, Linsheng Shen, Mark Gerstein Jul 2013

Identification Of Yeast Cell Cycle Regulated Genes Based On Genomic Features, Chao Cheng, Yao Fu, Linsheng Shen, Mark Gerstein

Dartmouth Scholarship

Background: Time-course microarray experiments have been widely used to identify cell cycle regulated genes. However, the method is not effective for lowly expressed genes and is sensitive to experimental conditions. To complement microarray experiments, we propose a computational method to predict cell cycle regulated genes based on their genomic features – transcription factor binding and motif profiles.

Results: Through integrating gene-expression data with ChIP-chip binding and putative binding sites of transcription factors, our method shows high accuracy in discriminating yeast cell cycle regulated genes from non-cell cycle regulated ones. We predict 211 novel cell cycle regulated genes. Our model rediscovers …


Rcytoscape: Tools For Exploratory Network Analysis, Paul Shannon, Mark L. Grimes, Burak Kutlu, Jan J. Bit, David J. Galas Jul 2013

Rcytoscape: Tools For Exploratory Network Analysis, Paul Shannon, Mark L. Grimes, Burak Kutlu, Jan J. Bit, David J. Galas

Biological Sciences Faculty Publications

Background: Biomolecular pathways and networks are dynamic and complex, and the perturbations to them which cause disease are often multiple, heterogeneous and contingent. Pathway and network visualizations, rendered on a computer or published on paper, however, tend to be static, lacking in detail, and ill-equipped to explore the variety and quantities of data available today, and the complex causes we seek to understand.

Results: RCytoscape integrates R (an open-ended programming environment rich in statistical power and datahandling facilities) and Cytoscape (powerful network visualization and analysis software). RCytoscape extends Cytoscape's functionality beyond what is possible with the Cytoscape graphical user interface. …


A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer Jul 2013

A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer

George K. Thiruvathukal

RNA-interference has potential therapeutic use against HIV-1 by targeting highly-functional mRNA sequences that contribute to the virulence of the virus. Empirical work has shown that within cell lines, all of the HIV-1 genes are affected by RNAi-induced gene silencing. While promising, inherent in this treatment is the fact that RNAi sequences must be highly specific. HIV, however, mutates rapidly, leading to the evolution of viral escape mutants. In fact, such strains are under strong selection to include mutations within the targeted region, evading the RNAi therapy and thus increasing the virus’ fitness in the host. Taking a phylogenetic approach, we …


Energy Awareness And Scheduling In Mobile Devices And High End Computing, Sachin S. Pawaskaw Jul 2013

Energy Awareness And Scheduling In Mobile Devices And High End Computing, Sachin S. Pawaskaw

Student Work

In the context of the big picture as energy demands rise due to growing economies and growing populations, there will be greater emphasis on sustainable supply, conservation, and efficient usage of this vital resource. Even at a smaller level, the need for minimizing energy consumption continues to be compelling in embedded, mobile, and server systems such as handheld devices, robots, spaceships, laptops, cluster servers, sensors, etc. This is due to the direct impact of constrained energy sources such as battery size and weight, as well as cooling expenses in cluster-based systems to reduce heat dissipation. Energy management therefore plays a …


Algorithms For Library-Based Microbial Source Tracking, Aldrin Montana Jun 2013

Algorithms For Library-Based Microbial Source Tracking, Aldrin Montana

Master's Theses

Pyroprinting is a novel, library-based microbial source tracking method developed by the Biology department at Cal Poly, San Luis Obispo. This method consists of two parts: (1) a collection of bacterial fingerprints, called pyroprints, from known host species, and (2) a method for pyroprint comparison. Currently, Cal Poly Library of Pyroprints (CPLOP), a web-based database application, provides storage and analysis of over $10000$ pyroprints. This number is quickly growing as students and researchers continue to use pyroprinting for research. Biologists conducting research using pyroprinting rely on methods for partitioning collected bacterial isolates into bacterial strains. Clustering algorithms are commonly used …


Machine Learning And Genome Annotation: A Match Meant To Be?, Kevin Y. Yip, Chao Cheng, Mark Gerstein May 2013

Machine Learning And Genome Annotation: A Match Meant To Be?, Kevin Y. Yip, Chao Cheng, Mark Gerstein

Dartmouth Scholarship

By its very nature, genomics produces large, high-dimensional datasets that are well suited to analysis by machine learning approaches. Here, we explain some key aspects of machine learning that make it useful for genome annotation, with illustrative examples from ENCODE.


What Google Maps Can Do For Biomedical Data Dissemination: Examples And A Design Study, Radu Jianu, David H. Laidlaw May 2013

What Google Maps Can Do For Biomedical Data Dissemination: Examples And A Design Study, Radu Jianu, David H. Laidlaw

School of Computing and Information Sciences

Background: Biologists often need to assess whether unfamiliar datasets warrant the time investment required for more detailed exploration. Basing such assessments on brief descriptions provided by data publishers is unwieldy for large datasets that contain insights dependent on specific scientific questions. Alternatively, using complex software systems for a preliminary analysis may be deemed as too time consuming in itself, especially for unfamiliar data types and formats. This may lead to wasted analysis time and discarding of potentially useful data. Results: We present an exploration of design opportunities that the Google Maps interface offers to biomedical data visualization. In particular, we …


Genome-Wide Profiling Unveils Criticial Functions Of P53 In Human Embryonic Stem Cells, Kadir C. Akdemir May 2013

Genome-Wide Profiling Unveils Criticial Functions Of P53 In Human Embryonic Stem Cells, Kadir C. Akdemir

Dissertations & Theses (Open Access)

Embryonic stem cells (ESCs) possess two unique characteristics: infinite self-renewal and the potential to differentiate into almost every cell type (pluripotency). Recently, global expression analyses of metastatic breast and lung cancers revealed an ESC-like expression program or signature, specifically for cancers that are mutant for p53 function. Surprisingly, although p53 is widely recognized as the guardian of the genome, due to its roles in cell cycle checkpoints, programmed cell death or senescence, relatively little is known about p53 functions in normal cells, especially in ESCs. My hypothesis is that p53 has specific transcription regulatory functions in human ESCs (hESCs) that …


A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer Apr 2013

A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer

Computer Science: Faculty Publications and Other Works

RNA-interference has potential therapeutic use against HIV-1 by targeting highly-functional mRNA sequences that contribute to the virulence of the virus. Empirical work has shown that within cell lines, all of the HIV-1 genes are affected by RNAi-induced gene silencing. While promising, inherent in this treatment is the fact that RNAi sequences must be highly specific. HIV, however, mutates rapidly, leading to the evolution of viral escape mutants. In fact, such strains are under strong selection to include mutations within the targeted region, evading the RNAi therapy and thus increasing the virus’ fitness in the host. Taking a phylogenetic approach, we …


Detecting And Correcting Batch Effects In High-Throughput Genomic Experiments, Sarah Reese Apr 2013

Detecting And Correcting Batch Effects In High-Throughput Genomic Experiments, Sarah Reese

Theses and Dissertations

Batch effects are due to probe-specific systematic variation between groups of samples (batches) resulting from experimental features that are not of biological interest. Principal components analysis (PCA) is commonly used as a visual tool to determine whether batch effects exist after applying a global normalization method. However, PCA yields linear combinations of the variables that contribute maximum variance and thus will not necessarily detect batch effects if they are not the largest source of variability in the data. We present an extension of principal components analysis to quantify the existence of batch effects, called guided PCA (gPCA). We describe a …


A Semantic-Based Method For Extracting Concept Definitions From Scientific Publications: Evaluation In The Autism Phenotype Domain, Saeed Hassanpour, Martin J. O’Connor, Amar K. Das Apr 2013

A Semantic-Based Method For Extracting Concept Definitions From Scientific Publications: Evaluation In The Autism Phenotype Domain, Saeed Hassanpour, Martin J. O’Connor, Amar K. Das

Dartmouth Scholarship

Background: A variety of informatics approaches have been developed that use information retrieval, NLP and text-mining techniques to identify biomedical concepts and relations within scientific publications or their sentences. These approaches have not typically addressed the challenge of extracting more complex knowledge such as biomedical definitions. In our efforts to facilitate knowledge acquisition of rule-based definitions of autism phenotypes, we have developed a novel semantic-based text-mining approach that can automatically identify such definitions within text.

Results: Using an existing knowledge base of 156 autism phenotype definitions and an annotated corpus of 26 source articles containing such definitions, we evaluated and …


Using Bayesian Networks To Discover Relations Between Genes, Environment, And Disease, Chengwei Su, Angeline Andrew, Margaret R. Karagas, Mark E. Borsuk Mar 2013

Using Bayesian Networks To Discover Relations Between Genes, Environment, And Disease, Chengwei Su, Angeline Andrew, Margaret R. Karagas, Mark E. Borsuk

Dartmouth Scholarship

We review the applicability of Bayesian networks (BNs) for discovering relations between genes, environment, and disease. By translating probabilistic dependencies among variables into graphical models and vice versa, BNs provide a comprehensible and modular framework for representing complex systems. We first describe the Bayesian network approach and its applicability to understanding the genetic and environmental basis of disease. We then describe a variety of algorithms for learning the structure of a network from observational data. Because of their relevance to real-world applications, the topics of missing data and causal interpretation are emphasized. The BN approach is then exemplified through application …


Multifactor Dimensionality Reduction Reveals A Three-Locus Epistatic Interaction Associated With Susceptibility To Pulmonary Tuberculosis, Ryan L. Collins, Ting Hu, Christian Wejse, Giorgio Sirugo, Scott M. Williams, Jason H. Moore Feb 2013

Multifactor Dimensionality Reduction Reveals A Three-Locus Epistatic Interaction Associated With Susceptibility To Pulmonary Tuberculosis, Ryan L. Collins, Ting Hu, Christian Wejse, Giorgio Sirugo, Scott M. Williams, Jason H. Moore

Dartmouth Scholarship

Background:

Identifying high-order genetics associations with non-additive (i.e. epistatic) effects in population-based studies of common human diseases is a computational challenge. Multifactor dimensionality reduction (MDR) is a machine learning method that was designed specifically for this problem. The goal of the present study was to apply MDR to mining high-order epistatic interactions in a population-based genetic study of tuberculosis (TB).

Results:

The study used a previously published data set consisting of 19 candidate single-nucleotide polymorphisms (SNPs) in 321 pulmonary TB cases and 347 healthy controls from Guniea-Bissau in Africa. The ReliefF algorithm was applied first to generate a smaller set …


Identification Of Snps Associated With Variola Virus Virulence, Anne Gatewood Hoen, Shea N. Gardner, Jason H. Moore Feb 2013

Identification Of Snps Associated With Variola Virus Virulence, Anne Gatewood Hoen, Shea N. Gardner, Jason H. Moore

Dartmouth Scholarship

Background: Decades after the eradication of smallpox, its etiological agent, variola virus (VARV), remains a threat as a potential bioweapon. Outbreaks of smallpox around the time of the global eradication effort exhibited variable case fatality rates (CFRs), likely attributable in part to complex viral genetic determinants of smallpox virulence. We aimed to identify genome-wide single nucleotide polymorphisms associated with CFR. We evaluated unadjusted and outbreak geographic location-adjusted models of single SNPs and two- and three-way interactions between SNPs. Findings: Using the data mining approach multifactor dimensionality reduction (MDR), we identified five VARV SNPs in models significantly associated with CFR. The …


A Novel Algorithm For Validating Peptide Identification From A Shotgun Proteomics Search Engine, Ling Jian, Xinnan Niu, Zhonghang Xia, Parimal Samir, Chiranthani Sumanasekera, Zheng Mu, Jennifer L. Jennings, Kristen L. Hoek, Tara Allos, Leigh M. Howard, Kathryn M. Edwards, P. Anthony Weil, Andrew J. Link Feb 2013

A Novel Algorithm For Validating Peptide Identification From A Shotgun Proteomics Search Engine, Ling Jian, Xinnan Niu, Zhonghang Xia, Parimal Samir, Chiranthani Sumanasekera, Zheng Mu, Jennifer L. Jennings, Kristen L. Hoek, Tara Allos, Leigh M. Howard, Kathryn M. Edwards, P. Anthony Weil, Andrew J. Link

Chemistry Faculty Research

Liquid chromatography coupled with tandem mass spectrometry (LC–MS/MS) has revolutionized the proteomics analysis of complexes, cells, and tissues. In a typical proteomic analysis, the tandem mass spectra from a LC–MS/MS experiment are assigned to a peptide by a search engine that compares the experimental MS/MS peptide data to theoretical peptide sequences in a protein database. The peptide spectra matches are then used to infer a list of identified proteins in the original sample. However, the search engines often fail to distinguish between correct and incorrect peptides assignments. In this study, we designed and implemented a novel algorithm called De-Noise to …


Utilizing Nmr Spectroscopy And Molecular Docking As Tools For The Structural Determination And Functional Annotation Of Proteins, Jaime Stark Feb 2013

Utilizing Nmr Spectroscopy And Molecular Docking As Tools For The Structural Determination And Functional Annotation Of Proteins, Jaime Stark

Department of Chemistry: Dissertations, Theses, and Student Research

With the completion of the Human Genome Project in 2001 and the subsequent explosion of organisms with sequenced genomes, we are now aware of nearly 28 million proteins. Determining the role of each of these proteins is essential to our understanding of biology and the development of medical advances. Unfortunately, the experimental approaches to determine protein function are too slow to investigate every protein. Bioinformatics approaches, such as sequence and structure homology, have helped to annotate the functions of many similar proteins. However, despite these computational approaches, approximately 40% of proteins still have no known function. Alleviating this deficit will …


Patterns Of Cytosine Methylation In The Genome Of Caenorhabditis Elegans, Kazufusa Okamoto Jan 2013

Patterns Of Cytosine Methylation In The Genome Of Caenorhabditis Elegans, Kazufusa Okamoto

Doctoral Dissertations

Recent large-scale comparative analysis of cytosine DNA methylation across diverse eukaryotes suggest that early features of DNA methylation present in the last common ancestor of all eukaryotes some 1.6 to 1.8 billion years ago included the methylation of gene bodies and transposable elements (Zemach, McDaniel et al. 2010; Parfrey, Lahr et al. 2011). These potentially ancient patterns may reflect a primitive role of methylation in transcriptional fidelity and as a mechanism to protect the germ line from transposon, or repeat, mediated mutation. Because spurious transcription and mutation are hypothesized to be among the critical limiting factors to genome size, an …


Disulfide By Design 2.0: A Web-Based Tool For Disulfide Engineering In Proteins, Douglas B. Craig, Alan A. Dombkowski Jan 2013

Disulfide By Design 2.0: A Web-Based Tool For Disulfide Engineering In Proteins, Douglas B. Craig, Alan A. Dombkowski

Wayne State University Associated BioMed Central Scholarship

Abstract

Background

Disulfide engineering is an important biotechnological tool that has advanced a wide range of research. The introduction of novel disulfide bonds into proteins has been used extensively to improve protein stability, modify functional characteristics, and to assist in the study of protein dynamics. Successful use of this technology is greatly enhanced by software that can predict pairs of residues that will likely form a disulfide bond if mutated to cysteines.

Results

We had previously developed and distributed software for this purpose: Disulfide by Design (DbD). The original DbD program has been widely used; however, it has a number …


Disulfide By Design 2.0: A Web-Based Tool For Disulfide Engineering In Proteins, Douglas B. Craig, Alan A. Dombkowski Jan 2013

Disulfide By Design 2.0: A Web-Based Tool For Disulfide Engineering In Proteins, Douglas B. Craig, Alan A. Dombkowski

Wayne State University Associated BioMed Central Scholarship

Abstract

Background

Disulfide engineering is an important biotechnological tool that has advanced a wide range of research. The introduction of novel disulfide bonds into proteins has been used extensively to improve protein stability, modify functional characteristics, and to assist in the study of protein dynamics. Successful use of this technology is greatly enhanced by software that can predict pairs of residues that will likely form a disulfide bond if mutated to cysteines.

Results

We had previously developed and distributed software for this purpose: Disulfide by Design (DbD). The original DbD program has been widely used; however, it has a number …


Computational Approaches To Anti-Toxin Therapies And Biomarker Identification, Rebecca Jane Swett Jan 2013

Computational Approaches To Anti-Toxin Therapies And Biomarker Identification, Rebecca Jane Swett

Wayne State University Dissertations

This work describes the fundamental study of two bacterial toxins with computational methods, the rational design of a potent inhibitor using molecular dynamics, as well as the development of two bioinformatic methods for mining genomic data.

Clostridium difficile is an opportunistic bacillus which produces two large glucosylating toxins. These toxins, TcdA and TcdB cause severe intestinal damage. As Clostridium difficile harbors considerable antibiotic resistance, one treatment strategy is to prevent the tissue damage that the toxins cause. The catalytic glucosyltransferase domain of TcdA and TcdB was studied using molecular dynamics in the presence of both a protein-protein binding partner and …


Bioinformatic Analysis And In Vitro Expression Of Malaria Parasite Translocon And Ribonuclease Binding-Like Rhoptry Genes, Moses Z. Timta Jan 2013

Bioinformatic Analysis And In Vitro Expression Of Malaria Parasite Translocon And Ribonuclease Binding-Like Rhoptry Genes, Moses Z. Timta

ETD Archive

Malaria caused by the parasite Plasmodium, still remains a significant public health problem worldwide, due to lack of a vaccine and emerging drug and insecticide resistance, among malaria parasites and mosquito vectors, respectively. Rhoptry proteins of Plasmodium enable merozoite invasion of host erythrocytes. However, only a few of these proteins have been characterized. Thirty-six P. yoelii merozoite rhoptry proteins were identified as putative rhoptry proteins by proteome analysis. Some of these proteins have been characterized while others still remain an intense area of active research. Molecular characterization and understanding of these novel proteins may assist in vaccine development, design of …


Seqnls: Nuclear Localization Signal Prediction Based On Frequent Pattern Mining And Linear Motif Scoring, J.-R. Lin, Jianjun Hu Jan 2013

Seqnls: Nuclear Localization Signal Prediction Based On Frequent Pattern Mining And Linear Motif Scoring, J.-R. Lin, Jianjun Hu

Faculty Publications

Nuclear localization signals (NLSs) are stretches of residues in proteins mediating their importing into the nucleus. NLSs are known to have diverse patterns, of which only a limited number are covered by currently known NLS motifs. Here we propose a sequential pattern mining algorithm SeqNLS to effectively identify potential NLS patterns without being constrained by the limitation of current knowledge of NLSs. The extracted frequent sequential patterns are used to predict NLS candidates which are then filtered by a linear motif-scoring scheme based on predicted sequence disorder and by the relatively local conservation (IRLC) based masking.

The experiment results on …


Computational Methods For Comparative Non-Coding Rna Analysis: From Structural Motif Identification To Genome-Wide Functional Classification, Cuncong Zhong Jan 2013

Computational Methods For Comparative Non-Coding Rna Analysis: From Structural Motif Identification To Genome-Wide Functional Classification, Cuncong Zhong

Electronic Theses and Dissertations

Recent advances in biological research point out that many ribonucleic acids (RNAs) are transcribed from the genome to perform a variety of cellular functions, rather than merely acting as information carriers for protein synthesis. These RNAs are usually referred to as the non-coding RNAs (ncRNAs). The versatile regulation mechanisms and functionalities of the ncRNAs contribute to the amazing complexity of the biological system. The ncRNAs perform their biological functions by folding into specific structures. In this case, the comparative study of the ncRNA structures is key to the inference of their molecular and cellular functions. We are especially interested in …