Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics Commons

Open Access. Powered by Scholars. Published by Universities.®

Series

2010

Discipline
Institution
Keyword
Publication
File Type

Articles 1 - 30 of 95

Full-Text Articles in Bioinformatics

Minimum Description Length Measures Of Evidence For Enrichment, Zhenyu Yang, David R. Bickel Dec 2010

Minimum Description Length Measures Of Evidence For Enrichment, Zhenyu Yang, David R. Bickel

COBRA Preprint Series

In order to functionally interpret differentially expressed genes or other discovered features, researchers seek to detect enrichment in the form of overrepresentation of discovered features associated with a biological process. Most enrichment methods treat the p-value as the measure of evidence using a statistical test such as the binomial test, Fisher's exact test or the hypergeometric test. However, the p-value is not interpretable as a measure of evidence apart from adjustments in light of the sample size. As a measure of evidence supporting one hypothesis over the other, the Bayes factor (BF) overcomes this drawback of the p-value but lacks …


Reconstructability Analysis Of Epistasis, Martin Zwick Dec 2010

Reconstructability Analysis Of Epistasis, Martin Zwick

Systems Science Faculty Publications and Presentations

The literature on epistasis describes various methods to detect epistatic interactions and to classify different types of epistasis. Reconstructability analysis (RA) has recently been used to detect epistasis in genomic data. This paper shows that RA offers a classification of types of epistasis at three levels of resolution (variable-based models without loops, variable-based models with loops, state-based models). These types can be defined by the simplest RA structures that model the data without information loss; a more detailed classification can be defined by the information content of multiple candidate structures. The RA classification can be augmented with structures from related …


Campylobacter Ureolyticus: An Emerging Gastrointestinal Pathogen?, Susan Bullman, Daniel Corcoran, James O'Leary, Brigid Lucey, Deirdre Byrne, Roy D. Sleator Dec 2010

Campylobacter Ureolyticus: An Emerging Gastrointestinal Pathogen?, Susan Bullman, Daniel Corcoran, James O'Leary, Brigid Lucey, Deirdre Byrne, Roy D. Sleator

Department of Biological Sciences Publications

A total of 7194 faecal samples collected over a 1-year period from patients presenting with diarrhoea were screened for Campylobacter spp. using EntericBios, a multiplex-PCR system. Of 349 Campylobacter-positive samples, 23.8% were shown to be Campylobacter ureolyticus, using a combination of 16S rRNA gene analysis and highly specific primers targeting the HSP60 gene of this organism. This is, to the best of our knowledge, the first report of C. ureolyticus in the faeces of patients presenting with gastroenteritis and may suggest a role for this organism as an emerging enteric pathogen.


Spatial Semantics For Better Interoperability And Analysis: Challenges And Experiences In Building Semantically Rich Applications In Web 3.0, Amit P. Sheth Dec 2010

Spatial Semantics For Better Interoperability And Analysis: Challenges And Experiences In Building Semantically Rich Applications In Web 3.0, Amit P. Sheth

Kno.e.sis Publications

No abstract provided.


Lateral Blood Flow Velocity Estimation Based On Ultrasound Speckle Size Change With Scan Velocity, Tiantian Xu, Gregory R. Bashford Dec 2010

Lateral Blood Flow Velocity Estimation Based On Ultrasound Speckle Size Change With Scan Velocity, Tiantian Xu, Gregory R. Bashford

Biomedical Imaging and Biosignal Analysis Laboratory

Conventional (Doppler-based) blood flow velocity measurement methods using ultrasound are capable of resolving the axial component (i.e., that aligned with the ultrasound propagation direction) of the blood flow velocity vector. However, these methods are incapable of detecting blood flow in the direction normal to the ultrasound beam. In addition, these methods require repeated pulse-echo interrogation at the same spatial location. A new method has been introduced which estimates the lateral component of blood flow within a single image frame using the observation that the speckle pattern corresponding to blood reflectors (typically red blood cells) stretches (i.e., is smeared) if the …


Terminology Enhanced Ehr: Integration Of Archetypes And Terminology, An Implementation Experience, Sheng Yu, Damon Berry Nov 2010

Terminology Enhanced Ehr: Integration Of Archetypes And Terminology, An Implementation Experience, Sheng Yu, Damon Berry

Reports

The integration of terminology and EHR information models is an important step in the journey towards semantic interoperability. Archetypes and two-level models for EHRs provide a mechanism that not only applies constraints on clinical content but also ensures effective terminology binding. However the lack of a standardised mechanism to bind terminology to the EHR and the difficulty of systematically coding clinical content, has led to a number of possible implementation choices. This study presents a review of the problems that may occur when working with modern terminology systems and discusses some related state of art technologies. The paper aims to …


Flexible Bootstrapping-Based Ontology Alignment, Prateek Jain, Pascal Hitzler, Amit P. Sheth Nov 2010

Flexible Bootstrapping-Based Ontology Alignment, Prateek Jain, Pascal Hitzler, Amit P. Sheth

Kno.e.sis Publications

BLOOMS (Jain et al, ISWC2010) is an ontology alignment system which, in its core, utilizes the Wikipedia category hierarchy for establishing alignments. In this paper, we present a Plug-and-Play extension to BLOOMS, which allows to flexibly replace or complement the use of Wikipedia by other online or offline resources, including domain-specific ontologies or taxonomies. By making use of automated translation services and of Wikipedia in languages other than English, it makes it possible to apply BLOOMS to alignment tasks where the input ontologies are written in different languages.


Ontology Alignment For Linked Open Data, Prateek Jain, Pascal Hitzler, Amit P. Sheth, Kunal Verma, Peter Z. Yeh Nov 2010

Ontology Alignment For Linked Open Data, Prateek Jain, Pascal Hitzler, Amit P. Sheth, Kunal Verma, Peter Z. Yeh

Kno.e.sis Publications

The Web of Data currently coming into existence through the Linked Open Data (LOD) effort is a major milestone in realizing the Semantic Web vision. However, the development of applications based on LOD faces difficulties due to the fact that the different LOD datasets are rather loosely connected pieces of information. In particular, links between LOD datasets are almost exclusively on the level of instances, and schema-level information is being ignored. In this paper, we therefore present a system for finding schema-level links between LOD datasets in the sense of ontology alignment. Our system, called BLOOMS, is based on the …


(1e,3e)-1,4-Bis(4-Methoxyphenyl)Buta1,3-Diene, Gopinathan Narayan, Nigam Rath, Suresh Das Oct 2010

(1e,3e)-1,4-Bis(4-Methoxyphenyl)Buta1,3-Diene, Gopinathan Narayan, Nigam Rath, Suresh Das

Chemistry & Biochemistry Faculty Works

The title compound, C18H18O2, which exhibits blue emission in the solid state, is an inter­mediate in the preparation of liquid crystals and polymers. The mol­ecule is located on an inversion centre. In the crystal, mol­ecules are arranged in a herringbone motif.


Use Of Oids And Iis In En13606, Damon Berry, Jostein Ven, Gerard Freriks, David Moner Oct 2010

Use Of Oids And Iis In En13606, Damon Berry, Jostein Ven, Gerard Freriks, David Moner

Reports

No abstract provided.


Discovering Gene Functional Relationships Using Faun (Feature Annotation Using Nonnegative Matrix Factorization), Elina Tjioe, Michael W. Berry, Ramin Homayouni Oct 2010

Discovering Gene Functional Relationships Using Faun (Feature Annotation Using Nonnegative Matrix Factorization), Elina Tjioe, Michael W. Berry, Ramin Homayouni

Faculty Publications and Other Works -- EECS

Background

Searching the enormous amount of information available in biomedical literature to extract novel functional relationships among genes remains a challenge in the field of bioinformatics. While numerous (software) tools have been developed to extract and identify gene relationships from biological databases, few effectively deal with extracting new (or implied) gene relationships, a process which is useful in interpretation of discovery-oriented genome-wide experiments.

Results

In this study, we develop a Web-based bioinformatics software environment called FAUN or Feature Annotation Using Nonnegative matrix factorization (NMF) to facilitate both the discovery and classification of functional relationships among genes. Both the computational complexity …


Quail Genomics: A Knowledgebase For Northern Bobwhite, Arun Rawat, Kurt A. Gust, Mohamed O. Elasri, Edward J. Perkins Oct 2010

Quail Genomics: A Knowledgebase For Northern Bobwhite, Arun Rawat, Kurt A. Gust, Mohamed O. Elasri, Edward J. Perkins

Faculty Publications

Background

The Quail Genomics knowledgebase (http://www.quailgenomics.info) has been initiated to share and develop functional genomic data for Northern bobwhite (Colinus virginianus). This web-based platform has been designed to allow researchers to perform analysis and curate genomic information for this non-model species that has little supporting information in GenBank.

Description

A multi-tissue, normalized cDNA library generated for Northern bobwhite was sequenced using 454 Life Sciences next generation sequencing. The Quail Genomics knowledgebase represents the 478,142 raw ESTs generated from the sequencing effort in addition to assembled nucleotide and protein sequences including 21,980 unigenes annotated with meta-data. A …


Time Lagged Information Theoretic Approaches To The Reverse Engineering Of Gene Regulatory Networks, Vijender Chaitankar, Preetam Ghosh, Edward J. Perkins, Ping Gong, Youping Deng, Chaoyang Zhang Oct 2010

Time Lagged Information Theoretic Approaches To The Reverse Engineering Of Gene Regulatory Networks, Vijender Chaitankar, Preetam Ghosh, Edward J. Perkins, Ping Gong, Youping Deng, Chaoyang Zhang

Faculty Publications

Background: A number of models and algorithms have been proposed in the past for gene regulatory network (GRN) inference; however, none of them address the effects of the size of time-series microarray expression data in terms of the number of time-points. In this paper, we study this problem by analyzing the behaviour of three algorithms based on information theory and dynamic Bayesian network (DBN) models. These algorithms were implemented on different sizes of data generated by synthetic networks. Experiments show that the inference accuracy of these algorithms reaches a saturation point after a specific data size brought about by …


Dynamics Of Protofibril Elongation And Association Involved In Aβ42 Peptide Aggregation In Alzheimer's Disease, Preetam Ghosh, Amit Kumar, Bhaswati Datta, Vijayaraghavan Rangachari Oct 2010

Dynamics Of Protofibril Elongation And Association Involved In Aβ42 Peptide Aggregation In Alzheimer's Disease, Preetam Ghosh, Amit Kumar, Bhaswati Datta, Vijayaraghavan Rangachari

Faculty Publications

Background: The aggregates of a protein called, ‘Aβ’ found in brains of Alzheimer’s patients are strongly believed to be the cause for neuronal death and cognitive decline. Among the different forms of Aβ aggregates, smaller aggregates called ‘soluble oligomers’ are increasingly believed to be the primary neurotoxic species responsible for early synaptic dysfunction. Since it is well known that the Aβ aggregation is a nucleation dependant process, it is widely believed that the toxic oligomers are intermediates to fibril formation, or what we call the ‘on-pathway’ products. Modeling of Aβ aggregation has been of intense investigation during the last …


Technology In Genomics And Bioinformatics, Timothy Hall Oct 2010

Technology In Genomics And Bioinformatics, Timothy Hall

Technology Essay Contest Winners

The advantages that new technological advancements in genomics and bioinformatics provide are numerous and varied. The advent of new technologies provides faster sequencing throughput, making the opportunity available to allow for the sequencing of an entire genome to be completed in twenty-four hours. The input of sequencing data and information into large databases distributes it across the world and provides the ability for comparisons between genes, gene products, mutations, and comparisons between species. The fact that these databases can be accessed instantly will help further catalyze not only developments in genomics but also in the medical field.


Using The R Package Crlmm For Genotyping And Copy Number Estimation, Robert B. Scharpf, Rafael Irizarry, Walter Ritchie, Benilton Carvalho, Ingo Ruczinski Sep 2010

Using The R Package Crlmm For Genotyping And Copy Number Estimation, Robert B. Scharpf, Rafael Irizarry, Walter Ritchie, Benilton Carvalho, Ingo Ruczinski

Johns Hopkins University, Dept. of Biostatistics Working Papers

Genotyping platforms such as Affymetrix can be used to assess genotype-phenotype as well as copy number-phenotype associations at millions of markers. While genotyping algorithms are largely concordant when assessed on HapMap samples, tools to assess copy number changes are more variable and often discordant. One explanation for the discordance is that copy number estimates are susceptible to systematic differences between groups of samples that were processed at different times or by different labs. Analysis algorithms that do not adjust for batch effects are prone to spurious measures of association. The R package crlmm implements a multilevel model that adjusts for …


2,2′,5,5′-Tetra­Chloro­Benzidine, Onome Ugono, Marcel Douglas, Nigam Rath, Alicia Beatty Sep 2010

2,2′,5,5′-Tetra­Chloro­Benzidine, Onome Ugono, Marcel Douglas, Nigam Rath, Alicia Beatty

Chemistry & Biochemistry Faculty Works

In the crystal structure of the title compound, C12H8Cl4N2, mol­ecules lie on crystallographic twofold axes at the centre of the C-C bonds linking the benzene rings, such that the asymmetric unit consists of a half-mol­ecule. The individual mol­ecules participate in inter­molecular N-H...N, N-H...Cl, C-H...Cl and Cl...Cl [3.4503 (3) Å] inter­actions.


Bioinformatics Across The Sciences, Nigel Yarlett Sep 2010

Bioinformatics Across The Sciences, Nigel Yarlett

Cornerstone 3 Reports : Interdisciplinary Informatics

No abstract provided.


A Taxonomy-Based Model For Expertise Extrapolation, Delroy H. Cameron, Boanerges Aleman-Meza, Ismailcem Budak Arpinar, Sheron L. Decker, Amit P. Sheth Sep 2010

A Taxonomy-Based Model For Expertise Extrapolation, Delroy H. Cameron, Boanerges Aleman-Meza, Ismailcem Budak Arpinar, Sheron L. Decker, Amit P. Sheth

Kno.e.sis Publications

While many ExpertFinder applications succeed in finding experts, their techniques are not always designed to capture the various levels at which expertise can be expressed. Indeed, expertise can be inferred from relationships between topics and subtopics in a taxonomy. The conventional wisdom is that expertise in subtopics is also indicative of expertise in higher level topics as well. The enrichment of Expertise Profiles for finding experts can therefore be facilitated by taking domain hierarchies into account. We present a novel semantics-based model for finding experts, expertise levels and collaboration levels in a peer review context, such as composing a Program …


Ranking Documents Semantically Using Ontological Relationships, Boanerges Aleman-Meza, I. Budak Arpinar, Mustafa V. Nural, Amit P. Sheth Sep 2010

Ranking Documents Semantically Using Ontological Relationships, Boanerges Aleman-Meza, I. Budak Arpinar, Mustafa V. Nural, Amit P. Sheth

Kno.e.sis Publications

Although arguable success of today’s keyword based search engines in certain information retrieval tasks, ranking search results in a meaningful way remains an open problem. In this work, the goal is to use of semantic relationships for ranking documents without relying on the existence of any specific structure in a document or links between documents. Instead, real-world entities are identified and the relevance of documents is determined using relationships that are known to exist between the entities in a populated ontology. We introduce a measure of relevance that is based on traversal and the semantics of relationships that link entities …


G-Lattices For An Unrooted Perfect Phylogeny, Monica Grigg Aug 2010

G-Lattices For An Unrooted Perfect Phylogeny, Monica Grigg

Mathematical Sciences Technical Reports (MSTR)

We look at the Pure Parsimony problem and the Perfect Phylogeny Haplotyping problem. From the Pure Parsimony problem we consider structures of genotypes called g-lattices. These structures either provide solutions or give bounds to the pure parsimony problem. In particular, we investigate which of these structures supports an unrooted perfect phylogeny, a condition that adds biological interpretation. By understanding which g-lattices support an unrooted perfect phylogeny, we connect two of the standard biological inference rules used to recreate how genetic diversity propagates across generations.


A Perturbation Method For Inference On Regularized Regression Estimates, Jessica Minnier, Lu Tian, Tianxi Cai Aug 2010

A Perturbation Method For Inference On Regularized Regression Estimates, Jessica Minnier, Lu Tian, Tianxi Cai

Harvard University Biostatistics Working Paper Series

No abstract provided.


Incorporating Genomics And Bioinformatics Across The Life Sciences Curriculum, Jayna L. Ditty, Christopher A. Kvaal, Brad Goodner, Sharyn K. Freyermuth, Cheryl Bailey, Robert A. Britton, Stuart G. Gordon, Sabine Heinhorst, Kelyenne Reed, Zhaohui Xu, Erin R. Sanders-Lorenz, Seth Axen, Edwin Kim, Mitrick Johns, Kathleen Scott, Cheryl A. Kerfeld Aug 2010

Incorporating Genomics And Bioinformatics Across The Life Sciences Curriculum, Jayna L. Ditty, Christopher A. Kvaal, Brad Goodner, Sharyn K. Freyermuth, Cheryl Bailey, Robert A. Britton, Stuart G. Gordon, Sabine Heinhorst, Kelyenne Reed, Zhaohui Xu, Erin R. Sanders-Lorenz, Seth Axen, Edwin Kim, Mitrick Johns, Kathleen Scott, Cheryl A. Kerfeld

Faculty Publications

No abstract provided.


A Spectral Approach To Protein Structure Alignment, Yosi Shibberu, Allen Holder Aug 2010

A Spectral Approach To Protein Structure Alignment, Yosi Shibberu, Allen Holder

Mathematical Sciences Technical Reports (MSTR)

We present two algorithms that use spectral methods to align protein folds. One of the algorithms is suitable for database searches, the other for difficult alignments. We present computational results for 780 pairwise alignments used to classify 40 proteins as well as results for a separate set of 36 protein alignments used for comparison to four other alignment algorithms. We also provide a mathematically rigorous development of the intrinsic geometry underlying our spectral approach.


Bilinear Programming And Protein Structure Alignment, J. Cain, D. Kamenetsky, N. Lavine Aug 2010

Bilinear Programming And Protein Structure Alignment, J. Cain, D. Kamenetsky, N. Lavine

Mathematical Sciences Technical Reports (MSTR)

Proteins are a primary functional component of organic life, and understanding their function is integral to many areas of research in biochemistry. The three-dimensional structure of a protein largely determines this function. Protein structure alignment compares the structure of a protein with known function to that of a protein with unknown function. A protein’s three-dimensional structure can be transformed through a smooth piecewise-linear sigmoid function to a real symmetric contact matrix that represents the functional significance of certain parts of the protein. We address the protein alignment problem as a minimization of the 2-norm difference of two proteins’ contact matrices. …


Cross-Market Model Adaptation With Pairwise Preference Data For Web Search Ranking, Jing Bai, Fernando Diaz, Yi Chang, Zhaohui Zheng, Keke Chen Aug 2010

Cross-Market Model Adaptation With Pairwise Preference Data For Web Search Ranking, Jing Bai, Fernando Diaz, Yi Chang, Zhaohui Zheng, Keke Chen

Kno.e.sis Publications

Machine-learned ranking techniques automatically learn a complex document ranking function given training data. These techniques have demonstrated the effectiveness and flexibility required of a commercial web search. However, manually labeled training data (with multiple absolute grades) has become the bottleneck for training a quality ranking function, particularly for a new domain. In this paper, we explore the adaptation of machine-learned ranking models across a set of geographically diverse markets with the market-specific pairwise preference data, which can be easily obtained from clickthrough logs. We propose a novel adaptation algorithm, Pairwise-Trada, which is able to adapt ranking models that are trained …


Pattern Space Maintenance For Data Updates And Interactive Mining, Mengling Feng, Guozhu Dong, Jinyan Li, Yap-Peng Tan, Limsoon Wong Aug 2010

Pattern Space Maintenance For Data Updates And Interactive Mining, Mengling Feng, Guozhu Dong, Jinyan Li, Yap-Peng Tan, Limsoon Wong

Kno.e.sis Publications

This article addresses the incremental and decremental maintenance of the frequent pattern space. We conduct an in-depth investigation on how the frequent pattern space evolves under both incremental and decremental updates. Based on the evolution analysis, a new data structure, Generator-Enumeration Tree (GE-tree), is developed to facilitate the maintenance of the frequent pattern space. With the concept of GE-tree, we propose two novel algorithms, Pattern Space Maintainer+ (PSM+) and Pattern Space Maintainer− (PSM−), for the incremental and decremental maintenance of frequent patterns. Experimental results demonstrate that the proposed algorithms, on average, outperform the representative state-of-the-art …


Comparative Functional Genomic Study Of Substrate Specificity Evolution Of The Sabath Family Of Methyltransferases In Plants, Nan Zhao, Jean-Luc Ferrer, Xiaofeng Zhuang, Feng Chen Jul 2010

Comparative Functional Genomic Study Of Substrate Specificity Evolution Of The Sabath Family Of Methyltransferases In Plants, Nan Zhao, Jean-Luc Ferrer, Xiaofeng Zhuang, Feng Chen

Plant Sciences Publications and Other Works

Background

The plant SABATH protein family is composed of a group of related small molecule methyltransferases (MTs) that catalyze the S-adenosyl-L-methionine dependent methylation of a variety of plant small molecular weight metabolites encompassing widely divergent structures. Some of these substrates are important plant hormones and signaling molecules, such as indole-3-acetic acid (IAA), jasmonic acid (JA) and salicylic acid (SA). Methylating these compounds may have important impacts on plant growth and development. In the previous paper, we presented Indole-3-acetic acid (IAA) methyltransferase (IAMT) as an evolutionarily ancient member of the SABATH family in higher plants. Whether the IAMT exists in less …


Graph Algorithms For Machine Learning: A Case-Control Study Based On Prostate Cancer Populations And High Throughput Transcriptomic Data, Gary L. Rogers, Pablo Moscato, Michael A. Langston Jul 2010

Graph Algorithms For Machine Learning: A Case-Control Study Based On Prostate Cancer Populations And High Throughput Transcriptomic Data, Gary L. Rogers, Pablo Moscato, Michael A. Langston

Faculty Publications and Other Works -- EECS

Background

The continuing proliferation of high-throughput biological data promises to revolutionize personalized medicine. Confirming the presence or absence of disease is an important goal. In this study, we seek to identify genes, gene products and biological pathways that are crucial to human health, with prostate cancer chosen as the target disease.

Materials and methods

Using case-control transcriptomic data, we devise a graph theoretical toolkit for this task. It employs both innovative algorithms and novel two-way correlations to pinpoint putative biomarkers that classify unknown samples as cancerous or normal.

Results and conclusion

Observed accuracy on real data suggests that we are …


Serendipitous Discoveries In Microarray Analysis, Sally R. Ellingson, Charles A. Phillips, Randy Glenn, Douglas Swanson, Thomas Ha, Daniel Goldowitz, Michael A. Langston Jul 2010

Serendipitous Discoveries In Microarray Analysis, Sally R. Ellingson, Charles A. Phillips, Randy Glenn, Douglas Swanson, Thomas Ha, Daniel Goldowitz, Michael A. Langston

Faculty Publications and Other Works -- EECS

Background

Scientists are capable of performing very large scale gene expression experiments with current microarray technologies. In order to find significance in the expression data, it is common to use clustering algorithms to group genes with similar expression patterns. Clusters will often contain related genes, such as co-regulated genes or genes in the same biological pathway. It is too expensive and time consuming to test all of the relationships found in large scale microarray experiments. There are many bioinformatics tools that can be used to infer the significance of microarray experiments and cluster analysis.

Materials and methods

In this project …