Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics Commons

Open Access. Powered by Scholars. Published by Universities.®

2007

Discipline
Institution
Keyword
Publication
Publication Type
File Type

Articles 1 - 30 of 114

Full-Text Articles in Bioinformatics

Analyzing In Situ Gene Expression In The Mouse Brain With Image Registration, Feature Extraction And Block Clustering, Manjunatha Jagalur, Chris Pal, Erik G. Learned-Miller, R R Thomas Zoeller, David Kulp Dec 2007

Analyzing In Situ Gene Expression In The Mouse Brain With Image Registration, Feature Extraction And Block Clustering, Manjunatha Jagalur, Chris Pal, Erik G. Learned-Miller, R R Thomas Zoeller, David Kulp

Erik G Learned-Miller

Background: Many important high throughput projects use in situ hybridization and may require the analysis of images of spatial cross sections of organisms taken with cellular level resolution. Projects creating gene expression atlases at unprecedented scales for the embryonic fruit fly as well as the embryonic and adult mouse already involve the analysis of hundreds of thousands of high resolution experimental images mapping mRNA expression patterns. Challenges include accurate registration of highly deformed tissues, associating cells with known anatomical regions, and identifying groups of genes whose expression is coordinately regulated with respect to both concentration and spatial location. Solutions to …


Network-Constrained Regularization And Variable Selection For Analysis Of Genomic Data, Caiyan Li, Hongzhe Li Dec 2007

Network-Constrained Regularization And Variable Selection For Analysis Of Genomic Data, Caiyan Li, Hongzhe Li

UPenn Biostatistics Working Papers

Graphs or networks are common ways of depicting information. In biology in particular, many different biological processes are represented by graphs, such as regulatory networks or metabolic pathways. This kind of {\it a priori} information gathered over many years of biomedical research is a useful supplement to the standard numerical genomic data such as microarray gene expression data. How to incorporate information encoded by the known biological networks or graphs into analysis of numerical data raises interesting statistical challenges. In this paper, we introduce a network-constrained regularization procedure for linear regression analysis in order to incorporate the information from these …


Vertex Clustering In Random Graphs Via Reversible Jump Markov Chain Monte Carlo, Stefano Monni, Hongzhe Li Dec 2007

Vertex Clustering In Random Graphs Via Reversible Jump Markov Chain Monte Carlo, Stefano Monni, Hongzhe Li

UPenn Biostatistics Working Papers

Networks are a natural and effective tool to study relational data, in which observations are collected on pairs of units. The units are represented by nodes and their relations by edges. In biology, for example, proteins and their interactions, and, in social science, people and inter-personal relations may be the nodes and the edges of the network. In this paper we address the question of clustering vertices in networks, as a way to uncover homogeneity patterns in data that enjoy a network representation. We use a mixture model for random graphs and propose a reversible jump Markov chain Monte Carlo …


Semantic Web For Health Care And Biomedical Informatics, Amit P. Sheth Dec 2007

Semantic Web For Health Care And Biomedical Informatics, Amit P. Sheth

Kno.e.sis Publications

No abstract provided.


Towards Tractable Local Closed World Reasoning For The Semantic Web, Matthias Knorr, Jose Julio Alferes, Pascal Hitzler Dec 2007

Towards Tractable Local Closed World Reasoning For The Semantic Web, Matthias Knorr, Jose Julio Alferes, Pascal Hitzler

Computer Science and Engineering Faculty Publications

Recently, the logics of minimal knowledge and negation as failure MKNF [12] was used to introduce hybrid MKNF knowledge bases [14], a powerful formalism for combining open and closed world reasoning for the Semantic Web. We present an extension based on a new three-valued framework including an alternating fixpoint, the well-founded MKNF model. This approach, the well-founded MKNF semantics, derives its name from the very close relation to the corresponding semantics known from logic programming. We show that the well-founded MKNF model is the least model among all (three-valued) MKNF models, thus soundly approximating also the two-valued MKNF models from …


Video On The Semantic Sensor Web, Cory Andrew Henson, Amit P. Sheth, Prateek Jain, Josh Pschorr, Terry Rapoch Dec 2007

Video On The Semantic Sensor Web, Cory Andrew Henson, Amit P. Sheth, Prateek Jain, Josh Pschorr, Terry Rapoch

Kno.e.sis Publications

Millions of sensors around the globe currently collect avalanches of data about our world. The rapid development and deployment of sensor technology is intensifying the existing problem of too much data and not enough knowledge. With a view to alleviating this glut, we propose that sensor data, especially video sensor data, can be annotated with semantic metadata to provide contextual information about videos on the Web. In particular, we present an approach to annotating video sensor data with spatial, temporal, and thematic semantic metadata. This technique builds on current standardization efforts within the W3C and Open Geospatial Consortium (OGC) and …


A General Boosting Method And Its Application To Learning Ranking Functions For Web Search, Zhaohui Zheng, Hongyuan Zha, Tong Zhang, Olivier Chapelle, Keke Chen, Gordon Sun Dec 2007

A General Boosting Method And Its Application To Learning Ranking Functions For Web Search, Zhaohui Zheng, Hongyuan Zha, Tong Zhang, Olivier Chapelle, Keke Chen, Gordon Sun

Kno.e.sis Publications

We present a general boosting method extending functional gradient boosting to optimize complex loss functions that are encountered in many machine learning problems. Our approach is based on optimization of quadratic upper bounds of the loss functions which allows us to present a rigorous convergence analysis of the algorithm. More importantly, this general framework enables us to use a standard regression base learner such as decision trees for fitting any loss function. We illustrate an application of the proposed method in learning ranking functions for Web search by combining both preference data and labeled data for training. We present experimental …


Nfu-Enabled Fasta: Moving Bioinformatics Applications Onto Wide Area Networks, Erich J. Baker, Guan N. Lin, Huadong Liu, Ravi Kosuri Nov 2007

Nfu-Enabled Fasta: Moving Bioinformatics Applications Onto Wide Area Networks, Erich J. Baker, Guan N. Lin, Huadong Liu, Ravi Kosuri

Faculty Publications and Other Works -- General Biology

Abstract

Background

Advances in Internet technologies have allowed life science researchers to reach beyond the lab-centric research paradigm to create distributed collaborations. Of the existing technologies that support distributed collaborations, there are currently none that simultaneously support data storage and computation as a shared network resource, enabling computational burden to be wholly removed from participating clients. Software using computation-enable logistical networking components of the Internet Backplane Protocol provides a suitable means to accomplish these tasks. Here, we demonstrate software that enables this approach by distributing both the FASTA algorithm and appropriate data sets within the framework of a wide area …


A Bayesian Model For Cross-Study Differential Gene Expression, Robert B. Scharpf, Hakon Tjelemeland, Giovanni Parmigiani, Andrew B. Nobel Nov 2007

A Bayesian Model For Cross-Study Differential Gene Expression, Robert B. Scharpf, Hakon Tjelemeland, Giovanni Parmigiani, Andrew B. Nobel

Johns Hopkins University, Dept. of Biostatistics Working Papers

In this paper we define a hierarchical Bayesian model for microarray expression data collected from several studies and use it to identify genes that show differential expression between two conditions. Key features include shrinkage across both genes and studies; flexible modeling that allows for interactions between platforms and the estimated effect, and for both concordant and discordant differential expression across studies. We evaluated the performance of our model in a comprehensive fashion, using both artificial data, and a "split-sample" validation approach that provides an agnostic assessment of the model's behavior not only under the null hypothesis but also under a …


Leveraging Semantic Web Techniques To Gain Situational Awareness, Amit P. Sheth Nov 2007

Leveraging Semantic Web Techniques To Gain Situational Awareness, Amit P. Sheth

Kno.e.sis Publications

No abstract provided.


Assessing Population Level Genetic Instability Via Moving Average, Samuel Mcdaniel, Rebecca Betensky, Tianxi Cai Nov 2007

Assessing Population Level Genetic Instability Via Moving Average, Samuel Mcdaniel, Rebecca Betensky, Tianxi Cai

Harvard University Biostatistics Working Paper Series

No abstract provided.


Statistical Tools For Transgene Copy Number Estimation Based On Real-Time Pcr, Joshua S. Yuan, Jason N Burris, Nathan R. Stewart, Ayalew Mentewab, C. Neal Stewart Nov 2007

Statistical Tools For Transgene Copy Number Estimation Based On Real-Time Pcr, Joshua S. Yuan, Jason N Burris, Nathan R. Stewart, Ayalew Mentewab, C. Neal Stewart

Faculty Publications and Other Works -- General Biology

Background

As compared with traditional transgene copy number detection technologies such as Southern blot analysis, real-time PCR provides a fast, inexpensive and high-throughput alternative. However, the real-time PCR based transgene copy number estimation tends to be ambiguous and subjective stemming from the lack of proper statistical analysis and data quality control to render a reliable estimation of copy number with a prediction value. Despite the recent progresses in statistical analysis of real-time PCR, few publications have integrated these advancements in real-time PCR based transgene copy number determination.

Results

Three experimental designs and four data quality control integrated statistical models are …


Cloning, Analysis And Functional Annotation Of Expressed Sequence Tags From The Earthworm Eisenia Fetida, Mehdi Pirooznia, Ping Gong, Xin Guan, Laura S. Inouye, Kuan Yang, Edward J. Perkins, Youping Deng Nov 2007

Cloning, Analysis And Functional Annotation Of Expressed Sequence Tags From The Earthworm Eisenia Fetida, Mehdi Pirooznia, Ping Gong, Xin Guan, Laura S. Inouye, Kuan Yang, Edward J. Perkins, Youping Deng

Faculty Publications

Background

Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR.

Results

A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone …


Can Semantic Web Techniques Empower Comprehension And Projection In Cyber Situational Awareness?, Amit P. Sheth Nov 2007

Can Semantic Web Techniques Empower Comprehension And Projection In Cyber Situational Awareness?, Amit P. Sheth

Kno.e.sis Publications

No abstract provided.


Semantic Convergence Of Wikipedia Articles, Christopher J. Thomas, Amit P. Sheth Nov 2007

Semantic Convergence Of Wikipedia Articles, Christopher J. Thomas, Amit P. Sheth

Kno.e.sis Publications

Social networking, distributed problem solving and human computation have gained high visibility. Wikipedia is a well established service that incorporates aspects of these three fields of research. For this reason it is a good object of study for determining quality of solutions in a social setting that is open, completely distributed, bottom up and not peer reviewed by certified experts. In particular, this paper aims at identifying semantic convergence of Wikipedia articles; the notion that the content of an article stays stable regardless of continuing edits. This could lead to an automatic recommendation of good article tags but also add …


Supporting Complex Thematic, Spatial And Temporal Queries Over Semantic Web Data, Matthew Perry, Amit P. Sheth, Farshad Hakimpour, Prateek Jain Nov 2007

Supporting Complex Thematic, Spatial And Temporal Queries Over Semantic Web Data, Matthew Perry, Amit P. Sheth, Farshad Hakimpour, Prateek Jain

Kno.e.sis Publications

Spatial and temporal data are critical components in many applications. This is especially true in analytical domains such as national security and criminal investigation. Often, the analytical process requires uncovering and analyzing complex thematic relationships between disparate people, places and events. Fundamentally new query operators based on the graph structure of Semantic Web data models, such as semantic associations, are proving useful for this purpose. However, these analysis mechanisms are primarily intended for thematic relationships. In this paper, we describe a framework built around the RDF metadata model for analysis of thematic, spatial and temporal relationships between named entities. We …


Conjunctive Queries For A Tractable Fragment Of Owl 1.1, Markus Krotzsch, Sebastian Rudolph, Pascal Hitzler Nov 2007

Conjunctive Queries For A Tractable Fragment Of Owl 1.1, Markus Krotzsch, Sebastian Rudolph, Pascal Hitzler

Computer Science and Engineering Faculty Publications

Despite the success of the Web Ontology Language OWL, the development of expressive means for querying OWL knowledge bases is still an open issue. In this paper, we investigate how a very natural and desirable form of queries-namely conjunctive ones-can be used in conjunction with OWL such that one of the major design criteria of the latter-namely decidability-can be retained. More precisely, we show that querying the tractable fragment EL++ of OWL 1.1 is decidable. We also provide a complexity analysis and show that querying unrestricted EL++ is undecidable.


Comparison Of Probabilistic Boolean Network And Dynamic Bayesian Network Approaches For Inferring Gene Regulatory Networks, Peng Li, Chaoyang Zhang, Edward J. Perkins, Ping Gong, Youping Deng Nov 2007

Comparison Of Probabilistic Boolean Network And Dynamic Bayesian Network Approaches For Inferring Gene Regulatory Networks, Peng Li, Chaoyang Zhang, Edward J. Perkins, Ping Gong, Youping Deng

Faculty Publications

Background: The regulation of gene expression is achieved through gene regulatory networks (GRNs) in which collections of genes interact with one another and other substances in a cell. In order to understand the underlying function of organisms, it is necessary to study the behavior of genes in a gene regulatory network context. Several computational approaches are available for modeling gene regulatory networks with different datasets. In order to optimize modeling of GRN, these approaches must be compared and evaluated in terms of accuracy and efficiency.

Results: In this paper, two important computational approaches for modeling gene regulatory networks, …


Modeling Sage Tag Formation And Its Effects On Data Interpretation Within A Bayesian Framework, Michael A. Gilchrist, Hong Qin, Russell Zaretzki Oct 2007

Modeling Sage Tag Formation And Its Effects On Data Interpretation Within A Bayesian Framework, Michael A. Gilchrist, Hong Qin, Russell Zaretzki

Faculty Publications and Other Works -- General Biology

Abstract

Background

Serial Analysis of Gene Expression (SAGE) is a high-throughput method for inferring mRNA expression levels from the experimentally generated sequence based tags. Standard analyses of SAGE data, however, ignore the fact that the probability of generating an observable tag varies across genes and between experiments. As a consequence, these analyses result in biased estimators and posterior probability intervals for gene expression levels in the transcriptome.

Results

Using the yeast Saccharomyces cerevisiae as an example, we introduce a new Bayesian method of data analysis which is based on a model of SAGE tag formation. Our approach incorporates the variation …


Statistical Methods For The Analysis Of Cancer Genome Sequencing Data, Giovanni Parmigiani, J. Lin, Simina Boca, T. Sjoblom, K.W. Kinzler, V.E. Velculescu, B. Vogelstein Oct 2007

Statistical Methods For The Analysis Of Cancer Genome Sequencing Data, Giovanni Parmigiani, J. Lin, Simina Boca, T. Sjoblom, K.W. Kinzler, V.E. Velculescu, B. Vogelstein

Johns Hopkins University, Dept. of Biostatistics Working Papers

The purpose of cancer genome sequencing studies is to determine the nature and types of alterations present in a typical cancer and to discover genes mutated at high frequencies. In this article we discuss statistical methods for the analysis of data generated in these studies. We place special emphasis on a two-stage study design introduced by Sjoblom et al.[1]. In this context, we describe statistical methods for constructing scores that can be used to prioritize candidate genes for further investigation and to assess the statistical signicance of the candidates thus identfied.


A Proposed Statistical Protocol For The Analysis Of Metabolic Toxicological Data Derived From Nmr Spectroscopy, Benjamin J. Kelly, Paul E. Anderson, Nicholas V. Reo, Nicholas J. Delraso, Travis E. Doom, Michael L. Raymer Oct 2007

A Proposed Statistical Protocol For The Analysis Of Metabolic Toxicological Data Derived From Nmr Spectroscopy, Benjamin J. Kelly, Paul E. Anderson, Nicholas V. Reo, Nicholas J. Delraso, Travis E. Doom, Michael L. Raymer

Kno.e.sis Publications

Nuclear magnetic resonance (NMR) spectroscopy is a non-invasive method of acquiring a metabolic profile from biofluids. This metabolic information may provide keys to the early detection of exposure to a toxin. A typical NMR toxicology data set has low sample size and high dimensionality. Thus, traditional pattern recognition techniques are not always feasible. In this paper, we evaluate several common alternatives for isolating these biomarkers. The fold test, unpaired t-test, and paired t-test were performed on an NMR-derived toxicological data set and results were compared. The paired t-test method was preferred, due to its ability to attribute statistical significance, to …


A Multi-Objective Genetic Algorithm That Employs A Hybrid Approach For Isolating Codon Usage Bias Indicative Of Translational Efficiency, Douglas W. Raiford, Dan E. Krane, Travis E. Doom, Michael L. Raymer Oct 2007

A Multi-Objective Genetic Algorithm That Employs A Hybrid Approach For Isolating Codon Usage Bias Indicative Of Translational Efficiency, Douglas W. Raiford, Dan E. Krane, Travis E. Doom, Michael L. Raymer

Kno.e.sis Publications

Isolation of translational efficiency bias can have important applications in gene expression prediction and heterologous protein production. In some genomes the presence of a high GC(AT)-content bias can confound the isolation of translational efficiency bias. In other organisms translational efficiency bias is weak making it difficult to isolate. Described here is a multi-objective genetic algorithm that improves the isolation of translational efficiency bias in Streptomyces coelicolor A3(2) and Pseudomonas aeruginosa PAO1, two organisms shown to have high GC-content and weak translational efficiency bias.


Swashup: Situational Web Applications Mashups, E. Michael Maximilien, Ajith Harshana Ranabahu, Stefan Tai Oct 2007

Swashup: Situational Web Applications Mashups, E. Michael Maximilien, Ajith Harshana Ranabahu, Stefan Tai

Kno.e.sis Publications

Distributed programming has shifted from private networks to the Internet using heterogeneous Web APIs. This enables the creation of situational applications of composed services exposing user interfaces, i.e., mashups. However, this programmable Web lacks unified models that can facilitate mashup creation, reuse, and deployments. This poster demonstrates a platform to facilitate Web 2.0 mashups.


Realizing The Relationship Web: Morphing Information Access On The Web From Today's Document- And Entity-Centric Paradigm To A Relationship-Centric Paradigm, Amit P. Sheth Sep 2007

Realizing The Relationship Web: Morphing Information Access On The Web From Today's Document- And Entity-Centric Paradigm To A Relationship-Centric Paradigm, Amit P. Sheth

Kno.e.sis Publications

No abstract provided.


Comparing Disjunctive Well-Founded Semantics, Matthias Knorr, Pascal Hitzler Sep 2007

Comparing Disjunctive Well-Founded Semantics, Matthias Knorr, Pascal Hitzler

Computer Science and Engineering Faculty Publications

While the stable model semantics, in the form of Answer Set Programming, has become a successful semantics for disjunctive logic programs, a corresponding satisfactory extension of the well-founded semantics to disjunctive programs remains to be found. The many current proposals for such an extension are so diverse, that even a systematic comparison between them is a challenging task. In order to aid the quest for suitable disjunctive well-founded semantics, we present a systematic approach to a comparison based on level mappings, a recently introduced framework for characterizing logic programming semantics, which was quite successfully used for comparing the major semantics …


Description Logic Programs: Normal Forms, Pascal Hitzler, Andreas Eberhart Sep 2007

Description Logic Programs: Normal Forms, Pascal Hitzler, Andreas Eberhart

Computer Science and Engineering Faculty Publications

The relationship and possible interplay between different knowledge representation and reasoning paradigms is a fundamental topic in artificial intelligence. For expressive knowledge representation for the Semantic Web, two different paradigms - namely Description Logics (DLs) and Logic Programming - are the two most successful approaches. A study of their exact relationships is thus paramount. An intersection of OWL with (function-free non-disjunctive) Datalog, called DLP (for Description Logic Programs), has been described in [1,2]. We provide normal forms for DLP in Description Logic syntax and in Datalog syntax, thus providing a bridge for the researcher and user who is familiar with …


Sa-Rest And (S)Mashups: Adding Semantics To Restful Services, Jonathan Lathem, Karthik Gomadam, Amit P. Sheth Sep 2007

Sa-Rest And (S)Mashups: Adding Semantics To Restful Services, Jonathan Lathem, Karthik Gomadam, Amit P. Sheth

Kno.e.sis Publications

The evolution of the Web 2.0 phenomenon has led to the increased adoption of the RESTful services paradigm. RESTful services often take the form of RSS/Atom feeds and AJAX based light weight services. The XML based messaging paradigm of RESTful services has made it possible to compose various services together. Such compositions of RESTful services is widely referred to as Mashups. In this paper, we outline the limitations in current approaches to creating mashups. We address these limitations by proposing a framework called as SA-REST. SA-REST adds semantics to RESTful services. Our proposed framework builds upon the original ideas in …


The Programmable Web: Agile, Social, And Grassroots Computing, E. Michael Maximilien, Ajith Harshana Ranabahu Sep 2007

The Programmable Web: Agile, Social, And Grassroots Computing, E. Michael Maximilien, Ajith Harshana Ranabahu

Kno.e.sis Publications

Web services, the semantic Web, and Web 2.0 are three somewhat separate movements trying to make the Web a programmable substrate. While each has achieved some level of success on their own right, it is becoming apparent that the grassroots approach of the Web 2.0 is gaining greater success than the other two. In this paper we analyze each movement, briefly describing its main traits, and outlining its primary assumptions. We then frame the common problem of achieving a programmable Web within the context of distributed computing and software engineering and then attempt to show why Web 2.0 is closest …


Any-World Access To Owl From Prolog, Tobias Matzner, Pascal Hitzler Sep 2007

Any-World Access To Owl From Prolog, Tobias Matzner, Pascal Hitzler

Computer Science and Engineering Faculty Publications

The W3C standard OWL provides a decidable language for representing ontologies. While its use is rapidly spreading, efforts are being made by researchers worldwide to augment OWL with additional expressive features or by interlacing it with other forms of knowledge representation, in order to make it applicable for even further purposes. In this paper, we integrate OWL with one of the most successful and most widely used forms of knowledge representation, namely Prolog, and present a hybrid approach which layers Prolog on top of OWL in such a way that the open-world semantics of OWL becomes directly accessible within the …


Sensor Data Management, Cory Andrew Henson Aug 2007

Sensor Data Management, Cory Andrew Henson

Kno.e.sis Publications

No abstract provided.