Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics

Series

2010

Institution
Keyword
Publication
File Type

Articles 1 - 30 of 61

Full-Text Articles in Physical Sciences and Mathematics

Minimum Description Length Measures Of Evidence For Enrichment, Zhenyu Yang, David R. Bickel Dec 2010

Minimum Description Length Measures Of Evidence For Enrichment, Zhenyu Yang, David R. Bickel

COBRA Preprint Series

In order to functionally interpret differentially expressed genes or other discovered features, researchers seek to detect enrichment in the form of overrepresentation of discovered features associated with a biological process. Most enrichment methods treat the p-value as the measure of evidence using a statistical test such as the binomial test, Fisher's exact test or the hypergeometric test. However, the p-value is not interpretable as a measure of evidence apart from adjustments in light of the sample size. As a measure of evidence supporting one hypothesis over the other, the Bayes factor (BF) overcomes this drawback of the p-value but lacks …


Reconstructability Analysis Of Epistasis, Martin Zwick Dec 2010

Reconstructability Analysis Of Epistasis, Martin Zwick

Systems Science Faculty Publications and Presentations

The literature on epistasis describes various methods to detect epistatic interactions and to classify different types of epistasis. Reconstructability analysis (RA) has recently been used to detect epistasis in genomic data. This paper shows that RA offers a classification of types of epistasis at three levels of resolution (variable-based models without loops, variable-based models with loops, state-based models). These types can be defined by the simplest RA structures that model the data without information loss; a more detailed classification can be defined by the information content of multiple candidate structures. The RA classification can be augmented with structures from related …


Spatial Semantics For Better Interoperability And Analysis: Challenges And Experiences In Building Semantically Rich Applications In Web 3.0, Amit P. Sheth Dec 2010

Spatial Semantics For Better Interoperability And Analysis: Challenges And Experiences In Building Semantically Rich Applications In Web 3.0, Amit P. Sheth

Kno.e.sis Publications

No abstract provided.


Flexible Bootstrapping-Based Ontology Alignment, Prateek Jain, Pascal Hitzler, Amit P. Sheth Nov 2010

Flexible Bootstrapping-Based Ontology Alignment, Prateek Jain, Pascal Hitzler, Amit P. Sheth

Kno.e.sis Publications

BLOOMS (Jain et al, ISWC2010) is an ontology alignment system which, in its core, utilizes the Wikipedia category hierarchy for establishing alignments. In this paper, we present a Plug-and-Play extension to BLOOMS, which allows to flexibly replace or complement the use of Wikipedia by other online or offline resources, including domain-specific ontologies or taxonomies. By making use of automated translation services and of Wikipedia in languages other than English, it makes it possible to apply BLOOMS to alignment tasks where the input ontologies are written in different languages.


Ontology Alignment For Linked Open Data, Prateek Jain, Pascal Hitzler, Amit P. Sheth, Kunal Verma, Peter Z. Yeh Nov 2010

Ontology Alignment For Linked Open Data, Prateek Jain, Pascal Hitzler, Amit P. Sheth, Kunal Verma, Peter Z. Yeh

Kno.e.sis Publications

The Web of Data currently coming into existence through the Linked Open Data (LOD) effort is a major milestone in realizing the Semantic Web vision. However, the development of applications based on LOD faces difficulties due to the fact that the different LOD datasets are rather loosely connected pieces of information. In particular, links between LOD datasets are almost exclusively on the level of instances, and schema-level information is being ignored. In this paper, we therefore present a system for finding schema-level links between LOD datasets in the sense of ontology alignment. Our system, called BLOOMS, is based on the …


(1e,3e)-1,4-Bis(4-Methoxyphenyl)Buta1,3-Diene, Gopinathan Narayan, Nigam Rath, Suresh Das Oct 2010

(1e,3e)-1,4-Bis(4-Methoxyphenyl)Buta1,3-Diene, Gopinathan Narayan, Nigam Rath, Suresh Das

Chemistry & Biochemistry Faculty Works

The title compound, C18H18O2, which exhibits blue emission in the solid state, is an inter­mediate in the preparation of liquid crystals and polymers. The mol­ecule is located on an inversion centre. In the crystal, mol­ecules are arranged in a herringbone motif.


2,2′,5,5′-Tetra­Chloro­Benzidine, Onome Ugono, Marcel Douglas, Nigam Rath, Alicia Beatty Sep 2010

2,2′,5,5′-Tetra­Chloro­Benzidine, Onome Ugono, Marcel Douglas, Nigam Rath, Alicia Beatty

Chemistry & Biochemistry Faculty Works

In the crystal structure of the title compound, C12H8Cl4N2, mol­ecules lie on crystallographic twofold axes at the centre of the C-C bonds linking the benzene rings, such that the asymmetric unit consists of a half-mol­ecule. The individual mol­ecules participate in inter­molecular N-H...N, N-H...Cl, C-H...Cl and Cl...Cl [3.4503 (3) Å] inter­actions.


Bioinformatics Across The Sciences, Nigel Yarlett Sep 2010

Bioinformatics Across The Sciences, Nigel Yarlett

Cornerstone 3 Reports : Interdisciplinary Informatics

No abstract provided.


A Taxonomy-Based Model For Expertise Extrapolation, Delroy H. Cameron, Boanerges Aleman-Meza, Ismailcem Budak Arpinar, Sheron L. Decker, Amit P. Sheth Sep 2010

A Taxonomy-Based Model For Expertise Extrapolation, Delroy H. Cameron, Boanerges Aleman-Meza, Ismailcem Budak Arpinar, Sheron L. Decker, Amit P. Sheth

Kno.e.sis Publications

While many ExpertFinder applications succeed in finding experts, their techniques are not always designed to capture the various levels at which expertise can be expressed. Indeed, expertise can be inferred from relationships between topics and subtopics in a taxonomy. The conventional wisdom is that expertise in subtopics is also indicative of expertise in higher level topics as well. The enrichment of Expertise Profiles for finding experts can therefore be facilitated by taking domain hierarchies into account. We present a novel semantics-based model for finding experts, expertise levels and collaboration levels in a peer review context, such as composing a Program …


Ranking Documents Semantically Using Ontological Relationships, Boanerges Aleman-Meza, I. Budak Arpinar, Mustafa V. Nural, Amit P. Sheth Sep 2010

Ranking Documents Semantically Using Ontological Relationships, Boanerges Aleman-Meza, I. Budak Arpinar, Mustafa V. Nural, Amit P. Sheth

Kno.e.sis Publications

Although arguable success of today’s keyword based search engines in certain information retrieval tasks, ranking search results in a meaningful way remains an open problem. In this work, the goal is to use of semantic relationships for ranking documents without relying on the existence of any specific structure in a document or links between documents. Instead, real-world entities are identified and the relevance of documents is determined using relationships that are known to exist between the entities in a populated ontology. We introduce a measure of relevance that is based on traversal and the semantics of relationships that link entities …


G-Lattices For An Unrooted Perfect Phylogeny, Monica Grigg Aug 2010

G-Lattices For An Unrooted Perfect Phylogeny, Monica Grigg

Mathematical Sciences Technical Reports (MSTR)

We look at the Pure Parsimony problem and the Perfect Phylogeny Haplotyping problem. From the Pure Parsimony problem we consider structures of genotypes called g-lattices. These structures either provide solutions or give bounds to the pure parsimony problem. In particular, we investigate which of these structures supports an unrooted perfect phylogeny, a condition that adds biological interpretation. By understanding which g-lattices support an unrooted perfect phylogeny, we connect two of the standard biological inference rules used to recreate how genetic diversity propagates across generations.


A Perturbation Method For Inference On Regularized Regression Estimates, Jessica Minnier, Lu Tian, Tianxi Cai Aug 2010

A Perturbation Method For Inference On Regularized Regression Estimates, Jessica Minnier, Lu Tian, Tianxi Cai

Harvard University Biostatistics Working Paper Series

No abstract provided.


A Spectral Approach To Protein Structure Alignment, Yosi Shibberu, Allen Holder Aug 2010

A Spectral Approach To Protein Structure Alignment, Yosi Shibberu, Allen Holder

Mathematical Sciences Technical Reports (MSTR)

We present two algorithms that use spectral methods to align protein folds. One of the algorithms is suitable for database searches, the other for difficult alignments. We present computational results for 780 pairwise alignments used to classify 40 proteins as well as results for a separate set of 36 protein alignments used for comparison to four other alignment algorithms. We also provide a mathematically rigorous development of the intrinsic geometry underlying our spectral approach.


Bilinear Programming And Protein Structure Alignment, J. Cain, D. Kamenetsky, N. Lavine Aug 2010

Bilinear Programming And Protein Structure Alignment, J. Cain, D. Kamenetsky, N. Lavine

Mathematical Sciences Technical Reports (MSTR)

Proteins are a primary functional component of organic life, and understanding their function is integral to many areas of research in biochemistry. The three-dimensional structure of a protein largely determines this function. Protein structure alignment compares the structure of a protein with known function to that of a protein with unknown function. A protein’s three-dimensional structure can be transformed through a smooth piecewise-linear sigmoid function to a real symmetric contact matrix that represents the functional significance of certain parts of the protein. We address the protein alignment problem as a minimization of the 2-norm difference of two proteins’ contact matrices. …


Cross-Market Model Adaptation With Pairwise Preference Data For Web Search Ranking, Jing Bai, Fernando Diaz, Yi Chang, Zhaohui Zheng, Keke Chen Aug 2010

Cross-Market Model Adaptation With Pairwise Preference Data For Web Search Ranking, Jing Bai, Fernando Diaz, Yi Chang, Zhaohui Zheng, Keke Chen

Kno.e.sis Publications

Machine-learned ranking techniques automatically learn a complex document ranking function given training data. These techniques have demonstrated the effectiveness and flexibility required of a commercial web search. However, manually labeled training data (with multiple absolute grades) has become the bottleneck for training a quality ranking function, particularly for a new domain. In this paper, we explore the adaptation of machine-learned ranking models across a set of geographically diverse markets with the market-specific pairwise preference data, which can be easily obtained from clickthrough logs. We propose a novel adaptation algorithm, Pairwise-Trada, which is able to adapt ranking models that are trained …


Pattern Space Maintenance For Data Updates And Interactive Mining, Mengling Feng, Guozhu Dong, Jinyan Li, Yap-Peng Tan, Limsoon Wong Aug 2010

Pattern Space Maintenance For Data Updates And Interactive Mining, Mengling Feng, Guozhu Dong, Jinyan Li, Yap-Peng Tan, Limsoon Wong

Kno.e.sis Publications

This article addresses the incremental and decremental maintenance of the frequent pattern space. We conduct an in-depth investigation on how the frequent pattern space evolves under both incremental and decremental updates. Based on the evolution analysis, a new data structure, Generator-Enumeration Tree (GE-tree), is developed to facilitate the maintenance of the frequent pattern space. With the concept of GE-tree, we propose two novel algorithms, Pattern Space Maintainer+ (PSM+) and Pattern Space Maintainer− (PSM−), for the incremental and decremental maintenance of frequent patterns. Experimental results demonstrate that the proposed algorithms, on average, outperform the representative state-of-the-art …


10302 Summary - Learning Paradigms In Dynamic Environments, Barbara Hammer, Pascal Hitzler Jul 2010

10302 Summary - Learning Paradigms In Dynamic Environments, Barbara Hammer, Pascal Hitzler

Computer Science and Engineering Faculty Publications

The seminar centered around problems which arise in the context of machine learning in dynamic environments. Particular emphasis was put on a couple of specific questions in this context: how to represent and abstract knowledge appropriately to shape the problem of learning in a partially unknown and complex environment and how to combine statistical inference and abstract symbolic representations; how to infer from few data and how to deal with non i.i.d. data, model revision and life-long learning; how to come up with efficient strategies to control realistic environments for which exploration is costly, the dimensionality is high and data …


Biomedical Ontologies For Parasite Research, Vinh Nguyen, Satya S. Sahoo, Priti Parikh, Todd Minning, Brent Weatherly, Flora Logan, Amit P. Sheth, Rick Tarleton Jul 2010

Biomedical Ontologies For Parasite Research, Vinh Nguyen, Satya S. Sahoo, Priti Parikh, Todd Minning, Brent Weatherly, Flora Logan, Amit P. Sheth, Rick Tarleton

Kno.e.sis Publications

Trypanosoma cruzi is a protozoan parasite that causes Chagas disease or American trypanosomiasis, which is the leading cause of death in Latin America. The primary objective of this study is to create an ontology-driven information infrastructure to support parasite researchers in identifying gene knockout, vaccination, or drug targets for T. cruzi. This involves querying across multiple datasets from diverse sources, such as proteome, pathway, internal lab data, etc. that are often represented in heterogeneous formats. To address this, a multi-ontology parasite knowledge repository (PKR) is being created with an intuitive graphical query interface called Cuebee. The PKR is underpinned by …


Cloud Based Scientific Workflow For Nmr Data Analysis, Ashwin Manjunatha, Paul E. Anderson, Satya S. Sahoo, Ajith Harshana Ranabahu, Michael L. Raymer, Amit P. Sheth Jul 2010

Cloud Based Scientific Workflow For Nmr Data Analysis, Ashwin Manjunatha, Paul E. Anderson, Satya S. Sahoo, Ajith Harshana Ranabahu, Michael L. Raymer, Amit P. Sheth

Kno.e.sis Publications

This work presents a service oriented scientific workflow approach to NMR-based metabolomics data analysis. We demonstrate the effectiveness of this approach by implementing several common spectral processing techniques in the cloud using a parallel map-reduce framework, Hadoop.


Trust Model For Semantic Sensor And Social Networks: A Preliminary Report, Pramod Anantharam, Cory Andrew Henson, Krishnaprasad Thirunarayan, Amit P. Sheth Jul 2010

Trust Model For Semantic Sensor And Social Networks: A Preliminary Report, Pramod Anantharam, Cory Andrew Henson, Krishnaprasad Thirunarayan, Amit P. Sheth

Kno.e.sis Publications

Trust is an amorphous concept that is becoming Increasingly important in many domains, such as P2P networks, E-commerce, social networks, and sensor networks. While we all have an intuitive notion of trust, the literature is scattered with a wide assortment of differing definitions and descriptions; often these descriptions are highly dependent on a single domain or application of interest. In addition, they often discuss orthogonal aspects of trust while continuing to use the general term “trust”. In order to make sense of the situation, we have developed an ontology of trust that integrates and relates its various aspects into a …


Sit-To-Stand Detection Using Fuzzy Clustering Techniques, Tanvi Banerjee, James M. Keller, Marjorie Skubic, Carmen Abbott Jul 2010

Sit-To-Stand Detection Using Fuzzy Clustering Techniques, Tanvi Banerjee, James M. Keller, Marjorie Skubic, Carmen Abbott

Kno.e.sis Publications

The ability to rise from a chair is an important parameter to assess the balance deficits of a person. In particular, this can be an indication of risk for falling in elderly persons. Our goal is automated assessment of fall risk using video data. Towards this goal, we present a simple yet effective method of detecting transition, i.e. sit-to-stand and stand-to-sit, from image frames using fuzzy clustering methods on image moments. The technique described in this paper is shown to be robust even in the presence of noise and has been tested on several data sequences using different subjects yielding …


How To Make Linked Data More Than Data, Prateek Jain, Amit P. Sheth, Kunal Verma, Pascal Hitzler, Peter Z. Yeh Jun 2010

How To Make Linked Data More Than Data, Prateek Jain, Amit P. Sheth, Kunal Verma, Pascal Hitzler, Peter Z. Yeh

Kno.e.sis Publications

The LOD cloud has a potential for applicability in many AI-related tasks, such as open domain question answering, knowledge discovery, and the Semantic Web. An important prerequisite before the LOD cloud can enable these goals is allowing its users (and applications) to effectively pose queries to and retrieve answers from it. However, this prerequisite is still an open problem for the LOD cloud and has restricted it to 'merely more data.' To transform the LOD cloud from 'merely more data' to 'semantically linked data' there are plenty of open issues which should be addressed. We believe this transformation of the …


Semantically Annotated Restful Services For Large-Scale Metabolomics Data Analysis, Ashwin Manjunatha, Paul E. Anderson, Satya S. Sahoo, Ajith H. Ranabahu, Michael L. Raymer, Amit P. Sheth Jun 2010

Semantically Annotated Restful Services For Large-Scale Metabolomics Data Analysis, Ashwin Manjunatha, Paul E. Anderson, Satya S. Sahoo, Ajith H. Ranabahu, Michael L. Raymer, Amit P. Sheth

Kno.e.sis Publications

No abstract provided.


The Strength Of Statistical Evidence For Composite Hypotheses: Inference To The Best Explanation, David R. Bickel Jun 2010

The Strength Of Statistical Evidence For Composite Hypotheses: Inference To The Best Explanation, David R. Bickel

COBRA Preprint Series

A general function to quantify the weight of evidence in a sample of data for one hypothesis over another is derived from the law of likelihood and from a statistical formalization of inference to the best explanation. For a fixed parameter of interest, the resulting weight of evidence that favors one composite hypothesis over another is the likelihood ratio using the parameter value consistent with each hypothesis that maximizes the likelihood function over the parameter of interest. Since the weight of evidence is generally only known up to a nuisance parameter, it is approximated by replacing the likelihood function with …


Janus: From Workflows To Semantic Provenance And Linked Open Data, Paolo Missier, Satya S. Sahoo, Jun Zhao, Carole Goble, Amit P. Sheth Jun 2010

Janus: From Workflows To Semantic Provenance And Linked Open Data, Paolo Missier, Satya S. Sahoo, Jun Zhao, Carole Goble, Amit P. Sheth

Kno.e.sis Publications

Data provenance graphs are form of metadata that can be used to establish a variety of properties of data products that undergo sequences of transformations, typically specified as workflows. Their usefulness for answering user provenance queries is limited, however, unless the graphs are enhanced with domain-specific annotations. In this paper we propose a model and architecture for semantic, domain-aware provenance, and demonstrate its usefulness in answering typical user queries. Furthermore, we discuss the additional benefits and the technical implications of publishing provenance graphs as a form of Linked Data. A prototype implementation of the model is available for data produced …


Provenance Management In Parasite Research, Vinh Nguyen, Priti Parikh, Satya S. Sahoo, Amit P. Sheth Jun 2010

Provenance Management In Parasite Research, Vinh Nguyen, Priti Parikh, Satya S. Sahoo, Amit P. Sheth

Kno.e.sis Publications

The objective of this research is to create a semantic problem solving environment (PSE) for human parasite Trypanosoma cruzi. As a part of the PSE, we are trying to manage provenance of the experiment data as it is generated. It requires to capture the provenance which is often collected through web forms used by biologists to input the information about experiments they conduct. We have created Parasite Experiment Ontology (PEO) that represents provenance information used in the project. We have modified the back end which processes the data gathered from biologists, generates RDF triples and serializes them into the triple …


Powerful Snp Set Analysis For Case-Control Genome Wide Association Studies, Michael C. Wu, Peter Kraft, Michael P. Epstein, Deanne M. Taylor, Stephen J. Chanock, David J. Hunter, Xihong Lin May 2010

Powerful Snp Set Analysis For Case-Control Genome Wide Association Studies, Michael C. Wu, Peter Kraft, Michael P. Epstein, Deanne M. Taylor, Stephen J. Chanock, David J. Hunter, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Distance-Based Measures Of Inconsistency And Incoherency For Description Logics, Yue Ma, Pascal Hitzler May 2010

Distance-Based Measures Of Inconsistency And Incoherency For Description Logics, Yue Ma, Pascal Hitzler

Computer Science and Engineering Faculty Publications

Inconsistency and incoherency are two sorts of erroneous information in a DL ontology which have been widely discussed in ontology-based applications. For example, they have been used to detect modeling errors during ontology construction. To provide more informative metrics which can tell the differences between inconsistent ontologies and between incoherent terminologies, there has been some work on measuring inconsistency of an ontology and on measuring incoherency of a terminology. However, most of them merely focus either on measuring inconsistency or on measuring incoherency and no clear ideas of how to extend them to allow for the other. In this paper, …


Some Trust Issues In Social Networks And Sensor Networks, Krishnaprasad Thirunarayan, Pramod Anantharam, Cory Andrew Henson, Amit P. Sheth May 2010

Some Trust Issues In Social Networks And Sensor Networks, Krishnaprasad Thirunarayan, Pramod Anantharam, Cory Andrew Henson, Amit P. Sheth

Kno.e.sis Publications

Trust and reputation are becoming increasingly important in diverse areas such as search, e-commerce, social media, semantic sensor networks, etc. We review past work and explore future research issues relevant to trust in social/sensor networks and interactions. We advocate a balanced, iterative approach to trust that marries both theory and practice. On the theoretical side, we investigate models of trust to analyze and specify the nature of trust and trust computation. On the practical side, we propose to uncover aspects that provide a basis for trust formation and techniques to extract trust information from concrete social/sensor networks and interactions. We …


Linked Sensor Data, Harshal Kamlesh Patni, Cory Andrew Henson, Amit P. Sheth May 2010

Linked Sensor Data, Harshal Kamlesh Patni, Cory Andrew Henson, Amit P. Sheth

Kno.e.sis Publications

A number of government, corporate, and academic organizations are collecting enormous amounts of data provided by environmental sensors. However, this data is too often locked within organizations and underutilized by the greater community. In this paper, we present a framework to make this sensor data openly accessible by publishing it on the Linked Open Data (LOD) Cloud. This is accomplished by converting raw sensor observations to RDF and linking with other datasets on LOD. With such a framework, organizations can make large amounts of sensor data openly accessible, thus allowing greater opportunity for utilization and analysis.