Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics

PDF

Series

Institution
Keyword
Publication Year
Publication

Articles 631 - 660 of 882

Full-Text Articles in Physical Sciences and Mathematics

Querying Formal Contexts With Answer Set Programs, Pascal Hitzler, Markus Krotzsch Jul 2006

Querying Formal Contexts With Answer Set Programs, Pascal Hitzler, Markus Krotzsch

Computer Science and Engineering Faculty Publications

Recent studies showed how a seamless integration of formal concept analysis (FCA), logic of domains, and answer set programming (ASP) can be achieved. Based on these results for combining hierarchical knowledge with classical rule-based formalisms, we introduce an expressive common-sense query language for formal contexts. Although this approach is conceptually based on order-theoretic paradigms, we show how it can be implemented on top of standard ASP systems. Advanced features, such as default negation and disjunctive rules, thus become practically available for processing contextual data.


Geospatial Ontology Development And Semantic Analytics, I. Budak Arpinar, Cartic Ramakrishnan, Molly Azami, Amit P. Sheth, E. Lynn Usery, Mei-Po Kwan Jul 2006

Geospatial Ontology Development And Semantic Analytics, I. Budak Arpinar, Cartic Ramakrishnan, Molly Azami, Amit P. Sheth, E. Lynn Usery, Mei-Po Kwan

Kno.e.sis Publications

Geospatial ontology development and semantic knowledge discovery addresses the need for modeling, analyzing and visualizing multimodal information, and is unique in offering integrated analytics that encompasses spatial, temporal and thematic dimensions of information and knowledge. The comprehensive ability to provide integrated analysis from multiple forms of information and use of explicit knowledge make this approach unique. This also involves specification of spatiotemporal thematic ontologies and populating such ontologies with high quality knowledge. Such ontologies form the basis for defining the meaning of important relations terms, such as near or surrounded by, and enable computation of spatiotemporal thematic proximity measures we …


Optimal Adaptation Of Web Processes With Inter-Service Dependencies, Kunal Verma, Prashant Doshi, Karthik Gomadam, John A. Miller, Amit P. Sheth Jul 2006

Optimal Adaptation Of Web Processes With Inter-Service Dependencies, Kunal Verma, Prashant Doshi, Karthik Gomadam, John A. Miller, Amit P. Sheth

Kno.e.sis Publications

We present methods for optimally adapting Web processes to exogenous events while preserving inter-service dependencies. For example, in a supply chain process, orders placed by the manufacturer may get delayed in arriving. In response to this event, the manufacturer has the choice of either waiting out the delay or changing the supplier. Additionally, there may be compatibility constraints between the different orders, thereby introducing the problem of coordination between them if the manufacturer chooses to change the suppliers. We present our methods within the framework of autonomic Web processes. This framework seeks to add properties of self-configuration, adaptation, and self-optimization …


Bi-Level Clustering Of Mixed Categorical And Numerical Biomedical Data, Bill Andreopoulos, Aijun An, Xiaogang Wang Jun 2006

Bi-Level Clustering Of Mixed Categorical And Numerical Biomedical Data, Bill Andreopoulos, Aijun An, Xiaogang Wang

Faculty Publications, Computer Science

Biomedical data sets often have mixed categorical and numerical types, where the former represent semantic information on the objects and the latter represent experimental results. We present the BILCOM algorithm for |Bi-Level Clustering of Mixed categorical and numerical data types|. BILCOM performs a pseudo-Bayesian process, where the prior is categorical clustering. BILCOM partitions biomedical data sets of mixed types, such as hepatitis, thyroid disease and yeast gene expression data with Gene Ontology annotations, more accurately than if using one type alone.


A Metamodel And Uml Profile For Rule-Extended Owl Dl Ontologies, Saartje Brockmans, Peter Haase, Pascal Hitzler, Rudi Studer Jun 2006

A Metamodel And Uml Profile For Rule-Extended Owl Dl Ontologies, Saartje Brockmans, Peter Haase, Pascal Hitzler, Rudi Studer

Computer Science and Engineering Faculty Publications

In this paper we present a MOF compliant metamodel and UML profile for the Semantic Web Rule Language (SWRL) that integrates with our previous work on a metamodel and UML profile for OWL DL. Based on this metamodel and profile, UML tools can be used for visual modeling of rule-extended ontologies.


Masquerader Detection Using Oclep: One-Class Classification Using Length Statistics Of Emerging Patterns, Lijun Chen, Guozhu Dong Jun 2006

Masquerader Detection Using Oclep: One-Class Classification Using Length Statistics Of Emerging Patterns, Lijun Chen, Guozhu Dong

Kno.e.sis Publications

We introduce a new method for masquerader detection that only uses a user’s own data for training, called Oneclass Classification using Length statistics of Emerging Patterns (OCLEP). Emerging patterns (EPs) are patterns whose support increases from one dataset/class to another with a big ratio, and have been very useful in earlier studies. OCLEP classifies a case T as self or masquerader by using the average length of EPs obtained by contrasting T against sets of samples of a user’s normal data. It is based on the observation that one needs long EPs to differentiate instances from a common class, but …


Semantic Empowerment Of Health Care And Life Science Applications, Amit P. Sheth May 2006

Semantic Empowerment Of Health Care And Life Science Applications, Amit P. Sheth

Kno.e.sis Publications

No abstract provided.


Bounded Search For De Novo Identification Of Degenerate Cis-Regulatory Elements, Jonathan M. Carlson, Arijit Chakravarty, Radhika S. Khetani, Robert H. Gross May 2006

Bounded Search For De Novo Identification Of Degenerate Cis-Regulatory Elements, Jonathan M. Carlson, Arijit Chakravarty, Radhika S. Khetani, Robert H. Gross

Dartmouth Scholarship

The identification of statistically overrepresented sequences in the upstream regions of coregulated genes should theoretically permit the identification of potential cis-regulatory elements. However, in practice many cis-regulatory elements are highly degenerate, precluding the use of an exhaustive word-counting strategy for their identification. While numerous methods exist for inferring base distributions using a position weight matrix, recent studies suggest that the independence assumptions inherent in the model, as well as the inability to reach a global optimum, limit this approach.


Semantic Analytics Visualization, Leonidas Deligiannidis, Amit P. Sheth, Boanerges Aleman-Meza May 2006

Semantic Analytics Visualization, Leonidas Deligiannidis, Amit P. Sheth, Boanerges Aleman-Meza

Kno.e.sis Publications

In this paper we present a new tool for semantic analytics through 3D visualization called “Semantic Analytics Visualization” (SAV). It has the capability for visualizing ontologies and meta-data including annotated web-documents, images, and digital media such as audio and video clips in a synthetic three-dimensional semi-immersive environment. More importantly, SAV supports visual semantic analytics, whereby an analyst can interactively investigate complex relationships between heterogeneous information. The tool is built using Virtual Reality technology which makes SAV a highly interactive system. The backend of SAV consists of a Semantic Analytics system that supports query processing and semantic association discovery. Using a …


Detecting The Change Of Clustering Structure In Categorical Data Streams, Keke Chen, Ling Liu Apr 2006

Detecting The Change Of Clustering Structure In Categorical Data Streams, Keke Chen, Ling Liu

Kno.e.sis Publications

Analyzing clustering structures in data streams can provide critical information for making decision in real time. In this paper, we present a framework for detecting the change of critical clustering structure in categorical data streams. The framework consists of the Hierarchical Entropy Tree structure (HE-Tree) and the extended ACE clustering algorithm. HE-Tree can efficiently capture the entropy property of the categorical data streams and allow us to draw precise clustering information from the data stream for high-quality BkPLots with the extended ACE algorithm.


Ivibrate: Interactive Visualization Based Framework For Clustering Large Datasets, Keke Chen, Ling Liu Apr 2006

Ivibrate: Interactive Visualization Based Framework For Clustering Large Datasets, Keke Chen, Ling Liu

Kno.e.sis Publications

With continued advances in communication network technology and sensing technology, there is astounding growth in the amount of data produced and made available through cyberspace. Efficient and high-quality clustering of large datasets continues to be one of the most important problems in large-scale data analysis. A commonly used methodology for cluster analysis on large datasets is the three-phase framework of sampling/summarization, iterative cluster analysis, and disk-labeling. There are three known problems with this framework which demand effective solutions. The first problem is how to effectively define and validate irregularly shaped clusters, especially in large datasets. Automated algorithms and statistical methods …


Semantic Web Applications In Financial Industry, Government, Health Care And Life Sciences, Amit P. Sheth Mar 2006

Semantic Web Applications In Financial Industry, Government, Health Care And Life Sciences, Amit P. Sheth

Kno.e.sis Publications

No abstract provided.


Genome Scanning Methods For Comparing Sequences Between Groups, With Application To Hiv Vaccine Trials, Peter B. Gilbert, Chunyuan Wu, David V. Jobes Mar 2006

Genome Scanning Methods For Comparing Sequences Between Groups, With Application To Hiv Vaccine Trials, Peter B. Gilbert, Chunyuan Wu, David V. Jobes

UW Biostatistics Working Paper Series

Consider a placebo-controlled preventive HIV vaccine efficacy trial. An HIV amino acid sequence is measured from each volunteer who acquires HIV, and these sequences are aligned together with the reference HIV sequence represented in the vaccine. We develop genome scanning methods to identify HIV positions at which the amino acids in sequences from infected vaccine recipients tend to be more divergent from the corresponding reference amino acid than the amino acids in sequences from infected placebo recipients. We consider five two-sample test statistics, based on Euclidean, Mahalanobis, and Kullback-Leibler divergence measures. Weights are incorporated to reflect biological information contained in …


2^K Factorials In Blocks Of Size 2, With Application To Two-Color Microarray Experiments, Kathleen F. Kerr Mar 2006

2^K Factorials In Blocks Of Size 2, With Application To Two-Color Microarray Experiments, Kathleen F. Kerr

UW Biostatistics Working Paper Series

When a two-level design must be run in blocks of size two, there is a unique blocking scheme that enables estimation of all the main effects. Unfortunately this design does not enable estimation of any two-factor interactions. When the experimental goal is to estimate all main effects and two-factor interactions, it is necessary to combine replicates of the experiment that use different blocking schemes. In this paper we identify such designs for up to eight factors that enable estimation of all main effects and two-factor interactions with the fewest number of replications. In addition, we give a construction for general …


Wsdl-S: Specification, Tools, Use Cases And Applications, Amit P. Sheth, Kunal Verma, Karthik Gomadam Mar 2006

Wsdl-S: Specification, Tools, Use Cases And Applications, Amit P. Sheth, Kunal Verma, Karthik Gomadam

Kno.e.sis Publications

No abstract provided.


Gpnn: Power Studies And Applications Of A Neural Network Method For Detecting Gene-Gene Interactions In Studies Of Human Disease, Alison A. Motsinger, Stephen L. Lee, George Mellick, Marylyn D. Ritchie Jan 2006

Gpnn: Power Studies And Applications Of A Neural Network Method For Detecting Gene-Gene Interactions In Studies Of Human Disease, Alison A. Motsinger, Stephen L. Lee, George Mellick, Marylyn D. Ritchie

Dartmouth Scholarship

The identification and characterization of genes that influence the risk of common, complex multifactorial disease primarily through interactions with other genes and environmental factors remains a statistical and computational challenge in genetic epidemiology. We have previously introduced a genetic programming optimized neural network (GPNN) as a method for optimizing the architecture of a neural network to improve the identification of gene combinations associated with disease risk. The goal of this study was to evaluate the power of GPNN for identifying high-order gene-gene interactions. We were also interested in applying GPNN to a real data analysis in Parkinson's disease.


Why Is The Number Of Dna Bases 4?, Bo Deng Jan 2006

Why Is The Number Of Dna Bases 4?, Bo Deng

Department of Mathematics: Faculty Publications

In this paper we construct a mathematical model for DNA replication based on Shannon’s mathematical theory for communication. We treatDNAreplication as a communication channel. We show that the mean replication rate is maximal with four nucleotide bases under the primary assumption that the pairing time of the G–C bases is between 1.65 and 3 times the pairing time of the A–T bases.


An Online Discriminative Approach To Background Subtraction, Li Cheng, Shaojun Wang, Terry Caelli Jan 2006

An Online Discriminative Approach To Background Subtraction, Li Cheng, Shaojun Wang, Terry Caelli

Kno.e.sis Publications

We present a simple, principled approach to detecting foreground objects in video sequences in real-time. Our method is based on an on-line discriminative learning technique that is able to cope with illumination changes due to discontinuous switching, or illumination drifts caused by slower processes such as varying time of the day. Starting from a discriminative learning principle, we derive a training algorithm that, for each pixel, computes a weighted linear combination of selected past observations with time-decay. We present experimental results that show the proposed approach outperforms existing methods on both synthetic sequences and real video data.


Visual Ontology Modeling For Electronic Markets, Saartje Brockmans, Andreas Geyer-Schulz, Pascal Hitzler, Rudi Studer Jan 2006

Visual Ontology Modeling For Electronic Markets, Saartje Brockmans, Andreas Geyer-Schulz, Pascal Hitzler, Rudi Studer

Computer Science and Engineering Faculty Publications

The research program, Information Management and Market Engineering, focuses on the analysis and the design of electronic markets. Taking a holistic view of the conceptualization and realization of solutions, the research integrates the disciplines business administration, economics, computer science, and law. Topics of interest range from the implementation, quality assurance, and further development of electronic markets to their integration into business processes, innovative business models, and legal frameworks.


A Semantic Future For Ai, Rudi Studer, Anupriya Ankolekar, Pascal Hitzler Jan 2006

A Semantic Future For Ai, Rudi Studer, Anupriya Ankolekar, Pascal Hitzler

Computer Science and Engineering Faculty Publications

In our modern information society, people need to manage ever-increasing numbers of personal devices and conduct more of their work and activities online, often making use of heterogeneous services. The amount of information to be processed by each individual is constantly growing, making it increasingly difficult to control, channel, share and make constructive use of it. To mitigate this, computing needs to become much more human-centered, e.g. by presenting personalised information to users and by respecting personal preferences in controlling multiple devices or invoking various services. Appropriate representation of the semantics of the information and functionality of devices and services …


A Novel Approach To Phylogenetic Tree Construction Using Stochastic Optimization And Clustering, Ling Qin, Yixin Chen, Yi Pan, Ling Chen Jan 2006

A Novel Approach To Phylogenetic Tree Construction Using Stochastic Optimization And Clustering, Ling Qin, Yixin Chen, Yi Pan, Ling Chen

Computer Science Faculty Publications

Background: The problem of inferring the evolutionary history and constructing the phylogenetic tree with high performance has become one of the major problems in computational biology.

Results: A new phylogenetic tree construction method from a given set of objects (proteins, species, etc.) is presented. As an extension of ant colony optimization, this method proposes an adaptive phylogenetic clustering algorithm based on a digraph to find a tree structure that defines the ancestral relationships among the given objects.

Conclusion: Our phylogenetic tree construction method is tested to compare its results with that of the genetic algorithm (GA). Experimental results show that …


Show Me What You Mean! Exploiting Domain Semantics In Ontology Visualization, Ravi Pavagada, Christopher Thomas, Amit P. Sheth, William S. York Jan 2006

Show Me What You Mean! Exploiting Domain Semantics In Ontology Visualization, Ravi Pavagada, Christopher Thomas, Amit P. Sheth, William S. York

Kno.e.sis Publications

Ontologies build the backbone for many life-sciences applications. These ontologies, however, are represented in XML based languages that are meant for machine-consumption and hence are difficult for humans to comprehend. For a meaningful visualization of these ontologies, it is important that the display of entities and relationships captures the cognitive representation of the domain as perceived by the domain experts. In this paper we present OntoVista, an ontology visualization tool that is adaptable to the needs of different domains, especially in the life sciences. While keeping the graph structures as the predominant model, we provide a semantically enhanced graph display …


Taxaminer: Improving Taxonomy Label Quality Using Latent Semantic Indexing, Cartic Ramakrishnan, Christopher Thomas, Vipul Kashyap, Amit P. Sheth Jan 2006

Taxaminer: Improving Taxonomy Label Quality Using Latent Semantic Indexing, Cartic Ramakrishnan, Christopher Thomas, Vipul Kashyap, Amit P. Sheth

Kno.e.sis Publications

The development of taxonomies/ontologies is a human intensive process requiring prohibitively large resource commitments in terms of time and cost. In our previous work we have identified an experimentation framework for semi-automatic taxonomy/hierarchy generation from unstructured text. In the preliminary results presented, the taxonomy/hierarchy quality was lower than we had anticipated. In this paper, we present two variations of our experimentation framework, viz. Latent semantic Indexing (LSI) for document indexing and the use of term vectors to prune labels assigned to nodes in the final taxonomy/hierarchy. Using our previous results of taxonomy/hierarchy quality as the baseline we present results that …


Data Processing In Space, Time, And Semantics Dimensions, Farshad Hakimpour, Boanerges Aleman-Meza, Matthew Perry, Amit P. Sheth Jan 2006

Data Processing In Space, Time, And Semantics Dimensions, Farshad Hakimpour, Boanerges Aleman-Meza, Matthew Perry, Amit P. Sheth

Kno.e.sis Publications

This work presents an experimental system for data processing in space, time and semantics dimensions using current Semantic Web technologies. The paper describes how we obtain geographic and event data from Internet sources and also how we integrate them into an RDF store. We briefly introduce a set of functionalities in space, time and semantics dimensions. These functionalities are implemented based on our existing technology for main-memory based RDF data processing developed in the LSDIS Lab. A number of these functionalities are exposed as REST Web services. We present two sample client side applications that are developed using a combination …


Using Query-Specific Variance Estimates To Combine Bayesian Classifiers, Chi-Hoon Lee, Russell Greiner, Shaojun Wang Jan 2006

Using Query-Specific Variance Estimates To Combine Bayesian Classifiers, Chi-Hoon Lee, Russell Greiner, Shaojun Wang

Kno.e.sis Publications

Many of today's best classification results are obtained by combining the responses of a set of base classifiers to produce an answer for the query. This paper explores a novel "query specific" combination rule: After learning a set of simple belief network classifiers, we produce an answer to each query by combining their individual responses, using weights based inversely on their respective variances around their responses. These variances are based on the uncertainty of the network parameters, which in turn depend on the training datasample. In essence, this variance quantifies the base classifier's confidence of its response to this query. …


Semi-Supervised Conditional Random Fields For Improved Sequence Segmentation And Labeling, Feng Jiao, Shaojun Wang, Chi-Hoon Lee, Russell Greiner, Dale Schuurmans Jan 2006

Semi-Supervised Conditional Random Fields For Improved Sequence Segmentation And Labeling, Feng Jiao, Shaojun Wang, Chi-Hoon Lee, Russell Greiner, Dale Schuurmans

Kno.e.sis Publications

We present a new semi-supervised training procedure for conditional random fields (CRFs) that can be used to train sequence segmentors and labelers from a combination of labeled and unlabeled training data. Our approach is based on extending the minimum entropy regularization framework to the structured prediction case, yielding a training objective that combines unlabeled conditional entropy with labeled conditional likelihood. Although the training objective is no longer concave, it can still be used to improve an initial model (e.g. obtained from supervised training) by iterative ascent. We apply our new training algorithm to the problem of identifying gene and protein …


An Investigation Of Codon Usage Bias Including Visualization And Quantification In Organisms Exhibiting Multiple Biases, Douglas W. Raiford, Travis E. Doom, Dan E. Krane, Michael L. Raymer Jan 2006

An Investigation Of Codon Usage Bias Including Visualization And Quantification In Organisms Exhibiting Multiple Biases, Douglas W. Raiford, Travis E. Doom, Dan E. Krane, Michael L. Raymer

Kno.e.sis Publications

Prokaryotic genomic sequence data provides a rich resource for bioinformatic analytic algorithms. Information can be extracted in many ways from the sequence data. One often overlooked process involves investigating an organism’s codon usage. Degeneracy in the genetic code leads to multiple codons coding for the same amino acids. Organism’s often preferentially utilize specific codons when coding for an amino acid. This biased codon usage can be a useful trait when predicting a gene’s expressivity or whether the gene originated from horizontal transfer. There can be multiple biases at play in a genome causing errors in the predictive process. For this …


Clustering Similarity Comparison Using Density Profiles, Eric Bae, James Bailey, Guozhu Dong Jan 2006

Clustering Similarity Comparison Using Density Profiles, Eric Bae, James Bailey, Guozhu Dong

Kno.e.sis Publications

The unsupervised nature of cluster analysis means that objects can be clustered in many ways, allowing different clustering algorithms to generate vastly different results. To address this, clustering comparison methods have traditionally been used to quantify the degree of similarity between alternative clusterings. However, existing techniques utilize only the point memberships to calculate the similarity, which can lead to unintuitive results. They also cannot be applied to analyze clusterings which only partially share points, which can be the case in stream clustering. In this paper we introduce a new measure named ADCO, which takes into account density profiles for each …


Predicting Domain Specific Entities With Limited Background Knowledge, Christopher Thomas, Amit P. Sheth Jan 2006

Predicting Domain Specific Entities With Limited Background Knowledge, Christopher Thomas, Amit P. Sheth

Kno.e.sis Publications

This paper proposes a framework for automatic recognition of domain-specific entities from text, given limited background knowledge, e.g. in form of an ontology. The algorithm exploits several lightweight natural language processing techniques, such as tokenization and stemming, as well as statistical techniques, such as singular value decomposition (SVD) to suggest domain relatedness of unknown entities.


Driving Deep Semantics In Middleware And Networks: What, Why And How?, Amit P. Sheth Jan 2006

Driving Deep Semantics In Middleware And Networks: What, Why And How?, Amit P. Sheth

Kno.e.sis Publications

No abstract provided.