Open Access. Powered by Scholars. Published by Universities.®

Science and Technology Studies Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 28 of 28

Full-Text Articles in Science and Technology Studies

Implicit Online Learning With Kernels, Li Cheng, S. V. N. Vishwanathan, Dale Schuurmans, Shaojun Wang, Terry Caelli Dec 2006

Implicit Online Learning With Kernels, Li Cheng, S. V. N. Vishwanathan, Dale Schuurmans, Shaojun Wang, Terry Caelli

Kno.e.sis Publications

We present two new algorithms for online learning in reproducing kernel Hilbert spaces. Our first algorithm, ILK (implicit online learning with kernels), employs a new, implicit update technique that can be applied to a wide variety of convex loss functions. We then introduce a bounded memory version, SILK (sparse ILK), that maintains a compact representation of the predictor without compromising solution quality, even in non-stationary environments. We prove loss bounds and analyze the convergence rate of both. Experimental evidence shows that our proposed algorithms outperform current methods on synthetic and real data.


Regression Cubes With Lossless Compression And Aggregation, Yixin Chen, Guozhu Dong, Jiawei Han, Jian Pei, Benjamin W. Wah, Jianyong Wang Dec 2006

Regression Cubes With Lossless Compression And Aggregation, Yixin Chen, Guozhu Dong, Jiawei Han, Jian Pei, Benjamin W. Wah, Jianyong Wang

Kno.e.sis Publications

As OLAP engines are widely used to support multidimensional data analysis, it is desirable to support in data cubes advanced statistical measures, such as regression and filtering, in addition to the traditional simple measures such as count and average. Such new measures will allow users to model, smooth, and predict the trends and patterns of data. Existing algorithms for simple distributive and algebraic measures are inadequate for efficient computation of statistical measures in a multidimensional space. In this paper, we propose a fundamentally new class of measures, compressible measures, in order to support efficient computation of the statistical models. For ...


Active Semantic Electronic Medical Record, Amit P. Sheth, S. Agrawal, Jonathan Lathem, Nicole Oldham, H. Wingate, K. Gallagher Nov 2006

Active Semantic Electronic Medical Record, Amit P. Sheth, S. Agrawal, Jonathan Lathem, Nicole Oldham, H. Wingate, K. Gallagher

Kno.e.sis Publications

The healthcare industry is rapidly advancing towards the widespread use of electronic medical records systems to manage the increasingly large amount of patient data and reduce medical errors. In addition to patient data there is a large amount of data describing procedures, treatments, diagnoses, drugs, insurance plans, coverage, formularies and the relationships between these data sets. While practices have benefited from the use of EMRs, infusing these essential programs with rich domain knowledge and rules can greatly enhance their performance and ability to support clinical decisions. Active Semantic Electronic Medical Record (ASEMR) application discussed here uses Semantic Web technologies to ...


{Ontology: Resource} X {Matching : Mapping} X {Schema : Instance} :: Components Of The Same Challenge, Amit P. Sheth Nov 2006

{Ontology: Resource} X {Matching : Mapping} X {Schema : Instance} :: Components Of The Same Challenge, Amit P. Sheth

Kno.e.sis Publications

Ontologies enable us to elevate syntactic and structural processing in an information system/Web to an information system/Web powered with semantic processing. Experience has shown that monolithic and tightly coupled approaches seldom succeed, and majority of information systems and applications will need to deal with plurality of ontologies in a loosely coupled environment (i.e., independently evolving ontologies and inter-ontology relationships, existence of different contexts for different users/applications etc.) Development of such loosely-coupled multi-ontology environments entails development of techniques for ontology mapping/alignment, multi-ontology query processing, and much more.


A Framework For Schema-Driven Relationship Discovery From Unstructured Text, Cartic Ramakrishnan, Krzysztof Kochut, Amit P. Sheth Nov 2006

A Framework For Schema-Driven Relationship Discovery From Unstructured Text, Cartic Ramakrishnan, Krzysztof Kochut, Amit P. Sheth

Kno.e.sis Publications

We address the issue of extracting implicit and explicit relationships between entities in biomedical text. We argue that entities seldom occur in text in their simple form and that relationships in text relate the modified, complex forms of entities with each other. We present a rule-based method for (1) extraction of such complex entities and (2) relationships between them and (3) the conversion of such relationships into RDF. Furthermore, we present results that clearly demonstrate the utility of the generated RDF in discovering knowledge from text corpora by means of locating paths composed of the extracted relationships.


Semantic Interoperability Of Web Services - Challenges And Experiences, Meenakshi Nagarajan, Kunal Verma, Amit P. Sheth, John A. Miller, Jonathan Lathem Sep 2006

Semantic Interoperability Of Web Services - Challenges And Experiences, Meenakshi Nagarajan, Kunal Verma, Amit P. Sheth, John A. Miller, Jonathan Lathem

Kno.e.sis Publications

With the rising popularity of Web services, both academia and industry have invested considerably in Web service description standards, discovery, and composition techniques. The standards based approach utilized by Web services has supported interoperability at the syntax level. However, issues of structural and semantic heterogeneity between messages exchanged by Web services are far more complex and crucial to interoperability. It is for these reasons that we recognize the value that schema/data mappings bring to Web service descriptions. In this paper, we examine challenges to interoperability; classify the types of heterogeneities that can occur between interacting services and present a ...


Flexible Querying Of Xml Documents, Krishnaprasad Thirunarayan, Trivikram Immaneni Sep 2006

Flexible Querying Of Xml Documents, Krishnaprasad Thirunarayan, Trivikram Immaneni

Kno.e.sis Publications

Text search engines are inadequate for indexing and searching XML documents because they ignore metadata and aggregation structure implicit in the XML documents. On the other hand, the query languages supported by specialized XML search engines are very complex. In this paper, we present a simple yet flexible query language, and develop its semantics to enable intuitively appealing extraction of relevant fragments of information while simultaneously falling back on retrieval through plain text search if necessary. We also present a simple yet robust relevance ranking for heterogeneous document-centric XML.


Optimal Adaptation In Web Processes With Coordination Constraints, Kunal Verma, Prashant Doshi, Karthik Gomadam, John A. Miller, Amit P. Sheth Sep 2006

Optimal Adaptation In Web Processes With Coordination Constraints, Kunal Verma, Prashant Doshi, Karthik Gomadam, John A. Miller, Amit P. Sheth

Kno.e.sis Publications

We present methods for optimally adapting Web processes to exogenous events while preserving inter-service constraints that necessitate coordination. For example, in a supply chain process, orders placed by a manufacturer may get delayed in arriving. In response to this event, the manufacturer has the choice of either waiting out the delay or changing the supplier. Additionally, there may be compatibility constraints between the different orders, thereby introducing the problem of coordination between them if the manufacturer chooses to change the suppliers. We focus on formulating the decision making models of the managers, who must adapt to external events while satisfying ...


Optimal Adaptation Of Web Processes With Inter-Service Dependencies, Kunal Verma, Prashant Doshi, Karthik Gomadam, John A. Miller, Amit P. Sheth Jul 2006

Optimal Adaptation Of Web Processes With Inter-Service Dependencies, Kunal Verma, Prashant Doshi, Karthik Gomadam, John A. Miller, Amit P. Sheth

Kno.e.sis Publications

We present methods for optimally adapting Web processes to exogenous events while preserving inter-service dependencies. For example, in a supply chain process, orders placed by the manufacturer may get delayed in arriving. In response to this event, the manufacturer has the choice of either waiting out the delay or changing the supplier. Additionally, there may be compatibility constraints between the different orders, thereby introducing the problem of coordination between them if the manufacturer chooses to change the suppliers. We present our methods within the framework of autonomic Web processes. This framework seeks to add properties of self-configuration, adaptation, and self-optimization ...


Geospatial Ontology Development And Semantic Analytics, I. Budak Arpinar, Cartic Ramakrishnan, Molly Azami, Amit P. Sheth, E. Lynn Usery, Mei-Po Kwan Jul 2006

Geospatial Ontology Development And Semantic Analytics, I. Budak Arpinar, Cartic Ramakrishnan, Molly Azami, Amit P. Sheth, E. Lynn Usery, Mei-Po Kwan

Kno.e.sis Publications

Geospatial ontology development and semantic knowledge discovery addresses the need for modeling, analyzing and visualizing multimodal information, and is unique in offering integrated analytics that encompasses spatial, temporal and thematic dimensions of information and knowledge. The comprehensive ability to provide integrated analysis from multiple forms of information and use of explicit knowledge make this approach unique. This also involves specification of spatiotemporal thematic ontologies and populating such ontologies with high quality knowledge. Such ontologies form the basis for defining the meaning of important relations terms, such as near or surrounded by, and enable computation of spatiotemporal thematic proximity measures we ...


Masquerader Detection Using Oclep: One-Class Classification Using Length Statistics Of Emerging Patterns, Lijun Chen, Guozhu Dong Jun 2006

Masquerader Detection Using Oclep: One-Class Classification Using Length Statistics Of Emerging Patterns, Lijun Chen, Guozhu Dong

Kno.e.sis Publications

We introduce a new method for masquerader detection that only uses a user’s own data for training, called Oneclass Classification using Length statistics of Emerging Patterns (OCLEP). Emerging patterns (EPs) are patterns whose support increases from one dataset/class to another with a big ratio, and have been very useful in earlier studies. OCLEP classifies a case T as self or masquerader by using the average length of EPs obtained by contrasting T against sets of samples of a user’s normal data. It is based on the observation that one needs long EPs to differentiate instances from a ...


Semantic Empowerment Of Health Care And Life Science Applications, Amit P. Sheth May 2006

Semantic Empowerment Of Health Care And Life Science Applications, Amit P. Sheth

Kno.e.sis Publications

No abstract provided.


Semantic Analytics Visualization, Leonidas Deligiannidis, Amit P. Sheth, Boanerges Aleman-Meza May 2006

Semantic Analytics Visualization, Leonidas Deligiannidis, Amit P. Sheth, Boanerges Aleman-Meza

Kno.e.sis Publications

In this paper we present a new tool for semantic analytics through 3D visualization called “Semantic Analytics Visualization” (SAV). It has the capability for visualizing ontologies and meta-data including annotated web-documents, images, and digital media such as audio and video clips in a synthetic three-dimensional semi-immersive environment. More importantly, SAV supports visual semantic analytics, whereby an analyst can interactively investigate complex relationships between heterogeneous information. The tool is built using Virtual Reality technology which makes SAV a highly interactive system. The backend of SAV consists of a Semantic Analytics system that supports query processing and semantic association discovery. Using a ...


Detecting The Change Of Clustering Structure In Categorical Data Streams, Keke Chen, Ling Liu Apr 2006

Detecting The Change Of Clustering Structure In Categorical Data Streams, Keke Chen, Ling Liu

Kno.e.sis Publications

Analyzing clustering structures in data streams can provide critical information for making decision in real time. In this paper, we present a framework for detecting the change of critical clustering structure in categorical data streams. The framework consists of the Hierarchical Entropy Tree structure (HE-Tree) and the extended ACE clustering algorithm. HE-Tree can efficiently capture the entropy property of the categorical data streams and allow us to draw precise clustering information from the data stream for high-quality BkPLots with the extended ACE algorithm.


Ivibrate: Interactive Visualization Based Framework For Clustering Large Datasets, Keke Chen, Ling Liu Apr 2006

Ivibrate: Interactive Visualization Based Framework For Clustering Large Datasets, Keke Chen, Ling Liu

Kno.e.sis Publications

With continued advances in communication network technology and sensing technology, there is astounding growth in the amount of data produced and made available through cyberspace. Efficient and high-quality clustering of large datasets continues to be one of the most important problems in large-scale data analysis. A commonly used methodology for cluster analysis on large datasets is the three-phase framework of sampling/summarization, iterative cluster analysis, and disk-labeling. There are three known problems with this framework which demand effective solutions. The first problem is how to effectively define and validate irregularly shaped clusters, especially in large datasets. Automated algorithms and statistical ...


Semantic Web Applications In Financial Industry, Government, Health Care And Life Sciences, Amit P. Sheth Mar 2006

Semantic Web Applications In Financial Industry, Government, Health Care And Life Sciences, Amit P. Sheth

Kno.e.sis Publications

No abstract provided.


Wsdl-S: Specification, Tools, Use Cases And Applications, Amit P. Sheth, Kunal Verma, Karthik Gomadam Mar 2006

Wsdl-S: Specification, Tools, Use Cases And Applications, Amit P. Sheth, Kunal Verma, Karthik Gomadam

Kno.e.sis Publications

No abstract provided.


Show Me What You Mean! Exploiting Domain Semantics In Ontology Visualization, Ravi Pavagada, Christopher Thomas, Amit P. Sheth, William S. York Jan 2006

Show Me What You Mean! Exploiting Domain Semantics In Ontology Visualization, Ravi Pavagada, Christopher Thomas, Amit P. Sheth, William S. York

Kno.e.sis Publications

Ontologies build the backbone for many life-sciences applications. These ontologies, however, are represented in XML based languages that are meant for machine-consumption and hence are difficult for humans to comprehend. For a meaningful visualization of these ontologies, it is important that the display of entities and relationships captures the cognitive representation of the domain as perceived by the domain experts. In this paper we present OntoVista, an ontology visualization tool that is adaptable to the needs of different domains, especially in the life sciences. While keeping the graph structures as the predominant model, we provide a semantically enhanced graph display ...


Taxaminer: Improving Taxonomy Label Quality Using Latent Semantic Indexing, Cartic Ramakrishnan, Christopher Thomas, Vipul Kashyap, Amit P. Sheth Jan 2006

Taxaminer: Improving Taxonomy Label Quality Using Latent Semantic Indexing, Cartic Ramakrishnan, Christopher Thomas, Vipul Kashyap, Amit P. Sheth

Kno.e.sis Publications

The development of taxonomies/ontologies is a human intensive process requiring prohibitively large resource commitments in terms of time and cost. In our previous work we have identified an experimentation framework for semi-automatic taxonomy/hierarchy generation from unstructured text. In the preliminary results presented, the taxonomy/hierarchy quality was lower than we had anticipated. In this paper, we present two variations of our experimentation framework, viz. Latent semantic Indexing (LSI) for document indexing and the use of term vectors to prune labels assigned to nodes in the final taxonomy/hierarchy. Using our previous results of taxonomy/hierarchy quality as the ...


Driving Deep Semantics In Middleware And Networks: What, Why And How?, Amit P. Sheth Jan 2006

Driving Deep Semantics In Middleware And Networks: What, Why And How?, Amit P. Sheth

Kno.e.sis Publications

No abstract provided.


Using Query-Specific Variance Estimates To Combine Bayesian Classifiers, Chi-Hoon Lee, Russell Greiner, Shaojun Wang Jan 2006

Using Query-Specific Variance Estimates To Combine Bayesian Classifiers, Chi-Hoon Lee, Russell Greiner, Shaojun Wang

Kno.e.sis Publications

Many of today's best classification results are obtained by combining the responses of a set of base classifiers to produce an answer for the query. This paper explores a novel "query specific" combination rule: After learning a set of simple belief network classifiers, we produce an answer to each query by combining their individual responses, using weights based inversely on their respective variances around their responses. These variances are based on the uncertainty of the network parameters, which in turn depend on the training datasample. In essence, this variance quantifies the base classifier's confidence of its response to ...


Clustering Similarity Comparison Using Density Profiles, Eric Bae, James Bailey, Guozhu Dong Jan 2006

Clustering Similarity Comparison Using Density Profiles, Eric Bae, James Bailey, Guozhu Dong

Kno.e.sis Publications

The unsupervised nature of cluster analysis means that objects can be clustered in many ways, allowing different clustering algorithms to generate vastly different results. To address this, clustering comparison methods have traditionally been used to quantify the degree of similarity between alternative clusterings. However, existing techniques utilize only the point memberships to calculate the similarity, which can lead to unintuitive results. They also cannot be applied to analyze clusterings which only partially share points, which can be the case in stream clustering. In this paper we introduce a new measure named ADCO, which takes into account density profiles for each ...


An Online Discriminative Approach To Background Subtraction, Li Cheng, Shaojun Wang, Terry Caelli Jan 2006

An Online Discriminative Approach To Background Subtraction, Li Cheng, Shaojun Wang, Terry Caelli

Kno.e.sis Publications

We present a simple, principled approach to detecting foreground objects in video sequences in real-time. Our method is based on an on-line discriminative learning technique that is able to cope with illumination changes due to discontinuous switching, or illumination drifts caused by slower processes such as varying time of the day. Starting from a discriminative learning principle, we derive a training algorithm that, for each pixel, computes a weighted linear combination of selected past observations with time-decay. We present experimental results that show the proposed approach outperforms existing methods on both synthetic sequences and real video data.


Semi-Supervised Conditional Random Fields For Improved Sequence Segmentation And Labeling, Feng Jiao, Shaojun Wang, Chi-Hoon Lee, Russell Greiner, Dale Schuurmans Jan 2006

Semi-Supervised Conditional Random Fields For Improved Sequence Segmentation And Labeling, Feng Jiao, Shaojun Wang, Chi-Hoon Lee, Russell Greiner, Dale Schuurmans

Kno.e.sis Publications

We present a new semi-supervised training procedure for conditional random fields (CRFs) that can be used to train sequence segmentors and labelers from a combination of labeled and unlabeled training data. Our approach is based on extending the minimum entropy regularization framework to the structured prediction case, yielding a training objective that combines unlabeled conditional entropy with labeled conditional likelihood. Although the training objective is no longer concave, it can still be used to improve an initial model (e.g. obtained from supervised training) by iterative ascent. We apply our new training algorithm to the problem of identifying gene and ...


An Investigation Of Codon Usage Bias Including Visualization And Quantification In Organisms Exhibiting Multiple Biases, Douglas W. Raiford, Travis E. Doom, Dan E. Krane, Michael L. Raymer Jan 2006

An Investigation Of Codon Usage Bias Including Visualization And Quantification In Organisms Exhibiting Multiple Biases, Douglas W. Raiford, Travis E. Doom, Dan E. Krane, Michael L. Raymer

Kno.e.sis Publications

Prokaryotic genomic sequence data provides a rich resource for bioinformatic analytic algorithms. Information can be extracted in many ways from the sequence data. One often overlooked process involves investigating an organism’s codon usage. Degeneracy in the genetic code leads to multiple codons coding for the same amino acids. Organism’s often preferentially utilize specific codons when coding for an amino acid. This biased codon usage can be a useful trait when predicting a gene’s expressivity or whether the gene originated from horizontal transfer. There can be multiple biases at play in a genome causing errors in the predictive ...


Knowledge Modeling And Its Application In Life Sciences: A Tale Of Two Ontologies, Satya S. Sahoo, Christopher Thomas, Amit P. Sheth, William S. York, Samir Tartir Jan 2006

Knowledge Modeling And Its Application In Life Sciences: A Tale Of Two Ontologies, Satya S. Sahoo, Christopher Thomas, Amit P. Sheth, William S. York, Samir Tartir

Kno.e.sis Publications

High throughput glycoproteomics, similar to genomics and proteomics, involves extremely large volumes of distributed, heterogeneous data as a basis for identification and quantification of a structurally diverse collection of biomolecules. The ability to share, compare, query for and most critically correlate datasets using the native biological relationships are some of the challenges being faced by glycobiology researchers. As a solution for these challenges, we are building a semantic structure, using a suite of ontologies, which supports management of data and information at each step of the experimental lifecycle. This framework will enable researchers to leverage the large scale of glycoproteomics ...


Data Processing In Space, Time, And Semantics Dimensions, Farshad Hakimpour, Boanerges Aleman-Meza, Matthew Perry, Amit P. Sheth Jan 2006

Data Processing In Space, Time, And Semantics Dimensions, Farshad Hakimpour, Boanerges Aleman-Meza, Matthew Perry, Amit P. Sheth

Kno.e.sis Publications

This work presents an experimental system for data processing in space, time and semantics dimensions using current Semantic Web technologies. The paper describes how we obtain geographic and event data from Internet sources and also how we integrate them into an RDF store. We briefly introduce a set of functionalities in space, time and semantics dimensions. These functionalities are implemented based on our existing technology for main-memory based RDF data processing developed in the LSDIS Lab. A number of these functionalities are exposed as REST Web services. We present two sample client side applications that are developed using a combination ...


Predicting Domain Specific Entities With Limited Background Knowledge, Christopher Thomas, Amit P. Sheth Jan 2006

Predicting Domain Specific Entities With Limited Background Knowledge, Christopher Thomas, Amit P. Sheth

Kno.e.sis Publications

This paper proposes a framework for automatic recognition of domain-specific entities from text, given limited background knowledge, e.g. in form of an ontology. The algorithm exploits several lightweight natural language processing techniques, such as tokenization and stemming, as well as statistical techniques, such as singular value decomposition (SVD) to suggest domain relatedness of unknown entities.