Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

2005

Discipline
Institution
Keyword
Publication
Publication Type

Articles 151 - 176 of 176

Full-Text Articles in Databases and Information Systems

A Generative Programming Approach To Interactive Information Retrieval: Insights And Experiences, Saverio Perugini, Naren Ramakrishnan Jan 2005

A Generative Programming Approach To Interactive Information Retrieval: Insights And Experiences, Saverio Perugini, Naren Ramakrishnan

Computer Science Faculty Publications

We describe the application of generative programming to a problem in interactive information retrieval. The particular interactive information retrieval problem we study is the support for "out-of-turn interaction" with a website – how a user can communicate input to a website when the site is not soliciting such information on the current page, but will do so on a subsequent page. Our solution approach makes generous use of program transformations (partial evaluation, currying, and slicing) to delay the site’s current solicitation for input until after the user’s out-of-turn input is processed. We illustrate how studying out-of-turn interaction through a generative …


The Good, Bad And The Indifferent: Explorations In Recommender System Health, Benjamin J. Keller, Sun-Mi Kim, N. Srinivas Vemuri, Naren Ramakrishnan, Saverio Perugini Jan 2005

The Good, Bad And The Indifferent: Explorations In Recommender System Health, Benjamin J. Keller, Sun-Mi Kim, N. Srinivas Vemuri, Naren Ramakrishnan, Saverio Perugini

Computer Science Faculty Publications

Our work is based on the premise that analysis of the connections exploited by a recommender algorithm can provide insight into the algorithm that could be useful to predict its performance in a fielded system. We use the jumping connections model defined by Mirza et al. [6], which describes the recommendation process in terms of graphs. Here we discuss our work that has come out of trying to understand algorithm behavior in terms of these graphs. We start by describing a natural extension of the jumping connections model of Mirza et al., and then discuss observations that have come from …


Applying Scenario-Based Design And Claim Analysis To The Design Of A Digital Library Of Geography Examination Resources, Yin-Leng Theng, Dion Hoe-Lian Goh, Ee Peng Lim, Zehua Liu, Ming Yin, Natalie Lee-San Pang, Patricia Bao-Bao Wong Jan 2005

Applying Scenario-Based Design And Claim Analysis To The Design Of A Digital Library Of Geography Examination Resources, Yin-Leng Theng, Dion Hoe-Lian Goh, Ee Peng Lim, Zehua Liu, Ming Yin, Natalie Lee-San Pang, Patricia Bao-Bao Wong

Research Collection School Of Computing and Information Systems

This paper describes the application of Carroll’s scenario-based design and claims analysis as a means of refinement to the initial design of a digital library of geographical resources (GeogDL) to prepare Singapore students to take a national examination in geography. GeogDL is built on top of G-Portal, a digital library providing services over geospatial and georeferenced Web content. Beyond improving the initial design of GeogDL, a main contribution of the paper is making explicit the use of Carroll’s strong theory-based but undercapitalized scenario-based design and claims analysis that inspired recommendations for the refinement of GeogDL. The paper concludes with an …


Web-Based Independent Study Program, Darryl Dwaine Scroggins Jan 2005

Web-Based Independent Study Program, Darryl Dwaine Scroggins

Theses Digitization Project

The Web-based Independent Study Program (WISP) is an on-line database program used to create and store educational records for members of Dikaios. (Dikaios is a Christian educators association that offers an independent study program.) The database allows home educators to create, store, edit, view, and print the forms and records that are required by Dakaios administrators and by the state of California. Further, the database helps students and home educators monitor student progress towards meeting high school graduation requirements.


Utility Computing: Certification Model, Costing Model, And Related Architecture Development, Saif Ahmed Faruqui Jan 2005

Utility Computing: Certification Model, Costing Model, And Related Architecture Development, Saif Ahmed Faruqui

Theses Digitization Project

The purpose of the thesis was to propose one set of solutions to some of the challenges that are delaying the adoption of utility computing on a wider scale. These components enable effective deployment of utility computing, efficient look-up, and comparison of service offerings of different utility computing resource centers connected to the utility computing network.


Pdf Shopping System With The Lightweight Currency Protocol, Yingzhuo Wang Jan 2005

Pdf Shopping System With The Lightweight Currency Protocol, Yingzhuo Wang

Theses Digitization Project

This project is a web application for two types of bookstores an E-Bookstore and a PDF-Bookstore. Both are document sellers, however, The E-Bookstore is not a currency user. The PDF-Bookstore sells PDF documents and issues a lightweight currency called Scart. Customers can sell their PDF documents to earn Scart currency and buy PDF documents by paying with Scart.


Ontology-Assisted Mining Of Rdf Documents, Tao Jiang, Ah-Hwee Tan Jan 2005

Ontology-Assisted Mining Of Rdf Documents, Tao Jiang, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Resource description framework (RDF) is becoming a popular encoding language for describing and interchanging metadata of web resources. In this paper, we propose an Apriori-based algorithm for mining association rules (AR) from RDF documents. We treat relations (RDF statements) as items in traditional AR mining to mine associations among relations. The algorithm further makes use of a domain ontology to provide generalization of relations. To obtain compact rule sets, we present a generalized pruning method for removing uninteresting rules. We illustrate a potential usage of AR mining on RDF documents for detecting patterns of terrorist activities. Experiments conducted based on …


A Random Rotation Perturbation Approach To Privacy Preserving Data Classification, Keke Chen, Ling Liu Jan 2005

A Random Rotation Perturbation Approach To Privacy Preserving Data Classification, Keke Chen, Ling Liu

Kno.e.sis Publications

This paper presents a random rotation perturbation approach for privacy preserving data classification. Concretely, we identify the importance of classification-specific information with respect to the loss of information factor, and present a random rotation perturbation framework for privacy preserving data classification. Our approach has two unique characteristics. First, we identify that many classification models utilize the geometric properties of datasets, which can be preserved by geometric rotation. We prove that the three types of classifiers will deliver the same performance over the rotation perturbed dataset as over the original dataset. Second, we propose a multi-column privacy model to address the …


Tontogen: A Synthetic Data Set Generator For Semantic Web Applications, Matthew Perry Jan 2005

Tontogen: A Synthetic Data Set Generator For Semantic Web Applications, Matthew Perry

Kno.e.sis Publications

No abstract provided.


Variational Bayesian Image Modelling, Li Chen, Feng Jiao, Dale Schuurmans, Shaojun Wang Jan 2005

Variational Bayesian Image Modelling, Li Chen, Feng Jiao, Dale Schuurmans, Shaojun Wang

Kno.e.sis Publications

We present a variational Bayesian framework for performing inference, density estimation and model selection in a special class of graphical models—Hidden Markov Random Fields (HMRFs). HMRFs are particularly well suited to image modelling and in this paper, we apply them to the problem of image segmentation. Unfortunately, HMRFs are notoriously hard to train and use because the exact inference problems they create are intractable. Our main contribution is to introduce an efficient variational approach for performing approximate inference of the Bayesian formulation of HMRFs, which we can then apply to the density estimation and model selection problems that arise when …


Meteor-S Wsdi: A Scalable P2p Infrastructure Of Registries For Semantic Publication And Discovery Of Web Services, Kunal Verma, Kaarthik Sivashanmugam, Amit P. Sheth, Abhijit Patil, Swapna Oundhakar, John Miller Jan 2005

Meteor-S Wsdi: A Scalable P2p Infrastructure Of Registries For Semantic Publication And Discovery Of Web Services, Kunal Verma, Kaarthik Sivashanmugam, Amit P. Sheth, Abhijit Patil, Swapna Oundhakar, John Miller

Kno.e.sis Publications

Web services are the new paradigm for distributed computing. They have much to offer towards interoperability of applications and integration of large scale distributed systems. To make Web services accessible to users, service providers use Web service registries to publish them. Current infrastructure of registries requires replication of all Web service publications in all Universal Business Registries. Large growth in number of Web services as well as the growth in the number of registries would make this replication impractical. In addition, the current Web service discovery mechanism is inefficient, as it does not support discovery based on the capabilities of …


From Semantic Search & Integration To Analytics, Amit P. Sheth Jan 2005

From Semantic Search & Integration To Analytics, Amit P. Sheth

Kno.e.sis Publications

Semantics is seen as the key ingredient in the next phase of the Web infrastructure as well as the next generation of enterprise content management. Ontology is the centerpiece of the most prevalent semantic technologies and provides the basis of representing, acquiring, and utilizing knowledge. With the availability of several commercial products and many research tools, specifications and increasing adoption of Semantic Web standards such as RDF for metadata and OWL for ontology representation, ontology-driven techniques and systems have already enabled a new generation of industry strength semantic applications. In particular, Semagix's Freedom has powered applications in leading verticals such …


Discovering Informative Subgraphs In Rdf Graphs, William H. Milnor, Cartic Ramakrishnan, Matthew Perry, Amit P. Sheth, John A. Miller, Krzysztof Kochut Jan 2005

Discovering Informative Subgraphs In Rdf Graphs, William H. Milnor, Cartic Ramakrishnan, Matthew Perry, Amit P. Sheth, John A. Miller, Krzysztof Kochut

Kno.e.sis Publications

Discovering patterns in graphs has long been an area of interest. In most contemporary approaches to such pattern discovery either quantitative anomalies or frequency of substructure is used to measure the interestingness of a pattern. In this paper we address the issue of discovering informative sub-graphs within RDF graphs. We motivate our work with an example related to Semantic Search. A user might pose a question of the form: ' What are the most relevant ways in which entity X is related to entity Y?' the response to which is a subgraph connecting X to Y. Relevance of the …


Taxaminer: An Experimentation Framework For Automated Taxonomy Bootstrapping, Vipul Kashyap, Cartic Ramakrishnan, Christopher Thomas, Amit P. Sheth Jan 2005

Taxaminer: An Experimentation Framework For Automated Taxonomy Bootstrapping, Vipul Kashyap, Cartic Ramakrishnan, Christopher Thomas, Amit P. Sheth

Kno.e.sis Publications

Construction of domain ontologies on the semantic web is a human and resource intensive process, efforts to reduce which are crucial for the Semantic Web to scale. We present a framework for automated taxonomy construction, that involves: (a) generation of a cluster hierarchy from a document corpus using statistical clustering and NLP techniques; (b) extraction of a topic hierarchy from this cluster hierarchy; and (c) assignment of labels to nodes in the topic hierarchy. Metrics for estimating topic hierarchy quality and parameters of an experimentation framework are identified. MEDLINE was the document corpus and MeSH thesaurus was the gold standard.


Framework For Semantic Web Process Composition, Kaarthik Sivashanmugam, John A. Miller, Amit P. Sheth, Kunal Verma Jan 2005

Framework For Semantic Web Process Composition, Kaarthik Sivashanmugam, John A. Miller, Amit P. Sheth, Kunal Verma

Kno.e.sis Publications

Web services have the potential to revolutionize e-commerce by enabling businesses to interact with each other on the fly. To date, however, Web processes using Web services have been created mostly at the syntactic level. Current composition standards focus on building processes based on the interface description of the participating services. This rigid approach, with its strong coupling between the process and the interface of the participating services, does not allow businesses to dynamically change partners and services. As shown in this article, Web process composition techniques can be enhanced by using semantic process templates to capture the semantic requirements …


Glyde - An Expressive Xml Standard For The Representation Of Glycan, Satya S. Sahoo, Christopher Thomas, Amit P. Sheth, Cory Andrew Henson, William S. York Jan 2005

Glyde - An Expressive Xml Standard For The Representation Of Glycan, Satya S. Sahoo, Christopher Thomas, Amit P. Sheth, Cory Andrew Henson, William S. York

Kno.e.sis Publications

The amount of glycomics data being generated is rapidly increasing as a result of improvements in analytical and computational methods. Correlation and analysis of this large, distributed data set requires an extensible and flexible representational standard that is also ‘understood’ by a wide range of software applications. An XML-based data representation standard that faithfully captures essential structural details of a glycan moiety along with additional information (such as data provenance) to aid the interpretation and usage of glycan data, will facilitate the exchange of glycomics data across the scientific community. To meet this need, we introduce GLYcan Data Exchange (GLYDE) …


The "Best K" For Entropy-Based Categorical Data Clustering, Keke Chen, Ling Liu Jan 2005

The "Best K" For Entropy-Based Categorical Data Clustering, Keke Chen, Ling Liu

Kno.e.sis Publications

With the growing demand on cluster analysis for categorical data, a handful of categorical clustering algorithms have been developed. Surprisingly, to our knowledge, none has satisfactorily addressed the important problem for categorical clustering – how can we determine the best K number of clusters for a categorical dataset? Since categorical data does not have the inherent distance function as the similarity measure, traditional cluster validation techniques based on the geometry shape and density distribution cannot be applied to answer this question. In this paper, we investigate the entropy property of the categorical data and propose a BkPlot method for determining …


Semantic Web & Semantic Web Services: Applications In Healthcare And Scientific Research, Amit P. Sheth Jan 2005

Semantic Web & Semantic Web Services: Applications In Healthcare And Scientific Research, Amit P. Sheth

Kno.e.sis Publications

No abstract provided.


Wsdl-S: Adding Semantics To Wsdl, John Miller, Kunal Verma, Preeda Rajasekaran, Amit P. Sheth, Rohit Aggarwal, Kaarthik Sivashanmugam Jan 2005

Wsdl-S: Adding Semantics To Wsdl, John Miller, Kunal Verma, Preeda Rajasekaran, Amit P. Sheth, Rohit Aggarwal, Kaarthik Sivashanmugam

Kno.e.sis Publications

Web services have primarily been designed for providing inter-operability between business applications. Current technologies assume a large amount of human interaction, for integrating two applications. This is primarily due to the fact that business process integration requires understanding of data and functions of the involved entities. Semantic Web technologies, powered by description logic based languages like OWL[1], aim to add greater meaning to Web content, by annotating the data with ontologies. Ontologies provide a mechanism of providing shared conceptualizations of domains. This allows agents to get an understanding of users’ Web content and greatly reduces human interaction for meaningful Web …


Adaptive Decision Support For Academic Course Scheduling Using Intelligent Software Agents, Prithviraj Dasgupta, Deepak Khazanchi Jan 2005

Adaptive Decision Support For Academic Course Scheduling Using Intelligent Software Agents, Prithviraj Dasgupta, Deepak Khazanchi

Information Systems and Quantitative Analysis Faculty Publications

Academic course scheduling is a complex operation that requires the interaction between different users including instructors and course schedulers to satisfy conflicting constraints in an optimal manner. Traditionally, this problem has been addressed as a constraint satisfaction problem where the constraints are stationary over time. In this paper, we address academic course scheduling as a dynamic decision support problem using an agent-enabled adaptive decision support system. In this paper, we describe the Intelligent Agent Enabled Decision Support (IAEDS) system, which employs software agents to assist humans in making strategic decisions under dynamic and uncertain conditions. The IAEDS system has a …


Non-Verbal Communication With Autistic Children Using Digital Libraries, Gondy A. Leroy, John Huang '05, Serena Chuang '05, Marjorie H. Charlop Jan 2005

Non-Verbal Communication With Autistic Children Using Digital Libraries, Gondy A. Leroy, John Huang '05, Serena Chuang '05, Marjorie H. Charlop

CGU Faculty Publications and Research

Autism spectrum disorder (ASD) has become one of the most prevalent mental disorders over the last few years and its prevalence is still growing. The disorder is characterized by a wide variety of symptoms such as lack of social behavior, extreme withdrawal, and problems communicating. Because of the diversity in symptoms and the wide variety in severity for those, each autistic child has different needs and requires individualized therapy. This leads to long waiting lists for therapy.


Recommender Systems Research, Saverio Perugini Jan 2005

Recommender Systems Research, Saverio Perugini

Computer Science Faculty Publications

We outline the history of recommender systems from their roots in information retrieval and filtering to their role in today’s Internet economy. Recommender systems attempt to reduce information overload and retain customers by selecting a subset of items from a universal set based on user preferences. Research in recommender systems lies at the intersection of several areas of computer science, such as artificial intelligence and human-computer interaction, and has progressed to an important research area of its own. It is important to note that recommendations are not delivered within a vacuum, but rather cast within an informal community of users …


Towards Persistent Resource Identification With The Uniform Resource Name, Luke Brown Jan 2005

Towards Persistent Resource Identification With The Uniform Resource Name, Luke Brown

Theses : Honours

The exponential growth of the Internet, and the subsequent reliance on the resources it connects, has exposed a clear need for an Internet identifier which remains accessible over time. Such identifiers have been dubbed persistent identifiers owing to the promise of reliability they imply. Persistent naming systems exist at present, however it is the resolution of these systems into what Kunze, (2003) calls "persistent actionable identifiers" which is the focus of this work. Actionable identifiers can be thought of as identifiers which are accessible in a simple fashion such as through a web browser or through a specific application. This …


Introduction: Data Communication And Topology Algorithms For Sensor Networks, Stephan Olariu, David Simplot-Ryl, Ivan Stojmenovic Jan 2005

Introduction: Data Communication And Topology Algorithms For Sensor Networks, Stephan Olariu, David Simplot-Ryl, Ivan Stojmenovic

Computer Science Faculty Publications

(First paragraph) We are very proud and honored to have been entrusted to be Guest Editors for this special issue. Papers were sought to comprehensively cover the algorithmic issues in the “hot” area of sensor networking. The concentration was on network layer problems, which can be divided into two groups: data communication problems and topology control problems. We wish to briefly introduce the five papers appearing in this special issue. They cover specific problems such as time division for reduced collision, fault tolerant clustering, self-stabilizing graph optimization algorithms, key pre-distribution for secure communication, and distributed storage based on spanning trees …


Exploring Bit-Difference For Approximate Knn Search In High-Dimensional Databases, Bin Cui, Heng Tao Shen, Jialie Shen, Kian-Lee Tan Jan 2005

Exploring Bit-Difference For Approximate Knn Search In High-Dimensional Databases, Bin Cui, Heng Tao Shen, Jialie Shen, Kian-Lee Tan

Research Collection School Of Computing and Information Systems

In this paper, we develop a novel index structure to support effcient approximate k-nearest neighbor (KNN) query in high-dimensional databases. In high-dimensional spaces, the computational cost of the distance (e.g., Euclidean distance) between two points contributes a dominant portion of the overall query response time for memory processing. To reduce the distance computation, we first propose a structure (BID) using BIt-Difference to answer approximate KNN query. The BID employs one bit to represent each feature vector of point and the number of bit-difference is used to prune the further points. To facilitate real dataset which is typically skewed, we enhance …


Linear Correlation Discovery In Databases: A Data Mining Approach, Cecil Chua, Roger Hsiang-Li Chiang, Ee Peng Lim Jan 2005

Linear Correlation Discovery In Databases: A Data Mining Approach, Cecil Chua, Roger Hsiang-Li Chiang, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Very little research in knowledge discovery has studied how to incorporate statistical methods to automate linear correlation discovery (LCD). We present an automatic LCD methodology that adopts statistical measurement functions to discover correlations from databases’ attributes. Our methodology automatically pairs attribute groups having potential linear correlations, measures the linear correlation of each pair of attribute groups, and confirms the discovered correlation. The methodology is evaluated in two sets of experiments. The results demonstrate the methodology’s ability to facilitate linear correlation discovery for databases with a large amount of data.