Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

2006

Institution
Keyword
Publication
Publication Type

Articles 1 - 30 of 148

Full-Text Articles in Physical Sciences and Mathematics

Itr/Im: Enabling The Creation And Use Of Geogrids For Next Generation Geospatial Information, Peggy Agouris, Mary-Kate Beard-Tisdale, Chaitanya Baru, Sarah Nusser Dec 2006

Itr/Im: Enabling The Creation And Use Of Geogrids For Next Generation Geospatial Information, Peggy Agouris, Mary-Kate Beard-Tisdale, Chaitanya Baru, Sarah Nusser

University of Maine Office of Research Administration: Grant Reports

The objective of this project is to advance science in information management, focusing in particular on geospatial information. It addresses the development of concepts, algorithms, and system architectures to enable users on a grid to query, analyze, and contribute to multivariate, quality-aware geospatial information. The approach consists of three complementary research areas: (1) establishing a statistical framework for assessing geospatial data quality; (2) developing uncertainty-based query processing capabilities; and (3) supporting the development of space- and accuracy-aware adaptive systems for geospatial datasets. The results of this project will support the extension of the concept of the computational grid to facilitate …


Data Management Plans: Stages, Components, And Activities, Abbas S. Tavakoli, Kirby Jackson, Linda Moneyham, Kenneth D. Phillips, Carolyn Murdaugh, Gene Meding Dec 2006

Data Management Plans: Stages, Components, And Activities, Abbas S. Tavakoli, Kirby Jackson, Linda Moneyham, Kenneth D. Phillips, Carolyn Murdaugh, Gene Meding

Applications and Applied Mathematics: An International Journal (AAM)

Data management strategies have become increasingly important as new computer technologies allow for larger and more complex data sets to be analyzed easily. As a consequence, data management has become a specialty requiring specific skills and knowledge. Many new investigators have no formal training in management of data sets. This paper describes common basic strategies critical to the management of data as applied to a data set from a longitudinal study. The stages of data management are identified. Moreover, key components and strategies, at each stage are described.


Towards Effective Content-Based Music Retrieval With Multiple Acoustic Feature Composition, Jialie Shen, John Shepherd, Ngu Ahh Dec 2006

Towards Effective Content-Based Music Retrieval With Multiple Acoustic Feature Composition, Jialie Shen, John Shepherd, Ngu Ahh

Research Collection School Of Computing and Information Systems

In this paper, we present a new approach to constructing music descriptors to support efficient content-based music retrieval and classification. The system applies multiple musical properties combined with a hybrid architecture based on principal component analysis (PCA) and a multilayer perceptron neural network. This architecture enables straightforward incorporation of multiple musical feature vectors, based on properties such as timbral texture, pitch, and rhythm structure, into a single low-dimensioned vector that is more effective for classification than the larger individual feature vectors. The use of supervised training enables incorporation of human musical perception that further enhances the classification process. We compare …


Implicit Online Learning With Kernels, Li Cheng, S. V. N. Vishwanathan, Dale Schuurmans, Shaojun Wang, Terry Caelli Dec 2006

Implicit Online Learning With Kernels, Li Cheng, S. V. N. Vishwanathan, Dale Schuurmans, Shaojun Wang, Terry Caelli

Kno.e.sis Publications

We present two new algorithms for online learning in reproducing kernel Hilbert spaces. Our first algorithm, ILK (implicit online learning with kernels), employs a new, implicit update technique that can be applied to a wide variety of convex loss functions. We then introduce a bounded memory version, SILK (sparse ILK), that maintains a compact representation of the predictor without compromising solution quality, even in non-stationary environments. We prove loss bounds and analyze the convergence rate of both. Experimental evidence shows that our proposed algorithms outperform current methods on synthetic and real data.


Regression Cubes With Lossless Compression And Aggregation, Yixin Chen, Guozhu Dong, Jiawei Han, Jian Pei, Benjamin W. Wah, Jianyong Wang Dec 2006

Regression Cubes With Lossless Compression And Aggregation, Yixin Chen, Guozhu Dong, Jiawei Han, Jian Pei, Benjamin W. Wah, Jianyong Wang

Kno.e.sis Publications

As OLAP engines are widely used to support multidimensional data analysis, it is desirable to support in data cubes advanced statistical measures, such as regression and filtering, in addition to the traditional simple measures such as count and average. Such new measures will allow users to model, smooth, and predict the trends and patterns of data. Existing algorithms for simple distributive and algebraic measures are inadequate for efficient computation of statistical measures in a multidimensional space. In this paper, we propose a fundamentally new class of measures, compressible measures, in order to support efficient computation of the statistical models. For …


Yellow Tree: A Distributed Main-Memory Spatial Index Structure For Moving Objects, Hariharan Gowrisankar Dec 2006

Yellow Tree: A Distributed Main-Memory Spatial Index Structure For Moving Objects, Hariharan Gowrisankar

Electronic Theses and Dissertations

Mobile devices equipped with wireless technologies to communicate and positioning systems to locate objects of interest are common place today, providing the impetus to develop location-aware applications. At the heart of location-aware applications are moving objects or objects that continuously change location over time, such as cars in transportation networks or pedestrians or postal packages. Location-aware applications tend to support the tracking of very large numbers of such moving objects as well as many users that are interested in finding out about the locations of other moving objects. Such location-aware applications rely on support from database management systems to model, …


Query-Based Watermarking For Xml Data, Xuan Zhou, Hwee Hwa Pang, Kian-Lee Tan Dec 2006

Query-Based Watermarking For Xml Data, Xuan Zhou, Hwee Hwa Pang, Kian-Lee Tan

Research Collection School Of Computing and Information Systems

As increasing amount of XML data is exchanged over the internet, copyright protection of this type of data is becoming an important requirement for many applications. In this paper, we introduce a rights protection scheme for XML data based on digital watermarking. One of the main challenges for watermarking XML data is that the data could be easily reorganized by an adversary in an attempt to destroy any embedded watermark. To overcome it, we propose a query-based watermarking scheme, which creates queries to identify available watermarking capacity, such that watermarks could be recovered from reorganized data through query rewriting. The …


Clique Percolation For Finding Naturally Cohesive And Overlapping Document Clusters, Wei Gao, Kam-Fai Wong, Yunqing Xia, Ruifeng Xu Dec 2006

Clique Percolation For Finding Naturally Cohesive And Overlapping Document Clusters, Wei Gao, Kam-Fai Wong, Yunqing Xia, Ruifeng Xu

Research Collection School Of Computing and Information Systems

Techniques for find document clusters mostly depend on models that impose strong explicit and/or implicit priori assumptions. As a consequence, the clustering effects tend to be unnatural and stray away from the intrinsic grouping natures of a document collection. We apply a novel graph-theoretic technique called Clique Percolation Method (CPM) for document clustering. In this method, a process of enumerating highly cohesive maximal document cliques is performed in a random graph, where those strongly adjacent cliques are mingled to form naturally overlapping clusters. Our clustering results can unveil the inherent structural connections of the underlying data. Experiments show that CPM …


Designing Web Sites For Customer Loyalty Across Business Domains: A Multilevel Analysis, S. Mithas, Narayanasamy Ramasubbu, M. S. Krishnan, C. Fornell Dec 2006

Designing Web Sites For Customer Loyalty Across Business Domains: A Multilevel Analysis, S. Mithas, Narayanasamy Ramasubbu, M. S. Krishnan, C. Fornell

Research Collection School Of Computing and Information Systems

Web Sites are important components of Internet strategy for organizations. This paper develops a theoretical model for understanding the effect of Web site design elements on customer loyalty to a Web site. We show the relevance of the business domain of a Web site to gain a contextual understanding of relative importance of Web site design elements. We use a hierarchical linear modeling approach to model multilevel and cross-level interactions that have not been explicitly considered in previous research. By analyzing, data on more than 12,000 online customer surveys for 43 Web sites in several business domains, we find that …


Continuous Monitoring Of Knn Queries In Wireless Sensor Networks, Yuxia Yao, Xueyan Tang, Ee Peng Lim Dec 2006

Continuous Monitoring Of Knn Queries In Wireless Sensor Networks, Yuxia Yao, Xueyan Tang, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Wireless sensor networks have been widely used for civilian and military applications, such as environmental monitoring and vehicle tracking. In these applications, continuous query processing is often required and their efficient evaluation is a critical requirement to be met. Due to the limited power supply for sensor nodes, energy efficiency is a major performance measure in such query evaluation. In this paper, we focus on continuous kNN query processing. We observe that the centralized data storage and monitoring schemes do not favor energy efficiency. We therefore propose a localized scheme to monitor long running nearest neighbor queries in sensor networks. …


Measuring Qualities Of Articles Contributed By Online Communities, Ee Peng Lim, Ba-Quy Vuong, Hady W. Lauw, Aixin Sun Dec 2006

Measuring Qualities Of Articles Contributed By Online Communities, Ee Peng Lim, Ba-Quy Vuong, Hady W. Lauw, Aixin Sun

Research Collection School Of Computing and Information Systems

Using open source Web editing software (e.g., wiki), online community users can now easily edit, review and publish articles collaboratively. While much useful knowledge can be derived from these articles, content users and critics are often concerned about their qualities. In this paper, we develop two models, namely basic model and peer review model, for measuring the qualities of these articles and the authorities of their contributors. We represent collaboratively edited articles and their contributors in a bipartite graph. While the basic model measures an article's quality using both the authorities of contributors and the amount of contribution from each …


Towards Effective Content-Based Music Retrieval With Multiple Acoustic Feature Combination, Jialie Shen, John Shepherd, Ann H. H. Ngu Dec 2006

Towards Effective Content-Based Music Retrieval With Multiple Acoustic Feature Combination, Jialie Shen, John Shepherd, Ann H. H. Ngu

Research Collection School Of Computing and Information Systems

In this paper, we present a new approach to constructing music descriptors to support efficient content-based music retrieval and classification. The system applies multiple musical properties combined with a hybrid architecture based on principal component analysis (PCA) and a multilayer perceptron neural network. This architecture enables straightforward incorporation of multiple musical feature vectors, based on properties such as timbral texture, pitch, and rhythm structure, into a single low-dimensioned vector that is more effective for classification than the larger individual feature vectors. The use of supervised training enables incorporation of human musical perception that further enhances the classification process. We compare …


On The Lower Bound Of Local Optimums In K-Means Algorithms, Zhenjie Zhang, Bing Tian Dai, Anthony K.H. Tung Dec 2006

On The Lower Bound Of Local Optimums In K-Means Algorithms, Zhenjie Zhang, Bing Tian Dai, Anthony K.H. Tung

Research Collection School Of Computing and Information Systems

No abstract provided.


Rapid Identification Of Column Heterogeneity, Bing Tian Dai, Nick Koudas, Beng Chin Ooi, Divesh Srivastava, Suresh Venkatasubramanian Dec 2006

Rapid Identification Of Column Heterogeneity, Bing Tian Dai, Nick Koudas, Beng Chin Ooi, Divesh Srivastava, Suresh Venkatasubramanian

Research Collection School Of Computing and Information Systems

No abstract provided.


Modeling Heterogeneous User Churn And Local Resilience Of Unstructured P2p Networks, Zhongmei Yao, Derek Leonard, Dmitri Loguinov, Xiaoming Wang Nov 2006

Modeling Heterogeneous User Churn And Local Resilience Of Unstructured P2p Networks, Zhongmei Yao, Derek Leonard, Dmitri Loguinov, Xiaoming Wang

Computer Science Faculty Publications

Previous analytical results on the resilience of unstructured P2P systems have not explicitly modeled heterogeneity of user churn (i.e., difference in online behavior) or the impact of in-degree on system resilience. To overcome these limitations, we introduce a generic model of heterogeneous user churn, derive the distribution of the various metrics observed in prior experimental studies (e.g., lifetime distribution of joining users, joint distribution of session time of alive peers, and residual lifetime of a randomly selected user), derive several closed-form results on the transient behavior of in-degree, and eventually obtain the joint in/out degree isolation probability as a simple …


Active Semantic Electronic Medical Record, Amit P. Sheth, Sangeeta Agrawal, Jonathan Lathem, Nicole Oldham, H. Wingate, K. Gallagher Nov 2006

Active Semantic Electronic Medical Record, Amit P. Sheth, Sangeeta Agrawal, Jonathan Lathem, Nicole Oldham, H. Wingate, K. Gallagher

Kno.e.sis Publications

The healthcare industry is rapidly advancing towards the widespread use of electronic medical records systems to manage the increasingly large amount of patient data and reduce medical errors. In addition to patient data there is a large amount of data describing procedures, treatments, diagnoses, drugs, insurance plans, coverage, formularies and the relationships between these data sets. While practices have benefited from the use of EMRs, infusing these essential programs with rich domain knowledge and rules can greatly enhance their performance and ability to support clinical decisions. Active Semantic Electronic Medical Record (ASEMR) application discussed here uses Semantic Web technologies to …


{Ontology: Resource} X {Matching : Mapping} X {Schema : Instance} :: Components Of The Same Challenge, Amit P. Sheth Nov 2006

{Ontology: Resource} X {Matching : Mapping} X {Schema : Instance} :: Components Of The Same Challenge, Amit P. Sheth

Kno.e.sis Publications

Ontologies enable us to elevate syntactic and structural processing in an information system/Web to an information system/Web powered with semantic processing. Experience has shown that monolithic and tightly coupled approaches seldom succeed, and majority of information systems and applications will need to deal with plurality of ontologies in a loosely coupled environment (i.e., independently evolving ontologies and inter-ontology relationships, existence of different contexts for different users/applications etc.) Development of such loosely-coupled multi-ontology environments entails development of techniques for ontology mapping/alignment, multi-ontology query processing, and much more.


California State Information Technology Strategic Plan 2006, Office Of The State Chief Information Officer Nov 2006

California State Information Technology Strategic Plan 2006, Office Of The State Chief Information Officer

California Agencies

No abstract provided.


How To Reason With Owl In A Logic Programming System, Markus Krotzsch, Pascal Hitzler, Denny Vrandecic, Michael Sintek Nov 2006

How To Reason With Owl In A Logic Programming System, Markus Krotzsch, Pascal Hitzler, Denny Vrandecic, Michael Sintek

Computer Science and Engineering Faculty Publications

Logic programming has always been a major ontology modeling paradigm, and is frequently being used in large research projects and industrial applications, e.g., by means of the F-Logic reasoning engine OntoBroker or the TRIPLE query, inference, and transformation language and system. At the same time, the Web Ontology Language OWL has been recommended by the W3C for modeling ontologies for the Web. Naturally, it is desirable to investigate the interoperability between both paradigms. In this paper, we do so by studying an expressive fragment of OWL DL for which reasoning can be reduced to the evaluation of Horn logic programs. …


On The Complexity Of Horn Description Logics, Markus Krotzsch, Sebastian Rudolph, Pascal Hitzler Nov 2006

On The Complexity Of Horn Description Logics, Markus Krotzsch, Sebastian Rudolph, Pascal Hitzler

Computer Science and Engineering Faculty Publications

Horn-SHIQ has been identified as a fragment of the description logic SHIQ for which inferencing is in PTIME with respect to the size of the ABox. This enables reasoning with larger ABoxes in situations where the TBox is static, and represents one approach towards tractable description logic reasoning. In this paper, we show that reasoning in Horn-SHIQ, in spite of its low datacomplexity, is ExpTIME-hard with respect to the overall size of the knowledge base. While this result is not unexpected, the proof is not a mere modification of existing reductions since …


A Framework For Schema-Driven Relationship Discovery From Unstructured Text, Cartic Ramakrishnan, Krzysztof Kochut, Amit P. Sheth Nov 2006

A Framework For Schema-Driven Relationship Discovery From Unstructured Text, Cartic Ramakrishnan, Krzysztof Kochut, Amit P. Sheth

Kno.e.sis Publications

We address the issue of extracting implicit and explicit relationships between entities in biomedical text. We argue that entities seldom occur in text in their simple form and that relationships in text relate the modified, complex forms of entities with each other. We present a rule-based method for (1) extraction of such complex entities and (2) relationships between them and (3) the conversion of such relationships into RDF. Furthermore, we present results that clearly demonstrate the utility of the generated RDF in discovering knowledge from text corpora by means of locating paths composed of the extracted relationships.


Integration Of Wikipedia And A Geography Digital Library, Ee Peng Lim, Zhe Wang, Darwin Sadeli, Yuanyuan Li, Chew-Hung Chang, Kalyani Chatterjea, Dion Hoe-Lian Goh, Yin-Leng Theng, Jun Zhang, Aixin Sun Nov 2006

Integration Of Wikipedia And A Geography Digital Library, Ee Peng Lim, Zhe Wang, Darwin Sadeli, Yuanyuan Li, Chew-Hung Chang, Kalyani Chatterjea, Dion Hoe-Lian Goh, Yin-Leng Theng, Jun Zhang, Aixin Sun

Research Collection School Of Computing and Information Systems

In this paper, we address the problem of integrating Wikipedia, an online encyclopedia, and G-Portal, a web-based digital library, in the geography domain. The integration facilitates the sharing of data and services between the two web applications that are of great value in learning. We first present an overall system architecture for supporting such an integration and address the metadata extraction problem associated with it. In metadata extraction, we focus on extracting and constructing metadata for geo-political regions namely cities and countries. Some empirical performance results will be presented. The paper will also describe the adaptations of G-Portal and Wikipedia …


Understanding User Perceptions On Usefulness And Usability Of An Integrated Wiki-G-Portal, Yin-Leng Theng, Yuanyuan Li, Ee Peng Lim, Zhe Wang, Dion Hoe-Lian Goh, Chew-Hung Chang, Kalyani Chatterjea, Jun Zhang Nov 2006

Understanding User Perceptions On Usefulness And Usability Of An Integrated Wiki-G-Portal, Yin-Leng Theng, Yuanyuan Li, Ee Peng Lim, Zhe Wang, Dion Hoe-Lian Goh, Chew-Hung Chang, Kalyani Chatterjea, Jun Zhang

Research Collection School Of Computing and Information Systems

This paper describes a pilot study on Wiki-G-Portal, a project integrating Wikipedia, an online encyclopedia, into G-Portal, a Web-based digital library, of geography resources. Initial findings from the pilot study seemed to suggest positive perceptions on usefulness and usability of Wiki-G-Portal, as well as subjects' attitude and intention to use.


A Model For Anticipatory Event Detection, Qi He, Kuiyu Chang, Ee Peng Lim Nov 2006

A Model For Anticipatory Event Detection, Qi He, Kuiyu Chang, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Event detection is a very important area of research that discovers new events reported in a stream of text documents. Previous research in event detection has largely focused on finding the first story and tracking the events of a specific topic. A topic is simply a set of related events defined by user supplied keywords with no associated semantics and little domain knowledge. We therefore introduce the Anticipatory Event Detection (AED) problem: given some user preferred event transition in a topic, detect the occurence of the transition for the stream of news covering the topic. We confine the events to …


Metadata Basics: A Literature Survey And Subject Analysis, Nicole Mitchell Oct 2006

Metadata Basics: A Literature Survey And Subject Analysis, Nicole Mitchell

The Southeastern Librarian

Librarians today are wrestling with an everchanging digital environment. In some way oranother, we must all adapt to new technologies, skills, and ways of thinking. What comes to mind when you hear the word “metadata?” Is itintimidating? Do metadatists and catalogers explain the term adequately? While this articleby no means captures all there is about metadata, it is intended to provide librarians with a basic understanding of what is involved in metadatawork.


Fast Tracking Of Near-Duplicate Keyframes In Broadcast Domain With Transitivity Propagation, Chong-Wah Ngo, Wan-Lei Zhao, Yu-Gang Jiang Oct 2006

Fast Tracking Of Near-Duplicate Keyframes In Broadcast Domain With Transitivity Propagation, Chong-Wah Ngo, Wan-Lei Zhao, Yu-Gang Jiang

Research Collection School Of Computing and Information Systems

The identification of near-duplicate keyframe (NDK) pairs is a useful task for a variety of applications such as news story threading and content-based video search. In this paper, we propose a novel approach for the discovery and tracking of NDK pairs and threads in the broadcast domain. The detection of NDKs in a large data set is a challenging task due to the fact that when the data set increases linearly, the computational cost increases in a quadratic speed, and so does the number of false alarms. This paper explores the symmetric and transitive nature of near-duplicate for the effective …


Extracting Link Chains Of Relationship Instances From A Website, Myo-Myo Naing, Ee Peng Lim, Roger Hsiang-Li Chiang Oct 2006

Extracting Link Chains Of Relationship Instances From A Website, Myo-Myo Naing, Ee Peng Lim, Roger Hsiang-Li Chiang

Research Collection School Of Computing and Information Systems

Web pages from a Web site can often be associated with concepts in an ontology, and pairs of Web pages also can be associated with relationships between concepts. With such associations, the Web site can be searched, browsed, or even reorganized based on the concept and relationship labels of its Web pages. In this article, we study the link chain extraction problem that is critical to the extraction of Web pages that are related. A link chain is an ordered list of anchor elements linking two Web pages related by some semantic relationship. We propose a link chain extraction method …


Service Pattern Discovery Of Web Service Mining In Web Service Registry-Repository, Qianhui Althea Liang, Jen-Yao Chung, Steven M. Miller, Yang Ouyang Oct 2006

Service Pattern Discovery Of Web Service Mining In Web Service Registry-Repository, Qianhui Althea Liang, Jen-Yao Chung, Steven M. Miller, Yang Ouyang

Research Collection School Of Computing and Information Systems

This paper presents and elaborates the concept of Web service usage patterns and pattern discovery through service mining. We define three different levels of service usage data: i) user request level, ii) template level and iii) instance level. At each level, we investigate patterns of service usage data and the discovery of these patterns. An algorithm for service pattern discovery at the template level is presented. We show the system architecture of a service-mining enabled service registry repository. Web service patterns, pattern discovery and pattern mining supports the discovery and composition of complex services, which in turn supports the application …


Audio Similarity Measure By Graph Modeling And Matching, Yuxin Peng, Chong-Wah Ngo, Cuihua Fang, Xiaoou Chen, Jianguo Xiao Oct 2006

Audio Similarity Measure By Graph Modeling And Matching, Yuxin Peng, Chong-Wah Ngo, Cuihua Fang, Xiaoou Chen, Jianguo Xiao

Research Collection School Of Computing and Information Systems

This paper proposes a new approach for the similarity measure and ranking of audio clips by graph modeling and matching. Instead of using frame-based or salient-based features to measure the acoustical similarity of audio clips, segment-based similarity is proposed. The novelty of our approach lies in two aspects: segment-based representation, and the similarity measure and ranking based on four kinds of similarity factors. In segmentbased representation, segments not only capture the change property of audio clip, but also keep and present the change relation and temporal order of audio features. In the similarity measure and ranking, four kinds of similarity …


Natural Document Clustering By Clique Percolation In Random Graphs, Wei Gao, Kam-Fai Wong Oct 2006

Natural Document Clustering By Clique Percolation In Random Graphs, Wei Gao, Kam-Fai Wong

Research Collection School Of Computing and Information Systems

Document clustering techniques mostly depend on models that impose explicit and/or implicit priori assumptions as to the number, size, disjunction characteristics of clusters, and/or the probability distribution of clustered data. As a result, the clustering effects tend to be unnatural and stray away more or less from the intrinsic grouping nature among the documents in a corpus. We propose a novel graph-theoretic technique called Clique Percolation Clustering (CPC). It models clustering as a process of enumerating adjacent maximal cliques in a random graph that unveils inherent structure of the underlying data, in which we unleash the commonly practiced constraints in …