Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 56

Full-Text Articles in Databases and Information Systems

Towards Effective Content-Based Music Retrieval With Multiple Acoustic Feature Composition, Jialie Shen, John Shepherd, Ngu Ahh Dec 2006

Towards Effective Content-Based Music Retrieval With Multiple Acoustic Feature Composition, Jialie Shen, John Shepherd, Ngu Ahh

Research Collection School Of Computing and Information Systems

In this paper, we present a new approach to constructing music descriptors to support efficient content-based music retrieval and classification. The system applies multiple musical properties combined with a hybrid architecture based on principal component analysis (PCA) and a multilayer perceptron neural network. This architecture enables straightforward incorporation of multiple musical feature vectors, based on properties such as timbral texture, pitch, and rhythm structure, into a single low-dimensioned vector that is more effective for classification than the larger individual feature vectors. The use of supervised training enables incorporation of human musical perception that further enhances the classification process. We compare …


Query-Based Watermarking For Xml Data, Xuan Zhou, Hwee Hwa Pang, Kian-Lee Tan Dec 2006

Query-Based Watermarking For Xml Data, Xuan Zhou, Hwee Hwa Pang, Kian-Lee Tan

Research Collection School Of Computing and Information Systems

As increasing amount of XML data is exchanged over the internet, copyright protection of this type of data is becoming an important requirement for many applications. In this paper, we introduce a rights protection scheme for XML data based on digital watermarking. One of the main challenges for watermarking XML data is that the data could be easily reorganized by an adversary in an attempt to destroy any embedded watermark. To overcome it, we propose a query-based watermarking scheme, which creates queries to identify available watermarking capacity, such that watermarks could be recovered from reorganized data through query rewriting. The …


Clique Percolation For Finding Naturally Cohesive And Overlapping Document Clusters, Wei Gao, Kam-Fai Wong, Yunqing Xia, Ruifeng Xu Dec 2006

Clique Percolation For Finding Naturally Cohesive And Overlapping Document Clusters, Wei Gao, Kam-Fai Wong, Yunqing Xia, Ruifeng Xu

Research Collection School Of Computing and Information Systems

Techniques for find document clusters mostly depend on models that impose strong explicit and/or implicit priori assumptions. As a consequence, the clustering effects tend to be unnatural and stray away from the intrinsic grouping natures of a document collection. We apply a novel graph-theoretic technique called Clique Percolation Method (CPM) for document clustering. In this method, a process of enumerating highly cohesive maximal document cliques is performed in a random graph, where those strongly adjacent cliques are mingled to form naturally overlapping clusters. Our clustering results can unveil the inherent structural connections of the underlying data. Experiments show that CPM …


Designing Web Sites For Customer Loyalty Across Business Domains: A Multilevel Analysis, S. Mithas, Narayanasamy Ramasubbu, M. S. Krishnan, C. Fornell Dec 2006

Designing Web Sites For Customer Loyalty Across Business Domains: A Multilevel Analysis, S. Mithas, Narayanasamy Ramasubbu, M. S. Krishnan, C. Fornell

Research Collection School Of Computing and Information Systems

Web Sites are important components of Internet strategy for organizations. This paper develops a theoretical model for understanding the effect of Web site design elements on customer loyalty to a Web site. We show the relevance of the business domain of a Web site to gain a contextual understanding of relative importance of Web site design elements. We use a hierarchical linear modeling approach to model multilevel and cross-level interactions that have not been explicitly considered in previous research. By analyzing, data on more than 12,000 online customer surveys for 43 Web sites in several business domains, we find that …


Continuous Monitoring Of Knn Queries In Wireless Sensor Networks, Yuxia Yao, Xueyan Tang, Ee Peng Lim Dec 2006

Continuous Monitoring Of Knn Queries In Wireless Sensor Networks, Yuxia Yao, Xueyan Tang, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Wireless sensor networks have been widely used for civilian and military applications, such as environmental monitoring and vehicle tracking. In these applications, continuous query processing is often required and their efficient evaluation is a critical requirement to be met. Due to the limited power supply for sensor nodes, energy efficiency is a major performance measure in such query evaluation. In this paper, we focus on continuous kNN query processing. We observe that the centralized data storage and monitoring schemes do not favor energy efficiency. We therefore propose a localized scheme to monitor long running nearest neighbor queries in sensor networks. …


Measuring Qualities Of Articles Contributed By Online Communities, Ee Peng Lim, Ba-Quy Vuong, Hady W. Lauw, Aixin Sun Dec 2006

Measuring Qualities Of Articles Contributed By Online Communities, Ee Peng Lim, Ba-Quy Vuong, Hady W. Lauw, Aixin Sun

Research Collection School Of Computing and Information Systems

Using open source Web editing software (e.g., wiki), online community users can now easily edit, review and publish articles collaboratively. While much useful knowledge can be derived from these articles, content users and critics are often concerned about their qualities. In this paper, we develop two models, namely basic model and peer review model, for measuring the qualities of these articles and the authorities of their contributors. We represent collaboratively edited articles and their contributors in a bipartite graph. While the basic model measures an article's quality using both the authorities of contributors and the amount of contribution from each …


Towards Effective Content-Based Music Retrieval With Multiple Acoustic Feature Combination, Jialie Shen, John Shepherd, Ann H. H. Ngu Dec 2006

Towards Effective Content-Based Music Retrieval With Multiple Acoustic Feature Combination, Jialie Shen, John Shepherd, Ann H. H. Ngu

Research Collection School Of Computing and Information Systems

In this paper, we present a new approach to constructing music descriptors to support efficient content-based music retrieval and classification. The system applies multiple musical properties combined with a hybrid architecture based on principal component analysis (PCA) and a multilayer perceptron neural network. This architecture enables straightforward incorporation of multiple musical feature vectors, based on properties such as timbral texture, pitch, and rhythm structure, into a single low-dimensioned vector that is more effective for classification than the larger individual feature vectors. The use of supervised training enables incorporation of human musical perception that further enhances the classification process. We compare …


On The Lower Bound Of Local Optimums In K-Means Algorithms, Zhenjie Zhang, Bing Tian Dai, Anthony K.H. Tung Dec 2006

On The Lower Bound Of Local Optimums In K-Means Algorithms, Zhenjie Zhang, Bing Tian Dai, Anthony K.H. Tung

Research Collection School Of Computing and Information Systems

No abstract provided.


Rapid Identification Of Column Heterogeneity, Bing Tian Dai, Nick Koudas, Beng Chin Ooi, Divesh Srivastava, Suresh Venkatasubramanian Dec 2006

Rapid Identification Of Column Heterogeneity, Bing Tian Dai, Nick Koudas, Beng Chin Ooi, Divesh Srivastava, Suresh Venkatasubramanian

Research Collection School Of Computing and Information Systems

No abstract provided.


Integration Of Wikipedia And A Geography Digital Library, Ee Peng Lim, Zhe Wang, Darwin Sadeli, Yuanyuan Li, Chew-Hung Chang, Kalyani Chatterjea, Dion Hoe-Lian Goh, Yin-Leng Theng, Jun Zhang, Aixin Sun Nov 2006

Integration Of Wikipedia And A Geography Digital Library, Ee Peng Lim, Zhe Wang, Darwin Sadeli, Yuanyuan Li, Chew-Hung Chang, Kalyani Chatterjea, Dion Hoe-Lian Goh, Yin-Leng Theng, Jun Zhang, Aixin Sun

Research Collection School Of Computing and Information Systems

In this paper, we address the problem of integrating Wikipedia, an online encyclopedia, and G-Portal, a web-based digital library, in the geography domain. The integration facilitates the sharing of data and services between the two web applications that are of great value in learning. We first present an overall system architecture for supporting such an integration and address the metadata extraction problem associated with it. In metadata extraction, we focus on extracting and constructing metadata for geo-political regions namely cities and countries. Some empirical performance results will be presented. The paper will also describe the adaptations of G-Portal and Wikipedia …


Understanding User Perceptions On Usefulness And Usability Of An Integrated Wiki-G-Portal, Yin-Leng Theng, Yuanyuan Li, Ee Peng Lim, Zhe Wang, Dion Hoe-Lian Goh, Chew-Hung Chang, Kalyani Chatterjea, Jun Zhang Nov 2006

Understanding User Perceptions On Usefulness And Usability Of An Integrated Wiki-G-Portal, Yin-Leng Theng, Yuanyuan Li, Ee Peng Lim, Zhe Wang, Dion Hoe-Lian Goh, Chew-Hung Chang, Kalyani Chatterjea, Jun Zhang

Research Collection School Of Computing and Information Systems

This paper describes a pilot study on Wiki-G-Portal, a project integrating Wikipedia, an online encyclopedia, into G-Portal, a Web-based digital library, of geography resources. Initial findings from the pilot study seemed to suggest positive perceptions on usefulness and usability of Wiki-G-Portal, as well as subjects' attitude and intention to use.


A Model For Anticipatory Event Detection, Qi He, Kuiyu Chang, Ee Peng Lim Nov 2006

A Model For Anticipatory Event Detection, Qi He, Kuiyu Chang, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Event detection is a very important area of research that discovers new events reported in a stream of text documents. Previous research in event detection has largely focused on finding the first story and tracking the events of a specific topic. A topic is simply a set of related events defined by user supplied keywords with no associated semantics and little domain knowledge. We therefore introduce the Anticipatory Event Detection (AED) problem: given some user preferred event transition in a topic, detect the occurence of the transition for the stream of news covering the topic. We confine the events to …


Fast Tracking Of Near-Duplicate Keyframes In Broadcast Domain With Transitivity Propagation, Chong-Wah Ngo, Wan-Lei Zhao, Yu-Gang Jiang Oct 2006

Fast Tracking Of Near-Duplicate Keyframes In Broadcast Domain With Transitivity Propagation, Chong-Wah Ngo, Wan-Lei Zhao, Yu-Gang Jiang

Research Collection School Of Computing and Information Systems

The identification of near-duplicate keyframe (NDK) pairs is a useful task for a variety of applications such as news story threading and content-based video search. In this paper, we propose a novel approach for the discovery and tracking of NDK pairs and threads in the broadcast domain. The detection of NDKs in a large data set is a challenging task due to the fact that when the data set increases linearly, the computational cost increases in a quadratic speed, and so does the number of false alarms. This paper explores the symmetric and transitive nature of near-duplicate for the effective …


Extracting Link Chains Of Relationship Instances From A Website, Myo-Myo Naing, Ee Peng Lim, Roger Hsiang-Li Chiang Oct 2006

Extracting Link Chains Of Relationship Instances From A Website, Myo-Myo Naing, Ee Peng Lim, Roger Hsiang-Li Chiang

Research Collection School Of Computing and Information Systems

Web pages from a Web site can often be associated with concepts in an ontology, and pairs of Web pages also can be associated with relationships between concepts. With such associations, the Web site can be searched, browsed, or even reorganized based on the concept and relationship labels of its Web pages. In this article, we study the link chain extraction problem that is critical to the extraction of Web pages that are related. A link chain is an ordered list of anchor elements linking two Web pages related by some semantic relationship. We propose a link chain extraction method …


Service Pattern Discovery Of Web Service Mining In Web Service Registry-Repository, Qianhui Althea Liang, Jen-Yao Chung, Steven M. Miller, Yang Ouyang Oct 2006

Service Pattern Discovery Of Web Service Mining In Web Service Registry-Repository, Qianhui Althea Liang, Jen-Yao Chung, Steven M. Miller, Yang Ouyang

Research Collection School Of Computing and Information Systems

This paper presents and elaborates the concept of Web service usage patterns and pattern discovery through service mining. We define three different levels of service usage data: i) user request level, ii) template level and iii) instance level. At each level, we investigate patterns of service usage data and the discovery of these patterns. An algorithm for service pattern discovery at the template level is presented. We show the system architecture of a service-mining enabled service registry repository. Web service patterns, pattern discovery and pattern mining supports the discovery and composition of complex services, which in turn supports the application …


Audio Similarity Measure By Graph Modeling And Matching, Yuxin Peng, Chong-Wah Ngo, Cuihua Fang, Xiaoou Chen, Jianguo Xiao Oct 2006

Audio Similarity Measure By Graph Modeling And Matching, Yuxin Peng, Chong-Wah Ngo, Cuihua Fang, Xiaoou Chen, Jianguo Xiao

Research Collection School Of Computing and Information Systems

This paper proposes a new approach for the similarity measure and ranking of audio clips by graph modeling and matching. Instead of using frame-based or salient-based features to measure the acoustical similarity of audio clips, segment-based similarity is proposed. The novelty of our approach lies in two aspects: segment-based representation, and the similarity measure and ranking based on four kinds of similarity factors. In segmentbased representation, segments not only capture the change property of audio clip, but also keep and present the change relation and temporal order of audio features. In the similarity measure and ranking, four kinds of similarity …


Natural Document Clustering By Clique Percolation In Random Graphs, Wei Gao, Kam-Fai Wong Oct 2006

Natural Document Clustering By Clique Percolation In Random Graphs, Wei Gao, Kam-Fai Wong

Research Collection School Of Computing and Information Systems

Document clustering techniques mostly depend on models that impose explicit and/or implicit priori assumptions as to the number, size, disjunction characteristics of clusters, and/or the probability distribution of clustered data. As a result, the clustering effects tend to be unnatural and stray away more or less from the intrinsic grouping nature among the documents in a corpus. We propose a novel graph-theoretic technique called Clique Percolation Clustering (CPC). It models clustering as a process of enumerating adjacent maximal cliques in a random graph that unveils inherent structure of the underlying data, in which we unleash the commonly practiced constraints in …


Wireless Indoor Positioning System With Enhanced Nearest Neighbors In Signal Space Algorithm, Quang Tran, Juki Wirawan Tantra, Ah-Hwee Tan, Ah-Hwee Tan, Kin-Choong Yow, Dongyu Qiu Sep 2006

Wireless Indoor Positioning System With Enhanced Nearest Neighbors In Signal Space Algorithm, Quang Tran, Juki Wirawan Tantra, Ah-Hwee Tan, Ah-Hwee Tan, Kin-Choong Yow, Dongyu Qiu

Research Collection School Of Computing and Information Systems

With the rapid development and wide deployment of wireless Local Area Networks (WLANs), WLAN-based positioning system employing signal-strength-based technique has become an attractive solution for location estimation in indoor environment. In recent years, a number of such systems has been presented, and most of the systems use the common Nearest Neighbor in Signal Space (NNSS) algorithm. In this paper, we propose an enhancement to the NNSS algorithm. We analyze the enhancement to show its effectiveness. The performance of the enhanced NNSS algorithm is evaluated with different values of the parameters. Based on the performance evaluation and analysis, we recommend some …


Discovering Image-Text Associations For Cross-Media Web Information Fusion, Tao Jiang, Ah-Hwee Tan Sep 2006

Discovering Image-Text Associations For Cross-Media Web Information Fusion, Tao Jiang, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

The diverse and distributed nature of the information published on the World Wide Web has made it difficult to collate and track information related to specific topics. Whereas most existing work on web information fusion has focused on multiple document summarization, this paper presents a novel approach for discovering associations between images and text segments, which subsequently can be used to support cross-media web content summarization. Specifically, we employ a similarity-based multilingual retrieval model and adopt a vague transformation technique for measuring the information similarity between visual features and textual features. The experimental results on a terrorist domain document set …


Three Architectures For Trusted Data Dissemination In Edge Computing, Shen-Tat Goh, Hwee Hwa Pang, Robert H. Deng, Feng Bao Sep 2006

Three Architectures For Trusted Data Dissemination In Edge Computing, Shen-Tat Goh, Hwee Hwa Pang, Robert H. Deng, Feng Bao

Research Collection School Of Computing and Information Systems

Edge computing pushes application logic and the underlying data to the edge of the network, with the aim of improving availability and scalability. As the edge servers are not necessarily secure, there must be provisions for users to validate the results—that values in the result tuples are not tampered with, that no qualifying data are left out, that no spurious tuples are introduced, and that a query result is not actually the output from a different query. This paper aims to address the challenges of ensuring data integrity in edge computing. We study three schemes that enable users to check …


Multi-Learner Based Recursive Supervised Training, Laxmi R. Iyer, Kiruthika Ramanathan, Sheng-Uei Guan Sep 2006

Multi-Learner Based Recursive Supervised Training, Laxmi R. Iyer, Kiruthika Ramanathan, Sheng-Uei Guan

Research Collection School Of Computing and Information Systems

In this paper, we propose the multi-learner based recursive supervised training (MLRT) algorithm, which uses the existing framework of recursive task decomposition, by training the entire dataset, picking out the best learnt patterns, and then repeating the process with the remaining patterns. Instead of having a single learner to classify all datasets during each recursion, an appropriate learner is chosen from a set of three learners, based on the subset of data being trained, thereby avoiding the time overhead associated with the genetic algorithm learner utilized in previous approaches. In this way MLRT seeks to identify the inherent characteristics of …


Continuous Nearest Neighbor Monitoring In Road Networks, Kyriakos Mouratidis, Man Lung Yiu, Dimitris Papadias, Nikos Mamoulis Sep 2006

Continuous Nearest Neighbor Monitoring In Road Networks, Kyriakos Mouratidis, Man Lung Yiu, Dimitris Papadias, Nikos Mamoulis

Research Collection School Of Computing and Information Systems

Recent research has focused on continuous monitoring of nearest neighbors (NN) in highly dynamic scenarios, where the queries and the data objects move frequently and arbitrarily. All existing methods, however, assume the Euclidean distance metric. In this paper we study k-NN monitoring in road networks, where the distance between a query and a data object is determined by the length of the shortest path connecting them. We propose two methods that can handle arbitrary object and query moving patterns, as well as °uctuations of edge weights. The ¯rst one maintains the query results by processing only updates that may invalidate …


Masking Page Reference Patterns In Encryption Databases On Untrusted Storage, Xi Ma, Hwee Hwa Pang, Kian-Lee Tan Sep 2006

Masking Page Reference Patterns In Encryption Databases On Untrusted Storage, Xi Ma, Hwee Hwa Pang, Kian-Lee Tan

Research Collection School Of Computing and Information Systems

To support ubiquitous computing, the underlying data have to be persistent and available anywhere-anytime. The data thus have to migrate from devices that are local to individual computers, to shared storage volumes that are accessible over open network. This potentially exposes the data to heightened security risks. In particular, the activity on a database exhibits regular page reference patterns that could help attackers learn logical links among physical pages and then launch additional attacks. We propose two countermeasures to mitigate the risk of attacks initiated through analyzing the shared storage server’s activity for those page patterns. The first countermeasure relocates …


Cuhk At Imageclef 2005: Cross-Language And Cross Media Image Retrieval, Steven Hoi, Jianke Zhu, Michael R. Lyu Sep 2006

Cuhk At Imageclef 2005: Cross-Language And Cross Media Image Retrieval, Steven Hoi, Jianke Zhu, Michael R. Lyu

Research Collection School Of Computing and Information Systems

In this paper, we describe our studies of cross-language and cross-media image retrieval at the ImageCLEF 2005. This is the first participation of our CUHK (The Chinese University of Hong Kong) group at ImageCLEF. The task in which we participated is the “bilingual ad hoc retrieval” task. There are three major focuses and contributions in our participation. The first is the empirical evaluation of language models and smoothing strategies for cross-language image retrieval. The second is the evaluation of cross-media image retrieval, i.e., combining text and visual contents for image retrieval. The last is the evaluation of bilingual image retrieval …


Collaborative Image Retrieval Via Regularized Metric Learning, Luo Si, Rong Jin, Steven C. H. Hoi, Michael R. Lyu Aug 2006

Collaborative Image Retrieval Via Regularized Metric Learning, Luo Si, Rong Jin, Steven C. H. Hoi, Michael R. Lyu

Research Collection School Of Computing and Information Systems

In content-based image retrieval (CBIR), relevant images are identified based on their similarities to query images. Most CBIR algorithms are hindered by the semantic gap between the low-level image features used for computing image similarity and the high-level semantic concepts conveyed in images. One way to reduce the semantic gap is to utilize the log data of users' feedback that has been collected by CBIR systems in history, which is also called “collaborative image retrieval.” In this paper, we present a novel metric learning approach, named “regularized metric learning,” for collaborative image retrieval, which learns a distance metric by exploring …


A Hybrid Architecture Combining Reactive Plan Execution And Reactive Learning, Samin Karim, Liz Sonenberg, Ah-Hwee Tan Aug 2006

A Hybrid Architecture Combining Reactive Plan Execution And Reactive Learning, Samin Karim, Liz Sonenberg, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Developing software agents has been complicated by the problem of how knowledge should be represented and used. Many researchers have identified that agents need not require the use of complex representations, but in many cases suffice to use “the world” as their representation. However, the problem of introspection, both by the agents themselves and by (human) domain experts, requires a knowledge representation with a higher level of abstraction that is more ‘understandable’. Learning and adaptation in agents has traditionally required knowledge to be represented at an arbitrary, low-level of abstraction. We seek to create an agent that has the capability …


An Energy-Efficient And Access Latency Optimized Indexing Scheme For Wireless Data Broadcast, Yuxia Yao, Xueyan Tang, Ee Peng Lim, Aixin Sun Aug 2006

An Energy-Efficient And Access Latency Optimized Indexing Scheme For Wireless Data Broadcast, Yuxia Yao, Xueyan Tang, Ee Peng Lim, Aixin Sun

Research Collection School Of Computing and Information Systems

Data broadcast is an attractive data dissemination method in mobile environments. To improve energy efficiency, existing air indexing schemes for data broadcast have focused on reducing tuning time only, i.e., the duration that a mobile client stays active in data accesses. On the other hand, existing broadcast scheduling schemes have aimed at reducing access latency through nonflat data broadcast to improve responsiveness only. Not much work has addressed the energy efficiency and responsiveness issues concurrently. This paper proposes an energy-efficient indexing scheme called MHash that optimizes tuning time and access latency in an integrated fashion. MHash reduces tuning time by …


Learning The Unified Kernel Machines For Classification, Steven C. H. Hoi, Michael R. Lyu, Edward Y. Chang Aug 2006

Learning The Unified Kernel Machines For Classification, Steven C. H. Hoi, Michael R. Lyu, Edward Y. Chang

Research Collection School Of Computing and Information Systems

Kernel machines have been shown as the state-of-the-art learning techniques for classification. In this paper, we propose a novel general framework of learning the Unified Kernel Machines (UKM) from both labeled and unlabeled data. Our proposed framework integrates supervised learning, semi-supervised kernel learning, and active learning in a unified solution. In the suggested framework, we particularly focus our attention on designing a new semi-supervised kernel learning method, i.e., Spectral Kernel Learning (SKL), which is built on the principles of kernel target alignment and unsupervised kernel design. Our algorithm is related to an equivalent quadratic programming problem that can be efficiently …


Bias And Controversy: Beyond The Statistical Deviation, Hady W. Lauw, Ee Peng Lim, Ke Wang Aug 2006

Bias And Controversy: Beyond The Statistical Deviation, Hady W. Lauw, Ee Peng Lim, Ke Wang

Research Collection School Of Computing and Information Systems

In this paper, we investigate how deviation in evaluation activities may reveal bias on the part of reviewers and controversy on the part of evaluated objects. We focus on a 'data-centric approach' where the evaluation data is assumed to represent the ground truth'. The standard statistical approaches take evaluation and deviation at face value. We argue that attention should be paid to the subjectivity of evaluation, judging the evaluation score not just on 'what is being said' (deviation), but also on 'who says it' (reviewer) as well as on 'whom it is said about' (object). Furthermore, we observe that bias …


Hierarchical Hidden Markov Model For Rushes Structuring And Indexing, Chong-Wah Ngo, Zailiang Pan, Xiaoyong Wei Jul 2006

Hierarchical Hidden Markov Model For Rushes Structuring And Indexing, Chong-Wah Ngo, Zailiang Pan, Xiaoyong Wei

Research Collection School Of Computing and Information Systems

Rushes footage are considered as cheap gold mine with the potential for reuse in broadcasting and filmmaking industries. However, it is difficult to mine the "gold" from the rushes since usually only minimum metadata is available. This paper focuses on the structuring and indexing of the rushes to facilitate mining and retrieval of "gold". We present a new approach for rushes structuring and indexing based on motion feature. We model the problem by a two-level Hierarchical Hidden Markov Model (HHMM). The HHMM, on one hand, represents the semantic concepts in its higher level to provide simultaneous structuring and indexing, on …