Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 31 - 60 of 62

Full-Text Articles in Databases and Information Systems

Authenticating The Query Results Of Text Search Engines, Hwee Hwa Pang, Kyriakos Mouratidis Aug 2008

Authenticating The Query Results Of Text Search Engines, Hwee Hwa Pang, Kyriakos Mouratidis

Research Collection School Of Computing and Information Systems

The number of successful attacks on the Internet shows that it is very difficult to guarantee the security of online search engines. A breached server that is not detected in time may return incorrect results to the users. To prevent that, we introduce a methodology for generating an integrity proof for each search result. Our solution is targeted at search engines that perform similarity-based document retrieval, and utilize an inverted list implementation (as most search engines do). We formulate the properties that define a correct result, map the task of processing a text search query to adaptations of existing threshold-based …


Active Kernel Learning, Steven C. H. Hoi, Rong Jin Jul 2008

Active Kernel Learning, Steven C. H. Hoi, Rong Jin

Research Collection School Of Computing and Information Systems

Identifying the appropriate kernel function/matrix for a given dataset is essential to all kernel-based learning techniques. A number of kernel learning algorithms have been proposed to learn kernel functions or matrices from side information (e.g., either labeled examples or pairwise constraints). However, most previous studies are limited to “passive” kernel learning in which side information is provided beforehand. In this paper we present a framework of Active Kernel Learning (AKL) that actively identifies the most informative pairwise constraints for kernel learning. The key challenge of active kernel learning is how to measure the informativeness of an example pair given its …


Estimating Local Optimums In Em Algorithm Over Gaussian Mixture Model, Zhenjie Zhang, Bing Tian Dai, Anthony K.H. Tung Jul 2008

Estimating Local Optimums In Em Algorithm Over Gaussian Mixture Model, Zhenjie Zhang, Bing Tian Dai, Anthony K.H. Tung

Research Collection School Of Computing and Information Systems

EM algorithm is a very popular iteration-based method to estimate the parameters of Gaussian Mixture Model from a large observation set. However, in most cases, EM algorithm is not guaranteed to converge to the global optimum. Instead, it stops at some local optimums, which can be much worse than the global optimum.


Comments-Oriented Document Summarization: Understanding Documents With Readers' Feedback, Meishan Hu, Aixin Sun, Ee Peng Lim Jul 2008

Comments-Oriented Document Summarization: Understanding Documents With Readers' Feedback, Meishan Hu, Aixin Sun, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Comments left by readers on Web documents contain valuable information that can be utilized in different information retrieval tasks including document search, visualization, and summarization. In this paper, we study the problem of comments-oriented document summarization and aim to summarize a Web document (e.g., a blog post) by considering not only its content, but also the comments left by its readers. We identify three relations (namely, topic, quotation, and mention) by which comments can be linked to one another, and model the relations in three graphs. The importance of each comment is then scored by: (i) graph-based method, where the …


A Self-Organizing Neural Model For Multimedia Information Fusion, Luong-Dong Nguyen, Kia-Yan Woon, Ah-Hwee Tan Jul 2008

A Self-Organizing Neural Model For Multimedia Information Fusion, Luong-Dong Nguyen, Kia-Yan Woon, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

This paper presents a self-organizing network model for the fusion of multimedia information. By synchronizing the encoding of information across multiple media channels, the neural model known as fusion Adaptive Resonance Theory (fusion ART) generates clusters that encode the associative mappings across multimedia information in a real-time and continuous manner. In addition, by incorporating a semantic category channel, fusion ART further enables multimedia information to be fused into predefined themes or semantic categories. We illustrate the fusion ART’s functionalities through experiments on two multimedia data sets in the terrorist domain and show the viability of the proposed approach.


User Guidance Of Resource-Adaptive Systems, João Pedro Sousa, Rajesh Krishna Balan, Vahe Poladian, David Garlan, Mahadev Satyanarayanan Jul 2008

User Guidance Of Resource-Adaptive Systems, João Pedro Sousa, Rajesh Krishna Balan, Vahe Poladian, David Garlan, Mahadev Satyanarayanan

Research Collection School Of Computing and Information Systems

This paper presents a framework for engineering resource-adaptive software systems targeted at small mobile devices. The proposed framework empowers users to control tradeoffs among a rich set of ervicespecific aspects of quality of service. After motivating the problem, the paper proposes a model for capturing user preferences with respect to quality of service, and illustrates prototype user interfaces to elicit such models. The paper then describes the extensions and integration work made to accommodate the proposed framework on top of an existing software infrastructure for ubiquitous computing. The research question addressed here is the feasibility of coordinating resource allocation and …


Ranked Reverse Nearest Neighbor Search, Ken C. K. Lee, Baihua Zheng, Wang-Chien Lee Jul 2008

Ranked Reverse Nearest Neighbor Search, Ken C. K. Lee, Baihua Zheng, Wang-Chien Lee

Research Collection School Of Computing and Information Systems

Given a set of data points P and a query point q in a multidimensional space, Reverse Nearest Neighbor (RNN) query finds data points in P whose nearest neighbors are q. Reverse k-Nearest Neighbor (RkNN) query (where k ≥ 1) generalizes RNN query to find data points whose kNNs include q. For RkNN query semantics, q is said to have influence to all those answer data points. The degree of q's influence on a data point p (∈ P) is denoted by κp where q is the κp-th NN of p. We introduce a new variant of RNN query, namely, …


Semi-Supervised Ensemble Ranking, Steven C. H. Hoi, Rong Jin Jul 2008

Semi-Supervised Ensemble Ranking, Steven C. H. Hoi, Rong Jin

Research Collection School Of Computing and Information Systems

Ranking plays a central role in many Web search and information retrieval applications. Ensemble ranking, sometimes called meta-search, aims to improve the retrieval performance by combining the outputs from multiple ranking algorithms. Many ensemble ranking approaches employ supervised learning techniques to learn appropriate weights for combining multiple rankers. The main shortcoming with these approaches is that the learned weights for ranking algorithms are query independent. This is suboptimal since a ranking algorithm could perform well for certain queries but poorly for others. In this paper, we propose a novel semi-supervised ensemble ranking (SSER) algorithm that learns query-dependent weights when combining …


Tree-Based Partition Querying: A Methodology For Computing Medoids In Large Spatial Datasets, Kyriakos Mouratidis, Dimitris Papadias, Spiros Papadimitriou Jul 2008

Tree-Based Partition Querying: A Methodology For Computing Medoids In Large Spatial Datasets, Kyriakos Mouratidis, Dimitris Papadias, Spiros Papadimitriou

Research Collection School Of Computing and Information Systems

Besides traditional domains (e.g., resource allocation, data mining applications), algorithms for medoid computation and related problems will play an important role in numerous emerging fields, such as location based services and sensor networks. Since the k-medoid problem is NP hard, all existing work deals with approximate solutions on relatively small datasets. This paper aims at efficient methods for very large spatial databases, motivated by: (i) the high and ever increasing availability of spatial data, and (ii) the need for novel query types and improved services. The proposed solutions exploit the intrinsic grouping properties of a data partition index in order …


A Multimodal And Multilevel Ranking Scheme For Large-Scale Video Retrieval, Steven C. H. Hoi, Michael R. Lyu Jun 2008

A Multimodal And Multilevel Ranking Scheme For Large-Scale Video Retrieval, Steven C. H. Hoi, Michael R. Lyu

Research Collection School Of Computing and Information Systems

A critical issue of large-scale multimedia retrieval is how to develop an effective framework for ranking the search results. This problem is particularly challenging for content-based video retrieval due to some issues such as short text queries, insufficient sample learning, fusion of multimodal contents, and large-scale learning with huge media data. In this paper, we propose a novel multimodal and multilevel (MMML) ranking framework to attack the challenging ranking problem of content-based video retrieval. We represent the video retrieval task by graphs and suggest a graph based semi-supervised ranking (SSR) scheme, which can learn with small samples effectively and integrate …


Visual Analytics For Supporting Entity Relationship Discovery On Text Data, Hanbo Dai, Ee Peng Lim, Hady W. Lauw, Hwee Hwa Pang Jun 2008

Visual Analytics For Supporting Entity Relationship Discovery On Text Data, Hanbo Dai, Ee Peng Lim, Hady W. Lauw, Hwee Hwa Pang

Research Collection School Of Computing and Information Systems

To conduct content analysis over text data, one may look out for important named objects and entities that refer to real world instances, synthesizing them into knowledge relevant to a given information seeking task. In this paper, we introduce a visual analytics tool called ER-Explorer to support such an analysis task. ER-Explorer consists of a data model known as TUBE and a set of data manipulation operations specially designed for examining entities and relationships in text. As part of TUBE, a set of interestingness measures is defined to help exploring entities and their relationships. We illustrate the use of ER-Explorer …


Wikinetviz: Visualizing Friends And Adversaries In Implicit Social Networks, Minh-Tam Le, Hoang-Vu Dang, Ee Peng Lim, Anwitaman Datta Jun 2008

Wikinetviz: Visualizing Friends And Adversaries In Implicit Social Networks, Minh-Tam Le, Hoang-Vu Dang, Ee Peng Lim, Anwitaman Datta

Research Collection School Of Computing and Information Systems

When multiple users with diverse backgrounds and beliefs edit Wikipedia together, disputes often arise due to disagreements among the users. In this paper, we introduce a novel visualization tool known as WikiNetViz to visualize and analyze disputes among users in a dispute-induced social network. WikiNetViz is designed to quantify the degree of dispute between a pair of users using the article history. Each user (and article) is also assigned a controversy score by our proposed controversy rank model so as to measure the degree of controversy of a user (and an article) by the amount of disputes between the user …


An Office Survival Guide, M. Thulasidas Jun 2008

An Office Survival Guide, M. Thulasidas

Research Collection School Of Computing and Information Systems

In the unforgiving, dog-eat-dog corporate jungle, when you find yourself in a new corporate setting, you need to be sure of the welcome. More importantly, you need to prove yourself worthy of it.


Self-Organizing Neural Models Integrating Rules And Reinforcement Learning, Teck-Hou Teng, Zhong-Ming Tan, Ah-Hwee Tan Jun 2008

Self-Organizing Neural Models Integrating Rules And Reinforcement Learning, Teck-Hou Teng, Zhong-Ming Tan, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Traditional approaches to integrating knowledge into neural network are concerned mainly about supervised learning. This paper presents how a family of self-organizing neural models known as fusion architecture for learning, cognition and navigation (FALCON) can incorporate a priori knowledge and perform knowledge refinement and expansion through reinforcement learning. Symbolic rules are formulated based on pre-existing know-how and inserted into FALCON as a priori knowledge. The availability of knowledge enables FALCON to start performing earlier in the initial learning trials. Through a temporal-difference (TD) learning method, the inserted rules can be refined and expanded according to the evaluative feedback signals received …


Context Modeling With Evolutionary Fuzzy Cognitive Map In Interactive Storytelling, Yundong Cai, Chunyan Miao, Ah-Hwee Tan, Zhiqi Shen Jun 2008

Context Modeling With Evolutionary Fuzzy Cognitive Map In Interactive Storytelling, Yundong Cai, Chunyan Miao, Ah-Hwee Tan, Zhiqi Shen

Research Collection School Of Computing and Information Systems

To generate a believable and dynamic virtual world is a great challenge in interactive storytelling. In this paper, we propose a model, namely evolutionary fuzzy cognitive map (E-FCM), to model the dynamic causal relationships among different context variables. As an extension to conventional FCM, E-FCM models not only the fuzzy causal relationships among the variables, but also the probabilistic property of causal relationships, and asynchronous activity update of the concepts. With this model, the context variables evolve in a dynamic and uncertain manner with the according evolving time. As a result, the virtual world is presented more realistically and dynamically.


Capacity Constrained Assignment In Spatial Databases, Hou U Leong, Man Lung Yiu, Kyriakos Mouratidis, Nikos Mamoulis Jun 2008

Capacity Constrained Assignment In Spatial Databases, Hou U Leong, Man Lung Yiu, Kyriakos Mouratidis, Nikos Mamoulis

Research Collection School Of Computing and Information Systems

Given a point set P of customers (e.g., WiFi receivers) and a point set Q of service providers (e.g., wireless access points), where each q 2 Q has a capacity q.k, the capacity constrained assignment (CCA) is a matching M Q × P such that (i) each point q 2 Q (p 2 P) appears at most k times (at most nce) in M, (ii) the size of M is maximized (i.e., it comprises min{|P|,P q2Q q.k} pairs), and (iii) the total assignment cost (i.e., the sum of Euclidean distances within all pairs) is minimized. Thus, the CCA problem is …


Predicting Trusts Among Users Of Online Communities: An Epinions Case Study, Haifeng Liu, Ee Peng Lim, Hady W. Lauw, Minh-Tam Le, Aixin Sun, Jaideep Srivastava, Young Ae Kim Jun 2008

Predicting Trusts Among Users Of Online Communities: An Epinions Case Study, Haifeng Liu, Ee Peng Lim, Hady W. Lauw, Minh-Tam Le, Aixin Sun, Jaideep Srivastava, Young Ae Kim

Research Collection School Of Computing and Information Systems

Trust between a pair of users is an important piece of information for users in an online community (such as electronic commerce websites and product review websites) where users may rely on trust information to make decisions. In this paper, we address the problem of predicting whether a user trusts another user. Most prior work infers unknown trust ratings from known trust ratings. The effectiveness of this approach depends on the connectivity of the known web of trust and can be quite poor when the connectivity is very sparse which is often the case in an online community. In this …


Semi-Supervised Svm Batch Mode Active Learning For Image Retrieval, Steven Hoi, Rong Jin, Jianke Zhu, Michael R. Lyu Jun 2008

Semi-Supervised Svm Batch Mode Active Learning For Image Retrieval, Steven Hoi, Rong Jin, Jianke Zhu, Michael R. Lyu

Research Collection School Of Computing and Information Systems

Active learning has been shown as a key technique for improving content-based image retrieval (CBIR) performance. Among various methods, support vector machine (SVM) active learning is popular for its application to relevance feedback in CBIR. However, the regular SVM active learning has two main drawbacks when used for relevance feedback. First, SVM often suffers from learning with a small number of labeled examples, which is the case in relevance feedback. Second, SVM active learning usually does not take into account the redundancy among examples, and therefore could select multiple examples in relevance feedback that are similar (or even identical) to …


Semi-Supervised Distance Metric Learning For Collaborative Image Retrieval, Steven Hoi, Wei Liu, Shih-Fu Chang Jun 2008

Semi-Supervised Distance Metric Learning For Collaborative Image Retrieval, Steven Hoi, Wei Liu, Shih-Fu Chang

Research Collection School Of Computing and Information Systems

Typical content-based image retrieval (CBIR) solutions with regular Euclidean metric usually cannot achieve satisfactory performance due to the semantic gap challenge. Hence, relevance feedback has been adopted as a promising approach to improve the search performance. In this paper, we propose a novel idea of learning with historical relevance feedback log data, and adopt a new paradigm called “Collaborative Image Retrieval” (CIR). To effectively explore the log data, we propose a novel semi-supervised distance metric learning technique, called “Laplacian Regularized Metric Learning” (LRML), for learning robust distance metrics for CIR. Different from previous methods, the proposed LRML method integrates both …


Verifying Completeness Of Relational Query Answers From Online Servers, Hwee Hwa Pang, Kian-Lee Tan May 2008

Verifying Completeness Of Relational Query Answers From Online Servers, Hwee Hwa Pang, Kian-Lee Tan

Research Collection School Of Computing and Information Systems

The number of successful attacks on the Internet shows that it is very difficult to guarantee the security of online servers over extended periods of time. A breached server that is not detected in time may return incorrect query answers to users. In this article, we introduce authentication schemes for users to verify that their query answers from an online server are complete (i.e., no qualifying tuples are omitted) and authentic (i.e., all the result values are legitimate). We introduce a scheme that supports range selection, projection as well as primary key-foreign key join queries on relational databases. We also …


Stress Test, M. Thulasidas May 2008

Stress Test, M. Thulasidas

Research Collection School Of Computing and Information Systems

Ultimately, the risk factors that create stress in professional life do not generate any reward


Building A Web Of Trust Without Explicit Trust Ratings, Young Ae Kim, Minh-Tam Le, Hady W. Lauw, Ee Peng Lim, Haifeng Liu, Jaideep Srivastava Apr 2008

Building A Web Of Trust Without Explicit Trust Ratings, Young Ae Kim, Minh-Tam Le, Hady W. Lauw, Ee Peng Lim, Haifeng Liu, Jaideep Srivastava

Research Collection School Of Computing and Information Systems

A satisfactory and robust trust model is gaining importance in addressing information overload, and helping users collect reliable information in online communities. Current research on trust prediction strongly relies on a web of trust, which is directly collected from users based on previous experience. However, the web of trust is not always available in online communities and even though it is available, it is often too sparse to predict the trust value between two unacquainted people with high accuracy. In this paper, we propose a framework to derive degree of trust based on users' expertise and users' affinity for certain …


Validating Multi-Column Schema Matchings By Type, Bing Tian Dai, Nick Koudas, Divesh Srivastava, Anthony K.H. Tung, Suresh Venkatasubramanian Apr 2008

Validating Multi-Column Schema Matchings By Type, Bing Tian Dai, Nick Koudas, Divesh Srivastava, Anthony K.H. Tung, Suresh Venkatasubramanian

Research Collection School Of Computing and Information Systems

Validation of multi-column schema matchings is essential for successful database integration. This task is especially difficult when the databases to be integrated contain little overlapping data, as is often the case in practice (e.g., customer bases of different companies). Based on the intuition that values present in different columns related by a schema matching will have similar "semantic type", and that this can be captured using distributions over values ("statistical types"), we develop a method for validating 1-1 and compositional schema matchings. Our technique is based on three key technical ideas. First, we propose a generic measure for comparing two …


E-Government Implementation: A Macro Analysis Of Singapore's E-Government Initiatives, Calvin M.L. Chan, Yi Meng Lau, Shan L. Pan Apr 2008

E-Government Implementation: A Macro Analysis Of Singapore's E-Government Initiatives, Calvin M.L. Chan, Yi Meng Lau, Shan L. Pan

Research Collection School Of Computing and Information Systems

This paper offers a macro perspective of the various activities involved in the implementation of e-government through an interpretive analysis of the various e-government-related initiatives undertaken by the Singapore Government. The analysis lead to the identification of four main components in the implementation of e-government, namely (i) information content, (ii) ICT infrastructure, (iii) e-government infostructure, and (iv) e-government promotion. These four components were then conceptually integrated into the e-Government Implementation Framework. This paper suggests that this framework can either be used as a descriptive tool to organize and coordinate various e-government initiatives, or be used as a prescriptive structure to …


On-Line Discovery Of Hot Motion Paths, Dimitris Sacharidis, Kostas Patroumpas, Manolis Terrovitis, Verena Kantere, Michalis Potamias, Kyriakos Mouratidis, Timos Sellis Mar 2008

On-Line Discovery Of Hot Motion Paths, Dimitris Sacharidis, Kostas Patroumpas, Manolis Terrovitis, Verena Kantere, Michalis Potamias, Kyriakos Mouratidis, Timos Sellis

Research Collection School Of Computing and Information Systems

We consider an environment of numerous moving objects, equipped with location-sensing devices and capable of communicating with a central coordinator. In this setting, we investigate the problem of maintaining hot motion paths, i.e., routes frequently followed by multiple objects over the recent past. Motion paths approximate portions of objects' movement within a tolerance margin that depends on the uncertainty inherent in positional measurements. Discovery of hot motion paths is important to applications requiring classification/profiling based on monitored movement patterns, such as targeted advertising, resource allocation, etc. To achieve this goal, we delegate part of the path extraction process to objects, …


Processing Transitive Nearest-Neighbor Queries In Multi-Channel Access Environments, Xiao Zhang, Wang-Chien Lee, Prasnjit Mitra, Baihua Zheng Mar 2008

Processing Transitive Nearest-Neighbor Queries In Multi-Channel Access Environments, Xiao Zhang, Wang-Chien Lee, Prasnjit Mitra, Baihua Zheng

Research Collection School Of Computing and Information Systems

Wireless broadcast is an efficient way for information dissemination due to its good scalability [10]. Existing works typically assume mobile devices, such as cell phones and PDAs, can access only one channel at a time. In this paper, we consider a scenario of near future where a mobile device has the ability to process queries using information simultaneously received from multiple channels. We focus on the query processing of the transitive nearest neighbor (TNN) search [19]. Two TNN algorithms developed for a single broadcast channel environment are adapted to our new broadcast enviroment. Based on the obtained insights, we propose …


Integrating Temporal Difference Methods And Self‐Organizing Neural Networks For Reinforcement Learning With Delayed Evaluative Feedback, Ah-Hwee Tan, Ning Lu, Dan Xiao Feb 2008

Integrating Temporal Difference Methods And Self‐Organizing Neural Networks For Reinforcement Learning With Delayed Evaluative Feedback, Ah-Hwee Tan, Ning Lu, Dan Xiao

Research Collection School Of Computing and Information Systems

This paper presents a neural architecture for learning category nodes encoding mappings across multimodal patterns involving sensory inputs, actions, and rewards. By integrating adaptive resonance theory (ART) and temporal difference (TD) methods, the proposed neural model, called TD fusion architecture for learning, cognition, and navigation (TD-FALCON), enables an autonomous agent to adapt and function in a dynamic environment with immediate as well as delayed evaluative feedback (reinforcement) signals. TD-FALCON learns the value functions of the state-action space estimated through on-policy and off-policy TD learning methods, specifically state-action-reward-state-action (SARSA) and Q-learning. The learned value functions are then used to determine the …


On Ranking Controversies In Wikipedia: Models And Evaluation, Ba-Quy Vuong, Ee Peng Lim, Aixin Sun, Minh-Tam Le, Hady Wirawan Lauw, Kuiyu Chang Feb 2008

On Ranking Controversies In Wikipedia: Models And Evaluation, Ba-Quy Vuong, Ee Peng Lim, Aixin Sun, Minh-Tam Le, Hady Wirawan Lauw, Kuiyu Chang

Research Collection School Of Computing and Information Systems

Wikipedia 1 is a very large and successful Web 2.0 example. As the number of Wikipedia articles and contributors grows at a very fast pace, there are also increasing disputes occurring among the contributors. Disputes often happen in articles with controversial content. They also occur frequently among contributors who are "aggressive" or controversial in their personalities. In this paper, we aim to identify controversial articles in Wikipedia. We propose three models, namely the Basic model and two Controversy Rank (CR) models. These models draw clues from collaboration and edit history instead of interpreting the actual articles or edited content. While …


Face Annotation Using Transductive Kernel Fisher Discriminant, Jianke Zhu, Steven C. H. Hoi, Michael R. Lyu Jan 2008

Face Annotation Using Transductive Kernel Fisher Discriminant, Jianke Zhu, Steven C. H. Hoi, Michael R. Lyu

Research Collection School Of Computing and Information Systems

Face annotation in images and videos enjoys many potential applications in multimedia information retrieval. Face annotation usually requires many training data labeled by hand in order to build effective classifiers. This is particularly challenging when annotating faces on large-scale collections of media data, in which huge labeling efforts would be very expensive. As a result, traditional supervised face annotation methods often suffer from insufficient training data. To attack this challenge, in this paper, we propose a novel Transductive Kernel Fisher Discriminant (TKFD) scheme for face annotation, which outperforms traditional supervised annotation methods with few training data. The main idea of …


Enhancing Recursive Supervised Learning Using Clustering And Combinatorial Optimization (Rsl-Cc), Kiruthika Ramanathan, Sheng Uei Guan Jan 2008

Enhancing Recursive Supervised Learning Using Clustering And Combinatorial Optimization (Rsl-Cc), Kiruthika Ramanathan, Sheng Uei Guan

Research Collection School Of Computing and Information Systems

The use of a team of weak learners to learn a dataset has been shown better than the use of one single strong learner. In fact, the idea is so successful that boosting, an algorithm combining several weak learners for supervised learning, has been considered to be one of the best off-the-shelf classifiers. However, some problems still remain, including determining the optimal number of weak learners and the overfitting of data. In an earlier work, we developed the RPHP algorithm which solves both these problems by using a combination of genetic algorithm, weak learner and pattern distributor. In this paper, …