Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Research Collection School Of Computing and Information Systems

2010

Articles 91 - 106 of 106

Full-Text Articles in Databases and Information Systems

Top-K Aggregation Queries Over Large Networks, Xifeng Yan, Bin He, Feida Zhu, Jiawei Han Mar 2010

Top-K Aggregation Queries Over Large Networks, Xifeng Yan, Bin He, Feida Zhu, Jiawei Han

Research Collection School Of Computing and Information Systems

Searching and mining large graphs today is critical to a variety of application domains, ranging from personalized recommendation in social networks, to searches for functional associations in biological pathways. In these domains, there is a need to perform aggregation operations on large-scale networks. Unfortunately the existing implementation of aggregation operations on relational databases does not guarantee superior performance in network space, especially when it involves edge traversals and joins of gigantic tables. In this paper, we investigate the neighborhood aggregation queries: Find nodes that have top-k highest aggregate values over their h-hop neighbors. While these basic queries are common in …


Estimating The Quality Of Postings In The Real-Time Web, Hady W. Lauw, Alexandros Ntoulas, Krishnaram Kenthapadi Feb 2010

Estimating The Quality Of Postings In The Real-Time Web, Hady W. Lauw, Alexandros Ntoulas, Krishnaram Kenthapadi

Research Collection School Of Computing and Information Systems

Millions of users are posting their status updates, interesting findings, news, ideas and observations in real-time on microblogging services such as Twitter, Jaiku and Plurk. This real-time Web can be a great resource of valuable timely information. Since the real-time Web is completely open and decentralized and anyone may post information at whim, distinguishing interesting and popular postings from the mundane ones is a challenging task. In this paper we study the problem of estimating the quality (or “interestingness”) of postings in the real-time Web. We identify several important factors that are indicative of the quality of postings, and present …


Privacy-Preserving Similarity-Based Text Retrieval, Hwee Hwa Pang, Jialie Shen, Ramayya Krishnan Feb 2010

Privacy-Preserving Similarity-Based Text Retrieval, Hwee Hwa Pang, Jialie Shen, Ramayya Krishnan

Research Collection School Of Computing and Information Systems

Users of online services are increasingly wary that their activities could disclose confidential information on their business or personal activities. It would be desirable for an online document service to perform text retrieval for users, while protecting the privacy of their activities. In this article, we introduce a privacy-preserving, similarity-based text retrieval scheme that (a) prevents the server from accurately reconstructing the term composition of queries and documents, and (b) anonymizes the search results from unauthorized observers. At the same time, our scheme preserves the relevance-ranking of the search server, and enables accounting of the number of documents that each …


Twitterrank: Finding Topic-Sensitive Influential Twitterers, Jianshu Weng, Ee Peng Lim, Jing Jiang, Qi He Feb 2010

Twitterrank: Finding Topic-Sensitive Influential Twitterers, Jianshu Weng, Ee Peng Lim, Jing Jiang, Qi He

Research Collection School Of Computing and Information Systems

This paper focuses on the problem of identifying influential users of micro-blogging services. Twitter, one of the most notable micro-blogging services, employs a social-networking model called "following", in which each user can choose who she wants to "follow" to receive tweets from without requiring the latter to give permission first. In a dataset prepared for this study, it is observed that (1) 72.4% of the users in Twitter follow more than 80% of their followers, and (2) 80.5% of the users have 80% of users they are following follow them back. Our study reveals that the presence of "reciprocity" can …


Player Performance Prediction In Massively Multiplayer Online Role-Playing Games (Mmorpgs), Kyong Jin Shim, Richa Sharan, Jaideep Srivastava Feb 2010

Player Performance Prediction In Massively Multiplayer Online Role-Playing Games (Mmorpgs), Kyong Jin Shim, Richa Sharan, Jaideep Srivastava

Research Collection School Of Computing and Information Systems

Recent years have seen an ever increasing number of people interacting in the online space. Massively multiplayer online role-playing games (MMORPGs) are personal computer or console-based digital games where thousands of players can simultaneously sign on to the same online, persistent virtual world to interact and collaborate with each other through their in-game characters. In recent years, researchers have found virtual environments to be a sound venue for studying learning, collaboration, social participation, literacy in online space, and learning trajectory at the individual level as well as at the group level. While many games today provide web and GUI-based reports …


Efficient Valid Scope For Location-Dependent Spatial Queries In Mobile Environments, Ken C. K. Lee, Wang-Chien Lee, Hong Va Leong, Brandon Unger, Baihua Zheng Feb 2010

Efficient Valid Scope For Location-Dependent Spatial Queries In Mobile Environments, Ken C. K. Lee, Wang-Chien Lee, Hong Va Leong, Brandon Unger, Baihua Zheng

Research Collection School Of Computing and Information Systems

In mobile environments, mobile clients can access information with respect to their locations by submitting Location-Dependent Spatial Queries (LDSQs) to Location-Based Service (LBS) servers. Owing to scarce wireless channel bandwidth and limited client battery life, frequent LDSQ submission from clients must be avoided. Observing that LDSQs issued from a client located at nearby positions would likely return the same query results, we explore the idea of valid scope, which represents a spatial area in which a set of LDSQs will retrieve exactly the same set of query results. With a valid scope derived and an LDSQ result cached, a client …


Dual Phase Learning For Large Scale Video Gait Recognition, Jialie Shen, Hwee Hwa Pang, Dacheng Tao, Xuelong Li Jan 2010

Dual Phase Learning For Large Scale Video Gait Recognition, Jialie Shen, Hwee Hwa Pang, Dacheng Tao, Xuelong Li

Research Collection School Of Computing and Information Systems

Accurate gait recognition from video is a complex process involving heterogenous features, and is still being developed actively. This article introduces a novel framework, called GC2F, for effective and efficient gait recognition and classification. Adopting a ”refinement-and-classification” principle, the framework comprises two components: 1) a classifier to generate advanced probabilistic features from low level gait parameters; and 2) a hidden classifier layer (based on multilayer perceptron neural network) to model the statistical properties of different subject classes. To validate our framework, we have conducted comprehensive experiments with a large test collection, and observed significant improvements in identification accuracy relative to …


Smart Media: Bridging Interactions And Services For The Smart Internet, Margaret-Anne Storey, Lars Grammel, Christoph Treude Jan 2010

Smart Media: Bridging Interactions And Services For The Smart Internet, Margaret-Anne Storey, Lars Grammel, Christoph Treude

Research Collection School Of Computing and Information Systems

This chapter describes a need for Smart Media to enhance the vision of the Smart Internet. Smart Media is introduced as a mechanism to bridge Smart Services and Smart Interactions. Smart Media extends the existing notions of Media in HCI such as Hypermedia, New Media, Adaptive Hypermedia, and Social Media. There are three main contributions from this paper: (1) A historical perspective of media in HCI and how media could benefit from smartness; (2) through some high level sample scenarios, a proposal for Smart Media to meet the vision of the Smart Internet; and (3) a detailed example of how …


Information Integration For Graph Databases, Ee Peng Lim, Aixin Sun, Anwitaman Datta, Chang Kuiyu Jan 2010

Information Integration For Graph Databases, Ee Peng Lim, Aixin Sun, Anwitaman Datta, Chang Kuiyu

Research Collection School Of Computing and Information Systems

With increasing interest in querying and analyzing graph data from multiple sources, algorithms and tools to integrate different graphs become very important. Integration of graphs can take place at the schema and instance levels. While links among graph nodes pose additional challenges to graph information integration, they can also serve as useful features for matching nodes representing real-world entities. This chapter introduces a general framework to perform graph information integration. It then gives an overview of the state-of-the-art research and tools in graph information integration.


Crctol: A Semantic Based Domain Ontology Learning System, Xing Jiang, Ah-Hwee Tan Jan 2010

Crctol: A Semantic Based Domain Ontology Learning System, Xing Jiang, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Domain ontologies play an important role in supporting knowledge‐based applications in the Semantic Web. To facilitate the building of ontologies, text mining techniques have been used to perform ontology learning from texts. However, traditional systems employ shallow natural language processing techniques and focus only on concept and taxonomic relation extraction. In this paper we present a system, known as Concept‐Relation‐Concept Tuple‐based Ontology Learning (CRCTOL), for mining ontologies automatically from domain‐specific documents. Specifically, CRCTOL adopts a full text parsing technique and employs a combination of statistical and lexico‐syntactic methods, including a statistical algorithm that extracts key concepts from a document collection, …


Anonymous Query Processing In Road Networks, Kyriakos Mouratidis, Man Lung Yiu Jan 2010

Anonymous Query Processing In Road Networks, Kyriakos Mouratidis, Man Lung Yiu

Research Collection School Of Computing and Information Systems

The increasing availability of location-aware mobile devices has given rise to a flurry of location-based services (LBSs). Due to the nature of spatial queries, an LBS needs the user position in order to process her requests. On the other hand, revealing exact user locations to a (potentially untrusted) LBS may pinpoint their identities and breach their privacy. To address this issue, spatial anonymity techniques obfuscate user locations, forwarding to the LBS a sufficiently large region instead. Existing methods explicitly target processing in the euclidean space and do not apply when proximity to the users is defined according to network distance …


Trust-Oriented Composite Service Selection With Qos Constraints, Lei Li, Yang Wang, Ee Peng Lim Jan 2010

Trust-Oriented Composite Service Selection With Qos Constraints, Lei Li, Yang Wang, Ee Peng Lim

Research Collection School Of Computing and Information Systems

In Service-Oriented Computing (SOC) environments, service clients interact with service providers for consuming services. From the viewpoint of service clients, the trust level of a service or a service provider is a critical factor to consider in service selection, particularlywhen a client is looking for a service from a large set of services or service providers. However, a invoked service may be composed of other services. The complex invocations in composite services greatly increase the complexity of trust-oriented service selection. In this paper, we propose novel approaches for composite service representation, trust evaluation and trust-oriented com-posite service selection (with QoS …


Keep It Simple With Time: A Reexamination Of Probabilistic Topic Detection Models, Qi He, Kuiyu Chang, Ee Peng Lim, Arindam Banerjee Jan 2010

Keep It Simple With Time: A Reexamination Of Probabilistic Topic Detection Models, Qi He, Kuiyu Chang, Ee Peng Lim, Arindam Banerjee

Research Collection School Of Computing and Information Systems

Topic detection (TD) is a fundamental research issue in the Topic Detection and Tracking (TDT) community with practical implications; TD helps analysts to separate the wheat from the chaff among the thousands of incoming news streams. In this paper, we propose a simple and effective topic detection model called the temporal Discriminative Probabilistic Model (DPM), which is shown to be theoretically equivalent to the classic vector space model with feature selection and temporally discriminative weights. We compare DPM to its various probabilistic cousins, ranging from mixture models like von-Mises Fisher (vMF) to mixed membership models like Latent Dirichlet Allocation (LDA). …


Modeling Anticipatory Event Transitions, He Qi, Kuiyu Chang, Ee Peng Lim Jan 2010

Modeling Anticipatory Event Transitions, He Qi, Kuiyu Chang, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Major world events such as terrorist attacks, natural disasters, wars, etc. typically progress through various representative stages/states in time. For example, a volcano eruption could lead to earthquakes, tsunamis, aftershocks, evacuation, rescue efforts, international relief support, rebuilding, and resettlement, etc. By analyzing various types of catastrophical and historical events, we can derive corresponding event transition models to embed useful information at each state. The knowledge embedded in these models can be extremely valuable. For instance, a transition model of the 1918-1920 flu pandemic could be used for the planning and allocation of resources to decisively respond to future occurrences of …


Wikipedia2onto: Building Concept Ontology Automatically, Experimenting With Web Image Retrieval, Huan Wang, Xing Jiang, Liang-Tien Chia, Ah-Hwee Tan Jan 2010

Wikipedia2onto: Building Concept Ontology Automatically, Experimenting With Web Image Retrieval, Huan Wang, Xing Jiang, Liang-Tien Chia, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Given its effectiveness to better understand data, ontology has been used in various domains including artificial intelligence, biomedical informatics and library science. What we have tried to promote is the use of ontology to better understand media (in particular, images) on the World Wide Web. This paper describes our preliminary attempt to construct a large-scale multi-modality ontology, called AutoMMOnto, for web image classification. Particularly, to enable the automation of text ontology construction, we take advantage of both structural and content features of Wikipedia and formalize real world objects in terms of concepts and relationships. For visual part, we train classifiers …


Applying Soft Cluster Analysis Techniques To Customer Interaction Information, Randall E. Duran, Li Zhang, Tom Hayhurst Jan 2010

Applying Soft Cluster Analysis Techniques To Customer Interaction Information, Randall E. Duran, Li Zhang, Tom Hayhurst

Research Collection School Of Computing and Information Systems

The number of channels available for companies and customers to communicate with one another has increased dramatically over the past several decades. Although some market segmentation efforts utilize high-level customer interaction statistics, in-depth information regarding customers’ use of different communication channels is often ignored. Detailed customer interaction information can help companies improve the way that they market to customers by taking into consideration customers’ behaviour patterns and preferences. However, a key challenge of interpreting customer contact information is that many channels have only been in existence for a relatively short period of time, and thus, there is limited understanding and …