Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 61

Full-Text Articles in Physical Sciences and Mathematics

I’M A Virus Harming The Earth, M. Thulasidas Dec 2007

I’M A Virus Harming The Earth, M. Thulasidas

Research Collection School Of Computing and Information Systems

We humans plunder the raw material from our host planet with such an abandon that is only seen in viruses.


Self-Organizing Neural Architectures And Cooperative Learning In A Multiagent Environment, Dan Xiao, Ah-Hwee Tan Dec 2007

Self-Organizing Neural Architectures And Cooperative Learning In A Multiagent Environment, Dan Xiao, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Temporal-Difference–Fusion Architecture for Learning, Cognition, and Navigation (TD-FALCON) is a generalization of adaptive resonance theory (a class of self-organizing neural networks) that incorporates TD methods for real-time reinforcement learning. In this paper, we investigate how a team of TD-FALCON networks may cooperate to learn and function in a dynamic multiagent environment based on minefield navigation and a predator/prey pursuit tasks. Experiments on the navigation task demonstrate that TD-FALCON agent teams are able to adapt and function well in a multiagent environment without an explicit mechanism of collaboration. In comparison, traditional Q-learning agents using gradient-descent-based feedforward neural networks, trained with the …


Multi-Order Neurons For Evolutionary Higher Order Clustering And Growth, Kiruthika Ramanathan, Sheng Uei Guan Dec 2007

Multi-Order Neurons For Evolutionary Higher Order Clustering And Growth, Kiruthika Ramanathan, Sheng Uei Guan

Research Collection School Of Computing and Information Systems

This letter proposes to use multiorder neurons for clustering irregularly shaped data arrangements. Multiorder neurons are an evolutionary extension of the use of higher-order neurons in clustering. Higher-order neurons parametrically model complex neuron shapes by replacing the classic synaptic weight by higher-order tensors. The multiorder neuron goes one step further and eliminates two problems associated with higher-order neurons. First, it uses evolutionary algorithms to select the best neuron order for a given problem. Second, it obtains more information about the underlying data distribution by identifying the correct order for a given cluster of patterns. Empirically we observed that when the …


Preventing Location-Based Identity Inference In Anonymous Spatial Queries, Panos Kalnis, Gabriel Ghinita, Kyriakos Mouratidis, Dimitris Papadias Dec 2007

Preventing Location-Based Identity Inference In Anonymous Spatial Queries, Panos Kalnis, Gabriel Ghinita, Kyriakos Mouratidis, Dimitris Papadias

Research Collection School Of Computing and Information Systems

The increasing trend of embedding positioning capabilities (for example, GPS) in mobile devices facilitates the widespread use of location-based services. For such applications to succeed, privacy and confidentiality are essential. Existing privacy-enhancing techniques rely on encryption to safeguard communication channels, and on pseudonyms to protect user identities. Nevertheless, the query contents may disclose the physical location of the user. In this paper, we present a framework for preventing location-based identity inference of users who issue spatial queries to location-based services. We propose transformations based on the well-established K-anonymity concept to compute exact answers for range and nearest neighbor search, without …


Sloque: Slot-Based Query Expansion For Complex Questions, Maggy Anastasia Suryanto, Ee Peng Lim, Aixin Sun, Roger Hsiang-Li Chiang Nov 2007

Sloque: Slot-Based Query Expansion For Complex Questions, Maggy Anastasia Suryanto, Ee Peng Lim, Aixin Sun, Roger Hsiang-Li Chiang

Research Collection School Of Computing and Information Systems

Searching answers to complex questions is a challenging IR task. In this paper, we examine the use of query templates with semantic slots to formulate slot-based queries. These queries have query terms assigned to entity and relationship slots. We develop several query expansion methods for slot-based queries so as to improve their retrieval effectiveness on a document collection. Each method consists of a combination of term scoring scheme, term scoring formula, and term assignment scheme. Our preliminary experiments evaluate these different slot-based query expansion methods on a collection of news documents,and conclude that:(1) slot-based queries yield better retrieval accuracy compared …


Experimenting Vireo-374: Bag-Of-Visual-Words And Visual-Based Ontology For Semantic Video Indexing And Search, Chong-Wah Ngo, Yu-Gang Jiang, Xiaoyong Wei, Feng Wang, Wanlei Zhao, Hung-Khoon Tan, Xiao Wu Nov 2007

Experimenting Vireo-374: Bag-Of-Visual-Words And Visual-Based Ontology For Semantic Video Indexing And Search, Chong-Wah Ngo, Yu-Gang Jiang, Xiaoyong Wei, Feng Wang, Wanlei Zhao, Hung-Khoon Tan, Xiao Wu

Research Collection School Of Computing and Information Systems

In this paper, we present our approaches and results of high-level feature extraction and automatic video search in TRECVID-2007.


Comment-Oriented Blog Summarization By Sentence Extraction, Meishan Hu, Ee Peng Lim, Aixin Sun Nov 2007

Comment-Oriented Blog Summarization By Sentence Extraction, Meishan Hu, Ee Peng Lim, Aixin Sun

Research Collection School Of Computing and Information Systems

Much existing research on blogs focused on posts only, ignoring their comments. Our user study conducted on summarizing blog posts, however, showed that reading comments does change one's understanding about blog posts. In this research, we aim to extract representative sentences from a blog post that best represent the topics discussed among its comments. The proposed solution first derives representative words from comments and then selects sentences containing representative words. The representativeness of words is measured using ReQuT (i.e., Reader, Quotation, and Topic). Evaluated on human labeled sentences, ReQuT together with summation-based sentence selection showed promising results.


On Improving Wikipedia Search Using Article Quality, Meiqun Hu, Ee Peng Lim, Aixin Sun, Hady Wirawan Lauw, Ba-Quy Vuong Nov 2007

On Improving Wikipedia Search Using Article Quality, Meiqun Hu, Ee Peng Lim, Aixin Sun, Hady Wirawan Lauw, Ba-Quy Vuong

Research Collection School Of Computing and Information Systems

Wikipedia is presently the largest free-and-open online encyclopedia collaboratively edited and maintained by volunteers. While Wikipedia offers full-text search to its users, the accuracy of its relevance-based search can be compromised by poor quality articles edited by non-experts and inexperienced contributors. In this paper, we propose a framework that re-ranks Wikipedia search results considering article quality. We develop two quality measurement models, namely Basic and PeerReview, to derive article quality based on co-authoring data gathered from articles' edit history. Compared with Wikipedia's full-text search engine, Google and Wikiseek, our experimental results showed that (i) quality-only ranking produced by PeerReview gives …


Measuring Article Quality In Wikipedia: Models And Evaluation, Meiqun Hu, Ee Peng Lim, Aixin Sun, Hady W. Lauw, Ba-Quy Vuong Nov 2007

Measuring Article Quality In Wikipedia: Models And Evaluation, Meiqun Hu, Ee Peng Lim, Aixin Sun, Hady W. Lauw, Ba-Quy Vuong

Research Collection School Of Computing and Information Systems

Wikipedia has grown to be the world largest and busiest free encyclopedia, in which articles are collaboratively written and maintained by volunteers online. Despite its success as a means of knowledge sharing and collaboration, the public has never stopped criticizing the quality of Wikipedia articles edited by non-experts and inexperienced contributors. In this paper, we investigate the problem of assessing the quality of articles in collaborative authoring of Wikipedia. We propose three article quality measurement models that make use of the interaction data between articles and their contributors derived from the article edit history. Our Basic model is designed based …


Analyzing Service Usage Patterns: Methodology And Simulation, Qianhui (Althea) Liang, Jen-Yao Chung Oct 2007

Analyzing Service Usage Patterns: Methodology And Simulation, Qianhui (Althea) Liang, Jen-Yao Chung

Research Collection School Of Computing and Information Systems

This paper proposes that service mining technology will power the construction of new business services via both intra- and inter-enterprise service assembly within the Service Oriented Architecture (SOA) framework. We investigate the methodologies of service mining at the component level of service usage. We also demonstrate how mining of service usage patterns is intended to be used to improve different aspects of service composition. Simulation experiments conducted for mining at the component level are analyzed. The processing details within a general service mining deployment are demonstrated.


A Multitude Of Opinions: Mining Online Rating Data, Hady Wirawan Lauw, Ee Peng Lim Oct 2007

A Multitude Of Opinions: Mining Online Rating Data, Hady Wirawan Lauw, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Online rating system is a popular feature of Web 2.0 applications. It typically involves a set of reviewers assigning rating scores (based on various evaluation criteria) to a set of objects. We identify two objectives for research on online rating data, namely achieving effective evaluation of objects and learning behaviors of reviewers/objects. These two objectives have conventionally been pursued separately. We argue that the future research direction should focus on the integration of these two objectives, as well as the integration between rating data and other types of data.


I Tube, You Tube, Everybody Tubes: Analyzing The World’S Largest User Generated Content Video System, Meeyoung Cha, Haewoon Kwak, Pablo Rodriguez, Yong-Yeol Ahn, Sue. Moon Oct 2007

I Tube, You Tube, Everybody Tubes: Analyzing The World’S Largest User Generated Content Video System, Meeyoung Cha, Haewoon Kwak, Pablo Rodriguez, Yong-Yeol Ahn, Sue. Moon

Research Collection School Of Computing and Information Systems

User Generated Content (UGC) is re-shaping the way people watch video and TV, with millions of video producers and consumers. In particular, UGC sites are creating new viewing patterns and social interactions, empowering users to be more creative, and developing new business opportunities. To better understand the impact of UGC systems, we have analyzed YouTube, the world's largest UGC VoD system. Based on a large amount of data collected, we provide an in-depth study of YouTube and other similar UGC systems. In particular, we study the popularity life-cycle of videos, the intrinsic statistical properties of requests and their relationship with …


Efficient Discovery Of Frequent Approximate Sequential Patterns, Feida Zhu, Xifeng Yan, Jiawei Han, Philip S. Yu Oct 2007

Efficient Discovery Of Frequent Approximate Sequential Patterns, Feida Zhu, Xifeng Yan, Jiawei Han, Philip S. Yu

Research Collection School Of Computing and Information Systems

We propose an efficient algorithm for mining frequent approximate sequential patterns under the Hamming distance model. Our algorithm gains its efficiency by adopting a "break-down-and-build-up" methodology. The "breakdown" is based on the observation that all occurrences of a frequent pattern can be classified into groups, which we call strands. We developed efficient algorithms to quickly mine out all strands by iterative growth. In the "build-up" stage, these strands are grouped up to form the support sets from which all approximate patterns would be identified. A salient feature of our algorithm is its ability to grow the frequent patterns by iteratively …


Gapprox: Mining Frequent Approximate Patterns From A Massive Network, Chen Chen, Xifeng Yan, Feida Zhu, Jiawei Han Oct 2007

Gapprox: Mining Frequent Approximate Patterns From A Massive Network, Chen Chen, Xifeng Yan, Feida Zhu, Jiawei Han

Research Collection School Of Computing and Information Systems

Recently, there arise a large number of graphs with massive sizes and complex structures in many new applications, such as biological networks, social networks, and the Web, demanding powerful data mining methods. Due to inherent noise or data diversity, it is crucial to address the issue of approximation, if one wants to mine patterns that are potentially interesting with tolerable variations. In this paper, we investigate the problem of mining frequent approximate patterns from a massive network and propose a method called gApprox. gApprox not only finds approximate network patterns, which is the key for many knowledge discovery applications on …


Om-Based Video Shot Retrieval By One-To-One Matching, Yuxin Peng, Chong-Wah Ngo, Jianguo Xiao Oct 2007

Om-Based Video Shot Retrieval By One-To-One Matching, Yuxin Peng, Chong-Wah Ngo, Jianguo Xiao

Research Collection School Of Computing and Information Systems

This paper proposes a new approach for shot-based retrieval by optimal matching (OM), which provides an effective mechanism for the similarity measure and ranking of shots by one-to-one matching. In the proposed approach, a weighted bipartite graph is constructed to model the color similarity between two shots. Then OM based on Kuhn-Munkres algorithm is employed to compute the maximum weight of a constructed bipartite graph as the shot similarity value by one-to-one matching among frames. To improve the speed efficiency of OM, two improved algorithms are also proposed: bipartite graph construction based on subshots and bipartite graph construction based on …


Evolutionary Combinatorial Optimization For Recursive Supervised Learning With Clustering, Kiruthika Ramanathan, Sheng Uei Guan Sep 2007

Evolutionary Combinatorial Optimization For Recursive Supervised Learning With Clustering, Kiruthika Ramanathan, Sheng Uei Guan

Research Collection School Of Computing and Information Systems

The idea of using a team of weak learners to learn a dataset is a successful one in literature. In this paper, we explore a recursive incremental approach to ensemble learning. In this paper, patterns are clustered according to the output space of the problem, i.e., natural clusters are formed based on patterns belonging to each class. A combinatorial optimization problem is therefore formed, which is solved using evolutionary algorithms. The evolutionary algorithms identify the "easy" and the "difficult" clusters in the system. The removal of the easy patterns then gives way to the focused learning of the more complicated …


Cross-Language And Cross-Media Image Retrieval: An Empirical Study At Imageclef2007, Steven C. H. Hoi Sep 2007

Cross-Language And Cross-Media Image Retrieval: An Empirical Study At Imageclef2007, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

This paper summarizes our empirical study of cross-language and cross-media image retrieval at the CLEF image retrieval track (ImageCLEF2007). In this year, we participated in the ImageCLEF photo retrieval task, in which the goal of the retrieval task is to search natural photos by some query with both textual and visual information. In this paper, we study the empirical evaluations of our solutions for the image retrieval tasks in three aspects. First of all, we study the application of language models and smoothing strategies for text-based image retrieval, particularly addressing the short text query issue. Secondly, we study the cross-media …


Overview Of The Imageclef 2007 Object Retrieval Task, Thomas Deselaers, Steven C. H. Hoi Sep 2007

Overview Of The Imageclef 2007 Object Retrieval Task, Thomas Deselaers, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

We describe the object retrieval task of ImageCLEF 2007, give an overview of the methods of the participating groups, and present and discuss the results. The task was based on the widely used PASCAL object recognition data to train object recognition methods and on the IAPR TC-12 benchmark dataset from which images of objects of the ten different classes bicycles, buses, cars, motorbikes, cats, cows, dogs, horses, sheep, and persons had to be retrieved. Seven international groups participated using a wide variety of methods. The results of the evaluation show that the task was very challenging and that different methods …


Ntu: Solution For The Object Retrieval Task Of The Imageclef2007, Steven C. H. Hoi Sep 2007

Ntu: Solution For The Object Retrieval Task Of The Imageclef2007, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Object retrieval is an interdisciplinary research problem between object recognition and content-based image retrieval (CBIR). It is commonly expected that object retrieval can be solved more effectively with the joint maximization of CBIR and object recognition techniques. We study a typical CBIR solution with application to the object retrieval tasks [26,27]. We expect that the empirical study in this work will serve as a baseline for future research when using CBIR techniques for object recognition.


Who’S Creating?, M. Thulasidas Sep 2007

Who’S Creating?, M. Thulasidas

Research Collection School Of Computing and Information Systems

We don’t read to retain information or knowledge any more. We search, scan, locate keywords, browse and bookmark. The Internet is doing to our reading habits what the calculator did to our arithmetic abilities. Knowledge is not cheap, although our easy access to it through the Internet may indicate otherwise. When we all become users of information, our knowledge will stop at its current level because nobody will be creating it any more.


Column Heterogeneity As A Measure Of Data Quality, Bing Tian Dai, Nick Koudas, Beng Chin Ooi, Divesh Srivastava, Suresh Venkatasubramanian Sep 2007

Column Heterogeneity As A Measure Of Data Quality, Bing Tian Dai, Nick Koudas, Beng Chin Ooi, Divesh Srivastava, Suresh Venkatasubramanian

Research Collection School Of Computing and Information Systems

Data quality is a serious concern in every data management application, and a variety of quality measures have been proposed, including accuracy, freshness and completeness, to capture the common sources of data quality degradation. We identify and focus attention on a novel measure, column heterogeneity, that seeks to quantify the data quality problems that can arise when merging data from different sources. We identify desiderata that a column heterogeneity measure should intuitively satisfy, and discuss a promising direction of research to quantify database column heterogeneity based on using a novel combination of cluster entropy and soft clustering. Finally, we present …


Cost-Time Sensitive Decision Tree With Missing Values, Shichao Zhang, Xiaofeng Zhu, Jilian Zhang, Chengqi Zhang Aug 2007

Cost-Time Sensitive Decision Tree With Missing Values, Shichao Zhang, Xiaofeng Zhu, Jilian Zhang, Chengqi Zhang

Research Collection School Of Computing and Information Systems

Cost-sensitive decision tree learning is very important and popular in machine learning and data mining community. There are many literatures focusing on misclassification cost and test cost at present. In real world application, however, the issue of time-sensitive should be considered in cost-sensitive learning. In this paper, we regard the cost of time-sensitive in cost-sensitive learning as waiting cost (referred to WC), a novelty splitting criterion is proposed for constructing cost-time sensitive (denoted as CTS) decision tree for maximal decrease the intangible cost. And then, a hybrid test strategy that combines the sequential test with the batch test strategies is …


Mapping Better Business Strategies With Gis, Tin Seong Kam Aug 2007

Mapping Better Business Strategies With Gis, Tin Seong Kam

Research Collection School Of Computing and Information Systems

The value of location as a business measure is fast becoming an important consideration for organisations. GIS (Geographical Information Systems), with its capability to manage, display, analyse business information spatially, is emerging as a powerful location intelligence tool. In the US, Starbucks, Blockbuster, Hyundai, and thousands of other businesses use census data and GIS software to help them understand what types of people buy their products and services, and how to better market to these consumers. For example, McDonald’s in Japan uses a GIS system to overlay demographic information on maps to help identify promising new store sites. Singapore Management …


Are You Too Smart For Your Own Good?, M. Thulasidas Aug 2007

Are You Too Smart For Your Own Good?, M. Thulasidas

Research Collection School Of Computing and Information Systems

Knowledge can be a bad thing, if others are taking credit for it. TECHNICAL knowledge is not always a good thing for you in the modern workplace.


Discovering And Exploiting Causal Dependencies For Robust Mobile Context-Aware Recommenders, Ghim-Eng Yap, Ah-Hwee Tan, Hwee Hwa Pang Jul 2007

Discovering And Exploiting Causal Dependencies For Robust Mobile Context-Aware Recommenders, Ghim-Eng Yap, Ah-Hwee Tan, Hwee Hwa Pang

Research Collection School Of Computing and Information Systems

Acquisition of context poses unique challenges to mobile context-aware recommender systems. The limited resources in these systems make minimizing their context acquisition a practical need, and the uncertainty in the mobile environment makes missing and erroneous context inputs a major concern. In this paper, we propose an approach based on Bayesian networks (BNs) for building recommender systems that minimize context acquisition. Our learning approach iteratively trims the BN-based context model until it contains only the minimal set of context parameters that are important to a user. In addition, we show that a two-tiered context model can effectively capture the causal …


Learning Causal Models For Noisy Biological Data Mining: An Application To Ovarian Cancer Detection, Ghim-Eng Yap, Ah-Hwee Tan, Hwee Hwa Pang Jul 2007

Learning Causal Models For Noisy Biological Data Mining: An Application To Ovarian Cancer Detection, Ghim-Eng Yap, Ah-Hwee Tan, Hwee Hwa Pang

Research Collection School Of Computing and Information Systems

Undetected errors in the expression measurements from highthroughput DNA microarrays and protein spectroscopy could seriously affect the diagnostic reliability in disease detection. In addition to a high resilience against such errors, diagnostic models need to be more comprehensible so that a deeper understanding of the causal interactions among biological entities like genes and proteins may be possible. In this paper, we introduce a robust knowledge discovery approach that addresses these challenges. First, the causal interactions among the genes and proteins in the noisy expression data are discovered automatically through Bayesian network learning. Then, the diagnosis of a disease based on …


An Empirical Study On Large-Scale Content-Based Image Retrieval, Yuk Man Wong, Steven C. H. Hoi, Michael R. Lyu Jul 2007

An Empirical Study On Large-Scale Content-Based Image Retrieval, Yuk Man Wong, Steven C. H. Hoi, Michael R. Lyu

Research Collection School Of Computing and Information Systems

One key challenge in content-based image retrieval (CBIR) is to develop a fast solution for indexing high-dimensional image contents, which is crucial to building large-scale CBIR systems. In this paper, we propose a scalable content-based image retrieval scheme using locality-sensitive hashing (LSH), and conduct extensive evaluations on a large image testbed of a half million images. To the best of our knowledge, there is less comprehensive study on large-scale CBIR evaluation with a half million images. Our empirical results show that our proposed solution is able to scale for hundreds of thousands of images, which is promising for building Web-scale …


Continuous Medoid Queries Over Moving Objects, Stavros Papadopoulos, Dimitris Sacharidis, Kyriakos Mouratidis Jul 2007

Continuous Medoid Queries Over Moving Objects, Stavros Papadopoulos, Dimitris Sacharidis, Kyriakos Mouratidis

Research Collection School Of Computing and Information Systems

In the k-medoid problem, given a dataset P, we are asked to choose kpoints in P as the medoids. The optimal medoid set minimizes the average Euclidean distance between the points in P and their closest medoid. Finding the optimal k medoids is NP hard, and existing algorithms aim at approximate answers, i.e., they compute medoids that achieve a small, yet not minimal, average distance. Similarly in this paper, we also aim at approximate solutions. We consider, however, the continuous version of the problem, where the points in P move and our task is to maintain the medoid set on-the-fly …


On Searching Continuous Nearest Neighbors In Wireless Data Broadcast Systems, Baihua Zheng, Wang-Chien Lee, Dik Lun Lee Jul 2007

On Searching Continuous Nearest Neighbors In Wireless Data Broadcast Systems, Baihua Zheng, Wang-Chien Lee, Dik Lun Lee

Research Collection School Of Computing and Information Systems

A continuous nearest neighbor (CNN) search, which retrieves the nearest neighbors corresponding to every point in a given query line segment, is important for location-based services such as vehicular navigation and tourist guides. It is infeasible to answer a CNN search by issuing a traditional nearest neighbor query at every point of the line segment due to the large number of queries generated and the overhead on bandwidth. Algorithms have been proposed recently to support CNN search in the traditional client-server systems but not in the environment of wireless data broadcast, where uplink communication channels from mobile devices to the …


Cross-Lingual Query Suggestion Using Query Logs Of Different Languages, Wei Gao, Cheng Niu, Jian-Yun Nie, Ming Zhou, Jian Hu, Kam-Fai Wong, Hsiao-Wuen Hon Jul 2007

Cross-Lingual Query Suggestion Using Query Logs Of Different Languages, Wei Gao, Cheng Niu, Jian-Yun Nie, Ming Zhou, Jian Hu, Kam-Fai Wong, Hsiao-Wuen Hon

Research Collection School Of Computing and Information Systems

Query suggestion aims to suggest relevant queries for a given query, which help users better specify their information needs. Previously, the suggested terms are mostly in the same language of the input query. In this paper, we extend it to cross-lingual query suggestion (CLQS): for a query in one language, we suggest similar or relevant queries in other languages. This is very important to scenarios of cross-language information retrieval (CLIR) and cross-lingual keyword bidding for search engine advertisement. Instead of relying on existing query translation technologies for CLQS, we present an effective means to map the input query of one …