Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 31 - 60 of 89

Full-Text Articles in Physical Sciences and Mathematics

Accelerating Sequence Searching: Dimensionality Reduction Method, Guojie Song, Bin Cui, Baihua Zheng, Kunqing Xie, Dongqing Yang Sep 2009

Accelerating Sequence Searching: Dimensionality Reduction Method, Guojie Song, Bin Cui, Baihua Zheng, Kunqing Xie, Dongqing Yang

Research Collection School Of Computing and Information Systems

Similarity search over long sequence dataset becomes increasingly popular in many emerging applications, such as text retrieval, genetic sequences exploring, etc. In this paper, a novel index structure, namely Sequence Embedding Multiset tree (SEM − tree), has been proposed to speed up the searching process over long sequences. The SEM-tree is a multi-level structure where each level represents the sequence data with different compression level of multiset, and the length of multiset increases towards the leaf level which contains original sequences. The multisets, obtained using sequence embedding algorithms, have the desirable property that they do not need to keep the …


Batch Mode Active Learning With Applications To Text Categorization And Image Retrieval, Steven C. H. Hoi, Rong Jin, Michael R. Lyu Sep 2009

Batch Mode Active Learning With Applications To Text Categorization And Image Retrieval, Steven C. H. Hoi, Rong Jin, Michael R. Lyu

Research Collection School Of Computing and Information Systems

Most machine learning tasks in data classification and information retrieval require manually labeled data examples in the training stage. The goal of active learning is to select the most informative examples for manual labeling in these learning tasks. Most of the previous studies in active learning have focused on selecting a single unlabeled example in each iteration. This could be inefficient, since the classification model has to be retrained for every acquired labeled example. It is also inappropriate for the setup of information retrieval tasks where the user's relevance feedback is often provided for the top K retrieved items. In …


Why Quants Fail, M. Thulasidas Sep 2009

Why Quants Fail, M. Thulasidas

Research Collection School Of Computing and Information Systems

Mathematical finance is built on a couple of assumptions. The most fundamental of them is the one on ma ket efficiency. It states that the market prices every asset fairly, and that the prices contain all the information available in the market.


Visible Reverse K-Nearest Neighbor Query Processing In Spatial Databases, Yunjun Gao, Baihua Zheng, Gencai Chen, Wang-Chien Lee, Ken C. K. Lee, Qing Li Sep 2009

Visible Reverse K-Nearest Neighbor Query Processing In Spatial Databases, Yunjun Gao, Baihua Zheng, Gencai Chen, Wang-Chien Lee, Ken C. K. Lee, Qing Li

Research Collection School Of Computing and Information Systems

Reverse nearest neighbor (RNN) queries have a broad application base such as decision support, profile-based marketing, resource allocation, etc. Previous work on RNN search does not take obstacles into consideration. In the real world, however, there are many physical obstacles (e.g., buildings) and their presence may affect the visibility between objects. In this paper, we introduce a novel variant of RNN queries, namely, visible reverse nearest neighbor (VRNN) search, which considers the impact of obstacles on the visibility of objects. Given a data set P, an obstacle set O, and a query point q in a 2D space, a VRNN …


Communication-Efficient Classification In P2p Networks, Hock Hee Ang, Vivekanand Gopalkrishnan, Wee Keong Ng, Steven C. H. Hoi Sep 2009

Communication-Efficient Classification In P2p Networks, Hock Hee Ang, Vivekanand Gopalkrishnan, Wee Keong Ng, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Distributed classification aims to learn with accuracy comparable to that of centralized approaches but at far lesser communication and computation costs. By nature, P2P networks provide an excellent environment for performing a distributed classification task due to the high availability of shared resources, such as bandwidth, storage space, and rich computational power. However, learning in P2P networks is faced with many challenging issues; viz., scalability, peer dynamism, asynchronism and fault-tolerance. In this paper, we address these challenges by presenting CEMPaR—a communication-efficient framework based on cascading SVMs that exploits the characteristics of DHT-based lookup protocols. CEMPaR is designed to be robust …


Exploiting Bilingual Information To Improve Web Search, Wei Gao, John Bitzer, Ming Zhou, Kam-Fai Wong Aug 2009

Exploiting Bilingual Information To Improve Web Search, Wei Gao, John Bitzer, Ming Zhou, Kam-Fai Wong

Research Collection School Of Computing and Information Systems

Web search quality can vary widely across languages, even for the same information need. We propose to exploit this variation in quality by learning a ranking function on bilingual queries: queries that appear in query logs for two languages but represent equivalent search interests. For a given bilingual query, along with corresponding monolingual query log and monolingual ranking, we generate a ranking on pairs of documents, one from each language. Then we learn a linear ranking function which exploits bilingual features on pairs of documents, as well as standard monolingual features. Finally, we show how to reconstruct monolingual ranking from …


Setting Discrete Bid Levels Adaptively In Repeated Auctions, Jilian Zhang, Hoong Chuin Lau, Jialie Shen Aug 2009

Setting Discrete Bid Levels Adaptively In Repeated Auctions, Jilian Zhang, Hoong Chuin Lau, Jialie Shen

Research Collection School Of Computing and Information Systems

The success of an auction design often hinges on its ability to set parameters such as reserve price and bid levels that will maximize an objective function such as the auctioneer revenue. Works on designing adaptive auction mechanisms have emerged recently, and the challenge is in learning different auction parameters by observing the bidding in previous auctions. In this paper, we propose a non-parametric method for determining discrete bid levels dynamically so as to maximize the auctioneer revenue. First, we propose a non-parametric kernel method for estimating the probabilities of closing price with past auction data. Then a greedy strategy …


Multi-Task Transfer Learning For Weakly-Supervised Relation Extraction, Jing Jiang Aug 2009

Multi-Task Transfer Learning For Weakly-Supervised Relation Extraction, Jing Jiang

Research Collection School Of Computing and Information Systems

Creating labeled training data for relation extraction is expensive. In this paper, we study relation extraction in a special weakly-supervised setting when we have only a few seed instances of the target relation type we want to extract but we also have a large amount of labeled instances of other relation types. Observing that different relation types can share certain common structures, we propose to use a multi-task learning method coupled with human guidance to address this weakly-supervised relation extraction problem. The proposed framework models the commonality among different relation types through a shared weight vector, enables knowledge learned from …


Optimal-Location-Selection Query Processing In Spatial Databases, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li Aug 2009

Optimal-Location-Selection Query Processing In Spatial Databases, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li

Research Collection School Of Computing and Information Systems

This paper introduces and solves a novel type of spatial queries, namely, Optimal-Location-Selection (OLS) search, which has many applications in real life. Given a data object set D_A, a target object set D_B, a spatial region R, and a critical distance d_c in a multidimensional space, an OLS query retrieves those target objects in D_B that are outside R but have maximal optimality. Here, the optimality of a target object b \in D_B located outside R is defined as the number of the data objects from D_A that are inside R and meanwhile have their distances to b not exceeding …


Inferring Player Rating From Performance Data In Massively Multiplayer Online Role-Playing Games (Mmorpgs), Kyong Jin Shim, Muhammad Aurangzeb Ahmad, Nishith Pathak, Jaideep Srivastava Aug 2009

Inferring Player Rating From Performance Data In Massively Multiplayer Online Role-Playing Games (Mmorpgs), Kyong Jin Shim, Muhammad Aurangzeb Ahmad, Nishith Pathak, Jaideep Srivastava

Research Collection School Of Computing and Information Systems

This paper examines online player performance in EverQuest II, a popular massively multiplayer online role-playing game (MMORPG) developed by Sony Online Entertainment. The study uses the game's player performance data to devise performance metrics for online players. We report three major findings. First, we show that the game's point-scaling system overestimates performances of lower level players and underestimates performances of higher level players. We present a novel point-scaling system based on the game's player performance data that addresses the underestimation and overestimation problems. Second, we present a highly accurate predictive model for player performance as a function of past behavior. …


A Fair Assignment Algorithm For Multiple Preference Queries, Leong Hou U, Nikos Mamoulis, Kyriakos Mouratidis Aug 2009

A Fair Assignment Algorithm For Multiple Preference Queries, Leong Hou U, Nikos Mamoulis, Kyriakos Mouratidis

Research Collection School Of Computing and Information Systems

Consider an internship assignment system, where at the end of each academic year, interested university students search and apply for available positions, based on their preferences (e.g., nature of the job, salary, office location, etc). In a variety of facility, task or position assignment contexts, users have personal preferences expressed by different weights on the attributes of the searched objects. Although individual preference queries can be evaluated by selecting the object in the database with the highest aggregate score, in the case of multiple simultaneous requests, a single object cannot be assigned to more than one users. The challenge is …


Scalable Verification For Outsourced Dynamic Databases, Hwee Hwa Pang, Jilian Zhang, Kyriakos Mouratidis Aug 2009

Scalable Verification For Outsourced Dynamic Databases, Hwee Hwa Pang, Jilian Zhang, Kyriakos Mouratidis

Research Collection School Of Computing and Information Systems

Query answers from servers operated by third parties need to be verified, as the third parties may not be trusted or their servers may be compromised. Most of the existing authentication methods construct validity proofs based on the Merkle hash tree (MHT). The MHT, however, imposes severe concurrency constraints that slow down data updates. We introduce a protocol, built upon signature aggregation, for checking the authenticity, completeness and freshness of query answers. The protocol offers the important property of allowing new data to be disseminated immediately, while ensuring that outdated values beyond a pre-set age can be detected. We also …


Ssnetviz: A Visualization Engine For Heterogeneous Semantic Social Networks, Ee Peng Lim, Maureen Maureen, Nelman Lubis Ibrahim, Aixin Sun, Anwitaman Datta, Kuiyu Chang Aug 2009

Ssnetviz: A Visualization Engine For Heterogeneous Semantic Social Networks, Ee Peng Lim, Maureen Maureen, Nelman Lubis Ibrahim, Aixin Sun, Anwitaman Datta, Kuiyu Chang

Research Collection School Of Computing and Information Systems

SSnetViz is an ongoing research to design and implement a visualization engine for heterogeneous semantic social networks. A semantic social network is a multi-modal network that contains nodes representing di®erent types of people or object entities, and edges representing relationships among them. When multiple heterogeneous semantic social networks are to be visualized together, SSnetViz provides a suite of functions to store heterogeneous semantic social networks, to integrate them for searching and analysis. We will illustrate these functions using social networks related to terrorism research, one crafted by domain experts and another from Wikipedia.


A Distributed Spatial Index For Error-Prone Wireless Data Broadcast, Baihua Zheng, Wang-Chien Lee, Ken C. K. Lee, Dik Lun Lee, Min Shao Aug 2009

A Distributed Spatial Index For Error-Prone Wireless Data Broadcast, Baihua Zheng, Wang-Chien Lee, Ken C. K. Lee, Dik Lun Lee, Min Shao

Research Collection School Of Computing and Information Systems

Information is valuable to users when it is available not only at the right time but also at the right place. To support efficient location-based data access in wireless data broadcast systems, a distributed spatial index (called DSI) is presented in this paper. DSI is highly efficient because it has a linear yet fully distributed structure that naturally shares links in different search paths. DSI is very resilient to the error-prone wireless communication environment because interrupted search operations based on DSI can be resumed easily. It supports search algorithms for classical location-based queries such as window queries and kNN queries …


On Efficient Mutual Nearest Neighbor Query Processing In Spatial Databases, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li Aug 2009

On Efficient Mutual Nearest Neighbor Query Processing In Spatial Databases, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li

Research Collection School Of Computing and Information Systems

This paper studies a new form of nearest neighbor queries in spatial databases, namely, mutual nearest neighbour (MNN) search. Given a set D of objects and a query object q, an MNN query returns from D, the set of objects that are among the k1 (≥ 1) nearest neighbors (NNs) of q; meanwhile, have q as one of their k2(≥ 1) NNs. Although MNN queries are useful in many applications involving decision making, data mining, and pattern recognition, it cannot be efficiently handled by existing spatial query processing approaches. In this paper, we present …


Symphony: Enabling Search-Driven Applications, John C. Shafer, Rakesh Agrawal, Hady W. Lauw Aug 2009

Symphony: Enabling Search-Driven Applications, John C. Shafer, Rakesh Agrawal, Hady W. Lauw

Research Collection School Of Computing and Information Systems

We present the design of Symphony, a platform that enables non-developers to build and deploy a new class of search-driven applications that combine their data and domain expertise with content from search engines and other web services. The Symphony prototype has been built on top of Microsoft’s Live Search infrastructure. While Symphony naturally makes use of the customization capabilities exposed by Live Search, its distinguishing feature is the capability it provides to the application creator to combine their proprietary data and domain expertise with content obtained from Live Search. They can also integrate specialized data obtained from web services to …


Learning And Inferencing In User Ontology For Personalized Semantic Web Search, Xing Jiang, Ah-Hwee Tan Jul 2009

Learning And Inferencing In User Ontology For Personalized Semantic Web Search, Xing Jiang, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

User modeling is aimed at capturing the users’ interests in a working domain, which forms the basis of providing personalized information services. In this paper, we present an ontology based user model, called user ontology, for providing personalized information service in the Semantic Web. Different from the existing approaches that only use concepts and taxonomic relations for user modeling, the proposed user ontology model utilizes concepts, taxonomic relations, and non-taxonomic relations in a given domain ontology to capture the users’ interests. As a customized view of the domain ontology, a user ontology provides a richer and more precise representation of …


Continuous Obstructed Nearest Neighbor Queries In Spatial Databases, Yunjun Gao, Baihua Zheng Jul 2009

Continuous Obstructed Nearest Neighbor Queries In Spatial Databases, Yunjun Gao, Baihua Zheng

Research Collection School Of Computing and Information Systems

In this paper, we study a novel form of continuous nearest neighbor queries in the presence of obstacles, namely continuous obstructed nearest neighbor (CONN) search. It considers the impact of obstacles on the distance between objects, which is ignored by most of spatial queries. Given a data set P, an obstacle set O, and a query line segment q in a two-dimensional space, a CONN query retrieves the nearest neighbor of each point on q according to the obstructed distance, i.e., the shortest path between them without crossing any obstacle. We formulate CONN search, analyze its unique properties, and develop …


Spatial Cloaking Revisited: Distinguishing Information Leakage From Anonymity, Kar Way Tan, Yimin Lin, Kyriakos Mouratidis Jul 2009

Spatial Cloaking Revisited: Distinguishing Information Leakage From Anonymity, Kar Way Tan, Yimin Lin, Kyriakos Mouratidis

Research Collection School Of Computing and Information Systems

Location-based services (LBS) are receiving increasing popularity as they provide convenience to mobile users with on-demand information. The use of these services, however, poses privacy issues as the user locations and queries are exposed to untrusted LBSs. Spatial cloaking techniques provide privacy in the form of k-anonymity; i.e., they guarantee that the (location of the) querying user u is indistinguishable from at least k-1 others, where k is a parameter specified by u at query time. To achieve this, they form a group of k users, including u, and forward their minimum bounding rectangle (termed anonymzing spatial region, ASR) to …


Active Learning For Causal Bayesian Network Structure With Non-Symmetrical Entropy, Li G., Tze-Yun Leong Jul 2009

Active Learning For Causal Bayesian Network Structure With Non-Symmetrical Entropy, Li G., Tze-Yun Leong

Research Collection School Of Computing and Information Systems

Causal knowledge is crucial for facilitating comprehension, diagnosis, prediction, and control in automated reasoning. Active learning in causal Bayesian networks involves interventions by manipulating specific variables, and observing the patterns of change over other variables to derive causal knowledge. In this paper, we propose a new active learning approach that supports interventions with node selection. Our method admits a node selection criterion based on non-symmetrical entropy from the current data and a stop criterion based on structure entropy of the resulting networks. We examine the technical challenges and practical issues involved. Experimental results on a set of benchmark Bayesian networks …


Compositemap: A Novel Framework For Music Similarity Measure, Bingjun Zhang, Jialie Shen, Qiaoliang Xiang, Ye Wang Jul 2009

Compositemap: A Novel Framework For Music Similarity Measure, Bingjun Zhang, Jialie Shen, Qiaoliang Xiang, Ye Wang

Research Collection School Of Computing and Information Systems

With the continuing advances in data storage and communication technology, there has been an explosive growth of music information from different application domains. As an effective technique for organizing, browsing, and searching large data collections, music information retrieval is attracting more and more attention. How to measure and model the similarity between different music items is one of the most fundamental yet challenging research problems. In this paper, we introduce a novel framework based on a multimodal and adaptive similarity measure for various applications. Distinguished from previous approaches, our system can effectively combine music properties from different aspects into a …


A Bayesian Approach Integrating Regional And Global Features For Image Semantic Learning, Luong-Dong Nguyen, Ghim-Eng Yap, Ying Liu, Ah-Hwee Tan, Liang-Tien Chia, Joo-Hwee Lim Jul 2009

A Bayesian Approach Integrating Regional And Global Features For Image Semantic Learning, Luong-Dong Nguyen, Ghim-Eng Yap, Ying Liu, Ah-Hwee Tan, Liang-Tien Chia, Joo-Hwee Lim

Research Collection School Of Computing and Information Systems

In content-based image retrieval, the “semantic gap” between visual image features and user semantics makes it hard to predict abstract image categories from low-level features. We present a hybrid system that integrates global features (Gfeatures) and region features (R-features) for predicting image semantics. As an intermediary between image features and categories, we introduce the notion of mid-level concepts, which enables us to predict an image’s category in three steps. First, a G-prediction system uses G-features to predict the probability of each category for an image. Simultaneously, a R-prediction system analyzes R-features to identify the probabilities of mid-level concepts in that …


Bonus Plans Of Mice And Men, M. Thulasidas Jul 2009

Bonus Plans Of Mice And Men, M. Thulasidas

Research Collection School Of Computing and Information Systems

Our best-laid plans often go awry. We see it all the time at a personal level -- accidents (both good and bad), deaths (both of loved ones and rich uncles), births, and lotteries all conspire to reshuffle our priorities and render our plans null and void. In fact, there is nothing like a solid misfortune to get us to put things in perspective. This opportunity may be the proverbial silver lining we are constantly advised to see. What is true at a personal level holds true also at a larger scale. The industry-wide financial meltdown has imparted a philosophical clarity …


Simplenpkl: Simple Non-Parametric Kernel Learning, Jinfeng Zhuang, Ivor Tsang, Steven C. H. Hoi Jun 2009

Simplenpkl: Simple Non-Parametric Kernel Learning, Jinfeng Zhuang, Ivor Tsang, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Previous studies of Non-Parametric Kernel (NPK) learning usually reduce to solving some Semi-Definite Programming (SDP) problem by a standard SDP solver. However, time complexity of standard interior-point SDP solvers could be as high as O(n6.5). Such intensive computation cost prohibits NPK learning applicable to real applications, even for data sets of moderate size. In this paper, we propose an efficient approach to NPK learning from side information, referred to as SimpleNPKL, which can efficiently learn non-parametric kernels from large sets of pairwise constraints. In particular, we show that the proposed SimpleNPKL with linear loss has a closed-form solution that can …


Information Sharing And Strategic Signaling In Supply Chains, Robert J. Kauffman, Hamid Mohtadi Jun 2009

Information Sharing And Strategic Signaling In Supply Chains, Robert J. Kauffman, Hamid Mohtadi

Research Collection School Of Computing and Information Systems

Information sharing in procurement occurs in rich and varied industry contexts in which managerial decisions are made and organizational strategy is formulated. We explore how information sharing ought to work in procurement contexts that involve investments in inter-organizational information systems (IOS) and collaborative planning, forecasting and replenishment (CPFR) practices. How and under what circumstances does a firm that plays the role of a supply chain buyer decide to share information on key variables, such as point-of-sale consumer demand data with its supplier, up the supply chain? This is a key issue that crosses the boundary between supply chain management and …


A Revisit Of Generative Model For Automatic Image Annotation Using Markov Random Fields, Yu Xiang, Xiangdong Zhou, Tat-Seng Chua, Chong-Wah Ngo Jun 2009

A Revisit Of Generative Model For Automatic Image Annotation Using Markov Random Fields, Yu Xiang, Xiangdong Zhou, Tat-Seng Chua, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

Much research effort on Automatic Image Annotation (AIA) has been focused on Generative Model, due to its well formed theory and competitive performance as compared with many well designed and sophisticated methods. However, when considering semantic context for annotation, the model suffers from the weak learning ability. This is mainly due to the lack of parameter setting and appropriate learning strategy for characterizing the semantic context in the traditional generative model. In this paper, we present a new approach based on Multiple Markov Random Fields (MRF) for semantic context modeling and learning. Differing from previous MRF related AIA approach, we …


Intentional Learning Agent Architecture, Budhitama Subagdja, Liz Sonenberg, Iyad Rahwan Jun 2009

Intentional Learning Agent Architecture, Budhitama Subagdja, Liz Sonenberg, Iyad Rahwan

Research Collection School Of Computing and Information Systems

Dealing with changing situations is a major issue in building agent systems. When the time is limited, knowledge is unreliable, and resources are scarce, the issue becomes more challenging. The BDI (Belief-Desire-Intention) agent architecture provides a model for building agents that addresses that issue. The model can be used to build intentional agents that are able to reason based on explicit mental attitudes, while behaving reactively in changing circumstances. However, despite the reactive and deliberative features, a classical BDI agent is not capable of learning. Plans as recipes that guide the activities of the agent are assumed to be static. …


Cyber Attacks: Cross-Country Interdependence And Enforcement, Qiu-Hong Wang, Seung Hyun Kim Jun 2009

Cyber Attacks: Cross-Country Interdependence And Enforcement, Qiu-Hong Wang, Seung Hyun Kim

Research Collection School Of Computing and Information Systems

This study empirically characterizes the interdependence in cyber attacks and examines theimpact from the first international treaty against cybercrimes (Convention on Cybercrimes:Europe Treaty Series No. 185). With the data covering 62 countries over the period from year2003 to 2007, we find that, international cooperation in enforcement as measured by theindicator of joining the Convention on Cybercrimes, deterred cyber attacks originating from anyparticular country by 15.81% ~ 24.77% (in 95% confidence interval). Second, joining theConvention also affected the interdependence in cyber attacks from two angels. First, for anypair of country, closer status in joining or not joining the Convention was associated …


Nonrigid Shape Recovery By Gaussian Process Regression, Jianke Zhu, Steven C. H. Hoi, Michael R. Liu Jun 2009

Nonrigid Shape Recovery By Gaussian Process Regression, Jianke Zhu, Steven C. H. Hoi, Michael R. Liu

Research Collection School Of Computing and Information Systems

Most state-of-the-art nonrigid shape recovery methods usually use explicit deformable mesh models to regularize surface deformation and constrain the search space. These triangulated mesh models heavily relying on the quadratic regularization term are difficult to accurately capture large deformations, such as severe bending. In this paper, we propose a novel Gaussian process regression approach to the nonrigid shape recovery problem, which does not require to involve a predefined triangulated mesh model. By taking advantage of our novel Gaussian process regression formulation together with a robust coarse-to-fine optimization scheme, the proposed method is fully automatic and is able to handle large …


Predicting Outcome For Collaborative Featured Article Nomination In Wikipedia, Meiqun Hu, Ee Peng Lim, Ramayya Krishnan May 2009

Predicting Outcome For Collaborative Featured Article Nomination In Wikipedia, Meiqun Hu, Ee Peng Lim, Ramayya Krishnan

Research Collection School Of Computing and Information Systems

In Wikipedia, good articles are wanted. While Wikipedia relies on collaborative effort from online volunteers for quality checking, the process of selecting top quality articles is time consuming. At present, the duty of decision making is shouldered by only a couple of administrators. Aiming to assist in the quality checking cycles so as to cope with the exponential growth of online contributions to Wikipedia, this work studies the task of predicting the outcome of featured article (FA) nominations. We analyze FA candidate (FAC) sessions collected over a period of 3.5 years, and examine the extent to which consensus has been …