Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 34

Full-Text Articles in Physical Sciences and Mathematics

Detecting Anomalies In Bipartite Graphs With Mutual Dependency Principles, Hanbo Dai, Feida Zhu, Ee Peng Lim, Hwee Hwa Pang Dec 2012

Detecting Anomalies In Bipartite Graphs With Mutual Dependency Principles, Hanbo Dai, Feida Zhu, Ee Peng Lim, Hwee Hwa Pang

Research Collection School Of Computing and Information Systems

Bipartite graphs can model many real life applications including users-rating-products in online marketplaces, users-clicking-webpages on the World Wide Web and users referring users in social networks. In these graphs, the anomalousness of nodes in one partite often depends on that of their connected nodes in the other partite. Previous studies have shown that this dependency can be positive (the anomalousness of a node in one partite increases or decreases along with that of its connected nodes in the other partite) or negative (the anomalousness of a node in one partite rises or falls in opposite direction to that of its …


Visualization For Anomaly Detection And Data Management By Leveraging Network, Sensor And Gis Techniques, Zhaoxia Wang, Chee Seng Chong, Rick S. M. Goh, Wanqing Zhou, Dan Peng, Hoong Chor Chin Dec 2012

Visualization For Anomaly Detection And Data Management By Leveraging Network, Sensor And Gis Techniques, Zhaoxia Wang, Chee Seng Chong, Rick S. M. Goh, Wanqing Zhou, Dan Peng, Hoong Chor Chin

Research Collection School Of Computing and Information Systems

This paper studies the importance of visualization for discerning and interpreting patterns of data and its application for solving real problems, such as anomaly detection and data management. There are various ways to realize visualization to cater to the needs of numerous real life applications. Depending on needs, a combination of some of these ways may be required for presenting an effective visualization. The authors present visualization schemes for anomaly detection/condition monitoring and data management by leveraging network techniques and combining them with modern techniques such as sensor, database, mobile communication, GPS and GIS techniques. Two case studies are presented …


A Survey Of Recommender Systems In Twitter, Su Mon Kywe, Ee Peng Lim, Feida Zhu Dec 2012

A Survey Of Recommender Systems In Twitter, Su Mon Kywe, Ee Peng Lim, Feida Zhu

Research Collection School Of Computing and Information Systems

Twitter is a social information network where short messages or tweets are shared among a large number of users through a very simple messaging mechanism. With a population of more than 100M users generating more than 300M tweets each day, Twitter users can be easily overwhelmed by the massive amount of information available and the huge number of people they can interact with. To overcome the above information overload problem, recommender systems can be introduced to help users make the appropriate selection. Researchers have began to study recommendation problems in Twitter but their works usually address individual recommendation tasks. There …


Cost-Sensitive Online Classification, Jialei Wang, Peilin Zhao, Steven C. H. Hoi Dec 2012

Cost-Sensitive Online Classification, Jialei Wang, Peilin Zhao, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Both cost-sensitive classification and online learning have been extensively studied in data mining and machine learning communities, respectively. However, very limited study addresses an important intersecting problem, that is, “Cost-Sensitive Online Classification". In this paper, we formally study this problem, and propose a new framework for Cost-Sensitive Online Classification by directly optimizing cost-sensitive measures using online gradient descent techniques. Specifically, we propose two novel cost-sensitive online classification algorithms, which are designed to directly optimize two well-known cost-sensitive measures: (i) maximization of weighted sum of sensitivity and specificity, and (ii) minimization of weighted misclassification cost. We analyze the theoretical bounds of …


Fast And Accurate Psd Matrix Estimation By Row Reduction, Hiroshi Kuwajima, Takashi Washio, Ee Peng Lim Nov 2012

Fast And Accurate Psd Matrix Estimation By Row Reduction, Hiroshi Kuwajima, Takashi Washio, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Fast and accurate estimation of missing relations, e.g., similarity, distance and kernel, among objects is now one of the most important techniques required by major data mining tasks, because the missing information of the relations is needed in many applications such as economics, psychology, and social network communities. Though some approaches have been proposed in the last several years, the practical balance between their required computation amount and obtained accuracy are insufficient for some class of the relation estimation. The objective of this paper is to formalize a problem to quickly and efficiently estimate missing relations among objects from the …


Impact Of Multimedia In Sina Weibo: Popularity And Life Span, Xun Zhao, Feida Zhu, Weining Qian, Aoying Zhou Nov 2012

Impact Of Multimedia In Sina Weibo: Popularity And Life Span, Xun Zhao, Feida Zhu, Weining Qian, Aoying Zhou

Research Collection School Of Computing and Information Systems

Multimedia contents such as images and videos are widely used in social network sites nowadays. Sina Weibo, a Chinese microblogging service, is one of the first microblog platforms to incorporate multimedia content sharing features. This work provides statistical analysis on how multimedia contents are produced, consumed, and propagated in Sina Weibo. Based on 230 million tweets and 1.8 million user profiles in Sina Weibo, we study the impact of multimedia contents on the popularity of both users and tweets as well as tweet life span. Our preliminary study shows that multimedia tweets dominant pure text ones in SinaWeibo. Multimedia contents …


Divad: A Dynamic And Interactive Visual Analytical Dashboard For Exploring And Analyzing Transport Data, Tin Seong Kam, Ketan Barshikar, Shaun Jun Hua Tan Nov 2012

Divad: A Dynamic And Interactive Visual Analytical Dashboard For Exploring And Analyzing Transport Data, Tin Seong Kam, Ketan Barshikar, Shaun Jun Hua Tan

Research Collection School Of Computing and Information Systems

The advances in location-based data collection technologies such as GPS, RFID etc. and the rapid reduction of their costs provide us with a huge and continuously increasing amount of data about movement of vehicles, people and goods in an urban area. This explosive growth of geospatially-referenced data has far outpaced the planner’s ability to utilize and transform the data into insightful information thus creating an adverse impact on the return on the investment made to collect and manage this data. Addressing this pressing need, we designed and developed DIVAD, a dynamic and interactive visual analytics dashboard to allow city planners …


A Unified Learning Framework For Auto Face Annotation By Mining Web Facial Images, Dayong Wang, Steven C. H. Hoi, Ying He Nov 2012

A Unified Learning Framework For Auto Face Annotation By Mining Web Facial Images, Dayong Wang, Steven C. H. Hoi, Ying He

Research Collection School Of Computing and Information Systems

Auto face annotation plays an important role in many real-world multimedia information and knowledge management systems. Recently there is a surge of research interests in mining weakly-labeled facial images on the internet to tackle this long-standing research challenge in computer vision and image understanding. In this paper, we present a novel unified learning framework for face annotation by mining weakly labeled web facial images through interdisciplinary efforts of combining sparse feature representation, content-based image retrieval, transductive learning and inductive learning techniques. In particular, we first introduce a new search-based face annotation paradigm using transductive learning, and then propose an effective …


Influentials, Novelty, And Social Contagion: The Viral Power Of Average Friends, Close Communities, And Old News, Nicholas Harrigan, Palakorn Achananuparp, Ee Peng Lim Oct 2012

Influentials, Novelty, And Social Contagion: The Viral Power Of Average Friends, Close Communities, And Old News, Nicholas Harrigan, Palakorn Achananuparp, Ee Peng Lim

Research Collection School Of Computing and Information Systems

What is the effect of (1) popular individuals, and (2) community structures on the retransmission of socially contagious behavior? We examine a community of Twitter users over a five month period, operationalizing social contagion as ‘retweeting’, and social structure as the count of subgraphs (small patterns of ties and nodes) between users in the follower/following network. We find that popular individuals act as ‘inefficient hubs’ for social contagion: they have limited attention, are overloaded with inputs, and therefore display limited responsiveness to viral messages. We argue this contradicts the ‘law of the few’ and ‘influentials hypothesis’. We find that community …


Entity Synonyms For Structured Web Search, Tao Cheng, Hady W. Lauw, Stelios Paparizos Oct 2012

Entity Synonyms For Structured Web Search, Tao Cheng, Hady W. Lauw, Stelios Paparizos

Research Collection School Of Computing and Information Systems

Nowadays, there are many queries issued to search engines targeting at finding values from structured data (e.g., movie showtime of a specific location). In such scenarios, there is often a mismatch between the values of structured data (how content creators describe entities) and the web queries (how different users try to retrieve them). Therefore, recognizing the alternative ways people use to reference an entity, is crucial for structured web search. In this paper, we study the problem of automatic generation of entity synonyms over structured data toward closing the gap between users and structured data. We propose an offline, data-driven …


Self-Regulating Action Exploration In Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan Oct 2012

Self-Regulating Action Exploration In Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

The basic tenet of a learning process is for an agent to learn for only as much and as long as it is necessary. With reinforcement learning, the learning process is divided between exploration and exploitation. Given the complexity of the problem domain and the randomness of the learning process, the exact duration of the reinforcement learning process can never be known with certainty. Using an inaccurate number of training iterations leads either to the non-convergence or the over-training of the learning agent. This work addresses such issues by proposing a technique to self-regulate the exploration rate and training duration …


Modeling Concept Dynamics For Large Scale Music Search, Jialie Shen, Hwee Hwa Pang, Meng Wang, Shuicheng Yan Aug 2012

Modeling Concept Dynamics For Large Scale Music Search, Jialie Shen, Hwee Hwa Pang, Meng Wang, Shuicheng Yan

Research Collection School Of Computing and Information Systems

Continuing advances in data storage and communication technologies have led to an explosive growth in digital music collections. To cope with their increasing scale, we need effective Music Information Retrieval (MIR) capabilities like tagging, concept search and clustering. Integral to MIR is a framework for modelling music documents and generating discriminative signatures for them. In this paper, we introduce a multimodal, layered learning framework called DMCM. Distinguished from the existing approaches that encode music as an ensemble of order-less feature vectors, our framework extracts from each music document a variety of acoustic features, and translates them into low-level encodings over …


Online Feature Selection For Mining Big Data, Steven C. H. Hoi, Jialei Wang, Peilin Zhao, Rong Jin Aug 2012

Online Feature Selection For Mining Big Data, Steven C. H. Hoi, Jialei Wang, Peilin Zhao, Rong Jin

Research Collection School Of Computing and Information Systems

Most studies of online learning require accessing all the attributes/features of training instances. Such a classical setting is not always appropriate for real-world applications when data instances are of high dimensionality or the access to it is expensive to acquire the full set of attributes/features. To address this limitation, we investigate the problem of Online Feature Selection (OFS) in which the online learner is only allowed to maintain a classifier involved a small and fixed number of features. The key challenge of Online Feature Selection is how to make accurate prediction using a small and fixed number of active features. …


Boosting Multi-Kernel Locality-Sensitive Hashing For Scalable Image Retrieval, Hao Xia, Steven C. H. Hoi, Pengcheng Wu, Rong Jin Aug 2012

Boosting Multi-Kernel Locality-Sensitive Hashing For Scalable Image Retrieval, Hao Xia, Steven C. H. Hoi, Pengcheng Wu, Rong Jin

Research Collection School Of Computing and Information Systems

Similarity search is a key challenge for multimedia retrieval applications where data are usually represented in high-dimensional space. Among various algorithms proposed for similarity search in high-dimensional space, Locality-Sensitive Hashing (LSH) is the most popular one, which recently has been extended to Kernelized Locality-Sensitive Hashing (KLSH) by exploiting kernel similarity for better retrieval efficacy. Typically, KLSH works only with a single kernel, which is often limited in real-world multimedia applications, where data may originate from multiple resources or can be represented in several different forms. For example, in content-based multimedia retrieval, a variety of features can be extracted to represent …


A Non-Parametric Visual-Sense Model Of Images: Extending The Cluster Hypothesis Beyond Text, Kong-Wah Wan, Ah-Hwee Tan, Joo-Hwee Lim, Liang-Tien Chia Aug 2012

A Non-Parametric Visual-Sense Model Of Images: Extending The Cluster Hypothesis Beyond Text, Kong-Wah Wan, Ah-Hwee Tan, Joo-Hwee Lim, Liang-Tien Chia

Research Collection School Of Computing and Information Systems

The main challenge of a search engine is to find information that are relevant and appropriate. However, this can become difficult when queries are issued using ambiguous words. Rijsbergen first hypothesized a clustering approach for web pages wherein closely associated pages are treated as a semantic group with the same relevance to the query (Rijsbergen 1979). In this paper, we extend Rijsbergen’s cluster hypothesis to multimedia content such as images. Given a user query, the polysemy in the return image set is related to the many possible meanings of the query. We develop a method to cluster the polysemous images …


Shortest Path Computation With No Information Leakage, Kyriakos Mouratidis, Man Lung Yiu Aug 2012

Shortest Path Computation With No Information Leakage, Kyriakos Mouratidis, Man Lung Yiu

Research Collection School Of Computing and Information Systems

Shortest path computation is one of the most common queries in location-based services (LBSs). Although particularly useful, such queries raise serious privacy concerns. Exposing to a (potentially untrusted) LBS the client’s position and her destination may reveal personal information, such as social habits, health condition, shopping preferences, lifestyle choices, etc. The only existing method for privacy-preserving shortest path computation follows the obfuscation paradigm; it prevents the LBS from inferring the source and destination of the query with a probability higher than a threshold. This implies, however, that the LBS still deduces some information (albeit not exact) about the client’s location …


Finding Bursty Topics From Microblogs, Qiming Diao, Jing Jiang, Feida Zhu, Ee Peng Lim Jul 2012

Finding Bursty Topics From Microblogs, Qiming Diao, Jing Jiang, Feida Zhu, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Microblogs such as Twitter reflect the general public’s reactions to major events. Bursty topics from microblogs reveal what events have attracted the most online attention. Although bursty event detection from text streams has been studied before, previous work may not be suitable for microblogs because compared with other text streams such as news articles and scientific publications, microblog posts are particularly diverse and noisy. To find topics that have bursty patterns on microblogs, we propose a topic model that simultaneousy captures two observations: (1) posts published around the same time are more likely to have the same topic, and (2) …


Identifying Event-Related Bursts Via Social Media Activities, Xin Zhao, Baihan Shu, Jing Jiang, Yang Song, Hongfei Yan, Xiaoming Li Jul 2012

Identifying Event-Related Bursts Via Social Media Activities, Xin Zhao, Baihan Shu, Jing Jiang, Yang Song, Hongfei Yan, Xiaoming Li

Research Collection School Of Computing and Information Systems

Activities on social media increase at a dramatic rate. When an external event happens, there is a surge in the degree of activities related to the event. These activities may be temporally correlated with one another, but they may also capture different aspects of an event and therefore exhibit different bursty patterns. In this paper, we propose to identify event-related bursts via social media activities. We study how to correlate multiple types of activities to derive a global bursty pattern. To model smoothness of one state sequence, we propose a novel function which can capture the state context. The experiments …


Logistics Orchestration Modeling And Evaluation For Humanitarian Relief, Hoong Chuin Lau, Zhengping Li, Xin Du, Heng Jiang, Robert De Souza Jul 2012

Logistics Orchestration Modeling And Evaluation For Humanitarian Relief, Hoong Chuin Lau, Zhengping Li, Xin Du, Heng Jiang, Robert De Souza

Research Collection School Of Computing and Information Systems

This paper proposes an orchestration model for post-disaster response that is aimed at automating the coordination of scarce resources that minimizes the loss of human lives. In our setting, different teams are treated as agents and their activities are "orchestrated" to optimize rescue performance. Results from simulation are analysed to evaluate the performance of the optimization model.


Joint Learning For Coreference Resolution With Markov Logic, Yang Song, Jing Jiang, Xin Zhao, Sujian Li, Houfeng Wang Jul 2012

Joint Learning For Coreference Resolution With Markov Logic, Yang Song, Jing Jiang, Xin Zhao, Sujian Li, Houfeng Wang

Research Collection School Of Computing and Information Systems

Pairwise coreference resolution models must merge pairwise coreference decisions to generate final outputs. Traditional merging methods adopt different strategies such as the best first method and enforcing the transitivity constraint, but most of these methods are used independently of the pairwise learning methods as an isolated inference procedure at the end. We propose a joint learning model which combines pairwise classification and mention clustering with Markov logic. Experimental results show that our joint learning system outperforms independent learning systems. Our system gives a better performance than all the learning-based systems from the CoNLL-2011 shared task on the same dataset. Compared …


Enhancing Access Privacy Of Range Retrievals Over B+Trees, Hwee Hwa Pang, Jilian Zhang, Kyriakos Mouratidis Jul 2012

Enhancing Access Privacy Of Range Retrievals Over B+Trees, Hwee Hwa Pang, Jilian Zhang, Kyriakos Mouratidis

Research Collection School Of Computing and Information Systems

Users of databases that are hosted on shared servers cannot take for granted that their queries will not be disclosed to unauthorized parties. Even if the database is encrypted, an adversary who is monitoring the I/O activity on the server may still be able to infer some information about a user query. For the particular case of a B+-tree that has its nodes encrypted, we identify properties that enable the ordering among the leaf nodes to be deduced. These properties allow us to construct adversarial algorithms to recover the B+-tree structure from the I/O traces generated by range queries. Combining …


Topic Discovery From Tweet Replies, Bingtian Dai, Ee Peng Lim, Philips Kokoh Prasetyo Jul 2012

Topic Discovery From Tweet Replies, Bingtian Dai, Ee Peng Lim, Philips Kokoh Prasetyo

Research Collection School Of Computing and Information Systems

Twitter is a popular online social information network service which allows people to read and post messages up to 140 characters, known as “tweets”. In this paper, we focus on the tweets between pairs of individuals, i.e., the tweet replies, and propose a generative model to discover topics among groups of twitter users. Our model has then been evaluated with a tweet dataset to show its effectiveness.


Virality And Susceptibility In Information Diffusions, Tuan-Anh Hoang, Ee Peng Lim Jun 2012

Virality And Susceptibility In Information Diffusions, Tuan-Anh Hoang, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Viral diffusion allows a piece of information to widely and quickly spread within the network of users through word-ofmouth. In this paper, we study the problem of modeling both item and user factors that contribute to viral diffusion in Twitter network. We identify three behaviorial factors, namely user virality, user susceptibility and item virality, that contribute to viral diffusion. Instead of modeling these factors independently as done in previous research, we propose a model that measures all the factors simultaneously considering their mutual dependencies. The model has been evaluated on both synthetic and real datasets. The experiments show that our …


Visualizing Media Bias Through Twitter, Jisun An, Meeyoung Cha, Gummadi, Krishna, Jon Crowcroft, Daniele Queria Jun 2012

Visualizing Media Bias Through Twitter, Jisun An, Meeyoung Cha, Gummadi, Krishna, Jon Crowcroft, Daniele Queria

Research Collection School Of Computing and Information Systems

Traditional media outlets are known to report political news in a biased way, potentially affecting the political beliefs of the audience and even altering their voting behaviors. Therefore, tracking bias in everyday news and building a platform where people can receive balanced news information is important. We propose a model that maps the news media sources along a dimensional dichotomous political spectrum using the co-subscriptions relationships inferred by Twitter links. By analyzing 7 million follow links, we show that the political dichotomy naturally arises on Twitter when we only consider direct media subscription. Furthermore, we demonstrate a real-time Twitter-based application …


Modeling Diffusion In Social Networks Using Network Properties, Duc Minh Luu, Ee Peng Lim, Tuan Anh Hoang, Chong Tat Freddy Chua Jun 2012

Modeling Diffusion In Social Networks Using Network Properties, Duc Minh Luu, Ee Peng Lim, Tuan Anh Hoang, Chong Tat Freddy Chua

Research Collection School Of Computing and Information Systems

"Diffusion of items occurs in social networks due to spreading of items through word of mouth and exogenous factors. These items may be news, products, videos, advertisements or contagious viruses. When a user purchases or consumes one of such items, we say that she adopts the item and she becomes an item adopter. Previous research has studied diffusion process at both the macro and micro levels. The former models the number of item adopters in the diffusion process while the latter determines which individuals adopt item. Both macro and micro level models have their merits and limitations. In this paper, …


Organizing User Search Histories, Heasoo Hwang, Hady W. Lauw, Lise Getoor, Alexandros Ntoulas May 2012

Organizing User Search Histories, Heasoo Hwang, Hady W. Lauw, Lise Getoor, Alexandros Ntoulas

Research Collection School Of Computing and Information Systems

Users are increasingly pursuing complex task-oriented goals on the web, such as making travel arrangements, managing finances, or planning purchases. To this end, they usually break down the tasks into a few codependent steps and issue multiple queries around these steps repeatedly over long periods of time. To better support users in their long-term information quests on the web, search engines keep track of their queries and clicks while searching online. In this paper, we study the problem of organizing a user's historical queries into groups in a dynamic and automated fashion. Automatically identifying query groups is helpful for a …


Mining Social Dependencies In Dynamic Interaction Networks, Freddy Chong-Tat Chua, Hady W. Lauw, Ee Peng Lim Apr 2012

Mining Social Dependencies In Dynamic Interaction Networks, Freddy Chong-Tat Chua, Hady W. Lauw, Ee Peng Lim

Research Collection School Of Computing and Information Systems

User-to-user interactions have become ubiquitous in Web 2.0. Users exchange emails, post on newsgroups, tag web pages, co-author papers, etc. Through these interactions, users co-produce or co-adopt content items (e.g., words in emails, tags in social bookmarking sites). We model such dynamic interactions as a user interaction network, which relates users, interactions, and content items over time. After some interactions, a user may produce content that is more similar to those produced by other users previously. We term this effect social dependency, and we seek to mine from such networks the degree to which a user may be socially dependent …


Obfuscating The Topical Intention In Enterprise Text Search, Hwee Hwa Pang, Xiaokui Xiao, Jialie Shen Apr 2012

Obfuscating The Topical Intention In Enterprise Text Search, Hwee Hwa Pang, Xiaokui Xiao, Jialie Shen

Research Collection School Of Computing and Information Systems

The text search queries in an enterprise can reveal the users' topic of interest, and in turn confidential staff or business information. To safeguard the enterprise from consequences arising from a disclosure of the query traces, it is desirable to obfuscate the true user intention from the search engine, without requiring it to be re-engineered. In this paper, we advocate a unique approach to profile the topics that are relevant to the user intention. Based on this approach, we introduce an (ε 1, ε 2)-privacy model that allows a user to stipulate that topics relevant to her intention …


Quality And Leniency In Online Collaborative Rating Systems, Hady W. Lauw, Ee Peng Lim, Ke Wang Mar 2012

Quality And Leniency In Online Collaborative Rating Systems, Hady W. Lauw, Ee Peng Lim, Ke Wang

Research Collection School Of Computing and Information Systems

The emerging trend of social information processing has resulted in Web users’ increased reliance on user-generated content contributed by others for information searching and decision making. Rating scores, a form of user-generated content contributed by reviewers in online rating systems, allow users to leverage others’ opinions in the evaluation of objects. In this article, we focus on the problem of summarizing the rating scores given to an object into an overall score that reflects the object’s quality. We observe that the existing approaches for summarizing scores largely ignores the effect of reviewers exercising different standards in assigning scores. Instead of …


Road: A New Spatial Object Search Framework For Road Networks, Ken C. K. Lee, Wang-Chien Lee, Baihua Zheng, Yuan Tian Mar 2012

Road: A New Spatial Object Search Framework For Road Networks, Ken C. K. Lee, Wang-Chien Lee, Baihua Zheng, Yuan Tian

Research Collection School Of Computing and Information Systems

In this paper, we present a new system framework called ROAD for spatial object search on road networks. ROAD is extensible to diverse object types and efficient for processing various location-dependent spatial queries (LDSQs), as it maintains objects separately from a given network and adopts an effective search space pruning technique. Based on our analysis on the two essential operations for LDSQ processing, namely, network traversal and object lookup, ROAD organizes a large road network as a hierarchy of interconnected regional subnetworks (called Rnets). Each Rnet is augmented with 1) shortcuts and 2) object abstracts to accelerate network traversals and …