Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 89

Full-Text Articles in Physical Sciences and Mathematics

Duol: A Double Updating Approach For Online Learning, Peilin Zhao, Steven C. H. Hoi, Rong Jin Dec 2009

Duol: A Double Updating Approach For Online Learning, Peilin Zhao, Steven C. H. Hoi, Rong Jin

Research Collection School Of Computing and Information Systems

In most online learning algorithms, the weights assigned to the misclassified examples (or support vectors) remain unchanged during the entire learning process. This is clearly insufficient since when a new misclassified example is added to the pool of support vectors, we generally expect it to affect the weights for the existing support vectors. In this paper, we propose a new online learning method, termed Double Updating Online Learning, or DUOL for short. Instead of only assigning a fixed weight to the misclassified example received in current trial, the proposed online learning algorithm also tries to update the weight for one …


On Strategies For Imbalanced Text Classification Using Svm: A Comparative Study, Aixin Sun, Ee Peng Lim, Ying Liu Dec 2009

On Strategies For Imbalanced Text Classification Using Svm: A Comparative Study, Aixin Sun, Ee Peng Lim, Ying Liu

Research Collection School Of Computing and Information Systems

Many real-world text classification tasks involve imbalanced training examples. The strategies proposed to address the imbalanced classification (e.g., resampling, instance weighting), however, have not been systematically evaluated in the text domain. In this paper, we conduct a comparative study on the effectiveness of these strategies in the context of imbalanced text classification using Support Vector Machines (SVM) classifier. SVM is the interest in this study for its good classification accuracy reported in many text classification tasks. We propose a taxonomy to organize all proposed strategies following the training and the test phases in text classification tasks. Based on the taxonomy, …


To Trust Or Not To Trust? Predicting Online Trusts Using Trust Antecedent Framework, Viet-An Nguyen, Ee Peng Lim, Jing Jiang, Aixin Sun Dec 2009

To Trust Or Not To Trust? Predicting Online Trusts Using Trust Antecedent Framework, Viet-An Nguyen, Ee Peng Lim, Jing Jiang, Aixin Sun

Research Collection School Of Computing and Information Systems

This paper analyzes the trustor and trustee factors that lead to inter-personal trust using a well studied Trust Antecedent framework in management science. To apply these factors to trust ranking problem in online rating systems, we derive features that correspond to each factor and develop different trust ranking models. The advantage of this approach is that features relevant to trust can be systematically derived so as to achieve good prediction accuracy. Through a series of experiments on real data from Epinions, we show that even a simple model using the derived features yields good accuracy and outperforms MoleTrust, a trust …


Cyber Attacks: Does Physical Boundary Matter?, Qiu-Hong Wang, Seung-Hyun Kim Dec 2009

Cyber Attacks: Does Physical Boundary Matter?, Qiu-Hong Wang, Seung-Hyun Kim

Research Collection School Of Computing and Information Systems

Information security issues are characterized with interdependence. Particularly, cyber criminals can easily cross national boundaries and exploit jurisdictional limitations between countries. Thus, whether cyber attacks are spatially autocorrelated is a strategic issue for government authorities and a tactic issue for insurance companies. Through an empirical study of cyber attacks across 62 countries during the period 2003-2007, we find little evidence on the spatial autocorrelation of cyber attacks at any week. However, after considering economic opportunity, IT infrastructure, international collaboration in enforcement and conventional crimes, we find strong evidence that cyber attacks were indeed spatially autocorrelated as they moved over time. …


A Robust Damage Assessment Model For Corrupted Database Systems, Ge Fu, Hong Zhu, Yingjiu Li Dec 2009

A Robust Damage Assessment Model For Corrupted Database Systems, Ge Fu, Hong Zhu, Yingjiu Li

Research Collection School Of Computing and Information Systems

An intrusion tolerant database uses damage assessment techniques to detect damage propagation scales in a corrupted database system. Traditional damage assessment approaches in a intrusion tolerant database system can only locate damages which are caused by reading corrupted data. In fact, there are many other damage spreading patterns that have not been considered in traditional damage assessment model. In this paper, we systematically analyze inter-transaction dependency relationships that have been neglected in the previous research and propose four different dependency relationships between transactions which may cause damage propagation. We extend existing damage assessment model based on the four novel dependency …


Learning Bregman Distance Functions And Its Application For Semi-Supervised Clustering, Lei Wu, Rong Jin, Steven C. H. Hoi, Jianke Zhu, Nenghai Yu Dec 2009

Learning Bregman Distance Functions And Its Application For Semi-Supervised Clustering, Lei Wu, Rong Jin, Steven C. H. Hoi, Jianke Zhu, Nenghai Yu

Research Collection School Of Computing and Information Systems

Learning distance functions with side information plays a key role in many machine learning and data mining applications. Conventional approaches often assume a Mahalanobis distance function. These approaches are limited in two aspects: (i) they are computationally expensive (even infeasible) for high dimensional data because the size of the metric is in the square of dimensionality; (ii) they assume a fixed metric for the entire input space and therefore are unable to handle heterogeneous data. In this paper, we propose a novel scheme that learns nonlinear Bregman distance functions from side information using a nonparametric approach that is similar to …


Coherent Phrase Model For Efficient Image Near-Duplicate Retrieval, Yiqun Hu, Xiangang Cheng, Liang-Tien Chia, Xing Xie, Deepu Rajan, Ah-Hwee Tan Dec 2009

Coherent Phrase Model For Efficient Image Near-Duplicate Retrieval, Yiqun Hu, Xiangang Cheng, Liang-Tien Chia, Xing Xie, Deepu Rajan, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

This paper presents an efficient and effective solution for retrieving image near-duplicate (IND) from image database. We introduce the coherent phrase model which incorporates the coherency of local regions to reduce the quantization error of the bag-of-words (BoW) model. In this model, local regions are characterized by visual phrase of multiple descriptors instead of visual word of single descriptor. We propose two types of visual phrase to encode the coherency in feature and spatial domain, respectively. The proposed model reduces the number of false matches by using this coherency and generates sparse representations of images. Compared to other method, the …


Wake Up Or Fall Asleep: Value Implication Of Trusted Computing, Nan Hu, Jianhui Huang, Ling Liu, Yingjiu Li, Dan Ma Dec 2009

Wake Up Or Fall Asleep: Value Implication Of Trusted Computing, Nan Hu, Jianhui Huang, Ling Liu, Yingjiu Li, Dan Ma

Research Collection School Of Computing and Information Systems

More than 10 years have passed since trusted computing (TC) technology was introduced to the market; however, there is still no consensus about its value. The increasing importance of user and enterprise security and the security promised by TC, coupled with the increasing tension between the proponents and the opponents of TC, make it timely to investigate the value relevance of TC in terms of both capital market and accounting performance. Based on both price and volume studies, we found that news releases related to the adoption of the TC technology had no information content. All investors, regardless of whether …


Online Fault Detection Of Induction Motors Using Independent Component Analysis And Fuzzy Neural Network, Zhaoxia Wang, C. S. Chang, X. German, W.W. Tan Nov 2009

Online Fault Detection Of Induction Motors Using Independent Component Analysis And Fuzzy Neural Network, Zhaoxia Wang, C. S. Chang, X. German, W.W. Tan

Research Collection School Of Computing and Information Systems

This paper proposes the use of independent component analysis and fuzzy neural network for online fault detection of induction motors. The most dominating components of the stator currents measured from laboratory motors are directly identified by an improved method of independent component analysis, which are then used to obtain signatures of the stator current with different faults. The signatures are used to train a fuzzy neural network for detecting induction-motor problems such as broken rotor bars and bearing fault. Using signals collected from laboratory motors, the robustness of the proposed method for online fault detection is demonstrated for various motor …


Vireo/Dvmm At Trecvid 2009: High-Level Feature Extraction, Automatic Video Search, And Content-Based Copy Detection, Chong-Wah Ngo, Yu-Gang Jiang, Xiao-Yong Wei, Wanlei Zhao, Yang Liu, Jun Wang, Shiai Zhu, Shih-Fu Chang Nov 2009

Vireo/Dvmm At Trecvid 2009: High-Level Feature Extraction, Automatic Video Search, And Content-Based Copy Detection, Chong-Wah Ngo, Yu-Gang Jiang, Xiao-Yong Wei, Wanlei Zhao, Yang Liu, Jun Wang, Shiai Zhu, Shih-Fu Chang

Research Collection School Of Computing and Information Systems

This paper presents overview and comparative analysis of our systems designed for 3 TRECVID 2009 tasks: high-level feature extraction, automatic search, and content-based copy detection.


Ensemble And Individual Noise Reduction Method For Induction-Motor Signature Analysis, Zhaoxia Wang, C.S. Chang, Tw Chua, W.W Tan Nov 2009

Ensemble And Individual Noise Reduction Method For Induction-Motor Signature Analysis, Zhaoxia Wang, C.S. Chang, Tw Chua, W.W Tan

Research Collection School Of Computing and Information Systems

Unlike a fixed-frequency power supply, the voltagesupplying an inverter-fed motor is heavily corrupted by noises,which are produced from high-frequency switching leading tonoisy stator currents. To extract useful information from statorcurrentmeasurements, a theoretically sound and robust denoisingmethod is required. The effective filtering of these noisesis difficult with certain frequency-domain techniques, such asFourier transform or Wavelet analysis, because some noises havefrequencies overlapping with those of the actual signals, andsome have high noise-to-frequency ratios. In order to analyze thestatistical signatures of different types of signals, a certainnumber is required of the individual signals to be de-noisedwithout sacrificing the individual characteristic and quantity ofthe …


Trust Relationship Prediction Using Online Product Review Data, Nan Ma, Ee Peng Lim, Viet-An Nguyen, Aixin Sun Nov 2009

Trust Relationship Prediction Using Online Product Review Data, Nan Ma, Ee Peng Lim, Viet-An Nguyen, Aixin Sun

Research Collection School Of Computing and Information Systems

Trust between users is an important piece of knowledge that can be exploited in search and recommendation.Given that user-supplied trust relationships are usually very sparse, we study the prediction of trust relationships using user interaction features in an online user generated review application context. We show that trust relationship prediction can achieve better accuracy when one adopts personalized and cluster-based classification methods. The former trains one classifier for each user using user-specific training data. The cluster-based method first constructs user clusters before training one classifier for each user cluster. Our proposed methods have been evaluated in a series of experiments …


What Makes Categories Difficult To Classify?, Aixin Sun, Ee Peng Lim, Ying Liu Nov 2009

What Makes Categories Difficult To Classify?, Aixin Sun, Ee Peng Lim, Ying Liu

Research Collection School Of Computing and Information Systems

In this paper, we try to predict which category will be less accurately classified compared with other categories in a classification task that involves multiple categories. The categories with poor predicted performance will be identified before any classifiers are trained and additional steps can be taken to address the predicted poor accuracies of these categories. Inspired by the work on query performance prediction in ad-hoc retrieval, we propose to predict classification performance using two measures, namely, category size and category coherence. Our experiments on 20-Newsgroup and Reuters-21578 datasets show that the Spearman rank correlation coefficient between the predicted rank of …


Trust-Oriented Composite Services Selection And Discovery, Lei Li, Yan Wang, Ee Peng Lim Nov 2009

Trust-Oriented Composite Services Selection And Discovery, Lei Li, Yan Wang, Ee Peng Lim

Research Collection School Of Computing and Information Systems

In Service-Oriented Computing (SOC) environments, service clients interact with service providers for consuming services. From the viewpoint of service clients, the trust level of a service or a service provider is a critical issue to consider in service selection and discovery, particularly when a client is looking for a service from a large set of services or service providers. However, a service may invoke other services offered by different providers forming composite services. The complex invocations in composite services greatly increase the complexity of trust-oriented service selection and discovery. In this paper, we propose novel approaches for composite service representation, …


Udel/Smu At Trec 2009 Entity Track, Wei Zheng, Swapna Gottipati, Jing Jiang, Hui Fang Nov 2009

Udel/Smu At Trec 2009 Entity Track, Wei Zheng, Swapna Gottipati, Jing Jiang, Hui Fang

Research Collection School Of Computing and Information Systems

We report our methods and experiment results from the collaborative participation of the InfoLab group from University of Delaware and the school of Information Systems from Singapore Management University in the TREC 2009 Entity track. Our general goal is to study how we may apply language modeling approaches and natural language processing techniques to the task. Specically, we proposed to find supporting information based on segment retrieval, to extract entities using Stanford NER tagger, and to rank entities based on a previously proposed probabilistic framework for expert finding.


Distribution-Based Concept Selection For Concept-Based Video Retrieval, Juan Cao, Hongfang Jing, Chong-Wah Ngo, Yongdong Zhang Oct 2009

Distribution-Based Concept Selection For Concept-Based Video Retrieval, Juan Cao, Hongfang Jing, Chong-Wah Ngo, Yongdong Zhang

Research Collection School Of Computing and Information Systems

Query-to-concept mapping plays one of the keys to concept-based video retrieval. Conventional approaches try to find concepts that are likely to co-occur in the relevant shots from the lexical or statistical aspects. However, the high probability of co-occurrence alone cannot ensure its effectiveness to distinguish the relevant shots from the irrelevant ones. In this paper, we propose distribution-based concept selection (DBCS) for query-to-concept mapping by analyzing concept score distributions of within and between relevant and irrelevant sets. In view of the imbalance between relevant and irrelevant examples, two variants of DBCS are proposed respectively by considering the two-sided and onesided …


Analyzing The Video Popularity Characteristics Of Large-Scale User Generated Content Systems, Meeyoung Cha, Haewoon Kwak, Pablo Rodriguez, Yong-Yeol Ahn, Sue Moon Oct 2009

Analyzing The Video Popularity Characteristics Of Large-Scale User Generated Content Systems, Meeyoung Cha, Haewoon Kwak, Pablo Rodriguez, Yong-Yeol Ahn, Sue Moon

Research Collection School Of Computing and Information Systems

User generated content (UGC), now with millions of video producers and consumers, is re-shaping the way people watch video and TV. In particular, UGC sites are creating new viewing patterns and social interactions, empowering users to be more creative, and generating new business opportunities. Compared to traditional video-on-demand (VoD) systems, UGC services allow users to request videos from a potentially unlimited selection in an asynchronous fashion. To better understand the impact of UGC services, we have analyzed the world's largest UGC VoD system, YouTube, and a popular similar system in Korea, Daum Videos. In this paper, we first empirically show …


Continuous Monitoring Of Spatial Queries In Wireless Broadcast Environments, Kyriakos Mouratidis, Spiridon Bakiras, Dimitris Papadias Oct 2009

Continuous Monitoring Of Spatial Queries In Wireless Broadcast Environments, Kyriakos Mouratidis, Spiridon Bakiras, Dimitris Papadias

Research Collection School Of Computing and Information Systems

Wireless data broadcast is a promising technique for information dissemination that leverages the computational capabilities of the mobile devices in order to enhance the scalability of the system. Under this environment, the data are continuously broadcast by the server, interleaved with some indexing information for query processing. Clients may then tune in the broadcast channel and process their queries locally without contacting the server. Previous work on spatial query processing for wireless broadcast systems has only considered snapshot queries over static data. In this paper, we propose an air indexing framework that 1) outperforms the existing (i.e., snapshot) techniques in …


Scalable Detection Of Partial Near-Duplicate Videos By Visual-Temporal Consistency, Hung-Khoon Tan, Chong-Wah Ngo, Richang Hong, Tat-Seng Chua Oct 2009

Scalable Detection Of Partial Near-Duplicate Videos By Visual-Temporal Consistency, Hung-Khoon Tan, Chong-Wah Ngo, Richang Hong, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Following the exponential growth of social media, there now exist huge repositories of videos online. Among the huge volumes of videos, there exist large numbers of near-duplicate videos. Most existing techniques either focus on the fast retrieval of full copies or near-duplicates, or consider localization in a heuristic manner. This paper considers the scalable detection and localization of partial near-duplicate videos by jointly considering visual similarity and temporal consistency. Temporal constraints are embedded into a network structure as directed edges. Through the structure, partial alignment is novelly converted into a network flow problem where highly efficient solutions exist. To precisely …


Domain Adaptive Semantic Diffusion For Large Scale Context-Based Video Annotation, Yu-Gang Jiang, Jun Wang, Shih-Fu Chang, Chong-Wah Ngo Oct 2009

Domain Adaptive Semantic Diffusion For Large Scale Context-Based Video Annotation, Yu-Gang Jiang, Jun Wang, Shih-Fu Chang, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

Learning to cope with domain change has been known as a challenging problem in many real-world applications. This paper proposes a novel and efficient approach, named domain adaptive semantic diffusion (DASD), to exploit semantic context while considering the domain-shift-of-context for large scale video concept annotation. Starting with a large set of concept detectors, the proposed DASD refines the initial annotation results using graph diffusion technique, which preserves the consistency and smoothness of the annotation over a semantic graph. Different from the existing graph learning methods which capture relations among data samples, the semantic graph treats concepts as nodes and the …


Parallel Sets In The Real World: Three Case Studies, Robert Kosara, Caroline Ziemkiewicz, F. Joseph Iii Mako, Tin Seong Kam Oct 2009

Parallel Sets In The Real World: Three Case Studies, Robert Kosara, Caroline Ziemkiewicz, F. Joseph Iii Mako, Tin Seong Kam

Research Collection School Of Computing and Information Systems

Parallel Sets are a visualization technique for categorical data. We recently released an implementation to the public in an effort to make our research useful to real users. This paper presents three case studies of Parallel Sets in use with real data.


First Acm Sigmm International Workshop On Social Media (Wsm'09), Suzanne Boll, Steven C. H. Hoi, Jiebo Luo, Rong Jin, Dong Xu, Irwin King Oct 2009

First Acm Sigmm International Workshop On Social Media (Wsm'09), Suzanne Boll, Steven C. H. Hoi, Jiebo Luo, Rong Jin, Dong Xu, Irwin King

Research Collection School Of Computing and Information Systems

No abstract provided.


Distance Metric Learning From Uncertain Side Information With Application To Automated Photo Tagging, Lei Wu, Steven C. H. Hoi, Rong Jin, Jianke Zhu, Nenghai Yu Oct 2009

Distance Metric Learning From Uncertain Side Information With Application To Automated Photo Tagging, Lei Wu, Steven C. H. Hoi, Rong Jin, Jianke Zhu, Nenghai Yu

Research Collection School Of Computing and Information Systems

Automated photo tagging is essential to make massive unlabeled photos searchable by text search engines. Conventional image annotation approaches, though working reasonably well on small testbeds, are either computationally expensive or inaccurate when dealing with large-scale photo tagging. Recently, with the popularity of social networking websites, we observe a massive number of user-tagged images, referred to as "social images", that are available on the web. Unlike traditional web images, social images often contain tags and other user-generated content, which offer a new opportunity to resolve some long-standing challenges in multimedia. In this work, we aim to address the challenge of …


First Acm Sigmm International Workshop On Social Media (Wsm'09), Suzanne Boll, Steven C. H. Hoi, Jiebo Luo, Rong Jin, Dong Xu, Irwin King Oct 2009

First Acm Sigmm International Workshop On Social Media (Wsm'09), Suzanne Boll, Steven C. H. Hoi, Jiebo Luo, Rong Jin, Dong Xu, Irwin King

Research Collection School Of Computing and Information Systems

The ACM SIGMM International Workshop on Social Media(WSM’09) is the first workshop held in conjunction withthe ACM International Multimedia Conference (MM’09) atBejing, P.R. China, 2009. This workshop provides a forumfor researchers and practitioners from all over the world toshare information on their latest investigations on social mediaanalysis, exploration, search, mining, and emerging newsocial media applications.


Semantics-Preserving Bag-Of-Words Models For Efficient Image Annotation, Lei Wu, Steven C. H. Hoi, Nenghai Yu Oct 2009

Semantics-Preserving Bag-Of-Words Models For Efficient Image Annotation, Lei Wu, Steven C. H. Hoi, Nenghai Yu

Research Collection School Of Computing and Information Systems

The Bag-of-Words (BoW) model is a promising image representation for annotation. One critical limitation of existing BoW models is the semantic loss during the codebook generation process, in which BoW simply clusters visual words in Euclidian space. However, distance between two visual words in Euclidean space does not necessarily reflect the semantic distance between the two concepts, due to the semantic gap between low-level features and high-level semantics. In this paper, we propose a novel scheme for learning a codebook such that semantically related features will be mapped to the same visual word. In particular, we consider the distance between …


Unsupervised Face Alignment By Robust Nonrigid Mapping, Jianke Zhu, Luc Van Gool, Steven C. H. Hoi Oct 2009

Unsupervised Face Alignment By Robust Nonrigid Mapping, Jianke Zhu, Luc Van Gool, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

We propose a novel approach to unsupervised facial image alignment. Differently from previous approaches, that are confined to affine transformations on either the entire face or separate patches, we extract a nonrigid mapping between facial images. Based on a regularized face model, we frame unsupervised face alignment into the Lucas-Kanade image registration approach. We propose a robust optimization scheme to handle appearance variations. The method is fully automatic and can cope with pose variations and expressions, all in an unsupervised manner. Experiments on a large set of images showed that the approach is effective.


Mining Globally Distributed Frequent Subgraphs In A Single Labeled Graph, Xing Jiang, Hui Xiong, Chen Wang, Ah-Hwee Tan Oct 2009

Mining Globally Distributed Frequent Subgraphs In A Single Labeled Graph, Xing Jiang, Hui Xiong, Chen Wang, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Recent years have observed increasing efforts on graph mining and many algorithms have been developed for this purpose. However, most of the existing algorithms are designed for discovering frequent subgraphs in a set of labeled graphs only. Also, the few algorithms that find frequent subgraphs in a single labeled graph typically identify subgraphs appearing regionally in the input graph. In contrast, for real-world applications, it is commonly required that the identified frequent subgraphs in a single labeled graph should also be globally distributed. This paper thus fills this crucial void by proposing a new measure, termed G-Measure, to find globally …


Streaming 3d Meshes Using Spectral Geometry Images, Ying He, Boon Seng Chew, Dayong Wang, Steven C. H. Hoi, Lap Pui Chau Oct 2009

Streaming 3d Meshes Using Spectral Geometry Images, Ying He, Boon Seng Chew, Dayong Wang, Steven C. H. Hoi, Lap Pui Chau

Research Collection School Of Computing and Information Systems

The transmission of 3D models in the form of Geometry Images (GI) is an emerging and appealing concept due to the reduction in complexity from R3 to image space and wide availability of mature image processing tools and standards. However, geometry images often suffer from the artifacts and error during compression and transmission. Thus, there is a need to address the artifact reduction, error resilience and protection of such data information during the transmission across an error prone network. In this paper, we introduce a new concept, called Spectral Geometry Images (SGI), which naturally combines the powerful spectral analysis with …


Why Quants Fail, M. Thulasidas Sep 2009

Why Quants Fail, M. Thulasidas

Research Collection School Of Computing and Information Systems

Mathematical finance is built on a couple of assumptions. The most fundamental of them is the one on ma ket efficiency. It states that the market prices every asset fairly, and that the prices contain all the information available in the market.


Accelerating Sequence Searching: Dimensionality Reduction Method, Guojie Song, Bin Cui, Baihua Zheng, Kunqing Xie, Dongqing Yang Sep 2009

Accelerating Sequence Searching: Dimensionality Reduction Method, Guojie Song, Bin Cui, Baihua Zheng, Kunqing Xie, Dongqing Yang

Research Collection School Of Computing and Information Systems

Similarity search over long sequence dataset becomes increasingly popular in many emerging applications, such as text retrieval, genetic sequences exploring, etc. In this paper, a novel index structure, namely Sequence Embedding Multiset tree (SEM − tree), has been proposed to speed up the searching process over long sequences. The SEM-tree is a multi-level structure where each level represents the sequence data with different compression level of multiset, and the length of multiset increases towards the leaf level which contains original sequences. The multisets, obtained using sequence embedding algorithms, have the desirable property that they do not need to keep the …