Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 14 of 14

Full-Text Articles in Databases and Information Systems

Collaborative Online Ranking Algorithms For Multitask Learning, Guangxia Li, Peilin Zhao, Tao Mei, Peng Yang, Yulong Shen, Julian K. Y. Chang, Steven C. H. Hoi Oct 2019

Collaborative Online Ranking Algorithms For Multitask Learning, Guangxia Li, Peilin Zhao, Tao Mei, Peng Yang, Yulong Shen, Julian K. Y. Chang, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

There are many applications in which it is desirable to rank or order instances that belong to several different but related problems or tasks. Although unique, the individual ranking problem often shares characteristics with other problems in the group. Conventional ranking methods treat each task independently without considering the latent commonalities. In this paper, we study the problem of learning to rank instances that belong to multiple related tasks from the multitask learning perspective. We consider a case in which the information that is learned for a task can be used to enhance the learning of other tasks and propose …


Detecting Cyberattacks In Industrial Control Systems Using Online Learning Algorithms, Guangxia Li, Yulong Shen, Peilin Zhao, Xiao Lu, Jia Liu, Yangyang Liu, Steven C. H. Hoi Oct 2019

Detecting Cyberattacks In Industrial Control Systems Using Online Learning Algorithms, Guangxia Li, Yulong Shen, Peilin Zhao, Xiao Lu, Jia Liu, Yangyang Liu, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Industrial control systems are critical to the operation of industrial facilities, especially for critical infrastructures, such as refineries, power grids, and transportation systems. Similar to other information systems, a significant threat to industrial control systems is the attack from cyberspace-the offensive maneuvers launched by "anonymous" in the digital world that target computer-based assets with the goal of compromising a system's functions or probing for information. Owing to the importance of industrial control systems, and the possibly devastating consequences of being attacked, significant endeavors have been attempted to secure industrial control systems from cyberattacks. Among them are intrusion detection systems that …


Large Scale Online Multiple Kernel Regression With Application To Time-Series Prediction, Doyen Sahoo, Steven C. H. Hoi, Bin Lin Jan 2019

Large Scale Online Multiple Kernel Regression With Application To Time-Series Prediction, Doyen Sahoo, Steven C. H. Hoi, Bin Lin

Research Collection School Of Computing and Information Systems

Kernel-based regression represents an important family of learning techniques for solving challenging regression tasks with non-linear patterns. Despite being studied extensively, most of the existing work suffers from two major drawbacks as follows: (i) they are often designed for solving regression tasks in a batch learning setting, making them not only computationally inefficient and but also poorly scalable in real-world applications where data arrives sequentially; and (ii) they usually assume that a fixed kernel function is given prior to the learning task, which could result in poor performance if the chosen kernel is inappropriate. To overcome these drawbacks, this work …


Online Active Learning With Expert Advice, Shuji Hao, Peiying Hu, Peilin Zhao, Steven C. H. Hoi, Chunyan Miao Jul 2018

Online Active Learning With Expert Advice, Shuji Hao, Peiying Hu, Peilin Zhao, Steven C. H. Hoi, Chunyan Miao

Research Collection School Of Computing and Information Systems

In literature, learning with expert advice methods usually assume that a learner always obtain the true label of every incoming training instance at the end of each trial. However, in many real-world applications, acquiring the true labels of all instances can be both costly and time consuming, especially for large-scale problems. For example, in the social media, data stream usually comes in a high speed and volume, and it is nearly impossible and highly costly to label all of the instances. In this article, we address this problem with active learning with expert advice, where the ground truth of an …


Distributed Multi-Task Classification: A Decentralized Online Learning Approach, Chi Zhang, Peilin Zhao, Shuji Hao, Yeng Chai Soh, Bu Sung Lee, Chunyan Miao, Steven C. H. Hoi Apr 2018

Distributed Multi-Task Classification: A Decentralized Online Learning Approach, Chi Zhang, Peilin Zhao, Shuji Hao, Yeng Chai Soh, Bu Sung Lee, Chunyan Miao, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Although dispersing one single task to distributed learning nodes has been intensively studied by the previous research, multi-task learning on distributed networks is still an area that has not been fully exploited, especially under decentralized settings. The challenge lies in the fact that different tasks may have different optimal learning weights while communication through the distributed network forces all tasks to converge to an unique classifier. In this paper, we present a novel algorithm to overcome this challenge and enable learning multiple tasks simultaneously on a decentralized distributed network. Specifically, the learning framework can be separated into two phases: (i) …


Sparse Passive-Aggressive Learning For Bounded Online Kernel Methods, Jing Lu, Doyen Sahoo, Peilin Zhao, Steven C. H. Hoi Feb 2018

Sparse Passive-Aggressive Learning For Bounded Online Kernel Methods, Jing Lu, Doyen Sahoo, Peilin Zhao, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

One critical deficiency of traditional online kernel learning methods is their unbounded and growing number of support vectors in the online learning process, making them inefficient and non-scalable for large-scale applications. Recent studies on scalable online kernel learning have attempted to overcome this shortcoming, e.g., by imposing a constant budget on the number of support vectors. Although they attempt to bound the number of support vectors at each online learning iteration, most of them fail to bound the number of support vectors for the final output hypothesis, which is often obtained by averaging the series of hypotheses over all the …


Scalable Online Kernel Learning, Jing Lu Nov 2017

Scalable Online Kernel Learning, Jing Lu

Dissertations and Theses Collection (Open Access)

One critical deficiency of traditional online kernel learning methods is their increasing and unbounded number of support vectors (SV’s), making them inefficient and non-scalable for large-scale applications. Recent studies on budget online learning have attempted to overcome this shortcoming by bounding the number of SV’s. Despite being extensively studied, budget algorithms usually suffer from several drawbacks.
First of all, although existing algorithms attempt to bound the number of SV’s at each iteration, most of them fail to bound the number of SV’s for the final averaged classifier, which is commonly used for online-to-batch conversion. To solve this problem, we propose …


Collaborative Topic Regression For Online Recommender Systems: An Online And Bayesian Approach, Chenghao Liu, Tao Jin, Steven C. H. Hoi, Peilin Zhao, Jianling Sun May 2017

Collaborative Topic Regression For Online Recommender Systems: An Online And Bayesian Approach, Chenghao Liu, Tao Jin, Steven C. H. Hoi, Peilin Zhao, Jianling Sun

Research Collection School Of Computing and Information Systems

Collaborative Topic Regression (CTR) combines ideas of probabilistic matrix factorization (PMF) and topic modeling (such as LDA) for recommender systems, which has gained increasing success in many applications. Despite enjoying many advantages, the existing Batch Decoupled Inference algorithm for the CTR model has some critical limitations: First of all, it is designed to work in a batch learning manner, making it unsuitable to deal with streaming data or big data in real-world recommender systems. Secondly, in the existing algorithm, the item-specific topic proportions of LDA are fed to the downstream PMF but the rating information is not exploited in discovering …


Soal: Second-Order Online Active Learning, Shuji Hao, Peilin Zhao, Jing Lu, Steven C. H. Hoi, Chunyan Miao, Chi Zhang Feb 2017

Soal: Second-Order Online Active Learning, Shuji Hao, Peilin Zhao, Jing Lu, Steven C. H. Hoi, Chunyan Miao, Chi Zhang

Research Collection School Of Computing and Information Systems

This paper investigates the problem of online active learning for training classification models from sequentially arriving data. This is more challenging than conventional online learning tasks since the learner not only needs to figure out how to effectively update the classifier but also needs to decide when is the best time to query the label of an incoming instance given limited label budget. The existing online active learning approaches are often based on first-order online learning methods which generally fall short in slow convergence rate and suboptimal exploitation of available information when querying the labeled data. To overcome the limitations, …


Soft Confidence-Weighted Learning, Jialei Wang, Peilin Zhao, Hoi, Steven C. H. Sep 2016

Soft Confidence-Weighted Learning, Jialei Wang, Peilin Zhao, Hoi, Steven C. H.

Research Collection School Of Computing and Information Systems

Online learning plays an important role in many big datamining problems because of its high efficiency and scalability. In theliterature, many online learning algorithms using gradient information havebeen applied to solve online classification problems. Recently, more effectivesecond-order algorithms have been proposed, where the correlation between thefeatures is utilized to improve the learning efficiency. Among them,Confidence-Weighted (CW) learning algorithms are very effective, which assumethat the classification model is drawn from a Gaussian distribution, whichenables the model to be effectively updated with the second-order informationof the data stream. Despite being studied actively, these CW algorithms cannothandle nonseparable datasets and noisy datasets very …


Large Scale Online Kernel Learning, Jing Lu, Hoi, Steven C. H., Jialei Wang, Peilin Zhao, Zhi-Yong Liu Apr 2016

Large Scale Online Kernel Learning, Jing Lu, Hoi, Steven C. H., Jialei Wang, Peilin Zhao, Zhi-Yong Liu

Research Collection School Of Computing and Information Systems

In this paper, we present a new framework for large scale online kernel learning, making kernel methods efficient and scalable for large-scale online learning applications. Unlike the regular budget online kernel learning scheme that usually uses some budget maintenance strategies to bound the number of support vectors, our framework explores a completely different approach of kernel functional approximation techniques to make the subsequent online learning task efficient and scalable. Specifically, we present two different online kernel machine learning algorithms: (i) Fourier Online Gradient Descent (FOGD) algorithm that applies the random Fourier features for approximating kernel functions; and (ii) Nyström Online …


Confidence Weighted Mean Reversion Strategy For Online Portfolio Selection, Bin Li, Steven C. H. Hoi, Peilin Zhao, Vivekanand Gopalkrishnan Mar 2013

Confidence Weighted Mean Reversion Strategy For Online Portfolio Selection, Bin Li, Steven C. H. Hoi, Peilin Zhao, Vivekanand Gopalkrishnan

Research Collection School Of Computing and Information Systems

Online portfolio selection has been attracting increasing attention from the data mining and machine learning communities. All existing online portfolio selection strategies focus on the first order information of a portfolio vector, though the second order information may also be beneficial to a strategy. Moreover, empirical evidence shows that relative stock prices may follow the mean reversion property, which has not been fully exploited by existing strategies. This article proposes a novel online portfolio selection strategy named Confidence Weighted Mean Reversion (CWMR). Inspired by the mean reversion principle in finance and confidence weighted online learning technique in machine learning, CWMR …


On-Line Portfolio Selection With Moving Average Reversion, Bin Li, Steven C. H. Hoi Jul 2012

On-Line Portfolio Selection With Moving Average Reversion, Bin Li, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

On-line portfolio selection has attracted increasing interests in machine learning and AI communities recently. Empirical evidences show that stock's high and low prices are temporary and stock price relatives are likely to follow the mean reversion phenomenon. While the existing mean reversion strategies are shown to achieve good empirical performance on many real datasets, they often make the single-period mean reversion assumption, which is not always satisfied in some real datasets, leading to poor performance when the assumption does not hold. To overcome the limitation, this article proposes a multiple-period mean reversion, or so-called Moving Average Reversion (MAR), and a …


Double Updating Online Learning, Peilin Zhao, Steven C. H. Hoi, Rong Jin May 2011

Double Updating Online Learning, Peilin Zhao, Steven C. H. Hoi, Rong Jin

Research Collection School Of Computing and Information Systems

In most kernel based online learning algorithms, when an incoming instance is misclassified, it will be added into the pool of support vectors and assigned with a weight, which often remains unchanged during the rest of the learning process. This is clearly insufficient since when a new support vector is added, we generally expect the weights of the other existing support vectors to be updated in order to reflect the influence of the added support vector. In this paper, we propose a new online learning method, termed Double Updating Online Learning, or DUOL for short, that explicitly addresses this problem. …