Physical Sciences and Mathematics | Open Access Articles

Active Code Learning: Benchmarking Sample-Efficient Training Of Code Models, Qiang Hu, Yuejun Guo, Xiaofei Xie, Maxime Cordy, Lei Ma, Mike Papadakis, Yves Le Traon Jan 2024

Active Code Learning: Benchmarking Sample-Efficient Training Of Code Models, Qiang Hu, Yuejun Guo, Xiaofei Xie, Maxime Cordy, Lei Ma, Mike Papadakis, Yves Le Traon

Research Collection School Of Computing and Information Systems

The costly human effort required to prepare the training data of machine learning (ML) models hinders their practical development and usage in software engineering (ML4Code), especially for those with limited budgets. Therefore, efficiently training models of code with less human effort has become an emergent problem. Active learning is such a technique to address this issue that allows developers to train a model with reduced data while producing models with desired performance, which has been well studied in computer vision and natural language processing domains. Unfortunately, there is no such work that explores the effectiveness of active learning for code …

Go to article

Real: A Representative Error-Driven Approach For Active Learning, Cheng Chen, Yong Wang, Lizi Liao, Yueguo Chen, Xiaoyong Du Sep 2023

Real: A Representative Error-Driven Approach For Active Learning, Cheng Chen, Yong Wang, Lizi Liao, Yueguo Chen, Xiaoyong Du

Research Collection School Of Computing and Information Systems

Given a limited labeling budget, active learning (al) aims to sample the most informative instances from an unlabeled pool to acquire labels for subsequent model training. To achieve this, al typically measures the informativeness of unlabeled instances based on uncertainty and diversity. However, it does not consider erroneous instances with their neighborhood error density, which have great potential to improve the model performance. To address this limitation, we propose Real, a novel approach to select data instances with Representative Errors for Active Learning. It identifies minority predictions as pseudo errors within a cluster and allocates an adaptive sampling budget for …

Go to article

Active Learning Of Discriminative Subgraph Patterns For Api Misuse Detection, Hong Jin Kang, David Lo Feb 2022

Active Learning Of Discriminative Subgraph Patterns For Api Misuse Detection, Hong Jin Kang, David Lo

Research Collection School Of Computing and Information Systems

A common cause of bugs and vulnerabilities are the violations of usage constraints associated with Application Programming Interfaces (APIs). API misuses are common in software projects, and while there have been techniques proposed to detect such misuses, studies have shown that they fail to reliably detect misuses while reporting many false positives. One limitation of prior work is the inability to reliably identify correct patterns of usage. Many approaches confuse a usage pattern’s frequency for correctness. Due to the variety of alternative usage patterns that may be uncommon but correct, anomaly detection-based techniques have limited success in identifying misuses. We …

Go to article

Second-Order Online Active Learning And Its Applications, Shuji Hao, Jing Lu, Peilin Zhao, Chi Zhang, Steven C. H. Hoi, Chunyan Miao Nov 2017

Second-Order Online Active Learning And Its Applications, Shuji Hao, Jing Lu, Peilin Zhao, Chi Zhang, Steven C. H. Hoi, Chunyan Miao

Research Collection School Of Computing and Information Systems

The goal of online active learning is to learn predictive models from a sequence of unlabeled data given limited label querybudget. Unlike conventional online learning tasks, online active learning is considerably more challenging because of two reasons.Firstly, it is difficult to design an effective query strategy to decide when is appropriate to query the label of an incoming instance givenlimited query budget. Secondly, it is also challenging to decide how to update the predictive models effectively whenever the true labelof an instance is queried. Most existing approaches for online active learning are often based on a family of first-order online …

Go to article

Active Crowdsourcing For Annotation, Shuji Hao, Chunyan Miao, Steven C. H. Hoi, Peilin Zhao Dec 2015

Active Crowdsourcing For Annotation, Shuji Hao, Chunyan Miao, Steven C. H. Hoi, Peilin Zhao

Research Collection School Of Computing and Information Systems

Crowdsourcing has shown great potential in obtaining large-scale and cheap labels for different tasks. However, obtaining reliable labels is challenging due to several reasons, such as noisy annotators, limited budget and so on. The state-of-the-art approaches, either suffer in some noisy scenarios, or rely on unlimited resources to acquire reliable labels. In this article, we adopt the learning with expert~(AKA worker in crowdsourcing) advice framework to robustly infer accurate labels by considering the reliability of each worker. However, in order to accurately predict the reliability of each worker, traditional learning with expert advice will consult with external oracles~(AKA domain experts) …

Go to article

Online Passive Aggressive Active Learning And Its Applications, Jing Lu, Peilin Zhao, Steven C. H. Hoi Nov 2014

Online Passive Aggressive Active Learning And Its Applications, Jing Lu, Peilin Zhao, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

We investigate online active learning techniques for classification tasks in data stream mining applications. Unlike traditional learning approaches (either batch or online learning) that often require to request the class label of each incoming instance, online active learning queries only a subset of informative incoming instances to update the classification model, which aims to maximize classification performance using minimal human labeling effort during the entire online stream data mining task. In this paper, we present a new family of algorithms for online active learning called Passive-Aggressive Active (PAA) learning algorithms by adapting the popular Passive-Aggressive algorithms in an online active …

Go to article

Active Code Search: Incorporating User Feedback To Improve Code Search Relevance, Shaowei Wang, David Lo, Lingxiao Jiang Sep 2014

Active Code Search: Incorporating User Feedback To Improve Code Search Relevance, Shaowei Wang, David Lo, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

Code search techniques return relevant code fragments given a user query. They typically work in a passive mode: given a user query, a static list of code fragments sorted by the relevance scores decided by a code search technique is returned to the user. A user will go through the sorted list of returned code fragments from top to bottom. As the user checks each code fragment one by one, he or she will naturally form an opinion about the true relevance of the code fragment. In an active model, those opinions will be taken as feedbacks to the search …

Go to article

Evolving An Information Systems Capstone Course To Align With The Fast Changing Singapore Marketplace, Chris Boesch, Benjamin Kok Siew Gan Jun 2014

Evolving An Information Systems Capstone Course To Align With The Fast Changing Singapore Marketplace, Chris Boesch, Benjamin Kok Siew Gan

Research Collection School Of Computing and Information Systems

Every year, around fifty-five undergraduate teams of four to six students are required to complete a capstone course for the School of Information Systems at Singapore Management University. Each team spends approximately five months working with an industry sponsor using the latest tools and techniques. Students actively learn by implementing the system to solve a real world problem. In addition to delivering value to the local sponsor, our students learn specialized skills currently needed in the marketplace, which might not yet be incorporated into electives and core courses. In this paper, we discuss the tradeoffs of providing students and project …

Go to article

Active Learning With Expert Advice, Peilin Zhao, Steven C. H. Hoi, Jinfeng Zhuang Jul 2013

Active Learning With Expert Advice, Peilin Zhao, Steven C. H. Hoi, Jinfeng Zhuang

Research Collection School Of Computing and Information Systems

Conventional learning with expert advice methods assumes a learner is always receiving the outcome (e.g., class labels) of every incoming training instance at the end of each trial. In real applications, acquiring the outcome from oracle can be costly or time consuming. In this paper, we address a new problem of active learning with expert advice, where the outcome of an instance is disclosed only when it is requested by the online learner. Our goal is to learn an accurate prediction model by asking the oracle the number of questions as small as possible. To address this challenge, we propose …

Go to article

Learning The Unified Kernel Machines For Classification, Steven C. H. Hoi, Michael R. Lyu, Edward Y. Chang Aug 2006

Learning The Unified Kernel Machines For Classification, Steven C. H. Hoi, Michael R. Lyu, Edward Y. Chang

Research Collection School Of Computing and Information Systems

Kernel machines have been shown as the state-of-the-art learning techniques for classification. In this paper, we propose a novel general framework of learning the Unified Kernel Machines (UKM) from both labeled and unlabeled data. Our proposed framework integrates supervised learning, semi-supervised kernel learning, and active learning in a unified solution. In the suggested framework, we particularly focus our attention on designing a new semi-supervised kernel learning method, i.e., Spectral Kernel Learning (SKL), which is built on the principles of kernel target alignment and unsupervised kernel design. Our algorithm is related to an equivalent quadratic programming problem that can be efficiently …

Go to article

Physical Sciences and Mathematics Commons^™

Full-Text Articles in Physical Sciences and Mathematics

Active Code Learning: Benchmarking Sample-Efficient Training Of Code Models, Qiang Hu, Yuejun Guo, Xiaofei Xie, Maxime Cordy, Lei Ma, Mike Papadakis, Yves Le Traon

Research Collection School Of Computing and Information Systems

Real: A Representative Error-Driven Approach For Active Learning, Cheng Chen, Yong Wang, Lizi Liao, Yueguo Chen, Xiaoyong Du

Research Collection School Of Computing and Information Systems

Active Learning Of Discriminative Subgraph Patterns For Api Misuse Detection, Hong Jin Kang, David Lo

Research Collection School Of Computing and Information Systems

Second-Order Online Active Learning And Its Applications, Shuji Hao, Jing Lu, Peilin Zhao, Chi Zhang, Steven C. H. Hoi, Chunyan Miao

Research Collection School Of Computing and Information Systems

Active Crowdsourcing For Annotation, Shuji Hao, Chunyan Miao, Steven C. H. Hoi, Peilin Zhao

Research Collection School Of Computing and Information Systems

Online Passive Aggressive Active Learning And Its Applications, Jing Lu, Peilin Zhao, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Active Code Search: Incorporating User Feedback To Improve Code Search Relevance, Shaowei Wang, David Lo, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

Evolving An Information Systems Capstone Course To Align With The Fast Changing Singapore Marketplace, Chris Boesch, Benjamin Kok Siew Gan

Research Collection School Of Computing and Information Systems

Active Learning With Expert Advice, Peilin Zhao, Steven C. H. Hoi, Jinfeng Zhuang

Research Collection School Of Computing and Information Systems

Learning The Unified Kernel Machines For Classification, Steven C. H. Hoi, Michael R. Lyu, Edward Y. Chang

Research Collection School Of Computing and Information Systems