Physical Sciences and Mathematics | Open Access Articles

Stealthy Backdoor Attack For Code Models, Zhou Yang, Bowen Xu, Jie M. Zhang, Hong Jin Kang, Jieke Shi, Junda He, David Lo Jan 2024

Stealthy Backdoor Attack For Code Models, Zhou Yang, Bowen Xu, Jie M. Zhang, Hong Jin Kang, Jieke Shi, Junda He, David Lo

Research Collection School Of Computing and Information Systems

Code models, such as CodeBERT and CodeT5, offer general-purpose representations of code and play a vital role in supporting downstream automated software engineering tasks. Most recently, code models were revealed to be vulnerable to backdoor attacks. A code model that is backdoor-attacked can behave normally on clean examples but will produce pre-defined malicious outputs on examples injected with that activate the backdoors. Existing backdoor attacks on code models use unstealthy and easy-to-detect triggers. This paper aims to investigate the vulnerability of code models with backdoor attacks. To this end, we propose A (dversarial eature as daptive Back). A achieves stealthiness …

Go to article

Robust Test Selection For Deep Neural Networks, Weifeng Sun, Meng Yan, Zhongxin Liu, David Lo Dec 2023

Robust Test Selection For Deep Neural Networks, Weifeng Sun, Meng Yan, Zhongxin Liu, David Lo

Research Collection School Of Computing and Information Systems

Deep Neural Networks (DNNs) have been widely used in various domains, such as computer vision and software engineering. Although many DNNs have been deployed to assist various tasks in the real world, similar to traditional software, they also suffer from defects that may lead to severe outcomes. DNN testing is one of the most widely used methods to ensure the quality of DNNs. Such method needs rich test inputs with oracle information (expected output) to reveal the incorrect behaviors of a DNN model. However, manually labeling all the collected test inputs is a labor-intensive task, which delays the quality assurance …

Go to article

Visually Analyzing Company-Wide Software Service Dependencies: An Industrial Case Study, Sebastian Baltes, Brian Pfitzmann, Thomas Kowark, Christoph Treude, Fabian Beck Oct 2023

Visually Analyzing Company-Wide Software Service Dependencies: An Industrial Case Study, Sebastian Baltes, Brian Pfitzmann, Thomas Kowark, Christoph Treude, Fabian Beck

Research Collection School Of Computing and Information Systems

Managing dependencies between software services is a crucial task for any company operating cloud applications. Visualizations can help to understand and maintain these com-plex dependencies. In this paper, we present a force-directed service dependency visualization and filtering tool that has been developed and used within SAP. The tool's use cases include guiding service retirement as well as understanding service deployment landscapes and their relationship to the company's organizational structure. We report how we built and adapted the tool under strict time constraints to address the requirements of our users. We further share insights on how we enabled internal adoption. For …

Go to article

Edge Distraction-Aware Salient Object Detection, Sucheng Ren, Wenxi Liu, Jianbo Jiao, Guoqiang Han, Shengfeng He Sep 2023

Edge Distraction-Aware Salient Object Detection, Sucheng Ren, Wenxi Liu, Jianbo Jiao, Guoqiang Han, Shengfeng He

Research Collection School Of Computing and Information Systems

Integrating low-level edge features has been proven to be effective in preserving clear boundaries of salient objects. However, the locality of edge features makes it difficult to capture globally salient edges, leading to distraction in the final predictions. To address this problem, we propose to produce distraction-free edge features by incorporating cross-scale holistic interdependencies between high-level features. In particular, we first formulate our edge features extraction process as a boundary-filling problem. In this way, we enforce edge features to focus on closed boundaries instead of those disconnected background edges. Second, we propose to explore cross-scale holistic contextual connections between every …

Go to article

Graphsearchnet: Enhancing Gnns Via Capturing Global Dependencies For Semantic Code Search, Shangqing Liu, Xiaofei Xie, Jjingkai Siow, Lei Ma, Guozhu Meng, Yang Liu Jan 2023

Graphsearchnet: Enhancing Gnns Via Capturing Global Dependencies For Semantic Code Search, Shangqing Liu, Xiaofei Xie, Jjingkai Siow, Lei Ma, Guozhu Meng, Yang Liu

Research Collection School Of Computing and Information Systems

Code search aims to retrieve accurate code snippets based on a natural language query to improve software productivity and quality. With the massive amount of available programs such as (on GitHub or Stack Overflow), identifying and localizing the precise code is critical for the software developers. In addition, Deep learning has recently been widely applied to different code-related scenarios, ., vulnerability detection, source code summarization. However, automated deep code search is still challenging since it requires a high-level semantic mapping between code and natural language queries. Most existing deep learning-based approaches for code search rely on the sequential text ., …

Go to article

Just-In-Time Defect Identification And Localization: A Two-Phase Framework, Meng Yan, Xin Xia, Yuanrui Fan, Ahmed E. Hassan, David Lo, Shanping Li Jan 2022

Just-In-Time Defect Identification And Localization: A Two-Phase Framework, Meng Yan, Xin Xia, Yuanrui Fan, Ahmed E. Hassan, David Lo, Shanping Li

Research Collection School Of Computing and Information Systems

Defect localization aims to locate buggy program elements (e.g., buggy files, methods or lines of code) based on defect symptoms, e.g., bug reports or program spectrum. However, when we receive the defect symptoms, the defect has been exposed and negative impacts have been introduced. Thus, one challenging task is: whether we can locate buggy program prior to appearance of the defect symptom at an early time (e.g., when buggy program elements are being checked-in). We refer to this type of defect localization as “Just-In-Time (JIT) Defect localization”. Although many prior studies have proposed various JIT defect identification methods to identify …

Go to article

A Large Scale Study Of Long-Time Contributor Prediction For Github Projects, Lingfeng Bao, Xin Xia, David Lo, Gail C. Murphy Jun 2021

A Large Scale Study Of Long-Time Contributor Prediction For Github Projects, Lingfeng Bao, Xin Xia, David Lo, Gail C. Murphy

Research Collection School Of Computing and Information Systems

The continuous contributions made by long time contributors (LTCs) are a key factor enabling open source software (OSS) projects to be successful and survival. We study Github as it has a large number of OSS projects and millions of contributors, which enables the study of the transition from newcomers to LTCs. In this paper, we investigate whether we can effectively predict newcomers in OSS projects to be LTCs based on their activity data that is collected from Github. We collect Github data from GHTorrent, a mirror of Github data. We select the most popular 917 projects, which contain 75,046 contributors. …

Go to article

An Empirical Study Of Release Note Production And Usage In Practice, Tingting Bi, Xin Xia, David Lo, John Grundy, Thomas Zimmermann Nov 2020

An Empirical Study Of Release Note Production And Usage In Practice, Tingting Bi, Xin Xia, David Lo, John Grundy, Thomas Zimmermann

Research Collection School Of Computing and Information Systems

The release note is one of the most important software artifacts that serves as a bridge for communication among stakeholders. Release notes contain a set of crucial information, such as descriptions of enhancements, improvements, potential issues, development, evolution, testing, and maintenance of projects throughout the whole development lifestyle. A comprehensive understanding of what makes a good release note and how to write one for different stakeholders would be highly beneficial. However, in practice, the release note is often neglected by stakeholders and has not to date been systematically investigated by researchers. In this paper, we conduct a mixed methods study …

Go to article

Chaff From The Wheat: Characterizing And Determining Valid Bug Reports, Yuanrui Fan, Xin Xia, David Lo, Ahmed E. Hassan May 2020

Chaff From The Wheat: Characterizing And Determining Valid Bug Reports, Yuanrui Fan, Xin Xia, David Lo, Ahmed E. Hassan

Research Collection School Of Computing and Information Systems

Developers use bug reports to triage and fix bugs. When triaging a bug report, developers must decide whether the bug report is valid (i.e., a real bug). A large amount of bug reports are submitted every day, with many of them end up being invalid reports. Manually determining valid bug report is a difficult and tedious task. Thus, an approach that can automatically analyze the validity of a bug report and determine whether a report is valid can help developers prioritize their triaging tasks and avoid wasting time and effort on invalid bug reports. In this study, motivated by the …

Go to article

Vt-Revolution: Interactive Programming Video Tutorial Authoring And Watching System, Lingfeng Bao, Zhenchang Xing, Xin Xia, David Lo Feb 2018

Vt-Revolution: Interactive Programming Video Tutorial Authoring And Watching System, Lingfeng Bao, Zhenchang Xing, Xin Xia, David Lo

Research Collection School Of Computing and Information Systems

Procedural knowledge describes actions and manipulations that are carried out to complete programming tasks. An effective way to document procedural knowledge is programming video tutorials. Existing solutions to adding interactive workflow and elements to programming videos have a dilemma between the level of desired interaction and the efforts required for authoring tutorials. In this work, we tackle this dilemma by designing and building a programming video tutorial authoring system that leverages operating system level instrumentation to log workflow history while tutorial authors are creating programming videos, and the corresponding tutorial watching system that enhances the learning experience of video tutorials …

Go to article

Physical Sciences and Mathematics Commons^™

Full-Text Articles in Physical Sciences and Mathematics

Stealthy Backdoor Attack For Code Models, Zhou Yang, Bowen Xu, Jie M. Zhang, Hong Jin Kang, Jieke Shi, Junda He, David Lo

Research Collection School Of Computing and Information Systems

Robust Test Selection For Deep Neural Networks, Weifeng Sun, Meng Yan, Zhongxin Liu, David Lo

Research Collection School Of Computing and Information Systems

Visually Analyzing Company-Wide Software Service Dependencies: An Industrial Case Study, Sebastian Baltes, Brian Pfitzmann, Thomas Kowark, Christoph Treude, Fabian Beck

Research Collection School Of Computing and Information Systems

Edge Distraction-Aware Salient Object Detection, Sucheng Ren, Wenxi Liu, Jianbo Jiao, Guoqiang Han, Shengfeng He

Research Collection School Of Computing and Information Systems

Graphsearchnet: Enhancing Gnns Via Capturing Global Dependencies For Semantic Code Search, Shangqing Liu, Xiaofei Xie, Jjingkai Siow, Lei Ma, Guozhu Meng, Yang Liu

Research Collection School Of Computing and Information Systems

Just-In-Time Defect Identification And Localization: A Two-Phase Framework, Meng Yan, Xin Xia, Yuanrui Fan, Ahmed E. Hassan, David Lo, Shanping Li

Research Collection School Of Computing and Information Systems

A Large Scale Study Of Long-Time Contributor Prediction For Github Projects, Lingfeng Bao, Xin Xia, David Lo, Gail C. Murphy

Research Collection School Of Computing and Information Systems

An Empirical Study Of Release Note Production And Usage In Practice, Tingting Bi, Xin Xia, David Lo, John Grundy, Thomas Zimmermann

Research Collection School Of Computing and Information Systems

Chaff From The Wheat: Characterizing And Determining Valid Bug Reports, Yuanrui Fan, Xin Xia, David Lo, Ahmed E. Hassan

Research Collection School Of Computing and Information Systems

Vt-Revolution: Interactive Programming Video Tutorial Authoring And Watching System, Lingfeng Bao, Zhenchang Xing, Xin Xia, David Lo

Research Collection School Of Computing and Information Systems