Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 1789

Full-Text Articles in Entire DC Network

Granular3d: Delving Into Multi-Granularity 3d Scene Graph Prediction, Kaixiang Huang, Jingru Yang, Jin Wang, Shengfeng He, Zhan Wang, Haiyan He, Qifeng Zhang, Guodong Lu Sep 2024

Granular3d: Delving Into Multi-Granularity 3d Scene Graph Prediction, Kaixiang Huang, Jingru Yang, Jin Wang, Shengfeng He, Zhan Wang, Haiyan He, Qifeng Zhang, Guodong Lu

Research Collection School Of Computing and Information Systems

This paper addresses the significant challenges in 3D Semantic Scene Graph (3DSSG) prediction, essential for understanding complex 3D environments. Traditional approaches, primarily using PointNet and Graph Convolutional Networks, struggle with effectively extracting multi-grained features from intricate 3D scenes, largely due to a focus on global scene processing and single-scale feature extraction. To overcome these limitations, we introduce Granular3D, a novel approach that shifts the focus towards multi-granularity analysis by predicting relation triplets from specific sub-scenes. One key is the Adaptive Instance Enveloping Method (AIEM), which establishes an approximate envelope structure around irregular instances, providing shape-adaptive local point cloud sampling, thereby …


Hierarchical Damage Correlations For Old Photo Restoration, Weiwei Cai, Xuemiao Xu, Jiajia Xu, Huaidong Zhang, Haoxin Yang, Kun Zhang, Shengfeng He Jul 2024

Hierarchical Damage Correlations For Old Photo Restoration, Weiwei Cai, Xuemiao Xu, Jiajia Xu, Huaidong Zhang, Haoxin Yang, Kun Zhang, Shengfeng He

Research Collection School Of Computing and Information Systems

Restoring old photographs can preserve cherished memories. Previous methods handled diverse damages within the same network structure, which proved impractical. In addition, these methods cannot exploit correlations among artifacts, especially in scratches versus patch-misses issues. Hence, a tailored network is particularly crucial. In light of this, we propose a unified framework consisting of two key components: ScratchNet and PatchNet. In detail, ScratchNet employs the parallel Multi-scale Partial Convolution Module to effectively repair scratches, learning from multi-scale local receptive fields. In contrast, the patch-misses necessitate the network to emphasize global information. To this end, we incorporate a transformer-based encoder and decoder …


Ethical Considerations Toward Protestware, Marc Cheong, Raula Kula, Christoph Treude Jun 2024

Ethical Considerations Toward Protestware, Marc Cheong, Raula Kula, Christoph Treude

Research Collection School Of Computing and Information Systems

This article looks into possible scenarios where developers might consider turning their free and open source software into protestware. Using different frameworks commonly used in artificial intelligence (AI) ethics, we extend the applications of AI ethics to the study of protestware.


Smart Hpa: A Resource-Efficient Horizontal Pod Auto-Scaler For Microservice Architectures, Hussain Ahmad, Christoph Treude, Markus Wagner, Claudia Szabo Jun 2024

Smart Hpa: A Resource-Efficient Horizontal Pod Auto-Scaler For Microservice Architectures, Hussain Ahmad, Christoph Treude, Markus Wagner, Claudia Szabo

Research Collection School Of Computing and Information Systems

Microservice architectures have gained prominence in both academia and industry, offering enhanced agility, reusability, and scalability. To simplify scaling operations in microservice architectures, container orchestration platforms such as Kubernetes feature Horizontal Pod Auto-scalers (HPAs) designed to adjust the resources of microservices to accommodate fluctuating workloads. However, existing HPAs are not suitable for resourceconstrained environments, as they make scaling decisions based on the individual resource capacities of microservices, leading to service unavailability and performance degradation. Furthermore, HPA architectures exhibit several issues, including inefficient data processing and a lack of coordinated scaling operations. To address these concerns, we propose Smart HPA, a …


An Evaluation Of Heart Rate Monitoring With In-Ear Microphones Under Motion, Kayla-Jade Butkow, Ting Dang, Andrea Ferlini, Dong Ma, Yang Liu, Cecilia Mascolo May 2024

An Evaluation Of Heart Rate Monitoring With In-Ear Microphones Under Motion, Kayla-Jade Butkow, Ting Dang, Andrea Ferlini, Dong Ma, Yang Liu, Cecilia Mascolo

Research Collection School Of Computing and Information Systems

With the soaring adoption of in-ear wearables, the research community has started investigating suitable in-ear heart rate detection systems. Heart rate is a key physiological marker of cardiovascular health and physical fitness. Continuous and reliable heart rate monitoring with wearable devices has therefore gained increasing attention in recent years. Existing heart rate detection systems in wearables mainly rely on photoplethysmography (PPG) sensors, however, these are notorious for poor performance in the presence of human motion. In this work, leveraging the occlusion effect that enhances low-frequency bone-conducted sounds in the ear canal, we investigate for the first time in-ear audio-based motion-resilient …


Large Language Models For Qualitative Research In Software Engineering: Exploring Opportunities And Challenges, Muneera Bano, Rashina Hoda, Didar Zowghi, Christoph Treude May 2024

Large Language Models For Qualitative Research In Software Engineering: Exploring Opportunities And Challenges, Muneera Bano, Rashina Hoda, Didar Zowghi, Christoph Treude

Research Collection School Of Computing and Information Systems

The recent surge in the integration of Large Language Models (LLMs) like ChatGPT into qualitative research in software engineering, much like in other professional domains, demands a closer inspection. This vision paper seeks to explore the opportunities of using LLMs in qualitative research to address many of its legacy challenges as well as potential new concerns and pitfalls arising from the use of LLMs. We share our vision for the evolving role of the qualitative researcher in the age of LLMs and contemplate how they may utilize LLMs at various stages of their research experience.


Breathpro: Monitoring Breathing Mode During Running With Earables, Changshuo Hu, Thivya Kandappu, Yang Liu, Cecilia Mascolo, Dong Ma May 2024

Breathpro: Monitoring Breathing Mode During Running With Earables, Changshuo Hu, Thivya Kandappu, Yang Liu, Cecilia Mascolo, Dong Ma

Research Collection School Of Computing and Information Systems

Running is a popular and accessible form of aerobic exercise, significantly benefiting our health and wellness. By monitoring a range of running parameters with wearable devices, runners can gain a deep understanding of their running behavior, facilitating performance improvement in future runs. Among these parameters, breathing, which fuels our bodies with oxygen and expels carbon dioxide, is crucial to improving the efficiency of running. While previous studies have made substantial progress in measuring breathing rate, exploration of additional breathing monitoring during running is still lacking. In this work, we fill this gap by presenting BreathPro, the first breathing mode monitoring …


Swapvid: Integrating Video Viewing And Document Exploration With Direct Manipulation, Taichi Murakami, Kazuyuki Fujita, Kotaro Hara, Kazuki Takashima, Yoshifumi Kitamura May 2024

Swapvid: Integrating Video Viewing And Document Exploration With Direct Manipulation, Taichi Murakami, Kazuyuki Fujita, Kotaro Hara, Kazuki Takashima, Yoshifumi Kitamura

Research Collection School Of Computing and Information Systems

Videos accompanied by documents—document-based videos—enable presenters to share contents beyond videos and audience to use them for detailed content comprehension. However, concurrently exploring multiple channels of information could be taxing. We propose SwapVid, a novel interface for viewing and exploring document-based videos. SwapVid seamlessly integrates a video and a document into a single view and lets the content behaves as both video and a document; it adaptively switches a document-based video to act as a video or a document upon direct manipulation (e.g., scrolling the document, manipulating the video timeline). We conducted a user study with twenty participants, comparing SwapVid …


Vaid: Indexing View Designs In Visual Analytics System, Lu Ying, Aoyu Wu, Haotian Li, Zikun Deng, Ji Lan, Jiang Wu, Yong Wang, Huamin Qu, Dazhen Deng, Yingcai Wu May 2024

Vaid: Indexing View Designs In Visual Analytics System, Lu Ying, Aoyu Wu, Haotian Li, Zikun Deng, Ji Lan, Jiang Wu, Yong Wang, Huamin Qu, Dazhen Deng, Yingcai Wu

Research Collection School Of Computing and Information Systems

Visual analytics (VA) systems have been widely used in various application domains. However, VA systems are complex in design, which imposes a serious problem: although the academic community constantly designs and implements new designs, the designs are difficult to query, understand, and refer to by subsequent designers. To mark a major step forward in tackling this problem, we index VA designs in an expressive and accessible way, transforming the designs into a structured format. We first conducted a workshop study with VA designers to learn user requirements for understanding and retrieving professional designs in VA systems. Thereafter, we came up …


Teaching Software Development For Real-World Problems Using A Microservice-Based Collaborative Problem-Solving Approach, Yi Meng Lau, Christian Michael Koh, Lingxiao Jiang Apr 2024

Teaching Software Development For Real-World Problems Using A Microservice-Based Collaborative Problem-Solving Approach, Yi Meng Lau, Christian Michael Koh, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

Experienced and skillful software developers are needed in organizations to develop software products effective for their business with shortened time-to-market. Such developers will not only need to code but also be able to work in teams and collaboratively solve real-world problems that organizations arefacing. It is challenging for educators to nurture students to become such developers with strong technical, social, and cognitive skills. Towards addressing the challenge, this study presents a Collaborative Software Development Project Framework for a course that focuses on learning microservices architectures anddeveloping a software application for a real-world business. Students get to work in teams to …


W4-Groups: Modeling The Who, What, When And Where Of Group Behavior Via Mobility Sensing, Akansha Atrey, Camellia Zakaria, Rajesh Krishna Balan, Prashant Shenoy Apr 2024

W4-Groups: Modeling The Who, What, When And Where Of Group Behavior Via Mobility Sensing, Akansha Atrey, Camellia Zakaria, Rajesh Krishna Balan, Prashant Shenoy

Research Collection School Of Computing and Information Systems

Human social interactions occur in group settings of varying sizes and locations, depending on the type of social activity. The ability to distinguish group formations based on their purposes transforms how group detection mechanisms function. Not only should such tools support the effective detection of serendipitous encounters, but they can derive categories of relation types among users. Determining who is involved, what activity is performed, and when and where the activity occurs are critical to understanding group processes in greater depth, including supporting goal-oriented applications (e.g., performance, productivity, and mental health) that require sensing social factors. In this work, we …


My Github Sponsors Profile Is Live!": Investigating The Impact Of Twitter/X Mentions On Github Sponsors, Youmei Fan, Tao Xiao, Hideaki Hata, Christoph Treude, Kenichi Matsumoto Apr 2024

My Github Sponsors Profile Is Live!": Investigating The Impact Of Twitter/X Mentions On Github Sponsors, Youmei Fan, Tao Xiao, Hideaki Hata, Christoph Treude, Kenichi Matsumoto

Research Collection School Of Computing and Information Systems

GitHub Sponsors was launched in 2019, enabling donations to opensource software developers to provide financial support, as per GitHub’s slogan: “Invest in the projects you depend on”. However, a 2022 study on GitHub Sponsors found that only two-fifths of developers who were seeking sponsorship received a donation. The study found that, other than internal actions (such as offering perks to sponsors), developers had advertised their GitHub Sponsors profiles on social media, such as Twitter (also known as X). Therefore, in this work, we investigate the impact of tweets that contain links to GitHub Sponsors profiles on sponsorship, as well as …


Going Viral: Case Studies On The Impact Of Protestware, Youmei Fan, Dong Wang, Supastsara Wattanakriengkrai, Hathaichanok Damrongsiri, Christoph Treude, Hideaki Hata, Raula Gaikovina Kula Apr 2024

Going Viral: Case Studies On The Impact Of Protestware, Youmei Fan, Dong Wang, Supastsara Wattanakriengkrai, Hathaichanok Damrongsiri, Christoph Treude, Hideaki Hata, Raula Gaikovina Kula

Research Collection School Of Computing and Information Systems

Maintainers are now self-sabotaging their work in order to take political or economic stances, a practice referred to as "protestware". In this poster, we present our approach to understand how the discourse about such an attack went viral, how it is received by the community, and whether developers respond to the attack in a timely manner. We study two notable protestware cases, i.e., Colors.js and es5-ext, comparing with discussions of a typical security vulnerability as a baseline, i.e., Ua-parser, and perform a thematic analysis of more than two thousand protest-related posts to extract the different narratives when discussing protestware.


The Impact Of Bug Localization Based On Crash Report Mining: A Developers' Perspective, Marcos Medeiros, Uirá Kulesza, Roberta Coelho, Rodrigo Bonifacio, Christoph Treude, Eiji Adachi Barbosa Apr 2024

The Impact Of Bug Localization Based On Crash Report Mining: A Developers' Perspective, Marcos Medeiros, Uirá Kulesza, Roberta Coelho, Rodrigo Bonifacio, Christoph Treude, Eiji Adachi Barbosa

Research Collection School Of Computing and Information Systems

Developers often use crash reports to understand the root cause of bugs. However, locating the buggy source code snippet from such information is a challenging task, mainly when the log database contains many crash reports. To mitigate this issue, recent research has proposed and evaluated approaches for grouping crash report data and using stack trace information to locate bugs. The effectiveness of such approaches has been evaluated by mainly comparing the candidate buggy code snippets with the actual changed code in bug-fix commits—which happens in the context of retrospective repository mining studies. Therefore, the existing literature still lacks discussing the …


Beyond A Joke: Dead Code Elimination Can Delete Live Code, Haoxin Tu, Lingxiao Jiang, Debin Gao, He Jiang Apr 2024

Beyond A Joke: Dead Code Elimination Can Delete Live Code, Haoxin Tu, Lingxiao Jiang, Debin Gao, He Jiang

Research Collection School Of Computing and Information Systems

Dead Code Elimination (DCE) is a fundamental compiler optimization technique that removes dead code (e.g., unreachable or reachable but whose results are unused) in the program to produce smaller or faster executables. However, since compiler optimizations are typically aggressively performed and there are complex relationships/interplay among a vast number of compiler optimizations (including DCE), it is not known whether DCE is indeed correctly performed and will only delete dead code in practice. In this study, we open a new research problem to investigate: can DCE happen to erroneously delete live code? To tackle this problem, we design a new approach …


Extracting Relevant Test Inputs From Bug Reports For Automatic Test Case Generation, Wendkuuni C. Ouédraogo, Laura Plein, Kader Kaboré, Andrew Habib, Jacques Klein, David Lo, Tegawende F. Bissyandé Apr 2024

Extracting Relevant Test Inputs From Bug Reports For Automatic Test Case Generation, Wendkuuni C. Ouédraogo, Laura Plein, Kader Kaboré, Andrew Habib, Jacques Klein, David Lo, Tegawende F. Bissyandé

Research Collection School Of Computing and Information Systems

The pursuit of automating software test case generation, particularly for unit tests, has become increasingly important due to the labor-intensive nature of manual test generation [6]. However, a significant challenge in this domain is the inability of automated approaches to generate relevant inputs, which compromises the efficacy of the tests [6].


Bidirectional Paper-Repository Tracing In Software Engineering, Daniel Garijo, Miguel Arroyo, Esteban González Guardia, Christoph Treude, Nicola Tarocco Apr 2024

Bidirectional Paper-Repository Tracing In Software Engineering, Daniel Garijo, Miguel Arroyo, Esteban González Guardia, Christoph Treude, Nicola Tarocco

Research Collection School Of Computing and Information Systems

While computer science papers frequently include their associated code repositories, establishing a clear link between papers and their corresponding implementations may be challenging due to the number of code repositories used in research publications. In this paper we describe a lightweight method for effectively identifying bidirectional links between papers and repositories from both LaTeX and PDF sources. We have used our approach to analyze more than 14000 PDF and Latex files in the Software Engineering category of Arxiv, generating a dataset of more than 1400 paper-code implementations and assessing current citation practices on it.


Redriver: Runtime Enforcement For Autonomous Vehicles, Yang Sun, Christopher M. Poskitt, Xiaodong Zhang, Jun Sun Apr 2024

Redriver: Runtime Enforcement For Autonomous Vehicles, Yang Sun, Christopher M. Poskitt, Xiaodong Zhang, Jun Sun

Research Collection School Of Computing and Information Systems

Autonomous driving systems (ADSs) integrate sensing, perception, drive control, and several other critical tasks in autonomous vehicles, motivating research into techniques for assessing their safety. While there are several approaches for testing and analysing them in high-fidelity simulators, ADSs may still encounter additional critical scenarios beyond those covered once they are deployed on real roads. An additional level of confidence can be established by monitoring and enforcing critical properties when the ADS is running. Existing work, however, is only able to monitor simple safety properties (e.g., avoidance of collisions) and is limited to blunt enforcement mechanisms such as hitting the …


Exploring The Potential Of Chatgpt In Automated Code Refinement: An Empirical Study, Qi Guo, Shangqing Liu, Junming Cao, Xiaohong Li, Xin Peng, Xiaofei Xie, Bihuan Chen Apr 2024

Exploring The Potential Of Chatgpt In Automated Code Refinement: An Empirical Study, Qi Guo, Shangqing Liu, Junming Cao, Xiaohong Li, Xin Peng, Xiaofei Xie, Bihuan Chen

Research Collection School Of Computing and Information Systems

Code review is an essential activity for ensuring the quality and maintainability of software projects. However, it is a time-consuming and often error-prone task that can significantly impact the development process. Recently, ChatGPT, a cutting-edge language model, has demonstrated impressive performance in various natural language processing tasks, suggesting its potential to automate code review processes. However, it is still unclear how well ChatGPT performs in code review tasks. To fill this gap, in this paper, we conduct the first empirical study to understand the capabilities of ChatGPT in code review tasks, specifically focusing on automated code refinement based on given …


Improving Automated Code Reviews: Learning From Experience, Hong Yi Lin, Patanamon Thongtanunam, Christoph Treude, Wachiraphan Charoenwet Apr 2024

Improving Automated Code Reviews: Learning From Experience, Hong Yi Lin, Patanamon Thongtanunam, Christoph Treude, Wachiraphan Charoenwet

Research Collection School Of Computing and Information Systems

Modern code review is a critical quality assurance process that is widely adopted in both industry and open source software environments. This process can help newcomers learn from the feedback of experienced reviewers; however, it often brings a large workload and stress to reviewers. To alleviate this burden, the field of automated code reviews aims to automate the process, teaching large language models to provide reviews on submitted code, just as a human would. A recent approach pre-trained and fine-tuned the code intelligent language model on a large-scale code review corpus. However, such techniques did not fully utilise quality reviews …


Dronlomaly: Runtime Log-Based Anomaly Detector For Dji Drones, Wei Minn, Naing Tun Yan, Lwin Khin Shar, Lingxiao Jiang Apr 2024

Dronlomaly: Runtime Log-Based Anomaly Detector For Dji Drones, Wei Minn, Naing Tun Yan, Lwin Khin Shar, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

We present an automated tool for realtime detection of anomalous behaviors while a DJI drone is executing a flight mission. The tool takes sensor data logged by drone at fixed time intervals and performs anomaly detection using a Bi-LSTM model. The model is trained on baseline flight logs from a successful mission physically or via a simulator. The tool has two modules --- the first module is responsible for sending the log data to the remote controller station, and the second module is run as a service in the remote controller station powered by a Bi-LSTM model, which receives the …


Marco: A Stochastic Asynchronous Concolic Explorer, Jie Hu, Yue Duan, Heng Yin Apr 2024

Marco: A Stochastic Asynchronous Concolic Explorer, Jie Hu, Yue Duan, Heng Yin

Research Collection School Of Computing and Information Systems

Concolic execution is a powerful program analysis technique for code path exploration. Despite recent advances that greatly improved the efficiency of concolic execution engines, path constraint solving remains a major bottleneck of concolic testing. An intelligent scheduler for inputs/branches becomes even more crucial. Our studies show that the previously under-studied branch-flipping policy adopted by state-of-the-art concolic execution engines has several limitations. We propose to assess each branch by its potential for new code coverage from a global view, concerning the path divergence probability at each branch. To validate this idea, we implemented a prototype Marco and evaluated it against the …


Acav: A Framework For Automatic Causality Analysis In Autonomous Vehicle Accident Recordings, Huijia Sun, Christopher M. Poskitt, Yang Sun, Jun Sun, Yuqi Chen Apr 2024

Acav: A Framework For Automatic Causality Analysis In Autonomous Vehicle Accident Recordings, Huijia Sun, Christopher M. Poskitt, Yang Sun, Jun Sun, Yuqi Chen

Research Collection School Of Computing and Information Systems

The rapid progress of autonomous vehicles (AVs) has brought the prospect of a driverless future closer than ever. Recent fatalities, however, have emphasized the importance of safety validation through large-scale testing. Multiple approaches achieve this fully automatically using high-fidelity simulators, i.e., by generating diverse driving scenarios and evaluating autonomous driving systems (ADSs) against different test oracles. While effective at finding violations, these approaches do not identify the decisions and actions that caused them -- information that is critical for improving the safety of ADSs. To address this challenge, we propose ACAV, an automated framework designed to conduct causality analysis for …


Experience Report: Identifying Common Misconceptions And Errors Of Novice Programmers With Chatgpt, Hua Leong Fwa Apr 2024

Experience Report: Identifying Common Misconceptions And Errors Of Novice Programmers With Chatgpt, Hua Leong Fwa

Research Collection School Of Computing and Information Systems

Identifying the misconceptions of novice programmers is pertinent for informing instructors of the challenges faced by their students in learning computer programming. In the current literature, custom tools, test scripts were developed and, in most cases, manual effort to go through the individual codes were required to identify and categorize the errors latent within the students' code submissions. This entails investment of substantial effort and time from the instructors. In this study, we thus propose the use of ChatGPT in identifying and categorizing the errors. Using prompts that were seeded only with the student's code and the model code solution …


Encoding Version History Context For Better Code Representation, Huy Nguyen, Christoph Treude, Patanamon Thongtanunam Apr 2024

Encoding Version History Context For Better Code Representation, Huy Nguyen, Christoph Treude, Patanamon Thongtanunam

Research Collection School Of Computing and Information Systems

With the exponential growth of AI tools that generate source code, understanding software has become crucial. When developers comprehend a program, they may refer to additional contexts to look for information, e.g. program documentation or historical code versions. Therefore, we argue that encoding this additional contextual information could also benefit code representation for deep learning. Recent papers incorporate contextual data (e.g. call hierarchy) into vector representation to address program comprehension problems. This motivates further studies to explore additional contexts, such as version history, to enhance models' understanding of programs. That is, insights from version history enable recognition of patterns in …


Unleashing The Power Of Clippy In Real-World Rust Projects, Chunmiao Li, Yijun Yu, Haitao Wu, Luca Carlig, Shijie Nie, Lingxiao Jiang Apr 2024

Unleashing The Power Of Clippy In Real-World Rust Projects, Chunmiao Li, Yijun Yu, Haitao Wu, Luca Carlig, Shijie Nie, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

The error messages generated by the Rust compiler (rustc) are useful for developers to identify and diagnose suspicious code segments. Complementing the compiler, linters can also play an important role in promoting the adherence to certain coding style conventions and best practices. Prominent linters utilized in the Rust ecosystem include Clippy [1] and Rustfmt [2]. Among them, the Rust community particularly emphasizes on the importance of heeding the warnings provided by Clippy to mitigate common errors and promote the adoption of idiomatic conventions. Clippy provides a set of more than 600 lints in addition to the built-in rustc lints. These …


Githubinclusifier: Finding And Fixing Non-Inclusive Language In Github Repositories, Liam Todd, John Grundy, Christoph Treude Apr 2024

Githubinclusifier: Finding And Fixing Non-Inclusive Language In Github Repositories, Liam Todd, John Grundy, Christoph Treude

Research Collection School Of Computing and Information Systems

Non-inclusive language in software artefacts has been recognised as a serious problem. We describe a tool to find and fix non-inclusive language in a variety of GitHub repository artefacts. These include various README files, PDFs, code comments, and code. A wide variety of non-inclusive language including racist, ageist, ableist, violent and others are located and issues created, tagging the artefacts for checking. Suggested fixes can be generated using third-party LLM APIs, and approved changes made to documents, including code refactorings, and committed to the repository. The tool and evaluation data are available from: https://github. com/LiamTodd/github-inclusifier


Classifying Source Code: How Far Can Compressor-Based Classifiers Go?, Zhou Yang Apr 2024

Classifying Source Code: How Far Can Compressor-Based Classifiers Go?, Zhou Yang

Research Collection School Of Computing and Information Systems

Pre-trained language models of code, which are built upon large-scale datasets, millions of trainable parameters, and high computational resources cost, have achieved phenomenal success. Recently, researchers have proposed a compressor-based classifier (Cbc); it trains no parameters but is found to outperform BERT. We conduct the first empirical study to explore whether this lightweight alternative can accurately classify source code. Our study is more than applying Cbc to code-related tasks. We first identify an issue that the original implementation overestimates Cbc. After correction, Cbc's performance on defect prediction drops from 80.7% to 63.0%, which is still comparable to CodeBERT (63.7%). We …


Creative And Correct: Requesting Diverse Code Solutions From Ai, Scott Blyth, Christoph Treude, Markus Wagner Apr 2024

Creative And Correct: Requesting Diverse Code Solutions From Ai, Scott Blyth, Christoph Treude, Markus Wagner

Research Collection School Of Computing and Information Systems

AI foundation models have the capability to produce a wide array of responses to a single prompt, a feature that is highly beneficial in software engineering to generate diverse code solutions. However, this advantage introduces a significant trade-off between diversity and correctness. In software engineering tasks, diversity is key to exploring design spaces and fostering creativity, but the practical value of these solutions is heavily dependent on their correctness. Our study systematically investigates this trade-off using experiments with HumanEval tasks, exploring various parameter settings and prompting strategies. We assess the diversity of code solutions using similarity metrics from the code …


Enhancing Source Code Representations For Deep Learning With Static Analysis, Xueting Guan, Christoph Treude Apr 2024

Enhancing Source Code Representations For Deep Learning With Static Analysis, Xueting Guan, Christoph Treude

Research Collection School Of Computing and Information Systems

Deep learning techniques applied to program analysis tasks such as code classification, summarization, and bug detection have seen widespread interest. Traditional approaches, however, treat programming source code as natural language text, which may neglect significant structural or semantic details. Additionally, most current methods of representing source code focus solely on the code, without considering beneficial additional context. This paper explores the integration of static analysis and additional context such as bug reports and design patterns into source code representations for deep learning models. We use the Abstract Syntax Tree-based Neural Network (ASTNN) method and augment it with additional context information …