Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 1665

Full-Text Articles in Computer Sciences

Hierarchical Damage Correlations For Old Photo Restoration, Weiwei Cai, Xuemiao Xu, Jiajia Xu, Huaidong Zhang, Haoxin Yang, Kun Zhang, Shengfeng He Jul 2024

Hierarchical Damage Correlations For Old Photo Restoration, Weiwei Cai, Xuemiao Xu, Jiajia Xu, Huaidong Zhang, Haoxin Yang, Kun Zhang, Shengfeng He

Research Collection School Of Computing and Information Systems

Restoring old photographs can preserve cherished memories. Previous methods handled diverse damages within the same network structure, which proved impractical. In addition, these methods cannot exploit correlations among artifacts, especially in scratches versus patch-misses issues. Hence, a tailored network is particularly crucial. In light of this, we propose a unified framework consisting of two key components: ScratchNet and PatchNet. In detail, ScratchNet employs the parallel Multi-scale Partial Convolution Module to effectively repair scratches, learning from multi-scale local receptive fields. In contrast, the patch-misses necessitate the network to emphasize global information. To this end, we incorporate a transformer-based encoder and decoder …


Ethical Considerations Toward Protestware, Marc Cheong, Raula Kula, Christoph Treude Jun 2024

Ethical Considerations Toward Protestware, Marc Cheong, Raula Kula, Christoph Treude

Research Collection School Of Computing and Information Systems

This article looks into possible scenarios where developers might consider turning their free and open source software into protestware. Using different frameworks commonly used in artificial intelligence (AI) ethics, we extend the applications of AI ethics to the study of protestware.


Large Language Models For Qualitative Research In Software Engineering: Exploring Opportunities And Challenges, Muneera Bano, Rashina Hoda, Didar Zowghi, Christoph Treude May 2024

Large Language Models For Qualitative Research In Software Engineering: Exploring Opportunities And Challenges, Muneera Bano, Rashina Hoda, Didar Zowghi, Christoph Treude

Research Collection School Of Computing and Information Systems

The recent surge in the integration of Large Language Models (LLMs) like ChatGPT into qualitative research in software engineering, much like in other professional domains, demands a closer inspection. This vision paper seeks to explore the opportunities of using LLMs in qualitative research to address many of its legacy challenges as well as potential new concerns and pitfalls arising from the use of LLMs. We share our vision for the evolving role of the qualitative researcher in the age of LLMs and contemplate how they may utilize LLMs at various stages of their research experience.


An Evaluation Of Heart Rate Monitoring With In-Ear Microphones Under Motion, Kayla-Jade Butkow, Ting Dang, Andrea Ferlini, Dong Ma, Yang Liu, Cecilia Mascolo May 2024

An Evaluation Of Heart Rate Monitoring With In-Ear Microphones Under Motion, Kayla-Jade Butkow, Ting Dang, Andrea Ferlini, Dong Ma, Yang Liu, Cecilia Mascolo

Research Collection School Of Computing and Information Systems

With the soaring adoption of in-ear wearables, the research community has started investigating suitable in-ear heart rate detection systems. Heart rate is a key physiological marker of cardiovascular health and physical fitness. Continuous and reliable heart rate monitoring with wearable devices has therefore gained increasing attention in recent years. Existing heart rate detection systems in wearables mainly rely on photoplethysmography (PPG) sensors, however, these are notorious for poor performance in the presence of human motion. In this work, leveraging the occlusion effect that enhances low-frequency bone-conducted sounds in the ear canal, we investigate for the first time in-ear audio-based motion-resilient …


Redriver: Runtime Enforcement For Autonomous Vehicles, Yang Sun, Christopher M. Poskitt, Xiaodong Zhang, Jun Sun Apr 2024

Redriver: Runtime Enforcement For Autonomous Vehicles, Yang Sun, Christopher M. Poskitt, Xiaodong Zhang, Jun Sun

Research Collection School Of Computing and Information Systems

Autonomous driving systems (ADSs) integrate sensing, perception, drive control, and several other critical tasks in autonomous vehicles, motivating research into techniques for assessing their safety. While there are several approaches for testing and analysing them in high-fidelity simulators, ADSs may still encounter additional critical scenarios beyond those covered once they are deployed on real roads. An additional level of confidence can be established by monitoring and enforcing critical properties when the ADS is running. Existing work, however, is only able to monitor simple safety properties (e.g., avoidance of collisions) and is limited to blunt enforcement mechanisms such as hitting the …


Exploring The Potential Of Chatgpt In Automated Code Refinement: An Empirical Study, Qi Guo, Shangqing Liu, Junming Cao, Xiaohong Li, Xin Peng, Xiaofei Xie, Bihuan Chen Apr 2024

Exploring The Potential Of Chatgpt In Automated Code Refinement: An Empirical Study, Qi Guo, Shangqing Liu, Junming Cao, Xiaohong Li, Xin Peng, Xiaofei Xie, Bihuan Chen

Research Collection School Of Computing and Information Systems

Code review is an essential activity for ensuring the quality and maintainability of software projects. However, it is a time-consuming and often error-prone task that can significantly impact the development process. Recently, ChatGPT, a cutting-edge language model, has demonstrated impressive performance in various natural language processing tasks, suggesting its potential to automate code review processes. However, it is still unclear how well ChatGPT performs in code review tasks. To fill this gap, in this paper, we conduct the first empirical study to understand the capabilities of ChatGPT in code review tasks, specifically focusing on automated code refinement based on given …


Teaching Software Development For Real-World Problems Using A Microservice-Based Collaborative Problem-Solving Approach, Yi Meng Lau, Christian Michael Koh, Lingxiao Jiang Apr 2024

Teaching Software Development For Real-World Problems Using A Microservice-Based Collaborative Problem-Solving Approach, Yi Meng Lau, Christian Michael Koh, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

Experienced and skillful software developers are needed in organizations to develop software products effective for their business with shortened time-to-market. Such developers will not only need to code but also be able to work in teams and collaboratively solve real-world problems that organizations arefacing. It is challenging for educators to nurture students to become such developers with strong technical, social, and cognitive skills. Towards addressing the challenge, this study presents a Collaborative Software Development Project Framework for a course that focuses on learning microservices architectures anddeveloping a software application for a real-world business. Students get to work in teams to …


Marco: A Stochastic Asynchronous Concolic Explorer, Jie Hu, Yue Duan, Heng Yin Apr 2024

Marco: A Stochastic Asynchronous Concolic Explorer, Jie Hu, Yue Duan, Heng Yin

Research Collection School Of Computing and Information Systems

Concolic execution is a powerful program analysis technique for code path exploration. Despite recent advances that greatly improved the efficiency of concolic execution engines, path constraint solving remains a major bottleneck of concolic testing. An intelligent scheduler for inputs/branches becomes even more crucial. Our studies show that the previously under-studied branch-flipping policy adopted by state-of-the-art concolic execution engines has several limitations. We propose to assess each branch by its potential for new code coverage from a global view, concerning the path divergence probability at each branch. To validate this idea, we implemented a prototype Marco and evaluated it against the …


Acav: A Framework For Automatic Causality Analysis In Autonomous Vehicle Accident Recordings, Huijia Sun, Christopher M. Poskitt, Yang Sun, Jun Sun, Yuqi Chen Apr 2024

Acav: A Framework For Automatic Causality Analysis In Autonomous Vehicle Accident Recordings, Huijia Sun, Christopher M. Poskitt, Yang Sun, Jun Sun, Yuqi Chen

Research Collection School Of Computing and Information Systems

The rapid progress of autonomous vehicles (AVs) has brought the prospect of a driverless future closer than ever. Recent fatalities, however, have emphasized the importance of safety validation through large-scale testing. Multiple approaches achieve this fully automatically using high-fidelity simulators, i.e., by generating diverse driving scenarios and evaluating autonomous driving systems (ADSs) against different test oracles. While effective at finding violations, these approaches do not identify the decisions and actions that caused them -- information that is critical for improving the safety of ADSs. To address this challenge, we propose ACAV, an automated framework designed to conduct causality analysis for …


Ur2m: Uncertainty And Resource-Aware Event Detection On Microcontrollers, Hong Jia, Young D. Kwon, Dong Ma, Nhat Pham, Lorena Qendro, Tam Vu, Cecilia Mascolo Mar 2024

Ur2m: Uncertainty And Resource-Aware Event Detection On Microcontrollers, Hong Jia, Young D. Kwon, Dong Ma, Nhat Pham, Lorena Qendro, Tam Vu, Cecilia Mascolo

Research Collection School Of Computing and Information Systems

Traditional machine learning techniques are prone to generating inaccurate predictions when confronted with shifts in the distribution of data between the training and testing phases. This vulnerability can lead to severe consequences, especially in applications such as mobile healthcare. Uncertainty estimation has the potential to mitigate this issue by assessing the reliability of a model's output. However, existing uncertainty estimation techniques often require substantial computational resources and memory, making them impractical for implementation on microcontrollers (MCUs). This limitation hinders the feasibility of many important on-device wearable event detection (WED) applications, such as heart attack detection. In this paper, we present …


Fixing Your Own Smells: Adding A Mistake-Based Familiarization Step When Teaching Code Refactoring, Ivan Wei Han Tan, Christopher M. Poskitt Mar 2024

Fixing Your Own Smells: Adding A Mistake-Based Familiarization Step When Teaching Code Refactoring, Ivan Wei Han Tan, Christopher M. Poskitt

Research Collection School Of Computing and Information Systems

Programming problems can be solved in a multitude of functionally correct ways, but the quality of these solutions (e.g. readability, maintainability) can vary immensely. When code quality is poor, symptoms emerge in the form of 'code smells', which are specific negative characteristics (e.g. duplicate code) that can be resolved by applying refactoring patterns. Many undergraduate computing curricula train students on this software engineering practice, often doing so via exercises on unfamiliar instructor-provided code. Our observation, however, is that this makes it harder for novices to internalise refactoring as part of their own development practices. In this paper, we propose a …


Ditmos: Delving Into Diverse Tiny-Model Selection On Microcontrollers, Xiao Ma, Shengfeng He, Hezhe Qiao, Dong Ma Mar 2024

Ditmos: Delving Into Diverse Tiny-Model Selection On Microcontrollers, Xiao Ma, Shengfeng He, Hezhe Qiao, Dong Ma

Research Collection School Of Computing and Information Systems

Enabling efficient and accurate deep neural network (DNN) inference on microcontrollers is non-trivial due to the constrained on-chip resources. Current methodologies primarily focus on compressing larger models yet at the expense of model accuracy. In this paper, we rethink the problem from the inverse perspective by constructing small/weak models directly and improving their accuracy. Thus, we introduce DiTMoS, a novel DNN training and inference framework with a selectorclassifiers architecture, where the selector routes each input sample to the appropriate classifier for classification. DiTMoS is grounded on a key insight: a composition of weak models can exhibit high diversity and the …


Delving Into Multimodal Prompting For Fine-Grained Visual Classification, Xin Jiang, Hao Tang, Junyao Gao, Xiaoyu Du, Shengfeng He, Zechao Li Feb 2024

Delving Into Multimodal Prompting For Fine-Grained Visual Classification, Xin Jiang, Hao Tang, Junyao Gao, Xiaoyu Du, Shengfeng He, Zechao Li

Research Collection School Of Computing and Information Systems

Fine-grained visual classification (FGVC) involves categorizing fine subdivisions within a broader category, which poses challenges due to subtle inter-class discrepancies and large intra-class variations. However, prevailing approaches primarily focus on uni-modal visual concepts. Recent advancements in pre-trained vision-language models have demonstrated remarkable performance in various high-level vision tasks, yet the applicability of such models to FGVC tasks remains uncertain. In this paper, we aim to fully exploit the capabilities of cross-modal description to tackle FGVC tasks and propose a novel multimodal prompting solution, denoted as MP-FGVC, based on the contrastive language-image pertaining (CLIP) model. Our MP-FGVC comprises a multimodal prompts …


Detecting Outdated Code Element References In Software Repository Documentation, Wen Siang Tan, Markus Wagner, Christoph Treude Feb 2024

Detecting Outdated Code Element References In Software Repository Documentation, Wen Siang Tan, Markus Wagner, Christoph Treude

Research Collection School Of Computing and Information Systems

Outdated documentation is a pervasive problem in software development, preventing effective use of software, and misleading users and developers alike. We posit that one possible reason why documentation becomes out of sync so easily is that developers are unaware of when their source code modifications render the documentation obsolete. Ensuring that the documentation is always in sync with the source code takes considerable effort, especially for large codebases. To address this situation, we propose an approach that can automatically detect code element references that survive in the documentation after all source code instances have been deleted. In this work, we …


Vibmilk: Non-Intrusive Milk Spoilage Detection Via Smartphone Vibration, Yuezhong Wu, Wei Song, Yanxiang Wang, Dong Ma, Weitao Xu, Mahbub Hassan, Wen Hu Feb 2024

Vibmilk: Non-Intrusive Milk Spoilage Detection Via Smartphone Vibration, Yuezhong Wu, Wei Song, Yanxiang Wang, Dong Ma, Weitao Xu, Mahbub Hassan, Wen Hu

Research Collection School Of Computing and Information Systems

Quantifying the chemical process of milk spoilage is challenging due to the need for bulky, expensive equipment that is not user-friendly for milk producers or customers. This lack of a convenient and accurate milk spoilage detection system can cause two significant issues. First, people who consume spoiled milk may experience serious health problems. Secondly, milk manufacturers typically provide a “best before” date to indicate freshness, but this date only shows the highest quality of the milk, not the last day it can be safely consumed, leading to significant milk waste. A practical and efficient solution to this problem is proposed …


Remote Multi-Person Heart Rate Monitoring With Smart Speakers: Overcoming Separation Constraint, Ngoc Doan Thu Tran, Dong Ma, Rajesh Krishna Balan Jan 2024

Remote Multi-Person Heart Rate Monitoring With Smart Speakers: Overcoming Separation Constraint, Ngoc Doan Thu Tran, Dong Ma, Rajesh Krishna Balan

Research Collection School Of Computing and Information Systems

Heart rate is a key vital sign that can be used to understand an individual’s health condition. Recently, remote sensing techniques, especially acoustic-based sensing, have received increasing attention for their ability to non-invasively detect heart rate via commercial mobile devices such as smartphones and smart speakers. However, due to signal interference, existing methods have primarily focused on monitoring a single user and required a large separation between them when monitoring multiple people. These limitations hinder many common use cases such as couples sharing the same bed or two or more people located in close proximity. In this paper, we present …


Provably Secure Decisions Based On Potentially Malicious Information, Dongxia Wang, Tim Muller, Jun Sun Jan 2024

Provably Secure Decisions Based On Potentially Malicious Information, Dongxia Wang, Tim Muller, Jun Sun

Research Collection School Of Computing and Information Systems

There are various security-critical decisions routinely made, on the basis of information provided by peers: routing messages, user reports, sensor data, navigational information, blockchain updates, etc. Jury theorems were proposed in sociology to make decisions based on information from peers, which assume peers may be mistaken with some probability. We focus on attackers in a system, which manifest as peers that strategically report fake information to manipulate decision making. We define the property of robustness: a lower bound probability of deciding correctly, regardless of what information attackers provide. When peers are independently selected, we propose an optimal, robust decision mechanism …


Conceptthread: Visualizing Threaded Concepts In Mooc Videos, Zhiguang Zhou, Li Ye, Lihong Cai, Lei Wang, Yigang Wang, Yongheng Wang, Wei Chen, Yong Wang Jan 2024

Conceptthread: Visualizing Threaded Concepts In Mooc Videos, Zhiguang Zhou, Li Ye, Lihong Cai, Lei Wang, Yigang Wang, Yongheng Wang, Wei Chen, Yong Wang

Research Collection School Of Computing and Information Systems

Massive Open Online Courses (MOOCs) platforms are becoming increasingly popular in recent years. Online learners need to watch the whole course video on MOOC platforms to learn the underlying new knowledge, which is often tedious and time-consuming due to the lack of a quick overview of the covered knowledge and their structures. In this paper, we propose ConceptThread , a visual analytics approach to effectively show the concepts and the relations among them to facilitate effective online learning. Specifically, given that the majority of MOOC videos contain slides, we first leverage video processing and speech analysis techniques, including shot recognition, …


Big Code Search: A Bibliography, Kisub Kim, Sankalp Ghatpande, Dongsun Kim, Xin Zhou, Kui Liu, Tegawende F. Bissyande, Jacques Klein, Traon Yves Le Jan 2024

Big Code Search: A Bibliography, Kisub Kim, Sankalp Ghatpande, Dongsun Kim, Xin Zhou, Kui Liu, Tegawende F. Bissyande, Jacques Klein, Traon Yves Le

Research Collection School Of Computing and Information Systems

Code search is an essential task in software development. Developers often search the internet and other code databases for necessary source code snippets to ease the development efforts. Code search techniques also help learn programming as novice programmers or students can quickly retrieve (hopefully good) examples already used in actual software projects. Given the recurrence of the code search activity in software development, there is an increasing interest in the research community. To improve the code search experience, the research community suggests many code search tools and techniques. These tools and techniques leverage several different ideas and claim a better …


Clearspeech: Improving Voice Quality Of Earbuds Using Both In-Ear And Out-Ear Microphones, Dong Ma, Ting Dang, Ming Ding, Rajesh Krishna Balan Jan 2024

Clearspeech: Improving Voice Quality Of Earbuds Using Both In-Ear And Out-Ear Microphones, Dong Ma, Ting Dang, Ming Ding, Rajesh Krishna Balan

Research Collection School Of Computing and Information Systems

Wireless earbuds have been gaining increasing popularity and using them to make phone calls or issue voice commands requires the earbud microphones to pick up human speech. When the speaker is in a noisy environment, speech quality degrades significantly and requires speech enhancement (SE). In this paper, we present ClearSpeech, a novel deep-learningbased SE system designed for wireless earbuds. Specifically, by jointly using the earbud’s in-ear and out-ear microphones, we devised a suite of techniques to effectively fuse the two signals and enhance the magnitude and phase of the speech spectrogram. We built an earbud prototype to evaluate ClearSpeech under …


Conversational Localization: Indoor Human Localization Through Intelligent Conversation, Sheshadri Smitha, Kotaro Hara Jan 2024

Conversational Localization: Indoor Human Localization Through Intelligent Conversation, Sheshadri Smitha, Kotaro Hara

Research Collection School Of Computing and Information Systems

We propose a novel sensorless approach to indoor localization by leveraging natural language conversations with users, which we call conversational localization. To show the feasibility of conversational localization, we develop a proof-of-concept system that guides users to describe their surroundings in a chat and estimates their position based on the information they provide. We devised a modular architecture for our system with four modules. First, we construct an entity database with available image-based floor maps. Second, we enable the dynamic identification and scoring of information provided by users through our utterance processing module. Then, we implement a conversational agent that …


Stealthy Backdoor Attack For Code Models, Zhou Yang, Bowen Xu, Jie M. Zhang, Hong Jin Kang, Jieke Shi, Junda He, David Lo Jan 2024

Stealthy Backdoor Attack For Code Models, Zhou Yang, Bowen Xu, Jie M. Zhang, Hong Jin Kang, Jieke Shi, Junda He, David Lo

Research Collection School Of Computing and Information Systems

Code models, such as CodeBERT and CodeT5, offer general-purpose representations of code and play a vital role in supporting downstream automated software engineering tasks. Most recently, code models were revealed to be vulnerable to backdoor attacks. A code model that is backdoor-attacked can behave normally on clean examples but will produce pre-defined malicious outputs on examples injected with that activate the backdoors. Existing backdoor attacks on code models use unstealthy and easy-to-detect triggers. This paper aims to investigate the vulnerability of code models with backdoor attacks. To this end, we propose A (dversarial eature as daptive Back). A achieves stealthiness …


Active Code Learning: Benchmarking Sample-Efficient Training Of Code Models, Qiang Hu, Yuejun Guo, Xiaofei Xie, Maxime Cordy, Lei Ma, Mike Papadakis, Yves Le Traon Jan 2024

Active Code Learning: Benchmarking Sample-Efficient Training Of Code Models, Qiang Hu, Yuejun Guo, Xiaofei Xie, Maxime Cordy, Lei Ma, Mike Papadakis, Yves Le Traon

Research Collection School Of Computing and Information Systems

The costly human effort required to prepare the training data of machine learning (ML) models hinders their practical development and usage in software engineering (ML4Code), especially for those with limited budgets. Therefore, efficiently training models of code with less human effort has become an emergent problem. Active learning is such a technique to address this issue that allows developers to train a model with reduced data while producing models with desired performance, which has been well studied in computer vision and natural language processing domains. Unfortunately, there is no such work that explores the effectiveness of active learning for code …


Learning An Interpretable Stylized Subspace For 3d-Aware Animatable Artforms, Chenxi Zheng, Bangzhen Liu, Xuemiao Xu, Huaidong Zhang, Shengfeng He Jan 2024

Learning An Interpretable Stylized Subspace For 3d-Aware Animatable Artforms, Chenxi Zheng, Bangzhen Liu, Xuemiao Xu, Huaidong Zhang, Shengfeng He

Research Collection School Of Computing and Information Systems

Throughout history, static paintings have captivated viewers within display frames, yet the possibility of making these masterpieces vividly interactive remains intriguing. This research paper introduces 3DArtmator, a novel approach that aims to represent artforms in a highly interpretable stylized space, enabling 3D-aware animatable reconstruction and editing. Our rationale is to transfer the interpretability and 3D controllability of the latent space in a 3D-aware GAN to a stylized sub-space of a customized GAN, revitalizing the original artforms. To this end, the proposed two-stage optimization framework of 3DArtmator begins with discovering an anchor in the original latent space that accurately mimics the …


Better Pay Attention Whilst Fuzzing, Shunkai Zhu, Jingyi Wang, Jun Sun, Jie Yang, Xingwei Lin, Liyi Zhang, Peng Cheng Dec 2023

Better Pay Attention Whilst Fuzzing, Shunkai Zhu, Jingyi Wang, Jun Sun, Jie Yang, Xingwei Lin, Liyi Zhang, Peng Cheng

Research Collection School Of Computing and Information Systems

Fuzzing is one of the prevailing methods for vulnerability detection. However, even state-of-the-art fuzzing methods become ineffective after some period of time, i.e., the coverage hardly improves as existing methods are ineffective to focus the attention of fuzzing on covering the hard-to-trigger program paths. In other words, they cannot generate inputs that can break the bottleneck due to the fundamental difficulty in capturing the complex relations between the test inputs and program coverage. In particular, existing fuzzers suffer from the following main limitations: 1) lacking an overall analysis of the program to identify the most “rewarding” seeds, and 2) lacking …


A Reliable And Secure Mobile Cyber-Physical Digital Microfluidic Biochip For Intelligent Healthcare, Yinan Yao, Decheng Qiu, Huangda Liu, Zhongliao Yang, Ximeng Liu, Yang Yang, Chen Dong Dec 2023

A Reliable And Secure Mobile Cyber-Physical Digital Microfluidic Biochip For Intelligent Healthcare, Yinan Yao, Decheng Qiu, Huangda Liu, Zhongliao Yang, Ximeng Liu, Yang Yang, Chen Dong

Research Collection School Of Computing and Information Systems

Digital microfluidic, as an emerging and potential technology, diversifies the biochemical applications platform, such as protein dilution sewage detection. At present, a vast majority of universal cyberphysical digital microfluidic biochips (DMFBs) transmit data through wires via personal computers and microcontrollers (like Arduino), consequently, susceptible to various security threats and with the popularity of wireless devices, losing competitiveness gradually. On the premise that security be ensured first and foremost, calls for wireless portable, safe, and economical DMFBs are imperative to expand their application fields, engage more users, and cater to the trend of future wireless communication. To this end, a new …


A Closer Look At The Security Risks In The Rust Ecosystem, Xiaoye Zheng, Zhiyuan Wan, Yun Zhang, Rui Chang, David Lo Dec 2023

A Closer Look At The Security Risks In The Rust Ecosystem, Xiaoye Zheng, Zhiyuan Wan, Yun Zhang, Rui Chang, David Lo

Research Collection School Of Computing and Information Systems

Rust is an emerging programming language designed for the development of systems software. To facilitate the reuse of Rust code, crates.io, as a central package registry of the Rust ecosystem, hosts thousands of third-party Rust packages. The openness of crates.io enables the growth of the Rust ecosystem but comes with security risks by severe security advisories. Although Rust guarantees a software program to be safe via programming language features and strict compile-time checking, the unsafe keyword in Rust allows developers to bypass compiler safety checks for certain regions of code. Prior studies empirically investigate the memory safety and concurrency bugs …


C³: Code Clone-Based Identification Of Duplicated Components, Yanming Yang, Ying Zou, Xing Hu, David Lo, Chao Ni, John C. Grundy, Xin: Xia Dec 2023

C³: Code Clone-Based Identification Of Duplicated Components, Yanming Yang, Ying Zou, Xing Hu, David Lo, Chao Ni, John C. Grundy, Xin: Xia

Research Collection School Of Computing and Information Systems

Reinventing the wheel is a detrimental programming practice in software development that frequently results in the introduction of duplicated components. This practice not only leads to increased maintenance and labor costs but also poses a higher risk of propagating bugs throughout the system. Despite numerous issues introduced by duplicated components in software, the identification of component-level clones remains a significant challenge that existing studies struggle to effectively tackle. Specifically, existing methods face two primary limitations that are challenging to overcome: 1) Measuring the similarity between different components presents a challenge due to the significant size differences among them; 2) Identifying …


On The Usage Of Continual Learning For Out-Of-Distribution Generalization In Pre-Trained Language Models Of Code, Martin Weyssow, Xin Zhou, Kisub Kim, David Lo, Houari A. Sahraoui Dec 2023

On The Usage Of Continual Learning For Out-Of-Distribution Generalization In Pre-Trained Language Models Of Code, Martin Weyssow, Xin Zhou, Kisub Kim, David Lo, Houari A. Sahraoui

Research Collection School Of Computing and Information Systems

Pre-trained language models (PLMs) have become a prevalent technique in deep learning for code, utilizing a two-stage pre-training and fine-tuning procedure to acquire general knowledge about code and specialize in a variety of downstream tasks. However, the dynamic nature of software codebases poses a challenge to the effectiveness and robustness of PLMs. In particular, world-realistic scenarios potentially lead to significant differences between the distribution of the pre-training and test data, i.e., distribution shift, resulting in a degradation of the PLM's performance on downstream tasks. In this paper, we stress the need for adapting PLMs of code to software data whose …


Memory Network-Based Interpreter Of User Preferences In Content-Aware Recommender Systems, Nhu Thuat Tran, Hady W. Lauw Dec 2023

Memory Network-Based Interpreter Of User Preferences In Content-Aware Recommender Systems, Nhu Thuat Tran, Hady W. Lauw

Research Collection School Of Computing and Information Systems

This article introduces a novel architecture for two objectives recommendation and interpretability in a unified model. We leverage textual content as a source of interpretability in content-aware recommender systems. The goal is to characterize user preferences with a set of human-understandable attributes, each is described by a single word, enabling comprehension of user interests behind item adoptions. This is achieved via a dedicated architecture, which is interpretable by design, involving two components for recommendation and interpretation. In particular, we seek an interpreter, which accepts holistic user’s representation from a recommender to output a set of activated attributes describing user preferences. …