Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 40

Full-Text Articles in Computer Sciences

A Closer Look At The Security Risks In The Rust Ecosystem, Xiaoye Zheng, Zhiyuan Wan, Yun Zhang, Rui Chang, David Lo Dec 2023

A Closer Look At The Security Risks In The Rust Ecosystem, Xiaoye Zheng, Zhiyuan Wan, Yun Zhang, Rui Chang, David Lo

Research Collection School Of Computing and Information Systems

Rust is an emerging programming language designed for the development of systems software. To facilitate the reuse of Rust code, crates.io, as a central package registry of the Rust ecosystem, hosts thousands of third-party Rust packages. The openness of crates.io enables the growth of the Rust ecosystem but comes with security risks by severe security advisories. Although Rust guarantees a software program to be safe via programming language features and strict compile-time checking, the unsafe keyword in Rust allows developers to bypass compiler safety checks for certain regions of code. Prior studies empirically investigate the memory safety and concurrency bugs …


Web Apis: Features, Issues, And Expectations: A Large-Scale Empirical Study Of Web Apis From Two Publicly Accessible Registries Using Stack Overflow And A User Survey, Neng Zhang, Ying Zou, Xin Xia, David Lo, David Lo, Shanping Li Feb 2023

Web Apis: Features, Issues, And Expectations: A Large-Scale Empirical Study Of Web Apis From Two Publicly Accessible Registries Using Stack Overflow And A User Survey, Neng Zhang, Ying Zou, Xin Xia, David Lo, David Lo, Shanping Li

Research Collection School Of Computing and Information Systems

With the increasing adoption of services-oriented computing and cloud computing technologies, web APIs have become the fundamental building blocks for constructing software applications. Web APIs are developed and published on the internet. The functionality of web APIs can be used to facilitate the development of software applications. There are numerous studies on retrieving and recommending candidate web APIs based on user requirements from a large set of web APIs. However, there are very limited studies on the features of web APIs that make them more likely to be used and the issues of using web APIs in practice. Moreover, users' …


Real World Projects, Real Faults: Evaluating Spectrum Based Fault Localization Techniques On Python Projects, Ratnadira Widyasari, Gede Artha Azriadi Prana, Stefanus Agus Haryono, Shaowei Wang, David Lo Nov 2022

Real World Projects, Real Faults: Evaluating Spectrum Based Fault Localization Techniques On Python Projects, Ratnadira Widyasari, Gede Artha Azriadi Prana, Stefanus Agus Haryono, Shaowei Wang, David Lo

Research Collection School Of Computing and Information Systems

Spectrum Based Fault Localization (SBFL) is a statistical approach to identify faulty code within a program given a program spectra (i.e., records of program elements executed by passing and failing test cases). Several SBFL techniques have been proposed over the years, but most evaluations of those techniques were done only on Java and C programs, and frequently involve artificial faults. Considering the current popularity of Python, indicated by the results of the Stack Overflow survey among developers in 2020, it becomes increasingly important to understand how SBFL techniques perform on Python projects. However, this remains an understudied topic. In this …


Defining Smart Contract Defects On Ethereum, Jiachi Chen, Xin Xia, David Lo, John Grundy, Xiapu Luo, Ting Chen Jan 2022

Defining Smart Contract Defects On Ethereum, Jiachi Chen, Xin Xia, David Lo, John Grundy, Xiapu Luo, Ting Chen

Research Collection School Of Computing and Information Systems

Smart contracts are programs running on a blockchain. They are immutable to change, and hence can not be patched for bugs once deployed. Thus it is critical to ensure they are bug-free and well-designed before deployment. A Contract defect is an error, flaw or fault in a smart contract that causes it to produce an incorrect or unexpected result, or to behave in unintended ways. The detection of contract defects is a method to avoid potential bugs and improve the design of existing code. Since smart contracts contain numerous distinctive features, such as the gas system. decentralized, it is important …


Smart Contract Security: A Practitioners' Perspective, Zhiyuan Wan, Xin Xia, David Lo, Jiachi Chen, Xiapu Luo, Xiaohu Yang May 2021

Smart Contract Security: A Practitioners' Perspective, Zhiyuan Wan, Xin Xia, David Lo, Jiachi Chen, Xiapu Luo, Xiaohu Yang

Research Collection School Of Computing and Information Systems

Smart contracts have been plagued by security incidents, which resulted in substantial financial losses. Given numerous research efforts in addressing the security issues of smart contracts, we wondered how software practitioners build security into smart contracts in practice. We performed a mixture of qualitative and quantitative studies with 13 interviewees and 156 survey respondents from 35 countries across six continents to understand practitioners' perceptions and practices on smart contract security. Our study uncovers practitioners' motivations and deterrents of smart contract security, as well as how security efforts and strategies fit into the development lifecycle. We also find that blockchain platforms …


Do Users Care About Ad's Performance Costs? Exploring The Effects Of The Performance Costs Of In-App Ads On User Experience, Cuiyun Gao, Jichuan Zeng, Federica Sarro, David Lo, Irwin King, Michael R. Lyu Apr 2021

Do Users Care About Ad's Performance Costs? Exploring The Effects Of The Performance Costs Of In-App Ads On User Experience, Cuiyun Gao, Jichuan Zeng, Federica Sarro, David Lo, Irwin King, Michael R. Lyu

Research Collection School Of Computing and Information Systems

Context: In-app advertising is the primary source of revenue for many mobile apps. The cost of advertising (ad cost) is non-negligible for app developers to ensure a good user experience and continuous profits. Previous studies mainly focus on addressing the hidden performance costs generated by ads, including consumption of memory, CPU, data traffic, and battery. However, there is no research on analyzing users’ perceptions of ads’ performance costs to our knowledge.Objective: To fill this gap and better understand the effects of performance costs of in-app ads on user experience, we conduct a study on analyzing user concerns about ads’ performance …


An Exploratory Study On The Introduction And Removal Of Different Types Of Technical Debt In Deep Learning Frameworks, Jiakun Liu, Qiao Huang, Xin Xia, Emad Shihab, David Lo, Shanping Li Feb 2021

An Exploratory Study On The Introduction And Removal Of Different Types Of Technical Debt In Deep Learning Frameworks, Jiakun Liu, Qiao Huang, Xin Xia, Emad Shihab, David Lo, Shanping Li

Research Collection School Of Computing and Information Systems

To complete tasks faster, developers often have to sacrifice the quality of the software. Such compromised practice results in the increasing burden to developers in future development. The metaphor, technical debt, describes such practice. Prior research has illustrated the negative impact of technical debt, and many researchers investigated how developers deal with a certain type of technical debt. However, few studies focused on the removal of different types of technical debt in practice. To fill this gap, we use the introduction and removal of different types of self-admitted technical debt (i.e., SATD) in 7 deep learning frameworks as an example. …


An Empirical Study Of The Dependency Networks Of Deep Learning Libraries, Junxiao Han, Shuiguang Deng, David Lo, Chen Zhi, Jianwei Yin, Xin Xia Sep 2020

An Empirical Study Of The Dependency Networks Of Deep Learning Libraries, Junxiao Han, Shuiguang Deng, David Lo, Chen Zhi, Jianwei Yin, Xin Xia

Research Collection School Of Computing and Information Systems

Deep Learning techniques have been prevalent in various domains, and more and more open source projects in GitHub rely on deep learning libraries to implement their algorithms. To that end, they should always keep pace with the latest versions of deep learning libraries to make the best use of deep learning libraries. Aptly managing the versions of deep learning libraries can help projects avoid crashes or security issues caused by deep learning libraries. Unfortunately, very few studies have been done on the dependency networks of deep learning libraries. In this paper, we take the first step to perform an exploratory …


Is Using Deep Learning Frameworks Free?: Characterizing Technical Debt In Deep Learning Frameworks, Jiakun Liu, Qiao Huang, Xin Xia, Emad Shihab, David Lo, Shanping Li Jun 2020

Is Using Deep Learning Frameworks Free?: Characterizing Technical Debt In Deep Learning Frameworks, Jiakun Liu, Qiao Huang, Xin Xia, Emad Shihab, David Lo, Shanping Li

Research Collection School Of Computing and Information Systems

Developers of deep learning applications (shortened as application developers) commonly use deep learning frameworks in their projects. However, due to time pressure, market competition, and cost reduction, developers of deep learning frameworks (shortened as framework developers) often have to sacrifice software quality to satisfy a shorter completion time. This practice leads to technical debt in deep learning frameworks, which results in the increasing burden to both the application developers and the framework developers in future development.In this paper, we analyze the comments indicating technical debt (self-admitted technical debt) in 7 of the most popular open-source deep learning frameworks. Although framework …


Are The Code Snippets What We Are Searching For? A Benchmark And An Empirical Study On Code Search With Natural-Language Queries, Shuhan Yan, Hang Yu, Yuting Chen, Beijun Shen Feb 2020

Are The Code Snippets What We Are Searching For? A Benchmark And An Empirical Study On Code Search With Natural-Language Queries, Shuhan Yan, Hang Yu, Yuting Chen, Beijun Shen

Research Collection School Of Computing and Information Systems

Code search methods, especially those that allow programmers to raise queries in a natural language, plays an important role in software development. It helps to improve programmers' productivity by returning sample code snippets from the Internet and/or source-code repositories for their natural-language queries. Meanwhile, there are many code search methods in the literature that support natural-language queries. Difficulties exist in recognizing the strengths and weaknesses of each method and choosing the right one for different usage scenarios, because (1) the implementations of those methods and the datasets for evaluating them are usually not publicly available, and (2) some methods leverage …


Memory And Resource Leak Defects And Their Repairs In Java Projects, Mohammadreza Ghanavati, Diego Costa, Janos Seboek, David Lo, Artur Andrzejak Jan 2020

Memory And Resource Leak Defects And Their Repairs In Java Projects, Mohammadreza Ghanavati, Diego Costa, Janos Seboek, David Lo, Artur Andrzejak

Research Collection School Of Computing and Information Systems

Despite huge software engineering efforts and programming language support, resource and memory leaks are still a troublesome issue, even in memory-managed languages such as Java. Understanding the properties of leak-inducing defects, how the leaks manifest, and how they are repaired is an essential prerequisite for designing better approaches for avoidance, diagnosis, and repair of leak-related bugs. We conduct a detailed empirical study on 452 issues from 10 large opensource Java projects. The study proposes taxonomies for the leak types, for the defects causing them, and for the repair actions. We investigate, under several aspects, the distributions within each taxonomy and …


An Empirical Study Towards Characterizing Deep Learning Development And Deployment Across Different Frameworks And Platforms, Qianyu Guo, Sen Chen, Xiaofei Xie, Lei Ma, Qiang Hu, Hongtao Liu, Yang Liu, Jianjun Zhao, Xiaohong Li Nov 2019

An Empirical Study Towards Characterizing Deep Learning Development And Deployment Across Different Frameworks And Platforms, Qianyu Guo, Sen Chen, Xiaofei Xie, Lei Ma, Qiang Hu, Hongtao Liu, Yang Liu, Jianjun Zhao, Xiaohong Li

Research Collection School Of Computing and Information Systems

Deep Learning (DL) has recently achieved tremendous success. A variety of DL frameworks and platforms play a key role to catalyze such progress. However, the differences in architecture designs and implementations of existing frameworks and platforms bring new challenges for DL software development and deployment. Till now, there is no study on how various mainstream frameworks and platforms influence both DL software development and deployment in practice.To fill this gap, we take the first step towards understanding how the most widely-used DL frameworks and platforms support the DL software development and deployment. We conduct a systematic study on these frameworks …


How Does Machine Learning Change Software Development Practices?, Zhiyuan Wan, Xin Xia, David Lo, Gail C. Murphy Aug 2019

How Does Machine Learning Change Software Development Practices?, Zhiyuan Wan, Xin Xia, David Lo, Gail C. Murphy

Research Collection School Of Computing and Information Systems

Adding an ability for a system to learn inherently adds uncertainty into the system. Given the rising popularity of incorporating machine learning into systems, we wondered how the addition alters software development practices. We performed a mixture of qualitative and quantitative studies with 14 interviewees and 342 survey respondents from 26 countries across four continents to elicit significant differences between the development of machine learning systems and the development of non-machine-learning systems. Our study uncovers significant differences in various aspects of software engineering (e.g., requirements, design, testing, and process) and work characteristics (e.g., skill variety, problem solving and task identity). …


Why Is My Code Change Abandoned?, Qingye Wang, Xin Xia, David Lo, Shanping Li Jun 2019

Why Is My Code Change Abandoned?, Qingye Wang, Xin Xia, David Lo, Shanping Li

Research Collection School Of Computing and Information Systems

Software developers contribute numerous changes every day to the code review systems. However, not all submitted changes are merged into a codebase because they might not pass the code review process. Some changes would be abandoned or be asked for resubmission after improvement, which results in more workload for developers and reviewers, and more delays to deliverables. To understand the underlying reasons why changes are abandoned, we conduct an empirical study on the code review of four open source projects (Eclipse, LibreOffice, OpenStack, and Qt).First, we manually analyzed 1459 abandoned changes. Second, we leveraged the open card sorting method to …


On The Impact Of Refactoring On The Relationship Between Quality Attributes And Design Metrics, Mohamed Wiem Mkaouer, Eman Abdullah Alomar, Ali Ouni, Marouane Kessentini May 2019

On The Impact Of Refactoring On The Relationship Between Quality Attributes And Design Metrics, Mohamed Wiem Mkaouer, Eman Abdullah Alomar, Ali Ouni, Marouane Kessentini

Articles

Refactoring is a critical task in software maintenance and is generally performed to enforce the best design and implementation practices or to cope with design defects. Several studies attempted to detect refactoring activities through mining software repositories allowing to collect, analyze and get actionable data-driven insights about refactoring practices within software projects. Aim: We aim at identifying, among the various quality models presented in the literature, the ones that are more in-line with the developer’s vision of quality optimization, when they explicitly mention that they are refactoring to improve them. Method: We extract a large corpus of design-related refactoring activities …


On Reliability Of Patch Correctness Assessment, Xuan-Bach D. Le, Lingfeng Bao, David Lo, Xin Xia, Shanping Li, Corina S. Pasareanu May 2019

On Reliability Of Patch Correctness Assessment, Xuan-Bach D. Le, Lingfeng Bao, David Lo, Xin Xia, Shanping Li, Corina S. Pasareanu

Research Collection School Of Computing and Information Systems

Current state-of-the-art automatic software repair (ASR) techniques rely heavily on incomplete specifications, or test suites, to generate repairs. This, however, may cause ASR tools to generate repairs that are incorrect and hard to generalize. To assess patch correctness, researchers have been following two methods separately: (1) Automated annotation, wherein patches are automatically labeled by an independent test suite (ITS) – a patch passing the ITS is regarded as correct or generalizable, and incorrect otherwise, (2) Author annotation, wherein authors of ASR techniques manually annotate the correctness labels of patches generated by their and competing tools. While automated annotation cannot ascertain …


Characterizing And Identifying Reverted Commits, Meng Yan, Xin Xia, David Lo, Ahmed E. Hassan, Shanping Li Mar 2019

Characterizing And Identifying Reverted Commits, Meng Yan, Xin Xia, David Lo, Ahmed E. Hassan, Shanping Li

Research Collection School Of Computing and Information Systems

In practice, a popular and coarse-grained approach for recovering from a problematic commit is to revert it (i.e., undoing the change). However, reverted commits could induce some issues for software development, such as impeding the development progress and increasing the difficulty for maintenance. In order to mitigate these issues, we set out to explore the following central question: can we characterize and identify which commits will be reverted? In this paper, we characterize commits using 27 commit features and build an identification model to identify commits that will be reverted. We first identify reverted commits by analyzing commit messages and …


Break The Dead End Of Dynamic Slicing: Localizing Data And Control Omission Bug, Yun Lin, Jun Sun, Lyly Tran, Guangdong Bai, Haijun Wang, Jin Song Dong Sep 2018

Break The Dead End Of Dynamic Slicing: Localizing Data And Control Omission Bug, Yun Lin, Jun Sun, Lyly Tran, Guangdong Bai, Haijun Wang, Jin Song Dong

Research Collection School Of Computing and Information Systems

Dynamic slicing is a common way of identifying the root cause when a program fault is revealed. With the dynamic slicing technique, the programmers can follow data and control flow along the program execution trace to the root cause. However, the technique usually fails to work on omission bugs, i.e., the faults which are caused by missing executing some code. In many cases, dynamic slicing over-skips the root cause when an omission bug happens, leading the debugging process to a dead end. In this work, we conduct an empirical study on the omission bugs in the Defects4J bug repository. Our …


Proactive Empirical Assessment Of New Language Feature Adoption Via Automated Refactoring: The Case Of Java 8 Default Methods, Raffi T. Khatchadourian, Hidehiko Masuhara Apr 2018

Proactive Empirical Assessment Of New Language Feature Adoption Via Automated Refactoring: The Case Of Java 8 Default Methods, Raffi T. Khatchadourian, Hidehiko Masuhara

Publications and Research

Programming languages and platforms improve over time, sometimes resulting in new language features that offer many benefits. However, despite these benefits, developers may not always be willing to adopt them in their projects for various reasons. In this paper, we describe an empirical study where we assess the adoption of a particular new language feature. Studying how developers use (or do not use) new language features is important in programming language research and engineering because it gives designers insight into the usability of the language to create meaning programs in that language. This knowledge, in turn, can drive future innovations …


Proactive Empirical Assessment Of New Language Feature Adoption Via Automated Refactoring: The Case Of Java 8 Default Methods, Raffi T. Khatchadourian, Hidehiko Masuhara Apr 2018

Proactive Empirical Assessment Of New Language Feature Adoption Via Automated Refactoring: The Case Of Java 8 Default Methods, Raffi T. Khatchadourian, Hidehiko Masuhara

Publications and Research

Programming languages and platforms improve over time, sometimes resulting in new language features that offer many benefits. However, despite these benefits, developers may not always be willing to adopt them in their projects for various reasons. In this paper, we describe an empirical study where we assess the adoption of a particular new language feature. Studying how developers use (or do not use) new language features is important in programming language research and engineering because it gives designers insight into the usability of the language to create meaning programs in that language. This knowledge, in turn, can drive future innovations …


What Do Developers Search For On The Web?, Xin Xia, Lingfeng Bao, David Lo, Pavneet Singh Kochhar, Ahmed E. Hassan, Zhenchang Xing Dec 2017

What Do Developers Search For On The Web?, Xin Xia, Lingfeng Bao, David Lo, Pavneet Singh Kochhar, Ahmed E. Hassan, Zhenchang Xing

Research Collection School Of Computing and Information Systems

Developers commonly make use of a web search engine such as Google to locate online resources to improve their productivity. A better understanding of what developers search for could help us understand their behaviors and the problems that they meet during the software development process. Unfortunately, we have a limited understanding of what developers frequently search for and of the search tasks that they often find challenging. To address this gap, we collected search queries from 60 developers, surveyed 235 software engineers from more than 21 countries across five continents. In particular, we asked our survey participants to rate the …


Bug Characteristics In Blockchain Systems: A Large-Scale Empirical Study, Zhiyuan Wan, David Lo, Xin Xia, Liang Cai Jun 2017

Bug Characteristics In Blockchain Systems: A Large-Scale Empirical Study, Zhiyuan Wan, David Lo, Xin Xia, Liang Cai

Research Collection School Of Computing and Information Systems

Bugs severely hurt blockchain system dependability. A thorough understanding of blockchain bug characteristics is required to design effective tools for preventing, detecting and mitigating bugs. We perform an empirical study on bug characteristics in eight representative open source blockchain systems. First, we manually examine 1,108 bug reports to understand the nature of the reported bugs. Second, we leverage card sorting to label the bug reports, and obtain ten bug categories in blockchain systems. We further investigate the frequency distribution of bug categories across projects and programming languages. Finally, we study the relationship between bug categories and bug fixing time. The …


Characterizing Malicious Android Apps By Mining Topic-Specific Data Flow Signatures, Xinli Yang, David Lo, Li Li, Xin Xia, Tegawendé F. Bissyande, Jacques Klein Apr 2017

Characterizing Malicious Android Apps By Mining Topic-Specific Data Flow Signatures, Xinli Yang, David Lo, Li Li, Xin Xia, Tegawendé F. Bissyande, Jacques Klein

Research Collection School Of Computing and Information Systems

Context: State-of-the-art works on automated detection of Android malware have leveraged app descriptions to spot anomalies w.r.t the functionality implemented, or have used data flow information as a feature to discriminate malicious from benign apps. Although these works have yielded promising performance,we hypothesize that these performances can be improved by a better understanding of malicious behavior. Objective: To characterize malicious apps, we take into account both information on app descriptions,which are indicative of apps’ topics, and information on sensitive data flow, which can be relevant todiscriminate malware from benign apps. Method: In this paper, we propose a topic-specific approach to …


What Security Questions Do Developers Ask? A Large-Scale Study Of Stack Overflow Posts, Xinli Yang, David Lo, Xin Xia, Zhi-Yuan Wan, Jian-Ling Sun Sep 2016

What Security Questions Do Developers Ask? A Large-Scale Study Of Stack Overflow Posts, Xinli Yang, David Lo, Xin Xia, Zhi-Yuan Wan, Jian-Ling Sun

Research Collection School Of Computing and Information Systems

Security has always been a popular and critical topic. With the rapid development of information technology, it is always attracting people’s attention. However, since security has a long history, it covers a wide range of topics which change a lot, from classic cryptography to recently popular mobile security. There is a need to investigate security-related topics and trends, which can be a guide for security researchers, security educators and security practitioners. To address the above-mentioned need, in this paper, we conduct a large-scale study on security-related questions on Stack Overflow. Stack Overflow is a popular on-line question and answer site …


Tasker: Behavioral Insights Via Campus-Based Experimental Mobile Crowd-Sourcing, Thivya Kandappu, Jaiman, Tandriansyah, Archan Misra, Shih-Fen Cheng, Chen, Hoong Chuin Lau, Deepthi Chander, Koustuv Dasgupta Sep 2016

Tasker: Behavioral Insights Via Campus-Based Experimental Mobile Crowd-Sourcing, Thivya Kandappu, Jaiman, Tandriansyah, Archan Misra, Shih-Fen Cheng, Chen, Hoong Chuin Lau, Deepthi Chander, Koustuv Dasgupta

Research Collection School Of Computing and Information Systems

While mobile crowd-sourcing has become a game-changer for many urban operations, such as last mile logistics and municipal monitoring, we believe that the design of such crowdsourcing strategies must better accommodate the real-world behavioral preferences and characteristics of users. To provide a real-world testbed to study the impact of novel mobile crowd-sourcing strategies, we have designed, developed and experimented with a real-world mobile crowd-tasking platform on the SMU campus, called TA$Ker. We enhanced the TA$Ker platform to support several new features (e.g., task bundling, differential pricing and cheating analytics) and experimentally investigated these features via a two-month deployment of TA$Ker, …


Practitioners' Expectations On Automated Fault Localization, Pavneet Singh Kochhar, Xin Xia, David Lo, Shanping Li Jul 2016

Practitioners' Expectations On Automated Fault Localization, Pavneet Singh Kochhar, Xin Xia, David Lo, Shanping Li

Research Collection School Of Computing and Information Systems

Software engineering practitioners often spend significant amount of time and effort to debug. To help practitioners perform this crucial task, hundreds of papers have proposed various fault localization techniques. Fault localization helps practitioners to find the location of a defect given its symptoms (e.g., program failures). These localization techniques have pinpointed the locations of bugs of various systems of diverse sizes, with varying degrees of success, and for various usage scenarios. Unfortunately, it is unclear whether practitioners appreciate this line of research. To fill this gap, we performed an empirical study by surveying 386 practitioners from more than 30 countries …


Empirical Investigation Of Key Business Factors For Digital Game Performance, Saiqa Aleem, Luiz Fernando Capretz, Faheem Ahmed Oct 2015

Empirical Investigation Of Key Business Factors For Digital Game Performance, Saiqa Aleem, Luiz Fernando Capretz, Faheem Ahmed

Electrical and Computer Engineering Publications

Game development is an interdisciplinary concept that embraces software engineering, business, management, and artistic disciplines. This research facilitates a better understanding of the business dimension of digital games. The main objective of this research is to investigate empirically the effect of business factors on the performance of digital games in the market and to answer the research questions asked in this study. Game development organizations are facing high pressure and competition in the digital game industry. Business has become a crucial dimension, especially for game development organizations. The main contribution of this paper is to investigate empirically the influence of …


An Empirical Assessment Of Bellon's Clone Benchmark, Alan Charpentier, Jean-Rémy Falleri, David Lo, Laurent Reveillere Apr 2015

An Empirical Assessment Of Bellon's Clone Benchmark, Alan Charpentier, Jean-Rémy Falleri, David Lo, Laurent Reveillere

Research Collection School Of Computing and Information Systems

Context: Clone benchmarks are essential to the assessment and improvement of clone detection tools and algorithms. Among existing benchmarks, Bellon's benchmark is widely used by the research community. However, a serious threat to the validity of this benchmark is that reference clones it contains have been manually validated by Bellon alone. Other persons may disagree with Bellon's judgment. Objective: In this paper, we perform an empirical assessment of Bellon's benchmark. Method: We seek the opinion of eighteen participants on a subset of Bellon's benchmark to determine if researchers should trust the reference clones it contains. Results: Our experiment shows that …


To What Extent Could We Detect Field Defects? An Extended Empirical Study Of False Negatives In Static Bug Finding Tools, Ferdian Thung, Lucia Lucia, David Lo, Lingxiao Jiang, Foyzur Rahman, Premkumar Devanbu Sep 2014

To What Extent Could We Detect Field Defects? An Extended Empirical Study Of False Negatives In Static Bug Finding Tools, Ferdian Thung, Lucia Lucia, David Lo, Lingxiao Jiang, Foyzur Rahman, Premkumar Devanbu

Research Collection School Of Computing and Information Systems

Software defects can cause much loss. Static bug-finding tools are designed to detect and remove software defects and believed to be effective. However, do such tools in fact help prevent actual defects that occur in the field and reported by users? If these tools had been used, would they have detected these field defects, and generated warnings that would direct programmers to fix them? To answer these questions, we perform an empirical study that investigates the effectiveness of five state-of-the-art static bug-finding tools (FindBugs, JLint, PMD, CheckStyle, and JCSC) on hundreds of reported and fixed defects extracted from three open …


An Empirical Study Of Adoption Of Software Testing In Open Source Projects, Pavneet Singh Kochhar, Tegawende F. Bissyande, David Lo, Lingxiao Jiang Jun 2014

An Empirical Study Of Adoption Of Software Testing In Open Source Projects, Pavneet Singh Kochhar, Tegawende F. Bissyande, David Lo, Lingxiao Jiang

David LO

In software engineering, testing is a crucial activity that is designed to ensure the quality of program code. For this activity, software teams spend substantial resources constructing test cases to thoroughly assess the correctness of software functionality. What is the proportion of open source projects that include test cases? What is the effect of number of developers on the number of test cases? In this study, we explore open source projects and investigate the correlation between the presence of test cases and various project development characteristics, including the number of lines of code, the size of development teams and the …