Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Software Engineering

Research Collection School Of Computing and Information Systems

Software engineering

Articles 1 - 30 of 36

Full-Text Articles in Computer Sciences

Large Language Models For Qualitative Research In Software Engineering: Exploring Opportunities And Challenges, Muneera Bano, Rashina Hoda, Didar Zowghi, Christoph Treude May 2024

Large Language Models For Qualitative Research In Software Engineering: Exploring Opportunities And Challenges, Muneera Bano, Rashina Hoda, Didar Zowghi, Christoph Treude

Research Collection School Of Computing and Information Systems

The recent surge in the integration of Large Language Models (LLMs) like ChatGPT into qualitative research in software engineering, much like in other professional domains, demands a closer inspection. This vision paper seeks to explore the opportunities of using LLMs in qualitative research to address many of its legacy challenges as well as potential new concerns and pitfalls arising from the use of LLMs. We share our vision for the evolving role of the qualitative researcher in the age of LLMs and contemplate how they may utilize LLMs at various stages of their research experience.


Fixing Your Own Smells: Adding A Mistake-Based Familiarization Step When Teaching Code Refactoring, Ivan Wei Han Tan, Christopher M. Poskitt Mar 2024

Fixing Your Own Smells: Adding A Mistake-Based Familiarization Step When Teaching Code Refactoring, Ivan Wei Han Tan, Christopher M. Poskitt

Research Collection School Of Computing and Information Systems

Programming problems can be solved in a multitude of functionally correct ways, but the quality of these solutions (e.g. readability, maintainability) can vary immensely. When code quality is poor, symptoms emerge in the form of 'code smells', which are specific negative characteristics (e.g. duplicate code) that can be resolved by applying refactoring patterns. Many undergraduate computing curricula train students on this software engineering practice, often doing so via exercises on unfamiliar instructor-provided code. Our observation, however, is that this makes it harder for novices to internalise refactoring as part of their own development practices. In this paper, we propose a …


Dexbert: Effective, Task-Agnostic And Fine-Grained Representation Learning Of Android Bytecode, Tiezhu Sun, Kevin Allix, Kisub Kim, Xin Zhou, Dongsun Kim, David Lo, Tegawendé F. Bissyande, Jacques Klein Oct 2023

Dexbert: Effective, Task-Agnostic And Fine-Grained Representation Learning Of Android Bytecode, Tiezhu Sun, Kevin Allix, Kisub Kim, Xin Zhou, Dongsun Kim, David Lo, Tegawendé F. Bissyande, Jacques Klein

Research Collection School Of Computing and Information Systems

The automation of an increasingly large number of software engineering tasks is becoming possible thanks to Machine Learning (ML). One foundational building block in the application of ML to software artifacts is the representation of these artifacts ( e.g. , source code or executable code) into a form that is suitable for learning. Traditionally, researchers and practitioners have relied on manually selected features, based on expert knowledge, for the task at hand. Such knowledge is sometimes imprecise and generally incomplete. To overcome this limitation, many studies have leveraged representation learning, delegating to ML itself the job of automatically devising suitable …


She Elicits Requirements And He Tests: Software Engineering Gender Bias In Large Language Models, Christoph Treude, Hideaki Hata May 2023

She Elicits Requirements And He Tests: Software Engineering Gender Bias In Large Language Models, Christoph Treude, Hideaki Hata

Research Collection School Of Computing and Information Systems

Implicit gender bias in software development is a well-documented issue, such as the association of technical roles with men. To address this bias, it is important to understand it in more detail. This study uses data mining techniques to investigate the extent to which 56 tasks related to software development, such as assigning GitHub issues and testing, are affected by implicit gender bias embedded in large language models. We systematically translated each task from English into a genderless language and back, and investigated the pronouns associated with each task. Based on translating each task 100 times in different permutations, we …


Techsumbot: A Stack Overflow Answer Summarization Tool For Technical Query, Chengran Yang, Bowen Xu, Jiakun Liu, David Lo May 2023

Techsumbot: A Stack Overflow Answer Summarization Tool For Technical Query, Chengran Yang, Bowen Xu, Jiakun Liu, David Lo

Research Collection School Of Computing and Information Systems

Stack Overflow is a popular platform for developers to seek solutions to programming-related problems. However, prior studies identified that developers may suffer from the redundant, useless, and incomplete information retrieved by the Stack Overflow search engine. To help developers better utilize the Stack Overflow knowledge, researchers proposed tools to summarize answers to a Stack Overflow question. However, existing tools use hand-craft features to assess the usefulness of each answer sentence and fail to remove semantically redundant information in the result. Besides, existing tools only focus on a certain programming language and cannot retrieve up-to-date new posted knowledge from Stack Overflow. …


I Know What You Are Searching For: Code Snippet Recommendation From Stack Overflow Posts, Zhipeng Gao, Xin Xia, David Lo, John C. Grundy, Xindong Zhang, Zhenchang Xing Jan 2023

I Know What You Are Searching For: Code Snippet Recommendation From Stack Overflow Posts, Zhipeng Gao, Xin Xia, David Lo, John C. Grundy, Xindong Zhang, Zhenchang Xing

Research Collection School Of Computing and Information Systems

Stack Overflow has been heavily used by software developers to seek programming-related information. More and more developers use Community Question and Answer forums, such as Stack Overflow, to search for code examples of how to accomplish a certain coding task. This is often considered to be more efficient than working from source documentation, tutorials, or full worked examples. However, due to the complexity of these online Question and Answer forums and the very large volume of information they contain, developers can be overwhelmed by the sheer volume of available information. This makes it hard to find and/or even be aware …


Itiger: An Automatic Issue Title Generation Tool, Ting Zhang, Ivana Clairine Irsan, Thung Ferdian, Donggyun Han, David Lo, Lingxiao Jiang Nov 2022

Itiger: An Automatic Issue Title Generation Tool, Ting Zhang, Ivana Clairine Irsan, Thung Ferdian, Donggyun Han, David Lo, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

In both commercial and open-source software, bug reports or issues are used to track bugs or feature requests. However, the quality of issues can differ a lot. Prior research has found that bug reports with good quality tend to gain more attention than the ones with poor quality. As an essential component of an issue, title quality is an important aspect of issue quality. Moreover, issues are usually presented in a list view, where only the issue title and some metadata are present. In this case, a concise and accurate title is crucial for readers to grasp the general concept …


Recipegen++: An Automated Trigger Action Programs Generator, Imam Nur Bani Yusuf, Diyanah Abdul Jamal, Lingxiao Jiang, David Lo Nov 2022

Recipegen++: An Automated Trigger Action Programs Generator, Imam Nur Bani Yusuf, Diyanah Abdul Jamal, Lingxiao Jiang, David Lo

Research Collection School Of Computing and Information Systems

Trigger Action Programs (TAPs) are event-driven rules that allow users to automate smart-devices and internet services. Users can write TAPs by specifying triggers and actions from a set of predefined channels and functions. Despite its simplicity, composing TAPs can still be challenging for users due to the enormous search space of available triggers and actions. The growing popularity of TAPs is followed by the increasing number of supported devices and services, resulting in a huge number of possible combinations between triggers and actions. Motivated by such a fact, we improve our prior work and propose RecipeGen++, a deep-learning-based approach that …


Including Everyone, Everywhere: Understanding Opportunities And Challenges Of Geographic Gender-Inclusion In Oss, Gede Artha Azriadi Prana, Denae Ford, Ayushi Rastogi, David Lo, Rahul Purandare, Nachiappan Nagappan Feb 2022

Including Everyone, Everywhere: Understanding Opportunities And Challenges Of Geographic Gender-Inclusion In Oss, Gede Artha Azriadi Prana, Denae Ford, Ayushi Rastogi, David Lo, Rahul Purandare, Nachiappan Nagappan

Research Collection School Of Computing and Information Systems

The gender gap is a significant concern facing the software industry as the development becomes more geographically distributed. Widely shared reports indicate that gender differences may be specific to each region. However, how complete can these reports be with little to no research reflective of the Open Source Software (OSS) process and communities software is now commonly developed in? Our study presents a multi-region geographical analysis of gender inclusion on GitHub. This mixed-methods approach includes quantitatively investigating differences in gender inclusion in projects across geographic regions and investigate these trends over time using data from contributions to 21,456 project repositories. …


A Survey On Deep Learning For Software Engineering, Yanming Yang, Xin Xia, David Lo Jan 2022

A Survey On Deep Learning For Software Engineering, Yanming Yang, Xin Xia, David Lo

Research Collection School Of Computing and Information Systems

In 2006, Geoffrey Hinton proposed the concept of training "Deep Neural Networks (DNNs)" and an improved model training method to break the bottleneck of neural network development. More recently, the introduction of AlphaGo in 2016 demonstrated the powerful learning ability of deep learning and its enormous potential. Deep learning has been increasingly used to develop state-of-the-art software engineering (SE) research tools due to its ability to boost performance for various SE tasks. There are many factors, e.g., deep learning model selection, internal structure differences, and model optimization techniques, that may have an impact on the performance of DNNs applied in …


Predictive Models In Software Engineering: Challenges And Opportunities, Yanming Yang, Xin Xia, David Lo, Tingting Bi, John C. Grundy, Xiaohu Yang Jan 2022

Predictive Models In Software Engineering: Challenges And Opportunities, Yanming Yang, Xin Xia, David Lo, Tingting Bi, John C. Grundy, Xiaohu Yang

Research Collection School Of Computing and Information Systems

Predictive models are one of the most important techniques that are widely applied in many areas of software engineering. There have been a large number of primary studies that apply predictive models and that present well-performed studies in various research domains, including software requirements, software design and development, testing and debugging, and software maintenance. This article is a first attempt to systematically organize knowledge in this area by surveying a body of 421 papers on predictive models published between 2009 and 2020. We describe the key models and approaches used, classify the different models, summarize the range of key application …


Research Artifact: The Potential Of Meta-Maintenance On Github, Hideaki Hata, Raula Kula, Takashi Ishio, Christoph Treude May 2021

Research Artifact: The Potential Of Meta-Maintenance On Github, Hideaki Hata, Raula Kula, Takashi Ishio, Christoph Treude

Research Collection School Of Computing and Information Systems

This is a research artifact for the paper “Same File, Different Changes: The Potential of Meta-Maintenance on GitHub”. This artifact is a data repository including a list of studied 32,007 repositories on GitHub, a list of targeted 401,610,677 files, the results of the qualitative analysis for RQ2, RQ3, and RQ4, the results of the quantitative analysis for RQ5, and survey material for RQ6. The purpose of this artifact is enabling researchers to replicate our mixed-methods results of the paper, and to reuse the results of our exploratory study for further software engineering research. This research artifact is available at https://github.com/NAIST-SE/MetaMaintenancePotential …


Same File, Different Changes: The Potential Of Meta-Maintenance On Github, Hideaki Hata, Raula Kula, Takashi Ishio, Christoph Treude May 2021

Same File, Different Changes: The Potential Of Meta-Maintenance On Github, Hideaki Hata, Raula Kula, Takashi Ishio, Christoph Treude

Research Collection School Of Computing and Information Systems

Online collaboration platforms such as GitHub have provided software developers with the ability to easily reuse and share code between repositories. With clone-and-own and forking becoming prevalent, maintaining these shared files is important, especially for keeping the most up-to-date version of reused code. Different to related work, we propose the concept of meta-maintenance-i.e., tracking how the same files evolve in different repositories with the aim to provide useful maintenance opportunities to those files. We conduct an exploratory study by analyzing repositories from seven different programming languages to explore the potential of meta-maintenance. Our results indicate that a majority of active …


Automatic Solution Summarization For Crash Bugs, Haoye Wang, Xin Xia, David Lo, John C. Grundy, Xinyu Wang May 2021

Automatic Solution Summarization For Crash Bugs, Haoye Wang, Xin Xia, David Lo, John C. Grundy, Xinyu Wang

Research Collection School Of Computing and Information Systems

The causes of software crashes can be hidden anywhere in the source code and development environment. When encountering software crashes, recurring bugs that are discussed on Q&A sites could provide developers with solutions to their crashing problems. However, it is difficult for developers to accurately search for relevant content on search engines, and developers have to spend a lot of manual effort to find the right solution from the returned results. In this paper, we present CRASOLVER, an approach that takes into account both the structural information of crash traces and the knowledge of crash-causing bugs to automatically summarize solutions …


Robot: Robustness-Oriented Testing For Deep Learning Systems, Jingyi Wang, Jialuo Chen, Youcheng Sun, Xingjun Ma, Dongxia Wang, Jun Sun, Peng Cheng May 2021

Robot: Robustness-Oriented Testing For Deep Learning Systems, Jingyi Wang, Jialuo Chen, Youcheng Sun, Xingjun Ma, Dongxia Wang, Jun Sun, Peng Cheng

Research Collection School Of Computing and Information Systems

Recently, there has been a significant growth of interest in applying software engineering techniques for the quality assurance of deep learning (DL) systems. One popular direction is deep learning testing, where adversarial examples (a.k.a. bugs) of DL systems are found either by fuzzing or guided search with the help of certain testing metrics. However, recent studies have revealed that the commonly used neuron coverage metrics by existing DL testing approaches are not correlated to model robustness. It is also not an effective measurement on the confidence of the model robustness after testing. In this work, we address this gap by …


Did Our Course Design On Software Architecture Meet Our Student’S Learning Expectations?, Eng Lieh Ouh, Benjamin Gan, Yunghans Irawan Oct 2020

Did Our Course Design On Software Architecture Meet Our Student’S Learning Expectations?, Eng Lieh Ouh, Benjamin Gan, Yunghans Irawan

Research Collection School Of Computing and Information Systems

This Innovative Practice Full Paper discusses our course design on software architecture to meet the learning expectations of two groups of software engineers. Software engineers with working experiences frequently find themselves the need to upskill in their lifelong learning journey. Their learning expectations are shaped not just by their need to know but also other learning characteristics such as their working experiences. In many cases, we design courses based on the required learning outcomes and assessment criteria. In this paper, we wish to find out whether our course design on software architecture has met the learning expectations of our students …


How Practitioners Perceive Automated Bug Report Management Techniques, Weiqin Zou, David Lo, Zhenyu Chen, Xin Xia, Yang Feng, Baowen Xu Aug 2020

How Practitioners Perceive Automated Bug Report Management Techniques, Weiqin Zou, David Lo, Zhenyu Chen, Xin Xia, Yang Feng, Baowen Xu

Research Collection School Of Computing and Information Systems

Bug reports play an important role in the process of debugging and fixing bugs. To reduce the burden of bug report managers and facilitate the process of bug fixing, a great amount of software engineering research has been invested into automated bug report management techniques. However, the verdict is still open whether such techniques are actually required and applicable outside of the theoretical research domain. To fill this gap, in this paper, we conducted a survey among 327 practitioners to gain their insights into various categories of automated bug report management techniques. Specifically, in the survey, we asked them to …


How Does Machine Learning Change Software Development Practices?, Zhiyuan Wan, Xin Xia, David Lo, Gail C. Murphy Aug 2019

How Does Machine Learning Change Software Development Practices?, Zhiyuan Wan, Xin Xia, David Lo, Gail C. Murphy

Research Collection School Of Computing and Information Systems

Adding an ability for a system to learn inherently adds uncertainty into the system. Given the rising popularity of incorporating machine learning into systems, we wondered how the addition alters software development practices. We performed a mixture of qualitative and quantitative studies with 14 interviewees and 342 survey respondents from 26 countries across four continents to elicit significant differences between the development of machine learning systems and the development of non-machine-learning systems. Our study uncovers significant differences in various aspects of software engineering (e.g., requirements, design, testing, and process) and work characteristics (e.g., skill variety, problem solving and task identity). …


Single Image Reflection Removal Beyond Linearity, Qiang Wen, Yinjie Tan, Jing Qin, Wenxi Liu, Guoqiang Han, Shengfeng He Jun 2019

Single Image Reflection Removal Beyond Linearity, Qiang Wen, Yinjie Tan, Jing Qin, Wenxi Liu, Guoqiang Han, Shengfeng He

Research Collection School Of Computing and Information Systems

Due to the lack of paired data, the training of image reflection removal relies heavily on synthesizing reflection images. However, existing methods model reflection as a linear combination model, which cannot fully simulate the real-world scenarios. In this paper, we inject non-linearity into reflection removal from two aspects. First, instead of synthesizing reflection with a fixed combination factor or kernel, we propose to synthesize reflection images by predicting a non-linear alpha blending mask. This enables a free combination of different blurry kernels, leading to a controllable and diverse reflection synthesis. Second, we design a cascaded network for reflection removal with …


Recommending Who To Follow In The Software Engineering Twitter Space, Abhabhisheksh Sharma, Yuan Tian, Agus Sulistya, Dinusha Wijedasa, David Lo Nov 2018

Recommending Who To Follow In The Software Engineering Twitter Space, Abhabhisheksh Sharma, Yuan Tian, Agus Sulistya, Dinusha Wijedasa, David Lo

Research Collection School Of Computing and Information Systems

With the advent of social media, developers are increasingly using it in their software development activities. Twitter is one of the popular social mediums used by developers. A recent study by Singer et al. found that software developers use Twitter to “keep up with the fast-paced development landscape.” Unfortunately, due to the general-purpose nature of Twitter, it’s challenging for developers to use Twitter for their development activities. Our survey with 36 developers who use Twitter in their development activities highlights that developers are interested in following specialized software gurus who share relevant technical tweets.To help developers perform this task, in …


Mining Implicit Design Templates For Actionable Code Reuse, Yun Lin, Guozhu Meng, Yinxing Yue, Zhenchang Xing, Jun Sun, Xin Peng, Yang Liu, Wenyun Zhao, Jin Song Dong Nov 2017

Mining Implicit Design Templates For Actionable Code Reuse, Yun Lin, Guozhu Meng, Yinxing Yue, Zhenchang Xing, Jun Sun, Xin Peng, Yang Liu, Wenyun Zhao, Jin Song Dong

Research Collection School Of Computing and Information Systems

In this paper, we propose an approach to detecting project-specific recurring designs in code base and abstracting them into design templates as reuse opportunities. The mined templates allow programmers to make further customization for generating new code. The generated code involves the code skeleton of recurring design as well as the semi-implemented code bodies annotated with comments to remind programmers of necessary modification. We implemented our approach as an Eclipse plugin called MICoDe. We evaluated our approach with a reuse simulation experiment and a user study involving 16 participants. The results of our simulation experiment on 10 open source Java …


Classification-Based Parameter Synthesis For Parametric Timed Automata, Jiaying Li, Jun Sun, Bo Gao, Étienne Andre Nov 2017

Classification-Based Parameter Synthesis For Parametric Timed Automata, Jiaying Li, Jun Sun, Bo Gao, Étienne Andre

Research Collection School Of Computing and Information Systems

Parametric timed automata are designed to model timed systems with unknown parameters, often representing design uncertainties of external environments. In order to design a robust system, it is crucial to synthesize constraints on the parameters, which guarantee the system behaves according to certain properties. Existing approaches suffer from scalability issues. In this work, we propose to enhance existing approaches through classification-based learning. We sample multiple concrete values for parameters and model check the corresponding non-parametric models. Based on the checking results, we form conjectures on the constraint through classification techniques, which can be subsequently confirmed by existing model checkers for …


Apibot: Question Answering Bot For Api Documentation, Yuan Tian, Ferdian Thung, Abhishek Sharma, David Lo Nov 2017

Apibot: Question Answering Bot For Api Documentation, Yuan Tian, Ferdian Thung, Abhishek Sharma, David Lo

Research Collection School Of Computing and Information Systems

As the carrier of Application Programming Interfaces (APIs) knowledge, API documentation plays a crucial role in how developers learn and use an API. It is also a valuable information resource for answering API-related questions, especially when developers cannot find reliable answers to their questions online/offline. However, finding answers to API-related questions from API documentation might not be easy because one may have to manually go through multiple pages before reaching the relevant page, and then read and understand the information inside the relevant page to figure out the answers. To deal with this challenge, we develop APIBot, a bot that …


Improving Probability Estimation Through Active Probabilistic Model Learning, Jingyi Wang, Xiaohong Chen, Jun Sun, Shengchao Qin Nov 2017

Improving Probability Estimation Through Active Probabilistic Model Learning, Jingyi Wang, Xiaohong Chen, Jun Sun, Shengchao Qin

Research Collection School Of Computing and Information Systems

It is often necessary to estimate the probability of certain events occurring in a system. For instance, knowing the probability of events triggering a shutdown sequence allows us to estimate the availability of the system. One approach is to run the system multiple times and then construct a probabilistic model to estimate the probability. When the probability of the event to be estimated is low, many system runs are necessary in order to generate an accurate estimation. For complex cyber-physical systems, each system run is costly and time-consuming, and thus it is important to reduce the number of system runs …


A Semantics Comparison Workbench For A Concurrent, Asynchronous, Distributed Programming Language, Claudio Corrodi, Alexander Heußner, Christopher M. Poskitt Nov 2017

A Semantics Comparison Workbench For A Concurrent, Asynchronous, Distributed Programming Language, Claudio Corrodi, Alexander Heußner, Christopher M. Poskitt

Research Collection School Of Computing and Information Systems

A number of high-level languages and libraries have been proposed that offer novel and simple to use abstractions for concurrent, asynchronous, and distributed programming. The execution models that realise them, however, often change over time---whether to improve performance, or to extend them to new language features---potentially affecting behavioural and safety properties of existing programs. This is exemplified by SCOOP, a message-passing approach to concurrent object-oriented programming that has seen multiple changes proposed and implemented, with demonstrable consequences for an idiomatic usage of its core abstraction. We propose a semantics comparison workbench for SCOOP with fully and semi-automatic tools for analysing …


Code Coverage And Postrelease Defects: A Large-Scale Study On Open Source Projects, Pavneet Singh Kochhar, David Lo, Julia Lawall, Nachiappan Nagappan Sep 2017

Code Coverage And Postrelease Defects: A Large-Scale Study On Open Source Projects, Pavneet Singh Kochhar, David Lo, Julia Lawall, Nachiappan Nagappan

Research Collection School Of Computing and Information Systems

Testing is a pivotal activity in ensuring the quality of software. Code coverage is a common metric used as a yardstick to measure the efficacy and adequacy of testing. However, does higher coverage actually lead to a decline in postrelease bugs? Do files that have higher test coverage actually have fewer bug reports? The direct relationship between code coverage and actual bug reports has not yet been analyzed via a comprehensive empirical study on real bugs. Past studies only involve a few software systems or artificially injected bugs (mutants). In this empirical study, we examine these questions in the context …


An Exploratory Study Of Functionality And Learning Resources Of Web Apis On Programmableweb, Yuan Tian, Pavneet Singh Kochhar, David Lo Jun 2017

An Exploratory Study Of Functionality And Learning Resources Of Web Apis On Programmableweb, Yuan Tian, Pavneet Singh Kochhar, David Lo

Research Collection School Of Computing and Information Systems

Web APIs provide various functionalities that can be leveraged by developers in building their applications. ProgrammableWeb, which is the largest and most active web API and mashup collection, provides a record of thousands of web APIs and mashups. However, important properties about these large number of web APIs, such as their functionality and support/resources for learning, have never been studied by the existing research work. In this study, we perform an exploratory analysis on functionality and learning resources of 9,883 web APIs and 4,315 mashups listed on ProgrammableWeb, and find that: (1) web APIs provide a wide range of functionalities …


How Long Will This Live? Discovering The Lifespans Of Software Engineering Ideas, Subhajit Datta, Santonu Sarkar, A. S. M Sajeev Jun 2016

How Long Will This Live? Discovering The Lifespans Of Software Engineering Ideas, Subhajit Datta, Santonu Sarkar, A. S. M Sajeev

Research Collection School Of Computing and Information Systems

We all want to be associated with long lasting ideas; as originators, or at least, expositors. For a tyro researcher or a seasoned veteran, knowing how long an idea will remain interesting in the community is critical in choosing and pursuing research threads. In the physical sciences, the notion of half-life is often evoked to quantify decaying intensity. In this paper, we study a corpus of 19,000+ papers written by 21,000+ authors across 16 software engineering publication venues from 1975 to 2010, to empirically determine the half-life of software engineering research topics. In the absence of any consistent and well-accepted …


Choosing Your Weapons: On Sentiment Analysis Tools For Software Engineering Research, Robbert Jongeling, Subhajit Datta, Alexander Serebrenik Oct 2015

Choosing Your Weapons: On Sentiment Analysis Tools For Software Engineering Research, Robbert Jongeling, Subhajit Datta, Alexander Serebrenik

Research Collection School Of Computing and Information Systems

Recent years have seen an increasing attention to social aspects of software engineering, including studies of emotions and sentiments experienced and expressed by the software developers. Most of these studies reuse existing sentiment analysis tools such as SentiStrength and NLTK. However, these tools have been trained on product reviews and movie reviews and, therefore, their results might not be applicable in the software engineering domain. In this paper we study whether the sentiment analysis tools agree with the sentiment recognized by human evaluators (as reported in an earlier study) as well as with each other. Furthermore, we evaluate the impact …


The Importance Of Being Isolated: An Empirical Study On Chromium Reviews, Subhajit Datta, Devarshi Bhatt, Manish Jain, Proshanta Sarkar, Santonu Sarkar Oct 2015

The Importance Of Being Isolated: An Empirical Study On Chromium Reviews, Subhajit Datta, Devarshi Bhatt, Manish Jain, Proshanta Sarkar, Santonu Sarkar

Research Collection School Of Computing and Information Systems

As large scale software development has become more collaborative, and software teams more globally distributed, several studies have explored how developer interaction influences software development outcomes. The emphasis so far has been largely on outcomes like defect count, the time to close modification requests etc. In the paper, we examine data from the Chromium project to understand how different aspects of developer discussion relate to the closure time of reviews. On the basis of analyzing reviews discussed by 2000+ developers, our results indicate that quicker closure of reviews owned by a developer relates to higher reception of information and insights …