Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

PDF

Series

Software engineering

Institution
Publication Year
Publication

Articles 1 - 30 of 105

Full-Text Articles in Physical Sciences and Mathematics

Large Language Models For Qualitative Research In Software Engineering: Exploring Opportunities And Challenges, Muneera Bano, Rashina Hoda, Didar Zowghi, Christoph Treude May 2024

Large Language Models For Qualitative Research In Software Engineering: Exploring Opportunities And Challenges, Muneera Bano, Rashina Hoda, Didar Zowghi, Christoph Treude

Research Collection School Of Computing and Information Systems

The recent surge in the integration of Large Language Models (LLMs) like ChatGPT into qualitative research in software engineering, much like in other professional domains, demands a closer inspection. This vision paper seeks to explore the opportunities of using LLMs in qualitative research to address many of its legacy challenges as well as potential new concerns and pitfalls arising from the use of LLMs. We share our vision for the evolving role of the qualitative researcher in the age of LLMs and contemplate how they may utilize LLMs at various stages of their research experience.


Rescape: Transforming Coral-Reefscape Images For Quantitative Analysis, Zachary Ferris, Eraldo Ribeiro, Tomofumi Nagata, Robert Van Woesik Apr 2024

Rescape: Transforming Coral-Reefscape Images For Quantitative Analysis, Zachary Ferris, Eraldo Ribeiro, Tomofumi Nagata, Robert Van Woesik

Ocean Engineering and Marine Sciences Faculty Publications

Ever since the first image of a coral reef was captured in 1885, people worldwide have been accumulating images of coral reefscapes that document the historic conditions of reefs. However, these innumerable reefscape images suffer from perspective distortion, which reduces the apparent size of distant taxa, rendering the images unusable for quantitative analysis of reef conditions. Here we solve this century-long distortion problem by developing a novel computer-vision algorithm, ReScape, which removes the perspective distortion from reefscape images by transforming them into top-down views, making them usable for quantitative analysis of reef conditions. In doing so, we demonstrate the …


Fixing Your Own Smells: Adding A Mistake-Based Familiarization Step When Teaching Code Refactoring, Ivan Wei Han Tan, Christopher M. Poskitt Mar 2024

Fixing Your Own Smells: Adding A Mistake-Based Familiarization Step When Teaching Code Refactoring, Ivan Wei Han Tan, Christopher M. Poskitt

Research Collection School Of Computing and Information Systems

Programming problems can be solved in a multitude of functionally correct ways, but the quality of these solutions (e.g. readability, maintainability) can vary immensely. When code quality is poor, symptoms emerge in the form of 'code smells', which are specific negative characteristics (e.g. duplicate code) that can be resolved by applying refactoring patterns. Many undergraduate computing curricula train students on this software engineering practice, often doing so via exercises on unfamiliar instructor-provided code. Our observation, however, is that this makes it harder for novices to internalise refactoring as part of their own development practices. In this paper, we propose a …


Dexbert: Effective, Task-Agnostic And Fine-Grained Representation Learning Of Android Bytecode, Tiezhu Sun, Kevin Allix, Kisub Kim, Xin Zhou, Dongsun Kim, David Lo, Tegawendé F. Bissyande, Jacques Klein Oct 2023

Dexbert: Effective, Task-Agnostic And Fine-Grained Representation Learning Of Android Bytecode, Tiezhu Sun, Kevin Allix, Kisub Kim, Xin Zhou, Dongsun Kim, David Lo, Tegawendé F. Bissyande, Jacques Klein

Research Collection School Of Computing and Information Systems

The automation of an increasingly large number of software engineering tasks is becoming possible thanks to Machine Learning (ML). One foundational building block in the application of ML to software artifacts is the representation of these artifacts ( e.g. , source code or executable code) into a form that is suitable for learning. Traditionally, researchers and practitioners have relied on manually selected features, based on expert knowledge, for the task at hand. Such knowledge is sometimes imprecise and generally incomplete. To overcome this limitation, many studies have leveraged representation learning, delegating to ML itself the job of automatically devising suitable …


Techsumbot: A Stack Overflow Answer Summarization Tool For Technical Query, Chengran Yang, Bowen Xu, Jiakun Liu, David Lo May 2023

Techsumbot: A Stack Overflow Answer Summarization Tool For Technical Query, Chengran Yang, Bowen Xu, Jiakun Liu, David Lo

Research Collection School Of Computing and Information Systems

Stack Overflow is a popular platform for developers to seek solutions to programming-related problems. However, prior studies identified that developers may suffer from the redundant, useless, and incomplete information retrieved by the Stack Overflow search engine. To help developers better utilize the Stack Overflow knowledge, researchers proposed tools to summarize answers to a Stack Overflow question. However, existing tools use hand-craft features to assess the usefulness of each answer sentence and fail to remove semantically redundant information in the result. Besides, existing tools only focus on a certain programming language and cannot retrieve up-to-date new posted knowledge from Stack Overflow. …


She Elicits Requirements And He Tests: Software Engineering Gender Bias In Large Language Models, Christoph Treude, Hideaki Hata May 2023

She Elicits Requirements And He Tests: Software Engineering Gender Bias In Large Language Models, Christoph Treude, Hideaki Hata

Research Collection School Of Computing and Information Systems

Implicit gender bias in software development is a well-documented issue, such as the association of technical roles with men. To address this bias, it is important to understand it in more detail. This study uses data mining techniques to investigate the extent to which 56 tasks related to software development, such as assigning GitHub issues and testing, are affected by implicit gender bias embedded in large language models. We systematically translated each task from English into a genderless language and back, and investigated the pronouns associated with each task. Based on translating each task 100 times in different permutations, we …


Conversations With Chatgpt About C Programming: An Ongoing Study, James C. Davis, Yung-Hsiang Lu, George K. Thiruvathukal Mar 2023

Conversations With Chatgpt About C Programming: An Ongoing Study, James C. Davis, Yung-Hsiang Lu, George K. Thiruvathukal

Computer Science: Faculty Publications and Other Works

AI (Artificial Intelligence) Generative Models have attracted great attention in recent years. Generative models can be used to create new articles, visual arts, music composition, even computer programs from English specifications. Among all generative models, ChatGPT is becoming one of the most well-known since its public announcement in November 2022. GPT means {\it Generative Pre-trained Transformer}. ChatGPT is an online program that can interact with human users in text formats and is able to answer questions in many topics, including computer programming. Many computer programmers, including students and professionals, are considering the use of ChatGPT as an aid. The quality …


Introduction, Raffi T. Khatchadourian Jan 2023

Introduction, Raffi T. Khatchadourian

Open Educational Resources

No abstract provided.


Reengineering And Refactoring, Raffi T. Khatchadourian Jan 2023

Reengineering And Refactoring, Raffi T. Khatchadourian

Open Educational Resources

No abstract provided.


I Know What You Are Searching For: Code Snippet Recommendation From Stack Overflow Posts, Zhipeng Gao, Xin Xia, David Lo, John C. Grundy, Xindong Zhang, Zhenchang Xing Jan 2023

I Know What You Are Searching For: Code Snippet Recommendation From Stack Overflow Posts, Zhipeng Gao, Xin Xia, David Lo, John C. Grundy, Xindong Zhang, Zhenchang Xing

Research Collection School Of Computing and Information Systems

Stack Overflow has been heavily used by software developers to seek programming-related information. More and more developers use Community Question and Answer forums, such as Stack Overflow, to search for code examples of how to accomplish a certain coding task. This is often considered to be more efficient than working from source documentation, tutorials, or full worked examples. However, due to the complexity of these online Question and Answer forums and the very large volume of information they contain, developers can be overwhelmed by the sheer volume of available information. This makes it hard to find and/or even be aware …


What Pakistani Computer Science And Software Engineering Students Think About Software Testing?, Luiz Fernando Capretz, Abdul Rehman Gilal Dec 2022

What Pakistani Computer Science And Software Engineering Students Think About Software Testing?, Luiz Fernando Capretz, Abdul Rehman Gilal

Electrical and Computer Engineering Publications

Software testing is one of the crucial supporting processes of the software life cycle. Unfortunately for the software industry, the role is stigmatized, partly due to misperception and partly due to treatment of the role. The present study aims to analyze the situation to explore what restricts computer science and software engineering students from taking up a testing career in the software industry. To conduct this study, we surveyed 88 Pakistani students taking computer science or software engineering degrees. The results showed that the present study supports previous work into the unpopularity of testing compared to other software life cycle …


An Empirical Study On The Classification Of Python Language Features Using Eye-Tracking, Jigyasa Chauhan Dec 2022

An Empirical Study On The Classification Of Python Language Features Using Eye-Tracking, Jigyasa Chauhan

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Python, currently one of the most popular programming languages, is an object-
oriented language that also provides language feature support for other programming
paradigms, such as functional and procedural. It is not currently understood how
support for multiple paradigms affects the ability of developers to comprehend that
code. Understanding the predominant paradigm in code, and how developers classify
the predominant paradigm, can benefit future research in program comprehension as
the paradigm may factor into how people comprehend that code. Other researchers
may want to look at how the paradigms in the code interact with various code smells.
To investigate how …


The Minority In The Minority, Black Women In Computer Science Fields: A Phenomenological Study, Blanche' D. Anderson Nov 2022

The Minority In The Minority, Black Women In Computer Science Fields: A Phenomenological Study, Blanche' D. Anderson

Doctoral Dissertations and Projects

The purpose of this transcendental phenomenological study was to describe the lived experiences of Black women with a bachelor’s, master’s, or doctoral degree in computer science, currently employed in the United States. The theory guiding this study was Krumboltz’s social learning theory of career decision-making, as it provides a foundation for understanding how a combination of factors leads to an individual’s educational and occupational preferences and skills. This qualitative study answered the following central research question: What are the lived experiences of Black women with a bachelor’s, master’s, or doctoral degree in computer science, currently employed in the United States? …


An Empirical Study Of Artifacts And Security Risks In The Pre-Trained Model Supply Chain, Wenxin Jiang, Nicholas Synovic, Rohan Sethi, Aryan Indarapu, Matt Hyattt, Taylor R. Schorlemmer, George K. Thiruvathukal, James C. Davis Nov 2022

An Empirical Study Of Artifacts And Security Risks In The Pre-Trained Model Supply Chain, Wenxin Jiang, Nicholas Synovic, Rohan Sethi, Aryan Indarapu, Matt Hyattt, Taylor R. Schorlemmer, George K. Thiruvathukal, James C. Davis

Computer Science: Faculty Publications and Other Works

Deep neural networks achieve state-of-the-art performance on many tasks, but require increasingly complex architectures and costly training procedures. Engineers can reduce costs by reusing a pre-trained model (PTM) and fine-tuning it for their own tasks. To facilitate software reuse, engineers collaborate around model hubs, collections of PTMs and datasets organized by problem domain. Although model hubs are now comparable in popularity and size to other software ecosystems, the associated PTM supply chain has not yet been examined from a software engineering perspective.

We present an empirical study of artifacts and security features in 8 model hubs. We indicate the potential …


Recipegen++: An Automated Trigger Action Programs Generator, Imam Nur Bani Yusuf, Diyanah Abdul Jamal, Lingxiao Jiang, David Lo Nov 2022

Recipegen++: An Automated Trigger Action Programs Generator, Imam Nur Bani Yusuf, Diyanah Abdul Jamal, Lingxiao Jiang, David Lo

Research Collection School Of Computing and Information Systems

Trigger Action Programs (TAPs) are event-driven rules that allow users to automate smart-devices and internet services. Users can write TAPs by specifying triggers and actions from a set of predefined channels and functions. Despite its simplicity, composing TAPs can still be challenging for users due to the enormous search space of available triggers and actions. The growing popularity of TAPs is followed by the increasing number of supported devices and services, resulting in a huge number of possible combinations between triggers and actions. Motivated by such a fact, we improve our prior work and propose RecipeGen++, a deep-learning-based approach that …


Itiger: An Automatic Issue Title Generation Tool, Ting Zhang, Ivana Clairine Irsan, Thung Ferdian, Donggyun Han, David Lo, Lingxiao Jiang Nov 2022

Itiger: An Automatic Issue Title Generation Tool, Ting Zhang, Ivana Clairine Irsan, Thung Ferdian, Donggyun Han, David Lo, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

In both commercial and open-source software, bug reports or issues are used to track bugs or feature requests. However, the quality of issues can differ a lot. Prior research has found that bug reports with good quality tend to gain more attention than the ones with poor quality. As an essential component of an issue, title quality is an important aspect of issue quality. Moreover, issues are usually presented in a list view, where only the issue title and some metadata are present. In this case, a concise and accurate title is crucial for readers to grasp the general concept …


Testing Research Software: A Survey, Nasir U. Eisty, Jeffrey C. Carver Nov 2022

Testing Research Software: A Survey, Nasir U. Eisty, Jeffrey C. Carver

Computer Science Faculty Publications and Presentations

Background Research software plays an important role in solving real-life problems, empowering scientific innovations, and handling emergency situations. Therefore, the correctness and trustworthiness of research software are of absolute importance. Software testing is an important activity for identifying problematic code and helping to produce high-quality software. However, testing of research software is difficult due to the complexity of the underlying science, relatively unknown results from scientific algorithms, and the culture of the research software community.

Aims The goal of this paper is to better understand current testing practices, identify challenges, and provide recommendations on how to improve the testing process …


Including Everyone, Everywhere: Understanding Opportunities And Challenges Of Geographic Gender-Inclusion In Oss, Gede Artha Azriadi Prana, Denae Ford, Ayushi Rastogi, David Lo, Rahul Purandare, Nachiappan Nagappan Feb 2022

Including Everyone, Everywhere: Understanding Opportunities And Challenges Of Geographic Gender-Inclusion In Oss, Gede Artha Azriadi Prana, Denae Ford, Ayushi Rastogi, David Lo, Rahul Purandare, Nachiappan Nagappan

Research Collection School Of Computing and Information Systems

The gender gap is a significant concern facing the software industry as the development becomes more geographically distributed. Widely shared reports indicate that gender differences may be specific to each region. However, how complete can these reports be with little to no research reflective of the Open Source Software (OSS) process and communities software is now commonly developed in? Our study presents a multi-region geographical analysis of gender inclusion on GitHub. This mixed-methods approach includes quantitatively investigating differences in gender inclusion in projects across geographic regions and investigate these trends over time using data from contributions to 21,456 project repositories. …


Eclipse, Osgi, And The Java Model, Raffi T. Khatchadourian Jan 2022

Eclipse, Osgi, And The Java Model, Raffi T. Khatchadourian

Open Educational Resources

No abstract provided.


Abstract Syntax Trees (Asts) And The Visitor Pattern, Raffi T. Khatchadourian Jan 2022

Abstract Syntax Trees (Asts) And The Visitor Pattern, Raffi T. Khatchadourian

Open Educational Resources

No abstract provided.


Developers Perception Of Peer Code Review In Research Software Development, Nasir U. Eisty, Jeffrey C. Carver Jan 2022

Developers Perception Of Peer Code Review In Research Software Development, Nasir U. Eisty, Jeffrey C. Carver

Computer Science Faculty Publications and Presentations

Context Research software is software developed by and/or used by researchers, across a wide variety of domains, to perform their research. Because of the complexity of research software, developers cannot conduct exhaustive testing. As a result, researchers have lower confidence in the correctness of the output of the software. Peer code review, a standard software engineering practice, has helped address this problem in other types of software.

Objective Peer code review is less prevalent in research software than it is in other types of software. In addition, the literature does not contain any studies about the use of peer code …


Automatic Transformation Of Natural To Unified Modeling Language: A Systematic Review, Sharif Ahmed, Arif Ahmed, Nasir U. Eisty Jan 2022

Automatic Transformation Of Natural To Unified Modeling Language: A Systematic Review, Sharif Ahmed, Arif Ahmed, Nasir U. Eisty

Computer Science Faculty Publications and Presentations

Context: Processing Software Requirement Specifications (SRS) manually takes a much longer time for requirement analysts in software engineering. Researchers have been working on making an automatic approach to ease this task. Most of the existing approaches require some intervention from an analyst or are challenging to use. Some automatic and semi-automatic approaches were developed based on heuristic rules or machine learning algorithms. However, there are various constraints to the existing approaches to UML generation, such as restrictions on ambiguity, length or structure, anaphora, incompleteness, atomicity of input text, requirements of domain ontology, etc. Objective: This study aims to better understand …


Predictive Models In Software Engineering: Challenges And Opportunities, Yanming Yang, Xin Xia, David Lo, Tingting Bi, John C. Grundy, Xiaohu Yang Jan 2022

Predictive Models In Software Engineering: Challenges And Opportunities, Yanming Yang, Xin Xia, David Lo, Tingting Bi, John C. Grundy, Xiaohu Yang

Research Collection School Of Computing and Information Systems

Predictive models are one of the most important techniques that are widely applied in many areas of software engineering. There have been a large number of primary studies that apply predictive models and that present well-performed studies in various research domains, including software requirements, software design and development, testing and debugging, and software maintenance. This article is a first attempt to systematically organize knowledge in this area by surveying a body of 421 papers on predictive models published between 2009 and 2020. We describe the key models and approaches used, classify the different models, summarize the range of key application …


Software Engineering Approaches For Tinyml Based Iot Embedded Vision: A Systematic Literature Review, Shashank Bangalore Lakshman, Nasir U. Eisty Jan 2022

Software Engineering Approaches For Tinyml Based Iot Embedded Vision: A Systematic Literature Review, Shashank Bangalore Lakshman, Nasir U. Eisty

Computer Science Faculty Publications and Presentations

Internet of Things (IoT) has catapulted human ability to control our environments through ubiquitous sensing, communication, computation, and actuation. Over the past few years, IoT has joined forces with Machine Learning (ML) to embed deep intelligence at the far edge. TinyML (Tiny Machine Learning) has enabled the deployment of ML models for embedded vision on extremely lean edge hardware, bringing the power of IoT and ML together. However, TinyML powered embedded vision applications are still in a nascent stage, and they are just starting to scale to widespread real-world IoT deployment. To harness the true potential of IoT and ML, …


A Survey On Deep Learning For Software Engineering, Yanming Yang, Xin Xia, David Lo Jan 2022

A Survey On Deep Learning For Software Engineering, Yanming Yang, Xin Xia, David Lo

Research Collection School Of Computing and Information Systems

In 2006, Geoffrey Hinton proposed the concept of training "Deep Neural Networks (DNNs)" and an improved model training method to break the bottleneck of neural network development. More recently, the introduction of AlphaGo in 2016 demonstrated the powerful learning ability of deep learning and its enormous potential. Deep learning has been increasingly used to develop state-of-the-art software engineering (SE) research tools due to its ability to boost performance for various SE tasks. There are many factors, e.g., deep learning model selection, internal structure differences, and model optimization techniques, that may have an impact on the performance of DNNs applied in …


Comparing The Popularity Of Testing Careers Among Canadian, Indian, Chinese, And Malaysian Students, Luiz Fernando Capretz, Pradeep Waychal, Jingdong Jia, Shuib Basri Nov 2021

Comparing The Popularity Of Testing Careers Among Canadian, Indian, Chinese, And Malaysian Students, Luiz Fernando Capretz, Pradeep Waychal, Jingdong Jia, Shuib Basri

Electrical and Computer Engineering Publications

This study attempts to understand motivators and de-motivators that influence the decisions of software students to take up and sustain software testing careers across four different countries, Canada, India, China, and Malaysia. Towards that end, we have developed a cross-sectional, but simple, survey-based instrument. In this study we investigated how software engineering and computer science students perceive and value what they do and their environmental settings. This study found that very few students are keen to take up software testing careers - why is this happening with such an important task in the software life cycle? The common advantages of …


Eveloping A Suitability Assessment Criteria For Software Developers: Behavioral Assessment Using Psychometric Test, Jayati Gulati, Bharti Suri, Luiz Fernando Capretz, Bimlesh Wadhwa, Anu Singh Lather Oct 2021

Eveloping A Suitability Assessment Criteria For Software Developers: Behavioral Assessment Using Psychometric Test, Jayati Gulati, Bharti Suri, Luiz Fernando Capretz, Bimlesh Wadhwa, Anu Singh Lather

Electrical and Computer Engineering Publications

A suitability assessment instrument for software developers was created using a psychometric criteria that identify the impact of behavior on the performance of software engineers. The instrument uses a questionnaire to help both individuals and IT recruiters to identify the psychological factors that affect the working performance of software engineers. Our study identifies the relationship between the behavioral drivers and the programming abilities of the subjects. In order to evaluate the instrument, a total of 100 respondents were compared on the basis of their programming skills and nine behavioral drivers. It was concluded that there is a direct relationship between …


Robot: Robustness-Oriented Testing For Deep Learning Systems, Jingyi Wang, Jialuo Chen, Youcheng Sun, Xingjun Ma, Dongxia Wang, Jun Sun, Peng Cheng May 2021

Robot: Robustness-Oriented Testing For Deep Learning Systems, Jingyi Wang, Jialuo Chen, Youcheng Sun, Xingjun Ma, Dongxia Wang, Jun Sun, Peng Cheng

Research Collection School Of Computing and Information Systems

Recently, there has been a significant growth of interest in applying software engineering techniques for the quality assurance of deep learning (DL) systems. One popular direction is deep learning testing, where adversarial examples (a.k.a. bugs) of DL systems are found either by fuzzing or guided search with the help of certain testing metrics. However, recent studies have revealed that the commonly used neuron coverage metrics by existing DL testing approaches are not correlated to model robustness. It is also not an effective measurement on the confidence of the model robustness after testing. In this work, we address this gap by …


Automatic Solution Summarization For Crash Bugs, Haoye Wang, Xin Xia, David Lo, John C. Grundy, Xinyu Wang May 2021

Automatic Solution Summarization For Crash Bugs, Haoye Wang, Xin Xia, David Lo, John C. Grundy, Xinyu Wang

Research Collection School Of Computing and Information Systems

The causes of software crashes can be hidden anywhere in the source code and development environment. When encountering software crashes, recurring bugs that are discussed on Q&A sites could provide developers with solutions to their crashing problems. However, it is difficult for developers to accurately search for relevant content on search engines, and developers have to spend a lot of manual effort to find the right solution from the returned results. In this paper, we present CRASOLVER, an approach that takes into account both the structural information of crash traces and the knowledge of crash-causing bugs to automatically summarize solutions …


Research Artifact: The Potential Of Meta-Maintenance On Github, Hideaki Hata, Raula Kula, Takashi Ishio, Christoph Treude May 2021

Research Artifact: The Potential Of Meta-Maintenance On Github, Hideaki Hata, Raula Kula, Takashi Ishio, Christoph Treude

Research Collection School Of Computing and Information Systems

This is a research artifact for the paper “Same File, Different Changes: The Potential of Meta-Maintenance on GitHub”. This artifact is a data repository including a list of studied 32,007 repositories on GitHub, a list of targeted 401,610,677 files, the results of the qualitative analysis for RQ2, RQ3, and RQ4, the results of the quantitative analysis for RQ5, and survey material for RQ6. The purpose of this artifact is enabling researchers to replicate our mixed-methods results of the paper, and to reuse the results of our exploratory study for further software engineering research. This research artifact is available at https://github.com/NAIST-SE/MetaMaintenancePotential …