Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 34

Full-Text Articles in Physical Sciences and Mathematics

Fixing Your Own Smells: Adding A Mistake-Based Familiarization Step When Teaching Code Refactoring, Ivan Wei Han Tan, Christopher M. Poskitt Mar 2024

Fixing Your Own Smells: Adding A Mistake-Based Familiarization Step When Teaching Code Refactoring, Ivan Wei Han Tan, Christopher M. Poskitt

Research Collection School Of Computing and Information Systems

Programming problems can be solved in a multitude of functionally correct ways, but the quality of these solutions (e.g. readability, maintainability) can vary immensely. When code quality is poor, symptoms emerge in the form of 'code smells', which are specific negative characteristics (e.g. duplicate code) that can be resolved by applying refactoring patterns. Many undergraduate computing curricula train students on this software engineering practice, often doing so via exercises on unfamiliar instructor-provided code. Our observation, however, is that this makes it harder for novices to internalise refactoring as part of their own development practices. In this paper, we propose a …


Dexbert: Effective, Task-Agnostic And Fine-Grained Representation Learning Of Android Bytecode, Tiezhu Sun, Kevin Allix, Kisub Kim, Xin Zhou, Dongsun Kim, David Lo, Tegawendé F. Bissyande, Jacques Klein Oct 2023

Dexbert: Effective, Task-Agnostic And Fine-Grained Representation Learning Of Android Bytecode, Tiezhu Sun, Kevin Allix, Kisub Kim, Xin Zhou, Dongsun Kim, David Lo, Tegawendé F. Bissyande, Jacques Klein

Research Collection School Of Computing and Information Systems

The automation of an increasingly large number of software engineering tasks is becoming possible thanks to Machine Learning (ML). One foundational building block in the application of ML to software artifacts is the representation of these artifacts ( e.g. , source code or executable code) into a form that is suitable for learning. Traditionally, researchers and practitioners have relied on manually selected features, based on expert knowledge, for the task at hand. Such knowledge is sometimes imprecise and generally incomplete. To overcome this limitation, many studies have leveraged representation learning, delegating to ML itself the job of automatically devising suitable …


Techsumbot: A Stack Overflow Answer Summarization Tool For Technical Query, Chengran Yang, Bowen Xu, Jiakun Liu, David Lo May 2023

Techsumbot: A Stack Overflow Answer Summarization Tool For Technical Query, Chengran Yang, Bowen Xu, Jiakun Liu, David Lo

Research Collection School Of Computing and Information Systems

Stack Overflow is a popular platform for developers to seek solutions to programming-related problems. However, prior studies identified that developers may suffer from the redundant, useless, and incomplete information retrieved by the Stack Overflow search engine. To help developers better utilize the Stack Overflow knowledge, researchers proposed tools to summarize answers to a Stack Overflow question. However, existing tools use hand-craft features to assess the usefulness of each answer sentence and fail to remove semantically redundant information in the result. Besides, existing tools only focus on a certain programming language and cannot retrieve up-to-date new posted knowledge from Stack Overflow. …


I Know What You Are Searching For: Code Snippet Recommendation From Stack Overflow Posts, Zhipeng Gao, Xin Xia, David Lo, John C. Grundy, Xindong Zhang, Zhenchang Xing Jan 2023

I Know What You Are Searching For: Code Snippet Recommendation From Stack Overflow Posts, Zhipeng Gao, Xin Xia, David Lo, John C. Grundy, Xindong Zhang, Zhenchang Xing

Research Collection School Of Computing and Information Systems

Stack Overflow has been heavily used by software developers to seek programming-related information. More and more developers use Community Question and Answer forums, such as Stack Overflow, to search for code examples of how to accomplish a certain coding task. This is often considered to be more efficient than working from source documentation, tutorials, or full worked examples. However, due to the complexity of these online Question and Answer forums and the very large volume of information they contain, developers can be overwhelmed by the sheer volume of available information. This makes it hard to find and/or even be aware …


Recipegen++: An Automated Trigger Action Programs Generator, Imam Nur Bani Yusuf, Diyanah Abdul Jamal, Lingxiao Jiang, David Lo Nov 2022

Recipegen++: An Automated Trigger Action Programs Generator, Imam Nur Bani Yusuf, Diyanah Abdul Jamal, Lingxiao Jiang, David Lo

Research Collection School Of Computing and Information Systems

Trigger Action Programs (TAPs) are event-driven rules that allow users to automate smart-devices and internet services. Users can write TAPs by specifying triggers and actions from a set of predefined channels and functions. Despite its simplicity, composing TAPs can still be challenging for users due to the enormous search space of available triggers and actions. The growing popularity of TAPs is followed by the increasing number of supported devices and services, resulting in a huge number of possible combinations between triggers and actions. Motivated by such a fact, we improve our prior work and propose RecipeGen++, a deep-learning-based approach that …


Itiger: An Automatic Issue Title Generation Tool, Ting Zhang, Ivana Clairine Irsan, Thung Ferdian, Donggyun Han, David Lo, Lingxiao Jiang Nov 2022

Itiger: An Automatic Issue Title Generation Tool, Ting Zhang, Ivana Clairine Irsan, Thung Ferdian, Donggyun Han, David Lo, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

In both commercial and open-source software, bug reports or issues are used to track bugs or feature requests. However, the quality of issues can differ a lot. Prior research has found that bug reports with good quality tend to gain more attention than the ones with poor quality. As an essential component of an issue, title quality is an important aspect of issue quality. Moreover, issues are usually presented in a list view, where only the issue title and some metadata are present. In this case, a concise and accurate title is crucial for readers to grasp the general concept …


Including Everyone, Everywhere: Understanding Opportunities And Challenges Of Geographic Gender-Inclusion In Oss, Gede Artha Azriadi Prana, Denae Ford, Ayushi Rastogi, David Lo, Rahul Purandare, Nachiappan Nagappan Feb 2022

Including Everyone, Everywhere: Understanding Opportunities And Challenges Of Geographic Gender-Inclusion In Oss, Gede Artha Azriadi Prana, Denae Ford, Ayushi Rastogi, David Lo, Rahul Purandare, Nachiappan Nagappan

Research Collection School Of Computing and Information Systems

The gender gap is a significant concern facing the software industry as the development becomes more geographically distributed. Widely shared reports indicate that gender differences may be specific to each region. However, how complete can these reports be with little to no research reflective of the Open Source Software (OSS) process and communities software is now commonly developed in? Our study presents a multi-region geographical analysis of gender inclusion on GitHub. This mixed-methods approach includes quantitatively investigating differences in gender inclusion in projects across geographic regions and investigate these trends over time using data from contributions to 21,456 project repositories. …


A Survey On Deep Learning For Software Engineering, Yanming Yang, Xin Xia, David Lo Jan 2022

A Survey On Deep Learning For Software Engineering, Yanming Yang, Xin Xia, David Lo

Research Collection School Of Computing and Information Systems

In 2006, Geoffrey Hinton proposed the concept of training "Deep Neural Networks (DNNs)" and an improved model training method to break the bottleneck of neural network development. More recently, the introduction of AlphaGo in 2016 demonstrated the powerful learning ability of deep learning and its enormous potential. Deep learning has been increasingly used to develop state-of-the-art software engineering (SE) research tools due to its ability to boost performance for various SE tasks. There are many factors, e.g., deep learning model selection, internal structure differences, and model optimization techniques, that may have an impact on the performance of DNNs applied in …


Predictive Models In Software Engineering: Challenges And Opportunities, Yanming Yang, Xin Xia, David Lo, Tingting Bi, John C. Grundy, Xiaohu Yang Jan 2022

Predictive Models In Software Engineering: Challenges And Opportunities, Yanming Yang, Xin Xia, David Lo, Tingting Bi, John C. Grundy, Xiaohu Yang

Research Collection School Of Computing and Information Systems

Predictive models are one of the most important techniques that are widely applied in many areas of software engineering. There have been a large number of primary studies that apply predictive models and that present well-performed studies in various research domains, including software requirements, software design and development, testing and debugging, and software maintenance. This article is a first attempt to systematically organize knowledge in this area by surveying a body of 421 papers on predictive models published between 2009 and 2020. We describe the key models and approaches used, classify the different models, summarize the range of key application …


Automatic Solution Summarization For Crash Bugs, Haoye Wang, Xin Xia, David Lo, John C. Grundy, Xinyu Wang May 2021

Automatic Solution Summarization For Crash Bugs, Haoye Wang, Xin Xia, David Lo, John C. Grundy, Xinyu Wang

Research Collection School Of Computing and Information Systems

The causes of software crashes can be hidden anywhere in the source code and development environment. When encountering software crashes, recurring bugs that are discussed on Q&A sites could provide developers with solutions to their crashing problems. However, it is difficult for developers to accurately search for relevant content on search engines, and developers have to spend a lot of manual effort to find the right solution from the returned results. In this paper, we present CRASOLVER, an approach that takes into account both the structural information of crash traces and the knowledge of crash-causing bugs to automatically summarize solutions …


Robot: Robustness-Oriented Testing For Deep Learning Systems, Jingyi Wang, Jialuo Chen, Youcheng Sun, Xingjun Ma, Dongxia Wang, Jun Sun, Peng Cheng May 2021

Robot: Robustness-Oriented Testing For Deep Learning Systems, Jingyi Wang, Jialuo Chen, Youcheng Sun, Xingjun Ma, Dongxia Wang, Jun Sun, Peng Cheng

Research Collection School Of Computing and Information Systems

Recently, there has been a significant growth of interest in applying software engineering techniques for the quality assurance of deep learning (DL) systems. One popular direction is deep learning testing, where adversarial examples (a.k.a. bugs) of DL systems are found either by fuzzing or guided search with the help of certain testing metrics. However, recent studies have revealed that the commonly used neuron coverage metrics by existing DL testing approaches are not correlated to model robustness. It is also not an effective measurement on the confidence of the model robustness after testing. In this work, we address this gap by …


Did Our Course Design On Software Architecture Meet Our Student’S Learning Expectations?, Eng Lieh Ouh, Benjamin Gan, Yunghans Irawan Oct 2020

Did Our Course Design On Software Architecture Meet Our Student’S Learning Expectations?, Eng Lieh Ouh, Benjamin Gan, Yunghans Irawan

Research Collection School Of Computing and Information Systems

This Innovative Practice Full Paper discusses our course design on software architecture to meet the learning expectations of two groups of software engineers. Software engineers with working experiences frequently find themselves the need to upskill in their lifelong learning journey. Their learning expectations are shaped not just by their need to know but also other learning characteristics such as their working experiences. In many cases, we design courses based on the required learning outcomes and assessment criteria. In this paper, we wish to find out whether our course design on software architecture has met the learning expectations of our students …


How Practitioners Perceive Automated Bug Report Management Techniques, Weiqin Zou, David Lo, Zhenyu Chen, Xin Xia, Yang Feng, Baowen Xu Aug 2020

How Practitioners Perceive Automated Bug Report Management Techniques, Weiqin Zou, David Lo, Zhenyu Chen, Xin Xia, Yang Feng, Baowen Xu

Research Collection School Of Computing and Information Systems

Bug reports play an important role in the process of debugging and fixing bugs. To reduce the burden of bug report managers and facilitate the process of bug fixing, a great amount of software engineering research has been invested into automated bug report management techniques. However, the verdict is still open whether such techniques are actually required and applicable outside of the theoretical research domain. To fill this gap, in this paper, we conducted a survey among 327 practitioners to gain their insights into various categories of automated bug report management techniques. Specifically, in the survey, we asked them to …


How Does Machine Learning Change Software Development Practices?, Zhiyuan Wan, Xin Xia, David Lo, Gail C. Murphy Aug 2019

How Does Machine Learning Change Software Development Practices?, Zhiyuan Wan, Xin Xia, David Lo, Gail C. Murphy

Research Collection School Of Computing and Information Systems

Adding an ability for a system to learn inherently adds uncertainty into the system. Given the rising popularity of incorporating machine learning into systems, we wondered how the addition alters software development practices. We performed a mixture of qualitative and quantitative studies with 14 interviewees and 342 survey respondents from 26 countries across four continents to elicit significant differences between the development of machine learning systems and the development of non-machine-learning systems. Our study uncovers significant differences in various aspects of software engineering (e.g., requirements, design, testing, and process) and work characteristics (e.g., skill variety, problem solving and task identity). …


Single Image Reflection Removal Beyond Linearity, Qiang Wen, Yinjie Tan, Jing Qin, Wenxi Liu, Guoqiang Han, Shengfeng He Jun 2019

Single Image Reflection Removal Beyond Linearity, Qiang Wen, Yinjie Tan, Jing Qin, Wenxi Liu, Guoqiang Han, Shengfeng He

Research Collection School Of Computing and Information Systems

Due to the lack of paired data, the training of image reflection removal relies heavily on synthesizing reflection images. However, existing methods model reflection as a linear combination model, which cannot fully simulate the real-world scenarios. In this paper, we inject non-linearity into reflection removal from two aspects. First, instead of synthesizing reflection with a fixed combination factor or kernel, we propose to synthesize reflection images by predicting a non-linear alpha blending mask. This enables a free combination of different blurry kernels, leading to a controllable and diverse reflection synthesis. Second, we design a cascaded network for reflection removal with …


Recommending Who To Follow In The Software Engineering Twitter Space, Abhabhisheksh Sharma, Yuan Tian, Agus Sulistya, Dinusha Wijedasa, David Lo Nov 2018

Recommending Who To Follow In The Software Engineering Twitter Space, Abhabhisheksh Sharma, Yuan Tian, Agus Sulistya, Dinusha Wijedasa, David Lo

Research Collection School Of Computing and Information Systems

With the advent of social media, developers are increasingly using it in their software development activities. Twitter is one of the popular social mediums used by developers. A recent study by Singer et al. found that software developers use Twitter to “keep up with the fast-paced development landscape.” Unfortunately, due to the general-purpose nature of Twitter, it’s challenging for developers to use Twitter for their development activities. Our survey with 36 developers who use Twitter in their development activities highlights that developers are interested in following specialized software gurus who share relevant technical tweets.To help developers perform this task, in …


Mining Implicit Design Templates For Actionable Code Reuse, Yun Lin, Guozhu Meng, Yinxing Yue, Zhenchang Xing, Jun Sun, Xin Peng, Yang Liu, Wenyun Zhao, Jin Song Dong Nov 2017

Mining Implicit Design Templates For Actionable Code Reuse, Yun Lin, Guozhu Meng, Yinxing Yue, Zhenchang Xing, Jun Sun, Xin Peng, Yang Liu, Wenyun Zhao, Jin Song Dong

Research Collection School Of Computing and Information Systems

In this paper, we propose an approach to detecting project-specific recurring designs in code base and abstracting them into design templates as reuse opportunities. The mined templates allow programmers to make further customization for generating new code. The generated code involves the code skeleton of recurring design as well as the semi-implemented code bodies annotated with comments to remind programmers of necessary modification. We implemented our approach as an Eclipse plugin called MICoDe. We evaluated our approach with a reuse simulation experiment and a user study involving 16 participants. The results of our simulation experiment on 10 open source Java …


Classification-Based Parameter Synthesis For Parametric Timed Automata, Jiaying Li, Jun Sun, Bo Gao, Étienne Andre Nov 2017

Classification-Based Parameter Synthesis For Parametric Timed Automata, Jiaying Li, Jun Sun, Bo Gao, Étienne Andre

Research Collection School Of Computing and Information Systems

Parametric timed automata are designed to model timed systems with unknown parameters, often representing design uncertainties of external environments. In order to design a robust system, it is crucial to synthesize constraints on the parameters, which guarantee the system behaves according to certain properties. Existing approaches suffer from scalability issues. In this work, we propose to enhance existing approaches through classification-based learning. We sample multiple concrete values for parameters and model check the corresponding non-parametric models. Based on the checking results, we form conjectures on the constraint through classification techniques, which can be subsequently confirmed by existing model checkers for …


Improving Probability Estimation Through Active Probabilistic Model Learning, Jingyi Wang, Xiaohong Chen, Jun Sun, Shengchao Qin Nov 2017

Improving Probability Estimation Through Active Probabilistic Model Learning, Jingyi Wang, Xiaohong Chen, Jun Sun, Shengchao Qin

Research Collection School Of Computing and Information Systems

It is often necessary to estimate the probability of certain events occurring in a system. For instance, knowing the probability of events triggering a shutdown sequence allows us to estimate the availability of the system. One approach is to run the system multiple times and then construct a probabilistic model to estimate the probability. When the probability of the event to be estimated is low, many system runs are necessary in order to generate an accurate estimation. For complex cyber-physical systems, each system run is costly and time-consuming, and thus it is important to reduce the number of system runs …


A Semantics Comparison Workbench For A Concurrent, Asynchronous, Distributed Programming Language, Claudio Corrodi, Alexander Heußner, Christopher M. Poskitt Nov 2017

A Semantics Comparison Workbench For A Concurrent, Asynchronous, Distributed Programming Language, Claudio Corrodi, Alexander Heußner, Christopher M. Poskitt

Research Collection School Of Computing and Information Systems

A number of high-level languages and libraries have been proposed that offer novel and simple to use abstractions for concurrent, asynchronous, and distributed programming. The execution models that realise them, however, often change over time---whether to improve performance, or to extend them to new language features---potentially affecting behavioural and safety properties of existing programs. This is exemplified by SCOOP, a message-passing approach to concurrent object-oriented programming that has seen multiple changes proposed and implemented, with demonstrable consequences for an idiomatic usage of its core abstraction. We propose a semantics comparison workbench for SCOOP with fully and semi-automatic tools for analysing …


Apibot: Question Answering Bot For Api Documentation, Yuan Tian, Ferdian Thung, Abhishek Sharma, David Lo Nov 2017

Apibot: Question Answering Bot For Api Documentation, Yuan Tian, Ferdian Thung, Abhishek Sharma, David Lo

Research Collection School Of Computing and Information Systems

As the carrier of Application Programming Interfaces (APIs) knowledge, API documentation plays a crucial role in how developers learn and use an API. It is also a valuable information resource for answering API-related questions, especially when developers cannot find reliable answers to their questions online/offline. However, finding answers to API-related questions from API documentation might not be easy because one may have to manually go through multiple pages before reaching the relevant page, and then read and understand the information inside the relevant page to figure out the answers. To deal with this challenge, we develop APIBot, a bot that …


Code Coverage And Postrelease Defects: A Large-Scale Study On Open Source Projects, Pavneet Singh Kochhar, David Lo, Julia Lawall, Nachiappan Nagappan Sep 2017

Code Coverage And Postrelease Defects: A Large-Scale Study On Open Source Projects, Pavneet Singh Kochhar, David Lo, Julia Lawall, Nachiappan Nagappan

Research Collection School Of Computing and Information Systems

Testing is a pivotal activity in ensuring the quality of software. Code coverage is a common metric used as a yardstick to measure the efficacy and adequacy of testing. However, does higher coverage actually lead to a decline in postrelease bugs? Do files that have higher test coverage actually have fewer bug reports? The direct relationship between code coverage and actual bug reports has not yet been analyzed via a comprehensive empirical study on real bugs. Past studies only involve a few software systems or artificially injected bugs (mutants). In this empirical study, we examine these questions in the context …


An Exploratory Study Of Functionality And Learning Resources Of Web Apis On Programmableweb, Yuan Tian, Pavneet Singh Kochhar, David Lo Jun 2017

An Exploratory Study Of Functionality And Learning Resources Of Web Apis On Programmableweb, Yuan Tian, Pavneet Singh Kochhar, David Lo

Research Collection School Of Computing and Information Systems

Web APIs provide various functionalities that can be leveraged by developers in building their applications. ProgrammableWeb, which is the largest and most active web API and mashup collection, provides a record of thousands of web APIs and mashups. However, important properties about these large number of web APIs, such as their functionality and support/resources for learning, have never been studied by the existing research work. In this study, we perform an exploratory analysis on functionality and learning resources of 9,883 web APIs and 4,315 mashups listed on ProgrammableWeb, and find that: (1) web APIs provide a wide range of functionalities …


The Habits Of Highly Effective Researchers: An Empirical Study, Subhajit Datta, Partha Basuchowdhuri, Surajit Acharya, Subhashis Majumder Jan 2017

The Habits Of Highly Effective Researchers: An Empirical Study, Subhajit Datta, Partha Basuchowdhuri, Surajit Acharya, Subhashis Majumder

Research Collection School Of Computing and Information Systems

Interest in the habits of influential individuals cuts across domains. As researchers, we are intrigued why few attain significant eminence in their fields, whereas many operate in obscurity. An empirical examination of this question has been made possible by the recent availability of large scale publication data. In this paper, we use information from the AMiner Paper Citation and Author Collaboration Networks to discern factors that relate to the impact of influential researchers across five domains in the computing discipline. We propose and apply a novel algorithm to identify influential vertices in co-authorship networks built from total corpora of 1,00,000+papers …


How Long Will This Live? Discovering The Lifespans Of Software Engineering Ideas, Subhajit Datta, Santonu Sarkar, A. S. M Sajeev Jun 2016

How Long Will This Live? Discovering The Lifespans Of Software Engineering Ideas, Subhajit Datta, Santonu Sarkar, A. S. M Sajeev

Research Collection School Of Computing and Information Systems

We all want to be associated with long lasting ideas; as originators, or at least, expositors. For a tyro researcher or a seasoned veteran, knowing how long an idea will remain interesting in the community is critical in choosing and pursuing research threads. In the physical sciences, the notion of half-life is often evoked to quantify decaying intensity. In this paper, we study a corpus of 19,000+ papers written by 21,000+ authors across 16 software engineering publication venues from 1975 to 2010, to empirically determine the half-life of software engineering research topics. In the absence of any consistent and well-accepted …


Choosing Your Weapons: On Sentiment Analysis Tools For Software Engineering Research, Robbert Jongeling, Subhajit Datta, Alexander Serebrenik Oct 2015

Choosing Your Weapons: On Sentiment Analysis Tools For Software Engineering Research, Robbert Jongeling, Subhajit Datta, Alexander Serebrenik

Research Collection School Of Computing and Information Systems

Recent years have seen an increasing attention to social aspects of software engineering, including studies of emotions and sentiments experienced and expressed by the software developers. Most of these studies reuse existing sentiment analysis tools such as SentiStrength and NLTK. However, these tools have been trained on product reviews and movie reviews and, therefore, their results might not be applicable in the software engineering domain. In this paper we study whether the sentiment analysis tools agree with the sentiment recognized by human evaluators (as reported in an earlier study) as well as with each other. Furthermore, we evaluate the impact …


The Importance Of Being Isolated: An Empirical Study On Chromium Reviews, Subhajit Datta, Devarshi Bhatt, Manish Jain, Proshanta Sarkar, Santonu Sarkar Oct 2015

The Importance Of Being Isolated: An Empirical Study On Chromium Reviews, Subhajit Datta, Devarshi Bhatt, Manish Jain, Proshanta Sarkar, Santonu Sarkar

Research Collection School Of Computing and Information Systems

As large scale software development has become more collaborative, and software teams more globally distributed, several studies have explored how developer interaction influences software development outcomes. The emphasis so far has been largely on outcomes like defect count, the time to close modification requests etc. In the paper, we examine data from the Chromium project to understand how different aspects of developer discussion relate to the closure time of reviews. On the basis of analyzing reviews discussed by 2000+ developers, our results indicate that quicker closure of reviews owned by a developer relates to higher reception of information and insights …


Managing Technical Debt: Insights From Recent Empirical Evidence, Narayan Ramasubbu, Chris F. Kemerer, C. Jason Woodard Mar 2015

Managing Technical Debt: Insights From Recent Empirical Evidence, Narayan Ramasubbu, Chris F. Kemerer, C. Jason Woodard

Research Collection School Of Computing and Information Systems

Technical debt refers to maintenance obligations that software teams accumulate as a result of their actions. Empirical research has led researchers to suggest three dimensions along which software development teams should map their technical-debt metrics: customer satisfaction needs, reliability needs, and the probability of technology disruption.


Orion: A Software Project Search Engine With Integrated Diverse Software Artifacts, Tegawende F. Bissyande, Ferdian Thung, David Lo, Lingxiao Jiang, Laurent Réveillère Jul 2013

Orion: A Software Project Search Engine With Integrated Diverse Software Artifacts, Tegawende F. Bissyande, Ferdian Thung, David Lo, Lingxiao Jiang, Laurent Réveillère

Research Collection School Of Computing and Information Systems

Software projects produce a wealth of data that is leveraged in different tasks and for different purposes: researchers collect project data for building experimental datasets; software programmers reuse code from projects; developers often explore the opportunities for getting involved in the development of a project to gain or offer expertise. Finding relevant projects that suit one needs is however currently challenging with the capabilities of existing search systems. We propose Orion, an integrated search engine architecture that combines information from different types of software repositories from multiple sources to facilitate the construction and execution of advanced search queries. Orion provides …


Mining Iterative Generators And Representative Rules For Software Specification Discovery, David Lo, Jinyan Li, Limsoon Wong, Siau-Cheng Khoo Feb 2011

Mining Iterative Generators And Representative Rules For Software Specification Discovery, David Lo, Jinyan Li, Limsoon Wong, Siau-Cheng Khoo

Research Collection School Of Computing and Information Systems

Billions of dollars are spent annually on software-related cost. It is estimated that up to 45 percent of software cost is due to the difficulty in understanding existing systems when performing maintenance tasks (i.e., adding features, removing bugs, etc.). One of the root causes is that software products often come with poor, incomplete, or even without any documented specifications. In an effort to improve program understanding, Lo et al. have proposed iterative pattern mining which outputs patterns that are repeated frequently within a program trace, or across multiple traces, or both. Frequent iterative patterns reflect frequent program behaviors that likely …