Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Software Engineering

2018

Institution
Keyword
Publication
Publication Type
File Type

Articles 31 - 60 of 258

Full-Text Articles in Physical Sciences and Mathematics

Ten Years Of Hunting For Similar Code For Fun And Profit (Keynote), Stephane Glondu, Lingxiao Jiang, Zhendong Su Nov 2018

Ten Years Of Hunting For Similar Code For Fun And Profit (Keynote), Stephane Glondu, Lingxiao Jiang, Zhendong Su

Research Collection School Of Computing and Information Systems

In 2007, the Deckard paper was published at ICSE. Since its publication, it has led to much follow-up research and applications. The paper made two core contributions: a novel vector embedding of structured code for fast similarity detection, and an application of the embedding for clone detection, resulting in the Deckard tool. The vector embedding is simple and easy to adapt. Similar code detection is also fundamental for a range of classical and emerging problems in software engineering, security, and computer science education (e.g., code reuse, refactoring, porting, translation, synthesis, program repair, malware detection, and feedback generation). Both have buttressed …


Aligning Technical Debt Prioritization With Business Objectives: A Multiple-Case Study, Rodrigo Rebouças De Almeida, Uirá Kulesza, Christoph Treude, D’Angellys Cavalcanti Feitosa, Aliandro Higino Guedes Lima Nov 2018

Aligning Technical Debt Prioritization With Business Objectives: A Multiple-Case Study, Rodrigo Rebouças De Almeida, Uirá Kulesza, Christoph Treude, D’Angellys Cavalcanti Feitosa, Aliandro Higino Guedes Lima

Research Collection School Of Computing and Information Systems

Technical debt (TD) is a metaphor to describe the trade-off between short-term workarounds and long-term goals in software development. Despite being widely used to explain technical issues in business terms, industry and academia still lack a proper way to manage technical debt while explicitly considering business priorities. In this paper, we report on a multiple-case study of how two big software development companies handle technical debt items, and we show how taking the business perspective into account can improve the decision making for the prioritization of technical debt. We also propose a first step toward an approach that uses business …


Artefact: An R Implementation Of The Autospearman Function, Jirayus Jiarpakdee, Chakkrit Tantithamthavorn, Christoph Treude Nov 2018

Artefact: An R Implementation Of The Autospearman Function, Jirayus Jiarpakdee, Chakkrit Tantithamthavorn, Christoph Treude

Research Collection School Of Computing and Information Systems

This artefact is the implementation of AutoSpearman, an automated metric selection approach based on correlation analyses. The goal of AutoSpearman is to automatically mitigate correlated metrics prior to constructing analytical models. This artefact is implemented as an R package and is available in the GitHub repository. We provide descriptions and R code snippets for the installation of AutoSpearman and usage examples.


Autospearman: Automatically Mitigating Correlated Software Metrics For Interpreting Defect Models, Jirayus Jiarpakdee, Chakkrit Tantithamthavorn, Christoph Treude Nov 2018

Autospearman: Automatically Mitigating Correlated Software Metrics For Interpreting Defect Models, Jirayus Jiarpakdee, Chakkrit Tantithamthavorn, Christoph Treude

Research Collection School Of Computing and Information Systems

The interpretation of defect models heavily relies on software metrics that are used to construct them. However, such software metrics are often correlated in defect models. Prior work often uses feature selection techniques to remove correlated metrics in order to improve the performance of defect models. Yet, the interpretation of defect models may be misleading if feature selection techniques produce subsets of inconsistent and correlated metrics. In this paper, we investigate the consistency and correlation of the subsets of metrics that are produced by nine commonly-used feature selection techniques. Through a case study of 13 publicly-available defect datasets, we find …


On The Sequential Massart Algorithm For Statistical Model Checking, Cyrille Jegourel, Jun Sun, Jin Song Dong Nov 2018

On The Sequential Massart Algorithm For Statistical Model Checking, Cyrille Jegourel, Jun Sun, Jin Song Dong

Research Collection School Of Computing and Information Systems

Several schemes have been provided in Statistical Model Checking (SMC) for the estimation of property occurrence based on predefined confidence and absolute or relative error. Simulations might be however costly if many samples are required and the usual algorithms implemented in statistical model checkers tend to be conservative. Bayesian and rare event techniques can be used to reduce the sample size but they can not be applied without prerequisite or knowledge about the system under scrutiny. Recently, sequential algorithms based on Monte Carlo estimations and Massart bounds have been proposed to reduce the sample size while providing guarantees on error …


An Interpretable Neural Fuzzy Inference System For Predictions Of Underpricing In Initial Public Offerings, Di Wang, Xiaolin Qian, Chai Quek, Ah-Hwee Tan, Chunyan Miao, Xiaofeng Zhang, Geok See Ng, You Zhou Nov 2018

An Interpretable Neural Fuzzy Inference System For Predictions Of Underpricing In Initial Public Offerings, Di Wang, Xiaolin Qian, Chai Quek, Ah-Hwee Tan, Chunyan Miao, Xiaofeng Zhang, Geok See Ng, You Zhou

Research Collection School Of Computing and Information Systems

Due to their aptitude in both accurate data processing and human comprehensible reasoning, neural fuzzy inference systems have been widely adopted in various application domains as decision support systems. Especially in real-world scenarios such as decision making in financial transactions, the human experts may be more interested in knowing the comprehensive reasons of certain advices provided by a decision support system in addition to how confident the system is on such advices. In this paper, we apply an integrated autonomous computational model termed genetic algorithm and rough set incorporated neural fuzzy inference system (GARSINFIS) to predict underpricing in initial public …


Learning Probabilistic Models For Model Checking: An Evolutionary Approach And An Empirical Study, Jingyi Wang, Jun Sun, Qixia Yuan, Jun Pang Nov 2018

Learning Probabilistic Models For Model Checking: An Evolutionary Approach And An Empirical Study, Jingyi Wang, Jun Sun, Qixia Yuan, Jun Pang

Research Collection School Of Computing and Information Systems

Many automated system analysis techniques (e.g., model checking, model-based testing) rely on first obtaining a model of the system under analysis. System modeling is often done manually, which is often considered as a hindrance to adopt model-based system analysis and development techniques. To overcome this problem, researchers have proposed to automatically “learn” models based on sample system executions and shown that the learned models can be useful sometimes. There are however many questions to be answered. For instance, how much shall we generalize from the observed samples and how fast would learning converge? Or, would the analysis result based on …


Vt-Revolution: Interactive Programming Tutorials Made Possible, Lingfeng Bao, Zhenchang Xing, Xin Xia, David Lo, Shanping Li Nov 2018

Vt-Revolution: Interactive Programming Tutorials Made Possible, Lingfeng Bao, Zhenchang Xing, Xin Xia, David Lo, Shanping Li

Research Collection School Of Computing and Information Systems

Programming video tutorials showcase programming tasks and associated workflows. Although video tutorials are easy to create, it isoften difficult to explore the captured workflows and interact withthe programs in the videos. In this work, we propose a tool named VTRevolution – an interactive programming video tutorial authoring system. VTRevolution has two components: 1) a tutorial authoring system leverages operating system level instrumentation to log workflow history while tutorial authors are creating programming video tutorials; 2) a tutorial watching system enhances the learning experience of video tutorials by providing operation history and timeline-based browsing interactions. Our tutorial authoring system does not …


Infar: Insight Extraction From App Reviews, Cuiyun Gao, Jichuan Zeng, David Lo, Chin-Yew Lin, Michael R. Lyu, Irwin King Nov 2018

Infar: Insight Extraction From App Reviews, Cuiyun Gao, Jichuan Zeng, David Lo, Chin-Yew Lin, Michael R. Lyu, Irwin King

Research Collection School Of Computing and Information Systems

App reviews play an essential role for users to convey their feedback about using the app. The critical information contained in app reviews can assist app developers for maintaining and updating mobile apps. However, the noisy nature and large-quantity of daily generated app reviews make it difficult to understand essential information carried in app reviews. Several prior studies have proposed methods that can automatically classify or cluster user reviews into a few app topics (e.g., security). These methods usually act on a static collection of user reviews. However, due to the dynamic nature of user feedback (i.e., reviews keep coming …


Dsm: A Specification Mining Tool Using Recurrent Neural Network Based Language Model, Tien-Duy B. Le, Lingfeng Bao, David Lo Nov 2018

Dsm: A Specification Mining Tool Using Recurrent Neural Network Based Language Model, Tien-Duy B. Le, Lingfeng Bao, David Lo

Research Collection School Of Computing and Information Systems

Formal specifications are important but often unavailable. Furthermore, writing these specifications is time-consuming and requires skills from developers. In this work, we present Deep Specification Miner (DSM), an automated tool that applies deep learning to mine finite-state automaton (FSA) based specifications. DSM accepts as input a set of execution traces to train a Recurrent Neural Network Language Model (RNNLM). From the input traces, DSM creates a Prefix Tree Acceptor (PTA) and leverages the inferred RNNLM to extract many features. These features are then forwarded to clustering algorithms for merging similar automata states in the PTA for assembling a number of …


Using Finite-State Models For Log Differencing, Hen Amar, Lingfeng Bao, Nimrod Busany, David Lo, Shahar Maoz Nov 2018

Using Finite-State Models For Log Differencing, Hen Amar, Lingfeng Bao, Nimrod Busany, David Lo, Shahar Maoz

Research Collection School Of Computing and Information Systems

Much work has been published on extracting various kinds of models from logs that document the execution of running systems. In many cases, however, for example in the context of evolution, testing, or malware analysis, engineers are interested not only in a single log but in a set of several logs, each of which originated from a different set of runs of the system at hand. Then, the difference between the logs is the main target of interest. In this work we investigate the use of finite-state models for log differencing. Rather than comparing the logs directly, we generate concise …


Improving Reusability Of Software Libraries Through Usage Pattern Mining, Mohamed Aymen Saied, Ali Ouni, Houari A. Sahraoui, Raula Gaikovina Kula, Katsuro Inoue, David Lo Nov 2018

Improving Reusability Of Software Libraries Through Usage Pattern Mining, Mohamed Aymen Saied, Ali Ouni, Houari A. Sahraoui, Raula Gaikovina Kula, Katsuro Inoue, David Lo

Research Collection School Of Computing and Information Systems

Modern software systems are increasingly dependent on third-party libraries. It is widely recognized that using mature and well-tested third-party libraries can improve developers’ productivity, reduce time-to-market, and produce more reliable software. Today’s open-source repositories provide a wide range of libraries that can be freely downloaded and used. However, as software libraries are documented separately but intended to be used together, developers are unlikely to fully take advantage of these reuse opportunities. In this paper, we present a novel approach to automatically identify third-party library usage patterns, i.e., collections of libraries that are commonly used together by developers. Our approach employs …


Recommending Who To Follow In The Software Engineering Twitter Space, Abhabhisheksh Sharma, Yuan Tian, Agus Sulistya, Dinusha Wijedasa, David Lo Nov 2018

Recommending Who To Follow In The Software Engineering Twitter Space, Abhabhisheksh Sharma, Yuan Tian, Agus Sulistya, Dinusha Wijedasa, David Lo

Research Collection School Of Computing and Information Systems

With the advent of social media, developers are increasingly using it in their software development activities. Twitter is one of the popular social mediums used by developers. A recent study by Singer et al. found that software developers use Twitter to “keep up with the fast-paced development landscape.” Unfortunately, due to the general-purpose nature of Twitter, it’s challenging for developers to use Twitter for their development activities. Our survey with 36 developers who use Twitter in their development activities highlights that developers are interested in following specialized software gurus who share relevant technical tweets.To help developers perform this task, in …


Leveling The Playing Field: Supporting Neurodiversity Via Virtual Realities, Louanne E. Boyd, Kendra Day, Natalia Stewart, Kaitlyn Abdo, Kathleen Lamkin, Erik J. Linstead Nov 2018

Leveling The Playing Field: Supporting Neurodiversity Via Virtual Realities, Louanne E. Boyd, Kendra Day, Natalia Stewart, Kaitlyn Abdo, Kathleen Lamkin, Erik J. Linstead

Mathematics, Physics, and Computer Science Faculty Articles and Research

Neurodiversity is a term that encapsulates the diverse expression of human neurology. By thinking in broad terms about neurological development, we can become focused on delivering a diverse set of design features to meet the needs of the human condition. In this work, we move toward developing virtual environments that support variations in sensory processing. If we understand that people have differences in sensory perception that result in their own unique sensory traits, many of which are clustered by diagnostic labels such as Autism Spectrum Disorder (ASD), Sensory Processing Disorder, Attention-Deficit/Hyperactivity Disorder, Rett syndrome, dyslexia, and so on, then we …


A Survey Of Software Metric Use In Research Software Development, Nasir U. Eisty, George K. Thiruvathukal, Jeffrey C. Carver Oct 2018

A Survey Of Software Metric Use In Research Software Development, Nasir U. Eisty, George K. Thiruvathukal, Jeffrey C. Carver

Computer Science: Faculty Publications and Other Works

Background: Breakthroughs in research increasingly depend on complex software libraries, tools, and applications aimed at supporting specific science, engineering, business, or humanities disciplines. The complexity and criticality of this software motivate the need for ensuring quality and reliability. Software metrics are a key tool for assessing, measuring, and understanding software quality and reliability. Aims: The goal of this work is to better understand how research software developers use traditional software engineering concepts, like metrics, to support and evaluate both the software and the software development process. One key aspect of this goal is to identify how the set of metrics …


Setting Up A Low-Cost Lab Management System For A Multi-Purpose Computing Laboratory Using Virtualisation Technology, Heng Ngee Mok, Wee Kiat Tan Oct 2018

Setting Up A Low-Cost Lab Management System For A Multi-Purpose Computing Laboratory Using Virtualisation Technology, Heng Ngee Mok, Wee Kiat Tan

Heng Ngee MOK

This paper describes how a generic computer laboratory equipped with 52 workstations is set up for teaching IT-related courses and other general purpose usage. The authors have successfully constructed a lab management system based on decentralised, client-side software virtualisation technology using Linux and free software tools from VMware that fulfils the requirements of fast "switch over" time between consecutive lab sessions, the ability to support a wide range of IT courses and usage scenarios, low cost, easy maintenance, and a sandboxed environment for potentially disruptive IT security lab exercises. Sufficient implementation details are provided so that readers can build a …


Liboblivious: A C++ Library For Oblivious Data Structures And Algorithms, Scott D. Constable, Steve Chapin Oct 2018

Liboblivious: A C++ Library For Oblivious Data Structures And Algorithms, Scott D. Constable, Steve Chapin

Electrical Engineering and Computer Science - Technical Reports

Infrastructure as a service (IaaS) is an enormously beneficial model for centralized data computation and storage. Yet, existing network-layer and hardware-layer security protections do not address a broad category of vulnerabilities known as side-channel attacks. Over the past several years, numerous techniques have been proposed at all layers of the software/hardware stack to prevent the inadvertent leakage of sensitive data. This report discusses a new technique which integrates seamlessly with C++ programs. We introduce a library, libOblivious, which provides thin wrappers around existing C++ standard template library classes, endowing them with the property of memory-trace obliviousness.


Issues In Reproducible Simulation Research, Ben G. Fitzpatrick Oct 2018

Issues In Reproducible Simulation Research, Ben G. Fitzpatrick

Annual Symposium on Biomathematics and Ecology Education and Research

No abstract provided.


Computer Games Are Serious Business And So Is Their Quality: Particularities Of Software Testing In Game Development From The Perspective Of Practitioners, Ronnie Santos, Cleyton Magalhaes, Luiz Fernando Capretz, Jorge Correia-Neto, Fabio Q. B. Silva Dr., Abdelrahman Saher Oct 2018

Computer Games Are Serious Business And So Is Their Quality: Particularities Of Software Testing In Game Development From The Perspective Of Practitioners, Ronnie Santos, Cleyton Magalhaes, Luiz Fernando Capretz, Jorge Correia-Neto, Fabio Q. B. Silva Dr., Abdelrahman Saher

Electrical and Computer Engineering Publications

Over the last several decades, computer games started to have a significant impact on society. However, although a computer game is a type of software, the process to conceptualize, produce and deliver a game could involve unusual features. In software testing, for instance, studies demonstrated the hesitance of professionals to use automated testing techniques with games, due to the constant changes in requirements and design, and pointed out the need for creating testing tools that take into account the flexibility required for the game development process. Goal. This study aims to improve the current body of knowledge regarding software …


A Learning And Masking Approach To Secure Learning, Linh Nguyen, Sky Wang, Arunesh Sinha Oct 2018

A Learning And Masking Approach To Secure Learning, Linh Nguyen, Sky Wang, Arunesh Sinha

Research Collection School Of Computing and Information Systems

Deep Neural Networks (DNNs) have been shown to be vulnerable against adversarial examples, which are data points cleverly constructed to fool the classifier. Such attacks can be devastating in practice, especially as DNNs are being applied to ever increasing critical tasks like image recognition in autonomous driving. In this paper, we introduce a new perspective on the problem. We do so by first defining robustness of a classifier to adversarial exploitation. Next, we show that the problem of adversarial example generation can be posed as learning problem. We also categorize attacks in literature into high and low perturbation attacks; well-known …


Exploring Experiential Learning Model And Risk Management Process For An Undergraduate Software Architecture Course, Eng Lieh Ouh, Yunghans Irawan Oct 2018

Exploring Experiential Learning Model And Risk Management Process For An Undergraduate Software Architecture Course, Eng Lieh Ouh, Yunghans Irawan

Research Collection School Of Computing and Information Systems

This paper shares our insights on exploring theexperiential learning model and risk management process todesign an undergraduate software architecture course. The keychallenge for undergraduate students to appreciate softwarearchitecture design is usually their limited experience in thesoftware industry. In software architecture, the high-level designprinciples are heuristics lacking the absoluteness of firstprinciples which for inexperienced undergraduate students, thisis a frustrating divergence from what they used to value. From aneducator's perspective, teaching software architecture requirescontending with the problem of how to express this level ofabstraction practically and also make the learning realistic. Inthis paper, we propose a model adapting the concepts ofexperiential learning …


Augmenting And Structuring User Queries To Support Efficient Free-Form Code Search, Raphael Sirres, Tegawendé F. Bissyande, Dongsun Kim, David Lo, Jacques Klein, Kisub Kim, Yves Le Traon Oct 2018

Augmenting And Structuring User Queries To Support Efficient Free-Form Code Search, Raphael Sirres, Tegawendé F. Bissyande, Dongsun Kim, David Lo, Jacques Klein, Kisub Kim, Yves Le Traon

Research Collection School Of Computing and Information Systems

Source code terms such as method names and variable types are often different from conceptual words mentioned in a search query. This vocabulary mismatch problem can make code search inefficient. In this paper, we present COde voCABUlary (CoCaBu), an approach to resolving the vocabulary mismatch problem when dealing with free-form code search queries. Our approach leverages common developer questions and the associated expert answers to augment user queries with the relevant, but missing, structural code entities in order to improve the performance of matching relevant code examples within large code repositories. To instantiate this approach, we build GitSearch, a code …


Visforum: A Visual Analysis System For Exploring User Groups In Online Forums, Siwei Fu, Yong Wang, Yi Yang, Qingqing Bi, Fangzhou Guo, Huamin Qu Oct 2018

Visforum: A Visual Analysis System For Exploring User Groups In Online Forums, Siwei Fu, Yong Wang, Yi Yang, Qingqing Bi, Fangzhou Guo, Huamin Qu

Research Collection School Of Computing and Information Systems

User grouping in asynchronous online forums is a common phenomenon nowadays. People with similar backgrounds or shared interests like to get together in group discussions. As tens of thousands of archived conversational posts accumulate, challenges emerge for forum administrators and analysts to effectively explore user groups in large-volume threads and gain meaningful insights into the hierarchical discussions. Identifying and comparing groups in discussion threads are nontrivial, since the number of users and posts increases with time and noises may hamper the detection of user groups. Researchers in data mining fields have proposed a large body of algorithms to explore user …


Hawkeye: Towards A Desired Directed Grey-Box Fuzzer, Hongxu Chen, Yinxing Xue, Yuekang Li, Bihuan Chen, Xiaofei Xie, Xiuheng Wu, Yang Liu Oct 2018

Hawkeye: Towards A Desired Directed Grey-Box Fuzzer, Hongxu Chen, Yinxing Xue, Yuekang Li, Bihuan Chen, Xiaofei Xie, Xiuheng Wu, Yang Liu

Research Collection School Of Computing and Information Systems

Grey-box fuzzing is a practically effective approach to test real-world programs. However, most existing grey-box fuzzers lack directedness, i.e. the capability of executing towards user-specified target sites in the program. To emphasize existing challenges in directed fuzzing, we propose Hawkeye to feature four desired properties of directed grey-box fuzzers. Owing to a novel static analysis on the program under test and the target sites, Hawkeye precisely collects the information such as the call graph, function and basic block level distances to the targets. During fuzzing, Hawkeye evaluates exercised seeds based on both static information and the execution traces to generate …


March Of The Silent Bots, Paul Robert Griffin Oct 2018

March Of The Silent Bots, Paul Robert Griffin

MITB Thought Leadership Series

Self-intelligent software robots, or ‘bots’ are everywhere. These small pieces of code run automated tasks when you order a taxi, search for a restaurant or check the weather. Quietly beavering away, it is unknown how many bots exist, but undoubtedly this number is set to surge over time. Already, bots comprise roughly half of all internet traffic.


Disruptive Technology: Can The Banking Industry Harness Disruption For Competitive Edge?, Edgar Low Oct 2018

Disruptive Technology: Can The Banking Industry Harness Disruption For Competitive Edge?, Edgar Low

MITB Thought Leadership Series

Disruptive innovation was identified as a phenomenon more than two decades ago by prominent Harvard scholar Clayton Christensen. So you may wonder why established industries are only now waking up to the prospect of digital transformation - the banking industry in particular.


Measuring Program Comprehension: A Large-Scale Field Study With Professionals, Xin Xia, Lingfeng Bao, David Lo, Zhengchang Xing, Ahmed E. Hassan, Shanping Li Oct 2018

Measuring Program Comprehension: A Large-Scale Field Study With Professionals, Xin Xia, Lingfeng Bao, David Lo, Zhengchang Xing, Ahmed E. Hassan, Shanping Li

Research Collection School Of Computing and Information Systems

During software development and maintenance, developers spend a considerable amount of time on program comprehension activities. Previous studies show that program comprehension takes up as much as half of a developer's time. However, most of these studies are performed in a controlled setting, or with a small number of participants, and investigate the program comprehension activities only within the IDEs. However, developers' program comprehension activities go well beyond their IDE interactions. In this paper, we extend our ActivitySpace framework to collect and analyze Human-Computer Interaction (HCI) data across many applications (not just the IDEs). We follow Minelli et al.'s approach …


Overfitting In Semantics-Based Automated Program Repair, Dinh Xuan Bach Le, Ferdian Thung, David Lo, Claire Le Goues Oct 2018

Overfitting In Semantics-Based Automated Program Repair, Dinh Xuan Bach Le, Ferdian Thung, David Lo, Claire Le Goues

Research Collection School Of Computing and Information Systems

The primary goal of Automated Program Repair (APR) is to automatically fix buggy software, to reduce the manual bug-fix burden that presently rests on human developers. Existing APR techniques can be generally divided into two families: semantics- vs. heuristics-based. Semantics-based APR uses symbolic execution and test suites to extract semantic constraints, and uses program synthesis to synthesize repairs that satisfy the extracted constraints. Heuristic-based APR generates large populations of repair candidates via source manipulation, and searches for the best among them. Both families largely rely on a primary assumption that a program is correctly patched if the generated patch leads …


I4s: Capturing Shopper’S In-Store Interactions, Sougata Sen, Archan Misra, Vigneshwaran Subbaraju, Karan Grover, Meeralakshmi Radhakrishnan, Rajesh K. Balan, Youngki Lee Oct 2018

I4s: Capturing Shopper’S In-Store Interactions, Sougata Sen, Archan Misra, Vigneshwaran Subbaraju, Karan Grover, Meeralakshmi Radhakrishnan, Rajesh K. Balan, Youngki Lee

Research Collection School Of Computing and Information Systems

In this paper, we present I4S, a system that identifies item interactions of customers in a retail store through sensor data fusion from smartwatches, smartphones and distributed BLE beacons. To identify these interactions, I4S builds a gesture-triggered pipeline that (a) detects the occurrence of “item picks”, and (b) performs fine-grained localization of such pickup gestures. By analyzing data collected from 31 shoppers visiting a mid-sized stationary store, we show that we can identify person-independent picking gestures with a precision of over 88%, and identify the rack from where the pick occurred with 91%+ precision (for popular racks).


Automating Intention Mining, Qiao Huang, Xin Xia, David Lo, Gail C. Murphy Oct 2018

Automating Intention Mining, Qiao Huang, Xin Xia, David Lo, Gail C. Murphy

Research Collection School Of Computing and Information Systems

Developers frequently discuss aspects of the systems they are developing online. The comments they post to discussions form a rich information source about the system. Intention mining, a process introduced by Di Sorbo et al., classifies sentences in developer discussions to enable further analysis. As one example of use, intention mining has been used to help build various recommenders for software developers. The technique introduced by Di Sorbo et al. to categorize sentences is based on linguistic patterns derived from two projects. The limited number of data sources used in this earlier work introduces questions about the comprehensiveness of intention …