Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 128

Full-Text Articles in Physical Sciences and Mathematics

Vkse-Mo: Verifiable Keyword Search Over Encrypted Data In Multi-Owner Settings, Yinbin Miao, Jianfeng Ma, Ximeng Liu, Junwei Zhang, Zhiquan Liu Dec 2017

Vkse-Mo: Verifiable Keyword Search Over Encrypted Data In Multi-Owner Settings, Yinbin Miao, Jianfeng Ma, Ximeng Liu, Junwei Zhang, Zhiquan Liu

Research Collection School Of Computing and Information Systems

Searchable encryption (SE) techniques allow cloud clients to easily store data and search encrypted data in a privacy-preserving manner, where most of SE schemes treat the cloud server as honest-but-curious. However, in practice, the cloud server is a semi-honest-but-curious third-party, which only executes a fraction of search operations and returns a fraction of false search results to save its computational and bandwidth resources. Thus, it is important to provide a results verification method to guarantee the correctness of the search results. Existing SE schemes allow multiple data owners to upload different records to the cloud server, but these schemes have …


D-Watch: Embracing “Bad” Multipaths For Device-Free Localization With Cots Rfid Devices, Ju Wang, Jie Xiong, Hongbo Jiang, Xiaojiang Chen, Dingyi Fang Dec 2017

D-Watch: Embracing “Bad” Multipaths For Device-Free Localization With Cots Rfid Devices, Ju Wang, Jie Xiong, Hongbo Jiang, Xiaojiang Chen, Dingyi Fang

Research Collection School Of Computing and Information Systems

Device-free localization, which does not require any device attached to the target, is playing a critical role in many applications, such as intrusion detection, elderly monitoring and so on. This paper introduces D-Watch, a device-free system built on the top of low cost commodity-off-the-shelf RFID hardware. Unlike previous works which consider multipaths detrimental, D-Watch leverages the ''bad'' multipaths to provide a decimeter-level localization accuracy without offline training. D-Watch harnesses the angle-of-arrival information from the RFID tags' backscatter signals. The key intuition is that whenever a target blocks a signal's propagation path, the signal power experiences a drop which can be …


Robust Human Activity Recognition Using Lesser Number Of Wearable Sensors, Di Wang, Edwin Candinegara, Junhui Hou, Ah-Hwee Tan, Chunyan Miao Dec 2017

Robust Human Activity Recognition Using Lesser Number Of Wearable Sensors, Di Wang, Edwin Candinegara, Junhui Hou, Ah-Hwee Tan, Chunyan Miao

Research Collection School Of Computing and Information Systems

In recent years, research on the recognition of human physical activities solely using wearable sensors has received more and more attention. Compared to other types of sensory devices such as surveillance cameras, wearable sensors are preferred in most activity recognition applications mainly due to their non-intrusiveness and pervasiveness. However, many existing activity recognition applications or experiments using wearable sensors were conducted in the confined laboratory settings using specifically developed gadgets. These gadgets may be useful for a small group of people in certain specific scenarios, but probably will not gain their popularity because they introduce additional costs and they are …


What Do Developers Search For On The Web?, Xin Xia, Lingfeng Bao, David Lo, Pavneet Singh Kochhar, Ahmed E. Hassan, Zhenchang Xing Dec 2017

What Do Developers Search For On The Web?, Xin Xia, Lingfeng Bao, David Lo, Pavneet Singh Kochhar, Ahmed E. Hassan, Zhenchang Xing

Research Collection School Of Computing and Information Systems

Developers commonly make use of a web search engine such as Google to locate online resources to improve their productivity. A better understanding of what developers search for could help us understand their behaviors and the problems that they meet during the software development process. Unfortunately, we have a limited understanding of what developers frequently search for and of the search tasks that they often find challenging. To address this gap, we collected search queries from 60 developers, surveyed 235 software engineers from more than 21 countries across five continents. In particular, we asked our survey participants to rate the …


Secure Server-Aided Top-K Monitoring, Yujue Wang, Hwee Hwa Pang, Yanjiang Yang, Xuhua Ding Dec 2017

Secure Server-Aided Top-K Monitoring, Yujue Wang, Hwee Hwa Pang, Yanjiang Yang, Xuhua Ding

Research Collection School Of Computing and Information Systems

In a data streaming model, a data owner releases records or documents to a set of users with matching interests, in such a way that the match in interest can be calculated from the correlation between each pair of document and user query. For scalability and availability reasons, this calculation is delegated to third-party servers, which gives rise to the need to protect the integrity and privacy of the documents and user queries. In this paper, we propose a server-aided data stream monitoring scheme (DSM) to address the aforementioned integrity and privacy challenges, so that the users are able to …


Graphmp: An Efficient Semi-External-Memory Big Graph Processing System On A Single Machine, Peng Sun, Yonggang Wen, Nguyen Binh Duong Ta, Xiaokui Xiao Dec 2017

Graphmp: An Efficient Semi-External-Memory Big Graph Processing System On A Single Machine, Peng Sun, Yonggang Wen, Nguyen Binh Duong Ta, Xiaokui Xiao

Research Collection School Of Computing and Information Systems

Recent studies showed that single-machine graph processing systems can be as highly competitive as clusterbased approaches on large-scale problems. While several outof-core graph processing systems and computation models have been proposed, the high disk I/O overhead could significantly reduce performance in many practical cases. In this paper, we propose GraphMP to tackle big graph analytics on a single machine. GraphMP achieves low disk I/O overhead with three techniques. First, we design a vertex-centric sliding window (VSW) computation model to avoid reading and writing vertices on disk. Second, we propose a selective scheduling method to skip loading and processing unnecessary edge …


Learning Likely Invariants To Explain Why A Program Fails, Long H. Pham, Jun Sun, Lyly Tran Thi, Jingyi Wang, Xin Peng Nov 2017

Learning Likely Invariants To Explain Why A Program Fails, Long H. Pham, Jun Sun, Lyly Tran Thi, Jingyi Wang, Xin Peng

Research Collection School Of Computing and Information Systems

Debugging is difficult. Recent studies show that automatic bug localization techniques have limited usefulness. One of the reasons is that programmers typically have to understand why the program fails before fixing it. In this work, we aim to help programmers understand a bug by automatically generating likely invariants which are violated in the failed tests. Given a program with an initial assertion and at least one test case failing the assertion, we first generate random test cases, identify potential bug locations through bug localization, and then generate program state mutation based on active learning techniques to identify a predicate 'explaining' …


Mining Implicit Design Templates For Actionable Code Reuse, Yun Lin, Guozhu Meng, Yinxing Yue, Zhenchang Xing, Jun Sun, Xin Peng, Yang Liu, Wenyun Zhao, Jin Song Dong Nov 2017

Mining Implicit Design Templates For Actionable Code Reuse, Yun Lin, Guozhu Meng, Yinxing Yue, Zhenchang Xing, Jun Sun, Xin Peng, Yang Liu, Wenyun Zhao, Jin Song Dong

Research Collection School Of Computing and Information Systems

In this paper, we propose an approach to detecting project-specific recurring designs in code base and abstracting them into design templates as reuse opportunities. The mined templates allow programmers to make further customization for generating new code. The generated code involves the code skeleton of recurring design as well as the semi-implemented code bodies annotated with comments to remind programmers of necessary modification. We implemented our approach as an Eclipse plugin called MICoDe. We evaluated our approach with a reuse simulation experiment and a user study involving 16 participants. The results of our simulation experiment on 10 open source Java …


Fib: Squeezing Loop Invariants By Interpolation Between Forward/Backward Predicate Transformers, Shang-Wei Lin, Jun Sun, Hao Xiao, Yang Liu, David Sana, Henri Hansen Nov 2017

Fib: Squeezing Loop Invariants By Interpolation Between Forward/Backward Predicate Transformers, Shang-Wei Lin, Jun Sun, Hao Xiao, Yang Liu, David Sana, Henri Hansen

Research Collection School Of Computing and Information Systems

Loop invariant generation is a fundamental problem in program analysis and verification. In this work, we propose a new approach to automatically constructing inductive loop invariants. The key idea is to aggressively squeeze an inductive invariant based on Craig interpolants between forward and backward reachability analysis. We have evaluated our approach by a set of loop benchmarks, and experimental results show that our approach is promising.


Automatic Loop-Invariant Generation And Refinement Through Selective Sampling, Jiaying Li, Jun Sun, Li Li, Quang Loc Le, Shang-Wei Lin Nov 2017

Automatic Loop-Invariant Generation And Refinement Through Selective Sampling, Jiaying Li, Jun Sun, Li Li, Quang Loc Le, Shang-Wei Lin

Research Collection School Of Computing and Information Systems

Automatic loop-invariant generation is important in program analysis and verification. In this paper, we propose to generate loop-invariants automatically through learning and verification. Given a Hoare triple of a program containing a loop, we start with randomly testing the program, collect program states at run-time and categorize them based on whether they satisfy the invariant to be discovered. Next, classification techniques are employed to generate a candidate loop-invariant automatically. Afterwards, we refine the candidate through selective sampling so as to overcome the lack of sufficient test cases. Only after a candidate invariant cannot be improved further through selective sampling, we …


Classification-Based Parameter Synthesis For Parametric Timed Automata, Jiaying Li, Jun Sun, Bo Gao, Étienne Andre Nov 2017

Classification-Based Parameter Synthesis For Parametric Timed Automata, Jiaying Li, Jun Sun, Bo Gao, Étienne Andre

Research Collection School Of Computing and Information Systems

Parametric timed automata are designed to model timed systems with unknown parameters, often representing design uncertainties of external environments. In order to design a robust system, it is crucial to synthesize constraints on the parameters, which guarantee the system behaves according to certain properties. Existing approaches suffer from scalability issues. In this work, we propose to enhance existing approaches through classification-based learning. We sample multiple concrete values for parameters and model check the corresponding non-parametric models. Based on the checking results, we form conjectures on the constraint through classification techniques, which can be subsequently confirmed by existing model checkers for …


Eeg-Based Emotion Recognition Via Fast And Robust Feature Smoothing, Cheng Tang, Di Wang, Ah-Hwee Tan, Chunyan Miao Nov 2017

Eeg-Based Emotion Recognition Via Fast And Robust Feature Smoothing, Cheng Tang, Di Wang, Ah-Hwee Tan, Chunyan Miao

Research Collection School Of Computing and Information Systems

Electroencephalograph (EEG) signals reveal much of our brain states and have been widely used in emotion recognition. However, the recognition accuracy is hardly ideal mainly due to the following reasons: (i) the features extracted from EEG signals may not solely reflect one’s emotional patterns and their quality is easily affected by noise; and (ii) increasing feature dimension may enhance the recognition accuracy, but it often requires extra computation time. In this paper, we propose a feature smoothing method to alleviate the aforementioned problems. Specifically, we extract six statistical features from raw EEG signals and apply a simple yet cost-effective feature …


Capsense: Capacitor-Based Activity Sensing For Kinetic Energy Harvesting Powered Wearable Devices, Guohao Lan, Dong Ma, Weitao Xu, Mahbub Hassan, Wen Hu Nov 2017

Capsense: Capacitor-Based Activity Sensing For Kinetic Energy Harvesting Powered Wearable Devices, Guohao Lan, Dong Ma, Weitao Xu, Mahbub Hassan, Wen Hu

Research Collection School Of Computing and Information Systems

We propose a new activity sensing method, CapSense, which detects activities of daily living (ADL) by sampling the voltage of the kinetic energy harvesting (KEH) capacitor at an ultra low sampling rate. Unlike conventional sensors that generate only instantaneous motion information of the subject, KEH capacitors accumulate and store human generated energy over time. Given that humans produce kinetic energy at distinct rates for different ADL, the KEH capacitor can be sampled only once in a while to observe the energy generation rate and identify the current activity. Thus, with CapSense, it is possible to avoid collecting time series motion …


On Locating Malicious Code In Piggybacked Android Apps, Li Li, Daoyuan Li, Tegawende F. Bissyande, Jacques Klein, Haipeng Cai, David Lo, Yves Le Traon Nov 2017

On Locating Malicious Code In Piggybacked Android Apps, Li Li, Daoyuan Li, Tegawende F. Bissyande, Jacques Klein, Haipeng Cai, David Lo, Yves Le Traon

Research Collection School Of Computing and Information Systems

To devise efficient approaches and tools for detecting malicious packages in the Android ecosystem, researchers are increasingly required to have a deep understanding of malware. There is thus a need to provide a framework for dissecting malware and locating malicious program fragments within app code in order to build a comprehensive dataset of malicious samples. Towards addressing this need, we propose in this work a tool-based approach called HookRanker, which provides ranked lists of potentially malicious packages based on the way malware behaviour code is triggered. With experiments on a ground truth of piggybacked apps, we are able to automatically …


The Impact Of Coverage On Bug Density In A Large Industrial Software Project, Thomas Bach, Artur Andrzejak, Ralf Pannemans, David Lo Nov 2017

The Impact Of Coverage On Bug Density In A Large Industrial Software Project, Thomas Bach, Artur Andrzejak, Ralf Pannemans, David Lo

Research Collection School Of Computing and Information Systems

Measuring quality of test suites is one of the major challenges of software testing. Code coverage identifies tested and untested parts of code and is frequently used to approximate test suite quality. Multiple previous studies have investigated the relationship between coverage ratio and test suite quality, without a clear consent in the results. In this work we study whether covered code contains a smaller number of future bugs than uncovered code (assuming appropriate scaling). If this correlation holds and bug density is lower in covered code, coverage can be regarded as a meaningful metric to estimate the adequacy of testing. …


Answerbot: Automated Generation Of Answer Summary To Developers’ Technical Questions, Bowen Xu, Zhenchang Xing, Xin Xia, David Lo Nov 2017

Answerbot: Automated Generation Of Answer Summary To Developers’ Technical Questions, Bowen Xu, Zhenchang Xing, Xin Xia, David Lo

Research Collection School Of Computing and Information Systems

The prevalence of questions and answers on domain-specific Q&A sites like Stack Overflow constitutes a core knowledge asset for software engineering domain. Although search engines can return a list of questions relevant to a user query of some technical question, the abundance of relevant posts and the sheer amount of information in them makes it difficult for developers to digest them and find the most needed answers to their questions. In this work, we aim to help developers who want to quickly capture the key points of several answer posts relevant to a technical question before they read the details …


Apibot: Question Answering Bot For Api Documentation, Yuan Tian, Ferdian Thung, Abhishek Sharma, David Lo Nov 2017

Apibot: Question Answering Bot For Api Documentation, Yuan Tian, Ferdian Thung, Abhishek Sharma, David Lo

Research Collection School Of Computing and Information Systems

As the carrier of Application Programming Interfaces (APIs) knowledge, API documentation plays a crucial role in how developers learn and use an API. It is also a valuable information resource for answering API-related questions, especially when developers cannot find reliable answers to their questions online/offline. However, finding answers to API-related questions from API documentation might not be easy because one may have to manually go through multiple pages before reaching the relevant page, and then read and understand the information inside the relevant page to figure out the answers. To deal with this challenge, we develop APIBot, a bot that …


Introducing People With Asd To Crowd Work, Kotaro Hara, Jeffrey P. Bigham Nov 2017

Introducing People With Asd To Crowd Work, Kotaro Hara, Jeffrey P. Bigham

Research Collection School Of Computing and Information Systems

Adults with Autism Spectrum Disorders (ASD) are unemployed at a high rate, in part because the constraints and expectations of traditional employment can be difficult for them. In this paper, we report on our work in introducing people with ASD to remote work on a crowdsourcing platform and a prototype tool we developed by working with participants. We conducted a six-week long user-centered design study with three participants with ASD. The early stage of the study focused on assessing the abilities of our participants to search and work on micro-tasks available on the crowdsourcing market. Based on our preliminary findings, …


Anomaly Detection For A Water Treatment System Using Unsupervised Machine Learning, Jun Inoue, Yoriyuki Yamagata, Yuqi Chen, Christopher M. Poskitt, Jun Sun Nov 2017

Anomaly Detection For A Water Treatment System Using Unsupervised Machine Learning, Jun Inoue, Yoriyuki Yamagata, Yuqi Chen, Christopher M. Poskitt, Jun Sun

Research Collection School Of Computing and Information Systems

In this paper, we propose and evaluate the application of unsupervised machine learning to anomaly detection for a Cyber-Physical System (CPS). We compare two methods: Deep Neural Networks (DNN) adapted to time series data generated by a CPS, and one-class Support Vector Machines (SVM). These methods are evaluated against data from the Secure Water Treatment (SWaT) testbed, a scaled-down but fully operational raw water purification plant. For both methods, we first train detectors using a log generated by SWaT operating under normal conditions. Then, we evaluate the performance of both methods using a log generated by SWaT operating under 36 …


Improving Probability Estimation Through Active Probabilistic Model Learning, Jingyi Wang, Xiaohong Chen, Jun Sun, Shengchao Qin Nov 2017

Improving Probability Estimation Through Active Probabilistic Model Learning, Jingyi Wang, Xiaohong Chen, Jun Sun, Shengchao Qin

Research Collection School Of Computing and Information Systems

It is often necessary to estimate the probability of certain events occurring in a system. For instance, knowing the probability of events triggering a shutdown sequence allows us to estimate the availability of the system. One approach is to run the system multiple times and then construct a probabilistic model to estimate the probability. When the probability of the event to be estimated is low, many system runs are necessary in order to generate an accurate estimation. For complex cyber-physical systems, each system run is costly and time-consuming, and thus it is important to reduce the number of system runs …


Enabling Phased Array Signal Processing For Mobile Wifi Devices, Kun Qian, Chenshu Wu, Zheng Yang, Zimu Zhou, Xu Wang, Yunhao Liu Nov 2017

Enabling Phased Array Signal Processing For Mobile Wifi Devices, Kun Qian, Chenshu Wu, Zheng Yang, Zimu Zhou, Xu Wang, Yunhao Liu

Research Collection School Of Computing and Information Systems

Modern mobile devices are equipped with multiple antennas, which brings various wireless sensing applications such as accurate localization, contactless human detection, and wireless human-device interaction. A key enabler for these applications is phased array signal processing, especially Angle of Arrival (AoA) estimation. However, accurate AoA estimation on commodity devices is non-trivial due to limited number of antennas and uncertain phase offsets. Previous works either rely on elaborate calibration or involve contrived human interactions. In this paper, we aim to enable practical AoA measurements on commodity off-the-shelf (COTS) mobile devices. The key insight is to involve users’ natural rotation to formulate …


Understanding Inactive Yet Available Assignees In Github, Jing Jiang, David Lo, Xinyu Ma, Fuli Feng, Li Zhang Nov 2017

Understanding Inactive Yet Available Assignees In Github, Jing Jiang, David Lo, Xinyu Ma, Fuli Feng, Li Zhang

Research Collection School Of Computing and Information Systems

Context In GitHub, an issue or a pull request can be assigned to a specific assignee who is responsible for working on this issue or pull request. Due to the principle of voluntary participation, available assignees may remain inactive in projects. If assignees ever participate in projects, they are active assignees; otherwise, they are inactive yet available assignees (inactive assignees for short). Objective Our objective in this paper is to provide a comprehensive analysis of inactive yet available assignees in GitHub. Method We collect 2,374,474 records of activities in 37 popular projects, and 797,756 records of activities in 687 projects …


File-Level Defect Prediction: Unsupervised Vs. Supervised Models, Meng Yan, Yicheng Fang, David Lo, Xin Xia, Xiaohong Zhang Nov 2017

File-Level Defect Prediction: Unsupervised Vs. Supervised Models, Meng Yan, Yicheng Fang, David Lo, Xin Xia, Xiaohong Zhang

Research Collection School Of Computing and Information Systems

Background: Software defect models can help software quality assurance teams to allocate testing or code review resources. A variety of techniques have been used to build defect prediction models, including supervised and unsupervised methods. Recently, Yang et al. [1] surprisingly find that unsupervised models can perform statistically significantly better than supervised models in effort-aware change-level defect prediction. However, little is known about relative performance of unsupervised and supervised models for effort-aware file-level defect prediction. Goal: Inspired by their work, we aim to investigate whether a similar finding holds in effort-aware file-level defect prediction. Method: We replicate Yang et al.'s study …


Temporal Understanding Of Human Mobility: A Multi-Time Scale Analysis, Tongtong Liu, Zheng Yang, Yi Zhao, Chenshu Wu, Zimu Zhou, Yunhao Liu Nov 2017

Temporal Understanding Of Human Mobility: A Multi-Time Scale Analysis, Tongtong Liu, Zheng Yang, Yi Zhao, Chenshu Wu, Zimu Zhou, Yunhao Liu

Research Collection School Of Computing and Information Systems

The recent availability of digital traces generated by cellphone calls has significantly increased the scientific understanding of human mobility. Until now, however, based on low time resolution measurements, previous works have ignored to study human mobility under various time scales due to sparse and irregular calls, particularly in the era of mobile Internet. In this paper, we introduced Mobile Flow Records, flow-level data access records of online activity of smartphone users, to explore human mobility. Mobile Flow Records collect high-resolution information of large populations. By exploiting this kind of data, we show the models and statistics of human mobility at …


A Semantics Comparison Workbench For A Concurrent, Asynchronous, Distributed Programming Language, Claudio Corrodi, Alexander Heußner, Christopher M. Poskitt Nov 2017

A Semantics Comparison Workbench For A Concurrent, Asynchronous, Distributed Programming Language, Claudio Corrodi, Alexander Heußner, Christopher M. Poskitt

Research Collection School Of Computing and Information Systems

A number of high-level languages and libraries have been proposed that offer novel and simple to use abstractions for concurrent, asynchronous, and distributed programming. The execution models that realise them, however, often change over time---whether to improve performance, or to extend them to new language features---potentially affecting behavioural and safety properties of existing programs. This is exemplified by SCOOP, a message-passing approach to concurrent object-oriented programming that has seen multiple changes proposed and implemented, with demonstrable consequences for an idiomatic usage of its core abstraction. We propose a semantics comparison workbench for SCOOP with fully and semi-automatic tools for analysing …


Language Inclusion Checking Of Timed Automata With Non-Zenoness, Xinyu Wang, Jun Sun, Ting Wang, Shengchao Qin Nov 2017

Language Inclusion Checking Of Timed Automata With Non-Zenoness, Xinyu Wang, Jun Sun, Ting Wang, Shengchao Qin

Research Collection School Of Computing and Information Systems

Given a timed automaton P modeling an implementation and a timed automaton S as a specification, the problem of language inclusion checking is to decide whether the language of P is a subset of that of S. It is known to be undecidable. The problem gets more complicated if non-Zenoness is taken into consideration. A run is Zeno if it permits infinitely many actions within finite time. Otherwise it is non-Zeno. Zeno runs might present in both P and S. It is necessary to check whether a run is Zeno or not so as to avoid presenting Zeno runs as …


Design And Implementation Of A Csi-Based Ubiquitous Smoking Detection System, Xiaolong Zheng, Jilian Wang, Longfei Shangguan, Zimu Zhou, Yunhao Liu Oct 2017

Design And Implementation Of A Csi-Based Ubiquitous Smoking Detection System, Xiaolong Zheng, Jilian Wang, Longfei Shangguan, Zimu Zhou, Yunhao Liu

Research Collection School Of Computing and Information Systems

Even though indoor smoking ban is being put into practice in civilized countries, existing vision or sensor-based smoking detection methods cannot provide ubiquitous detection service. In this paper, we take the first attempt to build a ubiquitous passive smoking detection system, Smokey, which leverages the patterns smoking leaves on WiFi signal to identify the smoking activity even in the non-line-of-sight and throughwall environments. We study the behaviors of smokers and leverage the common features to recognize the series of motions during smoking, avoiding the target-dependent training set to achieve the high accuracy. We design a foreground detectionbased motion acquisition method …


Tagscan: Simultaneous Target Imaging And Material Identification With Commodity Rfid Devices, Ju Wang, Jie Xiong, Xiaojiang Chen, Hongbo Jiang, Rajesh Krishna Balan, Dingyi Fang Oct 2017

Tagscan: Simultaneous Target Imaging And Material Identification With Commodity Rfid Devices, Ju Wang, Jie Xiong, Xiaojiang Chen, Hongbo Jiang, Rajesh Krishna Balan, Dingyi Fang

Research Collection School Of Computing and Information Systems

Target imaging and material identification play an important role in many real-life applications. This paper introduces TagScan, a system that can identify the material type and image the horizontal cut of a target simultaneously with cheap commercial of-the-shelf (COTS) RFID devices. The key intuition is that different materials and target sizes cause different amounts of phase and RSS (Received Signal Strength) changes when radio frequency (RF) signal penetrates through the target. Multiple challenges need to be addressed before we can turn the idea into a functional system including (i) indoor environments exhibit rich multipath which breaks the linear relationship between …


Which Packages Would Be Affected By This Bug Report?, Qiao Huang, David Lo, Xin Xia, Qingye Wang, Shanping Li Oct 2017

Which Packages Would Be Affected By This Bug Report?, Qiao Huang, David Lo, Xin Xia, Qingye Wang, Shanping Li

Research Collection School Of Computing and Information Systems

A large project (e.g., Ubuntu) usually contains a large number of software packages. Sometimes the same bug report in such project would affect multiple packages, and developers of different packages need to collaborate with one another to fix the bug. Unfortunately, the total number of packages involved in a project like Ubuntu is relatively large, which makes it time-consuming to manually identify packages that are affected by a bug report. In this paper, we propose an approach named PkgRec that consists of 2 components: a name matching component and an ensemble learning component. In the name matching component, we assign …


On Negative Results When Using Sentiment Analysis Tools For Software Engineering Research, Robbert Jongeling, Proshanta Sarkar, Subhajit Datta, Alexander Serebrenik Oct 2017

On Negative Results When Using Sentiment Analysis Tools For Software Engineering Research, Robbert Jongeling, Proshanta Sarkar, Subhajit Datta, Alexander Serebrenik

Research Collection School Of Computing and Information Systems

Recent years have seen an increasing attention to social aspects of software engineering, including studies of emotions and sentiments experienced and expressed by the software developers. Most of these studies reuse existing sentiment analysis tools such as SentiStrength and NLTK. However, these tools have been trained on product reviews and movie reviews and, therefore, their results might not be applicable in the software engineering domain. In this paper we study whether the sentiment analysis tools agree with the sentiment recognized by human evaluators (as reported in an earlier study) as well as with each other. Furthermore, we evaluate the impact …