Open Access. Powered by Scholars. Published by Universities.®

Software Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 13 of 13

Full-Text Articles in Software Engineering

Sewordsim: Software-Specific Word Similarity Database, Yuan Tian, David Lo, Julia Lawall Jun 2014

Sewordsim: Software-Specific Word Similarity Database, Yuan Tian, David Lo, Julia Lawall

David LO

Measuring the similarity of words is important in accurately representing and comparing documents, and thus improves the results of many natural language processing (NLP) tasks. The NLP community has proposed various measurements based on WordNet, a lexical database that contains relationships between many pairs of words. Recently, a number of techniques have been proposed to address software engineering issues such as code search and fault localization that require understanding natural language documents, and a measure of word similarity could improve their results. However, WordNet only contains information about words senses in general-purpose conversation, which often differ from word senses in …


Leveraging Machine Learning And Information Retrieval Techniques In Software Evolution Tasks: Summary Of The First Malir-Se Workshop, At Ase 2013, - Lucia, David Lo, Giuseppe Scanniello, Alessandro Marchetto, Nasir Ali, Collin Mcmillan Jun 2014

Leveraging Machine Learning And Information Retrieval Techniques In Software Evolution Tasks: Summary Of The First Malir-Se Workshop, At Ase 2013, - Lucia, David Lo, Giuseppe Scanniello, Alessandro Marchetto, Nasir Ali, Collin Mcmillan

David LO

The first International Workshop on MAchine Learning and Information Retrieval for Software Evolution (MALIR-SE) was held on the 11th of November 2013. The workshop was held in conjunction with the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE) in Silicon Valley, California, USA. The workshop brought researchers and practitioners that were interested in leveraging machine learning and information retrieval techniques to automate various software evolution tasks. During the workshop, papers on the application of machine learning and information retrieval techniques to bug fix time prediction and anti-pattern detection were presented. There were also discussions on the presented papers and …


Hierarchical Parallel Algorithm For Modularity-Based Community Detection Using Gpus, Chun Yew Cheong, Huynh Phung Huynh, David Lo, Rick Siow Mong Goh Jun 2014

Hierarchical Parallel Algorithm For Modularity-Based Community Detection Using Gpus, Chun Yew Cheong, Huynh Phung Huynh, David Lo, Rick Siow Mong Goh

David LO

This paper describes the design of a hierarchical parallel algorithm for accelerating community detection which involves partitioning a network into communities of densely connected nodes. The algorithm is based on the Louvain method developed at the Université Catholique de Louvain, which uses modularity to measure community quality and has been successfully applied on many different types of networks. The proposed hierarchical parallel algorithm targets three levels of parallelism in the Louvain method and it has been implemented on single-GPU and multi-GPU architectures. Benchmarking results on several large web-based networks and popular social networks show that on top of offering speedups …


Software Internationalization And Localization: An Industrial Experience, Xin Xia, David Lo, Feng Zhu, Xinyu Wang, Bo Zhou Jun 2014

Software Internationalization And Localization: An Industrial Experience, Xin Xia, David Lo, Feng Zhu, Xinyu Wang, Bo Zhou

David LO

Software internationalization and localization are important steps in distributing and deploying software to different regions of the world. Internationalization refers to the process of reengineering a system such that it could support various languages and regions without further modification. Localization refers to the process of adapting an internationalized software for a specific language or region. Due to various reasons, many large legacy systems did not consider internationalization and localization at the early stage of development. In this paper, we present our experience on, and propose a process along with tool supports for software internationalization and localization. We reengineer a large …


Leveraging Web 2.0 For Software Evolution, Yuan Tian, David Lo Jun 2014

Leveraging Web 2.0 For Software Evolution, Yuan Tian, David Lo

David LO

In this era of Web 2.0, much information is available on the Internet. Software forums, mailing lists, and question-and-answer sites contain lots of technical information. Blogs contain developers’ opinions, ideas, and descriptions of their day-to-day activities. Microblogs contain recent and popular software news. Software forges contain records of socio-technical interactions of developers. All these resources could potentially be leveraged to help developers in performing software evolution activities. In this chapter, we first present information that is available from these Web 2.0 resources. We then introduce empirical studies that investigate how developers contribute information to and use these resources. Next, we …


An Empirical Study Of Bugs In Build Process, Xiaoqiong Zhao, Xin Xia, Pavneet Singh Kochhar, David Lo, Shanping Li Jun 2014

An Empirical Study Of Bugs In Build Process, Xiaoqiong Zhao, Xin Xia, Pavneet Singh Kochhar, David Lo, Shanping Li

David LO

Software build process translates source codes into executable programs, packages the programs, generates documents, and distributes products. In this paper, we perform an empirical study to characterize build process bugs. We analyze bugs in build process in 5 open-source systems under Apache namely CXF, Camel, Felix, Struts, and Tuscany. We compare build process bugs and other bugs across 3 different dimensions, i.e., bug severity, bug fix time, and the number of files modified to fix a bug. Our results show that the fraction of build process bugs which are above major severity level is lower than that of other bugs. …


Build System Analysis With Link Prediction, Xin Xia, David Lo, Xinyu Wang, Bo Zhou Jun 2014

Build System Analysis With Link Prediction, Xin Xia, David Lo, Xinyu Wang, Bo Zhou

David LO

Compilation is an important step in building working software system. To compile large systems, typically build systems, such as make, are used. In this paper, we investigate a new research problem for build configuration file (e.g., Makefile) analysis: how to predict missed dependencies in a build configuration file. We refer to this problem as dependency mining. Based on a Makefile, we build a dependency graph capturing various relationships defined in the Makefile. By representing a Makefile as a dependency graph, we map the dependency mining problem to a link prediction problem, and leverage 9 state-of-the-art link prediction algorithms to solve …


Collaboration Patterns In Software Developer Network, Didi Surian, David Lo, Ee Peng Lim Jun 2014

Collaboration Patterns In Software Developer Network, Didi Surian, David Lo, Ee Peng Lim

David LO

No abstract provided.


An Empirical Study Of Bug Report Field Reassignment, Xin Xia, David Lo, Ming Wen, Shihab Emad, Bo Zhou Jun 2014

An Empirical Study Of Bug Report Field Reassignment, Xin Xia, David Lo, Ming Wen, Shihab Emad, Bo Zhou

David LO

A bug report contains many fields, such as product, component, severity, priority, fixer, operating system (OS), platform, etc., which provide important information for the bug triaging and fixing process. It is important to make sure that bug information is correct since previous studies showed that the wrong assignment of bug report fields could increase the bug fixing time, and even delay the delivery of the software. In this paper, we perform an empirical study on bug report field reassignments in open-source software projects. To better understand why bug report fields are reassigned, we manually collect 99 recent bug reports that …


Automated Construction Of A Software-Specific Word Similarity Database, Yuan Tian, David Lo, Julia Lawall Jun 2014

Automated Construction Of A Software-Specific Word Similarity Database, Yuan Tian, David Lo, Julia Lawall

David LO

Many automated software engineering approaches, including code search, bug report categorization, and duplicate bug report detection, measure similarities between two documents by analyzing natural language contents. Often different words are used to express the same meaning and thus measuring similarities using exact matching of words is insufficient. To solve this problem, past studies have shown the need to measure the similarities between pairs of words. To meet this need, the natural language processing community has built WordNet which is a manually constructed lexical database that records semantic relations among words and can be used to measure how similar two words …


Towards More Accurate Multi-Label Software Behavior Learning, Xin Xia, Feng Yang, David Lo, Zhenyu Chen, Xinyu Wang Jun 2014

Towards More Accurate Multi-Label Software Behavior Learning, Xin Xia, Feng Yang, David Lo, Zhenyu Chen, Xinyu Wang

David LO

In a modern software system, when a program fails, a crash report which contains an execution trace would be sent to the software vendor for diagnosis. A crash report which corresponds to a failure could be caused by multiple types of faults simultaneously. Many large companies such as Baidu organize a team to analyze these failures, and classify them into multiple labels (i.e., multiple types of faults). However, it would be time-consuming and difficult for developers to manually analyze these failures and come out with appropriate fault labels. In this paper, we automatically classify a failure into multiple types of …


Proceedings Of The 2nd International Workshop On Software Mining, Ming Li, Hongyu Zhang, David Lo Jun 2014

Proceedings Of The 2nd International Workshop On Software Mining, Ming Li, Hongyu Zhang, David Lo

David LO

No abstract provided.


Boat: An Experimental Platform For Researchers To Comparatively And Reproducibly Evaluate Bug Localization Techniques, Xinyu Wang, David Lo, Xin Xia, Xingen Wang, Pavneet Singh Kochhar, Yuan Tian, Xiaohu Yang, Shanping Li, Jianling Sun, Bo Zhou Jun 2014

Boat: An Experimental Platform For Researchers To Comparatively And Reproducibly Evaluate Bug Localization Techniques, Xinyu Wang, David Lo, Xin Xia, Xingen Wang, Pavneet Singh Kochhar, Yuan Tian, Xiaohu Yang, Shanping Li, Jianling Sun, Bo Zhou

David LO

Bug localization refers to the process of identifying source code files that contain defects from descriptions of these defects which are typically contained in bug reports. There have been many bug localization techniques proposed in the literature. However, often it is hard to compare these techniques since different evaluation datasets are used. At times the datasets are not made publicly available and thus it is difficult to reproduce reported results. Furthermore, some techniques are only evaluated on small datasets and thus it is not clear whether the results are generalizable. Thus, there is a need for a platform that allows …