Physical Sciences and Mathematics | Open Access Articles

F-Trail: Finding Patterns In Taxi Trajectories, Yasuko Matsubara, Evangelos Papalexakis, Lei Li, David Lo, Yasushi Sakurai, Christos Faloutsos Apr 2013

F-Trail: Finding Patterns In Taxi Trajectories, Yasuko Matsubara, Evangelos Papalexakis, Lei Li, David Lo, Yasushi Sakurai, Christos Faloutsos

David LO

Given a large number of taxi trajectories, we would like to find interesting and unexpected patterns from the data. How can we summarize the major trends, and how can we spot anomalies? The analysis of trajectories has been an issue of considerable interest with many applications such as tracking trails of migrating animals and predicting the path of hurricanes. Several recent works propose methods on clustering and indexing trajectories data. However, these approaches are not especially well suited to pattern discovery with respect to the dynamics of social and economic behavior. To further analyze a huge collection of taxi trajectories, …

Go to article

Empirical Evaluation Of Bug Linking, Tegawendé F. Bissyande, Ferdian Thung, Shaowei Wang, David Lo, Lingxiao Jiang, Laurent Réveillère Apr 2013

Empirical Evaluation Of Bug Linking, Tegawendé F. Bissyande, Ferdian Thung, Shaowei Wang, David Lo, Lingxiao Jiang, Laurent Réveillère

David LO

To collect software bugs found by users, development teams often setup bug trackers using systems such as Bugzilla. Developers would then fix some of the bugs and commit corresponding code changes into version control systems such as svn or git. Unfortunately, the links between bug reports and code changes are missing for many software projects as the bug tracking and version control systems are often maintained separately. Yet, linking bug reports to fix commits is important as it could shed light into the nature of bug fixing processes and expose patterns in software management. Bug linking solutions, such as ReLink, …

Go to article

Automatic Defect Categorization, Ferdian Thung, David Lo, Lingxiao Jiang Apr 2013

Automatic Defect Categorization, Ferdian Thung, David Lo, Lingxiao Jiang

David LO

Defects are prevalent in software systems. In order to understand defects better, industry practitioners often categorize bugs into various types. One common kind of categorization is the IBM’s Orthogonal Defect Classification (ODC). ODC proposes various orthogonal classification of defects based on much information about the defects, such as the symptoms and semantics of the defects, the root cause analysis of the defects, and many more. With these category labels, developers can better perform post-mortem analysis to find out what the common characteristics of the defects that plague a particular software project are. Albeit the benefits of having these categories, for …

Go to article

A Comparative Study Of Supervised Learning Algorithms For Re-Opened Bug Prediction, Xin Xia, David Lo, Xinyu Wang, Xiaohu Yang, Shanping Li, Jianling Sun Apr 2013

A Comparative Study Of Supervised Learning Algorithms For Re-Opened Bug Prediction, Xin Xia, David Lo, Xinyu Wang, Xiaohu Yang, Shanping Li, Jianling Sun

David LO

Bug fixing is a time-consuming and costly job which is performed in the whole life cycle of software development and maintenance. For many systems, bugs are managed in bug management systems such as Bugzilla. Generally, the status of a typical bug report in Bugzilla changes from new to assigned, verified and closed. However, some bugs have to be reopened. Reopened bugs increase the software development and maintenance cost, increase the workload of bug fixers, and might even delay the future delivery of a software. Only a few studies investigate the phenomenon of reopened bug reports. In this paper, we evaluate …

Go to article

An Empirical Study On Developer Interactions In Stackoverflow, Shaowei Wang, David Lo, Lingxiao Jiang Apr 2013

An Empirical Study On Developer Interactions In Stackoverflow, Shaowei Wang, David Lo, Lingxiao Jiang

David LO

No abstract provided.

Go to article

Understanding Widespread Changes: A Taxonomic Study, Shaowei Wang, David Lo, Lingxiao Jiang Apr 2013

Understanding Widespread Changes: A Taxonomic Study, Shaowei Wang, David Lo, Lingxiao Jiang

David LO

Many active research studies in software engineering, such as detection of recurring bug fixes, detection of copyand- paste bugs, and automated program transformation tools, are motivated by the assumption that many code changes (e.g., changing an identifier name) in software systems are widespread to many locations and are similar to one another. However, there is no study so far that actually analyzes widespread changes in software systems. Understanding the nature of widespread changes could empirically support the assumption, which provides insight to improve the research studies and related tools. Our study in this paper addresses such a need. We propose …

Go to article

Diffusion Of Software Features: An Exploratory Study, Ferdian Thung, David Lo, Lingxiao Jiang Apr 2013

Diffusion Of Software Features: An Exploratory Study, Ferdian Thung, David Lo, Lingxiao Jiang

David LO

New features are frequently proposed in many software libraries. These features include new methods, classes, packages, etc. These features are utilized in many open source and commercial software systems. Some of these features are adopted very quickly, while others take a long time to be adopted. Each feature takes much resource to develop, test, and document. Library developers and managers need to decide what feature to prioritize and what to develop next. As a first step to aid these stakeholders, we perform an exploratory study on the diffusion or rate of adoption of features in Java Development Kit (JDK) library. …

Go to article

Predicting Project Outcome Leveraging Socio-Technical Network Patterns, Didi Surian, Yuan Tian, David Lo, Hong Cheng, Ee Peng Lim Apr 2013

Predicting Project Outcome Leveraging Socio-Technical Network Patterns, Didi Surian, Yuan Tian, David Lo, Hong Cheng, Ee Peng Lim

David LO

There are many software projects started daily, some are successful, while others are not. Successful projects get completed, are used by many people, and bring benefits to users. Failed projects do not bring similar benefits. In this work, we are interested in developing an effective machine learning solution that predicts project outcome (i.e., success or failures) from developer socio-technical network. To do so, we investigate successful and failed projects to find factors that differentiate the two. We analyze the socio-technical aspect of the software development process by focusing at the people that contribute to these projects and the interactions among …

Go to article

Network Structure Of Social Coding In Github, Ferdian Thung, Tegawende F. Bissyande, David Lo, Lingxiao Jiang Apr 2013

Network Structure Of Social Coding In Github, Ferdian Thung, Tegawende F. Bissyande, David Lo, Lingxiao Jiang

David LO

Social coding enables a different experience of software development as the activities and interests of one developer are easily advertized to other developers. Developers can thus track the activities relevant to various projects in one umbrella site. Such a major change in collaborative software development makes an investigation of networkings on social coding sites valuable. Furthermore, project hosting platforms promoting this development paradigm have been thriving, among which GitHub has arguably gained the most momentum. In this paper, we contribute to the body of knowledge on social coding by investigating the network structure of social coding in GitHub. We collect …

Go to article

Adoption Of Software Testing In Open Source Projects: A Preliminary Study On 50,000 Projects, Pavneet Singh Kochhar, Tegawende F. Bissyande, David Lo, Lingxiao Jiang Apr 2013

Adoption Of Software Testing In Open Source Projects: A Preliminary Study On 50,000 Projects, Pavneet Singh Kochhar, Tegawende F. Bissyande, David Lo, Lingxiao Jiang

David LO

In software engineering, testing is a crucial activity that is designed to ensure the quality of program code. For this activity, development teams spend substantial resources constructing test cases to thoroughly assess the correctness of software functionality. What is however the proportion of open source projects that include test cases? What kind of projects are more likely to include test cases? In this study, we explore 50,000 projects and investigate the correlation between the presence of test cases and various project development characteristics, including the lines of code and the size of development teams.

Go to article

Physical Sciences and Mathematics Commons^™

Full-Text Articles in Physical Sciences and Mathematics

F-Trail: Finding Patterns In Taxi Trajectories, Yasuko Matsubara, Evangelos Papalexakis, Lei Li, David Lo, Yasushi Sakurai, Christos Faloutsos

David LO

Empirical Evaluation Of Bug Linking, Tegawendé F. Bissyande, Ferdian Thung, Shaowei Wang, David Lo, Lingxiao Jiang, Laurent Réveillère

David LO

Automatic Defect Categorization, Ferdian Thung, David Lo, Lingxiao Jiang

David LO

A Comparative Study Of Supervised Learning Algorithms For Re-Opened Bug Prediction, Xin Xia, David Lo, Xinyu Wang, Xiaohu Yang, Shanping Li, Jianling Sun

David LO

An Empirical Study On Developer Interactions In Stackoverflow, Shaowei Wang, David Lo, Lingxiao Jiang

David LO

Understanding Widespread Changes: A Taxonomic Study, Shaowei Wang, David Lo, Lingxiao Jiang

David LO

Diffusion Of Software Features: An Exploratory Study, Ferdian Thung, David Lo, Lingxiao Jiang

David LO

Predicting Project Outcome Leveraging Socio-Technical Network Patterns, Didi Surian, Yuan Tian, David Lo, Hong Cheng, Ee Peng Lim

David LO

Network Structure Of Social Coding In Github, Ferdian Thung, Tegawende F. Bissyande, David Lo, Lingxiao Jiang

David LO

Adoption Of Software Testing In Open Source Projects: A Preliminary Study On 50,000 Projects, Pavneet Singh Kochhar, Tegawende F. Bissyande, David Lo, Lingxiao Jiang

David LO