Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics

University of Nebraska - Lincoln

School of Computing: Faculty Publications

Mining software repositories

Articles 1 - 2 of 2

Full-Text Articles in Entire DC Network

Pitfalls And Guidelines For Using Time-Based Git Data, Samuel W. Flint, Jigyasa Chauhan, Robert Dyer Sep 2022

Pitfalls And Guidelines For Using Time-Based Git Data, Samuel W. Flint, Jigyasa Chauhan, Robert Dyer

School of Computing: Faculty Publications

Many software engineering research papers rely on time-based data (e.g., commit timestamps, issue report creation/update/close dates, release dates). Like most real-world data however, time-based data is often dirty. To date, there are no studies that quantify how frequently such data is used by the software engineering research community, or investigate sources of and quantify how often such data is dirty. Depending on the research task and method used, including such dirty data could aect the research results. This paper presents an extended survey of papers that utilize time-based data, published in the Mining Software Repositories (MSR) conference series. Out of …


Pitfalls And Guidelines For Using Time-Based Git Data, Samuel W. Flint, Jigyasa Chauhan, Robert Dyer Mar 2022

Pitfalls And Guidelines For Using Time-Based Git Data, Samuel W. Flint, Jigyasa Chauhan, Robert Dyer

School of Computing: Faculty Publications

Many software engineering research papers rely on time-based data (e.g., commit timestamps, issue report creation/update/close dates, release dates). Like most real-world data however, time-based data is often dirty. To date, there are no studies that quantify how frequently such data is used by the software engineering research community, or investigate sources of and quantify how often such data is dirty. Depending on the research task and method used, including such dirty data could affect the research results. This paper presents an extended survey of papers that utilize time-based data, published in the Mining Software Repositories (MSR) conference series. Out of …