Mining software repositories
![]() | This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
|
The mining software repositories[citation needed] (MSR) field[citation needed] analyzes the rich data available in software repositories, such as version control repositories, mailing list archives, bug tracking systems, issue tracking systems, etc. to uncover interesting and actionable information about software systems, projects and software engineering.
Definition
Herzig and Zeller define ”mining software archives” as a process to ”obtain lots of initial evidence” by extracting data from software repositories. Further they define ”data sources” as product-based artefacts like source code, requirement artefacts or version archives and claim that these sources are unbiased, but noisy and incomplete.[1]
Data Repositories
Metrics
- Floss Mole [1]
- MetricsGrimoire
Defect Prediction
- Promise Software Repository [2]
Collection of Open Source Code
Techniques
Coupled Change Analysis
The idea in coupled change analysis is that developers change code entities (e.g. files) together frequently for fixing defects or introducing new features. These couplings between the entities are often not made explicit in the code or other documents. Especially developers new on the project do not know which entities need to be changed together. Coupled change analysis aims to extract the coupling out of the version control system for a project. By the commits and the timing of changes, we might be able to identify which entities frequently change together. This information could then be presented to developers about to change one of the entities to support them in their further changes.[2]
A controlled experiment on the usefulness of such coupled change information found that developers working on perfective maintenance tasks provided significantly more often correct solutions while there was no significant difference in the time needed to complete the tasks.[3]
Tools
Experimentation Tools
Trace lab.
Metric Extraction Tools
Mining Tools
Contradictory Findings
![]() | This section is empty. You can help by adding to it. (January 2013) |
Software Metrics
![]() | This section is empty. You can help by adding to it. (January 2013) |
See also
References
- ^ K. S. Herzig and A. Zeller, “Mining your own evidence,” in Making Software, pp. 517–529, Sebastopol, Calif., USA: O’Reilly, 2011.
- ^ Gall, H.; Hajek, K.; Jazayeri, M. (November 1998). "Detection of logical coupling based on product release history". Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272): 190–198. doi:10.1109/icsm.1998.738508.
- ^ Ramadani, Jasmin; Wagner, Stefan (2017-10-16). "Are suggestions from coupled file changes useful for perfective maintenance tasks?". PeerJ Computer Science. 3. doi:10.7717/peerj-cs.135. ISSN 2376-5992.
{{cite journal}}
: CS1 maint: unflagged free DOI (link)
External links
- Working Conference on Mining Software Repositories, the main software engineering conference in the area.