Duplicate bug report detection using clustering
MetadataShow full item record
Bug reporting and fixing the reported bugs play a critical part in the development and maintenance of software systems. The software developers and end users can collaborate in this process to improve the reliability of software systems. Various end users report the defects they have found in the software and how these bugs affect them. However, the same defect may be reported independently by several users leading to a significant number of duplicate bug reports. There are a number of existing methods for detecting duplicate bug reports, but the best results so far account for only 24% of actual duplicates. In this paper, we propose a new method based on clustering to identify a larger proportion of duplicate bug reports while keeping the false positives of misidentified non-duplicates low. The proposed approach is experimentally evaluated on a large sample of bug reports from three public domain data sets. The results show that this approach achieves better performance in terms of a harmonic measure that combines true positive and true negative rates when compared to the existing methods.
Copyright © 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Showing items related by title, author, creator and subject.
Gopalan, Raj; Krishna, Aneesh (2014)Bug reporting and fixing the reported bugs play a critical part in the development and maintenance of software systems. The software developers and end users can collaborate in this process to improve the reliability of ...
Analysis of reproducibility for proteome coverage and quantitation using isobaric mass tags (iTRAQ and TMT)Casey, T.; Khan, J.; Bringans, S.; Koudelka, T.; Takle, P.; Downs, R.; Livk, A.; Syme, Robert; Tan, Kar-Chun; Lipscombe, R. (2017)This study aimed to compare the depth and reproducibility of total proteome and differentially expressed protein coverage in technical duplicates and triplicates using iTRAQ 4-plex, iTRAQ 8-plex, and TMT 6-plex reagents. ...
Ghostwriter gatherer: an open source tool for retrieving and scoring web documents from simple search termsNash, J.; Williams, Robert (2008)The GhostWriter project is an investigation and proof of concept study of automated essay and report writing. Its first component is the search and retrieve tool called the Gatherer,which searches for, retrieves, and ...