Duplicate bug report detection using clustering
Access Status
Authors
Date
2014Type
Metadata
Show full item recordCitation
Source Title
Source Conference
Remarks
Copyright © 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Collection
Abstract
Bug reporting and fixing the reported bugs play a critical part in the development and maintenance of software systems. The software developers and end users can collaborate in this process to improve the reliability of software systems. Various end users report the defects they have found in the software and how these bugs affect them. However, the same defect may be reported independently by several users leading to a significant number of duplicate bug reports. There are a number of existing methods for detecting duplicate bug reports, but the best results so far account for only 24% of actual duplicates. In this paper, we propose a new method based on clustering to identify a larger proportion of duplicate bug reports while keeping the false positives of misidentified non-duplicates low. The proposed approach is experimentally evaluated on a large sample of bug reports from three public domain data sets. The results show that this approach achieves better performance in terms of a harmonic measure that combines true positive and true negative rates when compared to the existing methods.
Related items
Showing items related by title, author, creator and subject.
-
Casey, T.; Khan, J.; Bringans, S.; Koudelka, T.; Takle, P.; Downs, R.; Livk, A.; Syme, Robert; Tan, Kar-Chun; Lipscombe, R. (2017)This study aimed to compare the depth and reproducibility of total proteome and differentially expressed protein coverage in technical duplicates and triplicates using iTRAQ 4-plex, iTRAQ 8-plex, and TMT 6-plex reagents. ...
-
Woodward, K.; Stampalia, J.; Vanyai, H.; Rijhumal, H.; Potts, K.; Taylor, F.; Peverall, J.; Grumball, T.; Sivamoorthy, S.; Alinejad-Rokny, H.; Wray, J.; Whitehouse, A.; Nagarajan, L.; Scurlock, J.; Afchani, S.; Edwards, M.; Murch, A.; Beilby, J.; Baynam, G.; Kiraly-Borri, C.; McKenzie, F.; Heng, Julian (2019)Molecular Genetics & Genomic Medicine published by Wiley Periodicals, Inc. Background: Chromosome 22q11.2 is susceptible to genomic rearrangements and the most frequently reported involve deletions and duplications ...
-
Nash, J.; Williams, Robert (2008)The GhostWriter project is an investigation and proof of concept study of automated essay and report writing. Its first component is the search and retrieve tool called the Gatherer,which searches for, retrieves, and ...