A transparent and transportable methodology for evaluating Data Linkage software
MetadataShow full item record
There has been substantial growth in DataLinkage (DL) activities in recent years. This reflects growth in both the demand for, and the supply of, linked or linkable data. Increased utilisation of DL “services” has brought with it increased need for impartial information about the suitability and performance capabilities of DL software programs and packages. Although evaluations of DL software exist; most have been restricted to the comparison of two or three packages. Evaluations of a large number of packages are rare because of the time and resource burden placed on the evaluators and the need for a suitable “gold standard” evaluation dataset. In this paper we present an evaluation methodology that overcomes a number of these difficulties. Our approach involves the generation and use of representative synthetic data; the execution of a series of linkages using a pre-defined linkage strategy; and the use of standard linkage quality metrics to assess performance. The methodology is both transparent and transportable, producing genuinely comparable results. The methodology was used by the Centre for DataLinkage (CDL) at Curtin University in an evaluation of ten DL software packages. It is also being used to evaluate larger linkage systems (not just packages). The methodology provides a unique opportunity to benchmark the quality of linkages in different operational environments.
NOTICE: this is the author’s version of a work that was accepted for publication in Journal of Biomedical Informatics. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Journal of Biomedical Informatics, 45 (1) 2012 http://dx.doi.org/10.1016/j.jbi.2011.10.006
Showing items related by title, author, creator and subject.
Randall, Sean; Ferrante, Anna; Boyd, James; Semmens, James (2013)Background: Within the field of record linkage, numerous data cleaning and standardisation techniques are employed to ensure the highest quality of links. While these facilities are common in record linkage software ...
Online Assessment System with Integrated Study (OASIS) to enhance the learning of Electrical Engineering students: an action research studySmaill, Christopher Raymond (2006)World-wide, there has been a large increase in tertiary student numbers, not entirely matched by funding increases. Consequently, instructors are faced with large, diverse classes, and find themselves struggling to provide ...
Boyd, James; Randall, Sean; Ferrante, Anna; Bauer, J.; Brown, A.; Semmens, James (2014)Background: Record linkage techniques are widely used to enable health researchers to gain event based longitudinal information for entire populations. The task of record linkage is increasingly being undertaken by ...