Limited privacy protection and poor sensitivity: Is it time to move on from the statistical linkage key-581?
Access Status
Authors
Date
2016Type
Metadata
Show full item recordCitation
Source Title
School
Collection
Abstract
Background: The statistical linkage key (SLK-581) is a common tool for record linkage in Australia, due to its ability to provide some privacy protection. However, newer privacy-preserving approaches may provide greater privacy protection, while allowing high-quality linkage. Objective: To evaluate the standard SLK-581, encrypted SLK-581 and a newer privacy-preserving approach using Bloom filters, in terms of both privacy and linkage quality. Method: Linkage quality was compared by conducting linkages on Australian health datasets using these three techniques and examining results. Privacy was compared qualitatively in relation to a series of scenarios where privacy breaches may occur. Results: The Bloom filter technique offered greater privacy protection and linkage quality compared to the SLK-based method commonly used in Australia. Conclusion: The adoption of new privacy-preserving methods would allow both greater confidence in research results, while significantly improving privacy protection.
Related items
Showing items related by title, author, creator and subject.
-
Brown, Adrian; Ferrante, Anna; Randall, Sean; Boyd, James; Semmens, James (2017)© 2017 Brown, Ferrante, Randall, Boyd and Semmens. In an era where the volume of structured and unstructured digital data has exploded, there has been an enormous growth in the creation of data about individuals that can ...
-
Brown, A.; Randall, Sean; Ferrante, A.; Semmens, J.; Boyd, J. (2017)Background: Probabilistic record linkage is a process used to bring together person-based records from within the same dataset (de-duplication) or from disparate datasets using pairwise comparisons and matching probabilities. ...
-
Boyd, James; Randall, Sean; Ferrante, Anna (2015)Record linkage is the process of bringing together data relating to the same individual within and between different datasets. These integrated datasets provide diverse and rich resources for researchers without the cost ...