Randomized Response and Balanced Bloom Filters for Privacy Preserving Record Linkage
|dc.identifier.citation||Schnell, R. and Borgs, C. 2017. Randomized Response and Balanced Bloom Filters for Privacy Preserving Record Linkage, 116th IEEE International Conference on Data Mining Workshops (ICDMW 2016), pp. 218-224.|
© 2016 IEEE. In most European settings, record linkage across different institutions is based on encrypted personal identifiers-such as names, birthdays, or places of birth-To protect privacy. However, in practice up to 20% of the records may contain errors in identifiers. Thus, exact record linkage on encrypted identifiers usually results in the loss of large subsets of the data. Such losses usually imply biased statistical estimates since the causes of errors might be correlated with the variables of interest in many applications. Over the past 10 years, the field of Privacy Preserving Record Linkage (PPRL) has developed different techniques to link data without revealing the identity of the described entity. However, only few techniques are suitable for applied research with large data bases that include millions of records, which is typical for administrative or medical data bases. Bloom filters were found to be one successful technique for PPRL when large scale applications are concerned. Yet, Bloom filters have been subject to cryptographic attacks. Previous research has shown that the straight application of Bloom filters has a non-zero re-identification risk. We present new results on recently developed techniques defying all known attacks on PPRL Bloom filters. The computationally inexpensive algorithms modify personal identifiers by combining different cryptographic techniques. The paper demonstrates these new algorithms and demonstrates their performance concerning precision, recall, and re-identification risk on large data bases.
|dc.title||Randomized Response and Balanced Bloom Filters for Privacy Preserving Record Linkage|
|dcterms.source.title||Proceedings 16th IEEE International Conference on Data Mining Workshops|
|dcterms.source.series||Proceedings 16th IEEE International Conference on Data Mining Workshops|
|dcterms.source.conference||116th IEEE International Conference on Data Mining Workshops (ICDMW 2016)|
|curtin.department||Centre for Population Health Research|
|curtin.accessStatus||Fulltext not available|
Files in this item
There are no files associated with this item.