How Protective Are Synthetic Data?
Permanent Link(s)
Collections
Author
Abowd, John
Vilhuber, Lars
Abstract
This short paper provides a synthesis of the statistical disclosure limitation and computer science data privacy approaches to measuring the confidentiality protections provided by fully synthetic data. Since all elements of the data records in the release file derived from fully synthetic data are sampled from an appropriate probability distribution, they do not represent “real data,” but there is still a disclosure risk. In SDL this risk is summarized by the inferential disclosure probability. In privacy-protected database queries, this risk is measured by the differential privacy ratio. The two are closely related. This result (not new) is demonstrated and examples are provided from recent work
Date Issued
2008-01-01
Related Version
Published as: Abowd J.M., Vilhuber L. (2008) How Protective Are Synthetic Data?. In: Domingo-Ferrer J., Saygın Y. (eds) Privacy in Statistical Databases. PSD 2008. Lecture Notes in Computer Science, vol 5262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87471-3_20
Related DOI
