Show simple item record

dc.contributor.authorHaney, Samuel
dc.contributor.authorMachanavajjhala, Ashwin
dc.contributor.authorAbowd, John M
dc.contributor.authorGraham, Matthew
dc.contributor.authorKutzbach, Mark
dc.contributor.authorVilhuber, Lars
dc.date.accessioned2017-05-09T17:34:45Z
dc.date.available2017-05-09T17:34:45Z
dc.date.issued2017-05-14
dc.identifier.urihttps://hdl.handle.net/1813/49652
dc.description.abstractNational statistical agencies around the world publish tabular summaries based on combined employeremployee (ER-EE) data. The privacy of both individuals and business establishments that feature in these data are protected by law in most countries. These data are currently released using a variety of statistical disclosure limitation (SDL) techniques that do not reveal the exact characteristics of particular employers and employees, but lack provable privacy guarantees limiting inferential disclosures. In this work, we present novel algorithms for releasing tabular summaries of linked ER-EE data with formal, provable guarantees of privacy. We show that state-of-the-art differentially private algorithms add too much noise for the output to be useful. Instead, we identify the privacy requirements mandated by current interpretations of the relevant laws, and formalize them using the Pufferfish framework. We then develop new privacy definitions that are customized to ER-EE data and satisfy the statutory privacy requirements. We implement the experiments in this paper on production data gathered by the U.S. Census Bureau. An empirical evaluation of utility for these data shows that for reasonable values of the privacy-loss parameter ϵ≥1, the additive error introduced by our provably private algorithms is comparable, and in some cases better, than the error introduced by existing SDL techniques that have no provable privacy guarantees. For some complex queries currently published, however, our algorithms do not have utility comparable to the existing traditionalen_US
dc.description.sponsorshipAuthors acknowledge support from NSF grants 1253327, 1408982, 1443014, BCS-0941226, TC-1012593 and SES-1131848, DARPA & SPAWAR N66001-15-C-4067, and Alfred P. Sloan Foundation.
dc.language.isoen_USen_US
dc.relation.hasversionPublished as Samuel Haney , Ashwin Machanavajjhala , John M. Abowd , Matthew Graham , Mark Kutzbach , Lars Vilhuber (2017) "Utility Cost of Formal Privacy for Releasing National Employer-Employee Statistics", SIGMOD’17, May 14-19, 2017, Chicago, Illinois, USA.
dc.relation.urihttp://doi.org/10.1145/3035918.3035940
dc.titleUtility Cost of Formal Privacy for Releasing National Employer-Employee Statisticsen_US
dc.typearticleen_US
dc.description.legacydownloadsDownloads for this item at https://digitalcommons.ilr.cornell.edu/ldi/36/ as of 9/15/2020: 172


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Statistics