ILR School

Using Partially Synthetic Microdata to Protect Sensitive Cells in Business Statistics

dc.contributor.authorMiranda, Javier
dc.contributor.authorVilhuber, Lars
dc.descriptionPresented at World Statistical Congress 2015 and Joint Statistical Meetings 2015.Vilhuber acknowledges support through NSF Grants SES-1042181 and BCS-0941226. All authors were affiliated with the U.S. Census Bureau, Center for Economic Studies, when originally contributing to the contents of this paper. This document reports the results of research and analysis undertaken by U.S. Census Bureau staff. It has undergone a Census Bureau review more limited in scope than that given to official Census Bureau publications. This document is released to inform interested parties of ongoing research and to encourage discussion of work in progress. All results have been reviewed to ensure that no confidential information is disclosed. The views expressed herein are attributable only to the authors and do not represent the views of the U.S. Census Bureau. The data used in this paper is restricted-access, and can be accessed either through the Federal Statistical Research Data Centers (LBD) or through the Synthetic Data Server at Cornell University (Synthetic LBD). Data and code used for the final version of this paper will be archived at the U.S. Census Bureau and made available upon request.
dc.description.abstractWe describe and analyze a method that blends records from both observed and synthetic microdata into public-use tabulations on establishment statistics. The resulting tables use synthetic data only in potentially sensitive cells. We describe different algorithms, and present preliminary results when applied to the Census Bureau's Business Dynamics Statistics and Synthetic Longitudinal Business Database, highlighting accuracy and protection afforded by the method when compared to existing public-use tabulations (with suppressions).
dc.description.legacydownloadssynbds_noise_synthetic_SJIAOS2015.pdf: 167 downloads, before Oct. 1, 2020.
dc.subjectsynthetic data
dc.subjectstatistical disclosure limitation
dc.subjectlocal labor markets
dc.subjectgross job flows
dc.subjectconfidentiality protection
dc.titleUsing Partially Synthetic Microdata to Protect Sensitive Cells in Business Statistics
local.authorAffiliationMiranda, Javier: Center for Economic Studies, US Census Bureau
local.authorAffiliationVilhuber, Lars: Cornell University


Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
1.54 MB
Adobe Portable Document Format