Synthetic Longitudinal Business Data International User Seminar

In this seminar, we discuss with interested parties the conditions necessary to implement the SynLBD approach, with the goal of providing other statistical agencies a straightforward toolkit to implement the same procedure on their own data. Our hope is that by implementing similar procedures on comparable business microdata, new research both within and across countries can be enabled. The ideal end result is a series of country-specific datasets on establishments and/or firms available within the same computing environment. We discuss the data and software requirements for the lowest-cost approach, the disclosure protection statistics already implemented that can be used to achieve release of the data in this way, the validation procedures that an agency should agree to, and the likely cost of maintaining such procedures. The seminar brings together academics working on cutting-edge methods for the protection of privacy in statistical databases, and researchers and implementers at statistical agencies that have started or are interested in starting a similar project.

Funding for the workshop is provided by the National Science Foundation (CNS-1012593, SES-1131848) and the Alfred P. Sloan Foundation. The organizers thank the National Academies’ Committee on National Statistics for hosting the seminar.

    Proceedings from the Synthetic LBD International Seminar
    Vilhuber, Lars; Kinney, Saki; Schmutte, Ian M. (2017-09-22)
    On May 9, 2017, we hosted a seminar to discuss the conditions necessary to implement the SynLBD approach with interested parties, with the goal of providing a straightforward toolkit to implement the same procedure on other data. The proceedings summarize the discussions during the workshop.
    Excerpt: Usage and outcomes of the Synthetic Data Server
    Vilhuber, Lars; Abowd, John M. (2017-05-09)
    This is an excerpt from a prior presentation at the Society of Labor Economists (2016). The Synthetic Data Server (SDS) at Cornell University was set up to provide early access to new synthetic data products by the U.S. Census Bureau. These datasets are made available to interested researchers in a controlled environment, prior to a more generalized release. Over the past 5 years, 4 synthetic datasets were made available on the server, and over 100 users have accessed the server over that time period. This paper reports on interim outcomes of the activity: results of validation requests from a user perspective, functioning of the feedback loop due to validation and user input, and the role of the SDS as an access gateway to and educational tool for other mechanisms of accessing detailed person, household, establishment, and firm statistics.
    Confidentiality of the SynLBD
    Vilhuber, Lars; Kinney, Saki (2017-05-09)
    We describe the confidentiality protection provided by the SynLBD. The presentation was originally prepared by Saki Kinney for the World Statistics Congress 2013.
    SynLBD Inputs: Structure, Example
    Vilhuber, Lars; Drechsler, Jörg (2017-05-09)
    We describe the structure of inputs for the SynLBD, and discuss challenges in preparing them.
    Overview: Synthetic Longitudinal Business Data International User Seminar
    Vilhuber, Lars; Kinney, Saki (2017-05-09)
    An overview over the content of the Synthetic Longitudinal Business Data International User Seminar, based in part on a presentation prepared by Saki Kinney for the 2013 World Statistics Congress (WSC2013).