Dynamically consistent noise infusion and partially synthetic data as confidentiality protection measures for related time-series
Abowd, John; Gittings, Kaj; McKinney, Kevin L.; Stevens, Bryce E.; Vilhuber, Lars; Woodcock, Simon
The Census Bureau's Quarterly Workforce Indicators (QWI) provide detailed quarterly statistics on employment measures such as worker and job ows, tabulated by detailed worker characteristics in various combinations. The data are released for detailed NAICS industries and for several levels of geography, the lowest aggregation of which are counties. OnTheMap, another Census Bureau product, provides a subset of these tabulations at the tract level. Disclosure avoidance methods are required to protect the information about individuals and businesses that contribute to the underlying data. The QWI disclosure avoidance mechanism we describe here relies heavily on the use of noise infusion through a permanent multiplicative noise distortion factor, used for magnitudes, counts, differences and ratios. There is minimal suppression and no complementary suppressions. To our knowledge, the release in 2003 of the QWI was the first large-scale use of noise infusion in any official statistical product. We show that the released statistics are analytically valid along several critical dimensions -- measures are unbiased and time series properties are preserved. We provide an analysis of the degree to which con dentiality is protected. Furthermore, we show how the judicious use of synthetic data, injected into the tabulation process, can completely eliminate suppressions, maintain analytical validity, and increase the protection of the underlying con dential data.
Presented at FCSM.
noise infusion; synthetic data; statistical disclosure limitation; time-series; local labor markets; gross job
Required Publisher Statement: Copyright held by authors.