Final Report to the Social Security Administration on the SIPP/SSA/IRS Public Use File Project
Abowd, John; Stinson, Martha; Benedetto, Gary
The creation of public use data that combine variables from the Census Bureau's Survey of Income and Program Participation (SIPP), the Internal Revenue Service's (IRS) individual lifetime earnings data, and the Social Security Administration's (SSA) individual benefit data began as part of ongoing collaborative research at the Census Bureau and SSA. The current project had its genesis with the formation of a joint committee containing representatives from the Census Bureau, SSA, IRS, and the Congressional Budget Office (CBO) that designed a prospective public use file. Aimed at a user community that was primarily interested in national retirement and disability programs, the selection of variables for the proposed SIPP/SSA/IRS-PUF focused on the critical demographic data to be supplied from the SIPP, earnings histories from the IRS data maintained at SSA, and benefit data from SSA’s master beneficiary records. After attempting to determine the feasibility of adding a limited number of variables from the SIPP directly to the linked earnings and benefit data, it was decided that the set of variables that could be added without compromising the confidentiality protection of the existing SIPP public use files was so limited that alternative methods had to be used to create a useful new public use file. The committee agreed to allow the Census Bureau to experiment with the confidentiality protection system known generically as "synthetic data." The actual technique adopted is called partially synthetic data with multiple imputation of missing items. As the term is used in this report, "partially synthetic data" means the release of person-level records containing some variables from the actual responses and other variables where the actual responses have been replaced by values sampled from the posterior predictive distribution for that record, conditional on all of the confidential data. This final report accompanies the delivery of version 4.0 to SSA as part of the fiscal year 2006 Jointly Financed Cooperative Agreement between the Census Bureau and SSA.
This document was posted by the U.S. Census Bureau. No known curated archive exists.
The creation of the SIPP Synthetic Beta was funded by the US Census Bureau and SSA, with additional funding from NSF Grants #0427889 and #0339191. Archiving of these documents is funded through NSF grant SES-1042181 and BCS-0941226, and through a grant from the Alfred P. Sloan Foundation.
U.S. Census Bureau
SIPP; synthetic data; U.S. Census Bureau
CC0 1.0 Universal
The following license files are associated with this item:
Except where otherwise noted, this item's license is described as CC0 1.0 Universal