eCommons

 

ESSAYS ON DATA PRIVACY CHALLENGES THAT FEDERAL STATISICAL AGENCIES CONFRONT IN A DATA-RICH WORLD

dc.contributor.authorSexton, William Nelson
dc.contributor.chairAbowd, John
dc.contributor.committeeMemberEasley, David
dc.contributor.committeeMemberShmatikov, Vitaly
dc.contributor.committeeMemberSchmutte, Ian
dc.date.accessioned2020-08-10T20:24:27Z
dc.date.available2020-08-10T20:24:27Z
dc.date.issued2020-05
dc.description133 pages
dc.description.abstractWith vast databases at their disposal, private tech companies can compete with public statistical agencies to provide population statistics. However, private companies face different incentives to provide high-quality statistics and to protect the privacy of the people whose data are used. When both privacy protection and statistical accuracy are public goods, private providers tend to produce at least one suboptimally, but it is not clear which. In the first paper, we model a firm that publishes statistics under a guarantee of differential privacy. We prove that provision by the private firm results in inefficiently low data quality in this framework. When Google or the U.S. Census Bureau publish detailed statistics on browsing habits or neighborhood characteristics, some privacy is lost for everybody while supplying public information. In the second paper, we assert that to date, economists have not focused on the privacy loss inherent in data publication. In their stead, these issues have been advanced almost exclusively by computer scientists who are primarily interested in technical problems associated with protecting privacy. Economists should join the discussion, first, to determine where to balance privacy protection against data quality; a social choice problem. Furthermore, economists must ensure new privacy models preserve the validity of public data for economic research. Differential privacy is a mathematical tool for protecting the confidentiality of records belonging to individuals. One of the key premises of differential privacy is that any measurement based on the confidential data must be altered with carefully chosen random noise before publication. In the third paper, we consider a scenario where the deployment of differentially private disclosure limitation technologies by official statistical agencies may not always occur under ideal conditions. For instance, internal decisions or external requirements (e.g., legal or contractual obligations) may stipulate that certain statistics must be published exactly. Additionally, overlapping datasets may have already been published. In this paper, we explain (1) the semantics of algorithms that satisfy differential privacy, (2) how the semantics are affected by release of exact statistics (computed directly from the confidential data), (3) how to attribute responsibility for any resulting information leakage, (4) how to provide privacy semantics for the combined information leakage.
dc.identifier.doihttps://doi.org/10.7298/spwy-1a64
dc.identifier.otherSexton_cornellgrad_0058_11907
dc.identifier.otherhttp://dissertations.umi.com/cornellgrad:11907
dc.identifier.urihttps://hdl.handle.net/1813/70447
dc.language.isoen
dc.subjectDifferential Privacy
dc.subjectPublic Goods
dc.subjectSemantics
dc.subjectSocial Choice
dc.titleESSAYS ON DATA PRIVACY CHALLENGES THAT FEDERAL STATISICAL AGENCIES CONFRONT IN A DATA-RICH WORLD
dc.typedissertation or thesis
dcterms.licensehttps://hdl.handle.net/1813/59810
thesis.degree.disciplineEconomics
thesis.degree.grantorCornell University
thesis.degree.levelDoctor of Philosophy
thesis.degree.namePh. D., Economics

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Sexton_cornellgrad_0058_11907.pdf
Size:
708.12 KB
Format:
Adobe Portable Document Format