Cornell University

Labor Dynamics Institute

275 Ives Faculty Building, 607-255-2744

News

July 29 2014

Abowd will be a keynote speaker at the Synthetic Data Workshop sponsored by The MITRE Corporation

John M. Abowd will be one of the keynote speakers at the Synthetic Data Workshop sponsored by The MITRE Corporation on July 29th.  

MITRE is hosting a technical exchange meeting (TEM) to discuss the challenges in the generation and use of synthetic data, a technology that has the potential for resolving some of those data access restrictions. See program details here.

Synthetic Data Publication and Access to Confidential Data

John M. Abowd

Abstract

An essential aspect the synthetic data program that emerged from the NSF-ITR grant (2004-2007 to Cornell University with partners at Census, IAB, Carnegie-Mellon, Duke and Michigan) is the hierarchy of access modalities to confidential data: public-use, restricted-access, and supervised direct access. One important feature of this hierarchy is the role of the feedback cycle between synthetic data (a public-use mode) and restricted-access to the underlying confidential data. Good quality synthetic data involves many modeling decisions even when the process appears to be automated. Synthetic data users represent a much broader community than any synthetic data development team can cover. When these users provide their results to the development team in exchange for validation on the underlying confidential data (a supervised direct-access mode), the quality of both the synthetic and confidential data is improved. We can now document this claim. This talk presents the current evidence and discusses the prospects for enhancing this feedback cycle.