CMStatistics 2020: Start Registration
View Submission - CMStatistics
Title: A latent class modeling approach for differentially private synthetic data for contingency tables Authors:  Michelle Nixon - The Pennsylvania State University (United States)
Andres Barrientos - Florida State University (United States) [presenting]
Jerry Reiter - Duke University (United States)
Aleksandra Slavkovic - Penn State University (United States)
Abstract: An approach is presented to construct differentially private synthetic data for contingency tables. The algorithm achieves privacy by adding noise to selected summary counts, e.g., two-way margins of the contingency table, via the Geometric mechanism. We posit an underlying latent class model for the counts, estimate the parameters of the model based on the noisy counts, and generate synthetic data using the estimated model. This approach allows the agency to create multiple imputations of synthetic data with no additional privacy loss, thereby facilitating estimation of uncertainty in downstream analyses. We illustrate the approach using a subset of the 2016 American Community Survey Public Use Microdata Sets.