postdoc, mit idss & sercLinks :
SERC Reading Group: Differential Privacy in the U.S. Census
During the Spring 2022 term, Kevin Mills and I ran a first-of-its-kind reading group with MIT’s Social and Ethical Repsonsibilities of Computing (SERC) Scholars. The readings and discussions centered around formal privacy in public data, and in particular the decisions, designs, and implementations of differentially private algorithms in the 2020 Decennial Census. The group is preparing a document to synthesize and present the readings, research, and discussions.
Below is the reading list and discussion topics for the term. Where possible, I’ll include links to the readings and their authors.
- [Differential Privacy and the 2020 U.S. Census] by Simson Garfinkel. In MIT Case Studies in Social and Ethical Responsibilities of Computing, Winter 2022.
- What do we know about differential privacy and the decennial census at the outset?
- Based on the case study, what do we see as being sticking points in the implementation and deployment timeline that we should dig further into?
- What does “data privacy” mean (informally) in this day and age and what is the value of privacy in data generated by the public and collected and published by the government?
- Chapters 1 and 2 of The Algorithmic Foundations of Differential Privacy by Cynthia Dwork and Aaron Roth, 2014; or
- Lecture 3B from Gautam Kamath’s Fall 2020 Differential Privacy course, University of Waterloo.
- Technically, what exactly is differential privacy and what does it accomplish?
- How does differential privacy align with common or colloquial understandings of “privacy”?
- What might the privacy concerns be in publishing U.S. census data, specifically? Can we enumerate some potential misuses or harms that would result from insufficient privacy protection?
- What does the privacy paramter epsilon mean in this context? How can we think about translating from tuning epsilon to protecting individuals’ privacy in meaningful ways?
- Chapters 3 and 8.1 of The Algorithmic Foundations of Differential Privacy by Dwork and Roth
- Lectures 1A-4B of Kamath’s Differential Privacy course
- Strength in Numbers: A Guide to the 2020 Census Redistricting Data From the U.S. Census Bureau
- Census TopDown: Differentially Private Data, Incremental Schemas, and Consistency with Public Knowledge, draft paper by Abowd et al., July 2019 version.
- Why Privacy is Important by James Rachels, Philosophy & Public Affairs, 1975.
- Given that the data published from the decennial census is relatively mundane, to what extent should privacy be a first-order concern in this process? For example, someone’s age might be reasonably inferred (to some confidence) by an observer on the street, so what is the value in the Census Bureau actively obscuring that in the data publication process?
- What kinds of challenges does the Census Bureau anticipate contending with as evidenced by the algorithm design?
- Data in the census is a subset of richer and often much scarier datasets sold by data brokers and similar agents. What is the purpose of privacy in the census’ publications given that the same data and much more invasive details are readily available to anyone willing to pay the right price?
- What role do trust and transparency play in the data collection, analysis, and publication process here?
- Privacy as Contextual Integrity by Helen Nissenbaum, UW Law Review, 2004.
- Privacy and Contextual Integrity: Framework and Applications by Barth, Datta, Mitchell, and Nissenbaum, IEEE Symposium on Security and Privacy, 2006.
- U.S. Code, Title 13
- Does contextual integrity provide a convincing framework for how to think about privacy?
- Where are the gaps between contextual integrity and the way that differential privacy asks us to imagine privacy loss?
- Do we think that the text of Title 13 and in particular its language about the Bureau not publishing reidentifiable data demands the use of differential privacy?
- Amicus Brief of Data Privacy Experts in Alabama v U.S. Dept. of Commerce, 2021.
- Brief of Amici Curiae Seven Data Privacy and Urban Planning Experts in Support of Plaintiff-Appellant and Reversal in Sanchez v Los Angeles Department of Transportation; City of Los Angeles, 2021.
- Opinion of the Court in Sanchez 2021.
- Did you find any of the presentations in the LADOT amicus brief surprising?
- Was the Court’s dismissal in the LADOT case (Sanchez) surprising?
- How do the privacy experts in the amicus brief in the Alabama case arguments match up against the concerns we’ve found in previous discussions?
- The seriousness of the reconstruction attack that is able to be carried out on the 2010 publications is a frequent point of contention in the discourse. Where do we land on that?
- The costs and benefits to privacy are not distributed equally across the population. Minorities and outliers with respect to general patterns in the data pay more in terms of accuracy for privacy but simultaneously be exactly the people who benefit most from the protections of privacy-preserving techniques. How do we consider that tension and how do we meaningfully engage with stakeholders on this topic?
- Understanding the 2020 Census Disclosure Avoidance System: Simulated Reconstruction-Abetted Re-identification Attack on the 2010 Census by Michael Hawes in a U.S. Census Bureau Presntation, 2021.
- What Is the Right to Privacy? by Andrei Marmor, USC Law and Legal Studies Papers, 2014.