Okay, the second week of redistricting summer camp is over and a lot has happened. Our first week of independent project work has highlighted many of the challenges we will face as researchers over the remaining weeks as well as demonstrated proof-of-concept for several promising new areas to look at. The research approach is very divide-and-conquer, so we all split into groups to work on various projects, but I tried to keep a proverbial thumb in each of the proverbial pies, so I can quickly recap what I worked on in each of these.
This was the group I was officially assigned to. If you imagine each congressional district as being built out of geographic cells (like towns, precincts, or counties), then you can extract a graph structure from each district by making a vertex for each cell and connecting two vertices with an edge when their cells are adjacent as geographies. Our goal was to explore some of the properties of these graphs, and we accomplished two major tasks. First, we actually built these graphs for each state and district in the US. Given the, umm, difficulty of working with the spatial data from the Census, this wasn’t as simple as we hoped it would be. Given that last week I promised you some pictures of redistricting in Maine, you can see this graph for the Evergreen State below. Second, we looked at embedding demographic and social data in these graphs and looking for similarities and clusters beyond spatial adjacency. Our thoughts are that if two nearby neighborhoods have extremely similar interests, it may make sense to try to put them into the same district, but a graph using only spatial adjacencies cannot capture this information.
A graph representation of Maine's two congressional districts
Markov Chain Monte Carlo Rebuild a.k.a. RunDMCMC
So there’s this big piece of code made by some researchers which basically takes in a districting plan, makes a bunch of small random changes to it, then spits out the new plan. If you want your districts to be optimized for something, such as compactness, you can ask the code to only give you new plans which are better than your original in terms of that something. Pretty cool, right? Well, the one problem is that the existing code is a single C++ file with 1300 lines of code. This group worked on rewriting this code in Python to be more readable and useable. Since the MCMC process requires running for a very large number of small random changes, code performance is incredibly critical, and this was a fun opportunity to review some graph algorithms and learn about some of the Python libraries’ implementations of them.
If we envision a state as a graph, then drawing districts is just a graph partitioning problem. Easy, right? Well, not really. There are a lot of really basic questions that we (nor anyone else, as far as we can tell) do not know how to answer. If I give you a graph, can you tell me how many ways there are to split it into two connected pieces? If I give you a graph, can you sample uniformly from the set of all cuts which separate it into two connected pieces? Are either of these problems in some class of computational hardness? If you know, please tell me! Otherwise, these are the big questions we’re thinking about and are hoping to answer soon.
Finally, what’s happening next week? I’m working with the Spectral Methods group, which is super exciting because learning spectal graph theory is one of my goals for this summer. In short, spectral approaches look at the eigenvalues of objects and functions, and next week I’ll hopefully feel comfortable enough to write a basic primer to explain how we can use them to work on redistricting. For now, here’s a little teaser: a picture of Florida, with the districts colored according to the largest eigenvalue of the graph’s adjacency matrix.
The largest eigenvalue of the adjacency matrix of each of Florida's districts
Share this post on social media:
Did you like that post?
You can suscribe to the
Zach is a PhD student at the University of Pennsylvania, Department of Computer and Information Science. His research interests include game theory and mechanism design, machine learning, and computing for the social sciences.