Friday, June 21, 2024

The problem is you?, part 3

 This is a return to the analysis of the geographical origins of people involved in the Jan 6, 2021 assault on the Capitol.  My original point was that if you're predicting the logarithm of the expected number of insurrectionists from a county, you should control for the logarithm of the population of the county, and you would expect the estimate to be near 1.0--that is, if the population of county B is X times as large as the population of county A, then the number of insurrectionists from county B will be X times as large as the number from county A.  But on further reflection, it seems likely that the number will depend not just on the size of the population, but the mix of Trump voters, Biden voters, and everyone else.  You'd expect that most of the people involved were Trump supporters, but there could also have been Trump sympathizers who were ineligible to vote, people who were generally against "the system" and voted for minor parties like the Libertarians, and people who were just looking for trouble or had come along with friends.  If we had data on how the insurrectionists voted in 2020, we could do separate analyses for each group--e. g., Trump-voting insurrectionists -- but we don't.  However, we have data on the votes in each county, so you can estimate a model for the total number of insurrectionists with the logs of Trump voters, Biden voters, and others as predictors.  '

The estimates and standard errors from a negative binomial regression:

White decline       .016         (.020)
Mfg decline         -.007        (.005)
% NH white         .011*       (.004)
NCHS                 -.161***   (.048)
Distance              -.052         (.065)
Drive                   1.001***  (.220)
Drive*Dist         -1.436*** (.431)
log(Biden)          -0.042       (.105)
log(Trump)            .632*** (.165)
log(Other)             .409*      (.186)

The first three variables, white population decline, manufacturing employment decline, and percent non-Hispanic white, were considered in the original analysis by Pape, Larson, and Ruby.  NCHS is a 6-category classification scheme developed by the National Center for Health Statistics:  large central metro, large fringe metro, medium metro, small metro, micropolitan, and non-core.  Pape, Larson, and Ruby divided that into two groups:  the first three categories vs. the last, but I treated it as a numerical variable (more or less urban) since that generally produced a better fit.  The next three variables are all related:  "Drive" is a 0/1 variable for being in driving distance which I defined as 700 kilometers of Washington, DC.  Distance is measured in hundreds of kilometers, so the estimates imply that distance reduces the number of insurrectionists until you get to about 700 kilometers from Washington, and makes no difference beyond 700 miles--that is, the rate is about the same if you're 700 kilometers or 3700 kilometers away.  Finally, the number of Biden voters doesn't matter, the number of Trump voters does, and there's some evidence that the number of other people does as well.  The fact that the number of Trump voters is an important predictor of the number of insurrectionists might seem like a matter of common sense, but it's contrary to the conclusions of Pape, Larson, and Ruby.  

Although the change in the method of controlling for population changes some conclusions, it leaves one point unchanged:  insurrectionists tended to come from more urban places (controlling for the other variables).  There's no clear difference in the overall rates--the average rate per million is:

Large central    2.51
Large fringe     3.54
Medium           2.76
Small               2.73
Micropolitan   2.81
Rural               2.53

However, the less urban areas tend to have more Trump voters, so when you adjust for that you would expect them to have a higher rate of insurrectionists.  I can think of a few ideas about why people in urban areas might be more likely to have participated, but don't have a way to test them, so I'll leave it at that.  


PS:  The estimates given above are from a negative binomial regression.  Results from a Poisson regression are almost the same.  I also tried ordinal probit, ordinal logit, and Cox (proportional hazards) regression.  With those, the standard errors were generally larger, but the relative values of the estimates were about the same:  the only notable difference was that in the Cox regression the estimate of log(Other) was near zero and non-significant.  












No comments:

Post a Comment