Wednesday, June 3, 2020

Geography of police shootings: not much change

In 2016, I had several posts on fatal police shootings.  Unfortunately, the issue of killing by police has come up again, so I went back  to the data, which are maintained by the Washington Post and available here.  Of course, they just involve fatal shootings, but there doesn't seem to be any other systematic collection of other kinds of deaths at the hands of police, and presumably the great majority of deaths are by shooting.

There is no trend in the annual totals:

2015    994
2016    962
2017    986
2018    992
2019  1004
2020    400

When I downloaded the data a few days ago, almost exactly 40% of 2020 had passed, so that projects to just about 1000 in 2020.

In 2016, I computed the rates of fatal shootings for the 100 largest cities and noted that there were large differences among them. That is still true--the highest rates are in  St. Louis, Las Vegas, Kansas City, Miami, and Orlando.  The lowest, going from low to high:  Irvine, New York, Greensboro, Plano, Chula Vista.  If we restrict it to cities with a population of over 500,000, the lowest are New York, Nashville, San Diego, Philadelphia, and Boston.  Annual rates per 100,000 ranged from .07 to 1.8.  New York City had a total of 37, while St. Louis had 30, with about 1/25 of New York's population.  I give the whole list at the end of this post.

My initial idea was to look for changes in the rates, but there wasn't much to see there.  That was partly because the numbers are small in most cities, so it's hard to distinguish possible trends from random variation.  There are three in which there is moderately strong evidence of a decline--  Indianapolis, San Francisco, and Oakland--and none in which there's clear evidence of an increase.  However, as mentioned above, the totals have hardly changed.

People often speak of states and cities as "laboratories of democracy"--the idea is that they can try out different approaches and adopt the ones that work best.  Organizational change is difficult, but it is possible:  according to a NY Times story  "In 1972, New York officers fired 2,510 bullets and killed 66 people. By 2014, there were 288 shots fired and eight people killed."  You might think that after police killings began to get more attention, that public officials in places like St Louis would start seeing if they could emulate places like New York or Nashville.  But that doesn't seem to have happened.  Why not?  I would guess that it's people who are interested in politics have become increasingly focused on the national level.  Moreover, most people who are engaged in politics are middle class, so when they pay attention to local issues, it's of things that are of more concern to middle class people, like schools.

PS:  There are some data on police department policies involving the use of force in the 100 largest cities.  It would be interesting to see whether they are related to the rate of deaths.

Rate of police killings (annual, per 100,000), 100 largest cities:

 1            Irvine      0.072
2          New York       0.080
3        Greensboro       0.130
4             Plano       0.131
5       Chula Vista       0.139
6       Jersey City       0.140
7           Lubbock       0.149
8        Chesapeake       0.157
9         Lexington       0.177
10        Nashville       0.198
11        San Diego       0.199
12     Philadelphia       0.201
13           Boston       0.222
14    Winston-Salem       0.230
15          Hialeah       0.234
16   Virginia Beach       0.245
17          Detroit       0.246
18          Raleigh       0.246
19           Dallas       0.256
20          Chicago       0.259

21       Fort Wayne       0.285
22   Corpus Christi       0.286
23          Buffalo       0.287
24   St. Petersburg       0.288
25           Laredo       0.290
26          Gilbert       0.299
27       Pittsburgh       0.304
28      Minneapolis       0.315
29    San Francisco       0.321
30           Toledo       0.331
31          Lincoln       0.334
32        Cleveland       0.334
33          Seattle       0.352
34         Chandler       0.355
35        Charlotte       0.358
36         San Jose       0.361
37     Indianapolis       0.369
38          Madison       0.372
39       Fort Worth       0.378
40          El Paso       0.381

41      Los Angeles       0.387
42  North Las Vegas       0.394
43          Oakland       0.398
44          Fremont       0.399
45        Riverside       0.402
46       Washington       0.413
47      New Orleans       0.428
48        Milwaukee       0.432
49       Cincinnati       0.434
50          Houston       0.452
51          Memphis       0.452
52       Sacramento       0.453
53            Omaha       0.459
54           Irving       0.470
55           Durham       0.503
56          Wichita       0.522
57          Norfolk       0.526
58         Portland       0.527
59          Anaheim       0.528
60      San Antonio       0.542

61          Garland       0.547
62       Scottsdale       0.547
63            Tampa       0.552
64         St. Paul       0.554
65        Anchorage       0.558
66        Baltimore       0.566
67           Austin       0.576
68        Henderson       0.583
69           Newark       0.591
70        Santa Ana       0.607
71           Fresno       0.641
72    Colo. Springs       0.649
73         Columbus       0.654
74            Boise       0.679
75         Honolulu       0.682
76       Louisville       0.692
77       Long Beach       0.703
78     Jacksonville       0.704
79        Arlington       0.716
80         Stockton       0.727

81          Atlanta       0.798
82           Denver       0.814
83             Mesa       0.824
84             Reno       0.844
85           Aurora       0.876
86    Oklahoma City       0.880
87          Phoenix       0.889
88      Baton Rouge       0.891
89         Richmond       0.925
90         Glendale       0.925
91           Tucson       1.010
92      Bakersfield       1.090
93            Tulsa       1.101
94   San Bernardino       1.114
95      Albuquerque       1.159
96          Orlando       1.230
97            Miami       1.260
98      Kansas City       1.286
99        Las Vegas       1.306
100       St. Louis       1.877

Saturday, May 30, 2020

Footnote 3

I located the questions on limiting executive salaries that I mentioned the other day:

March 2009 (Fox News/Opinion Dynamics) Do you think the federal government should ever be allowed to regulate the salaries of corporate executives at American companies?  38% yes, 56% no

March 2009 (Quinnipiac)  Do you think the government should limit the amount of money that companies not taking federal funds pay their executives?  30% yes, 64% no

Both of those were coupled with questions about pay at companies that were taking money from the government--for those 64% favored regulation in the Fox poll, 81% favored limits in the Quinnipiac.  I think that "regulation" sounds weaker than "limit," so I'm not sure why support for "regulation" of salaries in "companies that take taxpayer bailouts" (their words) was lower in the Fox poll than support for "limit" in companies that were "taking federal funds" was in the Quinnipiac.  I don't know about the order in which the questions were asked in either poll, but that might be the explanation--if people had said no to the general question, they might be less likely to say yes to the specific one.   However, support for the general principle of regulating/limiting salaries was comparable.   

There is also:

(Gallup)  Do you favor or oppose the federal government taking steps to limit the pay of executives at major companies?

                       Favor   Oppose
June 2009       59%      35%
March 2018    47%      48%

"Taking steps" sounds weaker than "limit" or even "regulation," so I'm not surprised that support was higher.  The drop in support between 2009 and 2018 is interesting. 

The general point is that even though large majorities think executive salaries are too high (I think I've had some posts documenting that), there's much less support for government action to do something about it. 

Thursday, May 28, 2020

Salary limits

In 1939, a Roper/Fortune survey asked:

"Do you think there should be a law limiting the amount of money any individual is allowed to earn in a year?"  24% said yes, 70% no

In 1943, the Office of Public Opinion Research asked "When the war is over, do you think it would be a good idea or a bad idea for us to have a top limit on the amount of money any one person can get in a year?"  40% said yes, and 51% said no.

In 1946, the Opinion Research Corporation asked "Do you think it would be a good thing for the country if the government put a top limit on the salary any man could make?"  They repeated that in 1953, twice in 1955, and 1961 with the following results:
            Yes      No
1946     32%    62%
1953     17%    78%
1955     17%    78%
1955     15%    81%
1961     21%    68%
[1962     21%    68%]*

In 1981 a survey by Civic Services (an organization I hadn't heard of before), asked people about this statement:  "There should be a top limit on incomes so that no one can earn more than $100,000 a year."  20% agreed and 75% disagreed

In 1994, a survey by Reader's Digest and the Institute for Social Inquiry asked " Should there be a top limit on incomes so that no one can earn more than one million dollars a year?"  22% said yes, and 74% said no.

Oddly, no similar question seems to have been asked since then, despite growing concern with inequality.   But it seems that support was quite steady at about 20%, except for the increase during and right after the war.  I doubt that  the exact numbers in the last two questions mattered much, since but $100,000 in 1981 would be about $300,000 today, and $1,000,000 in 1994 would be about 1.75 million.

There have been a few questions about limiting salaries in particular occupations.  In 1991, a CBS News poll asked "Do you think there should be a limit on how much professional baseball players can earn in a year, or should baseball players be allowed to earn as much as team owners are willing to pay them" 49% said yes, 47% no

In 2009 a CBS News poll asked "Do you think the federal government should put a limit on the amount of money that senior executives can earn at financial institutions, or do you think this is something the federal government should not be involved in?"  46% said yes, 46% said shouldn't be involved.

I thought there was also one about CEOs in general, but I can't locate it now.

*I think this is just a duplicate listing of the 1961 survey.

[Data from the Roper Center for Public Opinion Research]

Thursday, May 21, 2020

Footnote 2

In a post about the claim of rising "despair" among less educated people, I noted that when asked "Some people say that people get ahead by their own hard work; others say that lucky breaks or help from other people are more important. Which do you think is most important?" people without a college degree had become more likely to say "hard work" while college graduates had become more likely to say "lucky breaks or help."  I interpreted the changes among less educated people as counting against the hypothesis of rising despair, on the grounds that "hard work" was the optimistic answer.  But Tom VanHeuvelen has suggested a different interpretation:  "If you attribute success and failure mostly to your own efforts, well, then what do you make of life when the massive structural factors out of your control . . . all hit you in a span of two decades? It certainly feels like a depressing and mortifying combination of factors."  The implication is that if a people think that success is due to hard work, when when things get worse they will blame themselves.  This sounds plausible, and in fact I think some noted sociologist or political scientist said that this was one of the reasons for the relative lack of "class consciousness" in the United States.  On the other hand, my interpretation also seems plausible, at least to me. 

To try to see which interpretation fits here, I looked at the correlation between views on the sources of success and happiness.  On the average, people who say that success is due to hard work are happier than those who say it's due to breaks or help.  It seems to me that VanHeuvelen's interpretation suggests that this relationship should differ by social position:  it will be reversed, or at least weaker, among people in lower social positions.  That is, if you think that getting ahead is the result of hard work, and you haven't gotten ahead, then you'll feel bad about yourself.  On the other side, people who have been successful and think that success is the result of hard work will enjoy not just the material benefits of success, but greater self-esteem. 

 Breaking it down by education

Not HS grad                           .11
HS grad                                  .09
College grad                           .09

By occupational prestige

Low (1-30)                            .09
Medium low (30-40)             .05
Medium (40-50)                    .05
Medium high (50-60)            .09
High (60+)                             .06

There is no apparent pattern in the group differences, and they are small enough to be ascribed to sampling variation.  That is, it seems like the relationship is pretty much the same at all social levels:  people who think that getting ahead is the result of hard work are happier, even if by conventional standards they have not gotten ahead themselves.  It could be that people who think that getting ahead is due to hard work believe that they will get ahead in the future.  Or people may be able to see themselves as having done pretty well by their own standards--compared to some other people they've known, or considering the problems they've faced. 

I restricted this analysis to whites, since the "despair" arguments focus on them.  But I thought a comparison of races might be interesting:

White             .09
Black              .03
Other              .07

The black-white difference is statistically significant.  I don't have an interpretation for it, but it seems worth thinking about.  As far as the relationship between race and views on getting ahead, blacks are a bit more likely to say lucky breaks or help. 

Monday, May 18, 2020


The New Yorker has an article called "How Greenwich Republicans Learned to Love Trump."  It deals with an important issue--how Trump has maintained a high level of support among Republicans, even the kind of Republicans who once had doubts about him--but there's another important issue that it just mentions in passing, which is that there aren't as many Greenwich Republicans as there used to be.  Greenwich was once solidly Republican--Lyndon Johnson won in 1964, but no other Democrat even broke 40% until Bill Clinton in 1996.  But Barack Obama won the town in 2008, and Hillary Clinton won it by a larger margin in 2016.   I calculated the difference between vote in Greenwich and the national vote.  For example, in 1948 Truman got 29.8% of the vote in Greenwich and Dewey got 68.9%; in the nation, Truman got 49.6% and Dewey got 45.1%.  The difference is (29.8-68.9)-(49.6-45.1)= -43.6.  The difference from 1948 to 2016:

The shift towards the Democrats is pretty steady.  There are a few unusual elections.  One is 1964, when a lot of places broke from their traditional voting patterns.  Another is 2012, when the Democratic vote in Greenwich fell off sharply from 2008 (53.4% to 43.9%).  My guess is that was because of  financial regulation and other measures that Obama took to deal with the recession--although a lot of people on the left saw Obama as a "neoliberal" who was serving the interests of finance, the finance industry didn't see him that way.  Finally, there was 2016, when Clinton was ahead of the trend.  But those are all secondary--the big story is just the general movement towards the Democrats. 

I also remembered a post from last year, where I predicted the 2016 vote in Connecticut towns from two variables--population density and the ratio of mean to median household income.  I was primarily concerned with the urban/rural differences, and just found the mean/median relationship by experimentation.  But when I thought about it again, I decided I should control for racial composition as well.  That reduces the estimated effect of population density, but leaves the estimated effect of mean/median income almost unchanged.  In a post from earlier this year, I found that income inequality, particularly at the top end, is associated with more Democratic support at the county level.  That is the case among Connecticut towns as well.  Greenwich is an outlier here--it has the second mean/median ratio among Connecticut towns, so it's predicted to be near the top in Democratic support, but is actually in the middle. 

Saturday, May 16, 2020

What were they thinking?

On May 5, the Council of Economic Advisers sent out a tweet that got a lot of attention:

" To better visualize observed data, we also continually update a curve-fitting exercise to summarize COVID-19's observed trajectory. Particularly with irregular data, curve fitting can improve data visualization. As shown, IHME's mortality curves have matched the data fairly well."


The "cubic fit" was widely ridiculed at the time.  Jason Furman, who was chair of the Council under Obama, said that it "might be the lowest point in the 74 year history of the council,"  and Paul Krugman wrote a column suggesting that Kevin Hassett, who developed the cubic fit model, had a history "of both being wrong at every important juncture and refusing to admit or learn from mistakes."

The chair of the council, Tomas Philipson, replied "past CEA Chair Furman (and economist turned political hack Krugman) not understanding the difference between data smoothing and model-based forecasting. Furman only chair without peer-reviewed scientific work and academic appointments-it shows."  He wasn't done.  His next tweet was restrained:  "Kevin Hassett’s work comparing existing model-based forecasts with the emerging data should seem sensible to anyone interested in understanding the future course of the pandemic," but he was soon back on the attack: “comparing things to the data might have helped Furman when he advised the worst economic recovery in history.”

The cubic fit got renewed attention yesterday when a number of people pointed out that it suggested we should be down around zero deaths now, which obviously we aren't (1491 on May 15, according to data published in the Washington Post).  But suppose we give the Council the benefit of the doubt and assume that they never intended the cubic model to be a forecast.  What would be the point of doing it, then?  I think the clue is in the last sentence "IMHE's mortality curves have matched the data fairly well."  There are three different curves, one from March 27, one from April 5, and one from May 4.  The first two are similar, showing a peak in mid-April and then a rapid decline.  The second has a higher peak, but a more rapid decline.  It predicts about 150-200 deaths a day in mid-May.  The first one predicts about 500 in mid-may, and both predict near zero by late June.  The last IMHE projection is quite different, and more pessimistic, predicting about 1,500 in mid-May, and about 500 in late June.

So my guess is that the Council was concerned about the May 4 projections--they wanted to believe that the March and April ones were more reliable.  And by this point, a pattern of sharp rises and falls had developed, with lower rates on Saturday and Sunday.  Just looking at different projections compared to the data, it wasn't obvious which one was better--for example, the May prediction was right on target for late April, but far above the actual numbers on May 4; the March and April projections were far below the actual numbers in late April, but right on target for May 4.   Therefore, it seemed reasonable to smooth the data and see which of the projections fit it best.  Looking at the figure in the tweet, it appears that the "smoothed" data matches the April 5 projection pretty closely.  So I think that the message of the tweet (and presumably what the CEA believed) was that we should accept the two earlier projections, not the latest one--that deaths would decline pretty rapidly.  

I used the Washington Post data on coronavirus deaths to estimate a cubic polynomial with powers of time as independent variables.*  The results are very similar to those in the CEA tweet:


But why a cubic polynomial?  There is no prior reason to assume that will provide the best fit, so you have to consider different orders of polynomials and pick the best one.  The usual practice is to start at the "bottom" and keep adding terms (squared, cubed, fourth, fifth, etc. powers) as long as they are statistically significant.  An alternative, and probably better, approach is to start with some high-degree polynomial and remove the highest powers one by one until the highest remaining one is statistically significant.  I took the conventional approach and added a fourth power, which was statistically significant (t-ratio of 5.9), and then a fifth power, which was not.  So the final ("quartic") model included fourth, third, squared, and linear terms.  The estimates:

The quartic fit is clearly better, and it's more like the May 4 projection in that it has a slower decline from the peak.  In case you're wondering what happens if the quartic model is projected to May 15:

That's why you're not supposed to use polynomial regressions for prediction.

To conclude, it was reasonable to do some smoothing of the data, especially since the daily counts probably represent when the paperwork was filed rather than the actual time of deaths (it's hard to see why actual death rates would be lower on weekends).  Not everyone thinks that polynomial regressions are a good way to do that, but they are a well established method.  But if you're going to use polynomial regressions, you have to make some effort to pick the right order--otherwise, why not just save trouble and fit a linear regression?  So even if the curve-fitting exercise was done for "data visualization" or smoothing rather than prediction, it was done badly. 

*Since deaths can't be negative, I estimated the logarithm of deaths and then transformed the predicted values from the regressions.  I started the clock at March 1--it looks like the CEA started a few days before then.