Sometimes I start on a post but abandon it, usually because it no longer seems interesting or gets too complicated. One of these times was in August, about an article that was part of the New York Times 1619 project. I was reminded of it by a letter to the Times criticizing some aspects of the project, and after digging up my analysis, I decided it was worth writing about, although it is kind of complicated.
The article, by Linda Villarosa, was called "How false beliefs in physical racial differences still live in medicine today." Specifically, it was about the belief that blacks don't feel as much pain as whites. It started with an account of a 19th century physician who believed that blacks had thicker skin and conducted brutal experiments to try to find evidence for his hypothesis. Then it moved to the present and cited a review of studies that concluded that "black and Hispanic people .... received inadequate pain management compared with white
Then came the part that caught my attention. It said that a 2016 survey found that "when asked to imagine how much pain white or black patients experienced
in hypothetical situations, the medical students and residents insisted
that black people felt less pain." I was curious about how big the differences were, so I read the paper. It described a study that gave each medical student a pair of hypothetical cases, one black and one white, and asked them to rate how much pain the patient was likely to be feeling and how it should be treated (opioids vs. something weaker). There were several scenarios, and they were rotated so that each respondent got a different one for his or her cases but the total number in each was the same for the black and white examples.
It didn't report the mean pain ratings for hypothetical black and white cases, but showed this figure:
Figure A shows the average pain rating for the black and white case by number of false beliefs about physical differences between the races. Medical students who had a high number of false beliefs rated the white cases as experiencing more pain; medical students who had a low number of false beliefs rated the black cases as experiencing more. High and low were defined relative to the mean, so that implied that medical students with average numbers of false beliefs rated the black and white cases about the same.
The authors included their data as a supplement to the article, so I downloaded it and calculated the means. The average rating for the black cases was 7.622, on a scale of 1-10, while the average rating for the white cases was 7.626--that is, almost identical. The study also asked how the different cases should be treated--135 gave the same recommendation for both of their cases, 40 recommended stronger medication for their white case, and 28 for their black case. Since the total distribution of conditions was the same for the black and white cases, this means that in this sample, treatment recommendations were different for black and whites. However, the difference was not statistically significant at conventional levels (p is about .14)--that is, the sample difference could easily have come up by chance.
So you could conclude that, in this sample, there is no evidence that medical students rate the pain of blacks and whites differently, but perhaps some evidence that they treat white pain more aggressively. (If you just went by statistical significance, you would accept the hypothesis that they treat hypothetical black and white cases the same, but a more sensible conclusion would that you should collect more data). The paper, however, didn't do this. They used pain ratings to predict treatment recommendations for black and white cases, and then looked at the predicted treatment differences for students with high and low numbers of false beliefs: "participants who endorsed more false beliefs (+1 SD) were less accurate in their treatment recommendations for the black target compared with the white target [β = 0.15, SE = 0.06, t(192) = 2.47, P = 0.014]. Conversely, participants who endorsed fewer false beliefs (−1 SD) did not differ in their treatment recommendation accuracy [β = −0.06, SE = 0.06, t(192) = −1.05, P > 0.250]."
The problem with this analysis is that, to quote the title of an article by Andrew Gelman and Hal Stern "The difference between 'significant' and 'not significant' is not itself statistically significant." That is, if you tested the hypothesis that β had equal magnitude and opposite signs for black and white cases--that treatment recommendations were affected by ratings of pain but not by race--you would not be able to reject it.
So to summarize, the statement that "the medical students and residents insisted
that black people felt less pain" is false: they rated black and white pain as virtually equal. I don't blame Villarosa for that--the way it was written, I could see how someone would interpret the results that way. I don't really blame the authors either--interaction effects can be confusing. I would blame the journal (PNAS) for (1) not asking the authors to show means for the black and white examples as standard procedure and (2) not getting reviewers who understand interaction effects.
On the more general 1619 project, my thoughts are:
1. Most of the articles I read weren't very convincing
2. but it's a perspective that deserves to be heard
3. and it was published in the NY Times Magazine, which doesn't claim to be a straight news section, but to give perspectives and interpretations