Thursday, May 5, 2016

Polarization at last

The Pew Research Center reports that polarization in opinions has increased, in the sense that there are more people who are uniformly liberal or uniformly conservative, and fewer who are liberal on some issues and conservative on others.  More exactly, they find that there was no change between 1994 and 2004, but an increase between 2004 and 2014.

The General Social Survey provides an alternative source for analysis of this issue, since it also has a lot of repeated questions.  An advantage of the GSS is that it was conducted every year (until 1994) or two (starting in 1994), making it possible to track changes more closely.  A disadvantage is that unlike the surveys in the Pew analysis, it doesn't repeat exactly the same questions.  However, I did an analysis based on the following questions*:

1. should a woman be allowed to get a legal abortion if she is married and doesn't want more children
2.  sex between two adults of the same sex is always wrong....not wrong at all
3.  favor or oppose the death penalty for murder
4. favor or oppose requiring permit for gun ownership
5.  favor or oppose school prayer
6. vote on (hypothetical) law requiring homeowner to sell to person of any race
6a.  government spending on improving the conditions of blacks
7.  should marijuana be legal
8.  should government reduce income differences

Using a scale with items 1-5, 6, 7, and 8, the estimated polarization is:

Using a scale with 1-5, 6a, 7, and 8, it is:

In both, the estimates vary more from year to year before 1994, because the samples were smaller.  Either way, there is a definite increase over the period.  As far as the timing, the model of a trend over the whole period fits slightly better than the model of an increase only after 2004.  However, the difference in fit is small; either one is compatible with the data.  

The GSS questions go back to the 1970s, some as far back and 1972, so the analysis could be extended to cover a longer period, although they weren't all asked in the same year until 1988, making the analysis less straightforward.  In fact, I did an analysis of that about 15 years ago.  I didn't find evidence of a general trend; there were some changes in the association among opinions, but they were complicated, and I never figured out a satisfying interpretation, so eventually I gave up before publishing.  But it seems like reality finally caught up to my hypothesis.  

*The numbers on the vertical axis are the eigenvalue of the first principal component.  

Saturday, April 30, 2016


Things have been busy for the last week or two and will be for about another week, but I don't want to disappoint my legions of loyal readers.  CBS News has sometimes asked "In general, who do you think has a better chance of getting ahead in today's society--white people, black people, or do white people and black people have about an equal chance of getting ahead?"  The percent saying black people has been consistently low, about 5%, with no clear changes.  The figure shows the percent saying that white people have a better chance of getting ahead:

It seems to fluctuate without much pattern except for a stretch from October 2008 until about 2010.  Presumably the cause was Barack Obama's election as president (by the second half of October, it seemed pretty clear that he was going to be elected).  Logically, it's not reasonable for the accomplishment of one person to have much affect on views about the chances of blacks and whites in general, but I can understand why it had an impact.  The interesting thing is that the impact doesn't seem to have lasted.

More educated people are more likely to say that whites have a better chance.  For example, in October 2008, 19% of people without a high school diploma and 38% of people with a college degree said that whites had a better chance.  Age (or generation), however, didn't make any clear difference.

[Data from the Roper Center for Public Opinion Research]

Thursday, April 21, 2016

Remembrance of posts future

In 2010, a PSRA/Newsweek poll asked, "Would you personally favor or oppose a law making hiring discrimination based on appearance illegal?"  46% said favor, 49% oppose

It also asked, "Some jobs require employees to be the 'face' of the company at retail stores or in sales. Do you think companies should or should not be allowed to hire people based on their looks for some jobs?"  39% said allowed, 55% said not be allowed

I'm posting this to remind myself to include them for a longer post that should appear in the next week or two.

Tuesday, April 12, 2016

Ancient History

In the last couple of months, there has been a lot of discussion of the 1994 crime bill, mostly saying that it started or accelerated a trend to mass incarceration.  I was going to write about that, but several stories in the last few days have made the point I was going to make, which was that the rate of imprisonment started to rise in the mid-1970s, and the most rapid increase occurred before 1994 (see this article and more discussion and statistics here).

So instead I'll go back to the time when it actually started.  In 1973, the Gallup Poll asked:  "The governor of a state has proposed that all sellers of hard drugs such as heroin be given life imprisonment without the possibility of parole.  Do you approve or disapprove of this proposal?"  71% of the people with an opinion said they approved (only 4% had no opinion).

Support was strong among both Republicans (77%) and Democrats (68%).  It was a little lower among college graduates (62%) and younger people (59% among people aged 26 and under).  It was also somewhat lower in New England (57%) and the Middle Atlantic (67%).  The regional differences are interesting, because the governor in question was probably Nelson Rockefeller--New York soon did pass a law that was not quite as draconian as the proposal in the Gallup question, but pretty severe.  There were no clear differences between residents of urban and rural areas.  Finally, there were differences by race:  72% of whites, 66% of black women, and 55% of black men, said that they approved (the interaction between race and gender is statistically significant).

Although there are some group differences, and they are all in the direction you'd expect, they were not that large, and a majority of every group I looked at was in favor of the proposal.

[Data from the Roper Center for Public Opinion Research]

Sunday, April 3, 2016

Who and when?

Recently a number of people have claimed that anti-black prejudice is a key factor in support for Donald Trump:  for example, Sean McElwee and Philip Cohen and Matthew Yglesias (thanks to my esteemed friend Robert Biggert for the references).  I'd been skeptical of this in the past, but after looking at the data in the 2016 American National Election Studies Pilot Survey, I have to agree that it's true.  For example, ratings of the relative laziness of blacks and whites predicts support for Trump almost as well as position on how many immigrants should be allowed into the country.

But why is Trump popular now?  Anti-black prejudice doesn't change much from year to year, but this race for the Republican nomination is a lot different from the ones that preceded it (remember how people used to say that the Republicans always chose the person who'd finished second last time?).  Another way to look at it is to consider only people who don't display anti-black prejudice.  In the ANES, Trump gets 49% of the preferences among people rate blacks as lazier than whites, and only 30% among people who see no difference, but 30% still puts him in first place in that group, well ahead of the combined totals of the next two (Cruz and Rubio).*

Even setting Trump aside, there are some unusual things about this contest.  One is the strong showing of Ben Carson--although he eventually faded, he held up remarkably well considering his lack of experience, weak organization, and generally casual approach to campaigning.  Another is the uniform failure of the "establishment" candidates: Scott Walker, Jeb Bush, Chris Cristie, Marco Rubio, and maybe a few others who I'm forgetting now.  Of course, you could say that they were just weak candidates, and I'd agree about some of them, but only a year ago a lot of people were talking about the "deep bench" that the Republicans had.

So although race helps to explain who supports Trump, I don't think it explains why there's a lot of support for Trump (or someone like him) now.  To explain that, I would turn to an issue I've written about before:  negative attitudes towards the government.  The ANES has a number of questions that can be used to form an index of confidence in the federal government since 1952, and 2012 was the lowest ever.  The previous lows were in the 1970s and 1992-6, when there were strong insurgent movements from the right (Reagan's challenge to Gerald Ford in 1976, and surprising showings by Pat Buchanan in 1992 and 1996).  That is, outsiders from the right tend to do well when confidence in "Washington" is low.

Although no one has asked the whole set of questions since 2012, one of them "How much of the time do you think you can trust the government in Washington to do what is right," has been asked in several surveys, and the downward trend seems to be continuing.  Hopefully the final ANES survey will ask all of them, and my guess is that the overall score will match or surpass the 2012 low.

*This is based on two five-point scales, one asking people to rate blacks and another asking them to rate whites.

Monday, March 28, 2016

Unsolicited advice

This is inspired by a post by Ross Douthat on talk of running a third-party candidate if Donald Trump is the Republican nominee, with the goal of throwing the election into the House of Representatives. This would require winning some states that the Democrats would otherwise have won, which rules out a "movement conservative."  Some people have proposed trying to do it through the Libertarian party, but Douthat points out that Libertarians have traditionally done better in the Mountain west, and that winning places like Wyoming or Idaho is going to take electoral votes away from Trump, not Clinton or Sanders.  So what's left?  Nobody asked me, but here's my advice.

Since 1980, the third-party candidates who have received the largest share of votes were John Anderson in 1980, Ross Perot in 1992 and 1996, and Ralph Nader in 2000.  Although they represented quite different ideologies, there was a positive correlation between the state-level share of the vote for each one--that is, if one third-party candidate did well in a state, all third-party candidates did well.  Here is a scatterplot of vote for Anderson in 1980 and Perot in 1996, which was the weakest of the correlations.  There is a definite relationship:  for example, Vermont (in the upper right) was a relatively good state for both Anderson and Perot, while Alabama (lower left) was a poor one.

I did a factor analysis to get a score which can be interpreted as disposition to support third-party candidates.  The top-scoring states were Maine, Alaska, Vermont, Rhode Island, Montana, Massachusetts, Minnesota, Connecticut, New Hampshire, and Oregon.  In 2012, those states had a combined total of 56 electoral votes, 50 of which went to Obama.  So if a candidate won these states, there would be a decent chance of preventing the Democrat from getting a majority of the electoral votes.

But do those states have anything in common?  I think Alaska and Montana are distinctive, but that the New England states plus Minnesota and Oregon share "good government" traditions, the sort of thing that was associated with moderate Republicanism back when moderate Republicans roamed the earth.  So the most promising strategy for people who want to throw the election into the House of Representatives would be to run a moderate Republican, maybe someone like the former Senator from Maine, Olympia Snowe.  If supporting an avowed moderate was too much for them, maybe they could call the candidate a "Reform Conservative."

[Note:  it was surprisingly difficult to get data on vote shares by state in a convenient form.  I finally found spreadsheets going all the way back to 1828, compiled by Stephen Wolf.]

Wednesday, March 23, 2016

Five imaginary surveys

Carl Morris offered three examples of poll results (see this post by Andrew Gelman for a link and discussion). The numbers are those who say they favor the candidate from Party A in a two-person race (no one is undecided).  He adds that "Party A candidates have always gotten between 40% and 60% of the vote and have won about half of the elections."

  15 out of 20               (75% for A)
 115 out of 200            (57.5%)
 1,046 out of 2000       (52.3%)

The p-value for the hypothesis that exactly half support candidate A is .021 for each example.  But Morris argues that they provide different levels of evidence for the proposition that A is ahead:  strongest for 1,046 out of 2000, then 115 out of 200, with 15 out of 20 giving the weakest evidence.  For the explanation, read the original paper, but basically his point is given the experience of other elections, you should compare the pessimistic hypothesis ("I have just under 50%") to an alternative hypothesis that's consistent with that experience, like "I have more than 50% but less than 60%."  

What if the poll showed support from 10,145 out of 20,000 (50.7%)?  The p-value would again be .021, and Morris's approach would show even stronger evidence for the proposition that A was ahead than in the 1,046 out of 2,000 example.  However, a candidate might reasonably find it less encouraging than 1046 out of 2000, and describe the results as indicating that the election was "too close to call."  Morris's analysis and the p-value are both based on the assumption that you had a random sample.  But in an election poll, you know that's not quite true--even if you contact a random sample (which is difficult), non-response is almost certainly not completely random.  So in addition to sampling error, there is some extra uncertainty.  It's hard to say exactly how much, but it's safe to say that it's at least 1-2%, even for high-profile races.

What if the poll showed support from 27 out of 30?  The p-value is about .000004, and with a prior distribution like that used by Morris, the posterior probability that candidate A is ahead is very near one.   That is, both agree that this provides stronger evidence than any of the other examples.  But I think that a reasonable candidate would suspect that there was something wrong with the poll:  that there was some kind of mistake or deception.

This is not to say that there's any mistake in Morris's analysis, just that things get more complicated as you get closer to the problems of interpreting actual results.  These examples are also relevant to the situation faced by someone asking "does x affect y, after controlling for other relevant factors?" (e. g., does income affect chances of voting for Donald Trump?).   You could divide the range of parameter estimates into four groups:
a.  too small to be of interest
b.  "normal" size
c.  surprisingly large
d.  ridiculously large

People often characterize (a) in terms of "substantive significance," but it can also be parallel to the uncertainty in even well-conducted surveys.  In an observational study, the specification of "other relevant factors" is almost certainly wrong or incomplete, so if you have a very small parameter estimate, it's reasonable to suspect that it would be zero or the opposite sign under some reasonable alternative specifications--in effect, it's "too close to call."  The second, (b) is the common situation in which the variable makes some difference, but not all that much--often it's one of a large number of factors.  Establishing something like that may be an advance in knowledge, but usually isn't very exciting.  A sufficiently large value (c) is different:  it suggests we may have to fundamentally change the way we think about something (as I recall, people said things like that about the LaCour "study" of personal influence).  Then there's (d), which could be a result of mistakes in recording the data, or miscoding, or some gross error in model specification (politeness prevents me from offering examples).  The problem is that the values for (c) and (d) overlap--just like the 27 out of 30 example could indicate that the candidate is going to win by a historically unprecedented margin, or that there was merely some kind of mistake.