I was talking with some people the other day about bad regression discontinuity analyses (see this paper for some statistical background on the problems with these inferences), examples where the fitted model just makes no sense.
The people talking with me asked the question: OK, we agree that the published analysis was no good. What would I have done instead? My response was that I’d consider the problem as a natural experiment: a certain policy was done in some cities and not others, so compare the outcome (in this case, life expectancy) in exposed and unexposed cities, and then adjust for differences between the two groups. A challenge here is the discontinuity—the policy was implemented north of the river but not south—and that’s a challenge, but this sort of thing arises in many natural experiments. You have to model things in some way, make some assumps, no way around it. From this perspective, though, the key is that this “forcing variable” is just one of the many ways in which the exposed and unexposed cities can differ.
After I described this possible plan of analysis, the people talking with me agreed that it was reasonable, but they argued that such an analysis could never have been published in a top journal. They argued that the apparently clean causal identification of the regression discontinuity analysis made the result publishable in a way that a straightforward observational study would not be.
Maybe they’re right.
If so, that’s really frustrating. We’ve talked a lot about researchers’ incentives to find statistical significance, to hype their claims and not back down from error, etc., as well as flat-out ignorance, as in the above example, researchers naively thinking that some statistical trick can solve their data problems. But this latest thing is worse: the idea that a better analysis would have a lower chance of being published in a top journal, for the very reasons that makes it better. Talk about counterfactuals and perverse incentives. How horrible.