BMJ update: authors reply to our concerns (but I’m not persuaded)

Last week we discussed an article in the British Medical Journal that seemed seriously flawed to me, based on evidence such as the above graph.

At the suggestion of Elizabeth Loder, I submitted a comment to the paper on the BMJ website. Here’s what I wrote:

I am concerned that the model does not fit the data and also that the jumps in the fitted model do not make sense, considering the blurring in the underlying process. I was alerted to this by someone who pointed out the graph for Canada on page 84 of the supplementary material. The disconnect between the fit and the data gives me skepticism about the larger claims in the article.

I also have concerns about the data, similar to those raised by the peer reviewers regarding selection effects in who gets tested.

There were a couple other comments, and the authors of the article wrote a response. Here’s the part where they reply to my comment:

Zadey and Gelman also raised concerns about the model fit and model parameters. Our interrupted time series model allows for both a change in slope (incidence rate) and a change in level at the time of intervention, the latter of which can therefore look like a “jump” in the fitted line on the incidence graphs (supplementary appendix). We agree with Zadey that change in level may also be relevant in other contexts, but here we chose to focus on change in the slope because we were most interested in the effect over the full post-intervention period examined, and did not anticipate that the intervention would have an immediate effect (i.e. a sudden jump in level) in most countries.

We are of course well aware that the model fits better in some countries than others. A different model may have fit the data better in Canada, for example, but not necessarily in other countries. Using different models in different countries would have precluded the ability to perform a meta-analysis of results across countries.

I’m not convinced by this response. The model fits terribly for Canada. It makes no sense for Canada. Why should I believe it for other countries? And why should I believe any aggregate results that include Canada? If constructing models that made sense and fit the data would’ve precluded the ability to perform a meta-analysis, then maybe the meta-analysis wasn’t such a good idea!

The idea of taking a hundred analyses of uncertain quality and then throwing them together into a meta-analysis . . . I don’t think that makes sense.

In any case, the links are above so you can read the article, the referee reports (which I think missed some key issues, in keeping with our theme about the problems with peer reviews), and all the external comments.