The second derivative of the time trend on the log scale

Peter Dorman writes:

Have you seen this set of projections? It appears to have gotten around a bit, with citations to match, and IHME Director Christopher Murray is a superstar. (WHO Global Burden of Disease) Anyway, I live in Oregon, and when you compare our forecast to New York State it gets weird: a resource use peak of April 24 for us and already April 8 for NY. This makes zero sense, IMO.

I looked briefly at the methodological appendix. This is a top-down, curve-fitting exercise, not a bottom-up epi model. They fit three parameters on a sigmoid curve, with the apparent result that NY, with its explosion of cases, simply appears to be further up its curve. Or, which amounts to the same thing, the estimate for the asymptotic limit is waaaay underinformed. These aren’t the sort of models I have worked with in the past, so I’m interested in how experienced hands would view it.

I have a few thoughts on this model. First, yeah, it’s curve-fitting, no more and no less. Second, if they’re gonna fit a model like this, I’d recommend they just fit it in Stan: the methodological appendix has all sorts of fragile nonlinear-least-squares stuff that we don’t really need any more. Third, I guess there’s nothing wrong with doing this sort of analysis, as long as it’s clear what the assumptions are. What the method is really doing is using the second derivative of the time trend on the log scale to estimate where we are on the curve. Once that second derivative goes negative, so the exponential growth is slowing, the model takes this as evidence that the rate of growth on the log scale will rapidly continue to go toward zero and then go negative. Fourth, yeah, what Dorman says: you can’t take the model for the asymptotic limit seriously. For example, in that methodological appendix, they say that they use the probit (“ERF”) rather than the logit curve because the probit fits the data better. That’s fine, but there’s no reason to think that the functional form at the beginning of the spread of a disease will match the functional form (or, for that matter, the parameters of the curve) at later stages. It really is the tail wagging the dog.

In summary: what’s relevant here is not the curve-fitting model but rather the data that show a negative second derivative on the log scale—that is, a decreasing rate of increase of deaths. That’s the graph that you want to focus on.

Relatedly, Mark Tuttle points to this news article by Joe Mozingo that reports:

Michael Levitt, a Nobel laureate and Stanford biophysicist, began analyzing the number of COVID-19 cases worldwide in January and correctly calculated that China would get through the worst of its coronavirus outbreak long before many health experts had predicted. Now he foresees a similar outcome in the United States and the rest of the world. While many epidemiologists are warning of months, or even years, of massive social disruption and millions of deaths, Levitt says the data simply don’t support such a dire scenario — especially in areas where reasonable social distancing measures are in place. . . .

Here’s what Levitt noticed in China: On Jan. 31, the country had 46 new deaths due to the novel coronavirus, compared with 42 new deaths the day before. Although the number of daily deaths had increased, the rate of that increase had begun to ease off. In his view, the fact that new cases were being identified at a slower rate was more telling than the number of new cases itself. It was an early sign that the trajectory of the outbreak had shifted. . . .

Three weeks later, Levitt told the China Daily News that the virus’ rate of growth had peaked. He predicted that the total number of confirmed COVID-19 cases in China would end up around 80,000, with about 3,250 deaths. This forecast turned out to be remarkably accurate: As of March 16, China had counted a total of 80,298 cases and 3,245 deaths . . .

Now Levitt, who received the 2013 Nobel Prize in chemistry for developing complex models of chemical systems, is seeing similar turning points in other nations, even those that did not instill the draconian isolation measures that China did.

He analyzed data from 78 countries that reported more than 50 newcases of COVID-19 every day and sees “signs of recovery” in many of them. He’s not focusing on the total number ofcases in a country, but on the number of new cases identified every day — and, especially, on the change in that number from one day to the next. . . .

The news article emphasizes that trends depend on behavior, so they’re not suggesting that people stop with the preventive measures; rather, the argument is that if we continue on the current path, we’ll be ok.

Tuttle writes:

An important but subtle claim here is that the noise in the different sources of data cancels out. To be exact, here’s the relevant paragraph from the article:

Levitt acknowledges that his figures are messy and that the official case counts in many areas are too low because testing is spotty. But even with incomplete data, “a consistent decline means there’s some factor at work that is not just noise in the numbers,” he said. In other words, as long as the reasons for the inaccurate case counts remain the same, it’s still useful to compare them from one day to the next.

OK, a few thoughts from me now:

1. I think Mozingo’s news article and Levitt’s analysis are much more clear than that official-looking report with the fancy trend curves. That said, sometimes official-looking reports and made-up curves get the attention, so I guess we need both approaches.

2. The news article overstates the success of Levitt’s method. It says that Levitt predicted 80,000 cases and 3,250 deaths, and what actually happened was 80,298 cases and 3,245 deaths. That’s too close. What I’m saying is, even if Levitt’s model is wonderful, he got lucky. Sports Illustrated predicted the Astros would go 93-69 this year. Forgetting questions about the shortened season etc., if the Astros actually went 97-65 or 89-73, we’d say that the SI prediction was pretty much on the mark. If the Astros actually went 93-69, we wouldn’t say that the SI team had some amazing model; we’d say they had a good model and they also got a bit lucky.

3. What to do next? More measurement, at the very least, and also organization for what’s coming next.