This one’s important: Bayesian workflow for disease transmission modeling in Stan

Léo Grinsztajn, Elizaveta Semenova, Charles Margossian, and Julien Riou write:

This tutorial shows how to build, fit, and criticize disease transmission models in Stan, and should be useful to researchers interested in modeling the COVID-19 outbreak and doing Bayesian inference. Bayesian modeling provides a principled way to quantify uncertainty and incorporate prior knowledge into the model. What is more, Stan’s main inference engine, Hamiltonian Monte Carlo sampling, is amiable to diagnostics, which means we can verify whether our inference is reliable. Stan is an expressive probabilistic programing language that abstracts the inference and allows users to focus on the modeling. The resulting code is readable and easily extensible, which makes the modeler’s work more transparent and flexible. In this tutorial, we demonstrate with a simple Susceptible-Infected-Recovered (SIR) model how to formulate, fit, and diagnose a compartmental model in Stan. We also introduce more advanced topics which can help practitioners fit sophisticated models; notably, how to use simulations to probe our model and our priors, and computational techniques to scale ODE-based models.

Mathematical models of epidemic spread are affecting policy, and rightly so.

It’s fine to say that policy should be based only on data, not on models, but models are necessary to interpret data. To paraphrase Bill James, the alternative to a good model is not “no model,” it’s a bad model. Tell it to the A/Chairman.

But some influential models have had problems. And there are different models to choose from.

One useful step is to write these models in a common language. We have such a language: it’s called mathematics, and it’s really useful. But there are lots of steps between the mathematical model and the fit to data, and that’s where things often break down. The mathematical models have parameters, sometimes the parameters need constraints, and sometimes when you try to read a paper you get lost in the details of the fitting.

Bayesian inference and Stan are not the only ways of fitting SIR models, but they give us a common language, and they also give flexibility: Once you’ve fit a model, it’s not hard to expand it. That’s important, because model expansion is often a good way to react to criticism.

The above-linked paper by Léo, Liza, Charles, and Julien should be useful for three audiences:

– People who want to fit SIR models and their extensions, and would like to focus on the science and the data analysis rather than have computing and programming be a limiting factor.

– People who have already fit SIR models and their extensions but not in Stan, and who’d like to be able to communicate their models more easily and who’d like to be able to extend their models, adding hierarchical components, etc.

– People who are unfamiliar with these models and would like to learn about them from scratch.