“Frequentism-as-model”

Christian Hennig writes:

Most statisticians are aware that probability models interpreted in a frequentist manner are not really true in objective reality, but only idealisations. I [Hennig] argue that this is often ignored when actually applying frequentist methods and interpreting the results, and that keeping up the awareness for the essential difference between reality and models can lead to a more appropriate use and interpretation of frequentist models and methods, called frequentism-as-model. This is elaborated showing connections to existing work, appreciating the special role of i.i.d. models and subject matter knowledge, giving an account of how and under what conditions models that are not true can be useful, giving detailed interpreta- tions of tests and confidence intervals, confronting their implicit compatibility logic with the inverse probability logic of Bayesian inference, reinterpreting the role of model assumptions, appreciating robustness, and the role of “interpretative equivalence” of models. Epistemic (often referred to as Bayesian) probability shares the issue that its models are only idealisations and not really true for modelling reasoning about uncertainty, meaning that it does not have an essential advantage over frequentism, as is often claimed. Bayesian statistics can be combined with frequentism-as-model, leading to what Gelman and Hennig (2017) call “falsificationist Bayes”.

I’n interested in this topic (no surprise given the reference to our joint paper, “Beyond subjective and objective in statistics.”

I’ve long argued that Bayesian statistics is frequentist, in the sense that the prior distribution represents the distribution of parameter values among all problems for which you might apply a particular statistical model. Or, as I put it here, in the context of statistics being “the science of defaults”:

We can understand the true prior by thinking of the set of all problems to which your model might be fit. This is a frequentist interpretation and is based on the idea that statistics is the science of defaults. The true prior is the distribution of underlying parameter values, considering all possible problems for which your particular model (including this prior) will be fit.

Here we are thinking of the statistician as a sort of Turing machine that has assumptions built in, takes data, and performs inference. The only decision this statistician makes is which model to fit to which data (or, for any particular model, which data to fit it to).

We’ll never know what the true prior is in this world, but the point is that it exists, and we can think of any prior that we do use as an approximation to this true distribution of parameter values for the class of problems to which this model will be fit.

I like what Christian has to say in his article. I’m not quite sure what to do with it right now, but I think it will be useful going forward when I next want to write about the philosophy of statistics.

Frequentist thinking is important in statistics, for at least four reasons:

1. Many classical frequentist methods continue to be used by practitioners.

2. Much of existing and new statistical theory is frequentist; this is important because new methods are often developed and understood in a frequentist context.

3. Bayesian methods are frequentist too; see above discussion.

4. Frequentist ideas of compatibility remain relevant in many examples. It can be useful to know that a certain simple model is compatible with the data.

So I’m sure we’ll be talking more about all this.