“Whether something is statistically significant is itself a very random feature of data, so in this case you’re essentially outsourcing your modeling decision to a random number”

I happened to come across a post of mine that’s not scheduled until next April, and I noticed the above line, which I really liked, so I’m sharing it with you right now here.

The comment relates to a common procedure in statistics, where researchers decide exclude potentially important interactions from their models, just because these interactions are not statistically significant.

As I wrote, whether something is statistically significant is itself a very random feature of data, so in this case you’re essentially outsourcing your modeling decision to a random number.

At some level, sure, we know that our decisions won’t be perfect, and any data-based decision can be wrong. But using statistical significance (or any other binary procedure, whether it be a p-value or a Bayes factor or whatever) in this way . . . That’s just an unnecessary addition of noise into your procedure, and it can have real and malign consequences.