Some thoughts inspired by Lee Cronbach (1975), “Beyond the two disciplines of scientific psychology”

I happened to come across this article today. It’s hardly obscure—it has over 3000 citations, according to Google scholar—but it was new to me.

It’s a wonderful article. You should read it right away.

OK, click on the above link and read the article.

Done? OK, then read on.

You know that saying, that every good idea in statistics was published fifty years earlier in psychometrics? That’s what’s happening here. Cronbach talks about the importance of interactions, the difficulty of estimating them from data, the way in which researchers manage to find what they’re looking for, even in settings where the data are too weak to really show such patterns, he even talks about the piranha problem in the context of “Aptitude x Treatment interactions”:

In a world where researchers are babbling on about so-called moderators and mediators as if they know what they’re doing, Cronbach is a voice of sanity.

And this was all fifty years ago! All this sounds a lot like Meehl, and Meehl is great, but Cronbach adds value by giving lots of specific applied examples.

In the article, Cronbach makes a clear connection between interactions and the replication crisis arising from researcher degrees of freedom, a point that I rediscovered—40 years later—in my paper on the connection between varying treatment effects and the crisis of unreplicable research. Too bad I hadn’t been aware of this work earlier.

Hmmm . . . let me check the Simmons, Nelson, and Simonsohn (2011) article that introduced the world to the useful term “researcher degrees of freedom”: Do they cite Cronbach? No. Interesting that, even psychology researchers were unaware of that important work in psychometrics. I’m not slamming Simons et al.—I hadn’t known about Cronbach either!—I’m just noting that, even within psychology, his work was not so well known.

Going through the papers that referred to Cronbach (1975), I came across this book chapter from Denny Borsboom, Rogier A. Kievit, Daniel Cervone and S. Brian Hood, which begins:

Anybody who has some familiarity with the research literature in scientific psychology has probably thought, at one time or another, ‘Well, all these means and correlations are very interesting, but what do they have to do with me, as an individual person?’. The question, innocuous as it may seem, is a deep and complicated one. In contrast to the natural sciences, where researchers can safely assume that, say, all electrons are exchangeable save properties such as location and momentum, people differ from each other. . . .

The problem permeates virtually every subdiscipline of psychology, and in fact may be one of the reasons that progress in psychology has been limited.

They continue:

Given the magnitude of the problems involved in constructing person-specific theories and models, let alone in testing them, it is not surprising that scholars have sought to integrate inter-inter-individual differences and intra-individual dynamics in a systematic way. . . .

The call for integration of research traditions dates back at least to Cronbach’s (1957) . . .:

Correlational psychology studies only variance among organisms; experimental psychology studies only variance among treatments. A united discipline will study both of these, but it will also be concerned with the otherwise neglected interactions between organismic and treatment variables . . .

Not much has changed in the basic divisions in scientific psychology since Cronbach (1957) wrote his presidential address. True, today we have mediation and moderation analyses, which attempt to integrate inter-individual differences and intra-individual process, and in addition are able to formulate random effects models that to some extent incorporate inter-individual differences in an experimental context; but by and large research designs are characterized by a primary focus on the effects of experimental manipulations or on the structure associations of inter-individual differences, just as was the case in 1957. . . .

They continue:

In experimental research, the researcher typically hopes to demonstrate the existence of causal effects of experimental manipulations (which typically form the levels of the ‘independent variable’) on a set of properties which are treated as dependent on the manipulations (their levels form the ‘dependent variable’). . . .

One interesting and very general fact about experimental research is that such claims are never literally true. The literal reading of conclusions like Bargh et al., very prevalent among untrained readers of scientific work, is that all participants in the experimental condition were slower than all those in the control condition. But that, of course, is incorrect – otherwise there would be no need for the statistics. . . .

From a statistical perspective, it is commonplace to speak of an average treatment effect. But, when considered from the perspective of understanding human behavior, it’s a big deal that effects typically appear only in the aggregate and not on individuals.

The usual story we tell is that the average treatment effect (which we often simply call “the treatment effect”) is real—indeed, we often model it as constant across people and over time—and then we label deviations from this average as “noise.”

But I’ve increasingly come to the conclusion that we need to think of treatment effects as varying: thus, the difficulty in estimating treatment effects is not merely a problem of “finding a signal in noise” which can be solved by increasing our sample size; rather, it is a fundamental challenge.

To use rural analogies, when we’re doing social and behavioral science, we’re not looking for a needle in a haystack; rather, we’re trying to catch a slippery fish that keeps moving.

All this is even harder in political science, economics, or sociology. An essential aspect of social science is that it understands people not in isolation but within groups. Thus, if psychology ultimately requires a different model for each person (or a model that accounts for differences between people), the social sciences require a different model for each configuration of people (or a model that accounts for dependence of outcomes on the configuration).

To put it another way, if any theory of psychology implies 7,700,000,000 theories (corresponding to the population of the world today, and for now ignoring models of people who are no longer alive), then political science, economics, etc. imply 2^7,700,000,000 – 1 theories (corresponding to all possible subsets of the population, excluding the empty set, for which no social science is necessary). That’s an extreme statement—obviously we work with much simpler theories that merely have implications for each individual or each subset of the population—but the point is that such theories are either explicit or implied in any model of social science that is intended to have general application.