But maybe it’s all OK?
Most of this post is a pretty negative review of a recent book, about which I will apply the well-known saying, “Your manuscript is both good and original; but the part that is good is not original, and the part that is original is not good.”
That said, the part that is good but not original . . . maybe it’s still OK? This comes up with a lot of junk science. I wrote about this idea awhile ago in the context of the “pizzagate” research which was at best too noisy to be useful and at worst fraudulent, and involved dishonesty in any case—but still could arguably have been sending a useful message to people (Eat Small Portions) anyway. Just like if someone wrote a book called, ummmm, I dunno, Business Secrets from the Nation’s Top Astrologers, and all the supposed science was bogus, but it could still be filled with good tips about negotiation, working with your boss, etc etc. Also, as with pizzagate, power pose, and lots of other experiments that didn’t work, the underlying ideas could still be valid in many contexts.
Indeed, although the idea of a “critical positivity ratio” of 2.903 or 3 or 4 or whatever has no basis whatsoever in either theory or experiment, the more general idea of a “positivity ratio”—the idea that we should have more positive interactions and fewer negative interactions—that seems reasonable enough. And if misrepresentation of science is what it takes to write a self-help book that people will be willing to read, then, hey, who knows what’s the right thing to do?
So that’s my big-picture take. Now let’s talk about the science and how it’s represented.
Now let’s get to it
It’s called The Power of Bad, and it has endorsements from several prominent psychology professors. What could possibly go wrong here?? It’s a followup to their earlier book, Willpower: Rediscovering the Greatest Human Strength, which I guess must be based on the shaky theory of ego depletion.
Sure, Baumeister has a track record of making bold claims that don’t replicate as well as dabbling in politically minded pseudoscience, Tierney has a track record of falling for such claims, and the sort of prominent psychology professors who would endorse this sort of book have a track record of looking the other way or trying to explain away failures in research methods when they concern bold claims that they want to believe.
And all these behaviors annoy me.
That said, just because people have made mistakes in the past, it doesn’t mean that whatever they’re doing now is wrong. Indeed, at some level you have to give Tierney and Baumeister credit for sticking their necks out and making bold claims, and you have to give the psychology profession credit for supporting this behavior.
I guess what the track record is telling me is, I wouldn’t be inclined to believe what Tierney and Baumeister have to say, given that their last book was based on work that did not replicate, and I wouldn’t be inclined to take the book blurbs so seriously, given the track record of the psychology establishment.
Here’s the first blurb on the website of The Power of Bad:
“The most important book at the borderland of psychology and politics that I have ever read.”—–Martin E. P. Seligman, Zellerbach Family Professor of Psychology at that University of Pennsylvania and author of Learned Optimism
I got no beef with the Zellerbach family, but last I heard of Martin E. P. Seligman was a few years ago when Carol Nickerson informed us that he supported the both ridiculous and discredited “critical positivity ratio” theory of Fredrickson and Losada and denigrated the efforts of people like Nick Brown who have pointed out problems with that work.
So this made me search The Power of Bad for *positivity ratio*. Bingo! Here it is in the index:
And here’s the first appearance I found in the book:
So far, so good. I have no problem with the above passage. It’s a cute story, an interesting observation, and, sure, it makes a lot of sense for therapists and counselors to keep track of positive and negative events.
Later on, they get more specific:
OK, then they step back a bit:
But no, that’s not enough of a disclaimer. Not at all. They’re regurging discredited research, smoothing out the edges (the notorious “2.9013” doesn’t seem to appear anywhere), but they ride the B.S. wave by saying “the positivity ratio should be at least 3 to 1, and preferably a little higher.”
They also have this disclaimer in the notes:
Good of them to cite Nick Brown et al. here, but it seems a bit of a contradiction that, on one hand, “it was nonsense based on . . . elementary mathematical errors” and “the complex mathematical analysis was flawed.” It’s a bad sign for this entire subfield how long and tenaciously Fredrickson and others have held on to this “ridiculous . . . nonsense” theory. For example, from a 2014 report:
Dazzled by the ratio’s scientific-seeming implications, Fredrickson, whom Martin Seligman, former president of the American Psychological Association, earlier dubbed “the genius of the positive psychology movement” and who has occasionally blogged for Psychology Today, took her and Losada’s research and turned it into a mass-audience book. The result, Positivity, carried a subtitle that doubtless now makes her cringe: “Top-Notch Research Reveals the 3-to-1 Ratio That Will Change Your Life.”
Kind of a bummer when the genius of the movement writes a whole book relying on nonsense based on elementary mathematical errors!
And, just by the way, I don’t know that it’s accurate that Fredrickson “conceded that the complex mathematical analysis was flawed.” In her article, “Updated thinking about positivity ratios,” written in response to Brown et al., Fredrickson wrote:
I’ve come to see sufficient reason to question the particular mathematical framework Losada and I adopted . . . Whether the Lorenz equations—the nonlinear dynamic model we’d adopted—and the model estimation technique that Losada utilized can be fruitfully applied to understanding the impact of particular positivity ratios merits renewed and rigorous inquiry . . . the nonlinearity evident in human emotion systems may not be best modeled by the specific set of differential equations that Losada proposed . . .
This seems very guarded to me. She’s certainly not saying that her model with Losada is “ridiculous nonsense” or even that it’s “flawed”; all she’s saying is that there’s reason to question the model, or that Losada’s model “may not be best.”
Freeze that frame for a moment: “may not be best.” So, even after the discrediting of the model, Fredrickson is saying, not just that it might be correct, but that it might even be the “best” model for the nonlinearity evident in human emotion systems.
[T]he ratios obtained in each of the two samples closely flank the critical positivity ratio pinpointed by Losada’s mathematical work, to the extent that Losada’s mathematical work may have been flawed, inappropriately applied, or both, the apparent empirical support for Losada’s critical “tipping point” ratio offered by these data may have reflected chance, albeit chance striking twice.
Again, she’s not admitting Losada’s work (which, again, Tierney and Baumeister characterize as “ridiculous . . . nonsense based on . . . elementary mathematical errors”) is flawed. All she’ll say is it “may have been flawed.” And then she covers her bets by saying that the data “closely flank the critical positivity ratio pinpointed by Losada’s mathematical work.” She’s not letting go of that ratio of 3.
Does this sort of close reading have value in this discussion? Maybe. After all, if a leading paper—perhaps the leading paper—in the positivity-ratio field was “ridiculous . . . nonsense,” then it’s not a good sign that the leading researcher in the field still couldn’t bring herself to accept that it was flawed. This suggests a Tenacious D approach to academic disputes which won’t necessarily advance science, and it makes me what I can trust from this crowd, given that their resistance to admitting their own errors. (Tierney and Baumeister admit that Losada made an error, but Losada’s not part of their club. They’re representing joint work of Fredrickson and Losada—work that Fredrickson refuses to let go of—as if it’s Losada’s alone.)
This new book on The Power of Bad doesn’t seem so different from the discredited work that came before. It’s not a 3-to-1 ratio, it’s a 4-to-1 ratio, and Losada is out of the picture. But why should we believe this? Fredrickson seems to believe strongly thought 3-to-1 (sorry, 2.9103 to 1) was based on empirical evidence. If Tierney and Baumeister are switching to 4-to-1, did the evidence change too? What happened to Fredrickson’s two samples that “closely flank the critical positivity ratio pinpointed by Losada’s mathematical work”? Or did the evidence always point to 4-to-1? Or maybe it was 2-to-1? 8-to-1?
They’re just making that up! Why not the Rule of 2, or the Rule of 8, or the Rule of 2.9103? The Rule of pi, maybe? I guess that pseudo-certainty goes down smoothly for many readers.
Uh oh. Some of the evidence comes from John Gottman:
This isn’t Gottman’s notorious “94 percent accuracy” research, but I sill have to be concerned here.
Oh. Here’s another bit from that Fredrickson article (again, written after the discrediting of Fredrickson and Losada’s 2.9103):
Waugh and Fredrickson (2006) reported that the most potent predictor of accumulating relational resources was whether or not students’ positivity ratios, measured over 28 days of nightly reports, exceeded the critical ratio put forth in Fredrickson and Losada (2005). Strikingly, for students with ratios below 2.9:1, absolutely no evidence emerged to suggest growth in relational resources . . .
From a statistical perspective, I’ll warn you off of claims such as “most potent predictor.” The real point is . . . she was not letting go of that 2.9. Not at all!
And again, another study:
A sample of 239 adults . . . completed daily reports of emotions for 30 days . . . Chi-square tests showed that (across ages) participants with positivity ratios lower than 2.9:1 were disproportionately languishing, whereas those with positivity ratios above 2.9:1 were disproportionately flourishing. . . .
I wonder if Tierney and Baumeister believe the 2.9 thing too, and they’re just rounding up to 4, to be on the safe side.
Here’s what Brown et al. wrote about the purported evidence for the positivity ratio:
The fact that “flourishing” college students exhibited higher average positivity ratios (3.2) than those who were “languishing” (2.3) should not come as a surprise; there is nothing inherently implausible about the idea that people with a higher ratio of positive to negative emotions might experience better outcomes than those with a lower ratio. However, the suggestion that people with a positivity ratio of 2.91 are in some discontinuous way significantly better off than those with a ratio of 2.90, simply because this number has crossed some magic line, is not supported by any evidence. . . .
Fredrickson and Losada (2005) in effect claimed—on the basis of an analysis of verbal statements made in a series of one-hour meetings held in a laboratory setting by business teams of exactly eight people, combined with some solemn invocations of the Lorenz equations—to have discovered a universal truth about human emotions, valid for individuals, couples, and groups of arbitrary size and capable of being expressed numerically to five significant digits. This claim—which was presented with no qualification or discussion of possible limits to its validity—would, if verified, surely require much of contemporary psychology and neuroscience to be rewritten . . .
Indeed. The problem with the positive psychology researchers is not, as Tierney and Baumeister would have you believe, that “another researcher [Losada] took the results from . . . research to a ridiculous extreme” and that this sat there gathering acclaim and citations because “the math was beyond most psychologists.” The problem was their eagerness to believe. Look. If someone tells me he built a tabletop machine to turn lead into gold, I’d be skeptical. I wouldn’t trumpet the claim to the world and then hide behind the statement that the chemistry was beyond me. If the math is beyond you, maybe you should first run it by someone who it’s not behind.
Brown et al. continue:
We do not here call into question the idea that positive emotions are more likely to build resilience than negative emotions, or that a higher positivity ratio is ordinarily more desirable than a lower one.
I agree. Positive things are better than negative things, and in my opinion, or if I were a funding agency, I’d say this is a topic worthy of study by psychologists, worthy of qualitative study, worthy of quantitative study, and worthy of a popular book. So I’m almost entirely in agreement with authors of The Power of Bad, I’m 100% cool with them sharing anecdotes and giving advice along those lines. I give lots of advice about teaching, just based on my qualitative experience, with no studies at all of any kind. Anecdotes are great. My problem is with pseudo-quantitative claims that are purportedly backed by science—but aren’t.
Baby and bathwater
In her article, Fredrickson expressed concern about throwing out the baby—the study of positive and negative emotions and interactions—with the bathwater—the junk math of Losada. This is an important concern.
I have only three proposed amendments to Fredrickson’s position.
First, the “bathwater” is not just Losada’s math; it’s also the whole idea of a “critical ratio.”
Second, when you talk about throwing out the bathwater, really throw out the bathwater! Don’t try to pretend it has some value (“Strikingly, for students with ratios below 2.9:1”). Give it up. Let it go.
Third, I think that it is super-easy to hold onto the baby when discarding the bathwater. The “baby” here is qualitative and quantitative studies, interviews with people, memoirs, etc., also diary studies, counts of marital arguments, etc. Indeed, the “baby” can thrive much better without having to swim in the dirty bathwater. Once you stop looking for (nonexistent) critical ratios, you can look at the data with a clear eye and learn soooooo much more.
I agree with Seligman, Fredrickson, Tierney, Baumeister, Brown, Sokal, Friedman, and lots of other people that the quality and quantity of positive and negative experiences are worth studying and understanding. But I don’t think we all agree on the value of looking for or focusing on a threshold.
A new science book came out by two authors who have a track record of hyping work that did not replicate. The book was endorsed by someone who defended work that even these two authors describe as “ridiculous” and “nonsense.” That said, the qualitative advice in the book could have value, and some of the quantitative claims could be correct, even if they’re not as strongly supported by data as a causal reading of the book might have you believe. The authors are admirably careful in some of their statements, but from time to time they drifted into questionable pronouncements where they lend the authority of science to speculations based on weak analysis of data that are sometimes of questionable quality. I recommend that reviewers of the book not by default believe the book’s claims about the positivity ratio of 4, etc.
P.S. Thanks to Zad for the above picture to which he gives the caption, “When researchers spot noise disguised as a signal.”