Hey, I think something’s wrong with this graph!


Paul Alper points us to this column by Dana Milbank discussing the above graph from Georgia’s Department of Public Health:

Ok, the comb-style bar graph is, as always, a bad idea, as it multiplexes two dimensions (county and time) on a single x-axis. The graph should be a lineplot, with one line per county, and the lines labeled directly with county names.

But, wait . . . the graph has another problem. The ordering of the counties changes with each date!

But, wait . . . that’s not the biggest problem. As Milbank writes:

But on closer inspection, the dates on the chart showed a curious ordering: April 30 was followed by May 4; May 5 was followed by May 2, which was followed by May 7 — which in turn was followed by April 26. The dates had been re-sorted to create the illusion of a decline. The five counties were likewise re-sorted on each day to enhance the illusion.

Or maybe the software did it automatically? Fortunately, the Georgia Department of Public Health has all their data and code on Github, so you can go run the program yourself and see . . . just kidding!

Milbank continues:

The governor’s office apologized for what state Rep. Scott Holcomb, an Atlanta Democrat, properly called a “cuckoo” presentation of data. But as the Atlanta Journal-Constitution noted, it was the third such “error” in as many weeks involving sloppy counting of cases, deaths and other measures tracking covid-19. . . .

I have no idea if this was a software default (in which case, I assume the people who made the graph screwed up under pressure) or if some extra effort had to be made to sort the numbers in this way. Maybe someone just thought the graph looked prettier this way?

As a statistician who specializes in graphical communication, I’m usually happy to see statistical graphics in the news—but, after seeing this example and this one from a few days ago, I’m not so happy.