Controversy regarding the effectiveness of Remdesivir

Steven Wood writes:

There now some controversy regarding the effectiveness of Remdesivir for treatment of Covid. With the inadvertent posting of results on the WHO website.

One of the pillars of hope for this treatment is the monkey treatment trial (the paper is here).

As an experience clinical trialist I was immediately skeptical of the results. On the face of it the N’s (3 arms in the trial, 6 monkeys in each group) seemed to small for the low p-values they were reporting. I am sure you are familiar with the common practice in bench studies of using as N the number of the number of tests/assays etc performed and not the number of animals/subjects.

When I looked at figure 3, I was convinced that that is what they had done. For the viral load data between the treatment group and the controls only one of 6 pairwise comparisons between the 6 groups of lung lobes results was even of borderline statistical significance. Yet in the panel B in figure 3 the p-value of difference in viral loads is presented as

I can’t say anything about Remdesivir—I can’t even pronounce the word!—but, yes, it does seem like if you analyze different lung lobes from the same monkey as independent data points, that you’ll be overstating your certainty, and setting yourself up for future unpleasant surprises in the form of failed replications.

Or maybe not. Maybe the analysis is just fine. It might be better in this sort of paper to focus more on presenting the raw data in as many ways as possible, rather than on this sort of thing:

