This one’s important: Designing clinical trials for coronavirus treatments and vaccines

I’ve had various thoughts regarding clinical trials for coronavirus treatments and vaccines, and then I came across thoughtful posts by Thomas Lumley and Joseph Delaney on vaccines.

So let’s talk, first about treatments, then about vaccines.

Clinical trials for treatments

The first thing I want to say is that designing clinical trials is not just about power calculations and all that. It’s also about what you’re gonna do with the results once they come in. The usual ideas of design (including in our own books, unfortunately) focus on what can be learned from a single study. But that’s not what we have here.

Hospitals have lots of coronavirus patients right now, and they can try out whatever treatments are on the agenda, starting with the patients that are at the highest risk of dying. This should be done in a coordinated fashion, by which I don’t mean a bunch of randomized trials, each aiming for that statistical-significance jackpot, followed by a series of headlines and maybe an eventual meta-analysis. When I say “coordinated,” I mean that all the studies should put their patient-level information into an open repository using some shared format, everything gets registered, all the treatments, all the background variables, all the outcomes. This shouldn’t be a burden on experimenters. Indeed, a shared, open-source spreadsheet should be easier to use, compared to the default approach of each group doing their own thing.

Ok, now that I wrote that paragraph, I wish I’d written it a couple months ago. Not that it would’ve made any difference. It would take a lot to change the medical-industrial complex. Sander Greenland et al. have been screaming for years, and the changes have been incremental at best.

Let me tell you a story. A doctor was designing a trial for an existing drug that he thought could be effective for high-risk coronavirus patients. He contacted me to check his sample size calculation: under the assumption that the drug increased survival rate by 25 percentage points, a sample size of N = 126 would assure 80% power. (With 126 people divided evenly in two groups, the standard error of the difference in proportions is bounded above by √(0.5*0.5/63 + 0.5*0.5/63) = 0.089, so an effect of 0.25 is at least 2.8 standard errors from zero, which is the condition for 80% power for the z-test.) When I asked the doctor how confident he was in his guessed effect size, he replied that he thought the effect on these patients would be higher and that 25 percentage points was a conservative estimate. At the same time, he recognized that the drug might not work. I asked the doctor if he would be interested in increasing his sample size so he could detect a 10 percentage point increase in survival, for example, but he said that this would not be necessary.

It might seem reasonable to suppose that a drug might not be effective but would have a large effect if it did happen to work. But this vision of uncertainty has problems. Suppose, for example, that the survival rate was 30% among the patients who do not receive this new drug and 55% among the treatment group. Then in a population of 1000 people, it could be that the drug has no effect on the 300 of people who would live either way, no effect on the 450 who would die either way, and it would save the lives of the remaining 250 patients. There are other possibilities consistent with a 25 percentage point benefit—for example the drug could save 350 people while killing 100—but I’ll stick with the simple scenario for now. In any case, the point is that the posited benefit of the drug is not “a 25 percentage point benefit” for each patient; rather, it’s a benefit on 25% of the patients. And, from that perspective, of course the drug could work but only on 10% of the patients. Once we’ve accepted the idea that the drug works on some people and not others—or in some comorbidity scenarios and not others—we realize that “the treatment effect” in any given study will depend entirely on the patient mix. There is no underlying number representing “the effect of the drug.” Ideally one would like to know what sorts of patients the treatment would help, but in a clinical trial it is enough to show that there is some clear average effect. My point is that if we consider the treatment effect in the context of variation between patients, this can be the first step in a more grounded understanding of effect size.

I also shared some thoughts last month on costs and benefits, in particular:

When considering design for a clinical trial I’d recommend assigning cost and benefits and balancing the following:

– Benefit (or cost) of possible reduced (or increased) mortality and morbidity from COVID in the trial itself.
– Cost of toxicity or side effects in the trial itself.
– Public health benefits of learning that the therapy works, as soon as possible.
– Economic / public confidence benefits of learning that the therapy works, as soon as possible.
– Benefits of learning that the therapy doesn’t work, as soon as possible, if it really doesn’t work.
– Scientific insights gained from intermediate measurements or secondary data analysis.
– $ cost of the study itself, as well as opportunity cost if it reduces your effort to test something else.

This may look like a mess—but if you’re not addressing these issues explicitly, you’re addressing them implicitly. . . .

Whatever therapies are being tried, should be monitored. Doctors should have some freedom to experiment, and they should be recording what happens. To put it another way, they’re trying different therapies anyway, so let’s try to get something useful out of all that.

It’s also not just about “what works” or “does a particular drug work,” but how to do it. . . . You want to get something like optimal dosing, which could depend on individuals. But you’re not gonna get good discrimination on this from a standard clinical trial or set of clinical trials. So we have to go beyond the learning-from-clinical-trial paradigm, designing large studies that mix experiment and observation to get insight into dosing etc.

Also, lots of the relevant decisions will be made at the system level, not the individual level. . . . These sorts of issues are super important and go beyond the standard clinical-trial paradigm.

Clinical trials for vaccines

I haven’t thought about this at all so I’ll outsource the discussion to others.


There are over 100 potential vaccines being developed, and several are already in preliminary testing in humans. There are three steps to testing a vaccine: showing that it doesn’t have any common, nasty side effects; showing that it raises antibodies; showing that vaccinated people don’t get COVID-19.

The last step is the big one, especially if you want it fast. . . . We don’t expect perfection, and if a vaccine truly reduces the infection rate by 50% it would be a serious mistake to discard it as useless. But if the control-group infection rate over a couple of months is a high-but-maybe-plausible 0.2% that means 600,000 people in the trial — one of the largest clinical trials in history.

How can that be reduced? If the trial was done somewhere with out-of-control disease transmission, the rate of infection in controls might be 5% and a moderately large trial would be sufficient. But doing a randomised trial in setting like that is hard — and ethically dubious if it’s a developing-world population that won’t be getting a successful vaccine any time soon. If the trial took a couple of years, rather than a couple of months, the infection rate could be 3-4 times lower — but we can’t afford to wait a couple of years.

The other possibility is deliberate infection. If you deliberately exposed trial participants to the coronavirus, you could run a trial with only hundreds of participants, and no more COVID deaths, in total, than a larger trial. But signing people up for deliberate exposure to a potentially deadly infection when half of them are getting placebo is something you don’t want to do without very careful consideration and widespread consultation. . . .


One major barrier is manufacturing the doses, especially since we decided to off-shore a lot of our biomedical capacity in the name of efficiency (at the cost of robustness). . . . We want an effective vaccine and it may be the case that candidates vary in their effectiveness. There are successful vaccines that do not grant 100% immunity. The original polio vaccines were only 60-70% effective versus one of the strains, but that still led to a vast decrease in the number of infections in the United States once vaccination became standard.

So, clearly we want trials. . . . Now we get to the point about medical ethics. A phase III trial takes a long time to conduct and there is some political pressure for a fast solution. . . . if the virus is mostly under control, you need a lot of people and a long time to evaluate the effectiveness of a vaccine. People are rarely exposed so it takes a long time for differences in cases between the arms to show up. . . .

Another option is the challenge trial. Likely only taking a few hundred participants, it would have no more deaths than a regular trial. But it would involve infecting people, treated with a placebo(!!), with a potentially fatal infectious disease. There are greater good arguments here, but the longer I think about them the more dubious they get to me. Informed consent for things that are so dangerous really does suggest coercion. . . .

Combining these ideas

Organizing clinical trials for treatments . . . I just don’t think this is gonna happen.

But organizing clinical trials for vaccines? Maybe this is possible. Based on the above discussion, it seems like it’s likely we’ll soon be seeing vaccine trials based on infecting healthy people with the virus and then seeing if they fight it off. If so, I have a few thoughts:

1. I don’t see why you need to give anyone placebos. If we have several legitimate vaccine ideas, let’s give everyone some vaccine or another. If they all work, and nobody gets sick, that’s great. If we’re testing 100 vaccine ideas, then we can guess that most of them won’t be so effective, so we’ll get placebos automatically.

2. As discussed above, coordinate all of these. Certainly no need for 100 different placebo groups.

3. Multilevel modeling all the way. Bayesian inference. Decision making based on costs and benefits, not statistical significance.

Can we make this happen?

P.S. Zad informs us that the above cat is exhausted from quarantine and wants a vaccine immediately if not sooner.