“For the cost of running 96 wells you can test 960 people and accurate assess the prevalence in the population to within about 1%. Do this at 100 locations around the country and you’d have a spatial map of the extent of this epidemic today. . . and have this data by Monday.”

Daniel Lakeland writes:

COVID-19 is tested for using real-time reverse-transcriptase PCR (rt-rt-PCR). This is basically just a fancy way of saying they are detecting the presence of the RNA by converting it to DNA and amplifying it. It has already been shown by people in Israel that you can combine material from at least 64 swabs and still reliably detect the presence of the RNA.

No one has the slightest clue how widespread SARS-Cov-19 infections really are in the population, we’re wasting all our tests testing sick people where the bayesian prior is basically that they have it already, and the outcome of the test mostly doesn’t change the treatment anyway. It’s stupid.

To make decisions about how much physical isolation and shutdown and things we need, we NEED real-time monitoring of the prevalence in the population.

Here’s my proposal:

Mobilize military medical personnel around the country to 100 locations chosen randomly proportional to the population. (the military is getting salaries already, marginal cost is basically zero).

In each location set up outside a grocery store.

Swab 960 people as they enter the grocery store. Sort the swab vials in random order.

From each vial, extract RNA into a tube, and combine the first 10 tubes into well 1, second 10 tubes into well 2 etc… for a 96 well PCR plate (this is a standard sized PCR tray used in every bio lab in the country).

Run the machines and get back a count of positive wells for each tray…

Use a beta(2,95) prior for the frequency of SARS-Cov-19 infection, this has high probability density region extending from 0 to about 10% prevalence, with the highest density region between around 0.5 and 5%, an appropriate prior for this application.

let f be the frequency in the population, then let ff = 1-dbinom(0,10,f), then ff is the frequency with which a randomly selected well with 10 samples will have *one or more* swab positive. The likelihood for N wells to come positive is then dbinom(N,96,ff)

Doing a couple lines of simulation, for the cost of running 96 wells you can test 960 people and accurate assess the prevalence in the population to within about 1%. Do this at 100 locations around the country and you’d have a spatial map of the extent of this epidemic today.

There is NO reason you couldn’t mobilize military resources later today to do this swabbing, and have this data by Monday.

This kind of pooled sampling is a well known design, so I assume the planners at the CDC have already thought of this. On the other hand, if they were really on top of things, they’d have had a testing plan back in January, so really I have no idea.

The innovation of Lakeland’s plan is that you can use a statistical model to estimate prevalence from this pooled data. When I’ve the pooled-testing design in textbooks, it’s been framed as a problem of identifying the people who have the disease, not for estimating prevalance rates.