Can someone build a Bayesian tool that takes into account your symptoms and where you live to estimate your probability of having coronavirus?

Carl Mears writes:

I’m married to a doctor who does primary care with a mostly disadvantaged patient base.

The problem her patients face is if they get tested for COVID, they are supposed to self quarantine until they get their test results, which currently takes something like a week. Also, their *family* is supposed to stop going to work, etc.

Seems like a reasonable approach, until you consider that there are often a lot of people in their house, none of whom can afford to lose their job and no extra rooms to quarantine in. And living with someone who has been tested but not yet “positive” is not a valid sick leave excuse. So this is a big ask that will likely lead to people not wanting to get tested.

What would really help would be a statistical tool that takes into account where the patient lives (and maybe where they work) to generate a Bayesian prior, and then integrates reported symptoms (cough, fever, tired, loss of taste, etc) to find a statistically defensible probability of infection that could guide the request to quarantine

For an extreme example, a persistent cough and loss of smell would mean something very different in someone that works in a warehouse in the Bronx vs a ranch owner in rural Montana.

It seems like the P(symptom) if COVID probabilities are around. I am not sure about the total population P(symptom):(COVID or NOT_COVID) which one would also need, right?.

What do you think? Have you heard of anyone making a tool like this?

My reply:

Yes, I guess you could do a formal (or informal) decision analysis.

A starting point is that people are already balancing their estimated risks and benefits in some way to decide whether to be going to work, how much to go out, etc.

You can roughly say that a person is in one of 3 x 2 possible states:
– You have no symptoms or you have symptoms
– You have not taken a test, you have a negative test result, or you have a positive test result.

Most people are in the “no symptoms, no test” state and they’ve already decided how to live their lives. If you are tested and the result is positive or negative, that gives you more information. The difficult case, as you note, is if you have symptoms but your test result in’t in. Having symptoms increases the probability of having COVID, so it should make some people not go to work who were otherwise planning to go to work. But I’m not sure how helpful it would be to have some sort of Bayesian calculator for this, partly because these probabilities are not known very well and partly because it’s not so clear how to pipe this into a decision.

In any case, I agree with you that it’s not right to tell people to self-quarantine before the test results are in. If they want to tell people to self-quarantine if they have symptoms, that could make sense, but then the decision is based on the symptoms, not on the fact of having taken a test.

Carl’s response:

As far as COVID, I am thinking of a tool that takes into account symptoms and where the person lives to get a better estimate.

Simple tools are widespread in the MD community for making all kinds of decisions. You can play around with a huge collection of them here. (If you don’t like clicking links (you shouldn’t) just search for mdcalc.)

The innovation would be automatically using home address to inform the prior. This information could be automatically harvested from the web in an on-going way from e.g the NYT site or Johns Hopkins). Even at the county by county level, this would be useful.

What would be specifically needed to make such a tool?

My reply:

Yes, I guess this would be possible. Maybe it’s a good idea. The basic model would go like this: You’d be trying to estimate your status (never exposed, currently contagious, or exposed and no longer contagious). Your geographic information and exposure (some measure of how many at-risk people you are in close contact with) would give you a prior probability or base rate of probability for each of these three states. Then you’d also need the probability of each possible cluster of symptoms given each of the three states. This model implicitly assumes that these conditional probabilities don’t depend on your location and exposure. All these probabilities would have a lot of uncertainty, but it still seems like better than nothing.

So maybe someone wants to build this? Or has already done so?