Uncertainty in the Time of COVID-19, Part 2

Part 2: How Do We Know What We Know?

When a new pathogen first shows up to threaten human lives, ignorance dominates knowledge. The faster we retire our ignorance and maximize our knowledge, the better our response to any novel threat. The good news is that knowledge of what is happening during the current COVID-19 pandemic is accumulating more rapidly than it did during the SARS outbreak, in part because we have new tools available, and in part because Chinese clinicians and scientists are publishing more, and faster, than in 2003. And yet there is still a great deal of ignorance about this pathogen, and that ignorance breeds uncertainty. While it is true that the virus we are now calling SARS-CoV-2 is relatively closely related genetically to the SARS-CoV that emerged in 2002, the resulting disease we call COVID-19 is notably different than SARS. This post will dig into what methods and tools are being used today in diagnosis and tracking, what epidemiological knowledge is accumulating, and what error bars and assumptions are absent, being misunderstood, or are errant.

First, in all of these posts I will keep a running update of good sources of information. The Atlantic continues its excellent reporting into lack of testing in the US by digging into the decision-making process, or lack thereof, that resulted in our current predicament. I am finding it useful to read the China CDC Weekly Reports, which constitute source data and anecdotes used in many other articles and reports.

Before diving in any further, I would observe that it is now clear that extreme social distancing works to halt the spread of the virus, at least temporarily, as demonstrated in China. It is also clear that, with widespread testing, the spread can also be controlled with less severe restrictions — but only if you assay the population adequately, which means running tests on as many people as possible, not just those who are obviously sick and in hospital.

Why does any of this matter?

In what follows, I get down into the weeds of sources of error and of sampling strategies. I suggest that the way we are using tests is obscuring, rather than helping, our ability to understand what is happening. You might look at this, if you are an epidemiologist or public health person, and say that these details are irrelevant because all we really care about are actions that work to limit or slow the spread. Ultimately, as the goal is to save lives and reduce suffering, and since China has demonstrated that extreme social distancing can work to limit the spread of COVID-19, the argument might be that we should just implement the same measures and be done with it. I am certainly sympathetic to this view, and we should definitely implement measures to restrict the spread of the virus.

But it isn’t that simple. First, because the population infection data is still so poor, even in China (though perhaps not in South Korea, as I explore below) every statement about successful control is in actuality still a hypothesis, yet to be tested. Those tests will come in the form of 1) additional exposure data, such as population serology studies that identify the full extent of viral spread by looking for antibodies to the virus, which persist long after an infection is resolved, and 2) carefully tracking what happens when social distancing and quarantine measures are lifted. Prior pandemics, in particular the 1918 influenza episode, showed waves of infections that reoccured for years after the initial outbreak. Some of those waves are clearly attributable to premature reduction in social distancing, and different interpretations of data may have contributed to those decisions. (Have a look at this post by Tomas Pueyo, which is generally quite good, for the section with the heading “Learnings from the 1918 Flu Pandemic”.) Consequently, we need to carefully consider exactly what our current data sets are teaching us about SARS-CoV-19 and COVID-19, and, indeed, whether current data sets are teaching us anything helpful at all.

What is COVID-19?

Leading off the discussion of uncertainty are differences in the most basic description of the disease known as COVID-19. The list of observed symptoms — that is, visible impacts on the human body — from the CDC includes only fever, cough, and shortness of breath, while the WHO website list is more expansive, with fever, tiredness, dry cough, aches and pains, nasal congestion, runny nose, sore throat, or diarrhea. The WHO-China Joint Mission report from last month (PDF) is more quantitative: fever (87.9%), dry cough (67.7%), fatigue (38.1%), sputum production (33.4%), shortness of breath (18.6%), sore throat (13.9%), headache (13.6%), myalgia or arthralgia (14.8%), chills (11.4%), nausea or vomiting (5.0%), nasal congestion (4.8%), diarrhea (3.7%), and hemoptysis (0.9%), and conjunctival congestion (0.8%). Note that the preceding list, while quantitative in the sense that it reports the frequency of symptoms, is ultimately a list of qualitative judgements by humans.

The Joint Mission report continues with a slightly more quantitative set of statements:

Most people infected with COVID-19 virus have mild disease and recover. Approximately 80% of laboratory confirmed patients have had mild to moderate disease, which includes non-pneumonia and pneumonia cases, 13.8% have severe disease (dyspnea, respiratory frequency ≥30/minute, blood oxygen saturation ≤93%, PaO2/FiO2 ratio <300, and/or lung infiltrates >50% of the lung field within 24-48 hours) and 6.1% are critical (respiratory failure, septic shock, and/or multiple organ dysfunction/failure).

The rate of hospitalization, seriousness of symptoms, and ultimately the fatality rate depend strongly on age and, in a source of more uncertainty, perhaps on geography, points I will return to below.

What is the fatality rate, and why does it vary so much?

The Economist has a nice article exploring the wide variation in reported and estimated fatality rates, which I encourage you to read (also this means I don’t have to write it). One conclusion from that article is that we are probably misestimating fatalities due to measurement error. The total rate of infection is probably higher than is being reported, and the absolute number of fatalities is probably higher than generally understood. To this miscalculation I would add an additional layer of obfuscation, which I happened upon in my earlier work on SARS and the flu.

It turns out that we are probably significantly undercounting deaths due to influenza. This hypothesis is driven by a set of observations of anticorrelations between flu vaccination and deaths ascribed to stroke, myocardial infarction (“heart attack”), and “sudden cardiac death”, where the latter is the largest cause of “natural” death in the United States. Influenza immunization reduces the rate of those causes of death by 50-75%. The authors conclude that the actual number of people who die from influenza infections could be 4X-2.5-5X higher than the oft cited 20,000-40,000.

How could the standard estimate be so far off? Consider these two situations: First, if a patient is at the doctor or in the hospital due to symptoms of the flu, they are likely to undergo a test to rule in, or out, the flu. But if a patient comes into the ER in distress and then passes away, or if they die before getting to the hospital, then that molecular diagnostic is much less likely to be used. And if the patient is elderly and already suffering from an obvious likely cause of death, for example congestive heart failure, kidney failure, or cancer, then that is likely to be what goes on the death certificate. Consequently, particularly among older people with obvious preexisting conditions, case fatality rate for influenza is likely to be underestimated, and that is for a pathogen that is relatively well understood for which there is unlikely to be a shortage of diagnostic kits.

Having set that stage, it is no leap at all to hypothesize that the fatality rate for COVID-19 is likely to be significantly underestimated. And then if you add in insufficient testing, and thus insufficient diagnostics, as I explore below, it seems likely that many fatalities caused by COVID-19 will be attributed to something else, particularly among the elderly. The disease is already quite serious among those diagnosed who are older than 70. I expect that the final toll will be greater in communities that do not get the disease under control.

Fatality rate in China as reported by China CDC.

Fatality rate in China as reported by China CDC.

How is COVID-19 diagnosed?

For most of history, medical diagnoses have been determined by comparing patient symptoms (again, these are human-observable impacts on a patent, usually constituting natural language nouns and adjectives) with lists that doctors together agree define a particular condition. Recently, this qualitative methodology has been slowly amended with quantitative measures as they have become available: e.g., pulse, blood pressure, EEG and EKG, blood oxygen content, “five part diff” (which quantifies different kinds of blood cells), CT, MRI, blood sugar levels, liver enzyme activity, lung and heart pumping volume, viral load, and now DNA and RNA sequencing of tissues and pathogens. These latter tools have become particularly important in genetically tracking the spread of #SARS-CoV-2, because by following the sequence around the world you can sort out at the individual case level where it came from. And then simply being able to specifically detect viral RNA to provide a diagnosis is important because COVID-19 symptoms (other than fatality rate) are quite similar to that of the seasonal flu. Beyond differentiating COVID-19 from “influenza like illness”, new tools are being brought to bear that enable near real time quantification of viral RNA, which enables estimating viral load (number of viruses per sample volume), and which in turn facilitates 1) understanding how the disease progresses and then 2) how infectious patients are over time. These molecular assays are the result of decades of technology improvement, which has resulted in highly automated systems that take in raw clinical samples, process them, and deliver results electronically. At least in those labs that can afford such devices. Beyond these achievements, novel diagnostic methods based on the relatively recent development of CRISPR as a tool are already in the queue to be approved for use amidst the current pandemic. The pandemic is serving as a shock to the system to move diagnostic technology faster. We are watching in real time a momentous transition in the history of medicine, which is giving us a glimpse of the future. How are all these tools being applied today?

(Note: My original intention with this post was to look at the error rates of all the steps for each diagnostic method. I will explain why I think this is important, but other matters are more pressing at present, so the detailed error analysis will get short shrift for now.)

Recapitulating an explanation of relevant diagnostics from Part 1 of this series (with a slight change in organization):

There are three primary means of diagnosis:

1. The first is by display of symptoms, which can span a long list of cold-like runny nose, fever, sore throat, upper respiratory features, to much less pleasant, and in some cases deadly, lower respiratory impairment. (I recently heard an expert on the virus say that there are two primary ways that SARS-like viruses can kill you: “Either your lungs fill up with fluid, limiting your access to oxygen, and you drown, or all the epithelial cells in your lungs slough off, limiting your access to oxygen, and you suffocate.” Secondary infections are also more lethal for people experiencing COVID-19 symptoms.)

2. The second method of diagnosis is imaging of lungs, which includes x-ray and CT scans; SARS-CoV-2 causes particular pathologies in the lungs that can be identified on images and that distinguish it from other respiratory viruses.

3. Thirdly, the virus can be diagnosed via two molecular assays, the first of which uses antibodies to directly look for viral proteins in tissue or fluid samples, while the other looks for whether genetic material is present; sophisticated versions can quantify how many copies of viral RNA are present in a sample.

Imaging of lungs via x-ray and CT scan appears to be an excellent means to diagnose COVID-19 due to a distinct set of morphological features that appear throughout infected tissue, though those features also appear to change during the course of the disease. This study also examined diagnosis via PCR assays, and found a surprisingly high rate of false negatives. It is not clear from the text whether all patients had two independent swabs and accompanying tests, so either 10 or 12 total tests were done. If 10 were done, there are two clear false negatives, for a 20% failure rate; if 12 were done, there are up to four false negatives, for a 33% failure rate. The authors observe that “the false negative rate of oropharyngeal swabs seems high.” Note that this study directly compares the molecular assay with imaging, and the swab/PCR combo definitely comes up short. This is important because for us to definitively diagnose even the number of serious cases, let alone start sampling the larger population to track and try to get ahead of the outbreak, imaging is low throughput and expensive; we need rapid, accurate molecular assays. We need to have confidence in testing.

How does “testing” work? First, testing is not some science fiction process that involves pointing a semi-magical instrument like a Tricorder at a patient and instantly getting a diagnosis. In reality, testing involves multiple process steps implemented by humans — humans who sometimes are inadequately trained or who make mistakes. And then each of those process steps has an associated error or failure rate. You almost never hear about the rate of mistakes, errors, or failures in reporting on “testing”, and that is a problem.

Let’s take the testing process in order. For sample collection the CDC Recommendations include nasopharyngeal and oropharyngeal (i.e., nose and throat) swabs. Here is the Wikipedia page on RT-PCR, which is a pretty good place to start if you are new to these concepts.

The Seattle Flu Study and the UW Virology COVID-19 program often rely on home sample collection from nasal and throat swabs. My initial concern about this testing method was motivated in part by the fact that it was quite difficult to develop a swab-PCR for SARS-CoV that delivered consistent results, where part of the difficulty was simply in collecting a good patient sample. I have a nagging fear that not everyone who is collecting these samples today is adequately trained to get a good result, or that they are tested to ensure they are good at this skill. The number of sample takers has clearly expanded significantly around the world in the last couple of weeks, with more expansion to come. So I leave this topic with a question: is there a clinical study that examines the success rate sample collection by people who are not trained to do this every day?

On to the assays themselves: I am primarily concerned at the moment with the error bars on the detection assays. The RT-PCR assay data in China are not reported with errors (or even variance, which would be an improvement). Imaging is claimed to be 90-95% accurate (against what standard is unclear), and the molecular assays worse than that by some amount. Anecdotal reports are that they have only been 50-70% accurate, with assertions of as low as 10% in some cases. This suggests that, in addition to large probable variation in the detectable viral load, and possible quality variations in the kits themselves, human sample handling and lab error is quite likely the dominant factor in accuracy. There was a report of an automated high throughput testing lab getting set up in a hurry in Wuhan a couple of weeks ago, which might be great if the reagents quality is sorted, but I haven’t seen any reports of whether that worked out. So the idea that the “confirmed” case counts are representative of reality even in hospitals or care facilities is tenuous at best. South Korea has certainly done a better job of adequate testing, but even there questions remain about the accuracy of the testing, as reported by the Financial Times:

Hong Ki-ho, a doctor at Seoul Medical Centre, believed the accuracy of the country’s coronavirus tests was “99 per cent — the highest in the world”. He pointed to the rapid commercial development and deployment of new test kits enabled by a fast-tracked regulatory process. “We have allowed test kits based on WHO protocols and never followed China’s test methods,” Dr Hong said.

However, Choi Jae-wook, a medical professor of preventive medicine at Korea University, remained “worried”. “Many of the kits used at the beginning stage of the outbreak were the same as those in China where the accuracy was questioned . . . We have been hesitating to voice our concern because this could worry the public even more,” Mr Choi said.

At some point (hopefully soon) we will see antibody-based tests being deployed that will enable serology studies of who has been previously infected. The US CDC is developing these serologic tests now, and we should all hope the results are better than the initial round of CDC-produced PCR tests. We may also be fortunate and find that these assays could be useful for diagnosis, as lateral flow assays (like pregnancy tests) can be much faster than PCR assays. Eventually something will work, because this antibody detection is tried and true technology.

To sum up: I had been quite concerned about reports of problems (high error rates) with the PCR assay in China and in South Korea. Fortunately, it appears that more recent PCR data is more trustworthy (as I will discuss below), and that automated infrastructure being deployed in the US and Europe may improve matters further. The automated testing instruments being rolled out in the US should — should — have lower error rates and higher accuracy. I still worry about the error rate on the sample collection. However, detection of the virus may be facilitated because the upper respiratory viral load for SARS-CoV-2 appears to be much higher than for SARS-CoV, a finding with further implications that I will explore below.

How is the virus spread?

(Note: the reporting on asymptomatic spread has changed a great deal just in the last 24 hours. Not all of what appears below is updated to reflect this yet.)

The standard line, if there can be one at this point, has been that the virus is spread by close contact with symptomatic patients. This view is bolstered by claims in the WHO Joint Mission report: “Asymptomatic infection has been reported, but the majority of the relatively rare cases who are asymptomatic on the date of identification/report went on to develop disease. The proportion of truly asymptomatic infections is unclear but appears to be relatively rare and does not appear to be a major driver of transmission.”(p.12) These claims are not consistent with a growing body of clinical observations. Pinning down the rate of asymptomatic, or presymptomatic, infections is important for understanding how the disease spreads. Combining that rate with evidence that patients are infectious while asymptomatic, or presymptomatic, is critical for planning response and for understanding the impact of social distancing.

Two sentences in the Science news piece describing the Joint Mission report undermine all the quantitative claims about impact and control: “A critical unknown is how many mild or asymptomatic cases occur. If large numbers of infections are below the radar, that complicates attempts to isolate infectious people and slow spread of the virus.” Nature picked up this question earlier this week: “How much is coronavirus spreading under the radar?” The answer: probably quite a lot.

A study of cases apparently contracted in a shopping mall in Wenzhou concluded that the most likely explanation for the pattern of spread is “that indirect transmission of the causative virus occurred, perhaps resulting from virus contamination of common objects, virus aerosolization in a confined space, or spread from asymptomatic infected persons.”

Another recent paper in which the authors built an epidemiological transmission model all the documented cases in Wuhan found that, at best, only 41% of the total infection were “ascertained” by diagnosis, while the most likely acertainment rate was a mere 21%. That is, the model best fits the documented case statistics when 79% of the total infections were unaccounted for by direct diagnosis.

Finally, a recent study of patients early after infection clearly shows “that COVID-19 can often present as a common cold-like illness. SARS-CoV-2 can actively replicate in the upper respiratory tract, and is shed for a prolonged time after symptoms end, including in stool.” The comprehensive virological study demonstrates “active [infectious] virus replication in upper respiratory tract tissues”, which leads to a hypothesis that people can present with cold-like symptoms and be infectious. I will quote more extensively from the abstract, as this bit is crucially important:

Pharyngeal virus shedding was very high during the first week of symptoms (peak at 7.11 X 10^8 RNA copies per throat swab, day 4). Infectious virus was readily isolated from throat- and lung-derived samples, but not from stool samples in spite of high virus RNA concentration. Blood and urine never yielded virus. Active replication in the throat was confirmed by viral replicative RNA intermediates in throat samples. Sequence-distinct virus populations were consistently detected in throat- and lung samples of one same patient. Shedding of viral RNA from sputum outlasted the end of symptoms. Seroconversion occurred after 6-12 days, but was not followed by a rapid decline of viral loads.

That is, you can be sick for a week with minimal- to mild symptoms, shedding infectious virus, before antibodies to the virus are detectable. (This study also found that “Diagnostic testing suggests that simple throat swabs will provide sufficient sensitivity at this stage of infection. This is in stark contrast to SARS.” Thus my comments above about reduced concern about sampling methodology.)

So the virus is easy to detect because it is plentiful in the throat, which unfortunately also means that it is easy to spread. And then even after you begin to have a specific immune response, detectable as the presence of antibodies in blood, viral loads stay high.

The authors conclude, rather dryly, with an observation that “These findings suggest adjustments of current case definitions and re-evaluation of the prospects of outbreak containment.” Indeed.

One last observation from this paper is eye opening, and needs much more study: “Striking additional evidence for independent replication in the throat is provided by sequence findings in one patient who consistently showed a distinct virus in her throat as opposed to the lung.” I am not sure we have seen something like this before. Given the high rate of recombination between strains in this family of betacoronaviruses (see Part 1), I want to flag the infection of different tissues by different strains as a possibly worrying route to more viral innovation, that is, evolution.

STAT+ News summarizes the above study as follows:

The researchers found very high levels of virus emitted from the throat of patients from the earliest point in their illness —when people are generally still going about their daily routines. Viral shedding dropped after day 5 in all but two of the patients, who had more serious illness. The two, who developed early signs of pneumonia, continued to shed high levels of virus from the throat until about day 10 or 11.

This pattern of virus shedding is a marked departure from what was seen with the SARS coronavirus, which ignited an outbreak in 2002-2003. With that disease, peak shedding of virus occurred later, when the virus had moved into the deep lungs.

Shedding from the upper airways early in infection makes for a virus that is much harder to contain. The scientists said at peak shedding, people with Covid-19 are emitting more than 1,000 times more virus than was emitted during peak shedding of SARS infection, a fact that likely explains the rapid spread of the virus. 

Yesterday, CNN joined the chorus of reporting on the role asymptomatic spread. It is a nice summary, and makes clear that not only is “presymptomatic transmission commonplace”, it is a demonstrably significant driver of infection. Michael Osterholm, director of the Center for Infectious Disease Research (CIDRAP) and Policy at the University of Minnesota, and always ready with a good quote, was given the opportunity to put the nail in the coffin on the denial of asymptomatic spread:

"At the very beginning of the outbreak, we had many questions about how transmission of this virus occurred. And unfortunately, we saw a number of people taking very firm stances about it was happening this way or it wasn't happening this way. And as we have continued to learn how transmission occurs with this outbreak, it is clear that many of those early statements were not correct," he said. 

"This is time for straight talk," he said. "This is time to tell the public what we know and don't know."

There is one final piece of the puzzle that we need to examine to get a better understanding of how the virus is spreading. You may have read about characterizing the infection rate by the basic reproduction number, R0, which is a statistical measure that captures the average dynamics of transmission. There is another metric the “secondary attack rate”, or SAR, which is a measurement of the rate of transmission in specific cases in which a transmission event is known to have occurred. The Joint Mission report cites an SAR in the range of 5-10% in family settings, which is already concerning. But there is another study (that, to be fair, came out after the Joint Mission report) of nine instances in Wuhan that calculates the secondary attack rate in specific community settings is 35%. That is, assuming one initially infected person per room attended an event in which spread is known to have happened, on average 35% of those present were infected. In my mind, this is the primary justification for limiting social contacts — this virus appears to spread extremely well when people are in enclosed spaces together for a couple of hours, possibly handling and sharing food.

Many missing pieces must be filled in to understand whether the high reported SAR above is representative globally. For instance, what where the environmental conditions (humidity, temperature) and ventilation like at those events? Was the source of the virus a food handler, or otherwise a focus of attention and close contact, or were they just another person in the room? Social distancing and eliminating public events was clearly important in disrupting the initial outbreak in Wuhan, but without more specific information about how community spread occurs we are just hanging on, hoping old fashioned public health measures will slow the thing down until countermeasures (drugs and vaccines) are rolled out. And when the social control measures are lifted, the whole thing could blow up again. Here is Osterholm again, from the Science news article covering the Joint Mission report:

“There’s also uncertainty about what the virus, dubbed SARS-CoV-2, will do in China after the country inevitably lifts some of its strictest control measures and restarts its economy. COVID-19 cases may well increase again.”

“There’s no question they suppressed the outbreak,” says Mike Osterholm, head of the Center for Infectious Disease Research and Policy at the University of Minnesota, Twin Cities. “That’s like suppressing a forest fire, but not putting it out. It’ll come roaring right back.”

What is the age distribution of infections?

The short answer here is that everyone can get infected. The severity of one’s response appears to depend strongly on age, as does the final outcome of the disease (the “endpoint”, as it is somewhat ominously referred to). Here we run smack into another measurement problem, because in order to truly understand who is infected, we would need to be testing broadly across the population, including a generous sample of those who are not displaying symptoms. Because only South Korea has been sampling so widely, only South Korea appears to have a data set that gives some sense of the age distribution of infections across the whole population. Beyond the sampling problem, I found it difficult to find this sort of demographic data published anywhere on the web.

Below is the only age data I have been able to come up with, admirably cobbled together by Andreas Backhaus from screenshots of data out of South Korea and Italy.

Why would you care about this? Because, in many countries, policy makers have not yet closed schools, restaurants, or pubs that younger and healthier members of the population tend to frequent. If this population is either asymptomatic or mildly symptomatic, but still infectious — as indicated above — then they are almost certainly spreading virus not only amongst themselves, but also to members of their families who may be more likely to experience severe symptoms. Moreover, I am led to speculate by the different course of disease in different communities that the structure of social contacts could be playing a significant role in the spread of the virus. Countries that have a relatively high rate of multi-generational households, in which elderly relatives live under the same roof as young people, could be in for a rough ride with COVID-19. If young people are out in the community, exposed to the virus, then their elderly relatives at home have a much higher chance of contracting the virus. Here is the distribution of multigenerational households by region, according to the UN:

Screen Shot 2020-03-15 at 8.39.46 PM.png

The end result of all this is that we — humanity at large, and in particular North America and Europe — need to do a much better job of containment in our own communities in order to reduce morbidity and mortality caused by SARS-CoV-2.

How did we get off track with our response?

It is important to understand how the WHO got the conclusion about the modes of infection wrong. By communicating so clearly that they believed there was a minimal role for asymptomatic spread, the WHO sent a mixed message that, while extreme social distancing works, perhaps it was not so necessary. Some policy makers clearly latched onto the idea that the disease only spreads from very sick people, and that if you aren’t sick then you should continue to head out to the local pub and contribute to the economy. The US CDC seems to have been slow to understand the error (see the CNN story cited above), and the White House just ran with the version of events that seemed like it would be politically most favorable, and least inconvenient economically.

The Joint Mission based the assertion that asymptomatic and presymptomatic infection is “rare” on a study in Guangdong Province. Here is Science again: “To get at this question, the report notes that so-called fever clinics in Guangdong province screened approximately 320,000 people for COVID-19 and only found 0.14% of them to be positive.” Caitlin Rivers, from Johns Hopkins, hit the nail on the head by observing that “Guangdong province was not a heavily affected area, so it is not clear whether [results from there hold] in Hubei province, which was the hardest hit.”

I am quite concerned (and, frankly, disappointed) that the WHO team took at face value that the large scale screening effort in Guangdong that found a very low “asymptomatic count” is somehow representative of anywhere else. Guangdong has a ~50X lower “case count” than Hubei, and a ~400X lower fatality rate, according to the Johns Hopkins Dashboard on 15 March — the disparity was probably even larger when the study was performed. The course of the disease was clearly quite different in Guangdong than in Hubei.

Travel restrictions and social distancing measures appear to have had a significant impact on spread from Hubei to Guangdong, and within Guangdong, which means that we can’t really know how many infected individuals were in Guangdong, or how many of those were really out in the community. A recent study computed the probability of spread from Wuhan to other cities given both population of the city and number of inbound trips from Wuhan; for Guangzhou, in Guangdong, the number of infections was anomalously low given its very large population. That is, compared with other transmission chains in China, Guangdong wound up with many fewer cases that you would expect, and the case count there is therefore not representative. Consequently, the detected infection rate in Guangdong is not a useful metric for understanding anything but Guangdong. The number relevant for epidemiological modeling is the rate of asymptomatic infection in the *absence* of control measures, because that tells us how the virus behaves without draconian social distancing, and any return to normalcy in the world will not have that sort of control measure in place.

Now, if I am being charitable, it may have been that the only large scale screening data set available to the Joint Mission at the time was from Guangdong. The team needed to publish a report, and saying something about asymptomatic transmission was critically important to telling a comprehensive story, so perhaps they went with the only data they had. But the conclusions smelled wrong to me as soon as they were announced. I wrote as much to several reporters and on Twitter, observing that the WHO report was problematic because it assumed the official case counts approximated the actual number of infections, but I couldn’t put my finger on exactly what bugged me until I could put together the rest of the story above. Nevertheless, the WHO has a lot of smart people working for it; why did the organization so quickly embrace and promulgate a narrative that was so obviously problematic to anyone who knows about epidemiology and statistics?

What went wrong at the WHO?

There are some very strong opinions out there regarding the relationship between China and the WHO, and how that relationship impacts the decisions made by Director-General Dr. Tedros Adhanom. I have not met Dr. Tedros and only know what I read about him. However, I do have personal experience with several individuals now higher up in the chain of command for the WHO coronavirus response, and I have no confidence in them whatsoever. Here is my backstory.

I have wandered around the edges of the WHO for quite a while, and have spent most of my time in Geneva at the UN proper and working with the Biological Weapons Convention Implementation Support Unit. Then, several years ago, I was asked to serve on a committee at WHO HQ. I wasn’t particularly enthusiastic about saying yes, but several current and former high ranking US officials convinced me it was for the common good. So I went. It doesn’t matter which committee at the moment. What does matter is that, when it came time to write the committee report, I found that the first draft embraced a political narrative that was entirely counter to my understanding of the relevant facts, science, and history. I lodged my objections to the draft in a long minority report that pointed out the specific ways in which the text diverged from reality. And then something interesting happened.

I received a letter informing me that my appointment to the committee had been a mistake, and that I was actually supposed to be just a technical advisor. Now, the invitation said “member”, and all the documents that I signed beforehand said “member”, with particular rights and responsibilities, including a say in the text of the report. I inquired with the various officials who had encouraged me to serve, as well as with a diplomat or two, and the unanimous opinion was that I had been retroactively demoted so that the report could be written without addressing my concerns. All of those very experienced people were quite surprised by this turn of events. In other words, someone in the WHO went to surprising lengths to try to ensure that the report reflected a particular political perspective rather than facts, history, and science. Why? I do not know what the political calculations were. But I do know this: the administrative leadership in charge of the WHO committee I served on is now high up in the chain of command for the coronavirus response.

Coda: as it turns out, the final report hewed closely to reality as I understood it, and embraced most of the points I wanted it to make. I infer, but do not know for certain, that one or more other members of the committee — who presumably could not be shunted aside so easily, and who presumably had far more political heft than I do — picked up and implemented my recommended changes. So alls well that ends well? But the episode definitely contributed to my education (and cynicism) about how the WHO balances politics and science, and I am ill disposed to trust the organization. Posting my account may mean that I am not invited to hang out at the WHO again. This is just fine.

How much bearing does my experience have on what is happening now in the WHO coronavirus response? I don’t know. You have to make up your own mind about this. But having seen the sausage being made, I am all too aware that the organization can be steered by political considerations. And that definitely increases uncertainty about what is happening on the ground. I won’t be writing or saying anything more specific about that particular episode at this time.