sur 193
Bayesian Analysis of SARS
-2 Origin
Steven C. Quay, MD, PhD
29 January 2021
@2021. Steven C. Quay, MD, PhD
Steven C
Quay, MD, PhD
Wuhan Institute of Virology analysis of
lavage specimens
from ICU patients at Wuhan Jinyintan Hospital in December 2019
contain both SARS
-2 and adenovirus vaccine sequences
consistent with a vaccine challenge trial
Bayesian Analysis of SARS
-2 Origin
Steven C. Quay, MD, PhD
29 January 2021
@2021. Steven C. Quay, MD, PhD
This work is dedicated to the men, women, and children who were infected with
-2 over the last year.
It is my hope that this work becomes part of the body of
evidence to help inform the public about gain
-of-function pathogen research and that a renewed
debate can be had about the benefits and risks of this research in the co
ntext of world health.
Despite having collaborated over many decades on numerous scientific
projects, research during 2020 into COVID-19, SARS
-2, and therapeutic approaches has
been a unique experience. With lockdowns and international travel bans, all collaborative work
has been virtual.
With an apparent bias surrounding investigation into the origin of SARS
2, ad hoc
groups of Citizen-Scientists
, often anonymous, have worked together via email,
videoconference, micro-blogging, and social messaging networks to advance our understanding
of this horrific pandemic.
I want to thank a Twitter group
called #DRASTIC for many useful discussions that found their
way into this document. Dr. Martin Lee, Ph.D., Adjunct Professor of Statistics at UCLA
provided statistical support throughout this work. H. Lawrence Remmel provided input on the
adenovirus vaccine as a dual target vaccine. I want to thank D.A. for originally suggesting
performing a Bayesian analysis on the work I had done on SARS
-2 and for his facilitation
of the review of this work by a diverse group of scient
ists and policy makers.
In all cases, however, this is my own work product.
Bayesian Analysis of SARS
-2 Origin
Steven C. Quay, MD, PhD
29 January 2021
@2021. Steven C. Quay, MD, PhD
A Bayesian analysis concludes beyond a reasonable doubt that SARS
is not a natural zoonosis but instead is laboratory derived
Wuhan Institute of Virology analysis of lava
ge specimens from ICU patients at
Wuhan Jinyintan Hospital in December 2019 contain both SARS
-2 and
adenovirus vaccine sequences consistent with a vaccine challenge trial
Executive Summary.
The one
-year anniversary of the COVID
-19 pandemic records
2.1 million
deaths, over 100 million confirmed cases,
and trillions of dollars of economic damage.
Although there is universal agreement that a coronavirus
identified as Severe Acute Respiratory
Syndrome Coronavirus 2 or SARS
-2 (abbreviated CoV
-2 henceforth) causes the disease
COVID-19, there is no understanding or
consensus on the origin of the
The Chinese government, WHO, media, and many academic virologists have stated with strong
conviction that the coronavirus came from nature, either directly from bats or indirectly from
bats through another species. Transmission of a virus from animals to humans is called a
A small but growing number of scientists have considered another hypothesis
: that an ancestral
bat coronavirus was collected in the wild, genetically manipulated in a laboratory to
make it
more infectious
, training it
to infect human cells, and ultimately
released, probably
by accident,
in Wuhan, China. For most of 2020 this hypothesis
was consider
ed a crackpot idea,
but in the
last few weeks
, more media attention has been given to the possibility that the Wuhan Institute of
Virology, located near the Wuhan city center
and with a population of over 11 million
, may have been
the source of the field specimen collection effort, laboratory genetic
, and subsequent leak. On January 15, 2021, the U.S. Department of State issued a
statement requesting the WHO investigation of the origin of COVID
-19 include specific
assertions related to a laboratory origin of the pandemic.
Given the strong sentiment in the scientific community in favor of a
zoonosis and the massive
effort undertaken by China to find the
source, one can assume that any evidence
in favor of a natural origin, no matter how trivial, would become widely disseminated
known. This provides a potential evidence bias
within the scientific community
in favor of a
natural origin which isn’t quantifi
able but should be kept in mind.
This becomes especially important background when evidence t
hat could
support a laboratory
origin has been directly provided by leading Chinese scientists themselves, like Dr. Zhengli Shi,
head of coronavirus research at the Wuhan Institute of Virology and
Gao Fu (George Fu Gao),
Director of Chinese CDC;
by the Chinese government, as well as
by powerful and vocal
, pro-
natural origin scientists, like Dr. Peter Daszak, of the NYC
-based NGO, EcoHealth Alliance.
Bayesian Analysis of SARS
-2 Origin
Steven C. Quay, MD, PhD
29 January 2021
@2021. Steven C. Quay, MD, PhD
is report uses Bayesian inference, a common statistical tool
in which Bayes' theorem
, a well
known statistical equation,
is used to update the
likelihood for a par
ticular hypothesis as more
evidence or information becomes available.
It is widely used in the sciences and medicine
has begun to be used in the law.
The starting probability for origin of SARS
-2 was set with the zoonotic or natural
hypothesis at 98.8% likelihood with the laboratory origin hypothesis set at 1.2%. The initial state
was bias
ed as much as possible towards a zoonotic origin, with
the starting point selected as
upper bounds of the 95% confidence
interval for the mean and standard deviation of three
independent estimates, including one by Daszak and colleagues. Each piece of new evidence for
or against each hypothesis was
then used to adjust the probabilities. If evidence favor
ed a natural
origin the math adjusts upward the probability of a natural origin, and so on.
The most significant evidence provided herein is the finding from RNA
-Seq performed by
the Wuhan Institute of Virology (WIV)
of lavage patient samples collected on December 30,
These ICU patients were the subject of the seminal paper, entitled, “
A pneumonia
outbreak associated with a new coronavirus of probable bat origin,” from Dr. Zhengli Shi
and colleagues that first characterized SARS
This author has confirme
d that the
RNA-Seq of all five patients contained SARS
-2 sequences.
Surprisingly the specimens also contained the adenovirus “pShuttle” vector, developed by
Chinese scientists in 2005 for SARS-
Two immunogens were identified, the Spike
gene of SARS
-2 and the synthetic construct H7N9 HA gene.
Hundreds of
perfectly homologous (150/150) raw reads suggest
this is not an artefact. Reads that cross
the vector
-immunogen junction are identified. An example of the read contigs for CoV
-2 is
shown in this figure:
The detailed evidence for the adenovirus vaccine sequences is given at the end of this document.
Bayesian Analysis of SARS
-2 Origin
Steven C. Quay, MD, PhD
29 January 2021
@2021. Steven C. Quay, MD, PhD
While adenovirus is a common infection the wildtype viruses have low homology to the
vaccine vector sequence, by design, to avoid rejection of the vaccine due to prior exposure
to wildtype adenoviruses.
Two patients from the same hospital who had bronchial lavage on the same day but had
their specimens sent to the Hubei CDC did not have adenovirus vaccine sequences.
Three explanations come to mind from this evidence:
These represent sample preparation artifacts at the WIV
, such as sample spillover
on the sequenc
These patients were admitted with an unknown infection, were not responding to
the treatment protocols for a infection of unknown origin, and they were vaccinated
with an experimental vaccine in a desperate but compassionate therapeutic “Hail
A clinical trial of a combination influenza/SARS
-2 vaccine was being
conducted and a
n accidental release into Wuhan occurred.
Only WIV scientists and Chinese authorities can answer these questions. Until the evidence
of the adenovirus sequences has been confirmed by other scientists, this author will not
include this evidence in the Bayes
ian analysis.
Obviously if a vaccine containing the Spike Protein of SARS
-2 was being
administered to patients in Wuhan in December 2019 the question of laboratory origin is a
settled matter.
The remaining analysis is being conducted without the adenovirus vaccine evidence unless and
until it is corroborated. The outcome of this report is the conclusion that the probability of a
laboratory origin for CoV
-2 is
99.8% with a corresponding probability of a
zoonotic origin
0.2%. This exceeds most academic law school
discussions of how to quantify ‘beyond a
reasonable doubt
,’ the threshold for finding guilt in a criminal case. The report contains the
detailed analysis and quan
titative basis for the statistics
and conclusion. It should be noted that
because of the commutative property of the collected adjustments to the probabilities, the order
in which they are used in the overall calculation is immaterial
and the same end likelihoods will
be reached regardless of the order of input.
The following Text
-Table summarizes the evidence examined and the changes
in probabilities
Bayesian Analysis of SARS
-2 Origin
Steven C. Quay, MD, PhD
29 January 2021
@2021. Steven C. Quay, MD, PhD
e summary which follows will simply be a review and discussion of the evidence in the
context of
the two hypotheses.
A viral
zoonosis has at least three elements, a host, a virus, and the human population. With
some viruses there
two hosts. One is
a ‘reservoir host’ whe
re the virus can live for
years or even decades in a relatively stable relationship. The reservoir host is never decimated by
the virus
, and the virus is never burned
by the reservoir host
, disappearing
. For
coronaviruses the reservoir host is always one or more
bat species.
If there is a reservoir host that
some viruses that cannot jump directly into the human population, there is a need for an second
host, an intermediate host. In this case the virus sp
ends time jumping into the intermediate host,
‘practicing’ adaption through random mutation and Darwinian selection for fitness to reproduce,
infect, and transmit in the intermediate host.
This process is then repeated between the
intermediate host and the human population. Alternatively, the virus can jump directly between
the bat reservoir and humans, without the need for an intermediate host.
Zoonotic Origin
Laboratory Origin
Initial State
International committees to determine CoV-2 origin may not be impartial
Three key zoonotic papers: pros and cons
SARS-like infections among employees of the Wuhan Institute of Virology in the fall of 2019 reported by US
Location of first cases near Wuhan Institute of Virology
Lack of evidence of seroconversion in Wuhan and Shanghai
Lack of posterior diversity
The Wuhan Institute of Virology has publicly disclosed that by 2017 it had developed the techniques to
collect novel coronaviruses, systematically modify the receptor binding domain to improve binding or alter zoonotic
tropism and transmission, insert a furin site to permit human cell infection, make chimera and synthetic viruses, perform
experiments in humanized mice, and optimize the ORF8 gene to increase human cell death.
Lack of furin cleavage sites in any other sarbecovirus
Rare usage of -CGG- single codons & no CGG-CGG pairs
Routine use of CGG in laboratory codon optimization, including Daszak & Shi
Spike Protein receptor binding region (200 amino acids) optimized for humans
Whole genome analysis shows pre-adaption of CoV-2
The finding of CoV-2 in Barcelona wastewater in early 2019 was an artifact
Shi and the WHO comment early on that CoV-2 seemed to begin with a single patient
Mammalian biodiversity between Yunnan and Hubei is significantly different, limiting a potential common intermediate
The ancestor of CoV-2 can only obtain a furin site from other subgenera viruses but recombination is limited/non-
existent between subgenera
Canvas of 410 animals shows humans and primates are the best, bats are the worst, for ACE2-Spike Protein interaction
A government requested review of samples collected from a mineshaft may have caused the COVID-19 pandemic
The Hunan Seafood Market and farmed animals in Hubei province are not the source of CoV-2
Line 2 of the Wuhan Metro System is the likely conduit of the pandemic and is the closest subway line to the WIV
Feral and domestic cats are not the intermediate host
Extraodinary pre-adaption for the use of human tRNA is observed
Evidence of lax operations and disregard of laboratory safety protocols and regulations in China
Previous SARS-CoV-1 laboratory accidents
Shi and Daszak use Wuhan residents as negative control for zoonotic coronavirus exposure
RaTG13 could be CoV-2 precursor using the synthetic biology 'No See 'Em' technique
Location, location, location: Based on the distance between known SARS-CoV-1 laboratory-acquired infections and
the hospital of admission of the infected personnel, the WIV is within the expected hospital catchment for a CoV-2 LAI
Bayesian Analysis of SARS
-2 Origin
Steven C. Quay, MD, PhD
29 January 2021
@2021. Steven C. Quay, MD, PhD
For two prior human coronavirus epidemics, an intermediate or proximate host was identified.
-1 in 2003-4 it was the civet cat while for M
iddle Eastern Respiratory Syndrome
(MERS) in 2012-4 it was the camel. In both of these human epidemics
, the intermediate host was
identified within four to ten months
of the first clinically identified human infection
. With CoV
we are at 12 months since the pandemic began and still waiting
for evidence of, despite a
larger effort
inside China
to find an intermediate host
. For both of these
, a
bat species reservoir host was also identified
, but not in the case of SARS
Based on the genome sequence of CoV
-2, Dr
s. Shi and Daszak have proposed that the reservoir
host for CoV
-2 is the intermediate horseshoe bat (
Rhinolophus affinis
), which is found in
Yunnan Province. Yunnan Province is in southern, rural China and about 1900 km from the
north central province of Hubei, where the 11 million people of Wuhan live
. In the US this
would be equivalent
in distance, climate change, and human population den
sity difference
going from the Everglades
in Florida to Manhattan, in New York City. The intermediate
horseshow bat isn’t found at all in Hubei province
, making a direct bat
-human transmission
Experiments in three independent laboratories also demonstrate that CoV
-2 has
changed genetically so much that it can no longer infect any bat species cell culture
tested. So,
while the leading US coronavirus expert, Dr. Ralph Baric of The University of N
orth Carolina
suggested in early 20
20 that CoV
-2 may have jumped into the human population directly from
bats without an intermediate host, this
hypothesis seems
no longer be viable.
For the zoonosis hypothesis
to be
advanced, it is now necessary
to fin
d an intermediate host. In
January 2020 a theory was proposed that CoV
-2 arose in the Huanan Seafood Market, a
traditional Chinese “wet market” where live animals are butchered and sold
for food. The market
theory was based on the observation that about 40% of early patients worked or shopped there.
This wa
s reminiscent of the wet market sources for civet cats infected with
-1 or the
camel markets for the MERS coronavirus
. The Chinese
closed the market on
December 31, 2019 after performing extensive environmental sampling and sanitation
But by May 2020 Dr. Gao Fu
, Director of the Chinese CDC, announced that the market was not
the source of CoV
-2, as all of the animal specimens tested
negative for CoV
-2. And while
-1 was
found in 100% of local farmed civets when tested, CoV
-2 was different. In
July 2020 Dr. Shi reported that extensive testing of farmed animals throughout
Hubei Province
failed to find CoV
-2 in any animals
For about six months
, the pangolin, a scaly anteat
er, was suspected to be the intermediate host
but finally Dr. Daszak report
ed that CoV
-2 was not found in pangolins in the wild or from the
(illegal) market trade.
Domestic and feral cats also were
ruled out as a possible source. A
I am distinguishi
ng here the difference between SARS
2 being a descendent of a bat coronavirus (with 3.8%
or 1100 nucleotide (nt) differences between them) and the finding of the immediate precursor of SARS
2 in a
bat colony population somewhere in the wild, which usually is <100 nt differences.
“We have done bat virus surveillance in Hubei Province for many years but have not found that bats in Wuhan or
even the wider Hubei Province carry any coronaviruses that are closely related to SARS
2. I don't think the
spillover from bats to humans occurred in Wuhan or in Hubei Province,” said Dr. Shi. Science, July 2020
Bayesian Analysis of SARS
-2 Origin
Steven C. Quay, MD, PhD
29 January 2021
@2021. Steven C. Quay, MD, PhD
comprehensive computer
-based screen of 410 different animals reported the remarkable finding
that the best ACE2 receptor matches to CoV
-2 were human and other primates (or primate cells
in the laboratory), including the favorite laboratory coronavirus host
, the VERO monkey cell
, and that all bat
were the worst host. At th
e time of th
is writing
, there is not even
a working hypothesis for the species of an intermediate host
A typical
zoonosis has a number of characteristic properties that can allow identification of a
zoonotic infection
, even in the absence of
identifying an intermediate host. None of these
properties are found for CoV
All zoonotic infections have in common the principle that when
a virus in nature uses evolution
to move from, for example, a bat host to a camel host and then to a human host, it is a hit and
miss, slow process. After all, evolution is the result of random genetic changes, mutati
ons, and
then enrichment of the ones that are helpful by amplification during reproduction.
With both
-1 and MERS, the corona
virus spent months and years jumping from the
intermediate host into humans, not having all of the necessary mutations need
ed to be aggressive,
grow, and then spread, but spen
ding enough time in humans
to cause an infection
and leaving
behind a
corresponding immune response.
The hallmark
evidence of this ‘practice’ in abortive host jumping is
in stored, archived human
blood specimens taken from before the epidemic, where one can find
evidence of pre
usually sub-clinical, community spread from the antibodies to the
epidemic virus. For
-1 and MERS
, about 0.6% of people in the reg
ion where the epidemic began show
signs of an infection in archived blood. With CoV
-2, this seroconversion, as it is called, has
never been observed, including
in 540 specimens collected from ‘fever clinics’ in Wuhan
2019 and January 2020,
reported by the WHO. Because this is such a potent
signal of a zoonosis
, and because I believe
that China has over 100,000 stored specimens from
Wuhan taken in the fall of 2019, the lack of reports of seroconversion, the silence from China on
evidence, speaks volumes.
Another hallmark of
a slow
, natural zoonosis
can be found in the virus. In SARS
-1 and
MERS, the coronavirus spent years in the intermediate host, passing back and forth
populations of hosts
, the civets or camels,
that were
living in close proximity. During this time,
they would accumulat
e a background of genetic mistakes, i.e.,
- usually about one
every two weeks. When the final chip falls
, and a mutation
happens allowing the
jump into humans
, the virus with that new mutation
also jumps around within
the intermediate
host population.
The consequence of this latter behavior for a true zoonosis is that the genome
sequences found in humans don’t all descend from a single jump
into a single human but show
jumps from viruses that are only cousins of each other, not direct lineal descendants.
In a true zoonosis
, the fami
ly tree
of virus genome sequences doesn’t pass back through the first
patient but instead tracks all the way back to an ancestor months or years earlier. This is called
posterior diversity, and it
is an easy genetic test to perform. With CoV
-2, every one of the more
than 294,000 virus genomes sequenced c
an be traced back to the first genomic cluster and in the
first patient
in that cluster
, a 39-year
-old man
who was seen at the People’s Liberation Army