pangolin lineage covid

Elextel Welcome you !

pangolin lineage covid

This produced non-recombining alignment NRA3, which included 63 of the 68genomes. Evol. Unlike other viruses that have emerged in the past two decades, coronaviruses are highly recombinogenic14,15,16. Virology 507, 110 (2017). With horseshoe bats currently the most plausible origin of SARS-CoV-2, it is important to consider that sarbecoviruses circulate in a variety of horseshoe bat species with widely overlapping species ranges57. Genetics 176, 10351047 (2007). As illustrated by the dashed arrows, these two posteriors motivate our specification of prior distributions with standard deviations inflated 10-fold (light color). Rev. T.L. 26 March 2020. Preprint at https://doi.org/10.1101/2020.04.20.052019 (2020). SARS-CoV-2 genetic lineages in the United States are routinely monitored through epidemiological investigations, virus genetic sequence-based surveillance, and laboratory studies. 6, eabb9153 (2020). These are in general agreement with estimates using NRR2 and NRA3, which result in divergence times of 1982 (19482009) and 1948 (18791999), respectively, for SARS-CoV-2, and estimates of 1952 (19061989) and 1970 (19321996), respectively, for the divergence time of SARS-CoV from its closest known bat relative. In this approach, we considered a breakpoint as supported only if it had three types of statistical support: from (1) mosaic signals identified by 3SEQ, (2) PI signals identified by building trees around 3SEQs breakpoints and (3) the GARD algorithm35, which identifies breakpoints by identifying PI signals across proposed breakpoints. While such models have recently been made available, we lack the information to calibrate the rate decline over time (for example, through internal node calibrations44). Coronavirus Disease 2019 (COVID-19) Situation Report 51 (World Health Organization, 2020). Biol. 1. Concatenated region ABC is NRR1. PI signals were identified (with bootstrap support >80%) for seven of these eight breakpoints: positions 1,684, 3,046, 9,237, 11,885, 21,753, 22,773 and 24,628. Ji, W., Wang, W., Zhao, X., Zai, J. Lemey, P., Minin, V. N., Bielejec, F., Pond, S. L. K. & Suchard, M. A. Based on the identified breakpoints in each genome, only the major non-recombinant region is kept in each genome while other regions are masked. performed recombination and phylogenetic analysis and annotated virus names with geographical and sampling dates. 84, 31343146 (2010). This provides compelling support for the SARS-CoV-2 lineage being the consequence of a direct or nearly-direct zoonotic jump from bats, because the key ACE2-binding residues were present in viruses circulating in bats. Pangolin relies on a novel algorithm called pangoLEARN. J. Virol. 110. The origins we present in Fig. . Mol. Boni, M. F., Posada, D. & Feldman, M. W. An exact nonparametric method for inferring mosaic structure in sequence triplets. [12] In such cases, even moderate rate variation among long, deep phylogenetic branches will substantially impact expected root-to-tip divergences over a sampling time range that represents only a small fraction of the evolutionary history40. J. Gen. Virol. Another similarity between SARS-CoV and SARS-CoV-2 is their divergence time (4070years ago) from currently known extant bat virus lineages (Fig. Instead, similarity in codon usage metrics between the SARS-CoV-2 and eukaryotes analyzed was correlated with coding sequence GC content of the eukaryote, with more similar codon usage being identified in eukaryotes with low GC content similar to that of the coronavirus (b). 11,12,13,22,28)a signal that suggests recombinationthe divergence patterns in the Sprotein do not show evidence of recombination between the lineage leading to SARS-CoV-2 and known sarbecoviruses. In case of DRAGEN COVID Lineage tool, the minimum accepted alignment score was set to 22 and results with scores <22 were discarded. When the genomic data included both coding and non-coding regions we used a single GTR+ substitution model; for concatenated coding genes we partitioned the alignment by codon position and specified an independent GTR+ model for each partition with a separate gamma model to accommodate inter-site rate variation. We aimed to analyze 3 naso-oropharyngeal swab samples collected between August and December 2021 to describe the amino acid changes present in the sequence reads that may have a role in the emergence of new . 4), that region and shorter BFRs were not included in combined putative non-recombinant regions. Combining regions A, B and C and removing the five named sequences gives us putative NRR1, as an alignment of 63sequences. Pink, green and orange bars show BFRs, with regionA (nt 13,29119,628) showing two trimmed segments yielding regionA (nt13,29114,932, 15,40517,162, 18,00919,628). Using the most conservative approach (NRR1), the divergence time estimate for SARS-CoV-2 and RaTG13 is 1969 (95% HPD: 19302000), while that between SARS-CoV and its most closely related bat sequence is 1962 (95% HPD: 19321988); see Fig. These shy, quirky but cute mammals are one of the most heavily trafficked yet least understood animals in the world. 92, 433440 (2020). Below, we report divergence time estimates based on the HCoV-OC43-centred rate prior for NRR1, NRR2 and NRA3 and summarize corresponding estimates for the MERS-CoV-centred rate priors in Extended Data Fig. 1, vev003 (2015). A., Lytras, S., Singer, J. J. Virol. The most parsimonious explanation for these shared ACE2-specific residues is that they were present in the common ancestors of SARS-CoV-2, RaTG13 and Pangolin Guangdong 2019, and were lost through recombination in the lineage leading to RaTG13. PubMed Central Discovery and genetic analysis of novel coronaviruses in least horseshoe bats in southwestern China. The ongoing pandemic spread of a new human coronavirus, SARS-CoV-2, which is associated with severe pneumonia/disease (COVID-19), has resulted in the generation of tens of thousands of virus genome sequences. It allows a user to assign a SARS-CoV-2 genome sequence the most likely lineage (Pango lineage) to SARS-CoV-2 query sequences. 27) receptors and its RBD being genetically closer to a pangolin virus than to RaTG13 (refs. We considered (1) the possibility that BFRs could be combined into larger non-recombinant regions and (2) the possibility of further recombination within each BFR. As informative rate priors for the analysis of the sarbecovirus datasets, we used two different normal prior distributions: one with a mean of 0.00078 and s.d. In our analyses of the sarbecovirus datasets, we incorporated the uncertainty of the sampling dates when exact dates were not available. Further information on research design is available in the Nature Research Reporting Summary linked to this article. This dataset comprises an updated version of that used in Hon et al.15 and includes a cluster of genomes sampled in late 2003 and early 2004, but the evolutionary rate estimate without this cluster (0.00175 substitutions per siteyr1 (0.00117,0.00229)) is consistent with the complete dataset (0.00169 substitutions per siteyr1, (0.00131,0.00205)). In December 2019, a cluster of pneumonia cases epidemiologically linked to an open-air live animal market in the city of Wuhan (Hubei Province), China1,2 led local health officials to issue an epidemiological alert to the Chinese Center for Disease Control and Prevention and the World Health Organizations (WHO) China Country Office. & Holmes, E. C. A genomic perspective on the origin and emergence of SARS-CoV-2. Nature 579, 270273 (2020). Extended Data Fig. The virus then. When viewing the last 7kb of the genome, a clade of viruses from northern China appears to cluster with sequences from southern Chinese provinces but, when inspecting trees from different parts of ORF1ab, the N. China clade is phylogenetically separated from the S. China clade. The latter was reconstructed using IQTREE66 v.2.0 under a general time-reversible (GTR) model with a discrete gamma distribution to model inter-site rate variation. CAS However, on closer inspection, the relative divergences in the phylogenetic tree (Fig. Holmes, E. C., Rambaut, A. Humans' selfish, speciesist treatment of these animals could be the very reason why the novel coronavirus exists. Nguyen, L.-T., Schmidt, H. A., Von Haeseler, A. Robertson, D. nCoVs relationship to bat coronaviruses & recombination signals (no snakes) no evidence the 2019-nCoV lineage is recombinant. These residues are also in the Pangolin Guangdong 2019 sequence. Originally, PANGOLIN used a maximum-likelihood-based assignment algorithm to assign query SARS-CoV-2 the most likely lineage sequence. Biol. 36, 7597 (2002). Evol. The relatively fast evolutionary rate means that it is most appropriate to estimate shallow nodes in the sarbecovirus evolutionary history. G066215N, G0D5117N and G0B9317N)) and by the European Unions Horizon 2020 project MOOD (no. This is evidence for numerous recombination events occurring in the evolutionary history of the sarbecoviruses22,33; specifying all past events in their correct temporal order34 is challenging and not shown here. RegionsB and C span nt3,6259,150 and 9,26111,795, respectively. Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic. A pneumonia outbreak associated with a new coronavirus of probable bat origin. 6, e14 (2017). These rate priors are subsequently used in the Bayesian inference of posterior rates for NRR1, NRR2, and NRA3 as indicated by the solid arrows. We thank A. Chan and A. Irving for helpful comments on the manuscript. The research leading to these results received funding (to A.R. is funded by The National Natural Science Foundation of China Excellent Young Scientists Fund (Hong Kong and Macau; no. In early January, the aetiological agent of the pneumonia cases was found to be a coronavirus3, subsequently named SARS-CoV-2 by an International Committee on Taxonomy of Viruses (ICTV) Study Group4 and also named hCoV-19 by Wu et al.5. 17, 15781579 (1999). Evol. Phylogenies of subregions of NRR1 depict an appreciable degree of spatial structuring of the bat sarbecovirus population across different regions (Fig. The estimated divergence times for the pangolin virus most closely related to the SARS-CoV-2/RaTG13 lineage range from 1851 (17301958) to 1877 (17461986), indicating that these pangolin lineages were acquired from bat viruses divergent to those that gave rise to SARS-CoV-2. CAS performed codon usage analysis. Dis. Katoh, K., Asimenos, G. & Toh, H. in Bioinformatics for DNA Sequence Analysis (ed. Natl Acad. and JavaScript. 3) clusters with viruses from provinces in the centre, east and northeast of China. 32, 268274 (2014). For weather, science, and COVID-19 . Extended Data Fig. Press, H.) 3964 (Springer, 2009). Are you sure you want to create this branch? SARS-like WIV1-CoV poised for human emergence. We focused on these three non-recombining regions/alignments for divergence time estimation; this avoids inappropriate modelling of evolutionary processes with recombination on strictly bifurcating trees, which can result in different artefacts such as homoplasies that inflate branch lengths and lead to apparently longer evolutionary divergence times. Software package for assigning SARS-CoV-2 genome sequences to global lineages. Decimal years are shown on the x axis for the 1.2 years of SARS sampling in c. d, Mean evolutionary rate estimates plotted against sampling time range for the same three datasets (represented by the same colour as the data points in their respective RtT divergence plots), as well as for the comparable NRA3 using the two different priors for the rate in the Bayesian inference (red points). Evol. The command line tool is open source software available under the GNU General Public License v3.0. Our approach resulted in similar posterior rates using two different prior means, implying that the sarbecovirus data do inform the rate estimate even though a root-to-tip temporal signal was not apparent. Mol. Because coronaviruses are known to be highly recombinant, we used three different approaches to identify non-recombinant regions for use in our Bayesian time-calibrated phylogenetic inference. To estimate non-synonymous over synonymous rate ratios for the concatenated coding genes, we used the empirical Bayes Renaissance countingprocedure67. The consistency of the posterior rates for the different prior means also implies that the data do contribute to the evolutionary rate estimate, despite the fact that a temporal signal was visually not apparent (Extended Data Fig. and D.L.R. Download a free copy. A single 3SEQ run on the genome alignment resulted in 67 out of 68sequences supporting some recombination in the past, with multiple candidate breakpoint ranges listed for each putative recombinant. Get the most important science stories of the day, free in your inbox. Evol. Evol. =0.00075 and one with a mean of 0.00024 and s.d. This boundary appears to be rarely crossed. 2). stand-alone pangolin work flows or Illumina DRAGEN COVID Lineage App (v3.5.5) following the default parameters. Membrebe, J. V., Suchard, M. A., Rambaut, A., Baele, G. & Lemey, P. Bayesian inference of evolutionary histories under time-dependent substitution rates. On first examination this would suggest that that SARS-CoV-2 is a recombinant of an ancestor of Pangolin-2019 and RaTG13, as proposed by others11,22. Zhou, P. et al. This is not surprising for diverse viral populations with relatively deep evolutionary histories. The first available sequence data6 placed this novel human pathogen in the Sarbecovirus subgenus of Coronaviridae7, the same subgenus as the SARS virus that caused a global outbreak of >8,000 cases in 20022003. 2, vew007 (2016). Posterior means with 95% HPDs are shown in Supplementary Information Table 2. Emerg. Webster, R. G., Bean, W. J., Gorman, O. T., Chambers, T. M. & Kawaoka, Y. Evolution and ecology of influenza A viruses. Because the estimated rates and divergence dates were highly similar in the three datasets analysed, we conclude that our estimates are robust to the method of identifying a genomes NRRs. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. PubMedGoogle Scholar. Intraspecies diversity of SARS-like coronaviruses in Rhinolophus sinicus and its implications for the origin of SARS coronaviruses in humans. Lancet 395, 949950 (2020). Despite the high frequency of recombination among bat viruses, the block-like nature of the recombination patterns across the genome permits retrieval of a clean subalignment for phylogenetic analysis. RegionC showed no PI signals within it. wrote the first draft of the manuscript, and all authors contributed to manuscript editing. Genet. Aiewsakun, P. & Katzourakis, A. Time-dependent rate phenomenon in viruses. These means are based on the mean rates estimated for MERS-CoV and HCoV-OC43, respectively, while the standard deviations are set ten times higher than empirical values to allow greater prior uncertainty and avoid strong bias (Extended Data Fig. 5 (NRR1) are conservative in the sense that NRR1 is more likely to be non-recombinant than NRR2 or NRA3. Intragenomic rearrangements involving 5-untranslated region segments in SARS-CoV-2, other betacoronaviruses, and alphacoronaviruses, Crystal structure of the CoV-Y domain of SARS-CoV-2 nonstructural protein 3, Association of underlying comorbidities and progression of COVID-19 infection amongst 2586 patients hospitalised in the National Capital Region of India: a retrospective cohort study, Molecular characterization of horse nettle virus A, a new member of subgroup B of the genus Nepovirus, Molecular phylogeny of coronaviruses and host receptors among domestic and close-contact animals reveals subgenome-level conservation, crossover, and divergence. The boxplots show divergence time estimates (posterior medians) for SARS-CoV-2 (red) and the 20022003 SARS-CoV virus (blue) from their most closely related bat virus. Biol. Suchard, M. A. et al. Nature 583, 286289 (2020). Google Scholar. Background & objectives: Several phylogenetic classification systems have been devised to trace the viral lineages of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Li, Q. et al. It is available as a command line tool and a web application. Divergence time estimates based on the HCoV-OC43-centred rate prior for the separate BFRs (Supplementary Table 3) show consistency in TMRCA estimates across the genome. & Li, X. Crossspecies transmission of the newly identified coronavirus 2019nCoV. PLoS Pathog. One geographic clade includes viruses from provinces in southern China (Guangxi, Yunnan, Guizhou and Guangdong), with its major sister clade consisting of viruses from provinces in northern China (Shanxi, Henan, Hebei and Jilin) as well as Hubei Province in central China and Shaanxi Province in northwestern China. We showed that severe acute respiratory syndrome coronavirus 2 is probably a novel recombinant virus. PLoS ONE 5, e10434 (2010). "This is an extremely interesting . Mol. J. Virol. b, Similarity plot between SARS-CoV-2 and several selected sequences including RaTG13 (black), SARS-CoV (pink) and two pangolin sequences (orange). The fact that these estimates lie between the rates for MERS-CoV and HCoV-OC43 is consistent with the intermediate sampling time range of about 18years (Fig. Microbiol. 16, e1008421 (2020). BEAST inferences made use of the BEAGLE v.3 library68 for efficient likelihood computations. DRAGEN COVID Lineage App This app aligns reads to a SARS-CoV-2 reference genome and reports coverage of targeted regions. Share . These authors contributed equally: Maciej F. Boni, Philippe Lemey. The Artic Network receives funding from the Wellcome Trust through project no. Sequences were aligned by MAFTT58 v.7.310, with a final alignment length of 30,927, and used in the analyses below. Use the Previous and Next buttons to navigate the slides or the slide controller buttons at the end to navigate through each slide. SARS-CoV-2 itself is not a recombinant of any sarbecoviruses detected to date, and its receptor-binding motif, important for specificity to human ACE2 receptors, appears to be an ancestral trait shared with bat viruses and not one acquired recently via recombination. The red and blue boxplots represent the divergence time estimates for SARS-CoV-2 (red) and the 2002-2003 SARS-CoV (blue) from their most closely related bat virus, with the light- and dark-colored versions based on the HCoV-OC43 and MERS-CoV centered priors, respectively. Results and discussion Genomic surveillance has been a hallmark of the COVID-19 pandemic that, in contrast to other pandemics, achieves tracking of the virus evolution and spread worldwide almost in real-time ( 4 ). The construction of NRR1 is the most conservative as it is least likely to contain any remaining recombination signals. Open reading frames are shown above the breakpoint plot, with the variable-loop region indicated in the Sprotein. Lancet 395, 565574 (2020). In light of these time-dependent evolutionary rate dynamics, a slower rate is appropriate for calibration of the sarbecovirus evolutionary history. Patino-Galindo, J. & Andersen, K. G. Pandemics: spend on surveillance, not prediction. Preprint at https://doi.org/10.1101/2020.02.10.942748 (2020). Nucleotide positions for phylogenetic inference are 147695, 9621,686 (first tree), 3,6259,150 (second tree, also BFR B), 9,26111,795 (third tree, also BFR C), 12,44319,638 (fourth tree) and 23,63124,633, 24,79525,847, 27,70228,843 and 29,57430,650 (fifth tree). PubMed Biol. Posada, D., Crandall, K. A. Smuggled pangolins were carrying viruses closely related to the one sweeping the world, say scientists. Of importance for future spillover events is the appreciation that SARS-CoV-2 has emerged from the same horseshoe bat subgenus that harbours SARS-like coronaviruses. Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic, https://doi.org/10.1038/s41564-020-0771-4. 4 TMRCAs for SARS-CoV and SARS-CoV-2. Nevertheless, the viral population is largely spatially structured according to provinces in the south and southeast on one lineage, and provinces in the centre, east and northeast on another (Fig. Several of the recombinant sequences in these trees show that recombination events do occur across geographically divergent clades. Using a third consensus-based approach for identifying recombinant regions in individual sequenceswith six different recombination detection methods in RDP5 (ref. A third approach attempted to minimize the number of regions removed while also minimizing signals of mosaicism and homoplasy. Biol. Genetics 172, 26652681 (2006). In this study, we report the case of a child with severe combined immu presenting a prolonged severe acute respiratory syndrome coronavirus 2 infection. SARS-CoV-2 and RaTG13 are also exceptions because they were sampled from Hubei and Yunnan, respectively. Sorting these breakpoint-free regions (BFRs) by length results in two segments >5kb: an ORF1a subregion spanning nucleotides (nt) 3,6259,150 and the first half of ORF1b spanning nt13,29119,628 (sequence numbering given in Source Data, https://github.com/plemey/SARSCoV2origins). An initial genomic sequence analysis found that the reemergence of COVID-19 in New Zealand was caused by a SARS-CoV-2 from the (now ancestral) lineage B.1.1.1 of the pangolin nomenclature ( 17 ). GitHub - cov-lineages/pangolin: Software package for assigning SARS-CoV-2 genome sequences to global lineages. 1c). Lam, T. T. et al. MC_UU_1201412). 382, 11991207 (2020). He, B. et al. Wong, A. C. P., Li, X., Lau, S. K. P. & Woo, P. C. Y. A novel bat coronavirus closely related to SARS-CoV-2 contains natural insertions at the S1/S2 cleavage site of the Spike protein. Transparent bands of interquartile range width and with the same colours are superimposed to highlight the overlap between estimates. Pangolin-CoV is 91.02% and 90.55% identical to SARS-CoV-2 and BatCoV RaTG13, respectively, at the whole-genome level. 82, 18191826 (2008). Abstract. It compares the new genome against the large, diverse population of sequenced strains using a In other words, a true breakpoint is less likely to be called as such (this is breakpoint-conservative), and thus the construction of a non-recombining region may contain true recombination breakpoints (with insufficient evidence to call them as such). In regionA, we removed subregion A1 (ntpositions 3,8724,716 within regionA) and subregion A4 (nt1,6422,113) because both showed PI signals with other subregions of regionA. Visual exploration using TempEst39 indicates that there is no evidence for temporal signal in these datasets (Extended Data Fig. Microbiol. Green boxplots show the TMRCA estimate for the RaTG13/SARS-CoV-2 lineage and its most closely related pangolin lineage (Guangdong 2019). Complete genome sequence data were downloaded from GenBank and ViPR; accession numbers of all 68sequences are available in Supplementary Table 4. Thank you for visiting nature.com. This is notable because the variable-loop region contains the six key contact residues in the RBD that give SARS-CoV-2 its ACE2-binding specificity27,37. Coronavirus: Pangolins found to carry related strains. 4). In the variable-loop region, RaTG13 diverges considerably with the TMRCA, now outside that of SARS-CoV-2 and the Pangolin Guangdong 2019 ancestor, suggesting that RaTG13 has acquired this region from a more divergent and undetected bat lineage. If stopping an outbreak in its early stages is not possibleas was the case for the COVID-19 epidemic in Hubeiidentification of origins and point sources is nevertheless important for containment purposes in other provinces and prevention of future outbreaks. Lin, X. et al. Regions AC were further examined for mosaic signals by 3SEQ, and all showed signs of mosaicism.

The Fundamental Attribution Error Is The Tendency, Court Stenographer Iii Salary Grade, Is Anarkali Bazaar Open On Sunday, Pepperdine Soccer Id Camp, Dante Jelks Funeral Home Obituaries, Articles P

pangolin lineage covid