Pangolin study uncovers novel strains of coronaviruses and raises concerns about origin

In the last century, different regions of the world have experienced viral outbreaks caused by human pathogenic coronaviruses (CoVs) that have resulted in epidemics and pandemics. More recently, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) caused the ongoing coronavirus disease 2019 (COVID-19) pandemic, whereas the Middle Eastern respiratory syndrome coronavirus (MERS-CoV) was the causal agent of the MERS outbreak.

In many instances, zoonotic spillovers are responsible for the emergence of novel viruses, as demonstrated by MERS-CoV, which was transmitted to humans from camels. Bats are considered the natural reservoir hosts of both SARS-CoVs and MERS-CoVs.

Study: Identification of a novel HKU4-related coronavirus in single-cell datasets and clade viral host analysis. Image Credit: Foto Mous /

*Important notice: bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.

A recent study posted to the bioRxiv* preprint server identifies novel HKU4-related CoVs in a pangolin single-cell sequencing dataset.


There are two hypotheses regarding the origin of SARS-CoV-2, the first of which states that the virus spilled over from animals at a seafood market in Wuhan, China. The second hypothesis is the accidental release of the virus from a laboratory in Wuhan researching SARS-related CoVs.

HKU4-related CoVs belong to the betacoronavirus genus of coronaviruses and Merbecovirus subgenus. MERS-CoV and HKU4-CoV use the dipeptidyl peptidase 4 (DPP4) receptor of the host cell for invasion.

A SARS-CoV-2-related partial genome was identified in pangolin tissue sequencing datasets in late 2019. Two SARS-CoV-2-related strains, including Guangdong (GD) and Guangxi (GX), were identified in pangolin tissue samples after the COVID-19 outbreak.

The GD pangolin coronavirus strain exhibited 97% similarity with the receptor binding domain (RBD) of SARS-CoV-2. In addition, a ten-fold higher binding affinity of GD pangolin-related CoV (PCoV) for the human angiotensin-converting enzyme 2 (ACE2) receptor was observed compared to pangolin ACE2.

GX pangolin coronaviruses were first identified in Malayan pangolin (Manis javanica) tissue samples. Both GD and GX appeared to be rare, with no identified infection in the wild. 

HKU4-related CoVs, including MjHKU4r-CoV-1 and its variants, were identified in four out of eighty-six pangolins seized by Guangxi customs. Although the pangolins were determined to be poorly susceptible to SARS-related CoV, their kidney proximal tubule cells inferred susceptibility.

About the study

The current study identified a novel HKU4r-BGI-2020 CoV from seven pangolins single-cell sequencing ribonucleic acid sequencing (RNA-Seq) datasets. Phylogenetically, HKU4-BGI-2020 belongs to clade b, which contains pangolin coronaviruses MjHKU4r-CoV-1, PCoV HKU4-P251T, and Tylonycteris robustula bat CoV 162275 at the full-genome level, as well as RNA-dependent RNA polymerase (RdRp), and the spike (S) and nucleocapsid (N) genes. 

HKU4r-BGI-2020 exhibited 97.34% nucleotide similarity to MjHKU4r-CoV-1, a recently identified coronavirus related to pangolins, 97.23% similarity to HKU4-P251T, and 92.86% nucleotide similarity to the bat Tylonycteris robustula CoV isolate. MjHKU4r-CoV-1, MjHKU4r-CoV-2, MjHKU4r-CoV-3, and MjHKU4rCoV-4 were detected in four next-generation sequencing (NGS) datasets generated from Manis javanica pangolins seized by Guangxi customs.

Interestingly, the discovery of a partial HKU4-related CoV with a 97.48% similarity to PCoV HKU4-P251T had only 25 matching reads in the NGS dataset. Due to the low read counts, it is not clear whether CoV stems from accidental contamination.

The NGS datasets developed from samples HKU4-GX, PPeV-GX, and GX19-89 in BioProject PRJNA901878 almost entirely contained Sus scrofa genomic content and not of M. javanica or Rhizomys pruinosus.

Considering the findings, it is not plausible that the identified CoV was related to M. javanica samples.

Based on forensic genomics, the ratio of the virus to animal reads can be utilized to ascertain whether a particular animal is a likely host or not. The presence of Pangolin hunnivirus isolate GX/HKU4-GX/2020 in the S. scrofa NGS dataset, and their similarity to pangolin CoV MjHKU4r-CoV-1 indicates that they are related to laboratory research.

Around 98% of HKU4-related reads were found in a large intestine single-cell nuclei RNA-Seq dataset. An overall moderate correlation between bacterial content and viral read content was found. Stomach, lung, liver duodenum, and heart sample datasets also exhibited the presence of the virus in trace amounts.

No HKU4-related CoV reads were detected from the spleen and kidney, thus indicating that the animals were not infected naturally, but, instead, the sequences were acquired in the laboratory during the processing of organ samples. The current study infers pangolins are unlikely to be intermediate betacoronavirus hosts.


The novel CoV was identified using single-cell RNA-Seq datasets obtained from a single pangolin that died of natural causes.

HKU4-BGI-2020 represents the fourth HKU4-related CoV that has been designated to the newly identified “clade b.” This clade is phylogenetically distinct from the known HKU4-related CoVs. MjHKU4r-CoV-2 and MjHKU4rCoV-4 were determined to be close variants of this isolate. 

*Important notice: bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.

Journal reference:
  • Preliminary scientific report. Jones, A., Massey, S. E., Nemzer, L. R., et al. (2023) Identification of a novel HKU4-related coronavirus in single-cell datasets and clade viral host analysis. bioRxiv. doi:10.1101/2023.06.18.545480

Posted in: Medical Science News | Medical Research News | Disease/Infection News

Tags: ACE2, Angiotensin, Angiotensin-Converting Enzyme 2, binding affinity, Cell, Contamination, Coronavirus, Coronavirus Disease COVID-19, covid-19, Enzyme, Genes, Genome, Genomic, Genomics, Heart, Kidney, Laboratory, Large Intestine, Liver, MERS-CoV, Nucleotide, Pandemic, Polymerase, Receptor, Research, Respiratory, Ribonucleic Acid, RNA, SARS, SARS-CoV-2, Severe Acute Respiratory, Severe Acute Respiratory Syndrome, Spleen, Stomach, Syndrome, Virus

Comments (0)

Written by

Dr. Priyom Bose

Priyom holds a Ph.D. in Plant Biology and Biotechnology from the University of Madras, India. She is an active researcher and an experienced science writer. Priyom has also co-authored several original research articles that have been published in reputed peer-reviewed journals. She is also an avid reader and an amateur photographer.

Source: Read Full Article