You are here

SARS-CoV-2 genome sequencing effort

COVID-19 response by scientists at DBT-inStem and NCBS-TIFR, Bengaluru

COVID-19, the disease caused by the SARS-CoV-2 virus, is an ongoing pandemic, affecting millions of people worldwide, with lakhs of cases in India. Sequencing viral genomes from infected individuals across the country can help understand transmission chains, the dynamics of the global pandemic and viral genome evolution. To facilitate this effort, DBT-InStem and NCBS, TIFR are part of a multi-institutional consortium effort - PAN-India 1000 SARS-CoV-2 RNA genome consortium - anchored by the Department of Biotechnology, India, to sequence more than 1000 SARS-CoV-2 viral genomes from around the country.

Viral genomic RNA isolated from patient samples is sequenced to identify changes or mutations, to study evolution of the virus and better understand how it has changed through the course of the pandemic. Understanding transmission chains of particular viral genomes might help better understand the dynamics of the virus, whether certain viral sequences are spreading more than others. These data could in turn enable public health agencies to manage the epidemic more efficiently. Since viruses mutate so fast, identifying the novel mutations based on Indian viral genomes will help ensure that researchers in the country, design efficient and sensitive diagnostic tests, assess drugs and develop vaccines taking into account local variation.

Viral genome data can provide information about its relatedness to other viruses, mode and rate of evolution, geographical spread and adaptation to human hosts. This information can be used to assist in epidemiological investigations, particularly when combined with other types of data e.g. case counts. The study can also be a valuable tool to lay foundation for prediction of similar viral outbreaks in the near future. Multiple studies in the past have shown that members of the coronavirus family evolve by constantly mutating their genomes and multiple variations in the COVID-19 virus genome have been observed (such as Phan T., 2020). As the number of cases in India increases, a focus on capturing the viral genome and its diversity (if any) in the country is timely and important. Representative sequences from different regions can reveal different lineages of the virus prevalent in India as well as the rate of the mutations, and their origins. Representative sequences from different regions can reveal different lineages of the virus prevalent in India, and whether the rates at which mutations are accumulating is different. This information can find immediate application for public heath interventions and, in the longer-term, the development of ways of preventing transmission and spread, therapies and vaccines.

As a first step, the consortium is coordinating efforts to sequence and analyze the viral genome to understand the trajectory of SARS-CoV-2 in the country. The SARS-CoV-2 genome is encoded by RNA, which is converted to complementary DNA (cDNA). Genome sequencing requires many copies of DNA to produce many iterations of the genome. The number of times a particular area of the genome has been sequenced is known as the ‘coverage’. Viral RNA extracted from a patient’s swab is amplified to identify the genes unique to coronavirus.

Scientists at inStem and NCBS, have completed viral genome sequencing from close to 100 referral samples that are positive for the virus, at the joint COVID-19 testing centre on campus. The sequencing approaches include using shotgun or targeted (the arctic protocol) viral genome sequencing. The shotgun sequencing approach, requiring minimal sample processing, can be performed quickly but works poorly for samples with low viral loads. The artic sequencing approach selectively amplifies regions of the viral RNA; this can purify viral RNA even from samples with low load, but requires complex sample processing. Such genome assemblies and their analysis, when performed on a large scale, require computing servers with high processing power and memory. Additionally, software pipelines that can balance speed and accuracy while retaining the flexibility to peg the balance at levels appropriate to the goals of the analysis, are required. Scientists on campus have also taken the first steps towards developing such a pipeline named CoVa,, which is under continuous development.

We propose to analyse genome sequence data, in light of publicly available data generated from around the world, to track the genetic evolution of SARS-CoV-2 in the Indian populace. Effective analysis will require continuous monitoring of emerging strains of the virus by sequencing. No individual effort can achieve the scale of sequencing required, and it is essential that data be shared openly. Further, genomic analysis of pandemic agents would benefit from data sourced from diverse geographies, necessitating global collaboration.


Dasaradhi Palakodeti, is a member of the faculty at the Institute for Stem Cell Science & Regenerative Medicine (DBT-inStem) at Bangalore. A major focus of his laboratory’s research is regulatory networks involving coordination between miRNAs, several RNA binding proteins and various other RNA species in cell fate transitions. Das is a recipient of the DST Swaranjayanti Fellowship, Biology and, was a Wellcome Trust DBT India Alliance Intermediate Fellowship awardee, till recently.

Uma Ramakrishnan, is a member of the faculty at the National Centre for Biological Sciences (NCBS-TIFR) at Bangalore. Her research focuses on revealing the processes that drive patterns of mammalian genetic variation (in the present and the past). Her research applies molecular methods in combination with computational techniques developed by her group for the analysis of modern and archival DNA. Uma is a Wellcome Trust DBT India Alliance Senior Research Fellow

Aswin Sai Narain Seshasayee, is a member of the faculty at the National Centre for Biological Sciences (NCBS-TIFR) at Bangalore. His laboratory investigates bacterial gene regulation and adaptation on a genomic scale, using experimental and computational tools. Aswin is a Wellcome Trust DBT India Alliance Intermediate Fellow and a DST Ramanujan Fellowship awardee till recently.