Vinod PK's Lab, Center for Computational Natural Sciences & Bioinformatics (CCNSB),
IIIT Hyderabad
13th August '20
Latest Update using 1813 samples; clusters observed in Karartaka
6th July '20
We analyzed the spike protein sequences for Indian samples
and observed 4 distinct clusters determined by
D614 mutation
6th June '20
We observed that C2 (clade I/A3i) is not a dominant cluster based on the clustering pattern. C1 shows more diversity than C2. We also observed that different clusters are emerging in Gujarat and Maharashtra from C2
31st May '20
Nextstrain phylogenetic tree analysis by CSIR labs reported a new cluster as a clade I/A3i.
26th May '20
We analyzed the genomic sequences of SARS-CoV2 in India using an alignment-free method based on Chaos Game Representation. We found two predominant clusters (C1 and C2) in India with one cluster (C2) containing samples mostly from Telangana.
Progression of COVID19 in India from May to August
Multiple Sequence Alignment performed over 872 Indian Spike Protein Samples. Entropy (Hx) Plot for all samples indicates that the clustering pattern is influenced by max variance of D614
Diversity in the Genomic Sequences
Data Distributions
Chaos Game Representation
Chaos Game Representation is a 2D graphical representation of long 1D sequences based on oligonucleotide frequencies
Indian SARS-CoV2 Sequences Visualized through tSNE - Statewise Diversity
[Based on Data Accessed on 3rd June 2020]
SARS-CoV2 Sequences Visualized through tSNE - Countrywise Diversity
[Based on Data Accessed on 3rd June 2020]
Sample Distributions for Indian SARS-CoV-2 sequences
Phylogenetic tree of Indian Samples
Generated using NextStrain: Hadfield et al., Nextstrain: real-time tracking of pathogen evolution, Bioinformatics (2018)
CSIR lab describes a distinct phylogentic cluster
The clade I/A3i accounts for '41% of all genomes sequenced and deposited in the public domain from multiple states in India'
[Banu, Sofia, et al. "A distinct phylogenetic cluster of Indian SARS-CoV-2 isolates." bioRxiv (2020)]
Interesting clusters observed in the Indian SARS-CoV2 Sequences
The Indian samples analyzed using an alignment-free approach based on Chaos Game Representation indicate patterns conforming to regional variations in the t-SNE graph. One cluster from the state Telangana is found to be diverging from rest of the country.
The presence of clades observed in Indian samples by CSIR Lab
[updated: 14 May '20]
[Image Credits: Vinod Scaria Lab at CSIR Institute of Genomics & Integrative Biology]
Analysis performed by Ms. Ruchi Chauhan
Based on the data from Global Initiative on Sharing All Influenza Data (GISAID)
Details about Materials & Methods can be found here