Nita Parekh

Assistant Professor

Center for Computational Natural Sciences and Bioinformatics                         

International Institute of Information Technology

Gachibowli, Hyderabad – 500 032

Email: nita@iiit.ac.in

Ph: 66531183

 

 

Research interests: Bioinformatics and Systems Biology

 

·        Genome Analysis - Pattern recognition, Comparative genomics, Data mining, Developing Algorithms and Specialized Databases:

o       An Integrated Data Mining Tool for Function Analysis of SNPs  - ComPreSNPdb (In collaboration with CCMB, Hyderabad)

o       Identifying Genomic Islands and Pathogenicity Islands – developed a web-based tool - IGIPT

o       Development of Comprehensive Gene Database - CHGD (In collaboration with CCMB, Hyderabad)

o       Comparative genomics approach to identify cis-regulatory elements in plant genomes

 

·        Proteomics - Pattern recognition, graph theory approaches to protein structure analysis:

o       Identifying Peptide Periodic Repeats in Protein Sequences

§         Web-based Tool – PEPPER

§         Web-based Database - DRiPS

o       Graph Theory Approach for Analyzing Protein Structures

§         Identifying tandem structural repeats

§         Identifying domains

o       Computation analysis of protein conformational changes

 

·        Systems Biology

o       Dynamical Systems Modeling of Biological Systems

§         Analysis of Dynamical networks

 

 

Teaching:

·        PG courses: Biostatistics, Introduction to Bioinformatics, Advanced Bioinformatics, Projects in Computational Biology (core courses)

·        UG courses: Elements of Bioinformatics (open elective)

 

 

Details about Ongoing Research Activities

 

I         Identifying Repeats in Protein Sequences

 

Introduction

Identifying tandem repeats in the proteome of any organism is important not only for understanding the structure and function of the proteins but also for analyzing the association of abnormal expansion of repeat regions with disorders. We have developed an efficient tool for identifying Peptide Periodic Repeats (PEPPER) in protein sequences. The tool identifies tandem peptide repeats (TPRs), contiguous single amino acid repeats (SAARs), and periodic occurrences of single amino acids (POSAAs) and reports the consensus repeat pattern, the complete repeat region, the score and alignment of the consensus with the repeat region along with percentage mismatch and insertions/deletions. A database of periodic peptide repeats (DRiPS) has been developed. Presently we are developing algorithm for identifying repeated motifs in protein sequences. Available at: http://ccnsb.iiit.ac.in/nita/PEPPER.

 

Related Publications:

1.      PEPPER – A Tool for Identifying Peptide Periodic Repeats, poster presentation in the International Conference in Bioinformatics, Dec 18-20, 2006, New Delhi. Report no: IIIT/TR/2008/38.

 

2.      Protein Tandem Repeat DataBase (PTRDB), Krishna Manjari, V. Kiran Kumar, Rima Kumari and Nita Parekh, poster presentation at the International Conference on Bioinformatics & Drug Discovery, Hyderabad  Central University, Dec 20-22, 2007. (Awarded 5th best poster prize). Report no: IIIT/TR/2008/69.

 

3.      Database of Repeats in Protein Sequences (DRiPS), Rima Kumari and Nita Parekh, poster presentation in 19th International Conference on Genome Informatics (GIW2008), held at Gold Coast Australia, 1-3 Dec 2008.

 

Associated people:

K. Kasturi Kiran (M.Tech 2004), Radhika B. (M.Tech 2004), Krishna Manjari (MTech 2006-08), V. Kiran Kumar (MTech 2006-08), Rima Kumari (MS by Research 2006), B. Sreekanth (MTech 2008), Nita Parekh

 

 

II        An Integrated Tool for SNP Function Analysis

 

Introduction

Single nucleotide polymorphisms (SNPs) are commonly used for association studies to find genes responsible for complex genetic diseases. The complex diseases may involve many genes and hundreds of alleles but only a small portion of them are functional polymorphisms that contribute to disease phenotypes. Assessment of the risk requires access to a variety of heterogeneous biological databases and analytical tools. A web server is developed to facilitate the functional analysis of SNPs by mining data from various resources and providing a detailed report for the query.

 

Related Publications:

1.      CompreSNPdb: Comprehensive data-mining workflow for SNPs, Genes, Diseases and Pathways, Anshu Bhardwaj, Ajeet Pandey, P. Krishna Manjari, V. Kiran Kumar, Nita Parekh and Shrish Tiwari (in preparation)

 

Associated people:

Kasturi Nadella (MSIT, 2002), Ajeet Pandey (MTech 2005), Anshu Bharadwaj (Phd Student, CCMB), Shrish Tiwari (Sct., CCMB), Krishna Manjari (MTech 2006-08), V. Kiran Kumar (MTech 2006-08), Nita Parekh

 

 

III       Identifying Genomic Islands and Pathogenicity Islands

 

Introduction

In recent years many different genomic islands have been discovered in a variety of pathogenic as well as non-pathogenic bacteria. Because they promote genetic variability, genomic islands (GIs) play an important role in microbial evolution. We have developed a web based integrated platform for the identification of genomic islands in which various measures that capture bias in nucleotide compositions have been implemented, viz., GC content (both at the whole genome and at three codon positions in genes), genomic signature, k-mer distribution (k=2–6), codon usage bias and amino acid usage bias. The tool carries out analysis in sliding windows (default size 10Kb) and compares with the genomic average for each measure to identify probable genomic islands. The output is displayed in a tabular format for each window which may be filtered if the values of the measures differ by 1.5s (standard deviations) from the genomic average. The tool also provides option to extract flanking regions of predicted GIs for further analysis. Availability: http://ccnsb.iiit.ac.in/IGIPT/.

 

Related Publications:

1.      Ruchi Jain, Sandeep Ramineni, Nita Parekh, "Integrated Genomic Island Prediction Tool (IGIPT)," icit, pp.131-132, International Conference on Information Technology, 2008.

 

 2.  Genomic Islands Identification in Prokaryotic Genomes (GIIPro), Ruchi Jain and Nita Parekh, accepted in International Conference on Bioinformatics & Drug Discovery, Dec 20 – 22, 2007, Hyderabad. (Awarded 2nd best poster prize) Report no: IIIT/TR/2008/28.

 

Associated people:

Senthil Kumar (Phd student), Rishi Arvind (MTech 2004), Hemanth Sanna Reddy (MTech 2005), Ruchi Jain (Ms by Research, 2007-08), Sandeep Ramineni (Project student), Tulasi and Keerthija (MTech 2008), Nita Parekh

 

 

IV      Analysis of Protein Structures Using Graph Theory Approach

 

Introduction

We have been exploring the use of graph theory methods to identify clusters in protein structures, e.g., tandem structural repeats, domains, etc. Graph theory is a branch of discrete mathematics applied to the study of various real-world networks and their properties. Protein molecules being a set of atoms or group of atoms (vertices) connected by covalent bonds (edges) has also been extensively investigated by graph theory. The structure of proteins is governed to a large extent by non-covalent interactions, and graph theory captures this 3-dimensional topology providing insights into the structures of proteins. Analysis of the topological details of proteins with known structures, such as clustering of specific types of amino acids important for structure, folding and function, is of great value as large number of protein structures are now available. We have been focusing on the identification of tandem structural repeats and domains by graph spectral analysis.

 

Related Publications:

1.   Graph Spectral Approach for Identifying Protein Domains, Hari Krishna Yalamanchili and Nita Parekh, 1st International Conference on Bioinformatics and Computational Biology (BICoB) 2009, 8 – 10 April, 2009, New Orleans, USA (paper accepted for publication in Springer Verlag Conference Proceedings)

 

Associated people:

Ramesh Nerella (MTech 2005), Ruchi Jain (Ms by Research, 2006-08), Y. Hari Krishna(Ms by Research, 2008), Nita Parekh

 

 

V       Development of Comprehensive Gene Database

 

Introduction:

An important pattern recognition problem in biological sequences is gene prediction – the region that codes for proteins. What are the important conserved patterns or motifs in exonic and intronic regions of eukaryotic genes, splice site recognition, promoters & regulatory sequences found in the vicinity of genic regions, etc. are some of the important questions in gene prediction. Developing a specialized database of genes would greatly facilitate in this analysis. We have developed a Comprehensive Gene Database (CGD) of mammals be integrating information from various NCBI resources. 

 

Related Publication:

1.   Gene Prediction in silico at National Seminar on Bioinformatics and Functional Genomics, conducted by Bioinformatics Centre, Pondicherry University, Feb 15 – 17, 2005.

2.   Computational Issues in Gene Prediction, at 40th National Convention of Computer Society India, hosted by CSI Hyderabad Chapter, Nov 9 - 12, 2005.

3.   Tool to find Absolute Location of Genes in Human Genome, presented at the National Seminar on Systems Approach to Bioinformatics, conducted by Bioinformatics Centre, Pondicherry University, Feb 18 - 20, 2004. (Report no: IIIT/TR/2004/31)

 

Associated people:

G. Madhukar Reddy (MSIT 2002), Ch. Jagan Mohan Reddy (MSIT 2002), Sai Deepthi (MSIT 2002), Kasturi Nadella (MSIT 2002), B. Subramanyam Sarath (M.Tech 2004), Ramesh Narella (M.Tech 2005), Shrish Tiwari (Sct., CCMB), Nita Parekh

 

 

VI      Dynamical Systems Modeling of Biological Systems

 

Introduction

Networks of coupled dynamical systems have been used to model biological oscillators, excitable media, neural networks, genetic control networks and many other self-organizing systems. In general, the connection topology is assumed to be either completely regular (e.g., diffusively-coupled system) or completely random. However, most biological networks lie somewhere between these two extremes. We would like to explore some simple models of networks that can be tuned through this middle ground – regular networks re-wired to introduce increasing amounts of disorder. These systems, called small-world networks, can be highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs. From the perspective of nonlinear dynamics, it would be interesting to understand how a network of interacting dynamical systems – be they neurons, chemical concentrations, or species population – behave collectively, given their individual dynamics and coupling architecture. Currently we are analyzing the dynamics of a simple dynamical system, logistic embedded on of various networks: regular, small-world, scale-free and random and study the synchronizability of the network.

 

Related Publication:

1.      Controllability of Spatially Extended Systems Using the Pinning Approach, Nita Parekh and S. Sinha, Physica A 318, 200-212 (2003).

2.      Controlling Dynamics in Spatially Extended Systems, Nita Parekh and S. Sinha, Phys. Rev. E. 65, 036227-1 to 9 (2002).

3.      Global and Local Control of Spatiotemporal Chaos in Coupled Map Lattices, Nita Parekh, S. Parthasarthy and S. Sinha, Phys. Rev. Lett. 81, 1401 (1998).

 

Associated people:

Sunaina K. (MSIT 2002), Rishi Arvind (M.Tech 2004), Snehansu Ghosh (M.Tech 2004), Raina (post-BSc), Nita Parekh.