Nita Parekh
Assistant Professor
Center for Computational
Natural Sciences and Bioinformatics
International Institute of
Information Technology
Gachibowli,
Email: nita@iiit.ac.in
Ph: 66531183
Research interests: Bioinformatics and Systems
Biology
·
Genome
Analysis - Pattern
recognition, Comparative genomics, Data mining, Developing Algorithms and Specialized
Databases:
o
An
Integrated Data Mining Tool for Function Analysis of SNPs - ComPreSNPdb (In collaboration with CCMB,
o
Identifying
o
Development
of Comprehensive Gene Database - CHGD (In collaboration with CCMB,
o
Comparative
genomics approach to identify cis-regulatory elements in plant genomes
·
Proteomics - Pattern recognition, graph theory
approaches to protein structure analysis:
o
Identifying
Peptide Periodic Repeats in Protein Sequences
§
Web-based
Tool PEPPER
§
Web-based
Database - DRiPS
o
Graph
Theory Approach for Analyzing Protein Structures
§
Identifying
tandem structural repeats
§
Identifying
domains
o
Computation
analysis of protein conformational changes
·
Systems
Biology
o
Dynamical
Systems Modeling of Biological Systems
§
Analysis
of Dynamical networks
Teaching:
·
PG
courses: Biostatistics, Introduction to Bioinformatics, Advanced
Bioinformatics, Projects in Computational Biology (core courses)
·
UG
courses: Elements of Bioinformatics (open elective)
Details about Ongoing
Research Activities
I Identifying Repeats in Protein Sequences
Introduction
Identifying tandem repeats in the
proteome of any organism is important not only for understanding the structure
and function of the proteins but also for analyzing the association of abnormal
expansion of repeat regions with disorders. We have developed an efficient tool
for identifying Peptide Periodic Repeats (PEPPER) in protein sequences. The tool
identifies tandem peptide repeats (TPRs), contiguous single amino acid repeats
(SAARs), and periodic occurrences of single amino acids (POSAAs) and reports
the consensus repeat pattern, the complete repeat region, the score and
alignment of the consensus with the repeat region along with percentage
mismatch and insertions/deletions. A database of periodic peptide repeats
(DRiPS) has been developed. Presently we are developing algorithm for
identifying repeated motifs in protein sequences. Available at:
http://ccnsb.iiit.ac.in/nita/PEPPER.
Related Publications:
1.
PEPPER A
Tool for Identifying Peptide Periodic Repeats, poster
presentation in the International Conference in Bioinformatics,
2.
Protein
Tandem Repeat DataBase (PTRDB), Krishna Manjari, V. Kiran Kumar, Rima
Kumari and Nita Parekh, poster presentation at the International Conference on
Bioinformatics & Drug Discovery, Hyderabad Central University, Dec 20-22, 2007. (Awarded
5th best poster prize). Report no: IIIT/TR/2008/69.
3.
Database
of Repeats in Protein Sequences (DRiPS), Rima Kumari and Nita
Parekh, poster presentation in 19th International Conference on
Genome
Informatics (GIW2008), held at Gold Coast Australia, 1-3 Dec 2008.
Associated people:
K.
II An Integrated Tool for SNP Function Analysis
Introduction
Single nucleotide polymorphisms (SNPs)
are commonly used for association studies to find genes responsible for complex
genetic diseases. The complex diseases may involve many genes and hundreds of
alleles but only a small portion of them are functional polymorphisms that
contribute to disease phenotypes. Assessment of the risk requires access to a
variety of heterogeneous biological databases and analytical tools. A web
server is developed to facilitate the functional analysis of SNPs by mining
data from various resources and providing a detailed report for the query.
Related Publications:
1.
CompreSNPdb:
Comprehensive data-mining workflow for SNPs, Genes, Diseases and Pathways, Anshu
Bhardwaj, Ajeet Pandey, P. Krishna Manjari, V. Kiran Kumar, Nita Parekh and
Shrish Tiwari (in preparation)
III Identifying Genomic
Introduction
In recent years many different genomic
islands have been discovered in a variety of pathogenic as well as
non-pathogenic bacteria. Because they promote genetic variability, genomic
islands (GIs) play an important role in microbial evolution. We have developed
a web based integrated platform for the identification of genomic islands in
which various measures that capture bias in nucleotide compositions have been
implemented, viz., GC content (both
at the whole genome and at three codon positions in genes), genomic signature, k-mer distribution (k=26), codon usage bias and amino acid usage bias. The tool
carries out analysis in sliding windows (default size 10Kb) and compares with
the genomic average for each measure to identify probable genomic islands. The
output is displayed in a tabular format for each window which may be filtered
if the values of the measures differ by 1.5s (standard deviations) from
the genomic average. The tool also provides option to extract flanking regions
of predicted GIs for further analysis. Availability: http://ccnsb.iiit.ac.in/IGIPT/.
Related Publications:
1.
Ruchi Jain, Sandeep Ramineni, Nita Parekh, "Integrated Genomic Island Prediction Tool
(IGIPT)," icit, pp.131-132, International Conference on Information
Technology, 2008.
2. Genomic
Associated people:
Senthil Kumar
(Phd student), Rishi Arvind (MTech 2004), Hemanth Sanna Reddy (MTech 2005),
Ruchi Jain (Ms by Research, 2007-08), Sandeep Ramineni (Project student),
Tulasi and Keerthija (MTech 2008), Nita Parekh
IV Analysis of Protein Structures Using Graph Theory Approach
Introduction
We have been exploring the use of graph
theory methods to identify clusters in protein structures, e.g., tandem
structural repeats, domains, etc. Graph theory is a branch of discrete
mathematics applied to the study of various real-world networks and their
properties. Protein molecules being a set of atoms or group of atoms (vertices)
connected by covalent bonds (edges) has also been extensively investigated by
graph theory. The structure of proteins is governed to a large extent by
non-covalent interactions, and graph theory captures this 3-dimensional
topology providing insights into the structures of proteins. Analysis of the
topological details of proteins with known structures, such as clustering of
specific types of amino acids important for structure, folding and function, is
of great value as large number of protein structures are now available. We have
been focusing on the identification of tandem structural repeats and domains by
graph spectral analysis.
Related Publications:
1. Graph
Spectral Approach for Identifying Protein Domains, Hari Krishna
Yalamanchili and Nita Parekh, 1st
International Conference on Bioinformatics and Computational Biology (BICoB)
2009, 8 10 April, 2009,
Associated people:
Ramesh
Nerella (MTech 2005), Ruchi Jain (Ms by Research, 2006-08),
Y. Hari Krishna(Ms by Research,
2008), Nita Parekh
V Development of Comprehensive Gene Database
Introduction:
An important pattern recognition
problem in biological sequences is gene prediction the region that codes for
proteins. What are the important conserved patterns or motifs in exonic and
intronic regions of eukaryotic genes, splice site recognition, promoters &
regulatory sequences found in the vicinity of genic regions, etc. are some of
the important questions in gene prediction. Developing a specialized database
of genes would greatly facilitate in this analysis. We have developed a
Comprehensive Gene Database (CGD) of mammals be integrating information from
various NCBI resources.
Related Publication:
1. Gene Prediction in silico at National Seminar on Bioinformatics and
Functional Genomics, conducted by Bioinformatics Centre, Pondicherry
University, Feb 15 17, 2005.
2. Computational Issues in Gene Prediction, at 40th National
Convention of Computer Society
3. Tool to find Absolute Location of Genes in Human Genome, presented
at the National Seminar on Systems Approach to Bioinformatics,
conducted by Bioinformatics Centre,
Associated people:
G. Madhukar Reddy (MSIT 2002), Ch.
Jagan Mohan Reddy (MSIT 2002), Sai Deepthi (MSIT 2002),
VI Dynamical Systems
Modeling of Biological Systems
Introduction
Networks of
coupled dynamical systems have been used to model biological oscillators,
excitable media, neural networks, genetic control networks and many other
self-organizing systems. In general, the connection topology is assumed to be
either completely regular (e.g., diffusively-coupled system) or completely
random. However, most biological networks lie somewhere between these two
extremes. We would like to explore some simple models of networks that can be
tuned through this middle ground regular networks re-wired to introduce
increasing amounts of disorder. These systems, called small-world networks, can
be highly clustered, like regular lattices, yet have small characteristic path
lengths, like random graphs. From the perspective of nonlinear dynamics, it
would be interesting to understand how a network of interacting dynamical
systems be they neurons, chemical concentrations, or species population
behave collectively, given their individual dynamics and coupling architecture.
Currently we are analyzing the dynamics of a simple dynamical system, logistic
embedded on of various networks: regular, small-world, scale-free and random
and study the synchronizability of the network.
Related Publication:
1.
Controllability of Spatially Extended Systems Using the Pinning
Approach, Nita Parekh and S. Sinha, Physica
A 318, 200-212 (2003).
2.
Controlling Dynamics in Spatially Extended Systems, Nita Parekh and S. Sinha, Phys. Rev. E. 65, 036227-1 to 9
(2002).
3.
Global and Local Control of Spatiotemporal Chaos in Coupled Map
Lattices, Nita Parekh, S. Parthasarthy and S.
Sinha, Phys. Rev. Lett. 81, 1401 (1998).
Associated people:
Sunaina K. (MSIT 2002), Rishi Arvind
(M.Tech 2004), Snehansu Ghosh (M.Tech 2004), Raina (post-BSc), Nita Parekh.