SANDEEP KUMAR, Ph.D.

Contact Information

Laboratory of Experimental and Computational Biology, National Cancer Institute (NCI) - Frederick, Building 469, Room 151, Frederick, MD 21702. Phone: 301-846-6542; Fax : 301-846-5598; Email: kumarsan@ncifcrf.gov.

Curriculum vitae

Publication List

Papers in PubMed

Present Position

I am a postdoctoral visiting fellow with Drs. Ruth Nussinov and Jacob V. Maizel at LECB in NCI-Frederick.

Education

Ph.D. Computational Molecular Biophysics

I obtained my Ph.D. degree from Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India in 1998. My doctoral research was focused on sequence structure relationship in alpha helices.

Ph.D. thesis title: Geometry and sequence correlation studies on alpha helices in globular proteins.

M.Sc. Biotechnology and Molecular Biology

In 1992, I obtained my Masters degree in Biotechnology and Molecular Biology from G. B. Pant University. My M.Sc. research focussed on Glutamine synthetase production by Bacillus brevis BbG1 under different physiological conditions.

M.Sc. thesis title: Computerized mathematical approach for kinetics and production of Glutamine synthetase by B. brevis BbG1.

B. Sc. (Hons.) Physics

I completed my undergraduate studies in Physics (B. Sc. (Honours) Physics) from University of Delhi in 1989.

Research Interests

Computational molecular biophysics, sequence-structure-function relationship in biological macromolecules, specially proteins. Protein folding, binding and stability. Electrostatic interactions in protein stability. Relationship between "macroscopic" thermodynamic properties and "microscopic" protein structural properties. De novo protein design, bioinformatics and computational biology.

Description of present research

My present research explores physico-chemical rules behind protein folding, binding and stability as well as relationships among sequence, structural and thermodynamic data on proteins.

Protein thermostability

Recent years have witnessed an explosion in the available structural information on biological macromolecules, especially proteins. Such an information can be usefully exploited to study a number of biochemically relevant problems like molecular adaptation of proteins found in organisms living under extreme environmental conditions. I have compared various structural and sequence properties, namely, hydrophobicity, packing, oligomerization, insertion/deletions, hydrogen bonds, salt bridges, amino acid compositions and residue substitutions for eighteen non-redundant families of thermophilic and mesophilic proteins. Correlation between electrostatic interactions and protein thermostability was most consistent among the properties compared in this study. Further, I have compared electrostatic contributions of salt bridges towards stability of hyperthermophilic Pyrococcus furiosus glutamate dehydrogenase and mesophilic Clostridium symbiosum glutamate dehydrogenase monomers. Salt bridges in Pyrococcus furiosus glutamate dehydrogenase form extensive networks and, as a result, have greater electrostatic stabilities than those in Clostridium symbiosum glutamate dehydrogenase. Increased number of salt bridges and their networks contributes towards thermostability of Pyrococcus furiosus glutamate dehydrogenase.

Role of electrostatic interactions in protein stability and flexibility

Electrostatic interactions have important roles in protein folding, binding and stability. Electrostatic contribution of salt bridges can be stabilizing, insignificant or destabilizing towards protein structure depending upon their geometry, location and interaction with other charges in the protein globule. Recently, I have investigated role of the salt bridges in protein stability by computing their electrostatic contributions in several monomeric proteins. Salt bridges with 'good' geometry are usually stabilizing towards the proteins.

There are two types of protein flexibilities, viz., segmental and systemic. Segmental flexibility refers to rigid body movement of two or more subparts (domains, subunits, etc.) of a protein with respect to one another e. g. hinge-bending motion. Segmental flexibility is usually occurs on slow time scale. Segmental flexibility can be characterized in structural details by comparing "open" and "closed" conformations of enzymes. Close range electrostatic interactions such as salt bridges and hydrogen bonds are usually avoided across the moving parts of the protein. In contrast, systemic flexibility refers to fast movements of protein backbone and side chain atoms about their mean positions. Unlike the segmental flexibility, systemic flexibility is distributed all over the protein. Systemic flexibility can be analyzed by using multiple low energy conformations around the native state of the protein, such as those obtained from NMR experiments or by performing molecular dynamic simulations. I have analyzed the electrostatic contributions of six intra- and inter-helical salt bridges/ion pairs and an ion pair network in 40 NMR conformers of c-Myc-Max leucine zipper. The electrostatic contribution of each ion pair and the ion pair network fluctuates in conformer dependent manner. Each ion pair and the ion pair network inter-converts between being stabilizing and being destabilizing at least once in the 40 conformers of the NMR ensemble. The origin of fluctuations in the electrostatic contributions of the ion pairs can be traced to movements of the charged residues in proteins. This study indicates that overall contribution of salt bridges/ion pairs in solution may vary in conformer-population dependent manner. A large scale analysis of 22 ion pairs in 14 NMR conformer ensembles (with atleast 40 conformers), average energy minimized structures and multiple crystal structures of 11 non-homologous proteins has confirmed the above observations. These investigations have resulted in improved understanding of systemic protein flexibility. I have used the data from this analysis to characterize the relationship between geometries and electrostatic strengths of the ion pairs. It appears that oppositely charged residue pairs are mostly stabilizing towards proteins if their side chain charged group centroids are within five angstrom distance. These observations are useful for identifying stabilizing electrostatic interactions in protein structures and in de novo protein design.

Statistical analysis of protein thermodynamics data collected from experimental literature

There are two kinds of information available on proteins based on "macroscopic" and "microscopic" properties of the proteins. Thermodynamic information such as enthalpy, entropy, heat capacity and free energy differences between the folded and unfolding states of a given protein describes the macroscopic properties. Such information is usually gathered by spectroscopic (UV/VIS, CD, NMR), microcalorimetry (DSC) and hydrogen-deuterium exchange experiments. These experiments typically involve protein denaturation using physical (temperature) and chemical (Urea, GdHCl) agents. The data obtained using these experiments can be used to plot protein stability curves described by Gibbs-Helmholtz equation. A protein stability curve describes variation in the Gibbs free energy change between folded and unfolded state of the protein as a function of temperature. The information on microscopic properties of a protein is obtained from its atomic coordinates. The atomic coordinates can be obtained by solving the protein structure using crystallography and NMR. The challenge is to relate the two. One way to approach this problem is to analyze proteins for which both the thermodynamic data and the structural information is available. Using this approach, I am trying to learn more about protein thermostability. I have compared the protein stability curves for the families containing homologous thermophilic and mesophilic proteins. The observed differences are interpreted in terms of the sequence and structural differences in these homologues. Recently, I have written a review article on protein thermostability. This article tries to relate macroscopic thermodynamic differences in proteins with the differences in their sequence and structural properties. The central question asked in this article is: How do thermophilic proteins deal with heat? The observations on the thermodynamic differences among the homologous thermophilic and mesophilic proteins indicate that the thermophilic proteins achieve greater temperature resistance via increased thermodynamic stability. The protein stability curves of the thermophilic proteins are up-shifted and broader than those of their mesophilic homologues as shown by this study. Thermophilic proteins have greater enthalpy change at melting temperature, smaller heat capacity change and greater maximal thermodynamic stability as compared to their mesophilic homologues. Formation of specific interactions, such as electrostatics, may be responsible for this observation. This is consistent with our previous studies. Another interesting aspect of this analysis is that stabilities of the homologous thermophilic and mesophilic proteins at respective living temperatures of the source organisms are similar.

Most proteins are maximally stable around the room temperature!

Hydrophobic effect is the major force driving protein folding. It also explains clathrate formation by small organic solutes in water. Hydrophobic effect is strongest around the room temperature. Consistently, small organic solutes and apolar amino acids show minimal solubilities in water around the room temperature. A recent analysis of proteins that show reversible two-state folding/unfolding transition around the neutral pH has revealed an interesting observation. Such proteins with sufficiently large hydrophobic core are maximally stable around the room temperature, irrespective of their amino acid sequences, structural folds and living temperature of the source organisms.

Critical building blocks, protein folding/misfolding and protein folding pathways

Proteins are made up of building blocks at several hierarchical levels. Different building blocks have different contributions towards protein structure and stability. One or more of these building blocks may be critical for correct protein folding. In the absence of such critical building blocks, the proteins may misfold into non-native conformations. I have developed an algorithm to identify such critical building blocks. So far, I have studied the folded structure of adenylate kinase using this concept. We find that the critical building blocks also contain functionally important residues. This indicates that protein folding and function may be coupled. Occurrence of residues important for both protein folding and function in the same segment may be evolutionarily advantageous. In such proteins, the nature may need to guard against mutations in a single segment to protect both protein folding and function. Using a non-redundant data set of 930 protein chains with non-homologous structures, we have identified such critical building blocks in 225 proteins. Most of these proteins fold in complex non-sequential manner. Presently we studying structure-function conservation in these proteins.

Implications of energy landscape theory for protein folding and binding

A "new view" of protein folding is emerging from lattice modeling studies. This view is based on statistical mechanical concepts and describes protein folding as multiple pathway process. Different protein molecules are thought take different paths down their energy landscapes to reach native conformation. Some of these routes may be more frequently traveled than others. I am participating in discussions aimed to explore implications of this new view to folding and binding of large proteins. Using this theory, the "Lock and Key" and "Induced fit" models for protein-protein, protein-ligand, enzyme substrate binding can be understood in terms of "conformer selection". This theory is also useful in understanding and interpreting protein flexibility. These discussions are summarized in about half a dozen articles on this subject from our group. These articles are listed in publications.

Description of research for Ph.D. degree

Alpha helix is a major secondary structural element in proteins. My Ph.D. research focused on statistical analysis of sequence and structural characteristics of the alpha helices in globular proteins. It has been known for quite sometime that alpha helices in globular proteins often curve, bend and kink. I had developed HELANAL, a program to characterize alpha-helix geometries in proteins. This program classifies geometry of an alpha helix as being linear, curved or kinked. My studies on alpha helices in globular proteins showed a wide diversity in length, geometry and location for these motifs. The origin of these variations can be traced back to the amino acid sequences of the alpha helices. A detailed description of my observations is presented in in this publication.

Different positions in alpha helices, e.g. helix N- and C- termini and middle, prefer and avoid different amino acids. For example, Ncap position of protein alpha helices is very exclusive. It prefers six amino acids and avoids eleven others. I have carried out a comprehensive analysis of amino acid preferences and avoidances at 15 different positions in and around alpha helices. Each of these position shows unique preference and avoidances for different amino acids. Furthermore, there are several different structural motifs found at the alpha helix termini. The presence of these motifs is correlated with the occurrence of particular amino acids at the helix termini.

Something you may like to download from here:

My seminar on protein stability and electrostatic interactions

This talk presents an interesting aspect of my work. It highlights the fundamental nature of electrostatic interactions and their role in protein stability as well as in molecular adaptation. This talk is for educational / informational purpose only and should not be construed as a final word on this extremely complex issue. I expect you not to copy this talk (completely or in parts) and pass it on as your own or somebody else's (other than mine) intellectual property. It is a powerpoint presentation.

A high quality dataset of reversible two-state proteins

Often your results are as good as the data you collect. In protein thermodynamics and kinetics of protein folding / binding this is especially true. The experimental thermodynamics data on proteins reported from different labs across the world is very heterogeneous. This situation is compounded by the observations that the same protein may show two-state or three-state folding--unfolding transitions under different conditions such as pH, buffer, salt concentration, presence / absence of ligands, cofactors, metal ions, substrates or denaturants. Hence, one of the challenges that I faced was to devise a gold standard for what should be called as a reversible two state protein. I arrived at this standard after reading thousands of papers that report the calorimetric and spectroscopic experiments aimed at determining the protein thermodynamic parameters. The above manuscript defines this gold standard and the dataset that I obtained from the literature for the proteins that show a reversible two-state folding -- unfolding transition at or near the neutral pH under the stated conditions. I would suggest the use of these criteria and the dataset by experimentalists and theorists in their work.

Reprint (PDF) of a review on protein thermostability

My poster at 2001 Gordon research conference on Proteins at Holderness School in New Hampshire

Data on salt bridges and their electrostatic strengths in protein crystal structures

This is also the supplementary material for my 1999 JMB paper on salt bridge stability in proteins.

A few useful links

Last updated on November 18, 2002.