Sequence Similar-Structure Dissimilar Proteins Pairs in the Protein Data Bank caused by Protein Flexibility

Saturday, October 29, 2011
Hall 1-2 (San Jose Convention Center)
Glorimar Castro-Noriega , Computational Mathematics, University of Puerto Rico at Humacao, Humacao, PR
Andreas Prlić , Univertity of California, San Diego, San Diego, CA
Philip Bourne , University of California, San Diego, San Diego, CA
It is often assume that two proteins with similar sequence will also have similar structure. This assumption have been used for predict protein structure using homology modeling, for trace evolutionary relationship and for structure-based drug discovery. But this assumption has been put in test, for some examples, with the discovery of proteins pairs with high sequence similarity but structural dissimilar in the Protein Data Bank (PDB). Here we study the possibility that those differences in proteins structures with high sequence similarity may be related to flexibility in protein structures. Using BioJava to analyze how much flexibility is contained within the PDB, we developed a measurement for automatic detection of flexibility in proteins pairs that share high sequences similarities. Using this method, we expect to find vast amount of high sequence similar, structure dissimilar protein pairs in the PDB with flexibility.  The results can be integrated into the PDB, facilitating the protein structure flexibility analysis and the trace of evolutionary relationships. At the same time the method and the measurement proposed here could be made available in the PDB facilitating personal studies in protein flexibility.