Yaoyu E Wang (email@example.com)
Charles DeLisi, (firstname.lastname@example.org)
Graduate Program in Bioinformatics, Boston University, Boston, MA 02215, USA
Department of Biomedical Engineering, Boston University, Boston, MA, 02215, USA
We introduce a new and potentially valuable method for delineating the repertoire of protein complexes in highly mutable organisms and, in conjunction with other methods, for specifying the structural details of complexes. In the first instance the method provides a guide to selecting proteins for co-crystallization; in the second it augments the collection of structures determined by crystallography and other methods, including the discovery of possible alternative binding sites of known complexes. The key to the method is the availability of multiple sequence variants of an organism--arrived at either naturally or by directed mutagenesis in appropriate laboratory facilities. Amino acids that are important for the structural stability of a protein or complex tend to be conserved, generally mutating only when compensatory changes occur. Consequently significant correlations in variation of two conserved amino acids in the same protein suggest that they interact with one another, either directly or indirectly. Similarly, correlated mutations between conserved amino acids in different proteins suggest that they may be at a site of physical interaction. We have identified all highly conserved 9-11 amino acid long segments from HIV proteins and then identified pairs from different proteins with highly significant co-variation. Using the HIV reverse transcriptase and integrase proteins as an example, we demonstrate how the interface and combining sites can be inferred by co-variation analysis and rigid body docking. The potential significance for antiviral drug and vaccine design is briefly discussed.