Heather E. Burden, (firstname.lastname@example.org)
Zhiping Weng, (email@example.com)
Bioinformatics Program, Boston University, Boston, MA 02215, USA
Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
Many locations within transcription factor binding sites are not sequentially conserved and appear to be degenerate. We hypothesize that some of these positions contain essential structural codes that are recognized by the transcription factors that bind to them. The structural codes can be defined by base-pair step parameters that describe the relative displacement and orientation of two adjacent base pairs in a nucleic acid structure. We have developed a method, Identification of Conserved Structural Features (ICSF), which uses base-pair step parameters obtained from a collection of high-resolution DNA crystal structures to discover structural conservation that exists in the sequentially degenerate areas within a binding site and produce profiles of the structural features along the entire site. We have focused our study on the transcription factor binding sites in the JASPAR database and have found that one-third (P-value ≥ 0.05) of the binding sites contain sequentially degenerate locations with highly conserved structural features as described by the base-pair step parameters. These results will help us to gain a better understanding of the process by which transcription factors recognize their binding sites and possibly lead to an improvement in our ability to find these sites in genomic sequences.
Availability: ICSF is freely available to academic users at http://zlab.bu.edu/ICSF
Supplementary information: http://zlab.bu.edu/ICSF