Nicholas R. Steffen[1] (nsteffen@uci.edu)
Scott D. Murphy[1] (sdmurphy@uci.edu)
Richard H. Lathrop[1] (rickl@uci.edu)
Michael L. Opel[2] (mopel@uci.edu)
Lorenzo Tolleri[2] (Lorenzo_Tolleri@chiron.it)
G. Wesley Hatfield[2] (gwhatfie@.uci.edu)
[1]Department of Information and Computer Science,
University of California, Irvine, CA 92697, USA
[2]Department of Microbiology and Molecular Genetics, College of Medicine
University of California, Irvine, CA 92697, USA
We examine the use of deformation propensity at individual base steps for the identification of DNA-protein binding sites. We have previously demonstrated that estimates of the total energy to bend DNA to its bound conformation can partially explain indirect DNA-protein interactions. We now show that the deformation propensities at each base step are not equally informative for classifying a sequence as a binding site, and that applying non-uniform weights to the contribution of each base step to aggregate deformation propensity can greatly improve classification accuracy. We show that a perceptron can be trained to use the deformation propensity at each step in a sequence to generate such weights.