A New Statistical Model to Select Target Sequences Bound by Transcription Factors

Utz J. Pape[1],[2] (utz.pape@molgen.mpg.de)
Steffen Grossmann[1] (grossman@molgen.mpg.de)
Stefanie Hammer[3] (hammer@molgen.mpg.de)
Silke Sperling[3] (sperling@molgen.mpg.de)
Martin Vingron[1] (martin.vingron@molgen.mpg.de)

[1]Computational Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
[2]Mathematics and Computer Science, Free University of Berlin, Berlin, Germany
[3]Vertebrate Genomics, Max Planck Institute for Molecular Genetics, Berlin, Germany


Transcription factors (TFs) play a key role in gene regulation by binding to target sequences. In silico prediction of potential binding to a sequence is a main task in computational biology. Although many methods have been proposed to tackle this problem, the statistical significance of the prediction is still not solved. We propose an approach to give a good approximation for the potential of a sequence to be bound by a TF. Instead of assessing distinct binding sites, we motivate to focus on the number of binding sites. Based on a suitable statistical model, probabilities for scoring are approximated for a TF to bind to a sequence. Two examples show the necessity of such a model as well as the superiority of the proposed method compared to standard approaches.

[ Full-text PDF | Table of Contents ]

Japanese Society for Bioinformatics