Kazuhito Shida (firstname.lastname@example.org)
TUBERO (Tohoku University Biomedical Engineering Research Organization), Sendai 980-8575, Japan
The difficulties of computational discovery of transcription factor binding sites (TFBS) are well represented by (l, d) planted motif challenge problems. Large d problems are difficult, particularly for profile-based motif discovery algorithms. Their local search in the profile space is apparently incompatible with subtle motifs and large mutational distances between the motif occurrences.
Herein, an improved profile-based method called GibbsDST is described and tested on (15,4), (12,3), and (18,6) challenging problems. For the first time for a profile-based method, its performance in motif challenge problems is comparable to that of Random Projection. It is noteworthy that GibbsDST outperforms a pattern-based algorithm, WINNOWER, in some cases. Effectiveness of GibbsDST using a biological dataset as an example and its possible extension to more realistic evolution models are also introduced.