An Extension and Novel Solution to the (l,d)-Motif Challenge Problem

Mark P Styczynski[1] (marksty@mit.edu)
Kyle L Jensen[1] (kljensen@mit.edu)
Isidore Rigoutsos[2] (rigoutso@us.ibm.com)
Gregory N Stephanopoulos[1] (gregstep@mit.edu)

[1]Department of Chemical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
[2]Bioinformatics and Pattern Discovery Group, IBM Thomas J. Watson Research Center, PO Box 218, Yorktown Heights, NY 10598, USA


Abstract

The (l,d)-motif challenge problem, as introduced by Pevzner and Sze [12], is a mathematical abstraction of the DNA functional site discovery task. Here we expand the (l,d)-motif problem to more accurately model this task and present a novel algorithm to solve this extended problem. This algorithm is guaranteed to find all (l,d)-motifs in a set of input sequences with unbounded support and length. We demonstrate the performance of the algorithm on publicly available datasets and show that the algorithm deterministically enumerates the optimal motifs.

[ Full-text PDF | Table of Contents ]


Japanese Society for Bioinformatics