National Institute for Basic Biology
One of the most serious problems in predicting mature mRNA sequences from their precursor form is that there are so many false positive consensus sequence patterns of exon/intron and intron/exon boundaries. Are there any additional sequence information which are recognized by spliceosomes but have been missed by us? To investigate this, we constructed an aberrant splicing database. From that database, various interesting observations were made: (1) Most mutations worked for either destroying or creating the consensus patterns. (2) Mutations were observed much more frequently in 5' boundaries than in 3' boundaries. (3) Exon skippings were most commonly observed. (4) The selection of cryptic sites seem to be determined from the consensus score and perhaps from exon lengths. (5) Newly-created consensus sequences seem to be used only if it is 'appropriately' located. These observations will be hopefully used as rules for constructing a more effective prediction system of exon sequences.