Bioinformatics Issues for Automating the Annotation of Genomic Sequences

Kim Carter [1] (
Akira Oka [2] (
Gen Tamiya [2] (
Matthew I. Bellgard [1] (

[1] Centre for Bioinformatics and Biological Computing, Murdoch University, Murdoch WA 6150, Australia
[2] Division of Molecular Life Science, Tokai University School of Medicine, Bohseidai, Isehara, Kanagawa, 259-11, Japan


The rapid explosion in the amount of biological data being generated worldwide is surpassing efforts to manage analysis of the data. As part of an ongoing project to automate and manage bioinformatics analysis, the authors have designed and implemented a simple automated annotation system, which is described in this paper. The system is applied to existing GenBank/DDBJ/EMBL entries and compared with existing annotations to illustrate not only potential errors but also that they are generally not up-to-date, as a result of new versions of analysis tools and updates of genomic repositories. We highlight the important Bioinformatics issues of storage and management of information to ensure data and results are kept up-to-date in light of new information becoming available. Surprisingly, from just four database entries, a significant number of new features were found. We describe the results as well as identify important issues that need to be addressed in order to automate the re-analysis/re-annotation of genomic sequences within a reasonable timeframe.

[ Full-text PDF | Table of Contents ]

Japanese Society for Bioinformatics