Using Protein Motif Combinations to Update KEGG Pathway Maps and Orthologue Tables

Frédéric Nikitin[1],[4] (frederic.nikitin@genebio.com)
Bastien Rance[2] (rance@lri.fr)
Masumi Itoh[3] (itoh@kuicr.kyoto-u.ac.jp)
Minoru Kanehisa[3] (kanehisa@kuicr.kyoto-u.ac.jp)
Frédérique Lisacek[1],[4],[5] (frederique.lisacek@genebio.com)

[1]Geneva Bioinformatics, 25 avenue de Champel, 1206 Geneva, Switzerland
[2]LRI, Bat. 490, University Paris XI, 91405 Orsay cedex, France
[3]Bioinformatics Center, Institute for Chemical Research, University of Kyoto, Uji 611-0011, Japan
[4]Swiss Instiute of Bioinformatics, 1 rue Michel Servet, 1211 Geneva, Switzerland
[5]Génome & Informatique, Tour Evry 2, 91034 Evry Cedex, France


Abstract

We have studied the projection of protein family data onto single bacterial translated genome as a solution to visualise relationships between families restricted to bacterial sequences. Any member of any type of family as defined in the Pfam database (domains, signatures, etc.) is considered as a protein module. Our first goal is to discover rules correlating the occurrence of modules with biochemical properties. To achieve this goal we have developed a platform to quantify information found in protein databases and to support the analysis of the nature of modules, their position and corresponding frequencies of occurrence (in isolation or in combination) in association with pathway knowledge as found in KEGG.This paper focuses on two pathways: the two-component system and the aminophosphonate metabolism, that are partially but not completely documented. Proteins involved in those pathways were listed separately in each organism to analyse module composition and rules constraining pathway interactions were identified. It is shown how these results can be used to update KEGG pathways and orthologue tables.

[ Full-text PDF | Table of Contents ]


Japanese Society for Bioinformatics