Åke Västermark,  (firstname.lastname@example.org)
Yasumasa Shigemoto (email@example.com)
Takashi Abe (firstname.lastname@example.org)
Hideaki Sugawara (email@example.com)
Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics and the Graduate University for Advanced Studies, Mishima, Shizuoka 411-8540, Japan
Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK
Life Science Systems Division, Fujitsu Limited, Shinkamata, Tokyo-to 144-8588, Japan
In one scenario of gene evolution, exon shuffling plays a fundamental role in increasing gene diversity. This paper is an appraisal of the biological relevance of categorising proteins by their splicing profiles (exon-intron structures). The central question is whether protein function is more correlated with splicing profiles than sequence similarity, or not. To approach this question, a splicing profile similarity (SPS) index, which measures relative exon length discrepancy, was devised. Arbitrary human proteins were compared, in terms of SPS and amino acid sequence similarity, to their 1) mouse orthologues and 2) human paralogues, which epitomise functional equivalence and non-equivalence, respectively, to methodically elucidate the global relationship between a) biological function, b) splicing profile similarity, and c) sequence similarity. Protein function is more correlated with splicing profile similarity than sequence similarity as demonstrated by the fact that human-mouse orthologues (HMOs) display significantly higher splicing profile similarity than do human-human paralogues (HHPs), despite the mutual sequence similarity between these two categories. This finding indicates that splicing profile-based protein categorisation is biologically meaningful.