Extraction of Organism Groups from Phylogenetic Profiles Using Independent Component Analysis

Yoshihiro Yamanishi (yoshi@kuicr.kyoto-u.ac.jp)
Masumi Itoh (itoh@kuicr.kyoto-u.ac.jp)
Minoru Kanehisa (kanehisa@kuicr.kyoto-u.ac.jp)

Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan


In recent years, the analysis of orthologous genes based on phylogenetic profiles has received popularity in bioinfomatics. We propose a new method to extract organism groups and their hierarchy from phylogenetic profiles using the independent component analysis (ICA). The method involves first finding independent axes in the projected space from the multivariate data matrix representing phylogenetic profiles for a number of orthologous genes. Then the extracted axes are correlated with major organism groups, according to the extent of affiliaion of axes scores for all the genes to specific organisms. The ICA was applied to the phylogenetic profiles created for 2875 orthologs in 77 organisms by using the KEGG/GENES database. The 9 extracted components out of 18 predefined components well represented the organism groups as categorized in KEGG. Furthermore, we performed the cluster analysis and obtained the hierarchy of organism groups.

Japanese Society for Bioinformatics