Systematization of Species-Specific Diversity of Genes in Codon Usage:
Comparison of the Diversity Among Bacteria and Prediction of the Protein
Production Levels in Cells
Systematization of Species-Specific Diversity of Genes in Codon Usage:
Comparison of the Diversity Among Bacteria and Prediction of the Protein
Production Levels in Cells
Shigehiko Kanaya [1] (kanaya@eie.yz.yamagata-u.ac.jp)
Yoshihiro Kudo [1] (ykudo@eie.yz.yamagata-u.ac.jp)
Shinya Suzuki [1] (a93619@eie.yz.yamagata-u.ac.jp)
Toshimichi Ikemura [2] (tikemura@ddbj.nig.ac.jp)
[1] Department of Electric and information Engineering, Faculty of Engineering,
Yamagata University,
Yonezawa, Yamagata-ken 992, Japan
[2] Department of Evolutionary Genetics, National Institute of Genetics,
and the Graduate University for Advanced Studies,
Mishima, Shizuoka-ken 411, Japan
Abstract
In the present study, we have developed the procedure for
estimating species-specific heterogeneous codon usage among intraspecific
genes called diversity in codon usage and
for systematizing species by the species-specific diversity on the basis of principal component analysis.
We tried to quantify differences of the diversity
among five species, Escherichia coli (Ec), Salmonella typhimurium (St), Haemophilus
influenzae (Hi), Bacillus subtilis (Bs), and Synechocystis sp. (Ss).
In the five species, many of genes involved in the translation process
and energy metabolism had positive values (Z1 > 0) on the first principal component (PC1).
In Ss, many of genes involved in photosynthetic system
had also postive Z1-values.
These genes are thought to be highly expressed.
By the direction of PC1, the five species were roughly
classified into three categories, [Ec, St, Hi], [Ss], [Bs]. The dendrogram constructed was roughly consistent with the rRNA-based phylogeny, but interesting differences were also
observed between the two phylogenic trees.