Naoki Sato (firstname.lastname@example.org)
Masayuki Ishikawa (Ishimasa@bio.c.u-tokyo.ac.jp)
Makoto Fujiwara (email@example.com)
Kintake Sonoike (firstname.lastname@example.org)
Department of Life Sciences, Graduate School of Arts and Sciences, University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo 153-8902, Japan
Department of Integrated Biosciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba, 277-8562, Japan
Chloroplasts originate from ancient cyanobacteria-like endosymbiont. Several tens of chloroplast proteins are encoded by the chloroplast genome, while more than hundreds are encoded by the nuclear genome in plants and algae, but the exact number and identity of nuclear-encoded chloroplast proteins are still unknown. We describe here attempts to identify a large number of unidentified chloroplast proteins of endosymbiont origin (CPRENDOs). Our strategy consists of whole genome protein clustering by the homolog group method, which is optimized for organism number, and phylogenetic profiling that extract groups conserved in cyanobacteria and photosynthetic eukaryotes. An initial minimal set of CPRENDOs was predicted without targeting prediction and experimentally validated.