Jie Wu (firstname.lastname@example.org)
Joseph C. Mellor (email@example.com)
Charles DeLisi, (firstname.lastname@example.org)
Department of Biomedical Engineering, Boston University, 24 Cummington St. Boston, MA, 02215, USA
Graduate program of Bioinformatics, Boston University, 24 Cummington St. Boston, MA, 02215, USA
Phylogenetic profiling is now an effective computational method to detect functional associations between proteins. The method links two proteins in accordance with the similarity of their phyletic distributions across a set of genomes. While pair-wise linkage is useful, it misses correlations in higher order groups: triplets, quadruplets, and so on. Here we assess the probability of observing co-occurrence patterns of 3 binary profiles by chance and show that this probability is asymptotically the same as the mutual information in three profiles. We demonstrate the utility of the probability and the mutual information metrics in detecting overly represented triplets of orthologous proteins which could not be detected using pairwise profiles. These triplets serve as small building blocks, i.e. motifs in protein networks; they allow us to infer the function of uncharacterized members, and facilitate analysis of the local structure and global organization of the protein network. Our method is extendable to N-component clusters, and therefore serves as a general tool for high order protein function annotation.