Refining Markov Clustering for Protein Complex Prediction by Incorporating Core-Attachment Structure
Sriganesh Srihari (firstname.lastname@example.org)
Kang Ning (email@example.com)
HonWai Leong (firstname.lastname@example.org)
 School of Computing, National University of Singapore, Singapore 117590
 Department of Pathology, University of Michigan, Ann Arbor, MI
Protein complexes are responsible for most of vital biological processes within the cell. Understanding the
machinery behind these biological processes requires detection and analysis of complexes and their
constituent proteins. A wealth of computational approaches towards detection of complexes deal with
clustering of protein-protein interaction (PPI) networks. Among these clustering approaches, the Markov Clustering (MCL)
algorithm has proved to be reasonably successful, mainly due to its scalability and robustness. However, MCL
produces many noisy clusters, which either do not represent any known complexes or have additional proteins (noise)
that reduce the accuracies of correctly predicted complexes. Consequently, the accuracies of these clusters when matched with
known complexes are quite low. Refinement of these clusters to improve the accuracy requires deeper understanding of
the organization of complexes. Recently, experiments on yeast by Gavin et al. (2006) revealed that proteins within a complex
are organized in two parts: core and attachment.
Based on these insights, we propose our method (MCL-CA), which couples core-attachment based refinement steps
to refine the clusters produced by MCL.
We evaluated the effectiveness of our approach on two different
datasets and compared the quality of our predicted complexes with that produced by MCL. The results show that
our approach significantly improves the accuracies of predicted complexes when matched with known complexes.
A direct result of this is that MCL-CA is able to cover larger number of known complexes than MCL.
Further, we also compare our method with two very recently proposed methods CORE and COACH, which also capitalize
on the core-attachment structure. We also discuss several instances to show that our predicted complexes
clearly adhere to the core-attachment structure as revealed by Gavin et al.
Japanese Society for Bioinformatics