MetaClustering: Discovery of The Different Sample Clusterings in Gene Expression Data

David Venet[1] (
Hugues Bersini[2] (
Hitoshi Iba[1] (

[1]IBA LAB, Post Box: 704, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwanoha 5-1-5, Kashiwa-shi, Chiba 277-8561, Japan
[2]IRIDIA, Universite Libre de Bruxelles, Avenue Franklin Roosevelt 50, B-1050 Brussels, Belgium


Clustering of the samples is a standard procedure for the analysis of gene expression data, for instance to discover cancer subtypes. However, more than one biologically meaningful clustering can exist, depending on the genes chosen. We propose here to group the genes in function of the clustering of the samples they fit. This allows to determine directly the different clusterings of the samples present in the data. As a clustering is a structure, genes belonging to the same group are functions of the same structure. Hence, the determination of groups of genes which support the same clustering could also be viewed as the detection of non-linearly linked genes. MetaClustering was applied successfully to simulated data. It also recovered the known clustering of real cancer data, which was impossible using the complete set of genes. Finally, it clustered together cell-cycle genes, showing its ability to group genes related in a non-linear way.

[ Full-text PDF | Table of Contents ]

Japanese Society for Bioinformatics