Breast Cancer Stratification from Analysis of Micro-Array Data of Micro-Dissected Specimens

Gabriela Alexe[1][2]* (
Gul S. Dalgin[3]* (
Daniel Scanfeld[1] (
Pablo Tamayo[1] (
Jill P Mesirov[1] (
Shridar Ganesan[4] (
Charles Delisi[3] (
Gyan Bhanot[2][4][5] (

[1]The Broad Institute of MIT and Harvard, Cambridge MA, 02142, USA
[2]Institute for Advanced Study, Princeton, NJ, 08540, USA
[3]Boston University, Boston, MA, 02215, USA
[4]Cancer Institute of New Jersey, New Brunswick, NJ, 08903, USA
[5]Rutgers University, Piscataway, NJ 08854, USA
*Joint first authors


We describe a new method based on principal component analysis and robust consensus ensemble clustering to identify and elucidate the subtypes of breast cancer disease. The method was applied to microarray gene expression data using micro-dissection of samples from 36 breast cancer patients with at least two of three pathological stages of disease. Controls were normal breast epithelial cells from 3 disease free patients. Our method identified an optimum set of genes and strong, stable clusters which correlated well with clinical classification into Luminal, Basal and Her2+ subtypes based on ER, PR and Her2 status. It also revealed a hierarchical portrait of disease progression through various grades and stages and identified genes and functional pathways for each stage, grade and disease subtype. We found that gene expression heterogeneity across subtypes is much greater than the heterogeneity of progression from DCIS to IDC within a subtype, suggesting that the disease subtypes are distinct disease processes. The averaging over data perturbations and clustering methods is critical in the robust identification of subtypes and gene markers for grade and progression.

[ Full-text PDF | Table of Contents ]

Japanese Society for Bioinformatics