Juan Carlos Oliveros(email@example.com)
Joaqu í n Dopazo(firstname.lastname@example.org)
Protein Design Group, Centro Nacional de Biotecnolog í a (CNB-CSIC),
Campus de Cantoblanco, 28049 Madrid, Spain
Bioinformatics Unit, Centro Nacional de Investigaciones Oncol í ogicas (CNIO),
Ctra Majadahonda-Pozuelo Km 2, 28220 Majadahonda, Madrid, Spain
Expression arrays facilitate the monitoring of changes in expression patterns of large collections of genes. It is generally expected that genes with similar expression patterns would correspond to proteins of common biological function. We assess this common assumption by comparing levels of similarity of expression patterns and statistical significance of biological terms that describe the corresponding protein functions. Terms are automatically obtained by mining large collections of Medline abstracts. We propose that the combined use of the tools for expression profiles clustering and automatic function retrieval, can be useful tools for the detection of biologically relevant associations between genes in complex gene expression experiments. The results obtained using publicly available experimental data show how, in general, an increase in the similarity of the expression patterns is accompanied by an enhancement of the amount of specific functional information or, in other words, how the selected terms became more specific following an increase in the specificity of the expression patterns. Particularly interesting are the discrepancies from this general trend, i.e. groups of genes with similar expression patterns but very little in common at the functional level. In these cases the similarity of their expression profiles becomes the first link between previously unrelated genes.