A Novel Graph-Based Similarity Measure for 2D Chemical Structures

Si Quang Le[1] (quang@jaist.ac.jp)
Tu Bao Ho[1] (bao@jaist.ac.jp)
T.T Hang Phan[2] (s0244@st.ube-k.ac.jp)

[1]Japan Advanced Institute of Science and Technology, Ishikawa 923-1292, Japan
[2]Ube National College of Technology, Yamaguchi 755-8555, Japan


In this paper, we propose a graph-based method to measure the similarity between chemical compounds described by 2D form. Our main idea is to measure the similarity between two compounds based on edges, nodes, and connectivity of their common subgraphs. We applied the proposed similarity measure in combination with a clustering method to more than eleven thousand compounds in the chemical compound database KEGG/LIGAND and discovered that compound clusters with highly similar structure compounds that share common names, take part in the same pathways, and have the same requirement of enzymes in reactions. Furthermore, we discovered the surprising sameness between pathway modules identified by clusters of similar structure compounds and that identified by genomic contexts, namely, operon structures of enzyme genes.

[ Full-text PDF | Table of Contents ]

Japanese Society for Bioinformatics