Automated Discovery of Protein Functional Units from Amino-acid sequences using Rough-Sets-based Comparative Analysis

S. Tsumoto ([1]
H. Tanaka ([1]
K. Tsumoto[2]
I. Kumagai[2]

[1] Department of Information Medicine, Medical Research Institute, Tokyo Medical and Dental University
1-5-45 Yushima, Bunkyo-ku, Tokyo 113 Japan
[2] Department of Biochemistry and Engineering, Graduate School of Engineering, Tohoku University


Protein structure analysis from DNA sequences is an important and fast growing area in both computer science and biochemistry [3]. One of the most important problems is that two proteins, both of which have the similar three-dimensional structure, have different functions, such as lysozyme and lactalbumin. In such cases, comparative analysis of both amino acid sequences is effective to detect the functional and structural differences. In this paper, we introduce a system, called MW1 (Molecular biologists' Workbench version 1.0), which extracts differential knowledge from amino-acid sequences by using rough-set based classification, statistical analysis and change of representation. This method is applied to the following two domain: comparative analysis of lysozyme and alpha-lactalbumin, and analysis of immunoglobulin structure. The results show that several interesting results from amino-acid sequences, are obtained which have not been reported before.