Jian Huang, (email@example.com)
Wataru Honda (firstname.lastname@example.org)
Minoru Kanehisa, (email@example.com)
Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho
Uji, Kyoto 611-0011, Japan
School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan
We evaluate the performance of six amino acid indices in B cell epitope residue prediction using the classical sliding window method on five data sets. Four of the indices: i.e. relative connectivity, clustering coefficient, closeness and betweenness are newly derived from the topological parameters of residue networks. The other two are Parker's hydrophilicity and Levitt's index, known as the best indices so far for B cell epitope prediction. On four of the data sets, the performance of all the indices was comparable and poor in general. When applied to one well-annotated data set, the performances improved and the 4 network based indices showed better performance than that of Parker's hydrophilicity and Levitt's index. When using the relative connectivity index on this data set, the prediction accuracy, sensitivity and specificity reached 73.6%, 73.0% and 75.0% respectively, with an area under the curve about 0.796. Thus, we suggested that this index is a good choice for B cell epitope prediction. It also indicates that the low performance of B cell epitope prediction is not only due to the methods and amino acid indices used, but also the data set as well. Interestingly, on the well-annotated data set, the performance of B cell epitope residue prediction is very similar to that of protein surface residue prediction, especially at the 10 and 20 Å² cutoffs. It is suggested that the performance in surface residue prediction might form a theoretical upper limit for the performance of B cell epitope residue prediction methods.