FragQA: Predicting Local Fragment Quality of a Sequence-Structure Alignment

Xin Gao[1] (x4gao@cs.uwaterloo.ca)
Dongbo Bu[1],[3] (dbu@cs.uwaterloo.ca)
Shuai Cheng Li[1] (scli@cs.uwaterloo.ca)
Jinbo Xu[2] (j3xu@tti-c.org)
Ming Li[1] (mli@cs.uwaterloo.ca)

[1]David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada, N2L 3G1
[2]Toyota Technological Institute at Chicago, Chicago, IL, USA, 60637
[3]Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China, 100080


Abstract

Motivation.
Although protein structure prediction has made great progress in recent years, a protein model derived from automated prediction methods is subject to various errors. As methods for structure prediction develop, a continuing problem is how to evaluate the quality of a protein model, especially to identify some well predicted regions of the model, so that the structure biology community can benefit from automated structure prediction. It is also important to identify badly-predicted regions in a model so that some refinement measurements can be applied to.
Results.
We present a novel technique FragQA to accurately predict local quality of a sequence-structure (i.e., sequence-template) alignment generated by comparative modeling (i.e., homology modeling and threading). Different from previous local quality assessment methods, FragQA directly predicts cRMSD between a continuously aligned fragment determined by an alignment and the corresponding fragment in the native structure. FragQA uses an SVM (Support Vector Machines) regression method to perform prediction using information extracted from a single given alignment. Experimental results demonstrate that FragQA performs well on predicting local quality. More specifically, FragQA has prediction accuracy better than a top performer ProQres. Our results indicate that (1) local quality can be predicted well; (2) local sequence evolutionary information (i.e., sequence similarity) is the major factor in predicting local quality; and (3) structure information such as solvent accessibility and secondary structure helps improving prediction performance.

[ Full-text PDF | Table of Contents ]


Japanese Society for Bioinformatics