Predicting Disordered Regions from Amino Acid Sequence: Common Themes Despite Differing Structural Characterization

Ethan Garner[1] (egarner@wsunix.wsu.edu)
Paul Cannon[2] (pcannon@eecs.wsu.edu)
Pedro Romero[2] (promero@eecs.wsu.edu)
Zoran Obradovic[2] (zoran@eecs.wsu.edu)
A. Keith Dunker[3] (dunker@mail.wsu.edu)

[1] Department of Biochemistry and Biophysics, Washington State University
Pullman, WA 99164-4660
[2] School of Electrical Engineering and Computer Science, Washington State University
Pullman, WA 99164-4660
[3] Author to which all correspondence should be addressed.
Department of Biochemistry and Biophysics, Washington State University
Pullman, WA 99164-4660
Telephone: 509 335-5322, Fax: 509 335-9688


Abstract

Using ordered and disordered regions identified either by X-ray crystallography or by NMR spectroscopy, we trained neural networks to predict order and disorder from amino acid sequence. Although the NMR-based predictor initially appeared to be much better than the one based on the X-ray data, both predictors yielded similar overall accuracies when tested on each other's training sets, and indicated similar regions of disorder upon each sequence. The predictors trained with X-ray data showed similar results for a 5-cross validation experiment and for the out-of-sample predictions on the NMR characterized data. In contrast, the predictor trained with NMR data gave substantially worse accuracies on the out-of-sample X-ray data as compared to the accuracies displayed by the 5-cross validation during the network training. Overall, the results from the two predictors suggest that disordered regions comprise a sequence-dependant category distinct from that of ordered protein structure.

[ Full-text PDF | Table of Contents ]


Japanese Society for Bioinformatics