The Sequence Attribute Method for Determining Relationships Between Sequence and Protein Disorder

Qian Xie[1] (qian@grover.chem.wsu.edu)
Gregory E. Arnold[3][2] (ge_arnold_bits@yahoo.com)
Pedro Romero[2] (promero@eecs.wsu.edu)
Zoran Obradovic[2] (zoran@eecs.wsu.edu)
Ethan Garner[1] (egarner@wsunix.wsu.edu)
A. Keith Dunker[1] (dunker@mail.wsu.edu)

[1] Department of Biochemistry and Biophysics
[2] School of Electrical Engineering and Computer Science Washington State University
Pullman, WA 99164-4660, USA
[3] Biological Information Technologies
P.O. Box 1403, Richland, WA 99352, USA
[4] Present address: Amgen P Mail Stop 14-1-D, One Amgen Dr., Thousand Oaks, CA 91320, USA


Abstract

The conditional probability, P(s|x), is a statement of the probability that the event, s, will occur given prior knowledge for the value of x. If x is given and if s is randomly distributed, then an empirical approximation of the true conditional probability can be computed by the application of Bayes' Theorem. Here s represents one of two structural classes, either ordered, s o, or disordered, sd, and x represents an attribute value calculated over a window of 21 amino acids. Plots of P(s|x) versus x provide information about the correlation between the given sequence attribute and disorder or order. These conditional probability plots allow quantitative comparisons between individual attributes for their ability to discriminate between order and disorder states. Using such quantitative comparisons, 38 different sequence attributes have been rank-ordered. Attributes based on cysteine, the aromatics, flexible tendencies, and charge were found to be the best attributes for distinguishing order and disorder among those tested so far.

[ Full-text PDF | Table of Contents ]


Japanese Society for Bioinformatics