Henning Riedesel (firstname.lastname@example.org)
Björn Kolbeck (email@example.com)
Oliver Schmetzer (firstname.lastname@example.org)
Ernst-Walter Knapp (email@example.com)
Institute of Chemistry, Free University of Berlin, Takustrasse 6, Berlin 14195, Germany
We explore two different methods to predict the binding ability of nonapeptides at the class I major histocompatibility complex using a general linear scoring function that defines a separating hyperplane in the feature space of sequences. In absence of suitable data on non-binding nonapeptides we generated sequences randomly from a selected set of proteins from the protein data bank. The parameters of the scoring function were determined by a generalized least square optimization (LSM) and alternatively by the support vector machine (SVM). With the generalized LSM impaired data for learning with a small set of binding peptides and a large set of non-binding peptides can be treated in a balanced way rendering LSM more successful than SVM, while for symmetric data sets SVM has a slight advantage compared to LSM.