Multi-Class Support Vector Machines for Protein Secondary Structure Prediction

Minh N. Nguyen (minhnguyen@pmail.ntu.edu.sg)
Jagath C. Rajapakse (asjagath@ntu.edu.sg)

School of Computer Engineering, Nanyang Technological University, Singapore


Abstract

The solution of binary classification problems using the Support Vector Machine (SVM) method has been well developed. Though multi-class classification is typically solved by combining several binary classifiers, recently, several multi-class methods that consider all classes at once have been proposed. However, these methods require resolving a much larger optimization problem and are applicable to small datasets. Three methods based on binary classifications: one-against-all (OAA), one-against-one (OAO), and directed acyclic graph (DAG), and two approaches for multi-class problem by solving one single optimization problem, are implemented to predict protein secondary structure. Our experiments indicate that multi-class SVM methods are more suitable for protein secondary structure (PSS) prediction than the other methods, including binary SVMs, because their capacity to solve an optimization problem in one step. Furthermore, in this paper, we argue that it is feasible to extend the prediction accuracy by adding a second-stage multi-class SVM to capture the contextual information among secondary structural elements and thereby further improving the accuracies. We demonstrate that two-stage SVMs perform better than single-stage SVM techniques for PSS prediction using two datasets and report a maximum accuracy of 79.5%.

[ Full-text PDF | Table of Contents ]


Japanese Society for Bioinformatics