Combining Protein Secondary Structure Prediction Models with Ensemble Methods of Optimal Complexity

Yann Guermeur 1 Gianluca Pollastri André Elisseeff Dominique Zelus Hélène Paugam-Moisy Pierre Baldi
1 MODBIO - Computational models in molecular biology
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Many sophisticated methods are currently available to perform protein secondary structure prediction. Since they are frequently based on different principles, and different knowledge sources, significant benefits can be expected from combining them. However, the choice of an appropriate combiner appears to be an issue in its own right. The first difficulty to overcome when combining prediction methods is overfitting. This is the reason why we investigate the implementation of Support Vector Machines to perform the task. A family of multi-class SVMs is introduced. Two of these machines are used to combine some of the current best protein secondary structure prediction methods. Their performance is consistently superior to the performance of the ensemble methods traditionally used in the field. They also outperform the decomposition approaches based on bi-class SVMs. Furthermore, initial experimental evidence suggests that their outputs could be processed by the biologist to perform higher-level treatments.
Type de document :
Article dans une revue
Neurocomputing, Elsevier, 2003, 56, pp.305-327
Liste complète des métadonnées
Contributeur : Publications Loria <>
Soumis le : mardi 26 septembre 2006 - 09:38:03
Dernière modification le : jeudi 11 janvier 2018 - 06:19:51


  • HAL Id : inria-00099518, version 1



Yann Guermeur, Gianluca Pollastri, André Elisseeff, Dominique Zelus, Hélène Paugam-Moisy, et al.. Combining Protein Secondary Structure Prediction Models with Ensemble Methods of Optimal Complexity. Neurocomputing, Elsevier, 2003, 56, pp.305-327. 〈inria-00099518〉



Consultations de la notice