Combining Protein Secondary Structure Prediction Models with Ensemble Methods of Optimal Complexity

Yann Guermeur 1 Gianluca Pollastri André Elisseeff Dominique Zelus Hélène Paugam-Moisy Pierre Baldi
1 MODBIO - Computational models in molecular biology
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Many sophisticated methods are currently available to perform protein secondary structure prediction. Since they are frequently based on different principles, and different knowledge sources, significant benefits can be expected from combining them. However, the choice of an appropriate combiner appears to be an issue in its own right. The first difficulty to overcome when combining prediction methods is overfitting. This is the reason why we investigate the implementation of Support Vector Machines to perform the task. A family of multi-class SVMs is introduced. Two of these machines are used to combine some of the current best protein secondary structure prediction methods. Their performance is consistently superior to the performance of the ensemble methods traditionally used in the field. They also outperform the decomposition approaches based on bi-class SVMs. Furthermore, initial experimental evidence suggests that their outputs could be processed by the biologist to perform higher-level treatments.
Document type :
Journal articles
Complete list of metadatas

https://hal.inria.fr/inria-00099518
Contributor : Publications Loria <>
Submitted on : Tuesday, September 26, 2006 - 9:38:03 AM
Last modification on : Thursday, January 11, 2018 - 6:19:51 AM

Identifiers

  • HAL Id : inria-00099518, version 1

Collections

Citation

Yann Guermeur, Gianluca Pollastri, André Elisseeff, Dominique Zelus, Hélène Paugam-Moisy, et al.. Combining Protein Secondary Structure Prediction Models with Ensemble Methods of Optimal Complexity. Neurocomputing, Elsevier, 2003, 56, pp.305-327. ⟨inria-00099518⟩

Share

Metrics

Record views

184