3532 articles – 5253 Notices  [english version]

inria-00101092, version 1

Combining Protein Secondary Structure Prediction Models with Ensemble Methods of Optimal Complexity

Yann Guermeur () a1, Dominique Zelus b

Journées Ouvertes Biologie Informatique Mathématiques - JOBIM'2001 (2001) 97-104

Résumé : The idea of combining models instead of simply selecting the ``best'' one, in order to improve performance, has a long theoretical background in statistics. However, theoretical results are ordinarily based on strong hypotheses, seldom satisfied in practice. When dealing with real-world problems, overfitting is often the main limitation, which cannot be overcome but with a strict complexity control of the combiner selected. SVMs should thus be well suited for these difficult situations. Investigating this idea, we introduce a new family of multi-class SVMs, and assess them as ensemble methods for protein secondary structure prediction. Experimental evidence highlights the gain in prediction accuracy resulting from combining some of the current best prediction methods with our SVMs rather than with the combiners traditionally used in the field.

  • a –  UNIVERSITE HENRI POINCARE
  • b –  UNIVERSITE DE ROSARIO, ARGENTINE
  • 1 :  MODBIO (INRIA Lorraine - LORIA)
  • INRIA – CNRS : UMR7503 – Université Henri Poincaré - Nancy I – Université Nancy II – Institut National Polytechnique de Lorraine (INPL)
  • Domaine : Informatique/Autre
  • Mots-clés : protein secondary structure prediction – multi-class svm – ensemble methods || prédiction de la structure secondaire des protéines – svm multi-classe – combinaison de modèles
  • Référence interne : A01-R-041 || guermeur01a
  • Commentaire : Colloque avec actes et comité de lecture. nationale.
 
  • inria-00101092, version 1
  • oai:hal.inria.fr:inria-00101092
  • Contributeur : 
  • Soumis le : Mardi 26 Septembre 2006, 14:56:27
  • Dernière modification le : Jeudi 28 Septembre 2006, 15:22:48