Margin Error and Generalization Capabilities of Multi-Class Discriminant Systems

André Elisseeff Yann Guermeur 1 Hélène Paugam-Moisy
1 MODBIO - Computational models in molecular biology
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : The theory and practice of discriminant analysis have been mainly developed for two-class problems (computation of dichotomies). This phenomenon can easily be explained, since there is an obvious way to perform multi-category discrimination tasks using solely models computing dichotomies. It consists in dividing the problem at hand into several {\it one-against-all} ones and applying a simple rule to construct the global discriminant function from the partial ones. Adopting a direct approach, however, should make it possible to improve the results, let them be theoretical (bounds on the expected risk) or practical (values of the empirical risk and the expected risk). Although multi-category extensions of the main models computing dichotomiesas in the case of multi-layer perceptrons, in other cases, this cannot be done readily but at the expense of the loss of part of the theoretical foundations. This is for instance the main shortcoming of the multi-category support vector machines developed so far. One of the major difficulties of multi-category discriminant analysis rests in the fact that it requires specific uniform convergence results. Indeed, the uniform strong laws of large numbers established for dichotomies do not extend nicely to multi-category problems. They become significantly looser. This is problematical indeed, since the question of the quality of bounds is of central importance if one wants to implement with confidence the structural risk minimization inductive principle, which is preciselyIn this paper, building upon the notions of margin used in the context of statistical learning theory and boosting theory, and the corresponding generalization error bounds, we derive sharper bounds on the expected risk (generalization error) of multi-class vector-valued discriminant models. The main result is an extension of a lemma by Bartlett. After a discussion about the notion of margin and its use for two-class discriminant analysis, we derive the main theorem and its corollaries. We then show how to bound the capacity measure for sets of functions of interest and study specifically the case of the multivariate linear regression model, which is of particular importance, since it is directly related to multi-category support vector machines. Finally, the bound is assessed on a real-worldThis work, which aims at establishing foundations for the statistical analysis of multi-class models, based on a new notion of margin, paves the way for the theoretical study of the existing multi-class support vector machines and the design of new ones. biocomputing problem. grounding the support vector method. can often be conceived simply,
Type de document :
[Intern report] A01-R-033 || elisseeff01a, 2001, 29 p
Liste complète des métadonnées
Contributeur : Publications Loria <>
Soumis le : mardi 26 septembre 2006 - 14:49:54
Dernière modification le : jeudi 11 janvier 2018 - 06:19:51


  • HAL Id : inria-00100708, version 1



André Elisseeff, Yann Guermeur, Hélène Paugam-Moisy. Margin Error and Generalization Capabilities of Multi-Class Discriminant Systems. [Intern report] A01-R-033 || elisseeff01a, 2001, 29 p. 〈inria-00100708〉



Consultations de la notice