Leveraging k-NN for generic classification boosting

Paolo Piro; Richard Nock; Frank Nielsen; Michel Barlaud

Article Dans Une Revue Neurocomputing Année : 2012

Leveraging k-NN for generic classification boosting

(1) , (2) , (3) , (1)

1
2
3

Paolo Piro

Fonction : Auteur
PersonId : 860220

Laboratoire d'Informatique, Signaux, et Systèmes de Sophia-Antipolis (I3S) / Equipe IMAGES-CREATIVE

Richard Nock

Fonction : Auteur
PersonId : 838976

Centre de Recherche en Economie, Gestion, Modélisation et Informatique Appliquée

Frank Nielsen

Fonction : Auteur

Sony Corporation

Michel Barlaud

Fonction : Auteur
PersonId : 4232
IdHAL : michel-barlaud
ORCID : 0000-0001-9093-033X
IdRef : 033877572

Laboratoire d'Informatique, Signaux, et Systèmes de Sophia-Antipolis (I3S) / Equipe IMAGES-CREATIVE

Résumé

Voting rules relying on k-nearest neighbors (k-NN) are an effective tool in countless many machine learning techniques. Thanks to its simplicity, k-NN classification is very attractive to practitioners, as it enables very good performances in several practical applications. However, it suffers from various drawbacks, like sensitivity to "noisy" instances and poor generalization properties when dealing with sparse high-dimensional data. In this paper, we tackle the k-NN classification problem at its core by providing a novel k-NN boosting approach. Namely, we propose a supervised learning algorithm, called Universal Nearest Neighbors (UNN), that induces a leveraged k-NN rule by globally minimizing a surrogate risk upper bounding the empirical misclassification rate over training data. Interestingly, this surrogate risk can be arbitrary chosen from a class of Bregman loss functions, including the familiar exponential, logistic and squared losses. Furthermore, we show that UNN allows to efficiently filter a dataset of instances by keeping only a small fraction of data. Experimental results on the synthetic Ripley's dataset show that such a filtering strategy is able to reject "noisy" examples, and yields a classification error close to the optimal Bayes error. Experiments on standard UCI datasets show significant improvements over the current state of the art.

Mots clés

kNN Boosting Machine Learning classification

Domaines

Traitement des images [eess.IV] Machine Learning [stat.ML]

Michel Barlaud : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00664462

Soumis le : lundi 30 janvier 2012-16:17:37

Dernière modification le : lundi 26 février 2024-11:22:07

Dates et versions

hal-00664462 , version 1 (30-01-2012)

Identifiants

HAL Id : hal-00664462 , version 1

Citer

Paolo Piro, Richard Nock, Frank Nielsen, Michel Barlaud. Leveraging k-NN for generic classification boosting. Neurocomputing, 2012, 80, pp.3-9. ⟨hal-00664462⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-AG CNRS I3S UNIV-COTEDAZUR

259 Consultations

0 Téléchargements

Leveraging k-NN for generic classification boosting

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager