A Bayesian reassessment of nearest-neighbour classification - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Rapport (Rapport De Recherche) Année : 2008

A Bayesian reassessment of nearest-neighbour classification

Résumé

The k-nearest-neighbour procedure is a well-known deterministic method used in supervised classification. This paper proposes a reassessment of this approach as a statistical technique derived from a proper probabilistic model; in particular, we modify the assessment made in a previous analysis of this method undertaken by Holmes and Adams (2002, 2003), and evaluated by Manocha and Girolami (2007), where the underlying probabilistic model is not completely well-defined. Once a clear probabilistic basis for the k-nearest-neighbour procedure is established, we derive computational tools for conducting Bayesian inference on the parameters of the corresponding model. In particular, we assess the difficulties inherent to pseudo-likelihood and to path sampling approximations of an intractable normalising constant, and propose a perfect sampling strategy to implement a correct MCMC sampler associated with our model. If perfect sampling is not available, we suggest using a Gibbs sampling approximation. Illustrations of the performance of the corresponding Bayesian classifier are provided for several benchmark datasets, demonstrating in particular the limitations of the pseudo-likelihood approximation in this set-up.
Fichier principal
Vignette du fichier
RR-6173.pdf (1.99 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

inria-00143783 , version 1 (26-04-2007)
inria-00143783 , version 2 (08-05-2007)
inria-00143783 , version 3 (03-03-2008)
inria-00143783 , version 4 (03-03-2008)

Identifiants

  • HAL Id : inria-00143783 , version 3

Citer

Lionel Cucala, Jean-Michel Marin, Christian Robert, Mike Titterington. A Bayesian reassessment of nearest-neighbour classification. [Research Report] RR-6173, 2008. ⟨inria-00143783v3⟩

Collections

INRIA-RRRT
349 Consultations
847 Téléchargements

Partager

Gmail Facebook X LinkedIn More