A Sequential Nonparametric Two-Sample Test

Alix Lhéritier 1 Frédéric Cazals 1
1 ABS - Algorithms, Biology, Structure
CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : Given samples from two distributions, a nonparametric two-sample test aims at determining whether the two distributions are equal or not, based on a test statistic. This statistic may be computed on the whole dataset, or may be computed on a subset of the dataset by a function trained on its complement. We propose a third tier, consisting of functions exploiting a sequential framework to learn the differences while incrementally processing the data. Sequential processing naturally allows optional stopping, which makes our test the first truly sequential nonparametric two-sample test. We show that any sequential predictor can be turned into a sequential two-sample test for which a valid $p$-value can be computed, yielding controlled type I error. We also show that pointwise universal predictors yield consistent tests, which can be built with a nonparametric regressor based on $k$-nearest neighbors in particular. We also show that mixtures and switch distributions can be used to increase power, while keeping consistency.
Type de document :
Rapport
[Research Report] RR-8704, Inria. 2015, pp.18
Liste complète des métadonnées

Littérature citée [24 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01135608
Contributeur : Frederic Cazals <>
Soumis le : mardi 2 juin 2015 - 19:23:03
Dernière modification le : jeudi 11 janvier 2018 - 16:48:47
Document(s) archivé(s) le : mardi 25 avril 2017 - 00:41:17

Fichier

RR-8704-v2.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01135608, version 2

Collections

Citation

Alix Lhéritier, Frédéric Cazals. A Sequential Nonparametric Two-Sample Test. [Research Report] RR-8704, Inria. 2015, pp.18. 〈hal-01135608v2〉

Partager

Métriques

Consultations de la notice

243

Téléchargements de fichiers

242