A Sequential Nonparametric Two-Sample Test

Alix Lhéritier; Frédéric Cazals

Rapport (Rapport De Recherche) Année : 2015

A Sequential Nonparametric Two-Sample Test

Un Test Non-paramétrique d'Homogénéité Séquentiel

(1) , (1)

Alix Lhéritier

Fonction : Auteur
PersonId : 748804
IdHAL : alherit
ORCID : 0000-0002-6056-1470
IdRef : 189016280

Algorithms, Biology, Structure

Frédéric Cazals

Fonction : Auteur
PersonId : 1189617
ORCID : 0000-0003-2735-6755
IdRef : 094973881

Algorithms, Biology, Structure

Résumé

Given samples from two distributions, a nonparametric two-sample test aims at determining whether the two distributions are equal or not, based on a test statistic. This statistic may be computed on the whole dataset, or may be computed on a subset of the dataset by a function trained on its complement. We propose a third tier, consisting of functions exploiting a sequential framework to learn the differences while incrementally processing the data. Sequential processing naturally allows optional stopping, which makes our test the first truly sequential nonparametric two-sample test. We show that any sequential predictor can be turned into a sequential two-sample test for which a valid $p$-value can be computed, yielding controlled type I error. We also show that pointwise universal predictors yield consistent tests, which can be built with a nonparametric regressor based on $k$-nearest neighbors in particular. We also show that mixtures and switch distributions can be used to increase power, while keeping consistency.

Mots clés

Hypothesis testing Sequential prediction Bayes factor Nonparametric two-sample test Regression Bayesian mixtures Switch distributions

Domaines

Géométrie algorithmique [cs.CG] Statistiques [stat] Machine Learning [stat.ML]

Fichier principal

RR-8704-v2.pdf (768.06 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Frederic Cazals : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01135608

Soumis le : mardi 2 juin 2015-19:23:03

Dernière modification le : jeudi 15 février 2024-15:28:00

Archivage à long terme le : mardi 25 avril 2017-00:41:17

Dates et versions

hal-01135608 , version 1 (25-03-2015)

hal-01135608 , version 2 (02-06-2015)

Identifiants

HAL Id : hal-01135608 , version 2

Citer

Alix Lhéritier, Frédéric Cazals. A Sequential Nonparametric Two-Sample Test. [Research Report] RR-8704, Inria. 2015, pp.18. ⟨hal-01135608v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INRIA INRIA-RRRT INRIA2 LARA UNIV-COTEDAZUR

307 Consultations

801 Téléchargements

A Sequential Nonparametric Two-Sample Test

Un Test Non-paramétrique d'Homogénéité Séquentiel

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager