Skip to Main content Skip to Navigation
Reports

A Sequential Nonparametric Two-Sample Test

Alix Lhéritier 1 Frédéric Cazals 1
1 ABS - Algorithms, Biology, Structure
CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : Given samples from two distributions, a nonparametric two-sample test aims at determining whether the two distributions are equal or not, based on a test statistic. This statistic may be computed on the whole dataset, or may be computed on a subset of the dataset by a function trained on its complement. We propose a third tier, consisting of functions exploiting a sequential framework to learn the differences while incrementally processing the data. Sequential processing naturally allows optional stopping, which makes our test the first truly sequential nonparametric two-sample test. We show that any sequential predictor can be turned into a sequential two-sample test for which a valid $p$-value can be computed, yielding controlled type I error. We also show that pointwise universal predictors yield consistent tests, which can be built with a nonparametric regressor based on $k$-nearest neighbors in particular. We also show that mixtures and switch distributions can be used to increase power, while keeping consistency.
Complete list of metadata

Cited literature [24 references]  Display  Hide  Download

https://hal.inria.fr/hal-01135608
Contributor : Frederic Cazals <>
Submitted on : Tuesday, June 2, 2015 - 7:23:03 PM
Last modification on : Thursday, January 11, 2018 - 4:48:47 PM
Long-term archiving on: : Tuesday, April 25, 2017 - 12:41:17 AM

File

RR-8704-v2.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01135608, version 2

Collections

Citation

Alix Lhéritier, Frédéric Cazals. A Sequential Nonparametric Two-Sample Test. [Research Report] RR-8704, Inria. 2015, pp.18. ⟨hal-01135608v2⟩

Share

Metrics

Record views

404

Files downloads

688