Service interruption on Monday 11 July from 12:30 to 13:00: all the sites of the CCSD (HAL, EpiSciences, SciencesConf, AureHAL) will be inaccessible (network hardware connection).
Skip to Main content Skip to Navigation
Conference papers

Order statistics and estimating cardinalities of massive data sets

Frédéric Giroire 1 
1 ALGORITHMS - Algorithms
Inria Paris-Rocquencourt
Abstract : We introduce a new class of algorithms to estimate the cardinality of very large multisets using constant memory and doing only one pass on the data. It is based on order statistics rather that on bit patterns in binary representations of numbers. We analyse three families of estimators. They attain a standard error of $\frac{1}{\sqrt{M}}$ using $M$ units of storage, which places them in the same class as the best known algorithms so far. They have a very simple internal loop, which gives them an advantage in term of processing speed. The algorithms are validated on internet traffic traces.
Complete list of metadata

Cited literature [12 references]  Display  Hide  Download
Contributor : Coordination Episciences Iam Connect in order to contact the contributor
Submitted on : Wednesday, August 12, 2015 - 3:51:21 PM
Last modification on : Thursday, February 3, 2022 - 11:18:43 AM
Long-term archiving on: : Friday, November 13, 2015 - 11:39:58 AM


Publisher files allowed on an open archive




Frédéric Giroire. Order statistics and estimating cardinalities of massive data sets. 2005 International Conference on Analysis of Algorithms, 2005, Barcelona, Spain. pp.157-166, ⟨10.46298/dmtcs.3353⟩. ⟨hal-01184025⟩



Record views


Files downloads