# HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm

1 ALGO - Algorithms
Inria Paris-Rocquencourt
Abstract : This extended abstract describes and analyses a near-optimal probabilistic algorithm, HYPERLOGLOG, dedicated to estimating the number of \emphdistinct elements (the cardinality) of very large data ensembles. Using an auxiliary memory of m units (typically, "short bytes''), HYPERLOGLOG performs a single pass over the data and produces an estimate of the cardinality such that the relative accuracy (the standard error) is typically about $1.04/\sqrt{m}$. This improves on the best previously known cardinality estimator, LOGLOG, whose accuracy can be matched by consuming only 64% of the original memory. For instance, the new algorithm makes it possible to estimate cardinalities well beyond $10^9$ with a typical accuracy of 2% while using a memory of only 1.5 kilobytes. The algorithm parallelizes optimally and adapts to the sliding window model.
Keywords :
Type de document :
Communication dans un congrès
Jacquet, Philippe. AofA: Analysis of Algorithms, Jun 2007, Juan les Pins, France. Discrete Mathematics and Theoretical Computer Science, DMTCS Proceedings vol. AH, 2007 Conference on Analysis of Algorithms (AofA 07), pp.137-156, 2007, DMTCS Proceedings
Domaine :

Littérature citée [14 références]

https://hal.inria.fr/hal-00406166
Contributeur : Coordination Episciences Iam <>
Soumis le : lundi 17 août 2015 - 17:00:04
Dernière modification le : vendredi 25 mai 2018 - 12:02:02
Document(s) archivé(s) le : mercredi 18 novembre 2015 - 12:17:58

### Fichier

dmAH0110.pdf
Fichiers éditeurs autorisés sur une archive ouverte

### Identifiants

• HAL Id : hal-00406166, version 2

### Citation

Philippe Flajolet, Éric Fusy, Olivier Gandouet, Frédéric Meunier. HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm. Jacquet, Philippe. AofA: Analysis of Algorithms, Jun 2007, Juan les Pins, France. Discrete Mathematics and Theoretical Computer Science, DMTCS Proceedings vol. AH, 2007 Conference on Analysis of Algorithms (AofA 07), pp.137-156, 2007, DMTCS Proceedings. 〈hal-00406166v2〉

### Métriques

Consultations de la notice

## 7092

Téléchargements de fichiers