Skip to Main content Skip to Navigation
New interface
Conference papers

HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm

Abstract : This extended abstract describes and analyses a near-optimal probabilistic algorithm, HYPERLOGLOG, dedicated to estimating the number of \emphdistinct elements (the cardinality) of very large data ensembles. Using an auxiliary memory of m units (typically, "short bytes''), HYPERLOGLOG performs a single pass over the data and produces an estimate of the cardinality such that the relative accuracy (the standard error) is typically about $1.04/\sqrt{m}$. This improves on the best previously known cardinality estimator, LOGLOG, whose accuracy can be matched by consuming only 64% of the original memory. For instance, the new algorithm makes it possible to estimate cardinalities well beyond $10^9$ with a typical accuracy of 2% while using a memory of only 1.5 kilobytes. The algorithm parallelizes optimally and adapts to the sliding window model.
Complete list of metadata

Cited literature [14 references]  Display  Hide  Download
Contributor : Coordination Episciences Iam Connect in order to contact the contributor
Submitted on : Monday, August 17, 2015 - 5:00:04 PM
Last modification on : Friday, August 5, 2022 - 10:32:20 AM
Long-term archiving on: : Wednesday, November 18, 2015 - 12:17:58 PM


Publisher files allowed on an open archive



Philippe Flajolet, Éric Fusy, Olivier Gandouet, Frédéric Meunier. HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm. AofA: Analysis of Algorithms, Jun 2007, Juan les Pins, France. pp.137-156, ⟨10.46298/dmtcs.3545⟩. ⟨hal-00406166v2⟩



Record views


Files downloads