A Simple, Yet Effective and Efficient, Sliding Window Sampling Algorithm

Xuesong Lu 1, * Tok Wee Hyong 2 Chedy Raïssi 3 Stéphane Bressan 1
* Auteur correspondant
3 ORPAILLEUR - Knowledge representation, reasonning
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Sampling streams of continuous data with limited memory, or reservoir sampling, is a utility algorithm. Standard reservoir sampling maintains a random sample of the entire stream as it has arrived so far. This restriction does not meet the requirement of many applications that need to give preference to recent data. The simplest algorithm for maintaining a random sample of a sliding window reproduces periodically the same sample design. This is also undesirable for many applications. Other existing algorithms are using variable size memory, variable size samples or maintain biased samples and allow expired data in the sample. We propose an effective algorithm, which is very simple and therefore efficient, for maintaining a near random fixed size sample of a sliding window. Indeed our algorithm maintains a biased sample that may contain expired data. Yet it is a good approximation of a random sample with expired data being present with low probability. We analytically explain why and under which parameter settings the algorithm is effective. We empirically evaluate its performance (effectiveness) and compare it with the performance of existing representatives of random sampling over sliding windows and biased sampling algorithm.
Type de document :
Communication dans un congrès
Hiroyuki Kitagawa and Yoshiharu Ishikawa and Qing Li and ChiemiWatanabe. 15th International Conference on Database Systems for Advanced Applications - DASFAA 2010, Apr 2010, Tsukuba, Japan. Springer, 5981, pp.337-351, 2010, Lecture Notes in Computer Science. 〈10.1007/978-3-642-12026-8_27〉
Liste complète des métadonnées

https://hal.inria.fr/inria-00610941
Contributeur : Chedy Raïssi <>
Soumis le : lundi 25 juillet 2011 - 12:23:31
Dernière modification le : jeudi 11 janvier 2018 - 06:19:53

Identifiants

Collections

Citation

Xuesong Lu, Tok Wee Hyong, Chedy Raïssi, Stéphane Bressan. A Simple, Yet Effective and Efficient, Sliding Window Sampling Algorithm. Hiroyuki Kitagawa and Yoshiharu Ishikawa and Qing Li and ChiemiWatanabe. 15th International Conference on Database Systems for Advanced Applications - DASFAA 2010, Apr 2010, Tsukuba, Japan. Springer, 5981, pp.337-351, 2010, Lecture Notes in Computer Science. 〈10.1007/978-3-642-12026-8_27〉. 〈inria-00610941〉

Partager

Métriques

Consultations de la notice

323