Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Distribution of Tail Symbols in DST for Markov Sources

Abstract : Lempel-Ziv'78 is one of the most popular data compression algorithm on words. Over the last decades we uncover its fascinating behavior and understand better many of its beautiful properties. Among others, in 1995 by settling the Ziv conjecture we proved that for memoryless source (i.e., when a sequence is generated by a source without memory) the number of LZ'78 phrases satisfies the Central Limit Theorem (CLT). Since then the quest commenced to extend it to Markov sources, however, despite several attempts this problem is still open. In this conference paper, we revisit the issue and focus on a much simpler, but not trivial problem that may lead to the resolution of the LZ'78 dilemma. We consider the associated Digital Search Tree (DST) version of the problem in which the DST is built over a fixed number of Markov generated sequences. In such a model we shall count the number of of the so called "tail symbol", that is, the symbol that follows the last inserted symbol. Our goal here is to analyze this new quantity under Markovian assumption since it plays crucial role in the analysis of the original LZ'78 problem. We establish the mean, the variance, and the central limit theorem for the number of tail symbols. We accomplish it by applying techniques of analytic combinatorics on words also known as analytic pattern matching.
Document type :
Preprints, Working Papers, ...
Complete list of metadatas

Cited literature [20 references]  Display  Hide  Download

https://hal.inria.fr/hal-01966158
Contributor : Philippe Jacquet <>
Submitted on : Thursday, December 27, 2018 - 4:30:02 PM
Last modification on : Friday, April 19, 2019 - 4:55:03 PM
Long-term archiving on: : Thursday, March 28, 2019 - 1:41:16 PM

File

dst-tail.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01966158, version 1

Collections

Citation

Philippe Jacquet, Wojciech Szpankowski. Distribution of Tail Symbols in DST for Markov Sources. 2018. ⟨hal-01966158⟩

Share

Metrics

Record views

69

Files downloads

273