Profile of Tries - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Article Dans Une Revue SIAM Journal on Computing Année : 2009

Profile of Tries

Gayun Park
  • Fonction : Auteur
  • PersonId : 936027
Hsien-Kuei Hwang
  • Fonction : Auteur
  • PersonId : 936028
Wojciech Szpankowski
  • Fonction : Auteur
  • PersonId : 936029

Résumé

Tries (from retrieval) are one of the most popular data structures on words. They are pertinent to the (internal) structure of stored words and several splitting procedures used in diverse contexts. The profile of a trie is a parameter that represents the number of nodes (either internal or external) with the same distance from the root. It is a function of the number of strings stored in a trie and the distance from the root. Several, if not all, trie parameters such as height, size, depth, shortest path, and fill-up level can be uniformly analyzed through the (external and internal) profiles. Although profiles represent one of the most fundamental parameters of tries, they have hardly been studied in the past. The analysis of profiles is surprisingly arduous, but once it is carried out it reveals unusually intriguing and interesting behavior. We present a detailed study of the distribution of the profiles in a trie built over random strings generated by a memoryless source. We first derive recurrences satisfied by the expected profiles and solve them asymptotically for all possible ranges of the distance from the root. It appears that profiles of tries exhibit several fascinating phenomena. When moving from the root to the leaves of a trie, the growth of the expected profiles varies. Near the root, the external profiles tend to zero at an exponential rate, and then the rate gradually rises to being logarithmic; the external profiles then abruptly tend to infinity, first logarithmically and then polynomially; they then tend polynomially to zero again. Furthermore, the expected profiles of asymmetric tries are oscillating in a range where profiles grow polynomially, while symmetric tries are non-oscillating, in contrast to most shape parameters of random tries studied previously. Such a periodic behavior for asymmetric tries implies that the depth satisfies a central limit theorem but not a local limit theorem of the usual form. Also the widest levels in symmetric tries contain a linear number of nodes, differing from the order n=plog n for asymmetric tries, n being the size of the trees. Finally, it is observed that profiles satisfy central limit theorems when the variance goes unbounded, while near the height they are distributed according to Poisson laws. As a consequence of these results we find typical behaviors of the height, shortest path, fill-up level, and depth. These results are derived here by methods of analytic algorithmics such as generating functions, Mellin transform, Poissonization and de-Poissonization, the saddle-point method, singularity analysis, and uniform asymptotic analysis.
Les tries sont une des strucutres de données les plus utilisées pour les mots. Ce papier preesente une étude détaillée de la distribution des profils des tries générés par des sources sans mémoire. La croissance des profils attendus dépend de la distance à la racine. Les résultats sont obtenus en utilisant les outils de l'algorithmique analytique, notamment les fonctions génératrices, la transformation de Mellin, la Poissonisation et la dé-Poissonisation, la méthode du col, l'analyse de singularités, l'analyse asymptotique.
Fichier non déposé

Dates et versions

hal-00781400 , version 1 (26-01-2013)

Identifiants

Citer

Gayun Park, Hsien-Kuei Hwang, Pierre Nicodème, Wojciech Szpankowski. Profile of Tries. SIAM Journal on Computing, 2009, 38 (5), pp.1821-1880. ⟨10.1137/070685531⟩. ⟨hal-00781400⟩
140 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More