Lossy compression of unordered rooted trees

Romain Azaïs 1 Jean-Baptiste Durand 2, 3 Christophe Godin 3
1 BIGS - Biology, genetics and statistics
Inria Nancy - Grand Est, IECL - Institut Élie Cartan de Lorraine
2 MISTIS - Modelling and Inference of Complex and Structured Stochastic Systems
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
3 VIRTUAL PLANTS - Modeling plant morphogenesis at different scales, from genes to phenotype
CRISAM - Inria Sophia Antipolis - Méditerranée , INRA - Institut National de la Recherche Agronomique, Centre de coopération internationale en recherche agronomique pour le développement [CIRAD] : UMR51
Abstract : A classical compression method for trees is to exploit subtree repeats in the structure by representing them by directed acyclic graphs. We propose a lossy compression method that consists in computing a structure with high redundancy that approximates the initial data. Trees are commonly used to represent hierarchical data appearing in computer science or in biology. Compression methods often take advantage of repeated substructures appearing in the tree (see the survey [1]). Directed Acyclic Graph (DAG) compression is a classical approach that exploits subtree repeats in the structure. However, it should be noted that trees without a high level of redundancy are often insufficiently compressed by this procedure. Self-nested trees are such that all their complete subtrees of a given height are isomorphic. The systematic repetition of subtrees gives them remarkable compression properties by this approach. We address lossy compression for unordered trees. Loss can be acceptable for visual representation of scenes composed of plants, for example. Our method consists in computing the DAG version of a self-nested tree that closely approximates the tree to compress. A first approximation has been proposed in [2] in which the authors compute in polynomial time the Nearest Embedding Self-nested Tree (NEST) of the initial structure, namely the self-nested tree that minimizes the edit distance to the initial tree and that embeds it. We focus on the presentation of two new algorithms to find a self-nested structure that approximates the initial tree better than the NEST. These solutions rely on a technique to find the centroid of a forest of small height and may be computed in polynomial time for trees with bounded degree. We prove on a simulated dataset that the error rates of these lossy compression methods are always better than the loss involved in the previous algorithm (on average, we observe a substantial gain of around 20%), while the compression rates are equivalent.
Type de document :
Poster
DCC 2016 - Data Compression Conference , Mar 2016, Snowbird, Utah, United States. 〈10.1109/DCC.2016.73〉
Liste complète des métadonnées

https://hal.inria.fr/hal-01394707
Contributeur : Jean-Baptiste Durand <>
Soumis le : mercredi 4 janvier 2017 - 14:05:13
Dernière modification le : mercredi 11 avril 2018 - 01:58:39
Document(s) archivé(s) le : mercredi 5 avril 2017 - 12:10:20

Fichiers

dcc2016poster.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Romain Azaïs, Jean-Baptiste Durand, Christophe Godin. Lossy compression of unordered rooted trees. DCC 2016 - Data Compression Conference , Mar 2016, Snowbird, Utah, United States. 〈10.1109/DCC.2016.73〉. 〈hal-01394707〉

Partager

Métriques

Consultations de la notice

896

Téléchargements de fichiers

101