Fast, Stable and Efficient Approximation of Multi-parameter Persistence Modules with MMA

David Loiseaux; Mathieu Carriere; Andrew Blumberg

Pré-Publication, Document De Travail Année : 2023

Fast, Stable and Efficient Approximation of Multi-parameter Persistence Modules with MMA

(1) , (1) , (2)

1
2

David Loiseaux

Fonction : Auteur

Understanding the Shape of Data

Mathieu Carriere

Fonction : Auteur
PersonId : 737916
IdHAL : mathieu-carriere

Understanding the Shape of Data

Andrew Blumberg

Fonction : Auteur

Columbia University [New York]

Résumé

Topological data analysis (TDA) is a rapidly growing area of data science which uses the geometry and topology of data sets to produce qualitative multi-scale shape descriptors for subsequent statistical and machine learning tasks. The most common descriptor in TDA is persistent homology, which tracks the topological changes in growing families of subsets of the data set itself, called filtrations, and encodes them in an algebraic object, called a persistence module. The algorithmic and theoretical properties of persistence modules are now well understood in the single-parameter case, that is, when there is only one filtration (e.g., feature scale) to study. In contrast, much less is known in the multi-parameter case, where several filtrations (e.g., scale and density) are used simultaneously. Indeed, the resulting multi-parameter persistence modules are much more complicated and intricate, which dramatically impedes the study of their theoretical properties. However, they usually encode information that is invisible to their singleparameter counterparts, and are thus much more useful descriptors for applications in data science. As a consequence, a lot of attention has been devoted to the construction of tractable and stable proxies for multi-parameter persistence modules. However, most of the proposed approaches in the literature are still prohibitively expensive to compute on large-scale data, many are limited to at most two filtrations, and the most tractable sacrifice much of the richness of the multi-parameter setting. In this article, we introduce a new parameterized family of topological invariants, taking the form of candidate decompositions, for multi-parameter persistence modules. We prove that our candidate decompositions are controllable approximations: when restricting to modules that can be decomposed into interval summands, we establish theoretical results about the approximation error between our candidate decompositions and the true underlying module in terms of the standard interleaving and bottleneck distances. Moreover, even when the underlying module does not admit such a decomposition, our candidate decompositions are nonetheless stable invariants; small perturbations in the underlying module lead to small perturbations in the candidate decomposition. Then, we introduce MMA (Multipersistence Module Approximation): an algorithm for computing stable instances of such invariants, which is based on fibered barcodes and exact matchings, two constructions that stem from the theory of single-parameter persistence. By design, MMA can handle an arbitrary number of filtrations, and has bounded complexity and running time. Finally, we present empirical evidence validating the generalization capabilities and running time speed-ups of MMA on several data sets.

Domaines

Géométrie algorithmique [cs.CG] Topologie algébrique [math.AT]

Fichier principal

2206.02026.pdf (1.66 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Mathieu Carrière : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03689199

Soumis le : mercredi 21 juin 2023-11:14:33

Dernière modification le : mardi 16 avril 2024-17:14:25

Dates et versions

hal-03689199 , version 1 (07-06-2022)

hal-03689199 , version 2 (21-06-2023)

Licence

Paternité

Identifiants

HAL Id : hal-03689199 , version 2
ARXIV : 2206.02026

Citer

David Loiseaux, Mathieu Carriere, Andrew Blumberg. Fast, Stable and Efficient Approximation of Multi-parameter Persistence Modules with MMA. 2023. ⟨hal-03689199v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INRIA INRIA2 UNIV-COTEDAZUR 3IA-COTEDAZUR ANR GS-COMPUTER-SCIENCE

94 Consultations

55 Téléchargements

Fast, Stable and Efficient Approximation of Multi-parameter Persistence Modules with MMA

Résumé

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager