Discovery data topology with the closure structure. Theoretical and practical aspects - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2020

Discovery data topology with the closure structure. Theoretical and practical aspects

Résumé

In this paper, we are revisiting pattern mining and especially itemsetmining, which allows one to analyze binary datasets in searching for interestingand meaningful association rules and respective itemsets in an unsupervised way.While a summarization of a dataset based on a set of patterns does not provide ageneral and satisfying view over a dataset, we introduce a concise representation–the closure structure– based on closed itemsets and their minimum generators,for capturing the intrinsic content of a dataset. The closure structure allows one tounderstand the topology of the dataset in the whole and the inherent complexityof the data. We propose a formalization of the closure structure in terms of FormalConcept Analysis, which is well adapted to study this data topology. We presentand demonstrate theoretical results, and as well, practical results using the GDPMalgorithm. GDPM is rather unique in its functionality as it returns a characteri-zation of the topology of a dataset in terms of complexity levels, highlighting thediversity and the distribution of the itemsets. Finally a series of experiments showshow GDPM can be practically used and what can be expected from the output.

Dates et versions

hal-03079892 , version 1 (17-12-2020)

Identifiants

Citer

Tatiana Makhalova, Sergei O. Kuznetsov, Amedeo Napoli. Discovery data topology with the closure structure. Theoretical and practical aspects. 2020. ⟨hal-03079892⟩
49 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More