Mining Dominant Patterns in the Sky

Arnaud Soulet 1, * Chedy Raïssi 2 Marc Plantevit 3 Bruno Crémilleux 4
* Auteur correspondant
2 ORPAILLEUR - Knowledge representation, reasonning
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
3 DM2L - Data Mining and Machine Learning
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
4 Equipe CODAG - Laboratoire GREYC - UMR6072
GREYC - Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen
Abstract : Pattern discovery is at the core of numerous data mining tasks. Although many methods focus on efficiency in pattern mining, they still suffer from the problem of choosing a threshold that influences the final extraction result. The goal of our study is to make the results of pattern mining useful from a user-preference point of view. To this end, we integrate into the pattern discovery process the idea of skyline queries in order to mine skyline patterns in a threshold-free manner. Because the skyline patterns satisfy a formal property of dominations, they not only have a global interest but also have semantics that are easily understood by the user. In this work, we first establish theoretical relationships between pattern condensed representations and skyline pattern mining. We also show that it is possible to compute automatically a subset of measures involved in the user query which allows the patterns to be condensed and thus facilitates the computation of the skyline patterns. This forms the basis for a novel approach to mining skyline patterns. We illustrate the efficiency of our approach over several data sets and show that small sets of dominant patterns are produced under various measures.
Mots-clés : Pattern mining Skylines
Type de document :
Communication dans un congrès
IEEE. The 11th IEEE International Conference on Data Mining - ICDM 2011, Dec 2011, Vancouver, B.C, Canada. 2011
Liste complète des métadonnées

https://hal.inria.fr/inria-00623566
Contributeur : Chedy Raïssi <>
Soumis le : mercredi 14 septembre 2011 - 15:57:42
Dernière modification le : mardi 9 octobre 2018 - 11:46:07

Identifiants

  • HAL Id : inria-00623566, version 1

Citation

Arnaud Soulet, Chedy Raïssi, Marc Plantevit, Bruno Crémilleux. Mining Dominant Patterns in the Sky. IEEE. The 11th IEEE International Conference on Data Mining - ICDM 2011, Dec 2011, Vancouver, B.C, Canada. 2011. 〈inria-00623566〉

Partager

Métriques

Consultations de la notice

460