3532 articles – 5253 Notices  [english version]

hal-00578830, version 1

Computing iceberg concept lattices with Titanic

Gerd Stumme 12, Rafik Taouil 3, Yves Bastide 4, Nicolas Pasquier () 5, Lotfi Lakhal 6

Data and Knowledge Engineering 42, 2 (2002) 189-222

Résumé : We introduce the notion of iceberg concept lattices and show their use in knowledge discovery in databases. Iceberg lattices are a conceptual clustering method, which is well suited for analyzing very large databases. They also serve as a condensed representation of frequent itemsets, as starting point for computing bases of association rules, and as a visualization method for association rules. Iceberg concept lattices are based on the theory of Formal Concept Analysis, a mathematical theory with applications in data analysis, information retrieval, and knowledge discovery. We present a new algorithm called TITANIC for computing (iceberg) concept lattices. It is based on data mining techniques with a level-wise approach. In fact, TITANIC can be used for a more general problem: Computing arbitrary closure systems when the closure operator comes along with a so-called weight function. The use of weight functions for computing closure systems has not been discussed in the literature up to now. Applications providing such a weight function include association rule mining, functional dependencies in databases, conceptual clustering, and ontology engineering. The algorithm is experimentally evaluated and compared with Ganter's Next-Closure algorithm. The evaluation shows an important gain in efficiency, especially for weakly correlated data.

  • 1 :  Institute AIFB, Universität Karlsruhe
  • AIFB, Universit¨at Karlsruhe
  • 2 :  University of Karlsruhe (TH)
  • University of Karlsruhe – Universität of Karlsruhe
  • 3 :  ORPAILLEUR (INRIA Lorraine - LORIA)
  • INRIA – CNRS : UMR7503 – Université Henri Poincaré - Nancy I – Université Nancy II – Institut National Polytechnique de Lorraine (INPL)
  • 4 :  Laboratoire d'Informatique, de Modélisation et d'optimisation des Systèmes (LIMOS)
  • CNRS : UMR6158 – Université d'Auvergne - Clermont-Ferrand I – Université Blaise Pascal - Clermont-Ferrand II – Institut Français de Mécanique Avancée
  • 5 :  Laboratoire d'Informatique, Signaux, et Systèmes de Sophia-Antipolis (I3S)
  • Université Nice Sophia Antipolis [UNS] – CNRS : UMR7271
  • 6 :  Laboratoire d'Informatique de Marseille (LIM)
  • CNRS : FR2246 – Université de la Méditerranée - Aix-Marseille II
  • Domaine : Informatique/Algorithme et structure de données
  • Mots-clés : Data mining – Galois closure – Concept lattice – Association rules – Algorithms
 
  • hal-00578830, version 1
  • oai:hal.archives-ouvertes.fr:hal-00578830
  • Contributeur : 
  • Soumis le : Mardi 22 Mars 2011, 14:15:37
  • Dernière modification le : Vendredi 14 Septembre 2012, 10:41:59