A Randomization Test for extracting Robust Association Rules

Martine Cadot 1, *
* Auteur correspondant
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : An association rule "if A then B" is a link between database property sets A and B. Since this type of rule is not deduced from hypotheses, but found by investigation in data, association rules extraction belongs to Data Mining techniques (Han et al. 2001). Presently, more than fifty different measures are used to try to establish the quality of association rules, according to their different semantics. It shows the great variety of links between properties expressed by these rules, but also the difficulty of being sure they are meaningful. To test if an association rule is robust, that is to say to determine if the link it brings out is not due to chance, a Randomization Test (Edgington, 1995) is developed. For this, simulations that allow the generation of numerous artificial databases identical to an original database, except for the links between properties, are defined. Only the links which are found in the original database and in less than 5% of the artificial databases are judged statistically significant, with a type I error risk of less than 5% (Snedecor et al., 1967), and produce significant association rules. This simulation technique is far more efficient than the acceptance-rejection method and allows the use of the associated randomization test in various databases.
Type de document :
Communication dans un congrès
3rd world conference on Computational Statistics & Data Analysis - CSDA 2005, Oct 2005, Limassol, Cyprus. 2005
Liste complète des métadonnées

https://hal.inria.fr/inria-00337069
Contributeur : Martine Cadot <>
Soumis le : mercredi 5 novembre 2008 - 23:26:57
Dernière modification le : jeudi 11 janvier 2018 - 06:19:56
Document(s) archivé(s) le : mardi 28 juin 2011 - 17:37:55

Fichier

Cadot_CSDA05_T10_Robust_Data_M...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00337069, version 1

Collections

Citation

Martine Cadot. A Randomization Test for extracting Robust Association Rules. 3rd world conference on Computational Statistics & Data Analysis - CSDA 2005, Oct 2005, Limassol, Cyprus. 2005. 〈inria-00337069〉

Partager

Métriques

Consultations de la notice

336

Téléchargements de fichiers

191