A Randomization Test for extracting Robust Association Rules

Martine Cadot 1, *
* Corresponding author
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : An association rule "if A then B" is a link between database property sets A and B. Since this type of rule is not deduced from hypotheses, but found by investigation in data, association rules extraction belongs to Data Mining techniques (Han et al. 2001). Presently, more than fifty different measures are used to try to establish the quality of association rules, according to their different semantics. It shows the great variety of links between properties expressed by these rules, but also the difficulty of being sure they are meaningful. To test if an association rule is robust, that is to say to determine if the link it brings out is not due to chance, a Randomization Test (Edgington, 1995) is developed. For this, simulations that allow the generation of numerous artificial databases identical to an original database, except for the links between properties, are defined. Only the links which are found in the original database and in less than 5% of the artificial databases are judged statistically significant, with a type I error risk of less than 5% (Snedecor et al., 1967), and produce significant association rules. This simulation technique is far more efficient than the acceptance-rejection method and allows the use of the associated randomization test in various databases.
Complete list of metadatas

https://hal.inria.fr/inria-00337069
Contributor : Martine Cadot <>
Submitted on : Wednesday, November 5, 2008 - 11:26:57 PM
Last modification on : Thursday, January 11, 2018 - 6:19:56 AM
Long-term archiving on: Tuesday, June 28, 2011 - 5:37:55 PM

File

Cadot_CSDA05_T10_Robust_Data_M...
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00337069, version 1

Collections

Citation

Martine Cadot. A Randomization Test for extracting Robust Association Rules. 3rd world conference on Computational Statistics & Data Analysis - CSDA 2005, Oct 2005, Limassol, Cyprus. ⟨inria-00337069⟩

Share

Metrics

Record views

426

Files downloads

198