Discovering Closed Frequent Itemsets on Multicore: Parallelizing Computations and Optimizing Memory Accesses

Benjamin Negrevergne 1 Alexandre Termier 2 Jean-François Mehaut 1 Takeaki Uno 3
1 MESCAL - Middleware efficiently scalable
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
2 LIG Laboratoire d'Informatique de Grenoble - HADAS
LIG - Laboratoire d'Informatique de Grenoble
Abstract : The problem of closed frequent itemset discovery is a fundamental problem of data mining, having applications in numerous domains. It is thus very important to have efficient parallel algorithms to solve this problem, capable of efficiently harnessing the power of multicore processors that exists in our computers (notebooks as well as desktops). In this paper we present PLCMQS, a parallel algorithm based on the LCM algorithm, recognized as the most efficient algorithm for sequential discovery of closed frequent itemsets. We also present a simple yet powerfull parallelism interface based on the concept of Tuple Space, which allows an efficient dynamic sharing of the work. Thanks to a detailed experimental study, we show that PLCMQS is efficient on both on sparse and dense databases.
Type de document :
Communication dans un congrès
Proceedings of HPCS (Intl. Conference on High Performance Computing and Simulation), Special Session on High Performnce Parallel and Distribuated Data Mining, 2010, Caen, France. IEEE, pp.521-528, 2010, 〈10.1109/HPCS.2010.5547082〉
Liste complète des métadonnées

https://hal.inria.fr/hal-00788879
Contributeur : Arnaud Legrand <>
Soumis le : vendredi 15 février 2013 - 13:11:07
Dernière modification le : lundi 30 avril 2018 - 11:33:58

Identifiants

Citation

Benjamin Negrevergne, Alexandre Termier, Jean-François Mehaut, Takeaki Uno. Discovering Closed Frequent Itemsets on Multicore: Parallelizing Computations and Optimizing Memory Accesses. Proceedings of HPCS (Intl. Conference on High Performance Computing and Simulation), Special Session on High Performnce Parallel and Distribuated Data Mining, 2010, Caen, France. IEEE, pp.521-528, 2010, 〈10.1109/HPCS.2010.5547082〉. 〈hal-00788879〉

Partager

Métriques

Consultations de la notice

197