Grid-based approach for distributed frequent itemsets mining using dynamic workload management

Abstract : Discovering frequent itemsets is a crucial task in data mining. This study presents a grid-based frequent itemsets generation approach taking into account the underlying platform nature, and exploiting an inherent property of the basic task, using the Apriori algorithm, related to the candidate sets generation. Gridbased implementations introduce constraints related to the communication and synchronization overheads, the platform heterogeneity, and the underlying middleware and tools. Other constraints are related to the datasets properties and distribution. In the proposed approach only a local pruning strategy is considered. This greatly reduces the communication and synchronization costs. A block-based approach is introduced for memory constraints and dynamic workload management. This paper describes this approach and evaluates its performance on large scale datasets on a widely distributed grid testbed. Our performance study shows that our approach greatly enhances the performance and achieves high scalability compared to the grid implementation of a distributed Apriori founded algorithm, namely the FDM approach.
Type de document :
Communication dans un congrès
IAIT'07 : the 2nd International Conference on Advances in Information Technology, Nov 2007, Bangkok, Thailand. 2007
Liste complète des métadonnées

https://hal.inria.fr/hal-00697459
Contributeur : Ist Rennes <>
Soumis le : mardi 15 mai 2012 - 13:59:13
Dernière modification le : lundi 20 juin 2016 - 14:10:32

Identifiants

  • HAL Id : hal-00697459, version 1

Collections

Citation

Lamine M. Aouad, Nhien-An Le-Khac, Tahar Kechadi. Grid-based approach for distributed frequent itemsets mining using dynamic workload management. IAIT'07 : the 2nd International Conference on Advances in Information Technology, Nov 2007, Bangkok, Thailand. 2007. 〈hal-00697459〉

Partager

Métriques

Consultations de la notice

78