Abstract : Inducing rules from very large datasets is one of the most challenging areas in data mining. Several approaches exist to scaling up classification rule induction to large datasets, namely data reduction and the parallelisation of classification rule induction algorithms. In the area of parallelisation of classification rule induction algorithms most of the work has been concentrated on the Top Down Induction of Decision Trees (TDIDT), also known as the 'divide and conquer' approach. However powerful alternative algorithms exist that induce modular rules. Most of these alternative algorithms follow the 'separate and conquer' approach of inducing rules, but very little work has been done to make the 'separate and conquer' approach scale better on large training data. This paper examines the potential of the recently developed blackboard based J-PMCRI methodology for parallelising modular classification rule induction algorithms that follow the 'separate and conquer' approach. A concrete implementation of the methodology is evaluated empirically on very large datasets.
https://hal.inria.fr/hal-01054584 Contributor : Hal IfipConnect in order to contact the contributor Submitted on : Thursday, August 7, 2014 - 3:29:02 PM Last modification on : Thursday, March 5, 2020 - 5:43:02 PM Long-term archiving on: : Wednesday, November 26, 2014 - 1:46:15 AM
Frederic Stahl, Max Bramer, Mo Adda. J-PMCRI: A Methodology for Inducing Pre-Pruned Modular Classification Rules. Third IFIP TC12 International Conference on Artificial Intelligence (AI) / Held as Part of World Computer Congress (WCC), Sep 2010, Brisbane, Australia. pp.47-56, ⟨10.1007/978-3-642-15286-3_5⟩. ⟨hal-01054584⟩