Skip to Main content Skip to Navigation
Journal articles

Parallel learning of local SVM algorithms for classifying large datasets

Abstract : We propose new parallel learning algorithms of local support vector machines (local SVMs) for effectively non-linear classification of large datasets. The algorithms of local SVMs perform the training task of large datasets with two main steps. The first one is to partition the full dataset into k subsets of data, and then the second one is to learn non-linear SVMs from k subsets to locally classify them in parallel way on multi-core computers. The k local SVMs algorithm (kSVM) uses kmeans clustering algorithm to partition the data into k clusters, then constructs in parallel non-linear SVM models to classify data clusters locally. The decision tree with labeling support vector machines (tSVM) uses C4.5 decision tree algorithm to split the full dataset into terminal-nodes, and then it learns in parallel local SVM models for classifying impurity terminal-nodes with mixture of labels. The krSVM algorithm is to train random ensemble of kSVM. The numerical test results! on 4 datasets from UCI repository, 3 benchmarks of handwritten letters recognition and a color image collection of one-thousand small objects show that our proposed algorithms of local SVMs (kSVM, tSVM, krSVM) are efficient compared to the standard SVM (LibSVM) in terms of training time and accuracy for dealing with large datasets.
Document type :
Journal articles
Complete list of metadata
Contributor : François Poulet Connect in order to contact the contributor
Submitted on : Tuesday, November 22, 2016 - 2:40:09 PM
Last modification on : Wednesday, November 3, 2021 - 6:03:38 AM


  • HAL Id : hal-01400786, version 1


Thanh Nghi Do, François Poulet. Parallel learning of local SVM algorithms for classifying large datasets. Transactions on Large-Scale Data- and Knowledge-Centered Systems, Springer Berlin / Heidelberg, 2017, XXXI, pp.67-93. ⟨hal-01400786⟩



Record views