Parallel learning of local SVM algorithms for classifying large datasets

Abstract : We propose new parallel learning algorithms of local support vector machines (local SVMs) for effectively non-linear classification of large datasets. The algorithms of local SVMs perform the training task of large datasets with two main steps. The first one is to partition the full dataset into k subsets of data, and then the second one is to learn non-linear SVMs from k subsets to locally classify them in parallel way on multi-core computers. The k local SVMs algorithm (kSVM) uses kmeans clustering algorithm to partition the data into k clusters, then constructs in parallel non-linear SVM models to classify data clusters locally. The decision tree with labeling support vector machines (tSVM) uses C4.5 decision tree algorithm to split the full dataset into terminal-nodes, and then it learns in parallel local SVM models for classifying impurity terminal-nodes with mixture of labels. The krSVM algorithm is to train random ensemble of kSVM. The numerical test results! on 4 datasets from UCI repository, 3 benchmarks of handwritten letters recognition and a color image collection of one-thousand small objects show that our proposed algorithms of local SVMs (kSVM, tSVM, krSVM) are efficient compared to the standard SVM (LibSVM) in terms of training time and accuracy for dealing with large datasets.
Type de document :
Article dans une revue
Transactions on Large-Scale Data- and Knowledge-Centered Systems, Springer Berlin / Heidelberg, 2017, XXXI, pp.67-93
Liste complète des métadonnées

https://hal.inria.fr/hal-01400786
Contributeur : François Poulet <>
Soumis le : mardi 22 novembre 2016 - 14:40:09
Dernière modification le : mardi 16 janvier 2018 - 15:54:26

Identifiants

  • HAL Id : hal-01400786, version 1

Citation

Thanh Nghi Do, François Poulet. Parallel learning of local SVM algorithms for classifying large datasets. Transactions on Large-Scale Data- and Knowledge-Centered Systems, Springer Berlin / Heidelberg, 2017, XXXI, pp.67-93. 〈hal-01400786〉

Partager

Métriques

Consultations de la notice

461