Speedup Critical Stage of Machine Learning with Batch Scheduling in GPU

Abstract : As a superior data analysis method, Machine Learning suffers the bottleneck from limited computing capability for many years. With the advent of numerous parallel computing hardwares, modern GPU is becoming a promising carrier for the tasks of Machine Learning. In this paper, we propose an efficient GPU execution framework to speedup the forward propagation process of convolution neural network. By extending the convolution unrolling method to fit this batch mode, we get a significant increase of throughput but very little overhead.
Type de document :
Communication dans un congrès
Ching-Hsien Hsu; Xuanhua Shi; Valentina Salapura. 11th IFIP International Conference on Network and Parallel Computing (NPC), Sep 2014, Ilan, Taiwan. Springer, Lecture Notes in Computer Science, LNCS-8707, pp.522-525, 2014, Network and Parallel Computing. 〈10.1007/978-3-662-44917-2_43〉
Liste complète des métadonnées

https://hal.inria.fr/hal-01403124
Contributeur : Hal Ifip <>
Soumis le : vendredi 25 novembre 2016 - 14:39:47
Dernière modification le : dimanche 8 avril 2018 - 15:58:01
Document(s) archivé(s) le : mardi 21 mars 2017 - 01:52:33

Fichier

978-3-662-44917-2_43_Chapter.p...
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

Citation

Yuan Gao, Rui Wang, Ning An, Yanjiang Wei, Depei Qian. Speedup Critical Stage of Machine Learning with Batch Scheduling in GPU. Ching-Hsien Hsu; Xuanhua Shi; Valentina Salapura. 11th IFIP International Conference on Network and Parallel Computing (NPC), Sep 2014, Ilan, Taiwan. Springer, Lecture Notes in Computer Science, LNCS-8707, pp.522-525, 2014, Network and Parallel Computing. 〈10.1007/978-3-662-44917-2_43〉. 〈hal-01403124〉

Partager

Métriques

Consultations de la notice

334

Téléchargements de fichiers

15