High Performance Parallel Algorithms for the Tucker Decomposition of Sparse Tensors

Oguz Kaya 1, 2, * Bora Uçar 1, 2
* Auteur correspondant
2 ROMA - Optimisation des ressources : modèles, algorithmes et ordonnancement
Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l'Informatique du Parallélisme
Abstract : —We investigate an efficient parallelization of a class of algorithms for the well-known Tucker decomposition of general N-dimensional sparse tensors. The targeted algorithms are iterative and use the alternating least squares method. At each iteration, for each dimension of an N-dimensional input tensor, the following operations are performed: (i) the tensor is multiplied with (N − 1) matrices (TTMc step); (ii) the product is then converted to a matrix; and (iii) a few leading left singular vectors of the resulting matrix are computed (TRSVD step) to update one of the matrices for the next TTMc step. We propose an efficient parallelization of these algorithms for the current parallel platforms with multicore nodes. We discuss a set of preprocessing steps which takes all computational decisions out of the main iteration of the algorithm and provides an intuitive shared-memory parallelism for the TTM and TRSVD steps. We propose a coarse and a fine-grain parallel algorithm in a distributed memory environment, investigate data dependencies, and identify efficient communication schemes. We demonstrate how the computation of singular vectors in the TRSVD step can be carried out efficiently following the TTMc step. Finally, we develop a hybrid MPI-OpenMP implementation of the overall algorithm and report scalability results on up to 4096 cores on 256 nodes of an IBM BlueGene/Q supercomputer.
Type de document :
Communication dans un congrès
International Conference on Parallel Processing (ICPP), Aug 2016, 2016-08-19, United States. ICPP 2016, 2016
Liste complète des métadonnées

Littérature citée [34 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01354894
Contributeur : Equipe Roma <>
Soumis le : vendredi 19 août 2016 - 21:17:27
Dernière modification le : mardi 16 janvier 2018 - 15:34:26
Document(s) archivé(s) le : dimanche 20 novembre 2016 - 10:29:05

Fichier

PID4261953.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01354894, version 1

Collections

Citation

Oguz Kaya, Bora Uçar. High Performance Parallel Algorithms for the Tucker Decomposition of Sparse Tensors. International Conference on Parallel Processing (ICPP), Aug 2016, 2016-08-19, United States. ICPP 2016, 2016. 〈hal-01354894〉

Partager

Métriques

Consultations de la notice

161

Téléchargements de fichiers

166