Scalable sparse tensor decompositions in distributed memory systems

Oguz Kaya 1, 2 Bora Uçar 1, 2
Abstract : We investigate an efficient parallelization of the most common iterative sparse tensor decomposition algorithms on distributed memory systems. A key operation in each iteration of these algorithms is the matricized tensor times Khatri-Rao product (MTTKRP). This operation amounts to element-wise vector multiplication and reduction depending on the sparsity of tensor. We investigate a fine and a coarse-grain task definition for this operation, and propose hypergraph partitioning-based methods for these task definitions to achieve load balance as well as reduce communication requirements. We also design a distributed memory sparse tensor library, HyperTensor, which implements a well-known algorithm for the CANDECOMP-PARAFAC (CP) tensor decomposition using the task definitions and the associated partitioning methods. We use this library to test the proposed implementation of MTTKRP in CP decomposition context, and report scalability results up to 1024 MPI ranks. We demonstrate up to 194 fold speedups using 512 MPI processes on a well-known real world data, and significantly better performance results with respect to a state of the art implementation.
Type de document :
Communication dans un congrès
International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Nov 2015, Austin, TX, United States. 2015, 〈10.1145/2807591.2807624〉
Liste complète des métadonnées

Littérature citée [30 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01148202
Contributeur : Equipe Roma <>
Soumis le : lundi 14 décembre 2015 - 15:51:21
Dernière modification le : vendredi 20 avril 2018 - 15:44:27
Document(s) archivé(s) le : samedi 29 avril 2017 - 11:48:45

Fichier

als_distmem.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Oguz Kaya, Bora Uçar. Scalable sparse tensor decompositions in distributed memory systems. International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Nov 2015, Austin, TX, United States. 2015, 〈10.1145/2807591.2807624〉. 〈hal-01148202v2〉

Partager

Métriques

Consultations de la notice

287

Téléchargements de fichiers

763