A GPU-based Branch-and-Bound algorithm using Integer-Vector-Matrix data structure

Abstract : Branch-and-Bound (B&B) algorithms are tree-based exploratory methods for solving combinatorial optimization problems exactly to optimality. These problems are often large in size and known to be NP-hard to solve. The construction and exploration of the B&B-tree are performed using four operators: branching, bounding, selection and pruning. Such algorithms are irregular which makes their parallel design and implementation on GPU challenging. Existing GPU-accelerated B&B algorithms perform only a part of the algorithm on the GPU and rely on the transfer of pools of subproblems across the PCI Express bus to the device. To the best of our knowledge, the algorithm presented in this paper is the first GPU-based B&B algorithm that performs all four operators on the device and subsequently avoids the data transfer bottleneck between CPU and GPU. The implementation on GPU is based on the Integer-Vector-Matrix (IVM) data structure which is used instead of a conventional linked-list to store and manage the pool of subproblems. This paper revisits the IVM-based B&B algorithm on the GPU, addressing the irregularity of the algorithm in terms of workload, memory access patterns and control flow. In particular, the focus is put on reducing thread divergence by making a judicious choice for the mapping of threads onto the data. Compared to a GPU-accelerated B&B based on a linked-list, the algorithm presented in this paper solves a set of standard flowshop instances on average 3.3 times faster.
Liste complète des métadonnées

Littérature citée [26 références]  Voir  Masquer  Télécharger

Contributeur : Jan Gmys <>
Soumis le : vendredi 28 octobre 2016 - 14:42:59
Dernière modification le : vendredi 22 mars 2019 - 01:34:02


Fichiers produits par l'(les) auteur(s)




Jan Gmys, Mohand Mezmaz, Nouredine Melab, Daniel Tuyttens. A GPU-based Branch-and-Bound algorithm using Integer-Vector-Matrix data structure. Parallel Computing, Elsevier, 2016, Parallel Computing, 59, pp.119-139. 〈http://www.sciencedirect.com/science/article/pii/S0167819116000387〉. 〈10.1016/j.parco.2016.01.008〉. 〈hal-01389471〉



Consultations de la notice


Téléchargements de fichiers