FTH-B&B: A Fault-Tolerant HierarchicalBranch and Bound for Large ScaleUnreliable Environments

Abstract : Solving to optimality large instances of combinatorial optimization problems using Brand and Bound (B&B) algorithms requires a huge amount of computing resources. In this paper, we investigate the design and implementation of such algorithms on computational grids. Most of existing grid-based B&B algorithms are based on the Master-Worker paradigm, their scalability is therefore limited. In addition, even if the volatility of resources is a major issue in grids fault tolerance is rarely addressed in these works. We thereby propose FTH-B&B, a fault tolerant hierarchical B&B. FTH-B&B is based on different new mechanisms enabling to efficiently build and maintain balanced the hierarchy, and to store and recover work units (sub-problems). FTH-B&B has been implemented on top of the ProActive grid middleware and programming environment and applied to the Flow-Shop scheduling problem. Very often, the validation of existing grid-based B&B works is performed either through simulation or a very small real grid. In this paper, we experimented FTH-B&B on the Grid’5000 real French nation-wide computational grid using up to 1,900 processor cores distributed over six sites. The reported results show that the overhead induced by the proposed mechanisms is very low and an efficiency close to 100 percent can be achieved on some Taillards benchmarks of the Flow-Shop problem. In addition, the results demonstrate the robustness of the proposed mechanisms even in extreme failure situations.
Type de document :
Article dans une revue
IEEE Transactions on Computers, Institute of Electrical and Electronics Engineers, 2014, 63 (09), pp.2302 - 2315. 〈10.1109/TC.2013.40〉
Liste complète des métadonnées

https://hal.inria.fr/hal-01107787
Contributeur : Nouredine Melab <>
Soumis le : mercredi 21 janvier 2015 - 15:28:55
Dernière modification le : jeudi 11 janvier 2018 - 06:22:13

Identifiants

Citation

Ahcène Bendjoudi, Nouredine Melab, El-Ghazali Talbi. FTH-B&B: A Fault-Tolerant HierarchicalBranch and Bound for Large ScaleUnreliable Environments. IEEE Transactions on Computers, Institute of Electrical and Electronics Engineers, 2014, 63 (09), pp.2302 - 2315. 〈10.1109/TC.2013.40〉. 〈hal-01107787〉

Partager

Métriques

Consultations de la notice

393