R. Allen, L. Cinque, S. Tanimoto, L. Shapiro, and D. Yasuda, A parallel algorithm for graph matching and its MasPar implementation, IEEE Transactions on Parallel and Distributed Systems, vol.8, issue.5, 1997.

L. G. Casadoa, J. A. Martneza, I. Garcaa, and E. M. Hendrixb, Branch-and-Bound interval global optimization on shared memory multiprocessors. Optimization Methods and Software, pp.689-701, 2008.

I. Chakroun, A. Bendjoudi, and N. Melab, Reducing Thread Divergence in GPU-Based B&B Applied to the Flow-Shop Problem, 9th International Conference on Parallel Processing and Applied Mathematics PPAM'11, 2011.
DOI : 10.1007/978-3-642-31464-3_57

URL : https://hal.archives-ouvertes.fr/hal-00640805

I. Chakroun and N. Melab, An Adaptative Multi-GPU Based Branch-and-Bound. A Case Study: The Flow-Shop Scheduling Problem, 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems
DOI : 10.1109/HPCC.2012.59

URL : https://hal.archives-ouvertes.fr/hal-00705868

J. J. Dongarra, D. A. Bader, and J. Kurzak, Scientific computing with multi-core and accelerators, pp.978-1439825365, 2010.

W. Fung, I. Sham, G. Yuan, and T. Aamodt, Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007), pp.407-420, 2007.
DOI : 10.1109/MICRO.2007.30

M. R. Garey, D. S. Johnson, and R. Sethi, The Complexity of Flowshop and Jobshop Scheduling, Mathematics of Operations Research, vol.1, issue.2, pp.117-129, 1976.
DOI : 10.1287/moor.1.2.117

E. Alerstam, W. C. , Y. Lo, T. David-han, and J. Rose, Next-generation acceleration and code optimization for light transport in turbid media using GPUs, Biomedical Optics Express, vol.1, issue.2, pp.658-675, 2010.
DOI : 10.1364/BOE.1.000658

T. Han and T. S. Abdelrahman, Reducing branch divergence in GPU programs, Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-4, 2011.
DOI : 10.1145/1964179.1964184

S. M. Johnson, Optimal two- and three-stage production schedules with setup times included, Naval Research Logistics Quarterly, vol.1, issue.1, pp.61-68, 1954.
DOI : 10.1002/nav.3800010110

J. K. Lenstra, B. J. Lageweg, and A. H. Kan, A General bounding scheme for the permutation flow-shop problem, Operations Research, vol.26, issue.1, pp.53-67, 1978.

T. V. Luong, N. Melab, and E. Talbi, GPU computing for parallel local search metaheuristic algorithms, press, preprint available in IEEE computer Society Digital Library, 2011.

N. Melab, ContributionsàContributions`Contributionsà la résolution deprobì emes d'optimisation combinatoire sur grilles de calcul, 2005.

N. Melab, I. Chakroun, and A. Bendjoudi, GPU-accelerated Bounding for Branch-and-Bound applied to a Permutation Problem using Data Access Optimization. Under submission in Concurrency and Computation: Practice and Experience -Manuscript

J. Meng, D. Tarjan, and K. Skadron, Dynamic warp subdivision for integrated branch and memory divergence tolerance, Proc. of ISCA, Pages 235246, 2010.

M. Mezmaz, N. Melab, and E. Talbi, A Grid-enabled Branch and Bound Algorithm for Solving Challenging Combinatorial Optimization Problems, 2007 IEEE International Parallel and Distributed Processing Symposium, 2007.
DOI : 10.1109/IPDPS.2007.370217

URL : https://hal.archives-ouvertes.fr/inria-00083814

M. J. Quinn, Analysis and implementation of branch-and-bound algorithms on a hypercube multicomputer, IEEE Transactions on Computers, vol.39, issue.3, pp.384-387, 1990.
DOI : 10.1109/12.48868

S. Ryoo, C. I. Rodrigues, S. S. Stone, J. A. Stratton, S. Ueng et al., Program optimization carving for GPU computing, Journal of Parallel and Distributed Computing, vol.68, issue.10, pp.1389-1401, 2008.
DOI : 10.1016/j.jpdc.2008.05.011

E. Taillard, Benchmarks for basic scheduling problems, European Journal of Operational Research, vol.64, issue.2, pp.278-285, 1993.
DOI : 10.1016/0377-2217(93)90182-M

E. Z. Zhang, Y. Jiang, Z. Guo, and X. Shen, Streamlining GPU applications on the fly, Proceedings of the 24th ACM International Conference on Supercomputing, ICS '10
DOI : 10.1145/1810085.1810104

C. Nvidia and . Guide, http://www.Nvidia.com/docs/IO/43395/NV DS Tesla C2050 C2070 jul10 lores.pdf 24. http://en.wikipedia.org/wiki/Comparison of Nvidia graphics processing units 25, Nvidia CUDA BestPracticesGuide 2.3.pdf. 23Intel-Core-i7-970-Processor-%2812M-Cache-3 20-GHz-4 80-GTs-Intel- QPI%29