, AMD. Southern Islands Series Instruction Set Architecture, 2012.
Reconvergence de contrôle implicite pour les architectures SIMT, Revue des Sciences et Technologies de l'Information -Série TSI : Technique et Science Informatiques, vol.32, pp.153-178, 2013. ,
,
Simultaneous Branch and Warp Interweaving for Sustained GPU Performance, 39th Annual International Symposium on Computer Architecture (ISCA), pp.49-60, 2012. ,
URL : https://hal.archives-ouvertes.fr/ensl-00649650
Executing subroutines in a multi-threaded processing system, US Patent, vol.9, p.721, 2016. ,
Stack-less SIMT reconvergence at low cost, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00622654
Simty: a synthesizable general-purpose SIMT processor, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01351689
Étude comparée et simulation d'algorithmes de branchements pour le GPGPU, SYMPosium en Architectures nouvelles de machines (SYMPA), 2009. ,
Execution of divergent threads using a convergence barrier, vol.265, 2015. ,
MIMD synchronization on SIMT architectures, 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016. ,
A scalable multi-path microarchitecture for efficient GPU control flow, International Symposium on High Performance Computer Architecture (HPCA), 2014. ,
Dynamic warp formation: Efficient MIMD control flow on SIMD graphics hardware, ACM Transactions on Architecture and Code Optimization (TACO), vol.6, issue.2, p.7, 2009. ,
CUDA 9 Features Revealed: Volta, Cooperative Groups and More. NVIDIA Parallel ForAll, 2017. ,
Scheduling program instructions with a runner-up execution position, US Patent, vol.9, p.473, 2016. ,
Heterogeneous System Architecture: A new compute platform infrastructure, 2015. ,
HARP: Harnessing inactive threads in many-core processors, ACM TECS, vol.13, issue.3s, 2014. ,
Exploring the tradeoffs between programmability and efficiency in data-parallel accelerators, In ACM SIGARCH Computer Architecture News, vol.39, pp.129-140, 2011. ,
Chap -a SIMD graphics processor, Proceedings of the 11th annual conference on Computer graphics and interactive techniques, SIGGRAPH '84, pp.77-82, 1984. ,
Dynamic warp subdivision for integrated branch and memory divergence tolerance, SIGARCH Comput. Archit. News, vol.38, issue.3, pp.235-246, 2010. ,
The dual-path execution model for efficient GPU control flow, International Symposium on High Performance Computer Architecture (HPCA2013), pp.591-602, 2013. ,