Guppy: A GPU-like soft-core processor, Field-Programmable Technology (FPT), 2012 International Conference on. IEEE, pp.57-60, 2012. ,
FlexGrip: A soft GPGPU for FPGAs, Field-Programmable Technology (FPT), 2013 International Conference on. IEEE, pp.230-237, 2013. ,
Analyzing CUDA Workloads Using a Detailed GPU Simulator, proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp.163-174, 2009. ,
Enabling GPGPU low-level hardware explorations with MIAOW: an open-source RTL implementation of a GPGPU, ACM Transactions on Architecture and Code Optimization (TACO), vol.12, p.21, 2015. ,
Simultaneous Branch and Warp Interweaving for Sustained GPU Performance, 39th Annual International Symposium on Computer Architecture (ISCA), pp.49-60, 2012. ,
URL : https://hal.archives-ouvertes.fr/ensl-00649650
Nyami: a synthesizable GPU architectural model for general-purpose and graphics-specific workloads, Performance Analysis of Systems and Software (ISPASS), pp.173-182, 2015. ,
Intel® Xeon Phi? Coprocessor-the Architecture, Intel Whitepaper, 2014. ,
Stack-less SIMT reconvergence at low cost, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00622654
Path list traversal: a new class of SIMT flow tracking mechanisms, Inria Rennes -Bretagne Atlantique, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01533085
Barra: a Parallel Functional Simulator for GPGPU, IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), pp.351-360, 2010. ,
MIMD Synchronization on SIMT Architectures, 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016. ,
Multi2Sim Kepler: A detailed architectural GPU simulator, 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp.269-278, 2017. ,
Computer architecture: a quantitative approach, 2011. ,
Dynamic Inter-Thread Vectorization Architecture: extracting DLP from TLP, IEEE International Symposium on Computer Architecture and High-Performance Computing, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01356202
Macsim: A CPU-GPU heterogeneous simulation framework user guide, 2012. ,
A GPU-inspired soft processor for high-throughput acceleration, Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), pp.1-8, 2010. ,
Overcoming the limitations of conventional vector processors, ACM SIGARCH Computer Architecture News, vol.31, pp.399-409, 2003. ,
Exploring the Tradeoffs between Programmability and Efficiency in Data-Parallel Accelerators, ACM Transactions on Computer Systems (TOCS), vol.31, p.6, 2013. ,
A 45nm 1.3 GHz 16.7 doubleprecision GFLOPS/W RISC-V processor with vector accelerators, European Solid State Circuits Conference (ESSCIRC), pp.199-202, 2014. ,
NVIDIA Tesla: A Unified Graphics and Computing Architecture, IEEE Micro, vol.28, pp.39-55, 2008. ,
, , 2010.
A reconfigurable simulator for largescale heterogeneous multicore architectures, IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp.119-120, 2011. ,
Dynamic warp subdivision for integrated branch and memory divergence tolerance, 2010. ,
, Archit. News, vol.38, pp.235-246, 2010.
The GPU Computing Era, IEEE Micro, vol.30, pp.56-69, 2010. ,
Soft vector processors with streaming pipelines, Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays, pp.117-126, 2014. ,
The RISC-V Instruction Set Manual, vol.1, 2014. ,