S. Sirowy and A. Forin, Wheres the beef? Why FPGAs are so fast, 2008.

R. Mcmillan, Microsoft supercharges Bing search with programmable chips, 2014.

S. Parsons, D. E. Taylor, D. V. Schuehler, M. A. Franklin, and R. D. Chamberlain, High speed processing of financial information using FPGA devices, 2011.

R. Woods, J. Mcallister, Y. Yi, and G. Lightbody, FPGA-based Implementation of Signal Processing Systems, 2008.
DOI : 10.1002/9781119079231

J. Arram, W. Luk, and P. Jiang, Ramethy, Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA '15, 2015.
DOI : 10.1145/2684746.2689066

. Intel, Acquisition of altera, Intel Invester Conference Call Deck, 2015.

T. Graepel, J. Q. Candela, T. Borchert, and R. Herbrich, Web-scale bayesian click-through rate prediction for sponsored search advertising in Microsofts Bing search engine, Proc. ICML, 2010.

J. G. Coutinho, O. Pell, E. Oneill, P. Sanders, J. Mcglone et al., HARNESS Project: Managing Heterogeneous Computing Resources for a Cloud Platform, Reconfigurable Computing: Architectures, Tools, and Applications, 2014.
DOI : 10.1007/978-3-319-05960-0_36

D. F. Bacon, R. Rabbah, and S. Shukla, FPGA programming for the masses, Communications of the ACM, vol.56, issue.4, 2013.
DOI : 10.1145/2436256.2436271

D. W. Page, Dynamic data re-programmable PLA, 1985.

J. M. Cardoso and P. C. Diniz, Compilation Techniques for Reconfigurable Architectures, 2009.
DOI : 10.1007/978-0-387-09671-1

P. Grigoras¸, X. Grigoras¸, J. G. Niu, W. Coutinho, J. Luk et al., Aspect driven compilation for dataflow designs, 2013 IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors, 2013.
DOI : 10.1109/ASAP.2013.6567545

M. Technologies, Maxeler AppGallery

X. Inc, Applications

J. P. Walters, A. J. Younge, D. Kang, K. Yao, M. Kang et al., GPU Passthrough Performance: A Comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL Applications, 2014 IEEE 7th International Conference on Cloud Computing, 2014.
DOI : 10.1109/CLOUD.2014.90

W. Amazon and . Services, EC2: Elastic Compute Cloud

Y. Suzuki, S. Kato, H. Yamada, and K. Kono, GPUvm: why not virtualizing GPUs at the hypervisor, Proc. USENIX ATC, 2014.

W. Wang, M. Bolic, and J. Parri, pvFPGA: Accessing an FPGA-based hardware accelerator in a paravirtualized environment, 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), 2013.
DOI : 10.1109/CODES-ISSS.2013.6658997

L. Shi, H. Chen, J. Sun, and K. Li, vCUDA: GPU-accelerated highperformance computing in virtual machines, IEEE Transactions on Computers, vol.61, issue.6, 2012.

G. Giunta, R. Montella, G. Agrillo, and G. Coviello, A GPGPU Transparent Virtualization Component for High Performance Computing Clouds, Proc. Euro-Par, 2010.
DOI : 10.1007/978-3-642-15277-1_37

M. Gottschlag, M. Hillenbrand, J. Kehne, J. Stoess, and F. Bellosa, LoGV: Low-Overhead GPGPU Virtualization, 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013.
DOI : 10.1109/HPCC.and.EUC.2013.245

J. Duato, A. J. Pena, F. Silla, R. Mayo, and E. S. Quintana-ortí, rCUDA: Reducing the number of GPU-based accelerators in high performance clusters, 2010 International Conference on High Performance Computing & Simulation, 2010.
DOI : 10.1109/HPCS.2010.5547126

M. Oikawa, A. Kawai, K. Nomura, K. Yasuoka, K. Yoshikawa et al., DS-CUDA: A Middleware to Use Many GPUs in the Cloud Environment, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, 2012.
DOI : 10.1109/SC.Companion.2012.146

P. Kegel, M. Steuwer, and S. Gorlatch, dOpenCL: Towards a Uniform Programming Approach for Distributed Heterogeneous Multi-/Many-Core Systems, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012.
DOI : 10.1109/IPDPSW.2012.16

A. Barak and A. Shiloh, The VirtualCL (VCL) cluster platform

C. Reaño, R. Mayo, E. S. Quintana-orti, F. Silla, J. Duato et al., Influence of InfiniBand FDR on the performance of remote GPU virtualization, 2013 IEEE International Conference on Cluster Computing (CLUSTER), 2013.
DOI : 10.1109/CLUSTER.2013.6702662

A. Kawai, K. Yasuoka, K. Yoshikawa, and T. Narumi, Distributedshared CUDA: Virtualization of large-scale GPU systems for programmability and reliability, Proc. FCTA, 2012.

S. Byma, J. G. Steffan, H. Bannazadeh, A. L. Garcia, and P. Chow, FPGAs in the Cloud: Booting Virtualized Hardware Accelerators with OpenStack, 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines, 2014.
DOI : 10.1109/FCCM.2014.42

F. Chen, Y. Shan, Y. Zhang, Y. Wang, H. Franke et al., Enabling FPGAs in the cloud, Proceedings of the 11th ACM Conference on Computing Frontiers, CF '14, 2014.
DOI : 10.1145/2597917.2597929

M. Technologies, New Maxeler MPC-X series: Maximum performance computing for big data applications, 2012.

S. Potluri, K. Hamidouche, A. Venkatesh, D. Bureddy, and D. Panda, Efficient Inter-node MPI Communication Using GPUDirect RDMA for InfiniBand Clusters with NVIDIA GPUs, 2013 42nd International Conference on Parallel Processing, 2013.
DOI : 10.1109/ICPP.2013.17

W. Amazon and . Services, Amazon Machine Learning