An FPGA implementation of the two-dimensional finite-difference time-domain (FDTD) algorithm, Proceeding of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays , FPGA '04, pp.213-222, 2004. ,
DOI : 10.1145/968280.968311
Time domain numerical simulation for transient waves on reconfigurable coprocessor platform, Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, pp.127-136, 2005. ,
Accelerating Fluid Registration Algorithm on Multi-FPGA Platforms, 2011 21st International Conference on Field Programmable Logic and Applications, pp.50-57, 2011. ,
DOI : 10.1109/FPL.2011.20
URL : http://ballade.cs.ucla.edu/%7Econg/papers/fpl11.pdf
Automatic mapping of C to FPGAs with the DEFACTO compilation and synthesis system, Microprocessors and Microsystems, vol.29, issue.2-3, pp.51-62, 2005. ,
DOI : 10.1016/j.micpro.2004.06.007
An FPGA-optimized architecture of horn and schunck optical flow algorithm for real-time applications, 2014 24th International Conference on Field Programmable Logic and Applications (FPL), pp.1-4, 2014. ,
DOI : 10.1109/FPL.2014.6927406
Domain-Specific Language and Compiler for Stencil Computation on FPGA-Based Systolic Computational-Memory Array, Proceedings of the 8th International Symposium on Applied Reconfigurable Computing, pp.26-39, 2012. ,
DOI : 10.1109/71.97902
A high-level synthesis flow for the implementation of iterative stencil loop algorithms on FPGA devices, Proceedings of the 50th Annual Design Automation Conference on, DAC '13, pp.521-52, 2013. ,
DOI : 10.1145/2463209.2488797
Code generation from a domain-specific language for C-based HLS of hardware accelerators, Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis, CODES '14, 2014. ,
DOI : 10.1145/2656075.2656081
An optimal microarchitecture for stencil computation acceleration based on non-uniform partitioning of data reuse buffers, Proceedings of the 51st Annual Design Automation Conference, pp.771-77, 2014. ,
A polyhedral model-based framework for dataflow implementation on FPGA devices of iterative stencil loops, Proceedings of the 35th International Conference on Computer-Aided Design, ICCAD '16, pp.771-778, 2016. ,
DOI : 10.1145/2435264.2435271
isl: An Integer Set Library for the Polyhedral Model, Proceedings of the 3rd International Congress on Mathematical Software, pp.299-302, 2010. ,
DOI : 10.1007/978-3-642-15582-6_49
Supernode partitioning, Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '88, pp.319-329, 1988. ,
DOI : 10.1145/73560.73588
A data locality optimizing algorithm, Proceedings of the 12th Conference on Programming Language Design and Implementation, pp.30-44, 1991. ,
DOI : 10.1145/113445.113449
A practical automatic polyhedral parallelizer and locality optimizer, Proceedings of the 29th Conference on Programming Language Design and Implementation, pp.101-113, 2008. ,
DOI : 10.1145/1379022.1375595
URL : http://www.cse.ohio-state.edu/~bondhugu/publications/uday-pldi08.pdf
Hybrid Hexagonal/Classical Tiling for GPUs, Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO '14, pp.6666-6675, 2014. ,
DOI : 10.1145/2581122.2544160
URL : https://hal.archives-ouvertes.fr/hal-00911177
Effective automatic parallelization of stencil computations, Proceedings of the 28th Conference on Programming Language Design and Implementation, pp.235-244, 2007. ,
DOI : 10.1145/1250734.1250761
URL : http://www.cse.ohio-state.edu/~bondhugu/publications/pldi196-krishnamoorthy.ps
PATUS: A Code Generation and Autotuning Framework for Parallel Iterative Stencil Computations on Modern Microarchitectures, 2011 IEEE International Parallel & Distributed Processing Symposium, pp.676-687, 2011. ,
DOI : 10.1109/IPDPS.2011.70
The pochoir stencil compiler, Proceedings of the 23rd ACM symposium on Parallelism in algorithms and architectures, SPAA '11, pp.117-128, 2011. ,
DOI : 10.1145/1989493.1989508
A stencil compiler for short-vector SIMD architectures, Proceedings of the 27th international ACM conference on International conference on supercomputing, ICS '13, pp.13-24, 2013. ,
DOI : 10.1145/2464996.2467268
URL : http://www.cs.ucla.edu/%7Epouchet/doc/ics-article.13.pdf
Eliminating the memory bottleneck, Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays, FPGA '11, pp.65-74, 2011. ,
DOI : 10.1145/1950413.1950429
Darkroom, Proceedings of the 41st International Conference on Computer Graphics and Interactive Techniques, 2014. ,
DOI : 10.1145/2228360.2228472