http://docs.nvidia.com/cuda/cufft/ 5. https://software.intel.com/en-us/intel-mkl 6. https://team.inria.fr/aoste/ 7. https://software.intel.com/sites The Design and Implementation of FFTW3, www.spiral.net 4 Proceedings of the IEEE, pp.319433-319455, 2005. ,
The Best of the 20th Century : Editors Name Top 10 Algorithms, SIAM News, vol.33, issue.4, 2000. ,
Challenges of Computing the Fast Fourier Transform, DARPA CONFERENCE, 1997. ,
An algorithm for the machine calculation of complex Fourier series, Mathematics of Computation, vol.19, issue.90, pp.297-301, 1965. ,
DOI : 10.1090/S0025-5718-1965-0178586-1
Floating-Point Fused Multiply-Add Architectures, 2007 Conference Record of the Forty-First Asilomar Conference on Signals, Systems and Computers, pp.331-337, 2007. ,
DOI : 10.1109/ACSSC.2007.4487224
Implementation of Efficient FFT Algorithms on Fused Multiply- Add Architectures, IEEE Transactions on Signal Processing, vol.41, issue.1, p.93, 1993. ,
DOI : 10.1109/TSP.1993.193130
Modified FFTs for fused multiply-add architectures, Mathematics of Computation, vol.60, issue.201, pp.347-361, 1993. ,
DOI : 10.1090/S0025-5718-1993-1159169-0
Fast Radix 2, 3, 4, and 5 Kernels for Fast Fourier Transformations on Computers with Overlapping Multiply--Add Instructions, SIAM Journal on Scientific Computing, vol.18, issue.6, pp.1605-1611, 1997. ,
DOI : 10.1137/S1064827595281940
The optimizations of signal processing algorithms of modern parallel and embedded architectures. Theses, 2009. ,
URL : https://hal.archives-ouvertes.fr/tel-00610865
Scaling Performance of FFT Computation on an Industrial Integrated GPU Co-processor : Experiments with Algorithm Adaptation, 2014. ,
Computational Frameworks for the Fast Fourier Transform, Frontiers in Applied Mathematics, Society for Industrial and Applied Mathematics, 1992. ,
DOI : 10.1137/1.9781611970999
What is the fast Fourier transform?, Proceedings of the IEEE, vol.55, issue.10, pp.1664-1674, 1967. ,
DOI : 10.1109/PROC.1967.5957
An algorithm for computing the mixed radix fast Fourier transform, IEEE Transactions on Audio and Electroacoustics, vol.17, issue.2, pp.93-103, 1969. ,
DOI : 10.1109/TAU.1969.1162042
A Modified Split-Radix FFT With Fewer Arithmetic Operations, IEEE Transactions on Signal Processing, vol.55, issue.1, pp.111-119, 2007. ,
DOI : 10.1109/TSP.2006.882087
The Tangent FFT Algebraic Algorithms and Error-correcting Codes, Proceedings of the 17th International Conference on Applied Algebra, pp.291-300, 2007. ,
The interaction algorithm and practical Fourier analysis, Journal of the Royal Statistical Society. Series B, 1960. ,
Fast Fourier Transforms, Signal Processing, vol.19, pp.259-299, 1990. ,
DOI : 10.1201/9781420046076-c7
On computing the Discrete Fourier Transform, Proceedings of the National Academy of Sciences, vol.73, issue.4, p.175, 1978. ,
DOI : 10.1073/pnas.73.4.1005
On the multiplicative complexity of the Discrete Fourier Transform, Advances in Mathematics, vol.32, issue.2, pp.83-117, 1979. ,
DOI : 10.1016/0001-8708(79)90037-9
Investigating the use of GPU-accelerated nodes for SAR image formation, 2009 IEEE International Conference on Cluster Computing and Workshops, pp.1-8, 2009. ,
DOI : 10.1109/CLUSTR.2009.5289125
Realization of interpolation-free fast sar range-doppler algorithm using parallel processing on gpu, Progress In Electromagnetics Research Symposium Proceedings, pp.998-1002, 2013. ,
Combining register allocation and instruction scheduling, tech. rep, 1995. ,
Hyperthreading technology in the netburst microarchitecture, IEEE Micro, vol.23, issue.2, pp.56-65, 2003. ,
DOI : 10.1109/MM.2003.1196115
Computer Architecture, Fifth Edition : A Quantitative Approach, 2011. ,
Microarchitectural Mechanisms to Exploit Value Structure in SIMT Architectures, Proceedings of the 40th Annual International Symposium on Computer Architecture, ISCA '13, pp.130-141, 2013. ,
Encyclopedia of Parallel Computing, ch. Fast Fourier Transform, 2011. ,
Self-Sorting In-Place Fast Fourier Transforms, SIAM Journal on Scientific and Statistical Computing, vol.12, issue.4, pp.808-823, 1991. ,
DOI : 10.1137/0912043
FFTW: an adaptive software architecture for the FFT, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181), pp.1381-1384, 1998. ,
DOI : 10.1109/ICASSP.1998.681704
Automatic performance tuning in the uhfft library, Computational Science ICCS 2001, pp.71-80, 2001. ,
A short introduction to the art of programming, Technische Hogeschool Eindhoven Eindhoven, vol.4, 1971. ,
Computing the fast Fourier transform on SIMD microprocessors. Thesis, 2012. ,
Optimization of conjugate-pair split-radix FFT algorithm for SIMD platforms, 2014 IEEE International Conference on Consumer Electronics (ICCE), pp.373-374, 2014. ,
DOI : 10.1109/ICCE.2014.6776047
A high performance FFT library with single instruction multiple data (SIMD) architecture, 2011 International Conference on Electronics, Communications and Control (ICECC), pp.630-633, 2011. ,
DOI : 10.1109/ICECC.2011.6066463
Accelerating the data shuffle operations for FFT algorithms on SIMD DSPs, 2011 9th IEEE International Conference on ASIC, pp.683-686, 2011. ,
DOI : 10.1109/ASICON.2011.6157297
An efficient technique for corner-turn in SAR image reconstruction by improving cache access, Proceedings 16th International Parallel and Distributed Processing Symposium, p.67, 2002. ,
DOI : 10.1109/IPDPS.2002.1015471
High performance discrete Fourier transforms on graphics processors, 2008 SC, International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-2, 2008. ,
DOI : 10.1109/SC.2008.5213922
An efficient, model-based CPU-GPU heterogeneous FFT library, Parallel and Distributed Processing IEEE International Symposium on, pp.1-10, 2008. ,
Fitting fft onto the g80 architecture, 2008. ,
On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing, 2011 Symposium on Application Accelerators in High-Performance Computing, pp.141-149, 2011. ,
DOI : 10.1109/SAAHPC.2011.29
A Methodology for Speeding Up Fast Fourier Transform Focusing on Memory Architecture Utilization, IEEE Transactions on Signal Processing, vol.59, issue.12, pp.6217-6226, 2011. ,
DOI : 10.1109/TSP.2011.2168525
Pencil : Towards a platform-neutral compute intermediate language for dsls, 2nd Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC, associated with SC), 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00786828
StarPU: a unified platform for task scheduling on heterogeneous multicore architectures, Concurrency and Computation: Practice and Experience, vol.23, issue.4, pp.187-198, 2011. ,
DOI : 10.1002/cpe.1631
URL : https://hal.archives-ouvertes.fr/inria-00384363
XKaapi: A Runtime System for Data-Flow Task Programming on Heterogeneous Architectures, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, pp.1299-1308, 2013. ,
DOI : 10.1109/IPDPS.2013.66
URL : https://hal.archives-ouvertes.fr/hal-00799904