B. Bramas, O. Coulaud, and G. Sylvand, Time-Domain BEM for the Wave Equation: Optimization and Hybrid Parallelization, Par 2014, p.511
DOI : 10.1007/978-3-319-09873-9_43
URL : https://hal.archives-ouvertes.fr/hal-01063427

Y. J. Liu, S. Mukherjee, N. Nishimura, M. Schanz, W. Ye et al., Recent Advances and Emerging Applications of the Boundary Element Method, Applied Mechanics Reviews, vol.64, issue.3, pp.1-38, 2011.
DOI : 10.1115/1.4005491
URL : https://hal.archives-ouvertes.fr/hal-01401752

I. Terrasse, Résolution mathématique et numérique des équations de Maxwell instationnaires par une méthode de potentiels retardés, 1993.

T. Abboud, M. Pallud, and C. Teissedre, SONATE: a Parallel Code for Acoustics Nonlinear oscillations and boundary-value problems for Hamiltonian systems, 1982.

Q. Fang, A. Hu, . Conference, and D. Chapter, An efficient solution of time domain boundary integral equations for acoustic scattering and its acceleration by Graphics Processing Units, pp.10-2514, 2013.

S. Langer and M. Schanz, Time Domain Boundary Element Method Computational Acoustics of Noise Propagation in Fluids -Finite and Boundary Element Methods, pp.495-516, 2008.

T. Takahashi, A Time-domain BIEM for Wave Equation accelerated by Fast Multipole Method using Interpolation (pp. 191-192). doi:10, p.400549, 1115.

M. M. Baskaran, Optimizing Sparse Matrix-Vector Multiplication on GPUs, IBM Research Report, pp.812-859, 2008.

M. Garland, Sparse matrix computations on manycore GPU's, Proceedings of the 45th annual conference on Design automation, DAC '08, pp.2-6
DOI : 10.1145/1391469.1391473

N. Bell and M. Garland, Implementing sparse matrix-vector multiplication on throughputoriented processors Storage and Analysis (SC '09), Proceedings of the Conference on High Performance Computing Networking, p.1654078, 2009.
DOI : 10.1145/1654059.1654078
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.174.1350

A. Monakov, A. Lokhmotov, and A. Avetisyan, Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures, High Performance Embedded Architectures and Compilers Lecture Notes in Computer Science, pp.111-125, 2010.

C. Jhurani and P. Mullowney, A GEMM interface and implementation on NVIDIA GPUs for multiple small matrices, ARXIV, p.7053, 2013.

K. Goto and T. , Advanced, High-Performance Implementation of the Level-3 BLAS, pp.1-17, 2006.

G. M. Morton, A Computer Oriented Geodetic Data Base and a New Technique in File Sequencing, International Business Machines Company, 1966.

D. Hilbert, Ueber die stetige Abbildung einer Line auf ein Fl???chenst???ck, Mathematische Annalen, vol.38, issue.3, pp.459-460, 1891.
DOI : 10.1007/BF01199431

A. Pinar and M. T. Heath, Improving performance of sparse matrix-vector multiplication, Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM) , Supercomputing '99, 1999.
DOI : 10.1145/331532.331562

M. Snir and S. Otto, All : The MPI core, 1998.