V. Adve, G. Jin, J. Mellor-crummey, and Q. Yi, High Performance Fortran Compilation Techniques for Parallelizing Scientific Codes, Proceedings of the IEEE/ACM SC98 Conference, 1998.
DOI : 10.1109/SC.1998.10034

J. Anderson and M. Lam, Global optimizations for parallelism and locality on scalable parallel machines, The ACM SIGPLAN '93 Conference on Programming Language Design and Implementation, pp.112-125, 1993.

R. Andonov, S. Balev, S. Rajopadhye, and N. Yanev, Optimal semi-oblique tiling, IEEE Transactions on Parallel and Distributed Systems, vol.14, issue.9, pp.944-960, 2003.
DOI : 10.1109/TPDS.2003.1233716

V. Bandishti, I. Pananilath, and U. Bondhugula, TilingStencil Computations to Maximize Parallelism, the International conference on high performance computing, networking, storage and analysis, SC '12, pp.1-40

C. Bastoul, Code generation in the polyhedral model is easier than you think, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004., pp.7-16, 2004.
DOI : 10.1109/PACT.2004.1342537

URL : https://hal.archives-ouvertes.fr/hal-00017260

I. J. Bertolacci, C. Olschanowsky, B. Harshbarger, B. L. Chamberlain, D. G. Wonnacott et al., Parameterized Diamond Tiling for Stencil Computations with Chapel parallel iterators, Proceedings of the 29th ACM on International Conference on Supercomputing, ICS '15, pp.197-206, 2015.
DOI : 10.1145/2751205.2751226

U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan, Pluto: A practical and fully automatic polyhedral program optimization system, ACM Conference on Programming Language Design and Implementation, pp.101-113, 2008.

D. Callahan, J. Cocke, and K. Kennedy, Estimating interlock and improving balance for pipelined architectures, Journal of Parallel and Distributed Computing, vol.5, issue.4, pp.334-358, 1988.
DOI : 10.1016/0743-7315(88)90002-0

A. Darte and Y. Robert, Constructive methods for scheduling uniform loop nests, IEEE Transactions on Parallel and Distributed Systems, vol.5, issue.8, pp.814-822, 1994.
DOI : 10.1109/71.298207

URL : https://hal.archives-ouvertes.fr/hal-00857083

A. Darte and Y. Robert, Affine-by-Statement Scheduling of Uniform and Affine Loop Nests over Parametric Domains, Journal of Parallel and Distributed Computing, vol.29, issue.1, pp.43-59, 1995.
DOI : 10.1006/jpdc.1995.1105

URL : https://hal.archives-ouvertes.fr/hal-00857091

A. Lake, D. Wonnacott, and T. Jin, Automatic tiling of "mostly-tileable" loop nests, IMPACT 2015: 4th International Workshop on Polyhedral Compilation Techniques, 2015.

H. Dursun, K. Nomura, W. Wang, M. Kunaseth, L. Peng et al., In-Core Optimization of High-Order Stencil Computations, 2009.

P. Feautrier, Parametric integer programming, RAIRO - Operations Research, vol.22, issue.3, pp.243-268, 1988.
DOI : 10.1051/ro/1988220302431

P. Feautrier, Dataflow analysis of array and scalar references, International Journal of Parallel Programming, vol.24, issue.4, pp.23-53, 1991.
DOI : 10.1007/BF01407931

P. Feautrier, Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time, International Journal of Parallel Programming, vol.2, issue.4, 1992.
DOI : 10.1007/BF01379404

P. Feautrier, Some efficient solutions to the affine scheduling problem. I. One-dimensional time, International Journal of Parallel Programming, vol.40, issue.6, pp.313-347, 1992.
DOI : 10.1007/BF01407835

P. Feautrier, Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time, International Journal of Parallel Programming, vol.2, issue.4, pp.389-420, 1992.
DOI : 10.1007/BF01379404

G. Goumas, M. Athanasaki, and N. Koziris, An efficient code generation technique for tiled iteration spaces. Parallel and Distributed Systems, IEEE Transactions on, vol.14, issue.10, pp.1021-1034, 2003.

T. Grosser, A. Cohen, J. Holewinski, P. Sadayappan, and S. Verdoolaege, Hybrid Hexagonal/Classical Tiling for GPUs, Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO '14, p.66, 2014.
DOI : 10.1145/2581122.2544160

URL : https://hal.archives-ouvertes.fr/hal-00911177

T. Grosser, S. Verdoolaege, A. Cohen, and P. Sadayappan, The Relation Between Diamond Tiling and Hexagonal Tiling, Parallel Processing Letters, vol.24, issue.03, p.2014
DOI : 10.1142/S0129626414410023

URL : https://hal.archives-ouvertes.fr/hal-01257249

A. Hartono, M. M. Baskaran, C. Bastoul, A. Cohen, S. Krishnamoorthy et al., Parametric multi-level tiling of imperfectly nested loops, Proceedings of the 23rd international conference on Conference on Supercomputing, ICS '09, pp.147-157, 2009.
DOI : 10.1145/1542275.1542301

URL : https://hal.archives-ouvertes.fr/hal-00645328

A. Hartono, M. M. Baskaran, J. Ramanujam, and P. Sadayappan, DynTile: Parametric tiled loop generation for parallel execution on multicore processors, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp.1-12, 2010.
DOI : 10.1109/IPDPS.2010.5470459

T. Henretty, R. Veras, F. Franchetti, L. Pouchet, J. Ramanujam et al., A stencil compiler for short-vector SIMD architectures, Proceedings of the 27th international ACM conference on International conference on supercomputing, ICS '13, pp.13-24, 2013.
DOI : 10.1145/2464996.2467268

F. Irigoin and R. Triolet, Supernode partitioning, Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '88, pp.319-328, 1988.
DOI : 10.1145/73560.73588

D. Kim, Parameterized and Multi-Level Tiled Loop Generation, 2010.

T. Kisuki, M. W. Peter, M. F. Knijnenburg, F. O-'boyle, H. A. Bodin et al., A feasibility study in iterative compilation, Proceedings of the Second International Symposium on High Performance Computing, ISHPC '99, pp.121-132, 1999.
DOI : 10.1007/BFb0094916

T. Kisuki, M. W. Peter, M. F. Knijnenburg, F. O-'boyle, and . Bodin, Combined selection of tile sizes and unroll factors using iterative compilation, Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00622), pp.237-246, 2000.
DOI : 10.1109/PACT.2000.888348

A. Konstantinidis, H. J. Paul, J. Kelly, P. Ramanujam, and . Sadayappan, Parametric GPU Code Generation for Affine Loop Programs, Languages and Compilers for Parallel Computing - 26th International Workshop, pp.136-151, 2013.
DOI : 10.1007/978-3-319-09967-5_8

S. Krishnamoorthy, M. Baskaran, U. Bondhugula, J. Ramanujam, A. Rountev et al., Effective Automatic Parallelization of Stencil Computations, Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '07, pp.235-244, 2007.

]. H. Inria31 and . Verge, Un environnement de transformations de programmmes pour la synthèse d'architectures régulières, 1992.

C. Mauras, ALPHA: un langage équationnel pour la conception et la programmation d'architectures parallèles synchrones, 1989.

F. Quilleré and S. Rajopadhye, Optimizing memory usage in the polyhedral model, ACM Transactions on Programming Languages and Systems, vol.22, issue.5, pp.773-815, 2000.

P. Quinton and V. Van-dongen, The mapping of linear recurrence equations on regular arrays, Journal of VLSI signal processing systems for signal, image and video technology, vol.88, issue.1, pp.95-113, 1989.
DOI : 10.1007/BF02477176

URL : https://hal.archives-ouvertes.fr/inria-00075466

S. V. Rajopadhye, S. Purushothaman, and R. M. Fujimoto, On synthesizing systolic arrays from Recurrence Equations with Linear Dependencies, Proceedings, Sixth Conference on Foundations of Software Technology and Theoretical Computer Science, pp.488-503, 1986.
DOI : 10.1007/3-540-17179-7_30

J. Ramanujam and P. Sadayappan, Tiling multidimensional iteration spaces for multicomputers, Journal of Parallel and Distributed Computing, vol.16, issue.2, pp.108-120, 1992.
DOI : 10.1016/0743-7315(92)90027-K

X. Redon and P. Feautrier, Detection of scans in the polytope model. Parallel Algorithms and Applications, pp.229-263, 2000.

L. Renganarayanan, D. Kim, S. V. Rajopadhye, and M. M. Strout, Parameterized tiled loops for free, PLDI, pp.405-414, 2007.

L. Renganarayanan and D. Kim, Michelle Mills Strout, and Sanjay Rajopadhye. Parameterized loop tiling, ACM Transactions on Programming Languages and Systems (TOPLAS), vol.34, issue.1, p.3, 2012.

R. Schreiber and J. Dongarra, Automatic blocking of nested loops, 1990.

R. Strzodka, M. Shaheen, D. Pajak, and H. Seidel, Cache Accurate Time Skewing in Iterative Stencil Computations, 2011 International Conference on Parallel Processing, 2011.
DOI : 10.1109/ICPP.2011.47

S. Williams, A. Waterman, and D. Patterson, Roofline, Communications of the ACM, vol.52, issue.4, pp.65-76, 2009.
DOI : 10.1145/1498765.1498785

M. Wolf and M. Lam, A loop transformation theory and an algorithm to maximize parallelism, IEEE Transactions on Parallel and Distributed Systems, vol.2, issue.4, pp.452-471, 1991.
DOI : 10.1109/71.97902

M. E. Wolf and M. Lam, A data locality optimizing algorithm, ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 1991.

M. J. Wolfe, Iteration space tiling for memory hierarchies Time skewing for parallel computers, LCPC 1999: 12th International Workshop on Languages and Compilers for Parallel Computing, pp.357-361, 1987.

D. Wonnacott, Achieving scalable locality with time skewing, International Journal of Parallel Programming, vol.30, issue.3, pp.181-221, 2002.
DOI : 10.1023/A:1015460304860

J. Xue, Loop Tiling for Parallelism, 2000.
DOI : 10.1007/978-1-4615-4337-4

Y. Zou and S. Rajopadhye, Automatic Energy Efficient Parallelization of Uniform Dependence Computations, Proceedings of the 29th ACM on International Conference on Supercomputing, ICS '15, 2015.
DOI : 10.1145/2751205.2751245