N. Ahmed, N. Mateev, and K. Pingali, Tiling Imperfectly-nested Loop Nests, ACM/IEEE SC 2000 Conference (SC'00), 2000.
DOI : 10.1109/SC.2000.10018
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.22.491

N. Ahmed, N. Mateev, and K. Pingali, Synthesizing transformations for locality enhancement of imperfectly-nested loop nests, IJPP, vol.29, issue.5, 2001.

C. Ancourt and F. Irigoin, Scanning polyhedra with do loops, PPoPP'91, pp.39-50, 1991.
URL : https://hal.archives-ouvertes.fr/hal-00752774

C. Bastoul, Efficient code generation for automatic parallelization and optimization, Second International Symposium on Parallel and Distributed Computing, 2003. Proceedings., p.23, 2003.
DOI : 10.1109/ISPDC.2003.1267639

C. Bastoul, Code generation in the polyhedral model is easier than you think, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004., pp.7-16, 2004.
DOI : 10.1109/PACT.2004.1342537
URL : https://hal.archives-ouvertes.fr/hal-00017260

U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan, A practical automatic polyhedral program optimization system, PLDI'08, 2008.

P. Boulet, A. Darte, T. Risset, and Y. Robert, (Pen)-ultimate tiling? Integration, the VLSI Journal, pp.33-51, 1994.

S. Carr and K. Kennedy, Compiler blockability of numerical algorithms, Proceedings Supercomputing '92, pp.114-124, 1992.
DOI : 10.1109/SUPERC.1992.236704

C. Chen, J. Chame, and M. Hall, Combining Models and Guided Empirical Search to Optimize for Multiple Levels of the Memory Hierarchy, International Symposium on Code Generation and Optimization, 2005.
DOI : 10.1109/CGO.2005.10

C. Chen, J. Chame, and M. Hall, Chill: A framework for composing high-level loop transformations, 2008.

. Cloog, The Chunky Loop Generator

S. Coleman and K. Mckinley, Tile Size Selection Using Cache Organization and Data Layout, PLDI'95, pp.279-290, 1995.

P. Feautrier, Dataflow analysis of array and scalar references, International Journal of Parallel Programming, vol.24, issue.4, pp.23-53, 1991.
DOI : 10.1007/BF01407931

G. Goumas, M. Athanasaki, and N. Koziris, An efficient code generation technique for tiled iteration spaces, IEEE Transactions on Parallel and Distributed Systems, vol.14, issue.10, pp.1021-1034, 2003.
DOI : 10.1109/TPDS.2003.1239870

A. Hartono, M. Baskaran, C. Bastoul, A. Cohen, S. Krishnamoorthy et al., Primetile: A parametric multi-level tiler for imperfect loop nests, 2009.

. Hitlog, Hierarchical Tiled Loop Generator Available at http://www.cs.colostate

K. Hogstedt, L. Carter, and J. Ferrante, Selecting tile shape for minimal execution time, Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures , SPAA '99, pp.201-211, 1999.
DOI : 10.1145/305619.305641

F. Irigoin and R. Triolet, Supernode partitioning, Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '88, pp.319-329, 1988.
DOI : 10.1145/73560.73588

M. Jiménez, J. Llabería, and A. Fernández, Register tiling in nonrectangular iteration spaces, ACM Transactions on Programming Languages and Systems, vol.24, issue.4, pp.409-453, 2002.
DOI : 10.1145/567097.567101

M. Jiménez, J. Llabería, and A. Fernández, A cost-effective implementation of multilevel tiling, IEEE Transactions on Parallel and Distributed Systems, vol.14, issue.10, pp.1006-1020, 2003.
DOI : 10.1109/TPDS.2003.1239869

D. Kim and S. Rajopadhye, Parameterized tiling for imperfectly nested loops, 2009.

D. Kim, L. Renganarayanan, M. Strout, and S. Rajopadhye, Multi-level tiling, Proceedings of the 2007 ACM/IEEE conference on Supercomputing , SC '07, 2001.
DOI : 10.1145/1362622.1362691

A. Lim, G. Cheong, and M. Lam, An affine partitioning algorithm to maximize parallelism and minimize communication, Proceedings of the 13th international conference on Supercomputing , ICS '99, pp.228-237, 1999.
DOI : 10.1145/305138.305197

A. Lim and M. Lam, Maximizing parallelism and minimizing synchronization with affine partitions, Parallel Computing, vol.24, issue.3-4, pp.3-4445, 1998.
DOI : 10.1016/S0167-8191(98)00021-0

A. Lim, S. Liao, and M. Lam, Blocking and array contraction across arbitrarily nested loops using affine partitioning, PPoPP'01, 2001.

. Pluto, A polyhedral automatic parallelizer and locality optimizer for multicores

W. Pugh, The Omega test: a fast and practical integer programming algorithm for dependence analysis, Proceedings of the 1991 ACM/IEEE conference on Supercomputing , Supercomputing '91, pp.102-114, 1992.
DOI : 10.1145/125826.125848

F. Quilleré, S. Rajopadhye, and D. Wilde, Generation of efficient nested loops from polyhedra, IJPP, vol.28, issue.5, pp.469-498, 2000.

J. Ramanujam and P. Sadayappan, Tiling multidimensional iteration spaces for multicomputers, Journal of Parallel and Distributed Computing, vol.16, issue.2, pp.108-230, 1992.
DOI : 10.1016/0743-7315(92)90027-K

L. Renganarayana, D. Kim, S. Rajopadhye, and M. Strout, Parameterized tiled loops for free, PLDI'07, pp.405-414, 2007.

L. Renganarayana and S. Rajopadhye, A Geometric Programming Framework for Optimal Multi-Level Tiling, Proceedings of the ACM/IEEE SC2004 Conference, 2004.
DOI : 10.1109/SC.2004.3

G. Rivera and C. Tseng, Locality optimizations for multi-level caches, Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM) , Supercomputing '99, 1999.
DOI : 10.1145/331532.331534

R. Schreiber and J. Dongarra, Automatic blocking of nested loops, 1990.

Y. Song and Z. Li, New tiling techniques to improve cache temporal locality, PLDI, pp.215-228, 1999.
DOI : 10.1145/301631.301668
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.37.3631

A. Tiwari, C. Chen, J. Chame, M. Hall, and J. Hollingsworth, Scalable autotuning framework for compiler optimization, IPDPS '09, 2009.

. Tlog, A Parametrized Tiled Loop Generator Available at http://www.cs.colostate

R. Whaley and J. Dongarra, Automatically Tuned Linear Algebra Software (ATLAS), Proc. Supercomputing '98, 1998.

R. Whaley, A. Petitet, and J. Dongarra, Automated Empirical Optimizations of Software and the ATLAS Project, Parallel Computing Journal, 2000.

R. Whaley and D. Whalley, Tuning High Performance Kernels through Empirical Compilation, 2005 International Conference on Parallel Processing (ICPP'05), pp.89-98, 2005.
DOI : 10.1109/ICPP.2005.77
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.297.6378

J. Xue, Loop tiling for parallelism, 2000.
DOI : 10.1007/978-1-4615-4337-4

Q. Yi, K. Kennedy, and V. Adve, Transforming Complex Loop Nests for Locality, The Journal of Supercomputing, vol.27, issue.3, pp.219-264, 2004.
DOI : 10.1023/B:SUPE.0000011386.69245.f5