P. Wu, A. E. Eichenberger, A. Wang, and P. Zhao, An integrated simdization framework using virtual vectors, Proceedings of the 19th annual international conference on Supercomputing , ICS '05, 2005.
DOI : 10.1145/1088149.1088172

A. J. Bik, The Software Vectorization Handbook. Applying Multimedia Extensions for Maximum Performance, 2004.

A. J. Bik, M. Girkar, P. M. Grey, and X. Tian, Automatic intra-register vectorization for the Intel architecture, IJPP, vol.30, issue.2, pp.65-98, 2002.

D. Nuzman and A. Zaks, Autovectorization in GCC ? two years later, " in the GCC Developer's summit, 2006.

J. Shin, M. Hall, and J. Chame, Superword-level parallelism in the presence of control flow, CGO, 2005.

D. Nuzman and A. Zaks, Outer-loop vectorization -revisited for short SIMD architectures, PACT, 2008.
DOI : 10.1145/1454115.1454119

R. Allen and K. Kennedy, Optimizing Compilers for Modern Architectures, 2001.

U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan, A practical automatic polyhedral parallelization and locality optimization system, PLDI, 2008.

L. Pouchet, C. Bastoul, A. Cohen, and J. Cavazos, Iterative optimization in the polyhedral model: Part II, multidimensional time, PLDI, 2008.
URL : https://hal.archives-ouvertes.fr/hal-01257273

R. Allen and K. Kennedy, Automatic translation of FORTRAN programs to vector form, ACM Transactions on Programming Languages and Systems, vol.9, issue.4, pp.491-542, 1987.
DOI : 10.1145/29873.29875

M. Wolfe, High Performance Compilers for Parallel Computing, 1996.

J. Shin, J. Chame, and M. W. Hall, Compiler-controlled caching in superword register files for multimedia extension architectures, PACT, 2002.

D. Nuzman, I. Rosen, and A. Zaks, Auto-vectorization of interleaved data for simd, PLDI, 2006.

C. Cascaval, L. Derose, D. A. Padua, and D. A. Reed, Compile-Time Based Performance Prediction, LCPC, 1999.
DOI : 10.1007/3-540-44905-1_23
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.9836

B. B. Fraguela, R. Doallo, and E. L. Zapata, Probabilistic miss equations: evaluating memory hierarchy performance, IEEE Transactions on Computers, vol.52, issue.3, pp.321-336, 2003.
DOI : 10.1109/TC.2003.1183947

P. Feautrier, Array expansion, ICS, 1988.
DOI : 10.1145/2591635.2667159
URL : https://hal.archives-ouvertes.fr/hal-01099746

S. Girbal, N. Vasilache, C. Bastoul, A. Cohen, D. Parello et al., Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies, International Journal of Parallel Programming, vol.20, issue.1, pp.261-317, 2006.
DOI : 10.1007/s10766-006-0012-3
URL : https://hal.archives-ouvertes.fr/hal-01257288

P. Feautrier, Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time, International Journal of Parallel Programming, vol.2, issue.4, pp.389-420315, 1992.
DOI : 10.1007/BF01379404

W. Kelly and W. Pugh, A framework for unifying reordering transformations, 1993.

A. Lim and M. Lam, Maximizing parallelism and minimizing synchronization with affine transforms, Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '97, pp.201-214, 1997.
DOI : 10.1145/263699.263719

C. Bastoul, Code generation in the polyhedral model is easier than you think, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004., 2004.
DOI : 10.1109/PACT.2004.1342537
URL : https://hal.archives-ouvertes.fr/hal-00017260