R. Allen and K. Kennedy, Optimizing Compilers for Modern Architectures, 2002.

J. Buck, S. Ha, E. A. Lee, and D. G. Messerschmitt, Ptolemy: A Framework for Simulating and Prototyping Heterogeneous Systems, Int. J. in Computer Simulation, vol.4, issue.2, pp.155-182, 1994.
DOI : 10.1016/B978-155860702-6/50048-X

S. Carr, C. Ding, and P. Sweany, Improving software pipelining with unroll-and-jam, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences, 1996.
DOI : 10.1109/HICSS.1996.495462

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1.9319

P. Carribault and A. Cohen, Applications of storage mapping optimization to register promotion, Proceedings of the 18th annual international conference on Supercomputing , ICS '04, pp.247-256, 2004.
DOI : 10.1145/1006209.1006244

F. Catthoor, S. Wuytack, E. De-greef, F. Balasa, L. Nachtergaele et al., Custom memory management methodology, 1998.
DOI : 10.1007/978-1-4757-2849-1

A. Darte, G. Silber, and F. Vivien, Combining Retiming and Scheduling Techniques for Loop Parallelization and Loop Tiling, Parallel Processing Letters, vol.07, issue.04, pp.379-392, 1997.
DOI : 10.1142/S0129626497000383

URL : https://hal.archives-ouvertes.fr/hal-00856890

J. C. Dehnert, P. Y. Hsu, and J. P. Bratt, Overlapped loop support in the Cydra 5, Intl Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS'89), pp.26-38, 1989.

A. Douillet and G. R. Gao, Software-Pipelining on Multi-Core Architectures, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007), 2007.
DOI : 10.1109/PACT.2007.4336198

C. Dulong, R. Krishnaiyer, D. Kulkarni, D. Lavery, W. Li et al., An overview of the Intel IA-64 compiler, Intel Technical Journal, p.4, 1999.

D. G. Lee, Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing, IEEE Transactions on Computers, vol.36, issue.1, pp.24-25, 1987.
DOI : 10.1109/TC.1987.5009446

P. Feautrier, Array expansion, Intl. Conf. on Supercomputing (ICS'88), pp.429-441, 1988.
URL : https://hal.archives-ouvertes.fr/hal-01099746

G. Gao, R. , V. Sarkar, and R. Thekkath, Collective loop fusion for array contraction, LCPC'5 Fifth Workshop on Languages and Compilers for Parallel Computing, pp.281-295, 1992.
DOI : 10.1007/3-540-57502-2_53

S. Girbal, N. Vasilache, C. Bastoul, A. Cohen, D. Parello et al., Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies, International Journal of Parallel Programming, vol.20, issue.1, 2006.
DOI : 10.1007/s10766-006-0012-3

URL : https://hal.archives-ouvertes.fr/hal-01257288

M. I. Gordon, W. Thies, and S. Amarasinghe, Exploiting coarsegrained task, data, and pipeline parallelism in stream programs, Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS'06), 2006.

M. Karczmarek, W. Thies, and S. Amarasinghe, Phased scheduling of stream programs, LCTES'03, 2003.

C. E. Leiserson and J. B. Saxe, Retiming synchronous circuitry, Algorithmica, vol.9, issue.No. 1, pp.5-35, 1991.
DOI : 10.1007/BF01759032

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.368.3222

D. E. Maydan, S. P. Amarasinghe, and M. S. Lam, Array dataflow analysis and its use in array privatization, Principles of Programming Languages (PoPL'93), pp.2-15, 1993.

K. Mckinley, S. Carr, and C. Tseng, Improving data locality with loop transformations, ACM Transactions on Programming Languages and Systems, vol.18, issue.4, pp.424-453, 1996.
DOI : 10.1145/233561.233564

C. Mcnairy and D. Soltis, Itanium 2 processor microarchitecture, IEEE Micro, vol.23, issue.2, pp.44-55, 2003.
DOI : 10.1109/MM.2003.1196114

A. Moonen, M. Bekooij, and J. Van-meerbergen, Timing analysis model for network based multiprocessor systems, Proc. of ProRISC, 15th annual Workshop of Circuits, System and Signal Processing, pp.91-99, 2004.

R. Parra-hermandez and N. J. Dimopoulos, A new heuristic for solving the multichoice multidimensional knapsack problem, IEEE Transactions on Systems, Man, and Cybernetics ? Part A: Systems and Humans, vol.35, issue.5, 2005.

J. Puchinger, G. R. Raidl, and U. Pfershy, The Multidimensional Knapsack Problem: Structure and Algorithms, INFORMS Journal on Computing, vol.22, issue.2, 2007.
DOI : 10.1287/ijoc.1090.0344

URL : https://hal.archives-ouvertes.fr/hal-01224914

B. R. Rau, Iterative modulo scheduling, Proceedings of the 27th annual international symposium on Microarchitecture , MICRO 27, pp.63-74, 1994.
DOI : 10.1145/192724.192731

H. Rong, Z. Tang, R. Govindarajan, A. Douillet, and G. R. Gao, Single-dimension software pipelining for multi-dimensional loops, Proceedings of the International Symposium on Code generation and Optimization, 2004.

M. M. Strout, L. Carter, J. Ferrante, and B. Simon, Scheduleindependant storage mapping for loops [26] S. Touati and C. Eisenbeis. Early Control of Register Pressure for Software Pipelined Loops, Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS'98) Proceedings of the International Conference on Compiler Construction (CC), 1998.

P. Tu and D. Padua, Automatic array privatization, Languages and Compilers for Parallel Computers (LCPC'93), number 768 in LNCS, pp.500-521, 1993.
DOI : 10.1007/3-540-45403-9_8

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.3.5746

N. Vasilache, A. Cohen, and L. Pouchet, Automatic Correction of Loop Transformations, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007), 2007.
DOI : 10.1109/PACT.2007.4336220

URL : https://hal.archives-ouvertes.fr/hal-01257283

S. Verdoolaege, M. Bruynooghe, G. Janssens, and F. Catthoor, Multidimentsional incremetal loops fusion for data locality, ASAP, pp.17-27, 2003.