High Performance Fortran Compilation Techniques for Parallelizing Scientific Codes, Proceedings of the IEEE/ACM SC98 Conference, 1998. ,
DOI : 10.1109/SC.1998.10034
Global optimizations for parallelism and locality on scalable parallel machines, The ACM SIGPLAN '93 Conference on Programming Language Design and Implementation, pp.112-125, 1993. ,
Optimal semi-oblique tiling, IEEE Transactions on Parallel and Distributed Systems, vol.14, issue.9, pp.944-960, 2003. ,
DOI : 10.1109/TPDS.2003.1233716
TilingStencil Computations to Maximize Parallelism, the International conference on high performance computing, networking, storage and analysis, SC '12, pp.1-40 ,
Code generation in the polyhedral model is easier than you think, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004., pp.7-16, 2004. ,
DOI : 10.1109/PACT.2004.1342537
URL : https://hal.archives-ouvertes.fr/hal-00017260
Parameterized Diamond Tiling for Stencil Computations with Chapel parallel iterators, Proceedings of the 29th ACM on International Conference on Supercomputing, ICS '15, pp.197-206, 2015. ,
DOI : 10.1145/2751205.2751226
Pluto: A practical and fully automatic polyhedral program optimization system, ACM Conference on Programming Language Design and Implementation, pp.101-113, 2008. ,
Estimating interlock and improving balance for pipelined architectures, Journal of Parallel and Distributed Computing, vol.5, issue.4, pp.334-358, 1988. ,
DOI : 10.1016/0743-7315(88)90002-0
Constructive methods for scheduling uniform loop nests, IEEE Transactions on Parallel and Distributed Systems, vol.5, issue.8, pp.814-822, 1994. ,
DOI : 10.1109/71.298207
URL : https://hal.archives-ouvertes.fr/hal-00857083
Affine-by-Statement Scheduling of Uniform and Affine Loop Nests over Parametric Domains, Journal of Parallel and Distributed Computing, vol.29, issue.1, pp.43-59, 1995. ,
DOI : 10.1006/jpdc.1995.1105
URL : https://hal.archives-ouvertes.fr/hal-00857091
Automatic tiling of "mostly-tileable" loop nests, IMPACT 2015: 4th International Workshop on Polyhedral Compilation Techniques, 2015. ,
In-Core Optimization of High-Order Stencil Computations, 2009. ,
Parametric integer programming, RAIRO - Operations Research, vol.22, issue.3, pp.243-268, 1988. ,
DOI : 10.1051/ro/1988220302431
Dataflow analysis of array and scalar references, International Journal of Parallel Programming, vol.24, issue.4, pp.23-53, 1991. ,
DOI : 10.1007/BF01407931
Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time, International Journal of Parallel Programming, vol.2, issue.4, 1992. ,
DOI : 10.1007/BF01379404
Some efficient solutions to the affine scheduling problem. I. One-dimensional time, International Journal of Parallel Programming, vol.40, issue.6, pp.313-347, 1992. ,
DOI : 10.1007/BF01407835
Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time, International Journal of Parallel Programming, vol.2, issue.4, pp.389-420, 1992. ,
DOI : 10.1007/BF01379404
An efficient code generation technique for tiled iteration spaces. Parallel and Distributed Systems, IEEE Transactions on, vol.14, issue.10, pp.1021-1034, 2003. ,
Hybrid Hexagonal/Classical Tiling for GPUs, Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO '14, p.66, 2014. ,
DOI : 10.1145/2581122.2544160
URL : https://hal.archives-ouvertes.fr/hal-00911177
The Relation Between Diamond Tiling and Hexagonal Tiling, Parallel Processing Letters, vol.24, issue.03, p.2014 ,
DOI : 10.1142/S0129626414410023
URL : https://hal.archives-ouvertes.fr/hal-01257249
Parametric multi-level tiling of imperfectly nested loops, Proceedings of the 23rd international conference on Conference on Supercomputing, ICS '09, pp.147-157, 2009. ,
DOI : 10.1145/1542275.1542301
URL : https://hal.archives-ouvertes.fr/hal-00645328
DynTile: Parametric tiled loop generation for parallel execution on multicore processors, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp.1-12, 2010. ,
DOI : 10.1109/IPDPS.2010.5470459
A stencil compiler for short-vector SIMD architectures, Proceedings of the 27th international ACM conference on International conference on supercomputing, ICS '13, pp.13-24, 2013. ,
DOI : 10.1145/2464996.2467268
Supernode partitioning, Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '88, pp.319-328, 1988. ,
DOI : 10.1145/73560.73588
Parameterized and Multi-Level Tiled Loop Generation, 2010. ,
A feasibility study in iterative compilation, Proceedings of the Second International Symposium on High Performance Computing, ISHPC '99, pp.121-132, 1999. ,
DOI : 10.1007/BFb0094916
Combined selection of tile sizes and unroll factors using iterative compilation, Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00622), pp.237-246, 2000. ,
DOI : 10.1109/PACT.2000.888348
Parametric GPU Code Generation for Affine Loop Programs, Languages and Compilers for Parallel Computing - 26th International Workshop, pp.136-151, 2013. ,
DOI : 10.1007/978-3-319-09967-5_8
Effective Automatic Parallelization of Stencil Computations, Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '07, pp.235-244, 2007. ,
Un environnement de transformations de programmmes pour la synthèse d'architectures régulières, 1992. ,
ALPHA: un langage équationnel pour la conception et la programmation d'architectures parallèles synchrones, 1989. ,
Optimizing memory usage in the polyhedral model, ACM Transactions on Programming Languages and Systems, vol.22, issue.5, pp.773-815, 2000. ,
The mapping of linear recurrence equations on regular arrays, Journal of VLSI signal processing systems for signal, image and video technology, vol.88, issue.1, pp.95-113, 1989. ,
DOI : 10.1007/BF02477176
URL : https://hal.archives-ouvertes.fr/inria-00075466
On synthesizing systolic arrays from Recurrence Equations with Linear Dependencies, Proceedings, Sixth Conference on Foundations of Software Technology and Theoretical Computer Science, pp.488-503, 1986. ,
DOI : 10.1007/3-540-17179-7_30
Tiling multidimensional iteration spaces for multicomputers, Journal of Parallel and Distributed Computing, vol.16, issue.2, pp.108-120, 1992. ,
DOI : 10.1016/0743-7315(92)90027-K
Detection of scans in the polytope model. Parallel Algorithms and Applications, pp.229-263, 2000. ,
Parameterized tiled loops for free, PLDI, pp.405-414, 2007. ,
Michelle Mills Strout, and Sanjay Rajopadhye. Parameterized loop tiling, ACM Transactions on Programming Languages and Systems (TOPLAS), vol.34, issue.1, p.3, 2012. ,
Automatic blocking of nested loops, 1990. ,
Cache Accurate Time Skewing in Iterative Stencil Computations, 2011 International Conference on Parallel Processing, 2011. ,
DOI : 10.1109/ICPP.2011.47
Roofline, Communications of the ACM, vol.52, issue.4, pp.65-76, 2009. ,
DOI : 10.1145/1498765.1498785
A loop transformation theory and an algorithm to maximize parallelism, IEEE Transactions on Parallel and Distributed Systems, vol.2, issue.4, pp.452-471, 1991. ,
DOI : 10.1109/71.97902
A data locality optimizing algorithm, ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 1991. ,
Iteration space tiling for memory hierarchies Time skewing for parallel computers, LCPC 1999: 12th International Workshop on Languages and Compilers for Parallel Computing, pp.357-361, 1987. ,
Achieving scalable locality with time skewing, International Journal of Parallel Programming, vol.30, issue.3, pp.181-221, 2002. ,
DOI : 10.1023/A:1015460304860
Loop Tiling for Parallelism, 2000. ,
DOI : 10.1007/978-1-4615-4337-4
Automatic Energy Efficient Parallelization of Uniform Dependence Computations, Proceedings of the 29th ACM on International Conference on Supercomputing, ICS '15, 2015. ,
DOI : 10.1145/2751205.2751245