R. Ao, G. Tan, C. , and M. Parainsight, An assistant for quantitatively analyzing multi-granularity parallel region, High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), p.698707, 2013.

C. Bastoul, Generating loops for scanning polyhedra: Cloog users guide, Polyhedron, vol.2, p.10, 2004.

F. Bellard, Qemu, a fast and portable dynamic translator, Proceedings of the Annual Conference on USENIX Annual Technical Conference, 2005.

E. Berg and E. Hagersten, Fast data-locality proling of native execution, ACM SIGMETRICS Performance Evaluation Review, vol.33, p.169180, 2005.

K. Beyls and E. Hollander, Discovery of locality-improving refactorings by reuse path analysis, p.220229, 2006.

U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan, A practical automatic polyhedral program optimization system, PLDI, 2008.

P. Boulet, A. Darte, G. Silber, V. , and F. , Loop parallelization algorithms: From parallelism extraction to code generation, Parallel Comput, vol.24, p.444, 1998.
URL : https://hal.archives-ouvertes.fr/inria-00565000

K. Butt, A. Qadeer, G. Mustafa, W. , and A. , Runtime analysis of application binaries for function level parallelism potential using qemu, Open Source Systems and Technologies (ICOSST), 2012 International Conference on, p.3339, 2012.

S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer et al., Rodinia: A benchmark suite for heterogeneous computing, IEEE International Symposium on, 2009.

S. Che, J. W. Sheaffer, M. Boyer, L. G. Szafaryn, L. Wang et al., A characterization of the rodinia benchmark suite with comparison to contemporary cmp workloads, Proceedings of the IEEE International Symposium on Workload Characterization (IISWC'10, p.111, 2010.

J. Collard, D. Barthou, and P. Feautrier, Fuzzy array dataow analysis, Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, p.92101, 1995.

K. Faxén, K. Popov, S. Jansson, A. , and L. , Embla-data dependence proling for parallel programming, Proceedings of the 2008 International Conference on Complex, Intelligent and Software Intensive Systems, p.780785, 2008.

P. Feautrier, Parametric integer programming, RAIRO-Operations Research, vol.22, p.243268, 1988.

P. Feautrier, C. Lengauer, . Polyhedron, and . Model, Encyclopedia of Parallel Computing, p.15811592, 2011.

T. Grosser, A. Groesslinger, and C. Lengauer, Polly-performing polyhedral pptimizations on a low-level intermediate representation, Parallel Processing Letters, vol.22, p.1250010, 2012.

F. Gruber, M. Selva, D. Sampaio, C. Guillon, A. Moynault et al., Data-ow/dependence proling for structured transformations

, Submitted to the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'19, 2019.

C. Guillon, Program instrumentation with qemu, Proceedings of the International QEMU User's Forum, p.11, 2011.

J. Holewinski, R. Ramamurthi, M. Ravishankar, N. Fauzia, L. N. Pouchet et al., Dynamic trace-based analysis of vectorization potential of applications, ACM SIGPLAN Notices, vol.47, p.371382, 2012.

A. Ketterlin, C. , and P. , Prediction and trace compression of data access addresses through nested loop recognition, Proceedings of the 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization, p.94103, 2008.
URL : https://hal.archives-ouvertes.fr/inria-00504597

A. Ketterlin, C. , and P. , Proling data-dependence to assist parallelization: Framework, scope, and optimization, Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, p.437448, 2012.

M. Kim, H. Kim, and C. Luk, Prospector: A dynamic data-dependence proler to help parallel programming, HotPar'10: Proceedings of the USENIX workshop on Hot Topics in parallelism, 2010.

Z. Li, R. Atre, Z. Ul-huda, A. Jannesari, W. et al., Discopop: A proling tool to identify parallelization opportunities, Tools for High Performance Computing, p.3754, 2014.

X. Liu and J. Mellor-crummey, Pinpointing data locality problems using datacentric analysis, Code Generation and Optimization (CGO), 2011 9th Annual IEEE/ACM International Symposium on, p.171180, 2011.

G. Marin, J. Dongarra, D. Terpstra, and . Miami, A framework for application performance diagnosis, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS, 2014.

J. M. Martinez-caamaño, M. Selva, P. Clauss, A. Baloian, and W. Wolff, Full runtime polyhedral optimizing loop transformations with the generation, instantiation, and scheduling of code-bones, Concurrency and Computation: Practice and Experience, vol.29, 2017.

N. Nethercote and A. Mycroft, Redux: A dynamic dataow tracer. Electronic Notes in Theoretical Computer Science, vol.89, p.149170, 2003.

S. Pop, A. Cohen, and G. Silber, Induction variable analysis with delayed abstractions, Proceedings of the First International Conference on High Performance Embedded Architectures and Compilers, 2005.
URL : https://hal.archives-ouvertes.fr/hal-01257294

G. Rodríguez, J. M. Andión, M. T. Kandemir, and J. Touriño, Trace-based ane reconstruction of codes, Proceedings of the 2016 International Symposium on Code Generation and Optimization, p.139149, 2016.

A. Simbürger, S. Apel, A. Gröÿlinger, and C. Lengauer, Polyjit: Polyhedral optimization just in time, International Journal of Parallel Programming, 2018.

A. Sukumaran-rajam, C. , and P. , The polyhedral model of nonlinear loops, ACM Trans. Archit. Code Optim, vol.12, p.27, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01244464

G. Tournavitis, F. , and B. , Semi-automatic extraction and exploitation of hierarchical pipeline parallelism using proling information, Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, p.377388, 2010.

K. Trifunovic, A. Cohen, D. Edelsohn, F. Li, T. Grosser et al., GRAPHITE Two Years After: First Lessons Learned From Real-World Polyhedral Compilation, GCC Research Opportunities Workshop (GROW'10, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00551516

R. A. Van-engelen, Ecient symbolic analysis for optimizing compilers, International Conference on Compiler Construction, 2001.

H. Vandierendonck, S. Rul, D. Bosschere, and K. , The paralax infrastructure: automatic parallelization with a helping hand, Parallel Architectures and Compilation Techniques (PACT), 2010 19th International Conference on, p.389399, 2010.

Z. Wang, G. Tournavitis, B. Franke, O. 'boyle, and M. F. , Integrating proledriven parallelism detection and machine-learning-based mapping, ACM Transactions on Architecture and Code Optimization (TACO), vol.11, 2014.