M. Baptiste-roux, O. Gautier, S. Sentieys, and . Derrien, Communication- Based Power Modelling for Heterogeneous Multiprocessor Architectures, IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2016.

M. Baptiste-roux, O. Gautier, J. Sentieys, and . Delahaye, Fast and Energy-Driven Design Space Exploration for Heterogeneous Architectures, IEEE 25th International Symposium on Field-Programmable Custom Computing Machines (FCCM), Poster, 2017.

[. Alessandro, B. Luca, and D. Giovanni, Regression-based rtl power modeling, In: ACM Transactions on Design Automation of Electronic Systems, 2000.

G. M. Amdahl, Validity of the single processor approach to achieving large scale computing capabilities, Proceedings of the April 18-20, 1967, spring joint computer conference on, AFIPS '67 (Spring), 1967.
DOI : 10.1145/1465482.1465560

M. Ammar, M. Baklouti, M. Pelcat, K. Desnos, and M. Abid, On Exploiting Energy-Aware Scheduling Algorithms for MDE-Based Design Space Exploration of MP2SoC, 2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP), 2016.
DOI : 10.1109/PDP.2016.110

URL : https://hal.archives-ouvertes.fr/hal-01305971

C. André, J. Deantoni, F. Mallet, R. De-simoneas15-]-manikandan, S. Aiswarya et al., The Time Model of Logical Clocks available in the OMG MARTE proole In: Synthesis of Embedded Software Design of EEcient Linear Feedback Shift Register for BCH Encoder, Journal of Electrical & Electronic Systems, 2010.

K. Asanovic, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands et al., The landscape of parallel computing research: A view from berkeley An EEcient Framework for Power-Aware Design of Heterogeneous MPSoC, IEEE Transactions on Industrial Informatics, 2006.

G. Booch, I. Jacobson, and J. Rumbaugh, The uniied modeling language, Unix Review, 1996.

F. Balarin, Hardware-software co-design of embedded systems: the POLIS approach, 1997.
DOI : 10.1007/978-1-4615-6127-9

C. Bastoul, Code generation in the polyhedral model is easier than you think, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004., 2004.
DOI : 10.1109/PACT.2004.1342537

URL : https://hal.archives-ouvertes.fr/hal-00017260

F. Bellard and C. , OOcial QEMU git repository

F. Bellard, QEMU, a Fast and Portable Dynamic Translator The Spice page, Annual Conference on USENIX, 2005.

F. David, . Bacon, L. Susan, . Graham, J. Oliver et al., Compiler transformations for high-performance computing, In: ACM Computing Surveys (CSUR), 1994.

[. Bienia, S. Kumar, K. L. Singh, N. Binkert, B. Beckmann et al., The PARSEC benchmark suite, Proceedings of the 17th international conference on Parallel architectures and compilation techniques, PACT '08, 2008.
DOI : 10.1145/1454115.1454128

C. David, J. Black, B. Donovan, A. Bunton, H. Keist et al., SystemC: From the Ground Up, Second Edition Hybrid functional-and instruction-level power modeling for embedded and heterogeneous processor architectures, Journal of Systems Architecture, 2007.

[. Bondhugula, A. Hartono, J. Ramanujam, P. Sadayappan, and R. Lauwereins, A Practical Automatic Polyhedral Parallelizer and Locality Optimizer, 29th ACM SIGPLAN Conference on Programming Language Design and Implementation Coarse- Grained Array Accelerator for Software-Deened Radio Baseband Processing, 2008.
DOI : 10.1145/1375581.1375595

URL : http://www.cse.ohio-state.edu/~bondhugu/publications/uday-pldi08.pdf

]. C. Bra00, T. G. Schneider, and . Noll, A Codesign Approach to Software Power Estimation for Embedded Systems Power estimation on a functional level for programmable processor, TI Devel Conference, 2000.

M. Kirk, S. Bresniker, R. Singhal, and . Williams, Adapting to Thrive in a New Economy of Memory Abundance, 2015.

V. [. Brooks, M. Tiwari, and . Martonosi, Wattch: a Framework for Architectural-level Power Analysis and Optimizations, International Symposium on Computer Architecture (ISCA), 2000.
DOI : 10.1109/isca.2000.854380

L. Stephen, J. Campbell, R. Chancelier, ]. J. Nikoukhahcen+08, J. Ceng et al., Modeling and Simulation in SCILAB MAPS: An Integrated Framework for MPSoC Application Parallelization, 45th Annual Design Automation Conference, 2006.

E. S. Chung, P. A. Milder, J. C. Hoe, and K. Mai, Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs?, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010.
DOI : 10.1109/MICRO.2010.36

URL : http://www.ece.cmu.edu/%7Eechung/MICRO43-ucores.pdf

D. Cordes and P. Marwedel, Multi-objective aware extraction of task-level parallelism using genetic algorithms, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2012.
DOI : 10.1109/DATE.2012.6176503

D. Cordes, P. Marwedel, J. Cong, Z. Fang, M. Gill et al., Three approaches one tool Research Poster at The Designing for Embedded Parallel Computing Platforms (DEPCP) Automatic parallelization of embedded software using hierarchical task graphs and integer linear programming PARADE: A cycle-accurate full-system simulation Platform for Accelerator-Rich Architectural Design and Exploration, PaxES: PArallelism eXtraction for Embedded Systems 8th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS IEEE/ACM International Conference on Computer-Aided Design (ICCAD). Nov. 2015. [Cor+11] Daniel Cordes Automatic Extraction of Pipeline Parallelism for Embedded Software Using Linear Programming " . In: IEEE 17th International Conference on Parallel and Distributed Systems (ICPADS), 2010.

M. Engel, P. Marwedel, and O. Neugebauer, Automatic extraction of multi-objective aware pipeline parallelism using genetic algorithms, IFIP international conference on Hardware/software codesign and system synthesis (CODES+ISSS), p.8, 2011.

D. Cordes, M. Engel, O. Neugebauer, and P. Marwedel, Automatic Extraction of pipeline parallelism for embedded heterogeneous multi-core platforms, 2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES), 2013.
DOI : 10.1109/CASES.2013.6662508

D. Cordes, M. Engel, O. Neugebauer, and P. Marwedel, Automatic Extraction of Task-Level Parallelism for Heterogeneous MPSoCs, 2013 42nd International Conference on Parallel Processing, 2013.
DOI : 10.1109/ICPP.2013.113

D. Cordes, [. Chou, and K. Roy, Automatic Parallelization for Embedded Multi-Core Systems using High-Level Cost Models Accurate estimation of power dissipation in cmos sequential circuits, IEEE Transaction VLSI System, 1996.

G. Delbergue, M. Burton, F. Konrad, B. L. Gal, and C. Jego, QBox: an industrial solution for virtual platform simulation using QEMU and SystemC TLM-2.0, 8th European Congress on Embedded Real Time Software and Systems (ERTS), 2016.
URL : https://hal.archives-ouvertes.fr/hal-01292317

H. Robert, . Dennard, H. Fritz, . Gaensslen, E. Leo-rideout et al., Design of ion-implanted MOSFET's with very small physical dimensions, IEEE Journal of Solid-State Circuits, 1974.

]. B. Din+13, R. De-dinechin, P. E. Ayrignac, P. Beaucamps, B. Couvert et al., A clustered manycore processor architecture for embedded and accelerated applications, 2013.

B. Dupont-de-dinechin, Dataaow language compilation for a single chip massively parallel processor, IEEE 6th International Workshop on Multi-/Many-core Computing Systems (MuCoCoS), 2013.

J. Deantoni and F. Mallet, TimeSquare: Treat Your Models with Logical Time, 50th International Conference on Objects, Models, Components, Patterns (TOOLS). May 2012. [DT01] Philips Electronic Design and Philips Research Tools Group. DIESEL, 2001.
DOI : 10.1007/978-3-642-30561-0_4

URL : https://hal.archives-ouvertes.fr/hal-00688590

C. Erbas, System-level modelling and design space exploration for multiprocessor embedded system-on-chip architectures, 2007.
DOI : 10.5117/9789056294557

H. Esmaeilzadeh, E. Blem, R. St, K. Amant, D. Sankaralingam et al., Dark Silicon and the End of Multicore Scaling, 38th Annual International Symposium on Computer Architecture (ISCA), pp.365-376, 2011.

A. Floch, T. Yuki, A. El-moussawi, A. Morvan, K. Martin et al., GeCoS: A framework for prototyping custom hardware design flows, 2013 IEEE 13th International Working Conference on Source Code Analysis and Manipulation (SCAM), 2008.
DOI : 10.1109/SCAM.2013.6648190

URL : https://hal.archives-ouvertes.fr/hal-00921370

J. Ferrante, J. Karl, . Ottenstein, D. Joe, and . Warren, The program dependence graph and its use in optimization, In: ACM Transactions on Programming Languages and Systems (TOPLAS), 1987.

R. Gallager, Low-density parity-check codes, IEEE Transactions on Information Theory, vol.8, issue.1, 1962.
DOI : 10.1109/TIT.1962.1057683

[. Goossens, B. Akesson, M. Koedam, A. B. Nejad, A. Nelson et al., The CompSOC Design Flow for Virtual Execution Platforms ?C: A Programming Model and Language for Embedded Manycores, 10th FPGAworld Conference 11th International Conference Algorithms and Architectures for Parallel Processing, 2011.

M. Girkar and C. D. Polychronopoulos, The hierarchical task graph as a universal intermediate representation, International Journal of Parallel Programming, vol.25, issue.7, 1994.
DOI : 10.1016/S0022-0000(74)80049-8

[. Graphics, Lsim power analyst : Transistor-level simulation. [GS08] Sébastien Gérard and Bran Selic, The UML ? MARTE Standardized Proole " . In: IFAC Proceedings Volumes, 2008.

J. Gustafsson, A. Betts, A. Ermedahl, and B. Lisper, The Mälardalen WCET benchmarks: Past, present and future, OASIcs-OpenAccess Series in Informatics, 2010.

A. Hansson, K. Goossens, M. Bekooij, and J. Huisken, CoMPSoC, ACM Transactions on Design Automation of Electronic Systems, vol.14, issue.1, 2009.
DOI : 10.1145/1455229.1455231

[. Hara, H. Tomiyama, S. Honda, H. Takada, and K. Ishii, Chstone: A benchmark program suite for practical c-based high-level synthesis, Circuits and Systems ISCAS 2008. IEEE International Symposium on, 2008.

J. Henkel, A Low Power Hardware/Software Partitioning Approach for Core-based Embedded Systems, 36th Annual ACM/IEEE Design Automation Conference (DAC), 1999.
DOI : 10.1145/309847.309896

URL : http://www.cs.ucr.edu/~vahid/courses/269_s01/dac99_henkel_hwswpartpower.pdf

C. X. Huang, B. Zhang, A. Deng, and B. Swirski, The design and implementation of PowerMill, Proceedings of the 1995 international symposium on Low power design , ISLPED '95, 1995.
DOI : 10.1145/224081.224100

E. Edgar and C. Iglesias, Transaction Level eMulator (TLMu) git repository. https://github Parmibench-an open-source benchmark for embedded multiprocessor systems, IEEE Computer Architecture Letters, 2010.

X. Inc, Vivado Design Suite -HLX edition, 2015.

X. Inc, Zynq-7000 All Programmable SoC: Technical Reference Manual, 2016.

C. Inria, J. Gecos-guohua-jin, R. Mellor-crummey, and . Fowler, Generic Compiler Suite. http://gecos.gforge.inira.fr. [Iri91] François Irigoin et al. PIPS: Automatic Parallelizer and Code Transformation Framework Increasing Temporal Locality with Skewing and Recursive Blocking, 1991.

J. Aynsley, OSCI TLM-2.0 Language reference manual, Open SystemC Initiative (OSCI), 2009.

G. Kahnkhe+14, ]. A. Khecharem, C. Gomez, J. Deantoni, F. Mallet et al., The Semantics of Simple Language for Parallel Programming In: International Federation for Information Processing (IFIP) Aug Execution of heterogeneous models for thermal analysis with a multi-view approach, Proceedings of the 2014 Forum on Speciication and Design Languages (FDL), 1974.

G. Kahn, D. B. Macqueen, J. Laurent, N. Julien, E. Senn et al., Coroutines and networks of parallel processes International Federation for Information Processing (IFIP) Aug Functional Level Power Analysis: An EEcient Approach for Modeling the Power Consumption of Complex Processors FLPA-based power modeling and power aware code optimization for a Trimedia DSP, Design, Automation Test in Europe Conference and Exhibition (DATE). Feb. 2004. [LBN05] J. von Livonius, H. Blume, and, 1977.

R. Leupers and J. Castrillon, MPSoC programming using the MAPS compiler, 2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC), 2010.
DOI : 10.1109/ASPDAC.2010.5419677

Y. Li, J. Henkelli+09, ]. S. Li, J. H. Ahn, R. D. Strong et al., A framework for estimating and minimizing energy dissipation of embedded HW/SW systems Mc- PAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures, 35th Annual ACM/IEEE Design Automation Conference (DAC) 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO, 1998.

J. [. Montón, M. Carrabina, and . Burton, Mixed simulation kernels for high performance virtual platforms, In: Forum on Speciication Design Languages (FDL

J. Mitola, Software radios: Survey, critical evaluation and future directions, IEEE Aerospace and Electronic Systems Magazine, 1993.
DOI : 10.1109/ntc.1992.267870

E. Gordon and . Moore, Cramming more components onto integrated circuits, 1965.

]. M. Nac08, J. Moudgill, S. Glossner, G. Agrawal, and . Nacer, The sandblaster 2.0 architecture and SB3500 implementation, Software Deened Radio Technical Forum, 2008.

G. Ottoni, R. Rangan, A. Stoler, and D. I. August, Gurobi Optimizer Reference Manual Automatic Thread Extraction with Decoupled Software Pipelining, www. gurobi.com. [Ott+05] 38th Annual IEEE/ACM International Symposium on Microarchitecture, 2005.
DOI : 10.1109/micro.2005.13

M. Palkovic, P. Raghavan, M. Li, and A. Dejonghe, Liesbet Van der Perre, and Francky Catthoor Future Software-Deened Radio Platforms and Mapping Flows, IEEE Signal Processing Magazine, 2010.
DOI : 10.1109/msp.2009.935386

C. [. Pimentel, S. Erbas, and . Polstra, A systematic approach to exploring embedded system architectures at multiple abstraction levels, IEEE Transactions on Computers, vol.55, issue.2, 2006.
DOI : 10.1109/TC.2006.16

[. Pallister, S. Hollis, and J. Bennett, BEEBS: Open benchmarks for energy measurements on embedded platforms, 2013.

A. D. Pimentel, L. O. Hertzberger, P. Lieverse, P. Van-der-wolf, and E. F. Deprettere, Exploring embedded-systems architectures with Artemis, Computer, vol.34, issue.11, p.Computer, 2001.
DOI : 10.1109/2.963445

S. T. Hsiao and . Chakradhar, Accurate Power Macro-modeling Techniques for Complex RTL Circuits, Int. Conf. VLSI Design, 2001.

T. M. Parks, L. Pino, and E. A. Lee, A Comparison of Synchronous and Cyclo-Static Dataaow Asilomar Conference on Signals, Systems and Computers Function-level Power Estimation Methodology for Microprocessors, Qu+00] Gang Qu, Naoyuki Kawabe, Kimiyoshi Usami, and Miodrag Potkonjak 37th Annual Design Automation Conference (DAC), 1995.

E. Raman, G. Ottoni, A. Raman, M. J. Bridges, and D. I. August, Parallel-stage Decoupled Software Pipelining Software-Deened Radio Prospects for Multistandard Mobile Phones, 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization, p.Computer, 2007.

R. Brandon-reagen, Y. S. Adolf, . Shao, . Gu-yeon, D. Wei et al., Machsuite: Benchmarks for accelerator design and customized architectures, Workload Characterization (IISWC), 2014 IEEE International Symposium on, 2014.

]. S. Ret+14a, O. Rethinagiri, A. Palomar, O. Cristal, M. M. Unsal et al., DESSERT: DESign Space ExploRation Tool based on power and energy at System-Level, 27th IEEE International System-on-Chip Conference (SOCC), 2014.

S. K. Rethinagiri, O. Palomar, J. Arias-moreno, O. Unsal, and A. , VPPET: Virtual platform power and energy estimation tool for heterogeneous MPSoC based FPGA platforms, 2014 24th International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS), pp.1-8, 2014.
DOI : 10.1109/PATMOS.2014.6951910

M. Baptiste-roux, O. Gautier, S. Sentieys, and . Derrien, Communication- Based Power Modelling for Heterogeneous Multiprocessor Architectures, IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2016.

M. Baptiste-roux, O. Gautier, J. Sentieys, and . Delahaye, Fast and Energy-Driven Design Space Exploration for Heterogeneous Architectures, IEEE 25th International Symposium on Field-Programmable Custom Computing Machines (FCCM), Poster, 2017.

V. Sarkar, A Concurrent Execution Semantics for Parallel Program Graphs and Program Dependence Graphs JouleTrack-a Web based tool for software energy prooling, Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing Proceedings of the 38th Design Automation Conference, 1993.
DOI : 10.1007/3-540-57502-2_37

J. Laurent, N. Julien, and E. Martin, Scilab: Le logiciel open source gratuit de calcul numérique. Scilab Enterprises Softexplorer: Estimating and Optimizing the Power and Energy Consumption of a C Program for DSP Applications, Sci12] Scilab Enterprises, 2001.

Y. Shao, B. Reagen, . Gu-yeon, D. Wei, and . Brooks, The Aladdin Approach to Accelerator Design and Modeling, IEEE Micro, vol.35, issue.3, p.IEEE Micro, 2015.
DOI : 10.1109/MM.2015.50

[. Stripf, O. Oey, T. Bruckschloegla, J. Becker, G. Rauwerda et al., Compiling Scilab to high performance embedded multicore systems, Microprocessors and Microsystems, 2013.
DOI : 10.1016/j.micpro.2013.07.004

URL : https://hal.archives-ouvertes.fr/hal-00921437

V. Tiwari, S. Malik, A. Wolfe, M. T. Torquati, M. Vanneschi et al., Instruction level power analysis and optimization of software An innovative compilation tool-chain for embedded multi-core architectures " . In: Embedded World Conference. 2012. [Tou11] Georgios Tournavitis Proole-driven parallelisation of sequential programs, 9th International Conference on VLSI Design (VLSI)Vac+07] Neil Vachharajani Speculative Decoupled Software Pipelining " . In: 16th International Conference on Parallel Architecture and Compilation Techniques, 1996.

S. Verdoolaege, isl: An Integer Set Library for the Polyhedral Model, Mathematical Software ? ICMS 2010: Third International Congress on Mathematical Software Proceedings, pp.299-302, 2010.
DOI : 10.1007/978-3-642-15582-6_49

M. [. Wolf and . Lam, A loop transformation theory and an algorithm to maximize parallelism, IEEE Transactions on Parallel and Distributed Systems, vol.2, issue.4, 1991.
DOI : 10.1109/71.97902

M. Woh, Y. Lin, S. Seo, S. Mahlke, T. Mudge et al., From SODA to scotch: The evolution of a wireless baseband processor, 2008 41st IEEE/ACM International Symposium on Microarchitecture, 2008.
DOI : 10.1109/MICRO.2008.4771787

K. [. Wang and . Skadron, Lumos+: Rapid, pre-RTL design space exploration on accelerator-rich heterogeneous architectures with reconfigurable logic, 2016 IEEE 34th International Conference on Computer Design (ICCD)
DOI : 10.1109/ICCD.2016.7753297

[. Xilinx, Microblaze processor reference guide, p.reference manual, 2006.

]. W. Ye+00, N. Ye, M. Vijaykrishnan, M. J. Kandemir, and . Irwin, The design and use of simple- Power: a cycle-accurate energy estimation tool, 37th Design Automation Conference (DAC), pp.340-345, 2000.

[. Zong, A. Manzanares, X. Ruan, and X. Qin, EAD and PEBD: Two Energy-Aware Duplication Scheduling Algorithms for Parallel Tasks on Homogeneous Clusters, IEEE Transactions on Computers, 2011.
DOI : 10.1109/TC.2010.216

URL : http://www.eng.auburn.edu/users/xzq0001/pubs/ead-pebd-tc11.pdf