M. Becchi and P. Crowley, Dynamic thread assignment on heterogeneous multiprocessor architectures, Proceedings of the 3rd conference on Computing frontiers , CF '06, pp.29-40, 2006.
DOI : 10.1145/1128022.1128029

E. Borch, S. Manne, J. Emer, and E. Tune, Loose loops sink chips, Proceedings Eighth International Symposium on High Performance Computer Architecture, 2002.
DOI : 10.1109/HPCA.2002.995719

L. Ceze, K. Strauss, J. Tuck, J. Torrellas, and J. Renau, CAVA, ACM Transactions on Architecture and Code Optimization, vol.3, issue.2, pp.182-208, 2006.
DOI : 10.1145/1138035.1138038

Y. Chou, B. Fahs, and S. Abraham, Microarchitecture Optimizations for Exploiting Memory-Level Parallelism, ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture, p.76, 2004.
DOI : 10.1145/1028176.1006708

J. D. Collins, D. M. Tullsen, H. Wang, and J. P. Shen, Dynamic speculative precomputation, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34, pp.306-317, 2001.
DOI : 10.1109/MICRO.2001.991128

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.23.763

A. Cristal, O. J. Santana, M. Valero, and J. F. Martínez, Toward kilo-instruction processors, ACM Transactions on Architecture and Code Optimization, vol.1, issue.4, pp.389-417, 2004.
DOI : 10.1145/1044823.1044825

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.86.3606

J. Dundas and T. Mudge, Improving data cache performance by pre-executing instructions under a cache miss, ICS '97: Proceedings of the 11th international conference on Supercomputing, pp.68-75, 1997.

I. Ganusov and M. Burtscher, Future execution, ACM Transactions on Architecture and Code Optimization, vol.3, issue.4, pp.424-449, 2006.
DOI : 10.1145/1187976.1187979

A. Garg and M. C. Huang, A performance-correctness explicitly-decoupled architecture, 2008 41st IEEE/ACM International Symposium on Microarchitecture, pp.306-317, 2008.
DOI : 10.1109/MICRO.2008.4771800

A. Hartstein and T. R. Puzak, The optimum pipeline depth for a microprocessor, Proceedings of the 29th annual international symposium on Computer architecture, p.7, 2002.

N. Kirman, M. Kirman, M. Chaudhuri, and J. F. Martinez, Checkpointed early load retirement, 11th International Symposium on High-Performance Computer Architecture, pp.16-27, 2005.
DOI : 10.1109/HPCA.2005.9

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.84.1759

R. Kumar, K. I. Farkas, N. P. Jouppi, P. Ranganathan, and D. M. Tullsen, Single-isa heterogeneous multicore architectures: The potential for processor power reduction, MICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, p.81, 2003.

R. Kumar, D. M. Tullsen, and N. P. Jouppi, Core architecture optimization for heterogeneous chip multiprocessors, Proceedings of the 15th international conference on Parallel architectures and compilation techniques , PACT '06, pp.23-32, 2006.
DOI : 10.1145/1152154.1152162

R. Kumar, D. M. Tullsen, P. Ranganathan, N. P. Jouppi, and K. I. Farkas, Single-isa heterogeneous multicore architectures for multithreaded workload performance, ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture, p.64, 2004.

A. R. Lebeck, J. Koppanalil, T. Li, J. Patwardhan, and E. Rotenberg, A large, fast instruction window for tolerating cache misses, ISCA '02: Proceedings of the 29th annual international symposium on Computer architecture, pp.59-70, 2002.

A. Moshovos, D. N. Pnevmatikatos, and A. Baniasadi, Slice-processors, Proceedings of the 15th international conference on Supercomputing , ICS '01, pp.321-334, 2001.
DOI : 10.1145/377792.377856

O. Mutlu, H. Kim, and Y. N. Patt, Efficient Runahead Execution: Power-Efficient Memory Latency Tolerance, IEEE Micro, vol.26, issue.1, pp.10-20, 2006.
DOI : 10.1109/MM.2006.10

H. Hashem, E. Najaf-abadi, and . Rotenberg, Architectural contesting: exposing and exploiting temperamental behavior, SIGARCH Comput. Archit. News, vol.35, issue.3, pp.28-35, 2007.

S. Palacharla, N. P. Jouppi, and J. E. Smith, Complexity-effective superscalar processors, ISCA '97: Proceedings of the 24th annual international symposium on Computer architecture, pp.206-218, 1997.
DOI : 10.1145/384286.264201

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.527.5571

M. Pericas, A. Cristal, F. J. Cazorla, R. Gonzalez, D. A. Jimenez et al., A Flexible Heterogeneous Multi-Core Architecture, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007), pp.13-24, 2007.
DOI : 10.1109/PACT.2007.4336196

Z. Purser, K. Sundaramoorthy, and E. Rotenberg, A study of slipstream processors, MICRO 33: Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, pp.269-280, 2000.

A. Roth, Pre-execution via speculative data-driven multithreading, 2001.
DOI : 10.1109/hpca.2001.903250

URL : https://minds.wisconsin.edu/bitstream/handle/1793/60236/TR1414.pdf?sequence=1

J. C. Saez, M. Prieto, A. Fedorova, and S. Blagodurov, A comprehensive scheduler for asymmetric multicore systems, Proceedings of the 5th European conference on Computer systems, EuroSys '10, pp.139-152, 2010.
DOI : 10.1145/1755913.1755929

T. Srikanth, R. Srinivasan, H. Rajwar, A. Akkary, M. Gandhi et al., Continual flow pipelines, ASPLOS-XI: Proceedings of the 11th international conference on Architectural support for programming languages and operating systems, pp.107-119, 2004.

J. E. Stine, I. Castellanos, M. Wood, J. Henson, F. Love et al., FreePDK: An Open-Source Variation-Aware Design Kit, 2007 IEEE International Conference on Microelectronic Systems Education (MSE'07), 2007.
DOI : 10.1109/MSE.2007.44

M. A. Suleman, O. Mutlu, M. K. Qureshi, and Y. N. Patt, Accelerating Critical Section Execution with Asymmetric Multicore Architectures, ASPLOS '09: Proceeding of the 14th international conference on Architectural support for programming languages and operating systems, pp.253-264, 2009.
DOI : 10.1109/MM.2010.7

H. Zhou, Dual-core execution: Building a highly scalable single-thread instruction window, PACT '05: Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques, pp.231-242, 2005.

H. Zhou and T. M. Conte, Enhancing memory level parallelism via recovery-free value prediction, Proceedings of the 17th annual international conference on Supercomputing , ICS '03, pp.326-335, 2003.
DOI : 10.1145/782814.782859

C. Zilles and G. Sohi, Execution-based prediction using speculative slices, ISCA '01: Proceedings of the 28th annual international symposium on Computer architecture, pp.2-13, 2001.