R. Balasubramonian, H. Dwarkadas, and D. H. Albonesi, Reducing the complexity of the register file in dynamic superscalar processors, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34, pp.237-248, 2001.
DOI : 10.1109/MICRO.2001.991122

M. Bekerman, S. Jourdan, R. Ronen, G. Kirshenboim, L. Rappoport et al., Correlated load-address predictors, Proceedings of the International Symposium on Computer Architecture, pp.54-63, 1999.
DOI : 10.1145/307338.300984

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi et al., The gem5 simulator, ACM SIGARCH Computer Architecture News, vol.39, issue.2, pp.1-7, 2011.
DOI : 10.1145/2024716.2024718

E. Borch, E. Tune, S. Manne, and J. Emer, Loose loops sink chips, Proceedings Eighth International Symposium on High Performance Computer Architecture, pp.299-310, 2002.
DOI : 10.1109/HPCA.2002.995719

G. Z. Chrysos and J. S. Emer, Memory dependence prediction using store sets, Proceedings of the International Symposium on Computer Architecture, pp.142-153, 1998.

D. Ernst, A. Hamel, and T. Austin, Cyclone: a broadcast-free dynamic instruction scheduler with selective replay, Proceedings of the International Symposium on Computer Architecture, pp.253-262, 2003.

B. Fields, S. Rubin, and R. Bodík, Focusing processor policies via critical-path prediction, Proceedings of the International Symposium on Computer Architecture, pp.74-85, 2001.
DOI : 10.1145/384285.379253

URL : http://cadal.cse.nsysu.edu.tw/seminar/seminar_file/2002/10/Focusing processor policies via critical-path prediction.pdf

B. R. Fisk and R. I. Bahar, The non-critical buffer: using load latency tolerance to improve data cache efficiency, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040), pp.538-545, 1999.
DOI : 10.1109/ICCD.1999.808593

G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean et al., The microarchitecture of the Pentium 4 processor, Intel Technology Journal, vol.1, 2001.

]. R. Kessler, E. J. Mclellan, and D. A. Webb, Available: http://www.intel.com/content/dam/doc/manual/ 64-ia-32-architectures-optimization-manual.pdf The Alpha 21264 microprocessor architecture, Intel, Intel 64 and IA-32 Architectures Optimization Reference Manual Proceedings of the International Conference on Computer Design, pp.90-95, 1998.

I. Kim and M. H. Lipasti, Understanding scheduling replay schemes, Proceedings of the International Symposium on High Performance Computer Architecture, p.198, 2004.

Y. Liu, A. Shayesteh, G. Memik, and G. Reinman, Scaling the issue window with look-ahead latency prediction, Proceedings of the 18th annual international conference on Supercomputing , ICS '04, pp.217-226, 2004.
DOI : 10.1145/1006209.1006240

G. Memik, G. Reinman, and W. H. Mangione-smith, Precise instruction scheduling, Journal of Instruction-Level Parallelism, vol.7, pp.1-29, 2005.

A. Merchant, D. Boggs, and D. Sager, Processor with a replay system that includes a replay queue for improved throughput, p.737, 0200.

A. Merchant and D. Sager, Computer processor having a checker, p.626, 2001.

A. Merchant, D. Sager, D. Boggs, and M. Upton, Computer processor with a replay system having a plurality of checkers, p.94717, 2000.

P. Michaud and A. Seznec, Data-flow prescheduling for large instruction windows in out-of-order processors, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture, pp.27-36, 2001.
DOI : 10.1109/HPCA.2001.903249

E. Morancho, J. M. Llabería, and À. Olivé, Recovery mechanism for latency misprediction, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques, pp.118-128, 2001.
DOI : 10.1109/PACT.2001.953293

S. Palacharla, N. Jouppi, and J. Smith, Complexity-effective superscalar processors, Proceedings of the International Symposium on Computer Architecture, pp.206-218, 1997.

E. Perelman, G. Hamerly, and B. Calder, Picking statistically valid and early simulation points, Oceans 2002 Conference and Exhibition. Conference Proceedings (Cat. No.02CH37362), p.244, 2003.
DOI : 10.1109/PACT.2003.1238020

J. A. Rivers, G. S. Tyson, E. S. Davidson, and T. M. Austin, On high-bandwidth data cache design for multi-issue processors, Proceedings of 30th Annual International Symposium on Microarchitecture, pp.46-56, 1997.
DOI : 10.1109/MICRO.1997.645796

A. Seznec and P. Michaud, A case for (partially) TAgged GEometric history length branch prediction, Journal of Instruction Level Parallelism, vol.8, pp.1-23, 2006.

J. Stark, M. D. Brown, and Y. N. Patt, On pipelining dynamic instruction scheduling logic, Proceedings of the International Symposium on Microarchitecture, pp.57-66, 2000.

J. H. Tseng and K. Asanovi´casanovi´c, Banked multiported register files for high-frequency superscalar microprocessors, Proceedings of the International Symposium on Computer Architecture, pp.62-71, 2003.

E. S. Tune, D. M. Tullsen, and B. Calder, Quantifying instruction criticality, Proceedings.International Conference on Parallel Architectures and Compilation Techniques, pp.104-113, 2002.
DOI : 10.1109/PACT.2002.1106008

A. Yoaz, R. Erez, M. Ronen, and S. Jourdan, Speculation techniques for improving load related instruction scheduling, Proceedings of the 26th International Symposium on Computer Architecture (Cat. No.99CB36367), pp.42-53, 1999.
DOI : 10.1109/ISCA.1999.765938

V. Zyuban and P. Kogge, The energy complexity of register files, Proceedings of the 1998 international symposium on Low power electronics and design , ISLPED '98, pp.305-310, 1998.
DOI : 10.1145/280756.280943