A Large, Fast Instruction Window for Tolerating Cache Misses, Proc. International Symposium on Computer Architecture (ISCA), 2002. ,
Cyclone: A Broadcast-free Dynamic Instruction Scheduler with Selective Replay, Proc. International Symposium on Computer Architecture (ISCA), 2003. ,
On reducing energy-consumption by late-inserting instructions into the issue queue, Proceedings of the 2007 international symposium on Low power electronics and design, ISLPED '07, 2007. ,
DOI : 10.1145/1283780.1283861
MLP-aware dynamic instruction window resizing for adaptively exploiting both ILP and MLP, Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-46, 2013. ,
DOI : 10.1145/2540708.2540713
The gem5 simulator, SIGARCH Comput. Archit. News, 2011. ,
DOI : 10.1145/2024716.2024718
SPEC CPU2006 benchmark descriptions, ACM SIGARCH Computer Architecture News, vol.34, issue.4, 2006. ,
DOI : 10.1145/1186736.1186737
The Load Slice Core Microarchitecture, Proc. International Symposium on Computer Architecture (ISCA), 2015. ,
Complexity-effective Superscalar Processors, Proc. International Symposium on Computer Architecture (ISCA), 1997. ,
DOI : 10.1145/384286.264201
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.527.5571
Power considerations in the design of the Alpha 21264 microprocessor, Proceedings of the 35th annual conference on Design automation conference , DAC '98, 1998. ,
DOI : 10.1145/277044.277226
McPAT, Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, Micro-42, 2009. ,
DOI : 10.1145/1669112.1669172
CACTI 6.0: A Tool to Model Large Caches, tech. rep, 2009. ,
Improving data cache performance by pre-executing instructions under a cache miss, Proceedings of the 11th international conference on Supercomputing , ICS '97, 1997. ,
DOI : 10.1145/263580.263597
Runahead execution: an alternative to very large instruction windows for out-of-order processors, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings., 2003. ,
DOI : 10.1109/HPCA.2003.1183532
Dual-core Execution: Building a Highly Scalable Single-thread Instruction Window, Proc. International Conference on Parallel Architectures and Compilation Techniques (PACT), 2005. ,
Instruction Fetch Deferral Using Static Slack Recovery Mechanism for Latency Misprediction, Proc. International Symposium on Microarchitecture (MICRO) Proc. International Conference on Parallel Architectures and Compilation Techniques (PACT), 2001. ,
A scalable register file architecture for dynamically scheduled processors, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique, 1996. ,
DOI : 10.1109/PACT.1996.552666
Delaying physical register allocation through virtual-physical registers, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture, 1999. ,
DOI : 10.1109/MICRO.1999.809456
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.161.5303
Register renaming and dynamic speculation: an alternative approach, Proceedings of the 26th Annual International Symposium on Microarchitecture, 1993. ,
DOI : 10.1109/MICRO.1993.282756
Toward kilo-instruction processors, ACM Transactions on Architecture and Code Optimization, vol.1, issue.4, pp.368-396, 2004. ,
DOI : 10.1145/1044823.1044825
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.86.3606
Continual Flow Pipelines, Proc. Internationl Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2004. ,
DOI : 10.1145/1024393.1024407
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.5771
BOLT: Energy-efficient Out-of-Order Latency-Tolerant execution, HPCA, 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture, 2010. ,
DOI : 10.1109/HPCA.2010.5416634
Scalable Load and Store Processing in Latency Tolerant Processors, Proc. International Symposium on Computer Architecture (ISCA), 2005. ,
Decoupled Store Completion/Silent Deterministic Replay: Enabling Scalable Data Memory for CPR/CFP Processors, Proc. International Symposium on Computer Architecture (ISCA), 2009. ,
DOI : 10.1145/1555815.1555786
Late-Binding: Enabling Unordered Load-Store Queues, Proc. International Symposium on Computer Architecture (ISCA), 2007. ,
DOI : 10.1145/1250662.1250705
The non-critical buffer: using load latency tolerance to improve data cache efficiency, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040), 1999. ,
DOI : 10.1109/ICCD.1999.808593
Dynamic prediction of critical path instructions, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture, 2001. ,
DOI : 10.1109/HPCA.2001.903262
Focusing Processor Policies via Critical-path Prediction, Proc. International Symposium on Computer Architecture (ISCA), 2001. ,
DOI : 10.1145/384285.379253
URL : http://cadal.cse.nsysu.edu.tw/seminar/seminar_file/2002/10/Focusing processor policies via critical-path prediction.pdf
Predictive sequential associative cache, Proceedings. Second International Symposium on High-Performance Computer Architecture, 1996. ,
DOI : 10.1109/HPCA.1996.501190
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.62.4303
A High-speed Dynamic Instruction Scheduling Scheme for Superscalar Processors, Proc. International Symposium on Microarchitecture (MICRO), 2001. ,