J. D. Balfour and W. J. Dally, Design tradeoffs for tiled CMP on-chip networks, International Conference on Supercomputing, pp.187-198, 2006.

R. Bitirgen, E. Ipek, and J. F. Martinez, Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach, 2008 41st IEEE/ACM International Symposium on Microarchitecture, pp.318-329, 2008.
DOI : 10.1109/MICRO.2008.4771801

W. J. Dally and B. Towles, Principles and Practices of Interconnection Networks, 2004.

R. Das, O. Mutlu, T. Moscibroda, and C. R. Das, Application-aware prioritization mechanisms for on-chip networks, Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, Micro-42, pp.280-291, 2009.
DOI : 10.1145/1669112.1669150

A. Demers, S. Keshav, and S. Shenker, Analysis and simulation of a fair queueing algorithm, Symposium Proceedings on Communications Architectures and Protocols (SIG- COMM), pp.1-12, 1989.

E. Ebrahimi, C. J. Lee, O. Mutlu, and Y. N. Patt, Fairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systems, Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, pp.335-346, 2010.

S. Golestani, Congestion-free communication in high-speed packet networks, IEEE Transactions on Communications, vol.39, issue.12, pp.1802-1812, 1991.
DOI : 10.1109/26.120166

B. Grot, J. Hestness, S. W. Keckler, and O. Mutlu, Express Cube Topologies for on-Chip Interconnects, 2009 IEEE 15th International Symposium on High Performance Computer Architecture, pp.163-174, 2009.
DOI : 10.1109/HPCA.2009.4798251

B. Grot, S. W. Keckler, and O. Mutlu, Preemptive virtual clock, Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, Micro-42, pp.268-279, 2009.
DOI : 10.1145/1669112.1669149

R. Iyer, CQoS, Proceedings of the 18th annual international conference on Supercomputing , ICS '04, pp.257-266, 2004.
DOI : 10.1145/1006209.1006246

A. Kahng, B. Li, L. Peh, and K. Samadi, ORION 2.0: A fast and accurate NoC power and area model for early-stage design space exploration, 2009 Design, Automation & Test in Europe Conference & Exhibition, pp.423-428, 2009.
DOI : 10.1109/DATE.2009.5090700

P. Kermani and L. Kleinrock, Virtual cut-through: a new computer communication switching technique, Computer Networks, vol.3, pp.267-286, 1979.
DOI : 10.1016/0376-5075(79)90032-1
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.460.3486

J. Kim, J. Balfour, and W. Dally, Flattened butterfly topology for on-chip networks, Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, pp.172-182, 2007.

J. H. Kim and A. A. Chien, Rotating combined queueing (RCQ): bandwidth and latency guarantees in low-cost, highperformance networks, Proceedings of the 23rd Annual International Symposium on Computer Architecture, pp.226-236, 1996.

J. W. Lee, M. C. Ng, and K. Asanovic, Globallysynchronized frames for guaranteed quality-of-service in on-chip networks Virtual hierarchies to support server consolidation, Proceedings of the 35th Annual International Symposium on Computer Architecture Proceedings of the 34th Annual International Symposium on Computer Architecture, pp.89-100, 2007.

N. Muralimanohar, R. Balasubramonian, and N. Jouppi, Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007), pp.3-14, 2007.
DOI : 10.1109/MICRO.2007.33

O. Mutlu and T. Moscibroda, Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007), pp.146-160, 2007.
DOI : 10.1109/MICRO.2007.21
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.112.4661

O. Mutlu and T. Moscibroda, Parallelism-Aware Batch Scheduling, Proceedings of the 35th Annual International Symposium on Computer Architecture, pp.63-74, 2008.
DOI : 10.1145/1394608.1382128

K. J. Nesbit, J. Laudon, and J. E. Smith, Virtual private caches, Proceedings of the 34th Annual International Symposium on Computer Architecture, pp.57-68, 2007.
DOI : 10.1145/1273440.1250671

T. Ristenpart, E. Tromer, H. Shacham, and S. Savage, Hey, you, get off of my cloud, Proceedings of the 16th ACM conference on Computer and communications security, CCS '09, 2009.
DOI : 10.1145/1653662.1653687

J. Shin, K. Tam, D. Huang, B. Petrick, H. Pham et al., A 40nm 16-core 128-thread CMT SPARC SoC processor, 2010 IEEE International Solid-State Circuits Conference, (ISSCC), pp.98-99, 2010.
DOI : 10.1109/ISSCC.2010.5434030

G. E. Suh, S. Devadas, and L. Rudolph, A new memory monitoring scheme for memory-aware scheduling and partitioning, Proceedings Eighth International Symposium on High Performance Computer Architecture, pp.117-128, 2002.
DOI : 10.1109/HPCA.2002.995703

D. Wendel, R. Kalla, R. Cargoni, J. Clables, J. Friedrich et al., The implementation of POWER7: A highly parallel and scalable multicore high-end server processor, Proceedings of the IEEE International Solid-State Circuits Conference, pp.102-103, 2010.

L. Zhang, Virtual clock: a new traffic control algorithm for packet switching networks, ACM SIGCOMM Computer Communication Review, vol.20, issue.4, pp.19-29, 1990.
DOI : 10.1145/99517.99525