D. Abadi, D. Carney, U. , M. Cherniack, C. Convey et al., Aurora: a new model and architecture for data stream management, The VLDB Journal The International Journal on Very Large Data Bases, vol.12, issue.2, pp.120-139, 2003.
DOI : 10.1007/s00778-003-0095-z

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.6.1187

S. Agarwala, F. Alegre, K. Schwan, and J. Mehalingham, E2EProf: Automated End-to-End Performance Management for Enterprise Systems, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07), pp.749-758, 2007.
DOI : 10.1109/DSN.2007.38

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.116.4546

M. K. Aguilera, J. C. Mogul, J. L. Wiener, P. Reynolds, and A. Muthitacharoen, Performance debugging for distributed systems of black boxes, Proceedings of the 19th ACM symposium on Operating systems principles, p.3, 2003.

. Apache, Hbase log

S. Bhatia, A. Kumar, M. E. Fiuczynski, and L. Peterson, Lightweight, highresolution monitoring for troubleshooting production systems, Proceedings of the 8th USENIX conference on Operating systems design and implementation, OSDI'08, pp.103-116, 2008.

P. Bodik, M. Goldszmidt, A. Fox, D. B. Woodard, and H. Andersen, Fingerprinting the datacenter, Proceedings of the 5th European conference on Computer systems, EuroSys '10, pp.111-124, 2010.
DOI : 10.1145/1755913.1755926

G. Candea, S. Kawamoto, Y. Fujiki, G. Friedman, and A. Fox, Microreboot -a technique for cheap recovery, Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation, OSDI'04, 2004.

M. Y. Chen, E. Kiciman, E. Fratkin, A. Fox, and E. Brewer, Pinpoint: problem determination in large, dynamic Internet services, Proceedings International Conference on Dependable Systems and Networks, pp.595-604, 2002.
DOI : 10.1109/DSN.2002.1029005

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.137.8693

T. Condie, N. Conway, P. Alvaro, J. M. Hellerstein, K. Elmeleegy et al., Mapreduce online, Proceedings of the 7th USENIX conference on Networked systems design and implementation, 2010.

J. Dai, J. Huang, S. Huang, B. Huang, and Y. Liu, Hitune: dataflow-based performance analysis for big data cloud, Proceedings of the 2011 USENIX conference on USENIX annual technical conference, 2011.

M. De-hoon, S. Imoto, J. Nolan, and S. Miyano, Open source clustering software, Bioinformatics, vol.20, issue.9, pp.1453-1454, 2004.
DOI : 10.1093/bioinformatics/bth078

U. Erlingsson, M. Peinado, S. Peter, and M. Budiu, Fay: extensible distributed tracing from kernels to clusters, Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, pp.311-326, 2011.

B. Gedik, H. Andrade, K. Wu, P. S. Yu, and M. Doo, SPADE, Proceedings of the 2008 ACM SIGMOD international conference on Management of data , SIGMOD '08, pp.1123-1134
DOI : 10.1145/1376616.1376729

Z. Guo, D. Zhou, H. Lin, M. Yang, F. Long et al., g 2 : a graph processing system for diagnosing distributed systems, Proceedings of the 2011 USENIX annual technical conference, 2011.

. Hewlett-packard, Worldcup98 logs

L. Hu, K. Schwan, A. Gulati, J. Zhang, and C. Wang, Net-cohort, Proceedings of the 9th international conference on Autonomic computing, ICAC '12, p.12, 2012.
DOI : 10.1145/2371536.2371540

S. Y. Ko, P. Yalagandula, I. Gupta, V. Talwar, D. Milojicic et al., Moara: Flexible and Scalable Group-Based Querying System, Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware, Middleware '08, pp.408-428, 2008.
DOI : 10.1109/SASO.2007.55

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.186.5467

S. Kumar, V. Talwar, V. Kumar, P. Ranganathan, and K. Schwan, vManage, Proceedings of the 6th international conference on Autonomic computing, ICAC '09, pp.127-136, 2009.
DOI : 10.1145/1555228.1555262

M. Lee, A. S. Krishnakumar, P. Krishnan, N. Singh, and S. Yajnik, Supporting soft real-time tasks in the xen hypervisor, Proceedings of the 6th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, VEE '10, pp.97-108, 2010.
DOI : 10.1145/1735997.1736012

M. Mansour and K. Schwan, I-RMI: Performance Isolation in Information Flow Applications, Proceedings of the ACM/IFIP/USENIX 2005 International Conference on Middleware, Middleware '05, pp.375-389, 2005.
DOI : 10.1109/SC.2002.10003

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.67.4513

N. Marz, Twitter's storm. https

M. L. Massie, B. N. Chun, and D. E. Culler, The ganglia distributed monitoring system: design, implementation, and experience, Parallel Computing, vol.30, issue.7, 2003.
DOI : 10.1016/j.parco.2004.04.001

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.2889

L. Neumeyer, B. Robbins, A. Nair, and A. Kesari, S4: Distributed Stream Computing Platform, 2010 IEEE International Conference on Data Mining Workshops, pp.170-177, 2010.
DOI : 10.1109/ICDMW.2010.172

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.304.3588

A. Rabkin and R. Katz, Chukwa: a system for reliable large-scale log collection, Proceedings of the 24th international conference on Large installation system administration, LISA'10, pp.1-15, 2010.

G. Ren, E. Tune, T. Moseley, Y. Shi, S. Rus et al., Google-Wide Profiling: A Continuous Profiling Infrastructure for Data Centers, IEEE Micro, vol.30, issue.4, 2010.
DOI : 10.1109/MM.2010.68

B. H. Sigelman, L. A. Barroso, M. Burrows, P. Stephenson, M. Plakal et al., Dapper, a large-scale distributed systems tracing infrastructure, 2010.

V. Soundararajan and J. M. Anderson, The impact of management operations on the virtualized datacenter, Proceedings of the 37th annual international symposium on Computer architecture, ISCA '10, pp.326-337, 2010.

R. Van-renesse, K. P. Birman, and W. Vogels, Astrolabe, ACM Transactions on Computer Systems, vol.21, issue.2, pp.164-206, 2003.
DOI : 10.1145/762483.762485

K. Viswanathan, L. Choudur, V. Talwar, C. Wang, G. Macdonald et al., Ranking anomalies in data centers, 2012 IEEE Network Operations and Management Symposium, pp.79-87, 2012.
DOI : 10.1109/NOMS.2012.6211885

C. Wang, EbAT, Proceedings of the 6th Middleware Doctoral Symposium on, MDS '09, 2009.
DOI : 10.1145/1659753.1659757

C. Wang, K. Schwan, V. Talwar, G. Eisenhauer, L. Hu et al., A flexible architecture integrating monitoring and analytics for managing large-scale data centers, Proceedings of the 8th ACM international conference on Autonomic computing, ICAC '11, pp.141-150, 2011.
DOI : 10.1145/1998582.1998605

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.420.9402

C. Wang, V. Talwar, K. Schwan, and P. Ranganathan, Online detection of utility cloud anomalies using metric distributions, the 12th IEEE/IFIP Network Operations and Management Symposium, NOMS'10, pp.96-103, 2010.

C. Wang, K. Viswanathan, L. Choudur, V. Talwar, W. Satterfield et al., Statistical techniques for online anomaly detection in data centers, 12th IFIP/IEEE International Symposium on Integrated Network Management (IM 2011) and Workshops, pp.385-392, 2011.
DOI : 10.1109/INM.2011.5990537

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.421.1896

P. Yalagandula and M. Dahlin, A scalable distributed information management system, Proceedings of the 2004 conference on Applications, technologies, architectures , and protocols for computer communications, SIGCOMM '04, pp.379-390, 2004.
DOI : 10.1145/1030194.1015509

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.4.817