R. K. Sahoo, A. Sivasubramaniam, and M. S. Squillante, Failure data analysis of a large-scale heterogeneous server environment, International Conference on Dependable Systems and Networks, 2004, 2004.
DOI : 10.1109/DSN.2004.1311948

B. Tierney and W. Johnston, The NetLogger methodology for high performance distributed systems performance analysis, Proc. of HPDC, 1998.

R. K. Sahoo and A. J. Oliner, Critical event prediction for proactive management in large-scale computer clusters, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '03, 2003.
DOI : 10.1145/956750.956799

S. Fu and C. Xu, Exploring Event Correlation for Event prediction in Coalitions of Clusters, Proc. of ICS, 2007.

S. Fu and C. Xu, Quantifying Temporal and Spatial Correlation of Failure Events for Proactive Management, 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007), 2007.
DOI : 10.1109/SRDS.2007.18

P. Gujrati, Y. Li, and Z. Lan, A Meta-Learning Failure Predictor for Blue Gene/L Systems, 2007 International Conference on Parallel Processing (ICPP 2007), 2007.
DOI : 10.1109/ICPP.2007.9

J. C. Knight, An introduction to computing system dependability, Proceedings. 26th International Conference on Software Engineering, 2004.
DOI : 10.1109/ICSE.2004.1317509

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.132.3270

D. Tang and R. K. Iyer, Analysis and modeling of correlated failures in multicomputer systems, IEEE Transactions on Computers, vol.41, issue.5, pp.567-577, 1992.
DOI : 10.1109/12.142683

E. Koskinen and J. Jannotti, BorderPatrol: Isolating Events for Precise Black-box Tracing, Proc of Eurosys, 2008.

Y. Liang and Y. Zhang, BlueGene/L Failure Analysis and Prediction Models, Proc. of DSN, 2006.

T. J. Hacker, F. Romero, and C. D. Carothers, An analysis of clustered failures on large supercomputing systems, Journal of Parallel and Distributed Computing, vol.69, issue.7, pp.652-665, 2009.
DOI : 10.1016/j.jpdc.2009.03.007

A. J. Oliner, A. Aiken, and J. Stearley, Alert Detection in Logs, Proc. of ICDM, 2008.

W. Zhou, J. Zhan, D. Meng, D. Xu, and Z. Zhang, LogMaster: Mining Event Correlations in Logs of Large-scale Cluster Systems, p.951, 1003.

N. Jiang and L. Gruenwald, Research issues in data stream association rule mining, ACM SIGMOD Record, vol.35, issue.1, 2006.
DOI : 10.1145/1121995.1121998

F. Salfner and S. Tschirpke, Error Log Processing for Accurate Event prediction, USENIX workshop on the analysis of System logs (WASL), 2008.

J. G. Lou, Q. Fu, Y. Wang, and J. Li, Mining dependency in distributed systems through unstructured logs analysis, ACM SIGOPS Operating Systems Review, vol.44, issue.1, 2009.
DOI : 10.1145/1740390.1740411

R. Zhang, E. Cope, L. Heusler, and . F. Cheng, A Bayesian Network Approach to Modeling IT Service Availability using System Logs, 2009.

D. Tang and R. K. Iyer, Analysis and modeling of correlated failures in multicomputer systems, IEEE Transactions on Computers, vol.41, issue.5, pp.567-577, 1992.
DOI : 10.1109/12.142683

A. Oliner and J. Stearley, What Supercomputers Say: A Study of Five System Logs, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07), 2005.
DOI : 10.1109/DSN.2007.103

J. P. Rouillard, Real-time log file analysis using the Simple Event Correlator (SEC), Proc of LISA, 2004.

Z. Zhang and J. Zhan, Precise request tracing and performance debugging for multi-tier services of black boxes, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks, 2009.
DOI : 10.1109/DSN.2009.5270321

W. Zhou and J. Zhan, Multidimensional Analysis of System Logs in Large-scale Cluster Systems, Proc of DSN (FastAbstract), 2008.