Skip to Main content Skip to Navigation
Conference papers

Sequential fault monitoring

Dawei Feng 1 Cecile Germain-Renaud 1 Julien Nauroy 2
2 TAO - Machine Learning and Optimisation
CNRS - Centre National de la Recherche Scientifique : UMR8623, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique
Abstract : For large-scale distributed systems, the knowledge component at the core of the MAPE-K loop remains elusive. In the context of end-to-end probing, fault monitoring can be re- casted as an inference problem in the space-time domain. We propose and evaluate Sequential Matrix Factorization (SMF), a fully spatio-temporal method that exploits both the recent advances in matrix factorization for the spatial information and a new heuristics based on historical information. Adaptivity oper- ates at two levels: algorithmically, as the exploration/exploitation tradeoff is controlled by a self-calibrating parameter; and at the policy level, as active learning is required for the most challenging cases of a real-world dataset.
Complete list of metadata

Cited literature [34 references]  Display  Hide  Download
Contributor : Cecile Germain Connect in order to contact the contributor
Submitted on : Monday, September 15, 2014 - 8:05:36 PM
Last modification on : Thursday, July 8, 2021 - 3:48:48 AM
Long-term archiving on: : Tuesday, December 16, 2014 - 11:30:33 AM


Files produced by the author(s)


  • HAL Id : hal-01064161, version 1



Dawei Feng, Cecile Germain-Renaud, Julien Nauroy. Sequential fault monitoring. Cloud and Autonomic Computing, Sep 2014, London, United Kingdom. ⟨hal-01064161⟩



Les métriques sont temporairement indisponibles