Sequential fault monitoring

Dawei Feng 1 Cecile Germain-Renaud 1 Julien Nauroy 2
2 TAO - Machine Learning and Optimisation
CNRS - Centre National de la Recherche Scientifique : UMR8623, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique
Abstract : For large-scale distributed systems, the knowledge component at the core of the MAPE-K loop remains elusive. In the context of end-to-end probing, fault monitoring can be re- casted as an inference problem in the space-time domain. We propose and evaluate Sequential Matrix Factorization (SMF), a fully spatio-temporal method that exploits both the recent advances in matrix factorization for the spatial information and a new heuristics based on historical information. Adaptivity oper- ates at two levels: algorithmically, as the exploration/exploitation tradeoff is controlled by a self-calibrating parameter; and at the policy level, as active learning is required for the most challenging cases of a real-world dataset.
Complete list of metadatas

Cited literature [34 references]  Display  Hide  Download

https://hal.inria.fr/hal-01064161
Contributor : Cecile Germain <>
Submitted on : Monday, September 15, 2014 - 8:05:36 PM
Last modification on : Thursday, April 5, 2018 - 12:30:12 PM
Long-term archiving on : Tuesday, December 16, 2014 - 11:30:33 AM

File

seqmCACFinal.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01064161, version 1

Collections

Citation

Dawei Feng, Cecile Germain-Renaud, Julien Nauroy. Sequential fault monitoring. Cloud and Autonomic Computing, Sep 2014, London, United Kingdom. ⟨hal-01064161⟩

Share

Metrics

Record views

323

Files downloads

585