Skip to Main content Skip to Navigation
Conference papers

Implementing a flexible failure detector that expresses the confidence in the system

Abstract : Traditional unreliable failure detectors are per process oracles that provide a list of nodes suspected of having failed. Previously, we introduced the Impact failure detector that outputs a trust level value which is the degree of confidence in the system. An impact factor is assigned to each node and the trust level is equal to the sum of the impact factors of the nodes not suspected to have failed. An input threshold parameter defines an impact factor limit value, over which the confidence degree on the system is ensured. The impact factor indicates the relative importance of the process in the set S, while the threshold offers a degree of flexibility for failures and false suspicions. We propose in this article two different algorithms, based on query-response message rounds, that implement the Impact FD whose conceptions were tailored to satisfy the Impact FD’s flexibility. The first one exploits the time-free message pattern approach while the second one considers a set of bounded timely responses. We also introduced the concept that a process can be PS−accessible (or ♦PS−accessible) which guarantees that the system S will always (or eventually always) be trusted to this process as well as two properties, P R(IT ) and PR(♦IT ), that characterize the minimum necessary stability condition of S that ensures confidence (or eventual confidence) on it. In both implementations, if the process that monitors S is P S−accessible or ♦PS−accessible, at every query round, it only waits (or eventually only waits) for a set of response that satisfy the threshold. A crucial facet of this set of processes is that it is not fixed, i.e., the set of processes can change at each round, which is in accordance with the flexibility capacity of the Impact FD.
Complete list of metadata

Cited literature [18 references]  Display  Hide  Download
Contributor : Pierre Sens <>
Submitted on : Thursday, October 27, 2016 - 11:27:38 AM
Last modification on : Friday, January 8, 2021 - 5:46:03 PM


Files produced by the author(s)


  • HAL Id : hal-01352162, version 1


Anubis Graciela de Moraes Rossetto, Claudio R. Geyer, Luciana Arantes, Pierre Sens. Implementing a flexible failure detector that expresses the confidence in the system. LADC 2016 - 7th Latin-American Symposium on Dependable Computing, Oct 2016, Cali, Colombia. ⟨hal-01352162⟩