Skip to Main content Skip to Navigation

Which Verification for Soft Error Detection?

Abstract : Many methods are available to detect silent errors in high-performance computing (HPC) applications. Each comes with a given cost and recall (fraction of all errors that are actually detected). The main contribution of this paper is to show which detector(s) to use, and to characterize the optimal computational pattern for the application: how many detectors of each type to use, together with the length of the work segment that precedes each of them. We conduct a comprehensive complexity analysis of this optimization problem, showing NP-completeness and designing an FPTAS (Fully Polynomial-Time Approximation Scheme). On the practical side, we provide a greedy algorithm whose performance is shown to be close to the optimal for a realistic set of evaluation scenarios.
Complete list of metadata

Cited literature [33 references]  Display  Hide  Download
Contributor : Equipe Roma Connect in order to contact the contributor
Submitted on : Monday, October 5, 2015 - 6:55:26 PM
Last modification on : Friday, September 30, 2022 - 4:12:10 AM
Long-term archiving on: : Wednesday, April 26, 2017 - 10:21:54 PM


Files produced by the author(s)


  • HAL Id : hal-01164445, version 2



Leonardo Bautista-Gomez, Anne Benoit, Aurélien Cavelan, Saurabh K. Raina, Yves Robert, et al.. Which Verification for Soft Error Detection?. [Research Report] RR-8741, INRIA Grenoble; ENS Lyon; Jaypee Institute of Information Technology, India; Argonne National Laboratory; University of Tennessee Knoxville, USA; INRIA. 2015, pp.20. ⟨hal-01164445v2⟩



Record views


Files downloads