Skip to Main content Skip to Navigation
Conference papers

Which Verification for Soft Error Detection?

Abstract : Many methods are available to detect silent errors in high-performance computing (HPC) applications. Each comes with a given cost and recall (fraction of all errors that are actually detected). The main contribution of this paper is to characterize the optimal computational pattern for an application: which detector(s) to use, how many detectors of each type to use, together with the length of the work segment that precedes each of them. We conduct a comprehensive complexity analysis of this optimization problem, showing NP-completeness and designing an FPTAS (Fully Polynomial-Time Approximation Scheme). On the practical side, we provide a greedy algorithm whose performance is shown to be close to the optimal for a realistic set of evaluation scenarios.
Complete list of metadata

Cited literature [33 references]  Display  Hide  Download
Contributor : Equipe Roma Connect in order to contact the contributor
Submitted on : Thursday, January 7, 2016 - 3:04:16 PM
Last modification on : Friday, September 30, 2022 - 4:12:11 AM
Long-term archiving on: : Friday, April 8, 2016 - 1:27:34 PM


Files produced by the author(s)


  • HAL Id : hal-01252382, version 1



Leonardo Bautista-Gomez, Anne Benoit, Aurélien Cavelan, Saurabh K. Raina, Yves Robert, et al.. Which Verification for Soft Error Detection?. High Performance Computing 2015, Dec 2015, Bangalore, India. ⟨hal-01252382⟩



Record views


Files downloads