Characterizing a Neutron-Induced Fault Model for Deep Neural Networks

The reliability evaluation of Deep Neural Networks (DNNs) executed on Graphic Processing Units (GPUs) is a challenging problem since the hardware architecture is highly complex and the software frameworks are composed of many layers of abstraction. While software-level fault injection is a common and fast way to evaluate the reliability of complex applications, it may produce unrealistic results since it has limited access to the hardware resources and the adopted fault models may be too naive (i.e., single and double bit flip). Contrarily, physical fault injection with neutron beam provides realistic error rates but lacks fault propagation visibility. This paper proposes a characterization of the DNN fault model combining both neutron beam experiments and fault injection at software level. We exposed GPUs running General Matrix Multiplication (GEMM) and DNNs to beam neutrons to measure their error rate. On DNNs, we observe that the percentage of critical errors can be up to 61%, and show that ECC is ineffective in reducing critical errors. We then performed a complementary software-level fault injection, using fault models derived from RTL simulations. Our results show that by injecting complex fault models, the YOLOv3 misdetection rate is validated to be very close to the rate measured with beam experiments, which is 8.66× higher than the one measured with fault injection using only single-bit flips.

Domaines

Architectures Matérielles [cs.AR]

Fichier principal

tns_2023.pdf (760.61 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Fernando Fernandes dos Santos : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03865253

Soumis le : mardi 29 novembre 2022-15:51:52

Dernière modification le : mardi 19 mars 2024-09:34:06

Dates et versions

hal-03865253 , version 1 (22-11-2022)

hal-03865253 , version 2 (29-11-2022)

Licence

Paternité

Identifiants

HAL Id : hal-03865253 , version 2
ARXIV : 2211.13094

Citer

Fernando Fernandes dos Santos, Angeliki Kritikakou, Josie Esteban Rodriguez Condia, Juan David Guerrero Balaguera, Matteo Sonza Reorda, et al.. Characterizing a Neutron-Induced Fault Model for Deep Neural Networks. IEEE Transactions on Nuclear Science, In press. ⟨hal-03865253v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA CENTRALESUPELEC INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES ANR UR1-MATH-NUM

244 Consultations

309 Téléchargements