Faithful and Robust Local Interpretability for Textual Predictions - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2023

Faithful and Robust Local Interpretability for Textual Predictions

Résumé

Interpretability is essential for machine learning models to be trusted and deployed in critical domains. However, existing methods for interpreting text models are often complex, lack solid mathematical foundations, and their performance is not guaranteed. In this paper, we propose FRED (Faithful and Robust Explainer for textual Documents), a novel method for interpreting predictions over text. FRED identifies key words in a document that significantly impact the prediction when removed. We establish the reliability of FRED through formal definitions and theoretical analyses on interpretable classifiers. Additionally, our empirical evaluation against state-of-the-art methods demonstrates the effectiveness of FRED in providing insights into text models.

Dates et versions

hal-04394149 , version 1 (15-01-2024)

Licence

Paternité

Identifiants

Citer

Gianluigi Lopardo, Frederic Precioso, Damien Garreau. Faithful and Robust Local Interpretability for Textual Predictions. 2024. ⟨hal-04394149⟩
21 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More