Assessing Unintended Memorization in Neural Discriminative Sequence Models - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Assessing Unintended Memorization in Neural Discriminative Sequence Models

Résumé

Despite their success in a multitude of tasks, neural models trained on natural language have been shown to memorize the intricacies of their training data, posing a potential privacy threat. In this work, we propose a metric to quantify unintended memorization in neural dis-criminative sequence models. The proposed metric, named d-exposure (discriminative exposure), utilizes language ambiguity and classification confidence to elicit the model's propensity to memorization. Through experimental work on a named entity recognition task, we show the validity of d-exposure to measure memorization. In addition, we show that d-exposure is not a measure of overfitting as it does not increase when the model overfits.
Fichier principal
Vignette du fichier
HelaliM+20.pdf (255.94 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02880581 , version 1 (25-06-2020)

Identifiants

  • HAL Id : hal-02880581 , version 1

Citer

Mossad Helali, Thomas Kleinbauer, Dietrich Klakow. Assessing Unintended Memorization in Neural Discriminative Sequence Models. 23rd International Conference on Text, Speech and Dialogue, Sep 2020, Brno, Czech Republic. ⟨hal-02880581⟩
176 Consultations
288 Téléchargements

Partager

Gmail Facebook X LinkedIn More