Assessing Unintended Memorization in Neural Discriminative Sequence Models - Archive ouverte HAL Access content directly
Conference Papers Year : 2020

Assessing Unintended Memorization in Neural Discriminative Sequence Models

(1) , (1) , (2)
1
2

Abstract

Despite their success in a multitude of tasks, neural models trained on natural language have been shown to memorize the intricacies of their training data, posing a potential privacy threat. In this work, we propose a metric to quantify unintended memorization in neural dis-criminative sequence models. The proposed metric, named d-exposure (discriminative exposure), utilizes language ambiguity and classification confidence to elicit the model's propensity to memorization. Through experimental work on a named entity recognition task, we show the validity of d-exposure to measure memorization. In addition, we show that d-exposure is not a measure of overfitting as it does not increase when the model overfits.
Fichier principal
Vignette du fichier
HelaliM+20.pdf (255.94 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

hal-02880581 , version 1 (25-06-2020)

Identifiers

  • HAL Id : hal-02880581 , version 1

Cite

Mossad Helali, Thomas Kleinbauer, Dietrich Klakow. Assessing Unintended Memorization in Neural Discriminative Sequence Models. 23rd International Conference on Text, Speech and Dialogue, Sep 2020, Brno, Czech Republic. ⟨hal-02880581⟩
166 View
264 Download

Share

Gmail Facebook Twitter LinkedIn More