Skip to Main content Skip to Navigation
Conference papers

Preventing author profiling through zero-shot multilingual back-translation

Abstract : Documents as short as a single sentence may inadvertently reveal sensitive information about their authors, including e.g. their gender or ethnicity. Style transfer is an effective way of transforming texts in order to remove any information that enables author profiling. However, for a number of current state-of-theart approaches the improved privacy is accompanied by an undesirable drop in the downstream utility of the transformed data. In this paper, we propose a simple, zero-shot way to effectively lower the risk of author profiling through multilingual back-translation using off-the-shelf translation models. We compare our models with five representative text style transfer models on three datasets across different domains. Results from both an automatic and a human evaluation show that our approach achieves the best overall performance while requiring no training data. We are able to lower the adversarial prediction of gender and race by up to 22% while retaining 95% of the original utility on downstream tasks.
Document type :
Conference papers
Complete list of metadata
Contributor : Emmanuel Vincent Connect in order to contact the contributor
Submitted on : Tuesday, September 21, 2021 - 4:47:28 PM
Last modification on : Wednesday, September 22, 2021 - 3:08:40 AM
Long-term archiving on: : Wednesday, December 22, 2021 - 7:16:09 PM


Files produced by the author(s)


  • HAL Id : hal-03350906, version 1



David Ifeoluwa Adelani, Miaoran Zhang, Xiaoyu Shen, Ali Davody, Thomas Kleinbauer, et al.. Preventing author profiling through zero-shot multilingual back-translation. 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), Nov 2021, Punta Cana, Dominica. ⟨hal-03350906⟩



Record views


Files downloads