HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT

Abstract : Multilingual pretrained language models have demonstrated remarkable zero-shot crosslingual transfer capabilities. Such transfer emerges by fine-tuning on a task of interest in one language and evaluating on a distinct language, not seen during the fine-tuning. Despite promising results, we still lack a proper understanding of the source of this transfer. Using a novel layer ablation technique and analyses of the model's internal representations, we show that multilingual BERT, a popular multilingual language model, can be viewed as the stacking of two sub-networks: a multilingual encoder followed by a taskspecific language-agnostic predictor. While the encoder is crucial for cross-lingual transfer and remains mostly unchanged during finetuning, the task predictor has little importance on the transfer and can be reinitialized during fine-tuning. We present extensive experiments with three distinct tasks, seventeen typologically diverse languages and multiple domains to support our hypothesis.
Document type :
Conference papers
Complete list of metadata

https://hal.inria.fr/hal-03239087
Contributor : Benoît Sagot Connect in order to contact the contributor
Submitted on : Thursday, May 27, 2021 - 3:10:46 PM
Last modification on : Thursday, February 3, 2022 - 11:17:53 AM
Long-term archiving on: : Saturday, August 28, 2021 - 7:06:01 PM

File

2021.eacl-main.189.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-03239087, version 1

Citation

Benjamin Muller, Yanai Elazar, Benoît Sagot, Djamé Seddah. First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT. EACL 2021 - The 16th Conference of the European Chapter of the Association for Computational Linguistics, Apr 2021, Kyiv / Virtual, Ukraine. ⟨hal-03239087⟩

Share

Metrics

Record views

32

Files downloads

48