What does BERT learn about the structure of language?

Ganesh Jawahar; Benoît Sagot; Djamé Seddah

Communication Dans Un Congrès Année : 2019

What does BERT learn about the structure of language?

(1) , (1) , (1, 2)

1
2

Ganesh Jawahar

Fonction : Auteur
PersonId : 1047469

Automatic Language Modelling and ANAlysis & Computational Humanities

Benoît Sagot

Fonction : Auteur
PersonId : 1461
IdHAL : bsagot
ORCID : 0000-0002-0107-8526
IdRef : 177454229

Automatic Language Modelling and ANAlysis & Computational Humanities

Djamé Seddah

Fonction : Auteur
PersonId : 11545
IdHAL : djameseddah
IdRef : 086185136

Automatic Language Modelling and ANAlysis & Computational Humanities

Sorbonne Université

Résumé

BERT is a recent language representation model that has surprisingly performed well in diverse language understanding benchmarks. This result indicates the possibility that BERT networks capture structural information about language. In this work, we provide novel support for this claim by performing a series of experiments to unpack the elements of English language structure learned by BERT. We first show that BERT's phrasal representation captures phrase-level information in the lower layers. We also show that BERT's intermediate layers encode a rich hierarchy of linguistic information, with surface features at the bottom, syntactic features in the middle and semantic features at the top. BERT turns out to require deeper layers when long-distance dependency information is required, e.g.~to track subject-verb agreement. Finally, we show that BERT representations capture linguistic information in a compositional way that mimics classical, tree-like structures.

Domaines

Informatique et langage [cs.CL]

Fichier principal

intbert_acl19paper-3.pdf (502.6 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Benoît Sagot : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-02131630

Soumis le : mardi 4 juin 2019-18:32:48

Dernière modification le : jeudi 1 février 2024-10:06:28

Dates et versions

hal-02131630 , version 1 (04-06-2019)

Identifiants

HAL Id : hal-02131630 , version 1

Citer

Ganesh Jawahar, Benoît Sagot, Djamé Seddah. What does BERT learn about the structure of language?. ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Jul 2019, Florence, Italy. ⟨hal-02131630⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 INRIA IRISA INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES SORBONNE-UNIVERSITE ANR UR1-MATH-NUM

5660 Consultations

15296 Téléchargements

What does BERT learn about the structure of language?

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager