Information-Restricted Neural Language Models Reveal Different Brain Regions' Sensitivity to Semantics, Syntax and Context

Alexandre Pasquiou; Yair Lakretz; Bertrand Thirion; Christophe Pallier

doi:10.1162/nol_a_00125

Article Dans Une Revue Neurobiology of Language Année : 2023

Information-Restricted Neural Language Models Reveal Different Brain Regions' Sensitivity to Semantics, Syntax and Context

(1, 2) , (2) , (1) , (2)

1
2

Alexandre Pasquiou

Fonction : Auteur
PersonId : 1144780
IdHAL : alexandre-pasquiou
ORCID : 0000-0002-7966-8083

Modèles et inférence pour les données de Neuroimagerie

Neuroimagerie cognitive - Psychologie cognitive expérimentale

Yair Lakretz

Fonction : Auteur

Neuroimagerie cognitive - Psychologie cognitive expérimentale

Bertrand Thirion

Fonction : Auteur

Modèles et inférence pour les données de Neuroimagerie

Christophe Pallier

Fonction : Auteur

Neuroimagerie cognitive - Psychologie cognitive expérimentale

Résumé

A fundamental question in neurolinguistics concerns the brain regions involved in syntactic and semantic processing during speech comprehension, both at the lexical (word processing) and supra-lexical levels (sentence and discourse processing). To what extent are these regions separated or intertwined? To address this question, we introduce a novel approach exploiting neural language models to generate high-dimensional feature sets that separately encode semantic and syntactic information. More precisely, we train a lexical language model, Glove, and a supra-lexical language model, GPT-2, on a text corpus from which we selectively removed either syntactic or semantic information. We then assess to what extent the features derived from these information-restricted models are still able to predict the fMRI time-courses of humans listening to naturalistic text. Furthermore, to determine the windows of integration of brain regions involved in supra-lexical processing, we manipulate the size of contextual information provided to GPT-2. The analyses show that, while most brain regions involved in language comprehension are sensitive to both syntactic and semantic features, the relative magnitudes of these effects vary across these regions. Moreover, regions that are best fitted by semantic or syntactic features are more spatially dissociated in the left hemisphere than in the right one, and the right hemisphere shows sensitivity to longer contexts than the left. The novelty of our approach lies in the ability to control for the information encoded in the models' embeddings by manipulating the training set. These "information-restricted" models complement previous studies that used language models to probe the neural bases of language, and shed new light on its spatial organization.

Mots clés

fMRI encoding models syntax context semantics LLM

Domaines

Intelligence artificielle [cs.AI] Linguistique Neurosciences

Fichier principal

final_version.pdf (26.77 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Bertrand Thirion : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-04353489

Soumis le : mardi 19 décembre 2023-14:47:55

Dernière modification le : mercredi 3 avril 2024-10:20:13

Dates et versions

hal-04353489 , version 1 (19-12-2023)

Licence

Paternité

Identifiants

HAL Id : hal-04353489 , version 1
DOI : 10.1162/nol_a_00125

Citer

Alexandre Pasquiou, Yair Lakretz, Bertrand Thirion, Christophe Pallier. Information-Restricted Neural Language Models Reveal Different Brain Regions' Sensitivity to Semantics, Syntax and Context: LANGUAGE MODELS SHOW BRAIN SENSITIVITY TO SEMANTICS, SYNTAX AND CONTEXT. Neurobiology of Language, 2023, 4 (4), pp.611-636. ⟨10.1162/nol_a_00125⟩. ⟨hal-04353489⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CEA INRIA INRIA2 CEA-UPSAY UNIV-PARIS-SACLAY JOLIOT CEA-DRF NEUROSPIN GS-COMPUTER-SCIENCE GS-LIFE-SCIENCES-HEALTH

25 Consultations

0 Téléchargements

Information-Restricted Neural Language Models Reveal Different Brain Regions' Sensitivity to Semantics, Syntax and Context

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager