Recurrent Neural Networks Models for Developmental Language Acquisition: Reservoirs Outperform LSTMs

Xavier Hinaut; Alexandre Variengien

Poster Année : 2020

Recurrent Neural Networks Models for Developmental Language Acquisition: Reservoirs Outperform LSTMs

(1) , (2, 1)

1
2

Xavier Hinaut

Fonction : Auteur
PersonId : 8171
IdHAL : xavier-hinaut
ORCID : 0000-0002-1924-1184
IdRef : 22823218X

Mnemonic Synergy

Alexandre Variengien

Fonction : Auteur
PersonId : 740630
IdHAL : alexandre-variengien

École normale supérieure de Lyon

Mnemonic Synergy

Résumé

We previously developed cortico-striatal models for sentence comprehension (Hinaut & Dominey 2013) and sentence production (Hinaut et al. 2015). The sentence comprehension model is based on the reservoir computing principle: a random recurrent neural network (a reservoir) provides a rich recombination of sequential word inputs (e.g. a piece of prefrontal cortex), and an output layer (e.g. striatum) learns to "reads- out" the roles of words in the sentence from the internal recurrent dynamics. The model has several interesting properties, like the ability to predict the semantic roles of words during online processing. Additionally, we demonstrated its robustness to various corpus complexities, in different languages, and even its ability to work with bilingual inputs. In this study, we propose to (1) use the model in a new task related to a developmental language acquisition (i.e. Cross-Situational Learning), (2) provide a quantitative comparison with one of the best performing neural networks for sequential tasks (a LSTM), and (3) provide a qualitative analysis on the way reservoirs and LSTMs solve the task. This new Cross-Situational Task is as follows: for a given sentence, the target output provided often contains more detailed features than what is available in the sentence. Thus, the models have not only to learn how to parse sentences to extract useful information, but also to statistically infer which word is associated with which feature. While reservoir units are modelled as leaky average firing rate neurons, LSTM units are engineered to gate information using a costly and biologically implausible learning algorithm (Back-Propagation Through Time). We found that both models were able to successfully learn the task: the LSTM reached slightly better performance for the basic corpus, but the reservoir was able to significantly outperform LSTMs on more challenging corpora with increasing vocabulary sizes (for a given set of hyperparameters). We analyzed the hidden activations of internal units of both models. Despite the deep differences between both models (trained or fixed internal weights), we were able to uncover similar inner dynamics: the most useful units (with strongest weights to the output layer) seemed tuned to keep track of several specific words in the sentence. Because of its learning algorithm, it is predictable to see such behavior in a LTSM but not in a reservoir; in fact, the LSTM contained more tuned-like units than the reservoir. These differences between LSTMs and reservoirs highlights differences between classical Deep Learning approaches (based on back-propagation algorithm) and more plausible brain learning mechanisms. First, the reservoir is more efficient in terms of training time and cost (the LSTM needs several passes on the training data, while the reservoir uses it only one). Secondly, only the reservoir model seems to scale to larger corpora without the need to adapt specifically the hyperparameters of the model. Finally, the presence of more tuned units in the LSTM compared to the reservoir might be an explanation of why the LSTM seems to overfit too much to training data and have limited generalization capabilities when the learning data available becomes limited.

Mots clés

Reservoir computing Echo State Networks

Domaines

Neurosciences [q-bio.NC] Réseau de neurones [cs.NE] Intelligence artificielle [cs.AI] Apprentissage [cs.LG]

Fichier principal

HinautVariengien_SNL2020_poster.pdf (2.97 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Xavier Hinaut : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03146558

Soumis le : vendredi 19 février 2021-17:03:55

Dernière modification le : jeudi 15 février 2024-03:32:34

Archivage à long terme le : jeudi 20 mai 2021-18:28:19

Dates et versions

hal-03146558 , version 1 (19-02-2021)

Identifiants

HAL Id : hal-03146558 , version 1

Citer

Xavier Hinaut, Alexandre Variengien. Recurrent Neural Networks Models for Developmental Language Acquisition: Reservoirs Outperform LSTMs. SNL 2020 - 12th Annual Meeting of the Society for the Neurobiology of Language, Oct 2020, Virtual Edition, Canada. ⟨hal-03146558⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-LYON UNIV-RENNES1 CNRS INRIA IRISA INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UDL UR1-MATH-NUM

68 Consultations

61 Téléchargements

Recurrent Neural Networks Models for Developmental Language Acquisition: Reservoirs Outperform LSTMs

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager