Cross-Situational Learning Towards Robot Grounding - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2022

Cross-Situational Learning Towards Robot Grounding

Subba Reddy Oota
Frédéric Alexandre
Xavier Hinaut

Résumé

How do children acquire language through unsupervised or noisy supervision? How do their brain process language? We take this perspective to machine learning and robotics, where part of the problem is understanding how language models can perform grounded language acquisition through noisy supervision and discussing how they can account for brain learning dynamics. Most prior works have tracked the co-occurrence between single words and referents to model how infants learn wordreferent mappings. This paper studies cross-situational learning (CSL) with full sentences: we want to understand brain mechanisms that enable children to learn mappings between words and their meanings from full sentences in early language learning. We investigate the CSL task on a few training examples with two sequence-based models: (i) Echo State Networks (ESN) and (ii) Long-Short Term Memory Networks (LSTM). Most importantly, we explore several word representations including One-Hot, GloVe, pretrained BERT, and fine-tuned BERT representations (last layer token representations) to perform the CSL task. We apply our approach to three diverse datasets (two grounded language datasets and a robotic dataset) and observe that (1) One-Hot, GloVe, and pretrained BERT representations are less efficient when compared to representations obtained from fine-tuned BERT. (2) ESN online with final learning (FL) yields superior performance over ESN online continual learning (CL), offline learning, and LSTMs, indicating the more biological plausibility of ESNs and the cognitive process of sentence reading. (2) LSTM with fewer hidden units showcases higher performance for small datasets, but LSTM with more hidden units is Cross-Situational Learning needed to perform reasonably well on larger corpora. (4) ESNs demonstrate better generalization than LSTM models for increasingly large vocabularies. Overall, these models are able to learn from scratch to link complex relations between words and their corresponding meaning concepts, handling polysemous and synonymous words. Moreover, we argue that such models can extend to help current human-robot interaction studies on language grounding and better understand children's developmental language acquisition. We make the code publicly available * .
Fichier principal
Vignette du fichier
Journal_of_Social_and_Robotics.pdf (5.32 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03628290 , version 1 (01-04-2022)
hal-03628290 , version 2 (08-04-2022)

Identifiants

  • HAL Id : hal-03628290 , version 2

Citer

Subba Reddy Oota, Frédéric Alexandre, Xavier Hinaut. Cross-Situational Learning Towards Robot Grounding. 2022. ⟨hal-03628290v2⟩
140 Consultations
52 Téléchargements

Partager

Gmail Facebook X LinkedIn More