From Phonemes to Robot Commands with a Neural Parser - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

From Phonemes to Robot Commands with a Neural Parser

Xavier Hinaut

Résumé

The understanding of how children acquire language [1][2], from phoneme to syntax, could be improved by computational models. In particular when they are integrated in robots [3]: e.g. by interacting with users [4] or grounding language cues [5]. Recently, speech recognition systems have greatly improved thanks to deep learning. However, for specific domain applications, like Human-Robot Interaction, using generic recognition tools such as Google API often provide words that are unknown by the robotic system when not just irrelevant [6]. Additionally, such recognition system does not provide much indications on how our brains acquire or process these phonemes, words or grammatical constructions (i.e. sentence templates). Moreover, to our knowledge they do not provide useful tools to learn from small corpora, from which a child may bootstrap from. Here, we propose a neuro-inspired approach that processes sentences word by word, or phoneme by phoneme, with no prior knowledge of the semantics of the words. Previously, we demonstrated this RNN-based model was able to generalize on grammatical constructions [7] even with unknown words (i.e. words out of the vocabulary of the training data) [8]. In this preliminary study, in order to try to overcome word misrecognition, we tested whether the same architecture is able to solve the same task directly by processing phonemes instead of grammatical constructions [9]. Applied on a small corpus, we see that the model has similar performance (even if a little weaker) when using phonemes as inputs instead of grammatical constructions. We speculate that this phoneme version could overcome the previous model when dealing with real noisy phoneme inputs, thus improving its performance in a real-time human-robot interaction.
Fichier principal
Vignette du fichier
Camera_ready_Hinaut_WS_language-learning-ICDLepirob2017.pdf (145.9 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01665823 , version 1 (17-12-2017)

Identifiants

  • HAL Id : hal-01665823 , version 1

Citer

Xavier Hinaut. From Phonemes to Robot Commands with a Neural Parser. IEEE ICDL-EPIROB Workshop on Language Learning, Sep 2017, Lisbon, Portugal. pp.1-2. ⟨hal-01665823⟩
97 Consultations
36 Téléchargements

Partager

Gmail Facebook X LinkedIn More