From Phonemes to Robot Commands with a Neural Parser - Archive ouverte HAL Access content directly
Conference Papers Year :

From Phonemes to Robot Commands with a Neural Parser

Xavier Hinaut


The understanding of how children acquire language [1][2], from phoneme to syntax, could be improved by computational models. In particular when they are integrated in robots [3]: e.g. by interacting with users [4] or grounding language cues [5]. Recently, speech recognition systems have greatly improved thanks to deep learning. However, for specific domain applications, like Human-Robot Interaction, using generic recognition tools such as Google API often provide words that are unknown by the robotic system when not just irrelevant [6]. Additionally, such recognition system does not provide much indications on how our brains acquire or process these phonemes, words or grammatical constructions (i.e. sentence templates). Moreover, to our knowledge they do not provide useful tools to learn from small corpora, from which a child may bootstrap from. Here, we propose a neuro-inspired approach that processes sentences word by word, or phoneme by phoneme, with no prior knowledge of the semantics of the words. Previously, we demonstrated this RNN-based model was able to generalize on grammatical constructions [7] even with unknown words (i.e. words out of the vocabulary of the training data) [8]. In this preliminary study, in order to try to overcome word misrecognition, we tested whether the same architecture is able to solve the same task directly by processing phonemes instead of grammatical constructions [9]. Applied on a small corpus, we see that the model has similar performance (even if a little weaker) when using phonemes as inputs instead of grammatical constructions. We speculate that this phoneme version could overcome the previous model when dealing with real noisy phoneme inputs, thus improving its performance in a real-time human-robot interaction.
Fichier principal
Vignette du fichier
Camera_ready_Hinaut_WS_language-learning-ICDLepirob2017.pdf (145.9 Ko) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-01665823 , version 1 (17-12-2017)


  • HAL Id : hal-01665823 , version 1


Xavier Hinaut. From Phonemes to Robot Commands with a Neural Parser. IEEE ICDL-EPIROB Workshop on Language Learning, Sep 2017, Lisbon, Portugal. pp.1-2. ⟨hal-01665823⟩
92 View
39 Download


Gmail Facebook Twitter LinkedIn More