Humanoidly Speaking -How the Nao humanoid robot can learn the name of objects and interact with them through common speech

Xavier Hinaut; Johannes Twiefel; Marcelo Borghetti Soares; Pablo Barros; Luiza Mici; Stefan Wermter

Communication Dans Un Congrès Année : 2015

Humanoidly Speaking -How the Nao humanoid robot can learn the name of objects and interact with them through common speech

(1) , (1) , (1) , (1) , (1) , (1)

Xavier Hinaut

Fonction : Auteur
PersonId : 8171
IdHAL : xavier-hinaut
ORCID : 0000-0002-1924-1184
IdRef : 22823218X

Knowledge Technology group [Hamburg]

Johannes Twiefel

Fonction : Auteur

Knowledge Technology group [Hamburg]

Marcelo Borghetti Soares

Fonction : Auteur

Knowledge Technology group [Hamburg]

Pablo Barros

Fonction : Auteur

Knowledge Technology group [Hamburg]

Luiza Mici

Fonction : Auteur

Knowledge Technology group [Hamburg]

Stefan Wermter

Fonction : Auteur

Knowledge Technology group [Hamburg]

Résumé

This video shows a friendly human-robot interaction using humanoid Nao robots. The speaker teaches the robot some names of objects using speech. This work shows the successful integration of three different projects mainly using Artificial Neural Networks: (1) object recognition with RGB-D (color and depth) sensor, (2) speech to text using an approach that post-processes Google's speech recognition hypotheses, and (3) syntactic interpretation of sentences. The robot is able to identify surfaces in the environment (tables, floor, walls) and establish a relation between these surfaces and the clusters (objects). Multiple viewpoints are easily obtained from the segmented clusters and used for training a Convolutional Neural Network. The features obtained allow the robot to recognise objects and to generalise to unknown viewpoints and scales. The speech recognition system maps the results from Google to expectable sentences in the given scenario using phonemic matching. The syntactic interpretation of the sentence is done with a Recurrent Neural Network (namely an Echo State Network). It maps each semantic word in a sentence to its thematic role. In the end, all roles form predicates which indicate what should be performed (e.g. learning a new object or performing motor actions). At the start, the robot does not know any objects. During the learning of new objects, increasingly complex sentences are used to describe the position of new objects. Motor commands (e.g. pointing) are also provided in order to check the knowledge of the robot. It can be noted that the human user produces natural complex sentences, and thus any human could interact with the robot, not only robot programmers. Furthermore, complex sentences containing multiple commands can be correctly interpreted as a temporal action sequence (e.g. "Before doing 'B' do 'A'") without adding any complementary mechanism. In addition to teaching objects and relationships to the robot, in the future this kind of interaction scheme could also be used by children to learn (e.g. new objects) while interacting with the robot. In other words, by teaching the robot, the child is learning. It could be used also to teach new instructions or even new languages to the robot (with only few changes in modules (2) and (3)). Conversely, if the robot already knows about the environment in one language and a child would not, this child could learn this new language while interacting with the robot.

Domaines

Réseau de neurones [cs.NE] Apprentissage [cs.LG] Robotique [cs.RO] Neurosciences [q-bio.NC] Linguistique

Hinaut-et-al2015_IJCAI_video_competion.pdf (46.02 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Xavier Hinaut : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-02561332

Soumis le : dimanche 3 mai 2020-18:39:30

Dernière modification le : dimanche 22 novembre 2020-12:34:06

Dates et versions

hal-02561332 , version 1 (03-05-2020)

Identifiants

HAL Id : hal-02561332 , version 1

Citer

Xavier Hinaut, Johannes Twiefel, Marcelo Borghetti Soares, Pablo Barros, Luiza Mici, et al.. Humanoidly Speaking -How the Nao humanoid robot can learn the name of objects and interact with them through common speech. International Joint Conference on Artificial Neural Networks – IJCAI, Video Competition, Jul 2015, Buenos Aires, Argentina. ⟨hal-02561332⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

53 Consultations

12 Téléchargements

Humanoidly Speaking -How the Nao humanoid robot can learn the name of objects and interact with them through common speech

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager