Exploring the anatomical encoding of voice with a mathematical model of the vocal system.

Abstract : The faculty of language depends on the interplay between the production and perception of speech sounds. A relevant open question is whether the dimensions that organize voice perception in the brain are acoustical or depend on properties of the vocal system that produced it. One of the main empirical difficulties in answering this question is to generate sounds that vary along a continuum according to the anatomical properties the vocal apparatus that produced them. Here we use a mathematical model that offers the unique possibility of synthesizing vocal sounds by controlling a small set of anatomically based parameters. In a first stage the quality of the synthetic voice was evaluated. Using specific time traces for sub-glottal pressure and tension of the vocal folds, the synthetic voices generated perceptual responses, which are indistinguishable from those of real speech. The synthesizer was then used to investigate how the auditory cortex responds to the perception of voice depending on the anatomy of the vocal apparatus. Our fMRI results show that sounds are perceived as human vocalizations when produced by a vocal system that follows a simple relationship between the size of the vocal folds and the vocal tract. We found that these anatomical parameters encode the perceptual vocal identity (male, female, child) and show that the brain areas that respond to human speech also encode vocal identity. On the basis of these results, we propose that this low-dimensional model of the vocal system is capable of generating realistic voices and represents a novel tool to explore the voice perception with a precise control of the anatomical variables that generate speech. Furthermore, the model provides an explanation of how auditory cortices encode voices in terms of the anatomical parameters of the vocal system.
Type de document :
Article dans une revue
NeuroImage, Elsevier, 2016, 141, pp.31-9. 〈10.1016/j.neuroimage.2016.07.033〉
Liste complète des métadonnées

https://hal.inria.fr/hal-01498364
Contributeur : Gaël Varoquaux <>
Soumis le : mercredi 29 mars 2017 - 21:14:14
Dernière modification le : mercredi 21 mars 2018 - 18:57:51

Identifiants

Collections

Citation

M Florencia Assaneo, Jacobo Sitt, Gaël Varoquaux, Mariano Sigman, Laurent Cohen, et al.. Exploring the anatomical encoding of voice with a mathematical model of the vocal system.. NeuroImage, Elsevier, 2016, 141, pp.31-9. 〈10.1016/j.neuroimage.2016.07.033〉. 〈hal-01498364〉

Partager

Métriques

Consultations de la notice

298