Skip to Main content Skip to Navigation
New interface
Conference papers

F0 modeling using DNN for Arabic parametric speech synthesis

Imene Zangar 1 Zied Mnasri 1 Vincent Colotte 2 Denis Jouvet 2 
2 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Deep neural networks (DNN) are gaining increasing interest in speech processing applications, especially in text-to-speech synthesis. Actually state-of-the-art speech generation tools, like MERLIN and WAVENET are totally DNN-based. However, every language has to be modeled on its own using DNN. One of the key components of speech synthesis modules is the prosodic parameters generation module from contextual input features, and more particularly the fundamental frequency (F0) generation module. Actually F0 is responsible for intonation , that is why it should be accurately modeled to provide intelligible and natural speech. However, F0 modeling is highly dependent on the language. Therefore, language specific characteristics have to be taken into account. In this paper, we aim to model F0 for Arabic speech synthesis with feedforward and recurrent DNN, and using specific characteristic features for Arabic like vowel quantity and gemination, in order to improve the quality of Arabic parametric speech synthesis.
Complete list of metadata

Cited literature [24 references]  Display  Hide  Download
Contributor : Vincent Colotte Connect in order to contact the contributor
Submitted on : Tuesday, July 9, 2019 - 9:41:52 AM
Last modification on : Saturday, June 25, 2022 - 7:40:42 PM


Files produced by the author(s)


  • HAL Id : hal-02177496, version 1


Imene Zangar, Zied Mnasri, Vincent Colotte, Denis Jouvet. F0 modeling using DNN for Arabic parametric speech synthesis. INNSBDDL 2019 - INNS Big Data and Deep Learning, Apr 2019, Sestri Levante, Italy. ⟨hal-02177496⟩



Record views


Files downloads