Skip to Main content Skip to Navigation
Conference papers

DNN-Based Speech Synthesis for Arabic: Modelling and Evaluation

Amal Houidhek 1 Vincent Colotte 1 Zied Mnasri 2 Denis Jouvet 1
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : This paper investigates the use of deep neural networks (DNN) for Arabic speech synthesis. In parametric speech synthesis, whether HMM-based or DNN-based, each speech segment is described with a set of contextual features. These contextual features correspond to linguistic, phonetic and prosodic information that may affect the pronunciation of the segments. Gemination and vowel quantity (short vowel vs. long vowel) are two particular and important phenomena in Arabic language. Hence, it is worth investigating if those phenomena must be handled by using specific speech units, or if their specification in the contextual features is enough. Consequently four modelling approaches are evaluated by considering geminated consonants (respectively long vowels) either as fully-fledged phoneme units or as the same phoneme as their simple (respectively short) counterparts. Although no significant difference has been observed in previous studies relying on HMM-based modelling, this paper examines these modelling variants in the framework of DNN-based speech synthesis. Listening tests are conducted to evaluate the four modelling approaches, and to assess the performance of DNN-based Arabic speech synthesis with respect to previous HMM-based approach.
Document type :
Conference papers
Complete list of metadata

Cited literature [25 references]  Display  Hide  Download
Contributor : Denis Jouvet Connect in order to contact the contributor
Submitted on : Thursday, October 25, 2018 - 9:54:53 AM
Last modification on : Saturday, October 16, 2021 - 11:26:10 AM
Long-term archiving on: : Saturday, January 26, 2019 - 1:57:22 PM


Files produced by the author(s)


  • HAL Id : hal-01904512, version 1



Amal Houidhek, Vincent Colotte, Zied Mnasri, Denis Jouvet. DNN-Based Speech Synthesis for Arabic: Modelling and Evaluation. SLSP 2018 - 6th International Conference on Statistical Language and Speech Processing, Oct 2018, Mons, Belgium. ⟨hal-01904512⟩



Record views


Files downloads