Reconnaissance Statistique de la Parole Continue pour Voix Laryngée et Alaryngée

Résumé : Automatic Speech Recognition (ASR) has always been a scientist challenge. Many research efforts have been made over recent years to offer solutions and aiding systems in order to carry out various tasks previously dedicated only to humans. Speech is considered the most natural mode of communication, and an easy way for exchanging information between humans. A laryngectomee person lacks the ability of speaking normally because he/her lost his/her vocal cords after a surgical ablation of the larynx. Thus, the patient loses the phonation ability. Only a reeducation by a speech therapist allows this person to provide a new substitution voice called “esophageal”. Unlike laryngeal speech (normal), esophageal speech (alaryngeal) is hoarse, weak in intensity and in intelligibility whichmakes it difficult to understand. The goal of this thesis is the implementation of an automatic esophageal speech (alaryngeal) recognition system. This system should be able to provide most of the phonetic information contained in the esophageal speech signal. The decoding part of this system connected to a text-to-speech synthesizer should allow the reconstruction of a laryngeal voice. Such a system should permit laryngectomees an easier oral communication with other people. Our first contribution concerns the development of an automatic laryngeal speech recognition system using hidden Markov models. The few existing corpora of esophageal speech, are not dedicated to recognition, because of a lack of data (only a few dozen sentences are registered in practice). For this reason, we designed our own database dedicated to esophageal speech recognition containing 480 sentences spoken by a laryngectomee speaker. In the second part, our devoted laryngeal speech recognition system has been adapted and applied to this esophageal speech. Our last contribution of this thesis concerns the realization of a hybrid system (correction = conversion + recognition) based on voice conversion by projecting the acoustic feature vectors of esophageal speech in a less disturbed space related to laryngeal speech. We demonstrate that this hybrid system is able to improve the recognition of alaryngeal speech.
Type de document :
Thèse
Informatique et langage [cs.CL]. École Nationale Supérieure d’Informatique et d’Analyse des Systèmes, 2017. Français
Liste complète des métadonnées

Littérature citée [103 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/tel-01563766
Contributeur : Joseph Di Martino <>
Soumis le : mardi 18 juillet 2017 - 10:09:59
Dernière modification le : mardi 24 avril 2018 - 13:34:39
Document(s) archivé(s) le : samedi 27 janvier 2018 - 07:48:58

Identifiants

  • HAL Id : tel-01563766, version 1

Citation

Othman Lachhab. Reconnaissance Statistique de la Parole Continue pour Voix Laryngée et Alaryngée. Informatique et langage [cs.CL]. École Nationale Supérieure d’Informatique et d’Analyse des Systèmes, 2017. Français. 〈tel-01563766〉

Partager

Métriques

Consultations de la notice

522

Téléchargements de fichiers

227