A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion

Othman Lachhab; Joseph Di Martino; El Hassane Ibn Elhaj; Ahmed Hammouch

doi:10.1186/s40064-015-1428-2

Article Dans Une Revue SpringerPlus Année : 2015

A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion

(1, 2) , (2) , (3) , (1)

1
2
3

Othman Lachhab

Fonction : Auteur

Ecole Normale Supérieure de l'Enseignement Technique [Rabat]

Speech Modeling for Facilitating Oral-Based Communication

Joseph Di Martino

Fonction : Auteur
PersonId : 16557
IdHAL : joseph-di-martino
IdRef : 179331531

Speech Modeling for Facilitating Oral-Based Communication

El Hassane Ibn Elhaj

Fonction : Auteur
PersonId : 962249

Institut National des Postes et Télécommunications [Rabat]

Ahmed Hammouch

Fonction : Auteur

Ecole Normale Supérieure de l'Enseignement Technique [Rabat]

Résumé

In this paper, we propose a hybrid system based on a modified statistical GMM voice conversion algorithm for improving the recognition of esophageal speech. This hybrid system aims to compensate for the distorted information present in the esophageal acoustic features by using a voice conversion method. The esophageal speech is converted into a “target” laryngeal speech using an iterative statistical estimation of a transformation function. We did not apply a speech synthesizer for reconstructing the converted speech signal, given that the converted Mel cepstral vectors are used directly as input of our speech recognition system. Furthermore the feature vectors are linearly transformed by the HLDA (heteroscedastic linear discriminant analysis) method to reduce their size in a smaller space having good discriminative properties. The experimental results demonstrate that our proposed system provides an improvement of the phone recognition accuracy with an absolute increase of 3.40 % when compared with the phone recognition accuracy obtained with neither HLDA nor voice conversion.

Mots clés

Voice conversion Pathological voices Automatic speech recognition (ASR) Speech enhancement Esophageal speech assessment

Domaines

Traitement du signal et de l'image [eess.SP]

Joseph Di Martino : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01221503

Soumis le : mercredi 28 octobre 2015-10:52:47

Dernière modification le : lundi 11 septembre 2023-17:41:19

Dates et versions

hal-01221503 , version 1 (28-10-2015)

Identifiants

HAL Id : hal-01221503 , version 1
DOI : 10.1186/s40064-015-1428-2
PUBMEDCENTRAL : PMC4627987

Citer

Othman Lachhab, Joseph Di Martino, El Hassane Ibn Elhaj, Ahmed Hammouch. A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion. SpringerPlus, 2015, ⟨10.1186/s40064-015-1428-2⟩. ⟨hal-01221503⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-LORRAINE INRIA2 LORIA LORIA-NLPKD

105 Consultations

0 Téléchargements

A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager