Experiments in audio source separation with one sensor for robust speech recognition

Laurent Benaroya; Frédéric Bimbot; Guillaume Gravier; Rémi Gribonval

doi:10.1016/j.specom.2005.11.002

Article Dans Une Revue Speech Communication Année : 2006

Experiments in audio source separation with one sensor for robust speech recognition

(1) , (1) , (1) , (1)

Laurent Benaroya

Fonction : Auteur
PersonId : 7037
IdHAL : elie-laurent-benaroya
IdRef : 07600953X

Speech and sound data modeling and processing

Frédéric Bimbot

Fonction : Auteur
PersonId : 830967

Speech and sound data modeling and processing

Guillaume Gravier

Fonction : Auteur
PersonId : 1046
IdHAL : guig
ORCID : 0000-0002-2266-5682
IdRef : 110355415

Speech and sound data modeling and processing

Rémi Gribonval

Fonction : Auteur
PersonId : 1255
IdHAL : remi-gribonval
ORCID : 0000-0002-9450-8125
IdRef : 113181590

Speech and sound data modeling and processing

Résumé

This paper focuses on the problem of noise compensation in speech signals for robust speech recognition. We investi- gate on a novel paradigm based on source separation techniques to remove music from speech, a common situation in broadcast news transcription tasks. The two methods proposed, namely adaptive Wiener filtering and adaptive shrinkage, rely on the use of a dictionary of spectral shapes to deal with the non-stationarity of the signals. Unlike most classical noise suppression methods, we assume a prior knowledge of the sources that are mixed. The proposed algorithms are compared to simple standard approaches on the source separation task and assessed in terms of average distortion. Their effect on the entire transcription system is eventually compared in terms of word error rate. Results indicate that source separation tech- niques show some effectiveness for robust transcription at signal/noise ratio lower than 15 dB. We also observe that the improvement of the word error rate is correlated to the spectral distortion rather than to specific source separation per- formance measure such as the signal to interference ratio.

Mots clés

source separation

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Rémi Gribonval : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00544988

Soumis le : mardi 20 août 2013-15:51:32

Dernière modification le : vendredi 24 mars 2023-14:52:53

Dates et versions

inria-00544988 , version 1 (20-08-2013)

Identifiants

HAL Id : inria-00544988 , version 1
DOI : 10.1016/j.specom.2005.11.002

Citer

Laurent Benaroya, Frédéric Bimbot, Guillaume Gravier, Rémi Gribonval. Experiments in audio source separation with one sensor for robust speech recognition. Speech Communication, 2006, Non Linear Speech Processing (NOLISP), 48 (7), pp.848-854. ⟨10.1016/j.specom.2005.11.002⟩. ⟨inria-00544988⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA IRISA-D5 INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES INSA-GROUPE UR1-MATH-NUM

151 Consultations

0 Téléchargements

Experiments in audio source separation with one sensor for robust speech recognition

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager