Automatic speech recognition and speech variability: A review

Mohamed Benzeghiba; Renato de Mori; Olivier Deroo; Stéphane Dupont; T. Erbes; Denis Jouvet; Luciano Fissore; Pietro Laface; Alfred Mertins; Christophe Ris; Richard Rose; Vivek Tyagi; Christian Wellekens

doi:10.1016/j.specom.2007.02.006

Article Dans Une Revue Speech Communication Année : 2007

Automatic speech recognition and speech variability: A review

(1) , (2) , (3) , (4) , (1) , (5) , (6) , (7) , (8) , (4) , , (1) , (1)

1
2
3
4
5
6
7
8

Mohamed Benzeghiba

Fonction : Auteur

Eurecom [Sophia Antipolis]

Renato de Mori

Fonction : Auteur

Laboratoire Informatique d'Avignon

Olivier Deroo

Fonction : Auteur

Acapela

Stéphane Dupont

Fonction : Auteur

Multitel Asbl

T. Erbes

Fonction : Auteur

Eurecom [Sophia Antipolis]

Denis Jouvet

Fonction : Auteur
PersonId : 15904
IdHAL : denis-jouvet
IdRef : 029418666

France Télécom R&D

Luciano Fissore

Fonction : Auteur

LOQUENDO

Pietro Laface

Fonction : Auteur
PersonId : 873336

Dipartimento di Automatica e Informatica [Torino]

Alfred Mertins

Fonction : Auteur

Medizinische Physik

Christophe Ris

Fonction : Auteur

Multitel Asbl

Richard Rose

Fonction : Auteur

Vivek Tyagi

Fonction : Auteur

Eurecom [Sophia Antipolis]

Christian Wellekens

Fonction : Auteur
PersonId : 873362

Eurecom [Sophia Antipolis]

Résumé

Major progress is being recorded regularly on both the technology and exploitation of automatic speech recognition (ASR) and spoken language systems. However, there are still technological barriers to flexible solutions and user satisfaction under some circumstances. This is related to several factors, such as the sensitivity to the environment (background noise), or the weak representation of grammatical and semantic knowledge. Current research is also emphasizing deficiencies in dealing with variation naturally present in speech. For instance, the lack of robustness to foreign accents precludes the use by specific populations. Also, some applications, like directory assistance, particularly stress the core recognition technology due to the very high active vocabulary (application perplexity). There are actually many factors affecting the speech realization: regional, sociolinguistic, or related to the environment or the speaker herself. These create a wide range of variations that may not be modeled correctly (speaker, gender, speaking rate, vocal effort, regional accent, speaking style, non-stationarity, etc.), especially when resources for system training are scarce. This paper outlines current advances related to these topics.

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Denis Jouvet : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00616506

Soumis le : lundi 22 août 2011-17:54:19

Dernière modification le : mardi 25 avril 2023-15:04:06

Dates et versions

inria-00616506 , version 1 (22-08-2011)

Identifiants

HAL Id : inria-00616506 , version 1
DOI : 10.1016/j.specom.2007.02.006

Citer

Mohamed Benzeghiba, Renato de Mori, Olivier Deroo, Stéphane Dupont, T. Erbes, et al.. Automatic speech recognition and speech variability: A review. Speech Communication, 2007, ⟨10.1016/j.specom.2007.02.006⟩. ⟨inria-00616506⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-AVIGNON EURECOM LIA

239 Consultations

0 Téléchargements

Automatic speech recognition and speech variability: A review

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager