Investigating Stranded GMM for Improving Automatic Speech Recognition

Arseniy Gorin; Denis Jouvet; Emmanuel Vincent; Dung Tran

Communication Dans Un Congrès Année : 2014

Investigating Stranded GMM for Improving Automatic Speech Recognition

(1) , (1) , (1) , (1)

Arseniy Gorin

Fonction : Auteur
PersonId : 957227

Analysis, perception and recognition of speech

Denis Jouvet

Fonction : Auteur
PersonId : 15904
IdHAL : denis-jouvet
IdRef : 029418666

Analysis, perception and recognition of speech

Emmanuel Vincent

Fonction : Auteur
PersonId : 1256
IdHAL : emmanuelv
ORCID : 0000-0002-0183-7289
IdRef : 089360176

Analysis, perception and recognition of speech

Dung Tran

Fonction : Auteur
PersonId : 953494

Analysis, perception and recognition of speech

Résumé

This paper investigates recently proposed Stranded Gaussian Mixture acoustic Model (SGMM) for Automatic Speech Recognition (ASR). This model extends conventional hidden Markov model (HMM-GMM) by explicitly introducing dependencies between components of the observation Gaussian mixture densities. The main objective of the paper is to experimentally study, how useful SGMM can be for dealing with data, which contains different sources of acoustic variability. First studied sources of variability are age and gender in quiet environment (TIdigits task including child speech). Second, the SGMM modeling is applied on data produced by different speakers and corrupted by non-stationary noise (CHiME 2013 challenge data). Finally, SGMM is applied on the same noisy data, but after performing speech enhancement (i.e., the remaining variability mostly comes from residual noise and different speakers). Although SGMM was originally proposed for robust speech recognition of noisy data, in this work it was found, that the model is more efficient for handling speaker variability in quiet environment.

Mots clés

dynamic Bayesian network hidden Markov model trajectory modeling robust speech recognition

Domaines

Son [cs.SD]

Fichier principal

ago_HSCMA14_v5.2.pdf (135.78 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Arseniy Gorin : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01003054

Soumis le : mardi 10 juin 2014-11:14:08

Dernière modification le : jeudi 1 février 2024-10:03:58

Archivage à long terme le : mercredi 10 septembre 2014-11:30:21

Dates et versions

hal-01003054 , version 1 (09-06-2014)

hal-01003054 , version 2 (10-06-2014)

Identifiants

HAL Id : hal-01003054 , version 2

Citer

Arseniy Gorin, Denis Jouvet, Emmanuel Vincent, Dung Tran. Investigating Stranded GMM for Improving Automatic Speech Recognition. 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2014), May 2014, Nancy, France. ⟨hal-01003054v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA IRISA GRID5000 UNIV-LORRAINE INRIA2 LORIA LORIA-NLPKD UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES SILECS UR1-MATH-NUM

375 Consultations

346 Téléchargements

Investigating Stranded GMM for Improving Automatic Speech Recognition

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager