Speaker normalization for template based speech recognition - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2009

Speaker normalization for template based speech recognition

Dirk van Compernolle
  • Fonction : Auteur

Résumé

Vocal Tract Length Normalization (VTLN) has been shown to be an efficient speaker normalization tool for HMM based systems. In this paper we show that it is equally efficient for a template based recognition system. Template based systems, while promising, have as potential drawback that templates maintain all non phonetic details apart from the essential phonemic properties; i.e. they retain information on speaker and acoustic recording circumstances. This may lead to a very inefficient usage of the database. We show that after VTLN significantly more speakers - also from opposite gender - contribute templates to the matching sequence compared to the non-normalized case. In experiments on the Wall Street Journal database this leads to a relative word error rate reduction of 10%.
Fichier non déposé

Dates et versions

inria-00583853 , version 1 (06-04-2011)

Identifiants

  • HAL Id : inria-00583853 , version 1

Citer

Sébastien Demange, Dirk van Compernolle. Speaker normalization for template based speech recognition. 10th Annual Conference of the International Speech Communication Association - Interspeech 2009, Sep 2009, Brighton, United Kingdom. pp.560--563. ⟨inria-00583853⟩
48 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More