Statistical methods in multi-speaker automatic speech recognition

Anne Boyer 1 Joseph Di Martino 1 P. Divoux Jean-Paul Haton 1 Jean-Francois Mari 1 Kamel Smaïli 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Automatic speech recognition and understanding (ASR) plays an important role in the framework of man-machine communication. Substantial industrial developments are at present in progress in this area. However, after 40 years or so of efforts several fundamental questions remain open. This paper is concerned with a comparative study of four different methods for multi-speaker word recognition: (i) clustering of acoustic templates, (ii) comparison with a finite state automaton, (iii) dynamic programming and vector quantization, (iv) stochastic Markov sources. In order to make things comparable, the four methods were tested with the same material made up of the ten digits (0 to 9) pronounced four times by 60 different speakers (30 males and 30 females). We will distinguish in our experiments between multi-speaker systems (capable of recognizing words pronounced by speakers that have been used during the training phase of the system) and speaker-independent systems (capable of recognizing words pronounced by speakers totally unknown to the system). Half of the corpus (15 male and 15 female) were used for training, and the remaining part for test.
Type de document :
Communication dans un congrès
ASMDA - 4th International Symposium on Applied stochastic models and data analysis - 1988, 1988, Nancy, France. 1988
Liste complète des métadonnées

https://hal.inria.fr/hal-00835451
Contributeur : Joseph Di Martino <>
Soumis le : mardi 18 juin 2013 - 16:37:52
Dernière modification le : jeudi 11 janvier 2018 - 06:19:56

Identifiants

  • HAL Id : hal-00835451, version 1

Collections

Citation

Anne Boyer, Joseph Di Martino, P. Divoux, Jean-Paul Haton, Jean-Francois Mari, et al.. Statistical methods in multi-speaker automatic speech recognition. ASMDA - 4th International Symposium on Applied stochastic models and data analysis - 1988, 1988, Nancy, France. 1988. 〈hal-00835451〉

Partager

Métriques

Consultations de la notice

319