Statistical methods in multi-speaker automatic speech recognition

Anne Boyer; Joseph Di Martino; P. Divoux; Jean-Paul Haton; Jean-François Mari; Kamel Smaïli

doi:10.1002/asm.3150060302

Article Dans Une Revue Applied Stochastic Models and Data Analysis Année : 1990

Statistical methods in multi-speaker automatic speech recognition

(1) , (1) , , (1) , (1) , (1)

Anne Boyer

Fonction : Auteur

Analysis, perception and recognition of speech

Joseph Di Martino

Fonction : Auteur
PersonId : 16557
IdHAL : joseph-di-martino
IdRef : 179331531

Analysis, perception and recognition of speech

P. Divoux

Fonction : Auteur

Jean-Paul Haton

Fonction : Auteur
PersonId : 830987

Analysis, perception and recognition of speech

Jean-François Mari

Fonction : Auteur
PersonId : 830048

Analysis, perception and recognition of speech

Kamel Smaïli

Fonction : Auteur
PersonId : 2521
IdHAL : kamel-smaili
IdRef : 034429700

Analysis, perception and recognition of speech

Résumé

Automatic speech recognition and understanding (ASR) plays an important role in the framework of man-machine communication. Substantial industrial developments are at present in progress in this area. However, after 40 years or so of efforts several fundamental questions remain open. This paper is concerned with a comparative study of four different methods for multi-speaker word recognition: (i) clustering of acoustic templates, (ii) comparison with a finite state automaton, (iii) dynamic programming and vector quantization, (iv) stochastic Markov sources. In order to make things comparable, the four methods were tested with the same material made up of the ten digits (0 to 9) pronounced four times by 60 different speakers (30 males and 30 females). We will distinguish in our experiments between multi-speaker systems (capable of recognizing words pronounced by speakers that have been used during the training phase of the system) and speaker-independent systems (capable of recognizing words pronounced by speakers totally unknown to the system). Half of the corpus (15 male and 15 female) were used for training, and the remaining part for test.

Mots clés

Automatic speech recognition Multi-speaker Markov models Dynamic programming Clustering

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Joseph Di Martino : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00835107

Soumis le : mardi 18 juin 2013-09:57:26

Dernière modification le : vendredi 24 mars 2023-14:52:57

Dates et versions

hal-00835107 , version 1 (18-06-2013)

Identifiants

HAL Id : hal-00835107 , version 1
DOI : 10.1002/asm.3150060302

Citer

Anne Boyer, Joseph Di Martino, P. Divoux, Jean-Paul Haton, Jean-François Mari, et al.. Statistical methods in multi-speaker automatic speech recognition. Applied Stochastic Models and Data Analysis, 1990, 6 (3), pp.143-155. ⟨10.1002/asm.3150060302⟩. ⟨hal-00835107⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-LORRAINE INRIA2 LORIA

127 Consultations

0 Téléchargements

Statistical methods in multi-speaker automatic speech recognition

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager