Fast training of Large Margin diagonal Gaussian mixture models for speaker identification

Reda Jourani; Khalid Daoudi; Régine André-Obrecht; Driss Aboutajdine

Communication Dans Un Congrès Année : 2011

Fast training of Large Margin diagonal Gaussian mixture models for speaker identification

(1) , (2) , (1, 3) , (4)

1
2
3
4

Reda Jourani

Fonction : Auteur
PersonId : 881619
IdRef : 165708018

Équipe Structuration, Analyse et MOdélisation de documents Vidéo et Audio

Khalid Daoudi

Fonction : Auteur
PersonId : 1329075
ORCID : 0000-0003-3536-1060
IdRef : 115483500

Geometry and Statistics in acquisition data

Régine André-Obrecht

Fonction : Auteur
PersonId : 740810
IdHAL : obrecht
IdRef : 060375965

Équipe Structuration, Analyse et MOdélisation de documents Vidéo et Audio

Université Toulouse III - Paul Sabatier

Driss Aboutajdine

Fonction : Auteur

Laboratoire de Recherche en Informatique et Télécommunications [Rabat]

Résumé

Gaussian mixture models (GMM) have been widely and successfully used in speaker recognition during the last decades. They are generally trained using the generative criterion of maximum likelihood estimation. In an earlier work, we proposed an algorithm for discriminative training of GMM with diagonal covariances under a large margin criterion. In this paper, we present a new version of this algorithm which has the major advantage of being computationally highly efficient. The resulting algorithm is thus well suited to handle large scale databases. We carry out experiments on a speaker identification task using NIST-SRE'2006 data and compare our new algorithm to the baseline generative GMM using different GMM sizes. The results show that our system significantly outperforms the baseline GMM in all configurations, and with high computational efficiency.

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Fichier principal

SpeD-2011.pdf (498.33 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Khalid Daoudi : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00647213

Soumis le : jeudi 1 décembre 2011-16:33:53

Dernière modification le : vendredi 2 février 2024-03:34:20

Archivage à long terme le : vendredi 2 mars 2012-02:36:58

Dates et versions

hal-00647213 , version 1 (01-12-2011)

Identifiants

HAL Id : hal-00647213 , version 1

Citer

Reda Jourani, Khalid Daoudi, Régine André-Obrecht, Driss Aboutajdine. Fast training of Large Margin diagonal Gaussian mixture models for speaker identification. International Conference on Speech Technology and Human-Computer Dialogue (SpeD), May 2011, Brasov, Romania. ⟨hal-00647213⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLSE2 UNIV-RENNES1 CNRS INRIA IRISA UT1-CAPITOLE INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES IRIT IRIT-SAMOVA UR1-MATH-NUM IRIT-SI TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP

248 Consultations

427 Téléchargements

Fast training of Large Margin diagonal Gaussian mixture models for speaker identification

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager