Probabilistic scoring using decision trees for fast and scalable speaker recognition

Gilles Gonon 1 Frédéric Bimbot 1 Rémi Gribonval 1
1 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : In the context of fast and low cost speaker recognition, this article investigates several techniques based on decision trees. A new approach is introduced where the trees are used to estimate a score function rather than returning a decision among classes. This technique is developed to approximate the GMM log-likelihood ratio (LLR) score function. On top of this approach, different solutions are derived to improve the accuracy of the proposed trees. The first one studies the quantization of the LLR function to create classification trees on the LLR values. The second one makes use of knowledge on the GMM distribution of the acoustic features in order to build oblique trees. A third extension consists in using a low-complexity score function in each of the tree leaves. Series of comparative experiments are performed on the NIST 2005 speaker recognition evaluation data in order to evaluate the impact of the proposed improvements in terms of efficiency, execution time and algorithmic complexity. Considering a baseline system with an Equal Error Rate (EER) of 9.6% on the NIST 2005 evaluation, the best tree-based configuration achieves an EER of 12.9%, with a computational cost adapted to embedded devices and an execution time suitable for real-time speaker identification.
Type de document :
Article dans une revue
Speech Communication, Elsevier : North-Holland, 2009, 51 (11), pp.1065 - 1081. 〈10.1016/j.specom.2009.02.007〉
Liste complète des métadonnées

Littérature citée [29 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00544959
Contributeur : Rémi Gribonval <>
Soumis le : dimanche 6 février 2011 - 22:17:07
Dernière modification le : jeudi 11 janvier 2018 - 06:20:09
Document(s) archivé(s) le : samedi 7 mai 2011 - 02:30:03

Fichier

article_specom_gonon_decision_...
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Gilles Gonon, Frédéric Bimbot, Rémi Gribonval. Probabilistic scoring using decision trees for fast and scalable speaker recognition. Speech Communication, Elsevier : North-Holland, 2009, 51 (11), pp.1065 - 1081. 〈10.1016/j.specom.2009.02.007〉. 〈inria-00544959〉

Partager

Métriques

Consultations de la notice

247

Téléchargements de fichiers

193