Metric learning loss functions to reduce domain mismatch in the x-vector space for language recognition

Raphaël Duroselle; Denis Jouvet; Irina Illina

Communication Dans Un Congrès Année : 2020

Metric learning loss functions to reduce domain mismatch in the x-vector space for language recognition

(1) , (1) , (1)

Raphaël Duroselle

Fonction : Auteur
PersonId : 1072633

Speech Modeling for Facilitating Oral-Based Communication

Denis Jouvet

Fonction : Auteur
PersonId : 15904
IdHAL : denis-jouvet
IdRef : 029418666

Speech Modeling for Facilitating Oral-Based Communication

Irina Illina

Fonction : Auteur
PersonId : 15663
IdHAL : irina-illina
IdRef : 120731746

Speech Modeling for Facilitating Oral-Based Communication

Résumé

State-of-the-art language recognition systems are based on dis-criminative embeddings called x-vectors. Channel and gender distortions produce mismatch in such x-vector space where em-beddings corresponding to the same language are not grouped in an unique cluster. To control this mismatch, we propose to train the x-vector DNN with metric learning objective functions. Combining a classification loss with the metric learning n-pair loss allows to improve the language recognition performance. Such a system achieves a robustness comparable to a system trained with a domain adaptation loss function but without using the domain information. We also analyze the mismatch due to channel and gender, in comparison to language proximity, in the x-vector space. This is achieved using the Maximum Mean Discrepancy divergence measure between groups of x-vectors. Our analysis shows that using the metric learning loss function reduces gender and channel mismatch in the x-vector space, even for languages only observed on one channel in the train set.

Mots clés

language recognition domain adaptation domain mismatch x-vector embedding metric learning

Domaines

Informatique [cs]

Fichier principal

raphael_interspeech_v9.pdf (215.38 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Raphaël Duroselle : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02920460

Soumis le : lundi 24 août 2020-16:39:08

Dernière modification le : mercredi 20 novembre 2024-15:28:17

Archivage à long terme le : mardi 1 décembre 2020-20:37:46

Dates et versions

hal-02920460 , version 1 (24-08-2020)

Identifiants

HAL Id : hal-02920460 , version 1

Citer

Raphaël Duroselle, Denis Jouvet, Irina Illina. Metric learning loss functions to reduce domain mismatch in the x-vector space for language recognition. INTERSPEECH 2020, Oct 2020, Shangaï / Virtual, China. ⟨hal-02920460⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA GRID5000 UNIV-LORRAINE INRIA2 LORIA LORIA-NLPKD SILECS

344 Consultations

444 Téléchargements

Metric learning loss functions to reduce domain mismatch in the x-vector space for language recognition

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager