The Speed Submission to DIHARD II: Contributions & Lessons Learned

Md Sahidullah; Jose Patino; Samuele Cornell; Ruiqing Yin; Sunit Sivasankaran; Hervé Bredin; Pavel Korshunov; Alessio Brutti; Romain Serizel; Emmanuel Vincent; Nicholas Evans; Sébastien Marcel; Stefano Squartini; Claude Barras

Pré-Publication, Document De Travail Année : 2019

The Speed Submission to DIHARD II: Contributions & Lessons Learned

(1) , (2) , (3) , (4) , (1) , (4) , (5) , (6) , (1) , (1) , (2) , (5) , (3) , (4)

1
2
3
4
5
6

Md Sahidullah

Fonction : Auteur
PersonId : 737397
IdHAL : sahid

Speech Modeling for Facilitating Oral-Based Communication

Jose Patino

Fonction : Auteur
PersonId : 743667
IdHAL : jose-patino
ORCID : 0000-0001-7193-0721
IdRef : 241999308

Eurecom [Sophia Antipolis]

Samuele Cornell

Fonction : Auteur

Polytechnic University of Marche

Ruiqing Yin

Fonction : Auteur

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Sunit Sivasankaran

Fonction : Auteur

Speech Modeling for Facilitating Oral-Based Communication

Hervé Bredin

Fonction : Auteur
PersonId : 15856
IdHAL : hbredin
ORCID : 0000-0002-3739-925X
IdRef : 121165779

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Pavel Korshunov

Fonction : Auteur

IDIAP Research Institute

Alessio Brutti

Fonction : Auteur

Fondazione Bruno Kessler [Trento, Italy]

Romain Serizel

Fonction : Auteur
PersonId : 741857
IdHAL : nicolas-furnon

Speech Modeling for Facilitating Oral-Based Communication

Emmanuel Vincent

Fonction : Auteur
PersonId : 1256
IdHAL : emmanuelv
ORCID : 0000-0002-0183-7289
IdRef : 089360176

Speech Modeling for Facilitating Oral-Based Communication

Nicholas Evans

Fonction : Auteur
PersonId : 938450

Eurecom [Sophia Antipolis]

Sébastien Marcel

Fonction : Auteur

IDIAP Research Institute

Stefano Squartini

Fonction : Auteur

Polytechnic University of Marche

Claude Barras

Fonction : Auteur

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Résumé

This paper describes the speaker diarization systems developed for the Second DIHARD Speech Diarization Challenge (DIHARD II) by the Speed team. Besides describing the system, which considerably outperformed the challenge baselines, we also focus on the lessons learned from numerous approaches that we tried for single and multi-channel systems. We present several components of our diarization system, including categorization of domains, speech enhancement, speech activity detection, speaker embeddings, clustering methods, resegmentation, and system fusion. We analyze and discuss the effect of each such component on the overall diarization performance within the realistic settings of the challenge.

Mots clés

DIHARD challenge single-channel and multi-channel speech Single-channel Multichannel Speaker Diarization DIHARD 2019 Speech Activity Detection Speaker recognition

Domaines

Apprentissage [cs.LG] Intelligence artificielle [cs.AI] Multimédia [cs.MM] Acoustique [physics.class-ph] Traitement du signal et de l'image [eess.SP]

Fichier principal

Speed_DIHARDII_Manuscript.pdf (210.8 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Md Sahidullah : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-02352840

Soumis le : mardi 30 juin 2020-19:04:00

Dernière modification le : samedi 7 octobre 2023-21:36:20

Dates et versions

hal-02352840 , version 1 (07-11-2019)

hal-02352840 , version 2 (30-06-2020)

Identifiants

HAL Id : hal-02352840 , version 2

Citer

Md Sahidullah, Jose Patino, Samuele Cornell, Ruiqing Yin, Sunit Sivasankaran, et al.. The Speed Submission to DIHARD II: Contributions & Lessons Learned. 2019. ⟨hal-02352840v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA EURECOM LIMSI GRID5000 UNIV-LORRAINE INRIA2 LORIA LORIA-NLPKD UNIV-PARIS-SACLAY SORBONNE-UNIVERSITE SILECS LISN GS-ENGINEERING GS-COMPUTER-SCIENCE

162 Consultations

463 Téléchargements

The Speed Submission to DIHARD II: Contributions & Lessons Learned

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager