End-to-End and Self-Supervised Learning for ComParE 2022 Stuttering Sub-Challenge - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2022

End-to-End and Self-Supervised Learning for ComParE 2022 Stuttering Sub-Challenge

Résumé

In this paper, we present end-to-end and speech embedding based systems trained in a self-supervised fashion to participate in the ACM Multimedia 2022 ComParE Challenge, specifically the stuttering sub-challenge. In particular, we exploit the embeddings from the pre-trained Wav2Vec2.0 model for stuttering detection (SD) on the KSoF dataset. After embedding extraction, we benchmark with several methods for SD. Our proposed self-supervised based SD system achieves a UAR of 36.9% and 41.0% on validation and test sets respectively, which is 31.32% (validation set) and 1.49% (test set) higher than the best (DeepSpectrum) challenge baseline (CBL). Moreover, we show that concatenating layer embeddings with Mel-frequency cepstral coefficients (MFCCs) features further improves the UAR of 33.81% and 5.45% on validation and test sets respectively over the CBL. Finally, we demonstrate that the summing information across all the layers of Wav2Vec2.0 surpasses the CBL by a relative margin of 45.91% and 5.69% on validation and test sets respectively.
Fichier principal
Vignette du fichier
Stuttering_Challenge_ACM_MultiMedia_Conference.pdf (759.74 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte

Dates et versions

hal-03728331 , version 1 (20-07-2022)

Identifiants

  • HAL Id : hal-03728331 , version 1

Citer

Shakeel A Sheikh, Md Sahidullah, Fabrice Hirsch, Slim Ouni. End-to-End and Self-Supervised Learning for ComParE 2022 Stuttering Sub-Challenge. ACM Multimedia 2022 Computational Paralinguistics Challenge (ComParE), Oct 2022, Lisbon, Portugal. ⟨hal-03728331⟩
75 Consultations
157 Téléchargements

Partager

Gmail Facebook X LinkedIn More