End-to-End and Self-Supervised Learning for ComParE 2022 Stuttering Sub-Challenge

Shakeel A Sheikh; Md Sahidullah; Fabrice Hirsch; Slim Ouni

Communication Dans Un Congrès Année : 2022

End-to-End and Self-Supervised Learning for ComParE 2022 Stuttering Sub-Challenge

(1) , (1) , (2) , (1)

1
2

Shakeel A Sheikh

Fonction : Auteur

Speech Modeling for Facilitating Oral-Based Communication

Md Sahidullah

Fonction : Auteur
PersonId : 737397
IdHAL : sahid

Speech Modeling for Facilitating Oral-Based Communication

Fabrice Hirsch

Fonction : Auteur

Praxiling

Slim Ouni

Fonction : Auteur
PersonId : 1158
IdHAL : slim-ouni
ORCID : 0000-0001-5286-7368

Speech Modeling for Facilitating Oral-Based Communication

Résumé

In this paper, we present end-to-end and speech embedding based systems trained in a self-supervised fashion to participate in the ACM Multimedia 2022 ComParE Challenge, specifically the stuttering sub-challenge. In particular, we exploit the embeddings from the pre-trained Wav2Vec2.0 model for stuttering detection (SD) on the KSoF dataset. After embedding extraction, we benchmark with several methods for SD. Our proposed self-supervised based SD system achieves a UAR of 36.9% and 41.0% on validation and test sets respectively, which is 31.32% (validation set) and 1.49% (test set) higher than the best (DeepSpectrum) challenge baseline (CBL). Moreover, we show that concatenating layer embeddings with Mel-frequency cepstral coefficients (MFCCs) features further improves the UAR of 33.81% and 5.45% on validation and test sets respectively over the CBL. Finally, we demonstrate that the summing information across all the layers of Wav2Vec2.0 surpasses the CBL by a relative margin of 45.91% and 5.69% on validation and test sets respectively.

Mots clés

Speech disorders Disfluency Stuttering Detection Speech disorders ComParE stuttering-sub challenge Stuttering

Domaines

Apprentissage [cs.LG] Traitement du signal et de l'image [eess.SP] Intelligence artificielle [cs.AI]

Fichier principal

Stuttering_Challenge_ACM_MultiMedia_Conference.pdf (759.74 Ko)

Origine : Fichiers éditeurs autorisés sur une archive ouverte

Shakeel Ahmad Sheikh : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03728331

Soumis le : mercredi 20 juillet 2022-16:21:34

Dernière modification le : vendredi 26 avril 2024-10:14:03

Archivage à long terme le : vendredi 21 octobre 2022-18:39:49

Dates et versions

hal-03728331 , version 1 (20-07-2022)

Identifiants

HAL Id : hal-03728331 , version 1

Citer

Shakeel A Sheikh, Md Sahidullah, Fabrice Hirsch, Slim Ouni. End-to-End and Self-Supervised Learning for ComParE 2022 Stuttering Sub-Challenge. ACM Multimedia 2022 Computational Paralinguistics Challenge (ComParE), Oct 2022, Lisbon, Portugal. ⟨hal-03728331⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-MONTP3 PRAXILING UNIV-LORRAINE INRIA2 LORIA LORIA-NLPKD

75 Consultations

157 Téléchargements

End-to-End and Self-Supervised Learning for ComParE 2022 Stuttering Sub-Challenge

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager