Exploiting the Intermittency of Speech for Joint Separation and Diarization

Natural conversations are spontaneous exchanges involving two or more people speaking in an intermittent manner. Therefore one expects such conversation to have intervals where some of the speakers are silent. Yet, most (multichannel) audio source separation (MASS) methods consider the sound sources to be continuously emitting on the total duration of the processed mixture. In this paper we propose a probabilistic model for MASS where the sources may have pauses. The activity of the sources is modeled as a hidden state, the diarization state, enabling us to activate/de-activate the sound sources at time frame resolution. We plug the diarization model within the spatial covariance matrix model proposed for MASS, and obtain an improvement in performance over the state of the art when separating mixtures with intermittent speakers.

Mots clés

Audio source separation speaker diarization spatial covariance matrix EM

Domaines

Apprentissage [cs.LG] Son [cs.SD]

Fichier principal

main.pdf (2.48 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Perception team : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01568813

Soumis le : mardi 25 juillet 2017-18:07:17

Dernière modification le : jeudi 4 avril 2024-21:08:54

Dates et versions

hal-01568813 , version 1 (25-07-2017)

Identifiants

HAL Id : hal-01568813 , version 1
DOI : 10.1109/WASPAA.2017.8169991

Citer

Dionyssos Kounades-Bastian, Laurent Girin, Xavier Alameda-Pineda, Radu Horaud, Sharon Gannot. Exploiting the Intermittency of Speech for Joint Separation and Diarization. WASPAA 2017 - IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Oct 2017, New Paltz, NY, United States. pp.41-45, ⟨10.1109/WASPAA.2017.8169991⟩. ⟨hal-01568813⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 UGA CNRS INRIA IRISA GIPSA GIPSA-DPC LJK LJK_GI LJK_GI_PERCEPTION GIPSA-CRISSP INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

257 Consultations

274 Téléchargements