Advances in audio source separation and multisource audio content retrieval

Emmanuel Vincent

Autre Publication Année : 2012

Advances in audio source separation and multisource audio content retrieval

(1)

Emmanuel Vincent

Fonction : Auteur
PersonId : 1256
IdHAL : emmanuelv
ORCID : 0000-0002-0183-7289
IdRef : 089360176

Speech and sound data modeling and processing

Résumé

Audio source separation aims to extract the signals of individual sound sources from a given recording. In this paper, we review three recent advances which improve the robustness of source separation in real-world challenging scenarios and enable its use for multisource content retrieval tasks, such as automatic speech recognition (ASR) or acoustic event detection (AED) in noisy environments. We present a Flexible Audio Source Separation Toolkit (FASST) and discuss its advantages compared to earlier approaches such as independent component analysis (ICA) and sparse component analysis (SCA). We explain how cues as diverse as harmonicity, spectral envelope, temporal fine structure or spatial location can be jointly exploited by this toolkit. We subsequently present the uncertainty decoding (UD) framework for the integration of audio source separation and audio content retrieval. We show how the uncertainty about the separated source signals can be accurately estimated and propagated to the features. Finally, we explain how this uncertainty can be efficiently exploited by a classifier, both at the training and the decoding stage. We illustrate the resulting performance improvements in terms of speech separation quality and speaker recognition accuracy.

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Fichier principal

vincent_DSS12.pdf (129.27 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Emmanuel Vincent : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00664090

Soumis le : lundi 27 février 2012-09:50:37

Dernière modification le : vendredi 24 mars 2023-14:52:55

Archivage à long terme le : jeudi 14 juin 2012-16:27:57

Dates et versions

hal-00664090 , version 1 (27-02-2012)

Identifiants

HAL Id : hal-00664090 , version 1

Citer

Emmanuel Vincent. Advances in audio source separation and multisource audio content retrieval. 2012. ⟨hal-00664090⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA IRISA-D5 INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES INSA-GROUPE UR1-MATH-NUM

336 Consultations

224 Téléchargements

Advances in audio source separation and multisource audio content retrieval

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager