Online Spectrogram Inversion for Low-Latency Audio Source Separation

Paul Magron; Tuomas Virtanen

doi:10.1109/LSP.2020.2970310

Article Dans Une Revue IEEE Signal Processing Letters Année : 2020

Online Spectrogram Inversion for Low-Latency Audio Source Separation

(1) , (2)

1
2

Paul Magron

Fonction : Auteur
PersonId : 1085197
ORCID : 0000-0002-8561-0961

Signal et Communications

Tuomas Virtanen

Fonction : Auteur
PersonId : 1113848

University of Tampere [Finland]

Résumé

Audio source separation is usually achieved by estimating the short-time Fourier transform (STFT) magnitude of each source, and then applying a spectrogram inversion algorithm to retrieve time-domain signals. In particular, the multiple input spectrogram inversion (MISI) algorithm has been exploited successfully in several recent works. However, this algorithm suffers from two drawbacks, which we address in this paper. First, it has originally been introduced in a heuristic fashion: we propose here a rigorous optimization framework in which MISI is derived, thus proving the convergence of this algorithm. Besides, while MISI operates offline, we propose here an online version of MISI called oMISI, which is suitable for low-latency source separation, an important requirement for e.g., hearing aids applications. oMISI also allows one to use alternative phase initialization schemes exploiting the temporal structure of audio signals. Experiments conducted on a speech separation task show that oMISI performs as well as its offline counterpart, thus demonstrating its potential for real-time source separation.

Mots clés

audio source separation low-latency online spectrogram inversion phase recovery sinusoidal modeling

Domaines

Traitement du signal et de l'image [eess.SP] Son [cs.SD]

Fichier principal

Online_MISI.pdf (525.72 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Paul Magron : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03132170

Soumis le : lundi 11 octobre 2021-15:16:37

Dernière modification le : lundi 20 novembre 2023-11:44:23

Archivage à long terme le : mercredi 12 janvier 2022-20:05:50

Dates et versions

hal-03132170 , version 1 (11-10-2021)

Identifiants

HAL Id : hal-03132170 , version 1
ARXIV : 1911.03128
DOI : 10.1109/LSP.2020.2970310

Citer

Paul Magron, Tuomas Virtanen. Online Spectrogram Inversion for Low-Latency Audio Source Separation. IEEE Signal Processing Letters, 2020, 27, pp.306-310. ⟨10.1109/LSP.2020.2970310⟩. ⟨hal-03132170⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLSE2 CNRS SMS UT1-CAPITOLE IRIT IRIT-SC IRIT-SI TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP

192 Consultations

73 Téléchargements

Online Spectrogram Inversion for Low-Latency Audio Source Separation

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager