Extension of sparse, adaptive signal decompositions to semi-blind audio source separation

Andrew Nesbit 1 Emmanuel Vincent 2 Mark Plumbley 1
2 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : We apply sparse, fast and exible adaptive lapped orthogonal transforms to underdetermined audio source separation using the time-frequency masking framework. This normally requires the sources to overlap as little as possible in the time-frequency plane. In this work, we apply our adaptive transform schemes to the semi-blind case, in which the mixing system is already known, but the sources are unknown. By assuming that exactly two sources are active at each time-frequency index, we determine both the adaptive transforms and the estimated source coefficients using l1 norm minimisation. We show average performance of 12-13 dB SDR on speech and music mixtures, and show that the adaptive transform scheme offers improvements in the order of several tenths of a dB over transforms with constant block length. Comparison with previously studied upper bounds suggests that the potential for future improvements is significant.
Complete list of metadatas

Cited literature [9 references]  Display  Hide  Download

https://hal.inria.fr/inria-00544153
Contributor : Emmanuel Vincent <>
Submitted on : Tuesday, December 7, 2010 - 1:57:19 PM
Last modification on : Thursday, March 21, 2019 - 2:20:42 PM
Long-term archiving on : Tuesday, March 8, 2011 - 4:18:50 AM

File

nesbit_ICA09.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : inria-00544153, version 1

Citation

Andrew Nesbit, Emmanuel Vincent, Mark Plumbley. Extension of sparse, adaptive signal decompositions to semi-blind audio source separation. 8th Int. Conf. on Independent Component Analysis and Signal Separation (ICA), Mar 2009, Paraty, Brazil. pp.605--612. ⟨inria-00544153⟩

Share

Metrics

Record views

305

Files downloads

321