Skip to Main content Skip to Navigation
Conference papers

Extension of sparse, adaptive signal decompositions to semi-blind audio source separation

Andrew Nesbit 1 Emmanuel Vincent 2 Mark Plumbley 1
2 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : We apply sparse, fast and exible adaptive lapped orthogonal transforms to underdetermined audio source separation using the time-frequency masking framework. This normally requires the sources to overlap as little as possible in the time-frequency plane. In this work, we apply our adaptive transform schemes to the semi-blind case, in which the mixing system is already known, but the sources are unknown. By assuming that exactly two sources are active at each time-frequency index, we determine both the adaptive transforms and the estimated source coefficients using l1 norm minimisation. We show average performance of 12-13 dB SDR on speech and music mixtures, and show that the adaptive transform scheme offers improvements in the order of several tenths of a dB over transforms with constant block length. Comparison with previously studied upper bounds suggests that the potential for future improvements is significant.
Complete list of metadata

Cited literature [9 references]  Display  Hide  Download
Contributor : Emmanuel Vincent Connect in order to contact the contributor
Submitted on : Tuesday, December 7, 2010 - 1:57:19 PM
Last modification on : Wednesday, June 16, 2021 - 3:41:18 AM
Long-term archiving on: : Tuesday, March 8, 2011 - 4:18:50 AM


Publisher files allowed on an open archive


  • HAL Id : inria-00544153, version 1


Andrew Nesbit, Emmanuel Vincent, Mark Plumbley. Extension of sparse, adaptive signal decompositions to semi-blind audio source separation. 8th Int. Conf. on Independent Component Analysis and Signal Separation (ICA), Mar 2009, Paraty, Brazil. pp.605--612. ⟨inria-00544153⟩



Record views


Files downloads