A multi-resolution approach to common fate-based audio separation

Fatemeh Pishdadian 1 Bryan Pardo 1 Antoine Liutkus 2
2 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : We propose the Multi-resolution Common Fate Transform (MCFT), a signal representation that increases the separabil-ity of audio sources with significant energy overlap in the time-frequency domain. The MCFT combines the desirable features of two existing representations: the invertibility of the recently proposed Common Fate Transform (CFT) and the multi-resolution property of the cortical stage output of an auditory model. We compare the utility of the MCFT to the CFT by measuring the quality of source separation performed via ideal binary masking using each representation. Experiments on harmonic sounds with overlapping fundamental frequencies and different spectro-temporal modulation patterns show that ideal masks based on the MCFT yield better separation than those based on the CFT.
Document type :
Conference papers
Complete list of metadatas

Cited literature [18 references]  Display  Hide  Download

https://hal.inria.fr/hal-01515951
Contributor : Antoine Liutkus <>
Submitted on : Friday, April 28, 2017 - 12:36:15 PM
Last modification on : Tuesday, December 18, 2018 - 4:38:02 PM
Long-term archiving on : Saturday, July 29, 2017 - 1:20:13 PM

File

pishdadian.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01515951, version 1

Collections

Citation

Fatemeh Pishdadian, Bryan Pardo, Antoine Liutkus. A multi-resolution approach to common fate-based audio separation. 42nd International Conference on Acoustics, Speech and Signal Processing (ICASSP), Mar 2017, New Orleans, United States. ⟨hal-01515951⟩

Share

Metrics

Record views

811

Files downloads

336