An EM Algorithm for Audio Source Separation Based on the Convolutive Transfer Function

Xiaofei Li 1 Laurent Girin 2, 1 Radu Horaud 1
1 PERCEPTION - Interpretation and Modelling of Images and Videos
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
2 GIPSA-CRISSP - CRISSP
GIPSA-DPC - Département Parole et Cognition
Abstract : This paper addresses the problem of audio source separation from (possibly under-determined) multichannel convolutive mixtures. We propose a separation method based on the convolutive transfer function (CTF) in the short-time Fourier transform domain. For strongly reverberant signals, the CTF is a much more appropriate model than the widely-used multiplicative transfer function approximation. An Expectation-Maximization (EM) algorithm is proposed to jointly estimate the model parameters, including the CTF coefficients of the mixing filters, and infer the sources. Experiments show that the proposed method provides very satisfactory performance on highly reverberant speech mixtures.
Complete list of metadatas

Cited literature [17 references]  Display  Hide  Download

https://hal.inria.fr/hal-01568818
Contributor : Team Perception <>
Submitted on : Tuesday, July 25, 2017 - 6:12:12 PM
Last modification on : Friday, September 14, 2018 - 1:14:57 AM

File

Xiaofei_WASPAA_2017.pdf
Files produced by the author(s)

Identifiers

Citation

Xiaofei Li, Laurent Girin, Radu Horaud. An EM Algorithm for Audio Source Separation Based on the Convolutive Transfer Function. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Oct 2017, New Paltz, NY, United States. pp.56-60, ⟨10.1109/WASPAA.2017.8169994⟩. ⟨hal-01568818⟩

Share

Metrics

Record views

489

Files downloads

289