, PSDs v s (n, f ), v z (n, f ) and v b (n, f ), we use the same procedure as in Section 4.1. Here, the residual echo z(n, f ) is the only latent signal

, Note that the denition of the 5 components of the early near-end signal s e (t) is an extension of

L. Roux,

, Cascade : a cascade approach where the echo cancellation lter H(f ), the dereverberation lter G(f ) and the Wiener postlter W se (n, f ) are estimated and applied one after another. Echo cancellation relies on SpeexDSP 1 , which implements Valin's adaptive approach and is particularly suitable for time-varying conditions, Baselines Hereafter we denote our joint NN-supported approach as NN-joint. We compare it with four baselines: 1. Togami : our implementation of Togami et al.'s approach, vol.2

, NN-parallel : the variant of NN-joint where the echo cancellation lter H(f ) and the dereverberation lter G(f ) are applied in parallel as Togami et al.'s approach

, NN-cascade : the variant of Cascade where the echo cancellation lter H(f ) is estimated using the NN-supported approach similar to NN-joint (see Section 6) instead of Valin's adaptive approach. As WPE dereverberates similarly to its NN-supported counterpart in the multichannel case [16], NN-cascade corresponds to a cascade variant of NN-joint which estimates each lter separately using NN-supported optimization algorithms

G. Carbajal, R. Serizel, E. Vincent, and E. Humbert, Joint DNN-based multichannel reduction of echo, reverberation and noise, Speech, and Language Processing

T. Nakatani, T. Yoshioka, K. Kinoshita, M. Miyoshi, and B. H. Juang, Speech dereverberation based on variance-normalized delayed linear prediction, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.7, p.17171731, 2010.

T. Yoshioka, T. Nakatani, M. Miyoshi, and H. G. Okuno, Blind separation and dereverberation of speech mixtures by joint optimization, IEEE Transactions on Audio, Speech, and Language Processing, vol.19, issue.1, p.6984, 2011.

A. A. Nugraha, A. Liutkus, and E. Vincent, Multichannel audio source separation with deep neural networks, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.24, issue.9, p.16521664, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01163369

A. Ozerov and C. Févotte, Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.3, p.550563, 2010.

T. Yoshioka and T. Nakatani, Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.10, p.27072720, 2012.

N. Q. Duong, E. Vincent, and R. Gribonval, Under-determined reverberant audio source separation using a full-rank spatial covariance model, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.7, p.18301840, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00541865

A. A. Nugraha, A. Liutkus, and E. Vincent, Multichannel music separation with deep neural networks, Proc. EUSIPCO, p.17481752, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01334614

A. Liutkus, D. Fitzgerald, and Z. Rai, Scalable audio separation with light kernel additive modelling, Proc. ICASSP, p.7680, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01114890

G. Carbajal, R. Serizel, E. Vincent, and E. Humbert, Joint DNN-based multichannel reduction of echo, reverberation and noise: Supporting document, Inria, 2019.

J. M. Valin, On adjusting the learning rate in frequency domain echo cancellation with double-talk, IEEE Transactions on Audio, Speech, and Language Processing, vol.15, issue.3, p.10301034, 2007.

G. Carbajal, R. Serizel, E. Vincent, and E. Humbert, Multiple-input neural network-based residual echo suppression, Proc. ICASSP, p.231235, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01723630

E. Vincent, D. R. Campbell, and R. , Available: http, 2008.

J. L. Roux, S. Wisdom, H. Erdogan, and J. R. Hershey, SDR half-baked or well done, Proc. ICASSP, p.626630, 2019.

M. Togami and Y. Kawaguchi, Simultaneous optimization of acoustic echo reduction, speech dereverberation, and noise reduction against mutual interference, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.22, issue.11, p.16121623, 2014.

K. Kinoshita, M. Delcroix, H. Kwon, T. Mori, and T. Nakatani, Neural network-based spectrum estimation for online WPE dereverberation, pp.384388-9303, 2017.