C. Philipos and . Loizou, Speech enhancement: theory and practice, 2007.

A. Ozerov and C. Févotte, Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation, IEEE Trans. Audio, Speech, Language Process, vol.18, issue.3, pp.550-563, 2010.
DOI : 10.1109/tasl.2009.2031510

Q. K. Ngoc, E. Duong, R. Vincent, and . Gribonval, Underdetermined reverberant audio source separation using a full-rank spatial covariance model, IEEE Trans. Audio, Speech, Language Process, vol.18, issue.7, pp.1830-1840, 2010.

S. Arberet, A. Ozerov, Q. K. Ngoc, E. Duong, R. Vincent et al., Nonnegative matrix factorization and spatial covariance model for underdetermined reverberant audio source separation, Proc. IEEE Int. Conf. Information Sciences, Signal Process. and Applications (ISSPA), pp.1-4, 2010.
DOI : 10.1109/isspa.2010.5605570

URL : https://hal.archives-ouvertes.fr/inria-00541436

A. Ozerov, E. Vincent, and F. Bimbot, A general flexible framework for the handling of prior information in audio source separation, IEEE Trans. Audio, Speech, Language Process, vol.20, issue.4, pp.1118-1133, 2012.
URL : https://hal.archives-ouvertes.fr/inria-00536917

H. Sawada, H. Kameoka, S. Araki, and N. Ueda, Multichannel extensions of non-negative matrix factorization with complex-valued data, IEEE Trans. Audio, Speech, Language Process, vol.21, issue.5, pp.971-982, 2013.

D. Kitamura, N. Ono, H. Sawada, H. Kameoka, and H. Saruwatari, Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization, IEEE Trans. Audio, Speech, Language Process, vol.24, issue.9, pp.1626-1641, 2016.
DOI : 10.1109/taslp.2016.2577880

URL : https://doi.org/10.1109/taslp.2016.2577880

A. A. Nugraha, A. Liutkus, and E. Vincent, Multichannel audio source separation with deep neural networks, IEEE Trans. Audio, Speech, Language Process, vol.24, issue.9, pp.1652-1664, 2016.
DOI : 10.1109/taslp.2016.2580946

URL : https://hal.archives-ouvertes.fr/hal-01163369

S. Leglaive, R. Badeau, and G. Richard, Multichannel audio source separation with probabilistic reverberation priors, IEEE Trans. Audio, Speech, Language Process, vol.24, issue.12, pp.2453-2465, 2016.
DOI : 10.1109/taslp.2016.2614140

URL : https://hal.archives-ouvertes.fr/hal-01370051

D. Wang and J. Chen, Supervised speech separation based on deep learning: An overview, IEEE Trans. Audio, Speech, Language Process, vol.26, issue.10, pp.1702-1726, 2018.
DOI : 10.1109/taslp.2018.2842159

URL : http://arxiv.org/pdf/1708.07524

P. Diederik, M. Kingma, and . Welling, Auto-encoding variational Bayes, Proc. Int. Conf. Learning Representations (ICLR), 2014.

Y. Bando, M. Mimura, K. Itoyama, K. Yoshii, and T. Kawahara, Statistical speech enhancement based on probabilistic integration of variational autoencoder and nonnegative matrix factorization, Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), pp.716-720, 2018.
DOI : 10.1109/icassp.2018.8461530

S. Leglaive, L. Girin, and R. Horaud, A variance modeling framework based on variational autoencoders for speech enhancement, Proc. IEEE Int. Workshop Machine Learning Signal Process. (MLSP), 2018.
DOI : 10.1109/mlsp.2018.8516711

URL : https://hal.archives-ouvertes.fr/hal-01832826

S. Leglaive, U. Umut¸sim¸umut¸sim¸sekli, A. Liutkus, L. Girin, and R. Horaud, Speech enhancement with variational autoencoders and alpha-stable distributions, Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2019.
DOI : 10.1109/icassp.2019.8682546

URL : https://hal.archives-ouvertes.fr/hal-02005106

C. Févotte, N. Bertin, and J. Durrieu, Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis, Neural computation, vol.21, issue.3, pp.793-830, 2009.

C. G. Greg, M. A. Wei, and . Tanner, A Monte Carlo implementation of the EM algorithm and the poor man's data augmentation algorithms, Journal of the American statistical Association, vol.85, issue.411, pp.699-704, 1990.

K. Sekiguchi, Y. Bando, K. Yoshii, and T. Kawahara, Bayesian multichannel speech enhancement with a deep speech prior, Proc. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp.1233-1239, 2018.
DOI : 10.23919/apsipa.2018.8659591

F. D. Neeser and J. L. Massey, Proper complex random processes with applications to information theory, IEEE Trans. Information Theory, vol.39, issue.4, pp.1293-1302, 1993.
DOI : 10.1109/18.243446

URL : http://www.isiweb.ee.ethz.ch/papers/arch/fneese-mass-inspec-1993-1.pdf

A. Liutkus, R. Badeau, and G. Richard, Gaussian processes for underdetermined source separation, IEEE Trans. Signal Process, vol.59, issue.7, pp.3155-3167, 2011.
DOI : 10.1109/tsp.2011.2119315

URL : https://hal.archives-ouvertes.fr/hal-00643951

P. Christian, G. Robert, and . Casella, Monte Carlo Statistical Methods, 2005.

K. S. Chan and J. Ledolter, Monte Carlo EM estimation for time series models involving counts, Journal of the American Statistical Association, vol.90, issue.429, pp.242-252, 1995.
DOI : 10.2307/2291149

J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett et al., TIMIT acoustic phonetic continuous speech corpus, 1993.

J. Thiemann, N. Ito, and E. Vincent, The Diverse Environments Multi-channel Acoustic Noise Database (DE-MAND): A database of multichannel environmental noise recordings, Proc. Int. Cong. on Acoust, 2013.

V. Nair and G. E. Hinton, Rectified linear units improve restricted boltzmann machines, Proc. Int. Conf. Machine Learning (ICML), pp.807-814, 2010.

S. Ioffe and C. Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, Proc. Int. Conf. Machine Learning (ICML), pp.448-456, 2015.

P. Diederik, J. Kingma, and . Ba, Adam: A method for stochastic optimization, Proc. Int. Conf. Learning Representations (ICLR), 2015.

X. Glorot and Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, Proc. Int. Conf. Artif. Intelligence and Stat, pp.249-256, 2010.

E. Vincent, R. Gribonval, and C. Févotte, Performance measurement in blind audio source separation, IEEE Trans. Audio, Speech, Language Process, vol.14, issue.4, pp.1462-1469, 2006.
DOI : 10.1109/tsa.2005.858005

URL : https://hal.archives-ouvertes.fr/inria-00544230

A. W. Rix, J. G. Beerends, M. P. Hollier, and A. P. Hekstra, Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs, Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), pp.749-752, 2001.

H. Cees, R. C. Taal, R. Hendriks, J. Heusdens, and . Jensen, An algorithm for intelligibility prediction of time-frequency weighted noisy speech, IEEE Trans. Audio, Speech, Language Process, vol.19, issue.7, pp.2125-2136, 2011.

S. Boyd and L. Vandenberghe, Convex optimization, Cambridge university press, 2004.

F. Hansen and G. K. Pedersen, Jensen's operator inequality, Bulletin of the London Mathematical Society, vol.35, issue.4, pp.553-564, 2003.