J. M. Baker, L. Deng, J. Glass, S. Khudanpur, C. Lee et al., Developments and directions in speech recognition and understanding, Part 1 [DSP Education], IEEE Signal Processing Magazine, vol.26, issue.3, pp.75-80, 2009.
DOI : 10.1109/MSP.2009.932166

J. S. Downie, D. Byrd, and T. Crawford, Ten years of ISMIR: Reflections on challenges and opportunities, Proceedings of International Symposium on Music Information Retrieval (ISMIR), pp.13-18, 2009.

L. Deng, Front-end, back-end, and hybrid techniques for noise-robust speech recognition, " in Robust Speech Recognition of Uncertain or Missing Data -Theory and Applications, pp.67-99, 2011.

J. J. Bosch, J. Janer, F. Fuhrmann, and P. Herrera, A comparison of sound segregation techniques for predominant instrument recognition in musical audio signals, Proceedings of 13th International Society for Music Information Retrieval Conference (ISMIR), 2012, pp.559-564

J. Zapata and E. Gómez, Improving beat tracking in the presence of highly predominant vocals using source separation techniques: Preliminary study, Proceedings of 9th International Symposium on Computer Music Modeling and Retrieval (CMMR), pp.583-590, 2012.

E. Vincent, N. Bertin, R. Gribonval, and F. Bimbot, From Blind to Guided Audio Source Separation: How models and side information can improve the separation of sound, IEEE Signal Processing Magazine, vol.31, issue.3, pp.107-115, 2014.
DOI : 10.1109/MSP.2013.2297440

URL : https://hal.archives-ouvertes.fr/hal-00922378

T. Virtanen, J. F. Gemmeke, B. Raj, and P. Smaragdis, Compositional Models for Audio Processing: Uncovering the structure of sound mixtures, IEEE Signal Processing Magazine, vol.32, issue.2, pp.125-144, 2015.
DOI : 10.1109/MSP.2013.2288990

U. S. ¸-ims¸ekliims¸ekli, T. Virtanen, and A. T. , Non-negative tensor factorization models for Bayesian audio processing, Digital Signal Processing, vol.47, pp.178-191, 2015.

A. Ozerov, E. Vincent, and F. Bimbot, A General Flexible Framework for the Handling of Prior Information in Audio Source Separation, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.4, pp.1118-1133, 2012.
DOI : 10.1109/TASL.2011.2172425

URL : https://hal.archives-ouvertes.fr/inria-00536917

N. Q. Duong, E. Vincent, and R. Gribonval, Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.7, pp.1830-1840, 2010.
DOI : 10.1109/TASL.2010.2050716

URL : https://hal.archives-ouvertes.fr/inria-00435807

D. Fitzgerald, M. Cranitch, and E. Coyle, Extended Nonnegative Tensor Factorisation Models for Musical Sound Source Separation, Computational Intelligence and Neuroscience, vol.2008, p.872425, 2008.
DOI : 10.1109/TSA.2005.858005

J. Durrieu, G. Richard, B. David, and C. Févotte, Source/Filter Model for Unsupervised Main Melody Extraction From Polyphonic Audio Signals, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.3, pp.564-575, 2010.
DOI : 10.1109/TASL.2010.2041114

Y. Salaün, E. Vincent, N. Bertin, N. Souvirà-a-labastie, X. Jaureguiberry et al., The flexible audio source separation toolbox version 2.0, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Show & Tell, 2014.

M. Fakhry, P. Svaizer, and M. Omologo, Audio source separation using a redundant library of source spectral bases for non-negative tensor factorization, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.2015-251
DOI : 10.1109/ICASSP.2015.7177970

L. , L. Magoarou, A. Ozerov, and N. Q. Duong, Text-informed audio source separation. example-based approach using non-negative matrix partial co-factorization, Journal of Signal Processing Systems, vol.79, issue.2, pp.117-131, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01010602

N. Souvirà-a-labastie, A. Olivero, E. Vincent, and F. Bimbot, Multi-Channel Audio Source Separation Using Multiple Deformed References, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.23, issue.11, pp.1775-1787, 2015.
DOI : 10.1109/TASLP.2015.2450494

D. T. Tran, E. Vincent, D. Jouvet, and K. Adilo?-glu, Using fullrank spatial covariance models for noise-robust ASR, Proceedings of 2nd International Workshop on Machine Listening in Multisource Environments (CHiME), pp.31-32, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00801162

B. Lehner and G. Widmer, Monaural blind source separation in the context of vocal detection, Proceedings of International Symposium on Music Information Retrieval (ISMIR), pp.309-315, 2015.

A. Ozerov, C. ¸. Bilen, and P. Pérez, Multichannel audio declipping, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.31-32, 2014.
DOI : 10.1109/ICASSP.2016.7471757

URL : https://hal.archives-ouvertes.fr/hal-01254950

H. Attias, A variational Bayesian framework for graphical models, Advances in Neural Information Processing Systems (NIPS), 1999.

C. Févotte and S. J. , A Bayesian Approach for Blind Separation of Sparse Sources, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.6, pp.2174-2188, 2006.
DOI : 10.1109/TSA.2005.858523

R. Chen and Y. N. Wu, A null space method for over-complete blind source separation, Computational Statistics & Data Analysis, vol.51, issue.12, pp.5519-5536, 2007.
DOI : 10.1016/j.csda.2007.03.009

C. Févotte, B. Torrésani, L. Daudet, and S. J. , Sparse Linear Regression With Structured Priors and Application to Denoising of Musical Audio, IEEE Transactions on Audio, Speech, and Language Processing, vol.16, issue.1, pp.174-185, 2008.
DOI : 10.1109/TASL.2007.909290

K. Adilo?-glu and E. Vincent, An uncertainty estimation approach for the extraction of source features in multisource recordings, Proceedings of 19th European Signal Processing Conference (EUSIPCO), pp.1663-1667, 2011.

M. Kim, P. Smaragdis, G. G. Ko, and R. A. Rutenbar, Stereophonic spectrogram segmentation using Markov random fields, 2012 IEEE International Workshop on Machine Learning for Signal Processing, pp.2012-2013
DOI : 10.1109/MLSP.2012.6349754

J. Chien and H. Hsieh, Bayesian group sparse learning for music source separation, EURASIP Journal on Audio, Speech, and Music Processing, vol.2013, issue.1, p.18, 2013.
DOI : 10.1109/TNN.2003.810616

A. T. Cemgil, C. Févotte, and S. J. , Variational and stochastic inference for Bayesian source separation, Digital Signal Processing, vol.17, issue.5, pp.891-913, 2007.
DOI : 10.1016/j.dsp.2007.03.008

S. J. Rennie, J. R. Hershey, and P. A. Olsen, Single-Channel Multitalker Speech Recognition, IEEE Signal Processing Magazine, vol.27, issue.6, pp.66-80, 2010.
DOI : 10.1109/MSP.2010.938081

M. D. Hoffman, D. M. Blei, and P. R. Cook, Bayesian nonparametric matrix factorization for recorded music, Proceedings of the International Conference on Machine Learning (ICML), 2010.

G. Mysore and M. Sahani, Variational inference in non-negative factorial hidden Markov models for efficient audio source separation, Proceedings of 29th International Conference on Machine Learning (ICML), pp.1887-1894, 2012.

T. Otsuka, K. Ishiguro, H. Sawada, and H. G. Okuno, Bayesian unification of sound source localization and separation with permutation resolution, Proceedings of 26th AAAI Conference on Artificial Intelligence, pp.2038-2045, 2012.

J. Chien and H. Hsieh, Nonstationary Source Separation Using Sequential and Variational Bayesian Learning, IEEE Transactions on Neural Networks and Learning Systems, vol.24, issue.5, pp.681-694, 2013.
DOI : 10.1109/TNNLS.2013.2242090

J. Chien and P. Yang, Bayesian Factorization and Learning for Monaural Source Separation, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.24, issue.1, pp.185-195, 2016.
DOI : 10.1109/TASLP.2015.2502141

L. Deng, J. Droppo, and A. Acero, Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion, IEEE Transactions on Speech and Audio Processing, vol.13, issue.3, pp.412-421, 2005.
DOI : 10.1109/TSA.2005.845814

M. Delcroix, T. Nakatani, and S. Watanabe, Static and Dynamic Variance Compensation for Recognition of Reverberant Speech With Dereverberation Preprocessing, IEEE Transactions on Audio, Speech, and Language Processing, vol.17, issue.2, pp.324-334, 2009.
DOI : 10.1109/TASL.2008.2010214

D. Kolossa, R. F. Astudillo, E. Hoffmann, and R. Orglmeister, Independent component analysis and time-frequency masking for speech recognition in multitalker conditions, EURASIP Journal on Audio, Speech, and Music Processing, vol.2010, 2010.

R. Astudillo and T. Berlin, Integration of short-time Fourier domain speech enhancement and observation uncertainty techniques for robust automatic speech recognition, 2010.

F. Nesta, M. Matassoni, and R. F. Astudillo, A flexible spatial blind source extraction framework for robust speech recognition in noisy environments, Proceedings of the 2nd International Workshop on Machine Listening in Multisource Environments (CHiME), pp.33-40, 2013.

D. T. Tran, E. Vincent, and D. Jouvet, Nonparametric Uncertainty Estimation and Propagation for Noise Robust ASR, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.23, issue.11, pp.1835-1846, 2015.
DOI : 10.1109/TASLP.2015.2450497

URL : https://hal.archives-ouvertes.fr/hal-01114329

A. H. Abdelaziz, S. Watanabe, J. R. Hershey, E. Vincent, and D. Kolossa, Uncertainty propagation through deep neural networks, Proceedings of Interspeech, pp.3561-3565, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01162550

A. Ozerov, M. Lagrange, and E. Vincent, Uncertainty-based learning of acoustic models from noisy data, Computer Speech & Language, vol.27, issue.3, pp.874-894, 2013.
DOI : 10.1016/j.csl.2012.07.002

URL : https://hal.archives-ouvertes.fr/hal-00717992

C. Yu, G. Liu, S. Hahm, and J. H. Hansen, Uncertainty propagation in front end factor analysis for noise robust speaker recognition, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.2014-4017
DOI : 10.1109/ICASSP.2014.6854356

M. Lagrange, A. Ozerov, and E. Vincent, Robust singer identification in polyphonic music using melody enhancement and uncertainty-based learning, Proceedings of 13th International Society for Music Information Retrieval Conference (ISMIR), pp.595-600, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00709826

M. J. Gales, Model-based techniques for noise robust speech recognition, 1995.

P. J. Moreno, B. Raj, and R. M. Stern, A vector Taylor series approach for environment-independent speech recognition, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, pp.733-736, 1996.
DOI : 10.1109/ICASSP.1996.543225

R. F. Astudillo and R. Orglmeister, A MMSE estimator in Mel-cepstral domain for robust large vocabulary automatic speech recognition using uncertainty propagation, Proceedings of Interspeech, pp.713-716, 2010.

K. Adilo?-glu and E. Vincent, A general variational Bayesian framework for robust feature extraction in multisource recordings, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, pp.2012-273

C. M. Bishop, Pattern Recognition and Machine Learning, 2006.

C. Blandin, A. Ozerov, and E. Vincent, Multi-source TDOA estimation in reverberant audio using angular spectra and clustering, Signal Processing, vol.92, issue.8, 1950.
DOI : 10.1016/j.sigpro.2011.09.032

URL : https://hal.archives-ouvertes.fr/inria-00576297

E. Vincent, S. Araki, F. J. Theis, G. Nolte, P. Bofill et al., The signal separation evaluation campaign Achievements and remaining challenges, Signal Processing, vol.92, 1928.
URL : https://hal.archives-ouvertes.fr/inria-00579398

E. Vincent, R. Gribonval, and C. Févotte, Performance measurement in blind audio source separation, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.4, pp.1462-1469, 2006.
DOI : 10.1109/TSA.2005.858005

URL : https://hal.archives-ouvertes.fr/inria-00544230

E. Vincent, J. Barker, S. Watanabe, J. Le-roux, F. Nesta et al., The second ‘chime’ speech separation and recognition challenge: Datasets, tasks and baselines, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.2013-162
DOI : 10.1109/ICASSP.2013.6637622

A. Hurmalainen, J. F. Gemmeke, and T. Virtanen, Modelling non-stationary noise with spectral factorisation in automatic speech recognition, Computer Speech & Language, vol.27, issue.3, pp.763-779, 2013.
DOI : 10.1016/j.csl.2012.07.008

N. Moritz, M. R. Schädler, K. Adilo?-glu, B. T. Meyer, T. Jürgens et al., Noise robust distant automatic speech recognition utilizing NMF based source separation and auditory feature extraction, Proceedings of 2nd CHiME challenge workshop, 2013.

D. T. Tran, E. Vincent, and D. Jouvet, Extension of uncertainty propagation to dynamic MFCCS for noise robust ASR, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.2014-2045
DOI : 10.1109/ICASSP.2014.6854656

URL : https://hal.archives-ouvertes.fr/hal-00954654

L. K. Saul and M. I. Jordan, Exploiting tractable substructures in intractable networks, Advances in Neural Information Processing Systems (NIPS), pp.486-492, 1995.

Y. Wang, A. Narayanan, and D. Wang, On Training Targets for Supervised Speech Separation, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.22, issue.12, pp.1849-1858, 2014.
DOI : 10.1109/TASLP.2014.2352935

Y. Xu, J. Du, L. Dai, and C. Lee, An Experimental Study on Speech Enhancement Based on Deep Neural Networks, IEEE Signal Processing Letters, vol.21, issue.1, pp.65-68, 2014.
DOI : 10.1109/LSP.2013.2291240

F. Weninger, H. Erdogan, S. Watanabe, E. Vincent, J. Le-roux et al., Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR, Proceedings of 12th Int. Conf. on Latent Variable Analysis and Signal Separation, pp.2015-91
DOI : 10.1007/978-3-319-22482-4_11

URL : https://hal.archives-ouvertes.fr/hal-01163493

S. Sivasankaran, A. A. Nugraha, E. Vincent, J. A. Morales-cordovilla, S. Dalmia et al., Robust ASR using neural network based speech enhancement and feature simulation, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.482-489, 2015.
DOI : 10.1109/ASRU.2015.7404834

URL : https://hal.archives-ouvertes.fr/hal-01204553

B. Jorgensen, Statistical Properties of the Generalized Inverse-Gaussian Distribution, 1982.
DOI : 10.1007/978-1-4612-5698-4