. Acero, HMM adaptation using vector Taylor series for noisy speech recognition, Proc. ICSLP, pp.869-872, 2000.

S. Acero, A. Acero, and R. M. Stern, Environmental robustness in automatic speech recognition, International Conference on Acoustics, Speech, and Signal Processing, pp.849-852, 1990.
DOI : 10.1109/ICASSP.1990.115971

. Arberet, Nonnegative matrix factorization and spatial covariance model for under-determined reverberant audio source separation, 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010), pp.1-4, 2010.
DOI : 10.1109/ISSPA.2010.5605570
URL : https://hal.archives-ouvertes.fr/inria-00541436

R. Astudillo and . Tu-berlin, Integration of Short-Time Fourier Domain Speech Enhancement and Observation Uncertainty Techniques for Robust Automatic Speech Recognition, 2010.

K. Astudillo, R. Astudillo, and D. Kolossa, Uncertainty Propagation, 2011.
DOI : 10.1007/978-3-642-21317-5_3

R. F. Astudillo, An Extension of STFT Uncertainty Propagation for GMM-Based Super-Gaussian a Priori Models, IEEE Signal Processing Letters, vol.20, issue.12, pp.1163-1166, 2013.
DOI : 10.1109/LSP.2013.2283493

. Astudillo, A multichannel feature compensation approach for robust ASR in noisy and reverberant environments, Workshop REVERB, 2014.

. Bourlard, CDNN: a context dependent neural network for continuous speech recognition, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.349-352, 1992.
DOI : 10.1109/ICASSP.1992.226048

. Bourlard, . Wellekens, H. Bourlard, and C. Wellekens, Links between Markov models and multilayer perceptrons, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.12, issue.12, pp.1167-1178, 1990.
DOI : 10.1109/34.62605

. Brutti, WOZ acoustic data collection for interactive TV, Proc. LREC, 2008.
DOI : 10.1007/s10579-010-9116-x

W. Buntine, W. Buntine, and A. Weigend, Bayesian backpropagation. Complex systems, pp.603-643, 2004.

J. F. Cardoso, Infomax and maximum likelihood for blind source separation, IEEE Signal Processing Letters, vol.4, issue.4, pp.112-114, 1997.
DOI : 10.1109/97.566704
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.56.3619

I. Cohen, Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging, IEEE Transactions on Speech and Audio Processing, vol.11, issue.5, pp.466-475, 2003.
DOI : 10.1109/TSA.2003.811544

P. Comon, Independent component analysis, a new concept? Signal Processing, pp.287-314, 1994.

. Cooke, An audiovisual corpus for speech perception and automatic speech recognition, Journal of the, pp.2421-2424, 2006.

. Cooke, Robust automatic speech recognition with missing and unreliable acoustic data, Speech Communication, vol.34, issue.3, pp.267-285, 2001.
DOI : 10.1016/S0167-6393(00)00034-0

. Cristoforetti, The DIRHA simulated corpus, Proc. LREC, 2014.

. Dahl, Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.1, pp.30-42, 2012.
DOI : 10.1109/TASL.2011.2134090

M. Davis, S. B. Davis, and P. Mermelstein, Comparison of parametric representations for mono-syllabic word recognition in continuous spoken sentences, IEEE Transactions on Audio, Speech, and Language Processing, vol.28, issue.4, pp.357-366, 1980.

J. De-leeuw-]-de-leeuw, Block-relaxation Algorithms in Statistics, Information Systems and Data Analysis, pp.308-325, 1994.
DOI : 10.1007/978-3-642-46808-7_28

J. De-leeuw and K. Lange, Sharp quadratic majorization in one dimension, Computational Statistics & Data Analysis, vol.53, issue.7, pp.2471-2484, 2009.
DOI : 10.1016/j.csda.2009.01.002

. Delcroix, Speech recognition in living rooms: Integrated speech enhancement and recognition system based on spatial, spectral and temporal modeling of sounds, Computer Speech & Language, vol.27, issue.3, pp.851-873, 2013.
DOI : 10.1016/j.csl.2012.07.006

. Delcroix, Static and Dynamic Variance Compensation for Recognition of Reverberant Speech With Dereverberation Preprocessing, IEEE Transactions on Audio, Speech, and Language Processing, vol.17, issue.2, pp.324-334, 2009.
DOI : 10.1109/TASL.2008.2010214

L. Deng, Front-End, Back-End, and Hybrid Techniques for Noise-Robust Speech Recognition, Robust Speech Recognition of Uncertain or Missing Data -Theory and Applications, pp.67-99, 2011.
DOI : 10.1007/978-3-642-21317-5_4

. Deng, Large vocabulary speech recognition under adverse acoustic environments, Proc. ICSLP, pp.806-809, 2000.

. Deng, Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion, IEEE Transactions on Speech and Audio Processing, vol.13, issue.3, pp.412-421, 2005.
DOI : 10.1109/TSA.2005.845814

S. Doclo and M. Moonen, GSVD-based optimal filtering for single and multimicrophone speech enhancement, IEEE Transactions on Signal Processing, vol.50, issue.9, pp.2230-2244, 2002.
DOI : 10.1109/TSP.2002.801937
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.6.9502

. Droppo, Uncertainty decoding with SPLICE for noise robust speech recognition, Proc. ICASSP, pp.56-60, 2002.

. Duchi, Adaptive subgradient methods for online learning and stochastic optimization, Proc. ICML, pp.2121-2159, 2011.

. Duong, Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.7, pp.181830-1840, 2010.
DOI : 10.1109/TASL.2010.2050716
URL : https://hal.archives-ouvertes.fr/inria-00435807

E. , M. Ephraim, Y. Malah, and D. , Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator, IEEE Transactions on Audio, Speech, and Language Processing, issue.6, pp.321109-1121, 1984.

E. , M. Ephraim, Y. Malah, and D. , Speech enhancement using a minimum mean-square error log-spectral amplitude estimator, IEEE Transactions on Audio, Speech, and Language Processing, vol.33, issue.2, pp.443-445, 1985.
DOI : 10.1109/tassp.1985.1164550

J. G. Fiscus, A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER), 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings, pp.347-354, 1997.
DOI : 10.1109/ASRU.1997.659110

. Flanagan, Spatially selective sound capture for speech and audio processing, Speech Communication, vol.13, issue.1-2, pp.207-222, 1993.
DOI : 10.1016/0167-6393(93)90072-S

. Frey, ALGO- NQUIN:iterating Laplace's method to remove multiple types of acoustic distortion for robust speech recognition, Proc. Eurospeech, pp.901-904, 2001.

C. Févotte, C. Févotte, and J. Cardoso, Maximum likelihood approach for blind audio source separation using time-frequency Gaussian source models, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005., pp.78-81, 2005.
DOI : 10.1109/ASPAA.2005.1540173

. Févotte, Non-negative dynamical system with application to speech and audio, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.3158-3162, 2013.
DOI : 10.1109/ICASSP.2013.6638240

S. Young, Robust continuous speech recognition using parallel model combination, IEEE Transactions on Speech and Audio Processing, vol.4, issue.5, pp.352-359, 1996.

S. Young, The application of hidden Markov models in speech recognition, Journal Foundations and Trends in Signal Processing, vol.1, issue.3, pp.195-304, 2008.

M. J. Gales, Maximum likelihood linear transformations for HMM-based speech recognition, Computer Speech & Language, vol.12, issue.2, pp.75-98, 1998.
DOI : 10.1006/csla.1998.0043

. Gannot, Signal enhancement using beamforming and nonstationarity with applications to speech, IEEE Transactions on Signal Processing, vol.49, issue.8, pp.491614-1626, 2001.
DOI : 10.1109/78.934132

. Garofalo, CSR-I (WSJ0) complete. Linguistic Data Consortium, 2007.

L. Gauvain, J. Gauvain, and C. Lee, Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains, IEEE Transactions on Speech and Audio Processing, vol.2, issue.2, pp.291-298, 1994.
DOI : 10.1109/89.279278

. Gemmeke, Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition, IEEE Transactions on Audio, Speech, and Language Processing, vol.19, issue.7, pp.2067-2080, 2011.
DOI : 10.1109/TASL.2011.2112350

J. Ghahramani, Z. Ghahramani, and M. I. Jordan, Supervised learning from incomplete data via an em approach, Proc. NIPS, pp.120-127, 1994.

. Goodfellow, Maxout networks, Proc. ICML, pp.1319-1327, 2013.

. Gradshteyn, I. S. Ryzhik-]-gradshteyn, and I. M. Ryzhik, Table of Integrals, Series and Products, 1995.

K. Greenberg, S. Greenberg, and B. E. Kingsbury, The modulation spectrogram: in pursuit of an invariant representation of speech, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.1647-1650, 1997.
DOI : 10.1109/ICASSP.1997.598826

R. Häb-umbach and H. Ney, Linear discriminant analysis for improved large vocabulary continuous speech recognition, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.13-16, 1992.
DOI : 10.1109/ICASSP.1992.225984

. Hansen, Cu-move: Analysis corpus development for interactive in-vehicle speech systems, Proc. EUROSPEECH, pp.2023-2026, 2001.

W. J. Heiser, Convergent computing by iterative majorization: theory and applications in multidimensional data analysis, Recent Advances in Descriptive Multivariate Analysis, pp.157-189, 1995.

H. Hermansky, Perceptual linear predictive (PLP) analysis of speech, The Journal of the Acoustical Society of America, vol.87, issue.4, pp.1738-1752, 1990.
DOI : 10.1121/1.399423

H. Hermansky, Tandem connectionist feature extraction for conventional HMM systems, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100), pp.1635-1638, 2000.
DOI : 10.1109/ICASSP.2000.862024

. Hinton, Deep neural networks for acoustic modeling in speech recognition, IEEE Signal Processing Magazine, issue.6, pp.2982-97, 2012.

. Hinton, A Fast Learning Algorithm for Deep Belief Nets, Neural Computation, vol.18, issue.7, pp.1527-1554, 2006.
DOI : 10.1162/jmlr.2003.4.7-8.1235

S. Hochreiter and J. Schmidhuber, Long Short-Term Memory, Neural Computation, vol.4, issue.8, pp.1735-1780, 1997.
DOI : 10.1016/0893-6080(88)90007-X

. Hurmalainen, Nonnegative matrix deconvolution in noise robust speech recognition, Proc. ICASSP, pp.4588-4591, 2011.

V. Ion and R. Haeb-umbach, Uncertainty decoding for distributed speech recognition over error-prone networks, Speech Communication, vol.48, issue.11, pp.1435-1446, 2006.
DOI : 10.1016/j.specom.2006.03.007

U. Julier, S. Julier, and J. Uhlmann, Unscented Filtering and Nonlinear Estimation, Proceedings of the IEEE, pp.401-422, 2004.
DOI : 10.1109/JPROC.2003.823141
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.136.6539

. Kallasjoki, Estimating Uncertainty to Improve Exemplar-Based Feature Enhancement for Noise Robust Speech Recognition, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.22, issue.2, pp.368-380, 2014.
DOI : 10.1109/TASLP.2013.2292328

. Kallasjoki, Mask estimation and sparse imputation for missing data speech recognition in multisource reverberant environments, Proc. CHiME, pp.58-63, 2011.

C. Kim and R. Stern, Power-normalized cepstral coefficients (PNCC) for robust speech recognition, Proc. Interspeech, pp.1231-1234, 2009.
DOI : 10.1109/icassp.2012.6288820
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.220.2217

. Kolossa, CHIME challenge: approaches to robustness using beamforming and uncertainty-of-observation techniques, Proc. CHiME, pp.6-11, 2011.

H. Kolossa, D. Kolossa, and R. Haeb-umbach, Robust Speech Recognition of Uncertain or Missing data, 2011.
DOI : 10.1007/978-3-642-21317-5

R. Kompass, A Generalized Divergence Measure for Nonnegative Matrix Factorization, Neural Computation, vol.39, issue.3, pp.780-791, 2007.
DOI : 10.1162/089976602320264033

H. Krueger, A. Krueger, and R. Haeb-umbach, Model based feature enhancement for automatic speech recognition in reverberant environments, Proc. ICASSP, pp.126-130, 2013.

. Kumatani, Microphone Array Processing for Distant Speech Recognition: From Close-Talking Microphones to Far-Field Sensors, IEEE Signal Processing Magazine, vol.29, issue.6, pp.127-140, 2012.
DOI : 10.1109/MSP.2012.2205285

. Lange, Optimization transfer using surrogate objective functions (with discussion), Journal of Computational and Graphical Statistics, vol.9, pp.1-20, 2000.

L. Roux, . Vincent, J. Le-roux, and E. Vincent, A categorization of robust speech processing datasets, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01063805

. Lecun, Gradient-based learning applied to document recognition, Proceedings of the IEEE, pp.2278-2324, 1998.
DOI : 10.1109/5.726791
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.138.1115

. Lee, . Seung, D. D. Lee, and H. S. Seung, Learning the parts of objects with nonnegative matrix factorization, Nature, vol.401, pp.788-791, 1999.

C. J. Leggetter and P. C. Woodland, Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models, Computer Speech & Language, vol.9, issue.2, pp.171-185, 1995.
DOI : 10.1006/csla.1995.0010

. Li, . Sim, B. Li, and K. C. Sim, An ideal hidden-activation mask for deep neural networks based noise-robust speech recognition, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.200-204, 2014.
DOI : 10.1109/ICASSP.2014.6853586

H. Liao, Uncertainty Decoding for Noise Robust Speech Recognition, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00499200

J. Martens, Extracting and composing robust features with denoising autoencoders, Proc. ICML, 2010.

R. Martin, Statistical Methods for the Enhancement of Noisy Speech, Proc. IWAENC, pp.1-6, 2003.
DOI : 10.1007/3-540-27489-8_3

M. Mcaulay, R. J. Mcaulay, and M. L. Malpass, Speech enhancement using a soft-decision noise suppression filter, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.28, issue.2, pp.137-145, 1980.
DOI : 10.1109/TASSP.1980.1163394

. Mcdermott, Discriminative training based on an integrated view of MPE and MMI in margin and error space, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4894-4897, 2010.
DOI : 10.1109/ICASSP.2010.5495106

. Moreno, A vector Taylor series approach for environment-independent speech recognition, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, pp.733-736, 1996.
DOI : 10.1109/ICASSP.1996.543225

N. Ono, Stable and fast update rules for independent vector analysis based on auxiliary function technique, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp.189-192, 2011.
DOI : 10.1109/ASPAA.2011.6082320

A. Ozerov and C. Févotte, Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.3, pp.550-563, 2010.
DOI : 10.1109/TASL.2009.2031510

. Ozerov, Uncertainty-based learning of acoustic models from noisy data, Computer Speech & Language, vol.27, issue.3, pp.874-894, 2013.
DOI : 10.1016/j.csl.2012.07.002
URL : https://hal.archives-ouvertes.fr/hal-00717992

. Ozerov, A General Flexible Framework for the Handling of Prior Information in Audio Source Separation, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.4, pp.1118-1133, 2012.
DOI : 10.1109/TASL.2011.2172425
URL : https://hal.archives-ouvertes.fr/inria-00536917

D. Povey, Discriminative training for large vocabulary speech recognition, 2005.

. Povey, Boosted MMI for model and feature-space discriminative training, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 2008.
DOI : 10.1109/ICASSP.2008.4518545

. Povey, fMPE: Discriminatively Trained Features for Speech Recognition, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005., pp.961-964, 2005.
DOI : 10.1109/ICASSP.2005.1415275

D. Povey and P. C. Woodland, Minimum phone error and i-smoothing for improved discriminative training, Proc. ICASSP, pp.105-108, 2002.

J. Rabiner, Fundamentals of Speech Recognition, 1993.

. Rabiner, L. Levinson-]-rabiner, and S. E. Levinson, Isolated and Connected Word Recognition--Theory and Selected Applications, IEEE Transactions on Communications, vol.29, issue.5, pp.621-659, 1981.
DOI : 10.1109/TCOM.1981.1095031
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.84.3715

L. R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, pp.257-286, 1989.

S. Renals, S. Renals, and P. Swietojanski, Neural networks for distant speech recognition, 2014 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA), pp.172-176, 2014.
DOI : 10.1109/HSCMA.2014.6843274
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.589.6568

S. Rice, Mathematical Analysis of Random Noise, Bell System Technical Journal, vol.23, issue.3, pp.282-332, 1944.
DOI : 10.1002/j.1538-7305.1944.tb00874.x

F. Rosenblatt, The perceptron: A probabilistic model for information storage and organization in the brain., Psychological Review, vol.65, issue.6, pp.386-408, 1958.
DOI : 10.1037/h0042519

. Rumelhart, Learning representations by back-propagating errors, Nature, vol.85, issue.6088, pp.533-536, 1986.
DOI : 10.1038/323533a0

. Seide, Conversational speech transcription using context-dependent deep neural networks, Proc. Interspeech, pp.437-440, 2011.

M. L. Seltzer, Robustness is dead! Long live robustness! Keynote speech, 2014.

. Seltzer, Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition, IEEE Transactions on Speech and Audio Processing, vol.12, issue.5, pp.489-498, 2004.
DOI : 10.1109/TSA.2004.832988

. Seltzer, An investigation of noise robustness of deep neural networks, Proc. ICASSP, pp.7398-7402, 2013.

P. Smaragdis, Convolutive Speech Bases and Their Application to Supervised Speech Separation, IEEE Transactions on Audio, Speech and Language Processing, vol.15, issue.1, pp.1-14, 2007.
DOI : 10.1109/TASL.2006.876726

W. Srinivasan, S. Srinivasan, and D. Wang, Transforming Binary Uncertainties for Robust Speech Recognition, IEEE Transactions on Audio, Speech and Language Processing, vol.15, issue.7, pp.152130-2140, 2007.
DOI : 10.1109/TASL.2007.901836

. Tran, Fast DNN training based on auxiliary function technique, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.2160-2164, 2015.
DOI : 10.1109/ICASSP.2015.7178353
URL : https://hal.archives-ouvertes.fr/hal-01107809

. Tran, Extension of uncertainty propagation to dynamic MFCCS for noise robust ASR, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5507-5511, 2014.
DOI : 10.1109/ICASSP.2014.6854656
URL : https://hal.archives-ouvertes.fr/hal-00954654

. Tran, Fusion of multiple uncertainty estimators and propagators for noise robust ASR, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5512-5516, 2014.
DOI : 10.1109/ICASSP.2014.6854657
URL : https://hal.archives-ouvertes.fr/hal-00955185

. Tran, Discriminative uncertainty estimation for noise robust ASR, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5038-5042, 2015.
DOI : 10.1109/ICASSP.2015.7178930
URL : https://hal.archives-ouvertes.fr/hal-01103969

. Tran, Nonparametric Uncertainty Estimation and Propagation for Noise Robust ASR, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.23, issue.11, pp.231835-894, 2015.
DOI : 10.1109/TASLP.2015.2450497
URL : https://hal.archives-ouvertes.fr/hal-01114329

. Tran, Using full-rank spatial covariance models for noise-robust ASR, Proc. CHiME, pp.31-32, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00801162

. Vesel´yvesel´y, Sequencediscriminative training of deep neural networks, Proc. Interspeech, pp.2345-2349, 2013.

. Viikki, . Laurila, O. Viikki, and K. Laurila, Cepstral domain segmental feature vector normalization for noise robust speech recognition, Speech Communication, vol.25, issue.1-3, pp.1-3133, 1998.
DOI : 10.1016/S0167-6393(98)00033-8

. Vincent, The second ‘CHiME’ speech separation and recognition challenge: An overview of challenge systems and outcomes, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, pp.162-167, 2013.
DOI : 10.1109/ASRU.2013.6707723

. Vincent, The second ‘chime’ speech separation and recognition challenge: Datasets, tasks and baselines, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.126-130, 2013.
DOI : 10.1109/ICASSP.2013.6637622

A. J. Viterbi, Error bounds for convolutional codes and an asymptotically optimum decoding algorithm, IEEE Transactions on Information Theory, vol.13, issue.2, pp.260-269, 1967.
DOI : 10.1109/TIT.1967.1054010

W. Wang, Y. Wang, and D. L. Wang, Towards Scaling Up Classification-Based Speech Separation, IEEE Transactions on Audio, Speech, and Language Processing, vol.21, issue.7, pp.1381-1390, 2013.
DOI : 10.1109/TASL.2013.2250961

. Weng, Single-channel mixed speech recognition using deep neural networks, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5632-5636, 2014.
DOI : 10.1109/ICASSP.2014.6854681

. Wilson, Speech denoising using nonnegative matrix factorixation with priors, Proc. ICASSP, pp.4029-4032, 2008.
DOI : 10.1109/icassp.2008.4518538
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.417.4333

M. Wölfel and J. Mcdonough, Distant Speech Recognition, 2009.

. Zeiler, On rectified linear units for speech processing, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.3517-3521, 2013.
DOI : 10.1109/ICASSP.2013.6638312