Supervised speech separation based on deep learning: An overview, Speech, and Language Processing, vol.26, pp.1702-1726, 2018. ,
Exploring monaural features for classification-based speech segregation, IEEE Transactions on Audio, Speech, and Language Processing, vol.21, issue.2, pp.270-279, 2013. ,
Binaural classification for reverberant speech segregation using deep neural networks, IEEE/ACM Transactions on Audio, Speech and Language Processing, vol.22, issue.12, pp.2112-2121, 2014. ,
Gated residual networks with dilated convolutions for monaural speech enhancement, IEEE Transactions on Audio, Speech, and Language Processing, vol.27, issue.1, pp.189-198, 2019. ,
Single-channel speech separation with memory-enhanced recurrent neural networks, IEEE International Conference on Acoustics, Speech and Signal Processing, pp.3709-3713, 2014. ,
Long short-term memory for speaker generalization in supervised speech separation, The Journal of the Acoustical Society of America, vol.141, issue.6, pp.4705-4714, 2017. ,
Neural network based spectral mask estimation for acoustic beamforming, IEEE International Conference on Acoustics, Speech and Signal Processing, pp.196-200, 2016. ,
Exploring practical aspects of neural mask-based beamforming for far-field speech recognition, IEEE International Conference on Acoustics, Speech and Signal Processing, pp.6697-6701, 2018. ,
Deep learning based binaural speech separation in reverberant environments, IEEE/ACM transactions on audio, vol.25, issue.5, pp.1075-1084, 2017. ,
Combining spectral and spatial features for deep learning based blind speaker separation, Speech, and Language Processing, vol.27, pp.457-468, 2019. ,
Low-latency speaker-independent continuous speech separation, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.6980-6984, 2019. ,
Distant speech separation using predicted time-frequency masks from spatial features, Speech communication, vol.68, pp.97-106, 2015. ,
Time-frequency masking based online multi-channel speech enhancement with convolutional recurrent neural networks, IEEE Journal of Selected Topics in Signal Processing, vol.13, issue.4, pp.787-799, 2019. ,
Deep beamforming networks for multi-channel speech recognition, IEEE International Conference on Acoustics, Speech and Signal Processing ,
, IEEE, pp.5745-5749, 2016.
Multichannel signal processing with deep neural networks for automatic speech recognition, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.25, issue.5, pp.965-979, 2017. ,
Non-stationary noise power spectral density estimation based on regional statistics, IEEE International Conference on Acoustics, Speech and Signal Processing, pp.181-185, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01250892
Unbiased MMSE-based noise power estimation with low complexity and low tracking delay, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.4, pp.1383-1393, 2012. ,
Signal enhancement using beamforming and nonstationarity with applications to speech, IEEE Transactions on, vol.49, issue.8, pp.1614-1626, 2001. ,
Estimation of relative transfer function in the presence of stationary noise based on segmental power spectral density matrix subtraction, IEEE International Conference on Acoustics, Speech and Signal Processing, pp.320-324, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01119186
Audio-noise power spectral density estimation using long short-term memory, IEEE Signal Processing Letters, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02100059
Coherent-to-diffuse power ratio estimation for dereverberation, Speech, and Language Processing, vol.23, pp.1006-1018, 2015. ,
Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator, IEEE Transactions on Acoustics, Speech and Signal Processing, vol.32, issue.6, pp.1109-1121, 1984. ,
Speech enhancement for non-stationary noise environments, Signal processing, vol.81, issue.11, pp.2403-2418, 2001. ,
Microphone arrays: signal processing techniques and applications, 2013. ,
Multichannel speech enhancement based on timefrequency masking using subband long short-term memory, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02264247
On training targets for supervised speech separation, Speech, and Language Processing, vol.22, pp.1849-1858, 2014. ,
Complex ratio masking for monaural speech separation, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.24, issue.3, pp.483-492, 2016. ,
The third 'CHiME' speech separation and recognition challenge: Dataset, task and baselines, IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.504-511, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01211376
Keras, 2015. ,
Adam: A method for stochastic optimization, International Conference on Learning Representations, 2015. ,
Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs, IEEE International Conference on Acoustics, Speech, and Signal Processing, vol.2, pp.749-752, 2001. ,
An algorithm for intelligibility prediction of time-frequency weighted noisy speech, IEEE Transactions on Audio, Speech, and Language Processing, vol.19, issue.7, pp.2125-2136, 2011. ,
An improved non-intrusive intelligibility metric for noisy and reverberant speech, International Workshop on Acoustic Signal Enhancement (IWAENC), pp.55-59, 2014. ,
The MERL/SRI system for the 3RD CHiME challenge using beamforming, robust feature extraction, and advanced speech recognition, IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.475-481, 2015. ,
Acoustic beamforming for speaker diarization of meetings, IEEE Transactions on Audio, Speech, and Language Processing, vol.15, issue.7, pp.2011-2022, 2007. ,