Enhancement and analysis of conversational speech: JSALT 2017, in ICASSP, pp.5154-5158, 2018. ,
Diarization is hard: Some experiences and lessons learned for the JHU team in the inaugural DIHARD challenge, in Interspeech, pp.2808-2812, 2018. ,
BUT system for DIHARD speech diarization challenge, in Interspeech, pp.2798-2802, 2018. ,
The second DIHARD diarization challenge: Dataset, task, and baselines, pp.978-982, 2019. ,
Robust Automatic Speech Recognition -A Bridge to Practical Applications. Elsevier, 2015. ,
New Era for Robust Speech Recognition -Exploiting Deep Learning, 2017. ,
Audio Source Separation and Speech Enhancement, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01881431
Speech processing for digital home assistants: Combining signal processing with deep-learning techniques, IEEE Signal Processing Magazine, vol.36, issue.6, pp.111-124, 2019. ,
Overlapped speech detection for improved speaker diarization in multiparty meetings, ICASSP, pp.4353-4356, 2008. ,
Two's a crowd: Improving speaker diarization by automatically identifying and excluding overlapped speech, pp.32-35, 2008. ,
Efficient use of overlap information in speaker diarization, pp.683-686, 2007. ,
Efficient voice activity detection algorithms using long-term speech information, Speech Communication, vol.42, issue.3-4, pp.271-287, 2004. ,
,
, , pp.415-422
CHiME-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings, CHiME, 2020. ,
URL : https://hal.archives-ouvertes.fr/hal-02546993
Detecting overlapping speech with long short-term memory recurrent neural networks, pp.1668-1672, 2013. ,
Detecting overlapped speech on short timeframes using deep learning, pp.1198-1202, 2017. ,
Leveraging LSTM models for overlap detection in multi-party meetings, in ICASSP, pp.5249-5253, 2018. ,
The AMI meeting corpus, 5th International Conference on Methods and Techniques in Behavioral Research, pp.137-140, 2005. ,
Detection of overlapping speech for the purposes of speaker diarization, International Conference on Speech and Computer, pp.247-257, 2019. ,
Overlap-aware diarization: Resegmentation using neural end-to-end overlapped speech detection, ICASSP, pp.7114-7118, 2020. ,
Classification vs. regression in supervised learning for single channel speaker count estimation, in ICASSP, pp.436-440, 2018. ,
Countnet: Estimating the number of concurrent speakers using supervised learning, IEEE/ACM Transactions on Audio, Speech and Language Processing, vol.27, issue.2, pp.268-282, 2019. ,
Overlapped speech detection and competing speaker counting -humans versus deep learning, IEEE Journal of Selected Topics in Signal Processing, vol.13, issue.4, pp.850-862, 2019. ,
An empirical evaluation of generic convolutional and recurrent networks for sequence modeling, 2018. ,
Conv-tasnet: Surpassing ideal timefrequency magnitude masking for speech separation, speech, and language processing, vol.27, pp.1256-1266, 2019. ,
Mobilenets: Efficient convolutional neural networks for mobile vision applications, 2017. ,
Delving deep into rectifiers: Surpassing human-level performance on imagenet classification, 2015 IEEE International Conference on Computer Vision (ICCV), pp.1026-1034, 2015. ,
Layer normalization, 2016. ,
, On the variance of the adaptive learning rate and beyond, 2019.
Librispeech: an ASR corpus based on public domain audio books, ICASSP, pp.5206-5210, 2015. ,
gpuRIR: A Python library for room impulse response simulation with GPU acceleration, 2018. ,
Montreal Forced Aligner: Trainable text-speech alignment using Kaldi, pp.498-502, 2017. ,
The Kaldi speech recognition toolkit, ASRU, 2011. ,
Microsoft coco: Common objects in context, European Conference on Computer Vision (ECCV), pp.740-755, 2014. ,
Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research, vol.7, 2006. ,