Audiovisual probabilistic tracking of multiple speakers in meetings, IEEE Transactions on Audio, Speech and Language Processing, vol.15, issue.2, pp.601-616, 2007. ,
Structure inference for Bayesian multisensory scene understanding, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.30, issue.12, pp.2140-2157, 2008. ,
A multimodal approach to blind source separation of moving sources, IEEE Journal of Selected Topics in Signal Processing, vol.4, issue.5, pp.895-910, 2010. ,
Audio assisted robust visual tracking with adaptive particle filtering, IEEE Transactions on Multimedia, vol.17, issue.2, pp.186-200, 2015. ,
Information-driven active audio-visual source localization, PloS one, vol.10, issue.9, 2015. ,
Mean-shift and sparse sampling-based SMC-PHD filtering for audio informed visual speaker tracking, IEEE Transactions on Multimedia, vol.18, issue.12, pp.2417-2431, 2016. ,
Mean-shift and sparse sampling-based SMC-PHD filtering for audio informed visual speaker tracking, IEEE Transactions on Multimedia, vol.18, issue.12, pp.2417-2431, 2016. ,
An on-line variational Bayesian model for multi-person tracking from cluttered scenes, Computer Vision and Image Understanding, vol.153, pp.64-76, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01349763
Confidence-based data association and discriminative deep appearance learning for robust online multi-object tracking, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.40, issue.3, pp.595-610, 2018. ,
Robust localization and tracking of simultaneous moving sound sources using beamforming and particle filtering, Robotics and Autonomous Systems, vol.55, issue.3, pp.216-228, 2007. ,
TDOA estimation for multiple sound sources in noisy and reverberant environments using broadband independent component analysis, IEEE Transactions on Audio, Speech, and Language Processing, vol.19, issue.6, pp.1490-1503, 2011. ,
A geometric approach to sound source localization from time-delay estimates, Speech, and Language Processing, vol.22, pp.1082-1095, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-00910081
Tree-based recursive expectationmaximization algorithm for localization of acoustic sources, IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), vol.23, issue.10, pp.1692-1703, 2015. ,
Estimation of the directpath relative transfer function for supervised sound-source localization, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.24, issue.11, pp.2171-2186, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01349691
Multiple-speaker localization based on direct-path features and likelihood maximization with spatial sparsity regularization, IEEE/ACM Transactions on Audio, Speech and Language Processing, vol.25, issue.10, 1997. ,
URL : https://hal.archives-ouvertes.fr/hal-01413417
Co-localization of audio sources in images using binaural features and locally-linear regression, IEEE/ACM Transactions on Audio, Speech and Language Processing, vol.23, issue.4, pp.718-731, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01112834
Speech and audio signal processing: processing and perception of speech and music, 2011. ,
AV16.3: An audiovisual corpus for speaker localization and tracking, Machine Learning for Multimodal Interaction, pp.182-195, 2004. ,
Audio-visual speaker diarization based on spatiotemporal Bayesian fusion, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.40, issue.5, pp.1086-1099, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01413403
A metric for performance evaluation of multi-target tracking algorithms, IEEE Transactions on Signal Processing, vol.59, issue.7, pp.3452-3457, 2011. ,
DiarTk: an open source toolkit for research in multistream speaker diarization and its application to meeting recordings, INTERSPEECH, pp.2170-2173, 2012. ,
Multiple person and speaker activity tracking with a particle filter, IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.881-884, 2004. ,
Particle flow SMC-PHD filter for audio-visual multi-speaker tracking, International Conference on Latent Variable Analysis and Signal Separation, pp.344-353, 2017. ,
Non-zero diffusion particle flow SMC-PHD filter for audio-visual multi-speaker tracking, IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.4304-4308, 2018. ,
3D audio-visual speaker tracking with an adaptive particle filter, IEEE International Conference on Acoustics, Speech and Signal Processing, pp.2896-2900, 2017. ,
EM algorithms for weighted-data clustering with application to audio-visual scene analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.38, issue.12, pp.2402-2415, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01261374
Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environments, IEEE Journal of Selected Topics in Signal Processing, vol.13, issue.1, pp.88-103, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-01851985
Tracking Multiple Audio Sources with the Von Mises Distribution and Variational EM, IEEE Signal Processing Letters, vol.26, issue.6, pp.798-802, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-01969050
Exploiting the complementarity of audio and visual data in multi-speaker tracking, IEEE ICCV Workshop on Computer Vision for Audio-Visual Media, pp.446-454, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01577965
Accounting for room acoustics in audio-visual multi-speaker tracking, IEEE International Conference on Acoustics, Speech and Signal Processing, pp.6553-6557, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01718114
On a measure of divergence between two statistical populations defined by their probability distributions, Bull. Calcutta Math. Soc, vol.35, pp.99-109, 1943. ,
High-dimensional regression with Gaussian mixtures and partially-latent response variables, Statistics and Computing, vol.25, issue.5, pp.893-911, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01107604
, Pattern Recognition and Machine Learning, 2006.
The Variational Bayes Method in Signal Processing, 2006. ,
Speaker diarization: A review of recent research, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.2, pp.356-370, 2012. ,
Multimodal speaker diarization, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.1, pp.79-93, 2012. ,
A sector-based, frequency-domain approach to detection and localization of multiple speakers, IEEE International Conference on Acoustics, Speech, and Signal Processing, vol.3, pp.265-268, 2005. ,
Realtime multi-person 2D pose estimation using part affinity fields, IEEE Conference on Computer Vision and Pattern Recognition, pp.7291-7299, 2017. ,
Person re-identification in the wild, IEEE Conference on Computer Vision and Pattern Recognition, pp.1367-1376, 2017. ,
Mot16: A benchmark for multi-object tracking, 2016. ,
Multimodal multi-channel on-line speaker diarization using sensor fusion through SVM, IEEE Transactions on Multimedia, vol.17, issue.10, pp.1694-1705, 2015. ,