T. Kinnunen and H. Li, An overview of text-independent speaker recognition: From features to supervectors, Speech Communication, vol.52, issue.1, pp.12-40, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00587602

N. Dehak, P. J. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, Front-end factor analysis for speaker verification, IEEE Transactions on Audio, Speech, and Language Processing, vol.19, issue.4, pp.788-798, 2011.

E. Variani, Deep neural networks for small footprint text-dependent speaker verification, Proc. ICASSP, pp.4052-4056, 2014.

C. Li, X. Ma, B. Jiang, X. Li, X. Zhang et al., Deep speaker: an end-to-end neural speaker embedding system, CoRR, 2017.

D. Snyder, D. Garcia-romero, G. Sell, D. Povey, and S. Khudanpur, X-vectors: Robust DNN embeddings for speaker recognition, Proc. ICASSP, pp.5329-5333, 2018.

D. Snyder, D. Garcia-romero, G. Sell, A. Mccree, D. Povey et al., Speaker recognition for multi-speaker conversations using x-vectors, Proc. ICASSP, pp.5796-5800, 2019.

L. You, W. Guo, L. R. Dai, and J. Du, Multi-Task learning with high-order statistics for X-vector based text-independent speaker verification, Proc. INTERSPEECH, pp.1158-1162, 2019.

Y. Li, F. Gao, Z. Ou, and J. Sun, Angular softmax loss for endto-end speaker verification, 11th International Symposium on Chinese Spoken Language Processing, pp.190-194, 2018.

T. Ko, V. Peddinti, D. Povey, M. L. Seltzer, and S. Khudanpur, A study on data augmentation of reverberant speech for robust speech recognition, Proc. ICASSP, pp.5220-5224, 2017.

D. Snyder, G. Chen, and D. Povey, MUSAN: A Music, Speech, and Noise Corpus, 2015.

A. Nagrani, J. S. Chung, and A. Zisserman, VoxCeleb: A largescale speaker identification dataset, Proc. INTERSPEECH, pp.2616-2620, 2017.

M. Ravanelli and Y. Bengio, Speaker recognition from raw waveform with SincNet, Proc. SLT, pp.1021-1028, 2018.

T. Kinnunen, Low-variance multitaper MFCC features: A case study in robust speaker verification, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.7, 1990.

P. Rajan, T. Kinnunen, C. Hanilçi, J. Pohjalainen, and P. Alku, Using group delay functions from all-pole models for speaker recognition, Proc. INTERSPEECH, vol.01, p.2013

C. Kim and R. M. Stern, Power-normalized cepstral coefficients (PNCC) for robust speech recognition, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.24, issue.7, pp.1315-1329, 2016.

M. Sahidullah, T. Kinnunen, and C. Hanilçi, A comparison of features for synthetic speech detection, Proc. INTERSPEECH, pp.2087-2091, 2015.

C. Hanilçi, Features and classifiers for replay spoofing attack detection, 2017 10th International Conference on Electrical and Electronics Engineering, pp.1187-1191, 2017.

M. Mclaren, L. Ferrer, D. C. Lavilla, and A. Lawson, The speakers in the wild (SITW) speaker recognition database, Proc. INTERSPEECH, pp.818-822, 2016.

M. Todisco, H. Delgado, and N. Evans, Articulation rate filtering of CQCC features for automatic speaker verification, Proc. INTERSPEECH, pp.3628-3632, 2016.

X. Jing, J. Ma, J. Zhao, and H. Yang, Speaker recognition based on principal component analysis of LPCC and MFCC, Proc. ICSPCC, pp.403-408, 2014.

M. J. Alam, Multitaper MFCC and PLP features for speaker verification using i-vectors, Speech Communication, vol.55, issue.2, pp.237-251, 2013.

J. M. Kua, Investigation of spectral centroid magnitude and frequency for speaker recognition, Proc. Odyssey, pp.34-39, 2010.

P. Rajan, S. H. Parthasarathi, and H. A. Murthy, Robustness of phase based features for speaker recognition, Proc. INTER-SPEECH, 2009.

T. Thiruvaran, E. Ambikairajah, and J. Epps, Group delay features for speaker recognition, 2007 6th International Conference on Information, pp.1-5, 2007.

S. Sadjadi and J. Hansen, Mean hilbert envelope coefficients (MHEC) for robust speaker and language identification, Speech Communication, vol.72, pp.138-148, 2015.

N. Wang and L. Wang, Robust speaker recognition based on multi-stream features, 2016 IEEE International Conference on Consumer Electronics-China, pp.1-4, 2016.

A. G. Adami, Modeling prosodic differences for speaker recognition, Speech Communication, vol.49, issue.4, pp.277-291, 2007.

D. J. Thomson, Spectrum estimation and harmonic analysis, Proceedings of the IEEE, vol.70, issue.9, pp.1055-1096, 1982.

M. Hansson-sandsten and J. Sandberg, Optimal cepstrum estimation using multiple windows, Proc. ICASSP, pp.3077-3080, 2009.

M. Hansson, T. Gansler, and G. Salomonsson, A multiple window method for estimation of a peaked spectrum, Proc. ICASSP, vol.3, pp.1617-1620, 1995.

J. , Linear prediction: A tutorial review, Proceedings of the IEEE, vol.63, issue.4, pp.561-580, 1975.

L. Rabiner and B. Juang, Fundamentals of Speech Recognition, 1993.

H. Hermansky, Perceptual linear predictive (PLP) analysis of speech, The Journal of the Acoustical Society of America, vol.87, issue.4, pp.1738-1752, 1990.

J. Youngberg and S. Boll, Constant-q signal analysis and synthesis, Proc. ICASSP, vol.3, pp.375-378, 1978.

A. Schörkhuber, ;. Christian, and . Klapuri, Constant-q transform toolbox for music processing, 7th Sound and Music Computing Conference, 2010.

M. Todisco, H. Delgado, and N. W. Evans, A new feature for automatic speaker verification anti-spoofing: Constant q cepstral coefficients, Proc. Odyssey, pp.283-290, 2016.

H. Delgado, Further optimisations of constant q cepstral processing for integrated utterance verification and text-dependent speaker verification, Proc. SLT, vol.12, p.2016

H. A. Murthy and V. Gadde, The modified group delay function and its application to phoneme recognition, Proc. ICASSP, vol.1, p.68, 2003.

Z. Wu, X. Xiao, E. S. Chng, and H. Li, Synthetic speech detection using temporal modulation feature, Proc. ICASSP, pp.7234-7238, 2013.

J. Yang and L. Liu, Playback speech detection based on magnitude-phase spectrum, Electronics Letters, vol.54, 2018.

L. Cohen, Time-Frequency Analysis: Theory and Applications, 1995.

P. Ghahremani, A pitch extraction algorithm tuned for automatic speech recognition, Proc. ICASSP, pp.2494-2498, 2014.

F. J. Harris, On the use of windows for harmonic analysis with the discrete Fourier transform, Proceedings of the IEEE, vol.66, issue.1, pp.51-83, 1978.

S. Ioffe, Probabilistic linear discriminant analysis, Computer Vision -ECCV, pp.531-542, 2006.