T. Kinnunen and H. Li, An overview of text-independent speaker recognition: From features to supervectors, Speech Communication, vol.52, issue.1, pp.12-40, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00587602

J. P. Campbell, W. Shen, W. M. Campbell, R. Schwartz, J. Bonastre et al., Forensic speaker recognition, IEEE Signal Processing Magazine, vol.26, issue.2, pp.95-103, 2009.

, ICICI bank introduces voice recognition for biometric authentication, pp.2019-2020

, Interpols new software will recognize criminals by their voices, pp.2019-2020

J. P. Campbell, Speaker recognition: A tutorial, Proceedings of the IEEE, vol.85, issue.9, pp.1437-1462, 1997.

N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, Front-end factor analysis for speaker verification, IEEE Transactions on Audio, Speech, and Language Processing, vol.19, issue.4, pp.788-798, 2011.

P. Mat?jka, O. Glembek, O. Novotn`novotn`y, O. Plchot, F. Grézl et al., Cernock`y, Analysis of dnn approaches to speaker identification, Proc. ICASSP, pp.5100-5104, 2016.

D. Snyder, D. Garcia-romero, G. Sell, D. Povey, and S. Khudanpur, X-vectors: Robust DNN embeddings for speaker recognition, Proc. ICASSP, 2018.

A. Poddar, M. Sahidullah, and G. Saha, Speaker verification with short utterances: a review of challenges, trends and opportunities, IET Biometrics, vol.7, issue.3, pp.91-101, 2018.

Y. A. Solewicz, G. Jardine, T. Becker, and S. Gfroerer, Estimated intra-speaker variability boundaries in forensic speaker recognition casework, Proceedings of Biometric Technologies in Forensic Science (BTFS), pp.31-33, 2013.

J. Ming, T. J. Hazen, J. R. Glass, and D. A. Reynolds, Robust speaker recognition in noisy conditions, IEEE Transactions on Audio, Speech, and Language Processing, vol.15, issue.5, pp.1711-1723, 2007.

R. Saeidi, P. Alku, and T. Bäckström, Feature extraction using power-law adjusted linear prediction with application to speaker recognition under severe vocal effort mismatch, IEEE/ACM Transactions on Audio, Speech and Language Processing, vol.24, issue.1, pp.42-53, 2016.

M. Mclaren, L. Ferrer, and A. Lawson, Exploring the role of phonetic bottleneck features for speaker and language recognition, Proc. ICASSP, pp.5575-5579, 2016.

S. Parthasarathy, C. Zhang, J. H. Hansen, and C. Busso, A study of speaker verification performance with expressive speech, Proc. ICASSP, pp.5540-5544, 2017.

D. Wang, Y. Zou, J. Liu, and Y. Huang, A robust DBN-vector based speaker verification system under channel mismatch conditions, 2016 IEEE International Conference on Digital Signal Processing (DSP), pp.94-98, 2016.

V. Vestman, D. Gowda, M. Sahidullah, P. Alku, and T. Kinnunen, Time-varying autoregressions for speaker verification in reverberant conditions, Proc. INTERSPEECH, pp.1512-1516, 2017.

A. Kanagasundaram, R. Vogt, D. B. Dean, S. Sridharan, and M. W. Mason, I-vector based speaker recognition on short utterances, Proc. INTERSPEECH, International Speech Communication Association (ISCA), pp.2341-2344, 2011.

M. I. Mandasari, M. Mclaren, and D. A. Van-leeuwen, Evaluation of i-vector speaker recognition systems for forensic application, Proc. INTERSPEECH, pp.21-24, 2011.

A. Kanagasundaram, R. J. Vogt, D. B. Dean, and S. Sridharan, PLDA based speaker recognition on short utterances, Proc. Odyssey: The Speaker and Language Recognition Workshop, ISCA, 2012.

A. K. Sarkar, D. Matrouf, P. Bousquet, and J. Bonastre, Study of the effect of i-vector modeling on short and mismatch utterance duration for speaker verification, Proc. INTERSPEECH, 2012.
URL : https://hal.archives-ouvertes.fr/hal-01320313

B. G. Fauve, N. W. Evans, N. Pearson, J. Bonastre, and J. S. Mason, Influence of task duration in text-independent speaker verification, Proc. INTERSPEECH, pp.794-797, 2007.
URL : https://hal.archives-ouvertes.fr/hal-01312885

L. Ferrer, H. Bratt, V. R. Gadde, S. S. Kajarekar, E. Shriberg et al., Modeling duration patterns for speaker recognition, Proc. EUROSPEECH, 2003.

B. Fauve, N. Evans, and J. Mason, Improving the performance of text-independent short duration SVM-and GMM-based speaker verification, Proc. Odyssey: The Speaker and Language Recognition Workshop, p.18, 2008.

T. Hasan, R. Saeidi, J. H. Hansen, and D. Van-leeuwen, Duration mismatch compensation for i-vector based speaker recognition systems, Proc. ICASSP, pp.7663-7667, 2013.

A. Kanagasundaram, D. Dean, S. Sridharan, J. Gonzalez-dominguez, J. Gonzalez-rodriguez et al., Improving short utterance i-vector speaker verification using utterance variance modelling and compensation techniques, Speech Communication, vol.59, pp.69-82, 2014.

M. I. Mandasari, R. Saeidi, and D. A. Van-leeuwen, Quality measures based calibration with duration and noise dependency for speaker recognition, Speech Communication, vol.72, pp.126-137, 2015.

M. I. Mandasari, R. Saeidi, M. Mclaren, and D. A. Van-leeuwen, Quality measure functions for calibration of speaker recognition systems in various duration conditions, IEEE Transactions on Audio, Speech, and Language Processing, vol.21, issue.11, pp.2425-2438, 2013.

L. Li, D. Wang, C. Zhang, and T. F. Zheng, Improving short utterance speaker recognition by modeling speech unit classes, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.24, issue.6, pp.1129-1139, 2016.

C. Zhang, K. Koishida, and J. Hansen, Text-independent speaker verification based on triplet convolutional neural network embeddings, IEEE/ACM Transactions on Audio, Speech and Language Processing, vol.26, issue.9, pp.1633-1644, 2018.

J. Guo, N. Xu, K. Qian, Y. Shi, K. Xu et al., Deep neural network based i-vector mapping for speaker verification using short utterances, Speech Communication, vol.105, pp.92-102, 2018.

N. Poh, J. Kittler, and T. Bourlai, Quality-based score normalization with device qualitative information for multimodal biometric fusion, IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, vol.40, issue.3, pp.539-554, 2010.

N. Poh and S. Bengio, Improving fusion with margin-derived confidence in biometric authentication tasks, International Conference on Audio-and Video-Based Biometric Person Authentication, pp.474-483, 2005.

N. Poh, T. Bourlai, and J. Kittler, A multimodal biometric test bed for quality-dependent, cost-sensitive and client-specific score-level fusion algorithms, Pattern Recognition, vol.43, issue.3, pp.1094-1105, 2010.

N. Poh and J. Kittler, A unified framework for biometric expert fusion incorporating quality measures, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.1, pp.3-18, 2012.

J. Fierrez-aguilar, J. Ortega-garcia, J. Gonzalez-rodriguez, and J. Bigun, Discriminative multimodal biometric authentication based on quality measures, Pattern recognition, vol.38, issue.5, pp.777-779, 2005.

P. Grother and E. Tabassi, Performance of biometric quality measures, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.29, issue.4, pp.531-543, 2007.

D. Garcia-romero, J. Fierrez-aguilar, J. Gonzalez-rodriguez, and J. Ortega-garcia, Using quality measures for multilevel speaker recognition, Computer Speech & Language, vol.20, issue.2-3, pp.192-209, 2006.

D. Garcia-romero, J. Fiérrez-aguilar, J. Gonzalez-rodriguez, and J. Ortega-garcia, On the use of quality measures for textindependent speaker recognition, Proc. Odyssey: The Speaker and Language Recognition Workshop, 2004.

T. Hasan, S. O. Sadjadi, G. Liu, N. Shokouhi, H. Bo?il et al., CRSS systems for 2012 NIST speaker recognition evaluation, Proc. ICASSP, pp.6783-6787, 2013.

A. Harriero, D. Ramos, J. Gonzalez-rodriguez, and J. Fierrez, Analysis of the utility of classical and novel speech quality measures for speaker verification, International Conference on Biometrics, pp.434-442, 2009.

F. Alonso-fernandez, J. Fierrez, and J. Ortega-garcia, Quality measures in biometric systems, IEEE Security & Privacy, vol.10, issue.6, pp.52-62, 2012.

C. C. Chibelushi, F. Deravi, and J. S. Mason, A review of speech-based bimodal recognition, IEEE Transactions on Multimedia, vol.4, issue.1, pp.23-37, 2002.

J. Kittler, N. Poh, O. Fatukasi, K. Messer, K. Kryszczuk et al., Quality dependent fusion of intramodal and multimodal biometric experts, Defense and Security Symposium, International Society for Optics and Photonics, pp.653903-653903, 2007.

A. Poddar, M. Sahidullah, and G. Saha, Novel quality metric for duration variability compensation in speaker verification, Proc. Ninth International Conference on Advances in Pattern Recognition, 2017.

A. Poddar, M. Sahidullah, and G. Saha, Improved i-vector extraction technique for speaker verification with short utterances, International Journal of Speech Technology, vol.21, issue.3, pp.473-488, 2018.

D. A. Reynolds, T. F. Quatieri, and R. B. Dunn, Speaker verification using adapted Gaussian mixture models, Digital Signal Processing, vol.10, issue.1, pp.19-41, 2000.

W. M. Campbell, D. E. Sturim, D. A. Reynolds, and A. Solomonoff, SVM based speaker verification using a GMM supervector kernel and NAP variability compensation, Proc. ICASSP, vol.1, pp.I-I, 2006.

P. Kenny, G. Boulianne, P. Ouellet, and P. Dumouchel, Joint factor analysis versus eigenchannels in speaker recognition, IEEE Transactions on Audio, Speech, and Language Processing, vol.15, issue.4, pp.1435-1447, 2007.

P. Kenny, Bayesian speaker verification with heavy-tailed priors, Proc. Odyssey: The Speaker and Language Recognition Workshop, p.14, 2010.

W. Li, T. Fu, H. You, J. Zhu, and N. Chen, Feature sparsity analysis for i-vector based speaker verification, Speech Communication, vol.80, pp.60-70, 2016.

A. H. Poorjam, R. Saeidi, T. Kinnunen, and V. Hautamäki, Incorporating uncertainty as a quality measure in i-vector based language recognition, Proc. Odyssey: The Speaker and Language Recognition Workshop, ISCA, pp.74-80, 2016.

L. Ferrer, M. K. Sönmez, and S. S. Kajarekar, Class-dependent score combination for speaker recognition, Proc. INTERSPEECH, pp.2173-2176, 2005.

G. R. Doddington, M. A. Przybocki, A. F. Martin, and D. A. Reynolds, The NIST speaker recognition evaluation-overview, methodology, systems, results, perspective, Speech Communication, vol.31, issue.2, pp.225-254, 2000.

J. Bigun, J. Fierrez-aguilar, J. Ortega-garcia, and J. Gonzalez-rodriguez, Multimodal biometric authentication using quality signals in mobile communications, Proc. 12th International Conference on Image Analysis and Processing, 2003.

V. Hautamaki, T. Kinnunen, F. Sedlák, K. A. Lee, B. Ma et al., Sparse classifier fusion for speaker verification, IEEE Transactions on Audio, Speech, and Language Processing, vol.21, issue.8, pp.1622-1631, 2013.

. Bosaris-toolkit,

, The NIST year 2008 speaker recognition evaluation plan, tech.rep

, The NIST year 2010 speaker recognition evaluation plan, tech.rep

A. Poddar, M. Sahidullah, and G. Saha, Performance comparison of speaker recognition systems in presence of duration variability, Proc. 2015 Annual IEEE India Conference (INDICON), pp.1-6, 2015.

S. B. Davis and P. Mermelstein, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, IEEE Transactions on Acoustics, Speech and Signal Processing, vol.28, issue.4, pp.357-366, 1980.

M. Sahidullah and G. Saha, Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition, Speech Communication, vol.54, issue.4, pp.543-565, 2012.

M. Sahidullah and G. Saha, A novel windowing technique for efficient computation of MFCC for speaker recognition, Signal Processing Letters, vol.20, issue.2, pp.149-152, 2013.

M. Sahidullah and G. Saha, Comparison of speech activity detection techniques for speaker recognition

P. Kenny, P. Ouellet, N. Dehak, V. Gupta, and P. Dumouchel, A study of interspeaker variability in speaker verification, IEEE Transactions on Audio, Speech, and Language Processing, vol.16, issue.5, pp.980-988, 2008.

, Prior to that he has worked as a research project person in the project Development of Optical Character Recognition system on printed Indian Languages in Computer Vision and Pattern Recognition (CVPR) Unit, Indian Statistical Institute (ISI), Arnab Poddar received his MS (by research) degree in the area of speech processing and machine learning from the Department of Electronics & Electrical Communication Engineering, 2018.

, Prior to that he obtained the Bachelors of Engineering degree in Electronics and Communication Engineering from Vidyasagar University in 2004 and the Masters of Engineering degree in Computer Science and Engineering (with specialization in Embedded System) from West Bengal University of Technology in 2006, he was a postdoctoral researcher with the School of Computing, 2007.

, Indian Institute of Technology (IIT) Kharagpur, India in 1990 and 2000, respectively. In between, he served industry for about four years and obtained a five year fellowship from Council of Scientific Industrial Research, India, Goutam Saha received his B.Tech. and Ph.D. degrees from the Department of Electronics & Electrical Communication Engineering, 2002.