Skip to Main content Skip to Navigation
Conference papers

Voice Activity Detection Based on Statistical Likelihood Ratio With Adaptive Thresholding

Xiaofei Li 1 Radu Horaud 1 Laurent Girin 1, 2 Sharon Gannot 3
1 PERCEPTION - Interpretation and Modelling of Images and Videos
Inria Grenoble - Rhône-Alpes, Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology, LJK - Laboratoire Jean Kuntzmann
Abstract : Statistical likelihood ratio test is a widely used voice activity detection (VAD) method, in which the likelihood ratio of the current temporal frame is compared with a threshold. A fixed threshold is always used, but this is not suitable for various types of noise. In this paper, an adaptive threshold is proposed as a function of the local statistics of the likelihood ratio. This threshold represents the upper bound of the likelihood ratio for the non-speech frames, whereas it remains generally lower than the likelihood ratio for the speech frames. As a result, a high non-speech hit rate can be achieved, while maintaining speech hit rate as large as possible.
Complete list of metadata

Cited literature [16 references]  Display  Hide  Download
Contributor : Perception Team Connect in order to contact the contributor
Submitted on : Thursday, July 28, 2016 - 4:36:05 PM
Last modification on : Wednesday, November 3, 2021 - 5:13:17 AM
Long-term archiving on: : Saturday, October 29, 2016 - 10:40:28 AM


Files produced by the author(s)



Xiaofei Li, Radu Horaud, Laurent Girin, Sharon Gannot. Voice Activity Detection Based on Statistical Likelihood Ratio With Adaptive Thresholding. IWAENC 2016 - International Workshop on Acoustic Signal Enhancement (IWAENC), Sep 2016, Xi'an, China. pp.1-5, ⟨10.1109/IWAENC.2016.7602911⟩. ⟨hal-01349776⟩



Les métriques sont temporairement indisponibles