Skip to Main content Skip to Navigation
Conference papers

Voice Activity Detection Based on Statistical Likelihood Ratio With Adaptive Thresholding

Xiaofei Li 1 Radu Horaud 1 Laurent Girin 1, 2 Sharon Gannot 3
1 PERCEPTION - Interpretation and Modelling of Images and Videos
Inria Grenoble - Rhône-Alpes, Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology, LJK - Laboratoire Jean Kuntzmann
Abstract : Statistical likelihood ratio test is a widely used voice activity detection (VAD) method, in which the likelihood ratio of the current temporal frame is compared with a threshold. A fixed threshold is always used, but this is not suitable for various types of noise. In this paper, an adaptive threshold is proposed as a function of the local statistics of the likelihood ratio. This threshold represents the upper bound of the likelihood ratio for the non-speech frames, whereas it remains generally lower than the likelihood ratio for the speech frames. As a result, a high non-speech hit rate can be achieved, while maintaining speech hit rate as large as possible.
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download
Contributor : Team Perception <>
Submitted on : Thursday, July 28, 2016 - 4:36:05 PM
Last modification on : Thursday, November 19, 2020 - 1:02:13 PM
Long-term archiving on: : Saturday, October 29, 2016 - 10:40:28 AM


Files produced by the author(s)




Xiaofei Li, Radu Horaud, Laurent Girin, Sharon Gannot. Voice Activity Detection Based on Statistical Likelihood Ratio With Adaptive Thresholding. International Workshop on Acoustic Signal Enhancement (IWAENC), Sep 2016, Xi'an, China. pp.1-5, ⟨10.1109/IWAENC.2016.7602911⟩. ⟨hal-01349776⟩



Record views


Files downloads