Skip to Main content Skip to Navigation
New interface
Conference papers

Parameterized Channel Normalization for Far-field Deep Speaker Verification

Xuechen Liu 1, 2 Md Sahidullah 1 Tomi Kinnunen 2 
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : We address far-field speaker verification with deep neural network (DNN) based speaker embedding extractor, where mismatch between enrollment and test data often comes from convolutive effects (e.g. room reverberation) and noise. To mitigate these effects, we focus on two parametric normalization methods: per-channel energy normalization (PCEN) and parameterized cepstral mean normalization (PCMN). Both methods contain differentiable parameters and thus can be conveniently integrated to, and jointly optimized with the DNN using automatic differentiation methods. We consider both fixed and trainable (data-driven) variants of each method. We evaluate the performance on Hi-MIA, a recent large-scale far-field speech corpus, with varied microphone and positional settings. Our methods outperform conventional mel filterbank features, with maximum of 33.5% and 39.5% relative improvement on equal error rate under matched microphone and mismatched microphone conditions, respectively.
Complete list of metadata
Contributor : Md Sahidullah Connect in order to contact the contributor
Submitted on : Thursday, September 30, 2021 - 5:47:38 AM
Last modification on : Friday, February 4, 2022 - 11:05:47 AM
Long-term archiving on: : Friday, December 31, 2021 - 6:10:58 PM


Files produced by the author(s)


  • HAL Id : hal-03359174, version 1


Xuechen Liu, Md Sahidullah, Tomi Kinnunen. Parameterized Channel Normalization for Far-field Deep Speaker Verification. ASRU 2021 - IEEE Automatic Speech Recognition and Understanding Workshop, Dec 2021, Cartagena, Colombia. ⟨hal-03359174⟩



Record views


Files downloads