Skip to Main content Skip to Navigation
Conference papers

Supervised Group Nonnegative Matrix Factorisation With Similarity Constraints And Applications To Speaker Identification

Romain Serizel 1 Victor Bisot 2 Slim Essid 2 Gael Richard 2
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : This paper presents supervised feature learning approaches for speaker identification that rely on nonnegative matrix factorisa-tion. Recent studies have shown that group nonnegative matrix factorisation and task-driven supervised dictionary learning can help performing effective feature learning for audio classification problems. This paper proposes to integrate a recent method that relies on group nonnegative matrix factorisation into a task-driven supervised framework for speaker identification. The goal is to capture both the speaker variability and the session variability while exploiting the discriminative learning aspect of the task-driven approach. Results on a subset of the ESTER corpus prove that the proposed approach can be competitive with I-vectors. Index Terms— Nonnegative matrix factorisation, feature learning , dictionary learning, online learning, speaker identification
Document type :
Conference papers
Complete list of metadatas

Cited literature [27 references]  Display  Hide  Download

https://hal.inria.fr/hal-01484744
Contributor : Romain Serizel <>
Submitted on : Tuesday, March 7, 2017 - 4:25:06 PM
Last modification on : Wednesday, June 24, 2020 - 4:19:02 PM
Document(s) archivé(s) le : Thursday, June 8, 2017 - 2:22:38 PM

File

supervised-group-nonnegative.p...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01484744, version 1

Citation

Romain Serizel, Victor Bisot, Slim Essid, Gael Richard. Supervised Group Nonnegative Matrix Factorisation With Similarity Constraints And Applications To Speaker Identification. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Mar 2017, New Orleans, United States. ⟨hal-01484744⟩

Share

Metrics

Record views

695

Files downloads

297