A consolidated perspective on multi-microphone speech enhancement and source separation

Abstract : Speech enhancement and separation are core problems in audio signal processing, with commercial applications in devices as diverse as mobile phones, conference call systems, hands-free systems, or hearing aids. In addition, they are crucial pre-processing steps for noise-robust automatic speech and speaker recognition. Many devices now have two to eight microphones. The enhancement and separation capabilities offered by these multichannel interfaces are usually greater than those of single-channel interfaces. Research in speech enhancement and separation has followed two convergent paths, starting with microphone array processing and blind source separation, respectively. These communities are now strongly interrelated and routinely borrow ideas from each other. Yet, a comprehensive overview of the common foundations and the differences between these approaches is lacking at present. In this article, we propose to fill this gap by analyzing a large number of established and recent techniques according to four transverse axes: a) the acoustic impulse response model, b) the spatial filter design criterion, c) the parameter estimation algorithm, and d) optional postfiltering. We conclude this overview paper by providing a list of software and data resources and by discussing perspectives and future trends in the field.
Type de document :
Article dans une revue
IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2017, 25 (4), pp.692-730
Liste complète des métadonnées

https://hal.inria.fr/hal-01414179
Contributeur : Emmanuel Vincent <>
Soumis le : samedi 4 mars 2017 - 22:57:43
Dernière modification le : jeudi 11 janvier 2018 - 06:27:31
Document(s) archivé(s) le : mardi 6 juin 2017 - 12:07:06

Fichier

gannot_TASLP17.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01414179, version 2

Citation

Sharon Gannot, Emmanuel Vincent, Shmulik Markovich-Golan, Alexey Ozerov. A consolidated perspective on multi-microphone speech enhancement and source separation. IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2017, 25 (4), pp.692-730. 〈hal-01414179v2〉

Partager

Métriques

Consultations de la notice

699

Téléchargements de fichiers

931