Time-frequency processing - Spectral properties

Tuomas Virtanen; Emmanuel Vincent; Sharon Gannot

Chapitre D'ouvrage Année : 2018

Time-frequency processing - Spectral properties

(1) , (2) , (3)

1
2
3

Tuomas Virtanen

Fonction : Auteur

Tampere University of Technology [Tampere]

Emmanuel Vincent

Fonction : Auteur
PersonId : 1256
IdHAL : emmanuelv
ORCID : 0000-0002-0183-7289
IdRef : 089360176

Speech Modeling for Facilitating Oral-Based Communication

Sharon Gannot

Fonction : Auteur

Bar-Ilan University [Israël]

Résumé

Many audio signal processing algorithms typically do not operate on raw time-domain audio signals, but rather on time-frequency representations. A raw audio signal encodes the amplitude of a sound as a function of time. Its Fourier spectrum represents it as a function of frequency, but does not represent variations over time. A time-frequency representation presents the amplitude of a sound as a function of both time and frequency, and is able to jointly account for its temporal and spectral characteristics (Gröchenig, 2001). Time-frequency representations are appropriate for three reasons in our context. First, separation and enhancement often require modeling the structure of sound sources. Natural sound sources have a prominent structure both in time and frequency , which can be easily modeled in the time-frequency domain. Second, the sound sources are often mixed convolutively, and this convolutive mixing process can be approximated with simpler operations in the time-frequency domain. Third natural sounds are more sparsely distributed and overlap less with each other in the time-frequency domain than in the time or frequency domain, which facilitates their separation. In this chapter we introduce the most common time-frequency representations used for source separation and speech enhancement. Section 2.1 describes the procedure for calculating a time-frequency representation and converting it back to the time domain, using the short-time Fourier transform (STFT) as an example. It also presents other common time-frequency representations and their relevance for separation and enhancement. Section 2.2 discusses the properties of sound sources in the time-frequency domain, including sparsity, disjointness, and more complex structures such as harmonicity. Section 2.3 explains how to achieve separation by time-varying filtering in the time-frequency domain. We summarize the main concepts and provide links to other chapters and more advanced topics in Section 2.4.

Domaines

Traitement du signal et de l'image [eess.SP]

Fichier principal

virtanen_book18_chap2.pdf (991.39 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Emmanuel Vincent : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01881426

Soumis le : mardi 25 septembre 2018-21:29:43

Dernière modification le : jeudi 1 février 2024-10:05:50

Archivage à long terme le : mercredi 26 décembre 2018-17:17:31

Dates et versions

hal-01881426 , version 1 (25-09-2018)

Identifiants

HAL Id : hal-01881426 , version 1

Citer

Tuomas Virtanen, Emmanuel Vincent, Sharon Gannot. Time-frequency processing - Spectral properties. Emmanuel Vincent; Tuomas Virtanen; Sharon Gannot. Audio source separation and speech enhancement, Wiley, 2018, 978-1-119-27989-1. ⟨hal-01881426⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA IRISA UNIV-LORRAINE INRIA2 LORIA LORIA-NLPKD UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

170 Consultations

1910 Téléchargements

Time-frequency processing - Spectral properties

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager