Non-linear speech representation based on local predictability exponents

Abstract : Looking for new perspectives to analyze non-linear dynamics of speech, this paper presents a novel approach based on a microcanonical multiscale formulation which allows the geometric and statistical description of multiscale properties of the complex dynamics. Speech is a complex system whose dynamics can be, to some extent, geometrically and statistically accessed by the computation of Local Predictability Exponents (LPEs) unlocking the determination of the most informative subset (Most Singular Manifold or MSM), leading to associated compact representation and reconstruction. But the complex intertwining of different dynamics in speech (added to purely turbulent descriptions) suggests the definition of appropriate multiscale functionals that might influence the evaluation of LPEs, hence leading more compact MSM. Consequently, by using the classical and generic Sauer/Allebach algorithm for signal reconstruction from irregularly spaced samples, we show that speech reconstruction of high quality can be achieved using MSM of low cardinality. Moreover, in order to further show the potential of the new methodology, we develop a simple and efficient waveform coder which achieves almost the same level of perceptual quality as a standard coder, while having a lower bit-rate.
Type de document :
Article dans une revue
Neurocomputing Journal, Elsevier, 2013, Special issue on Non-Linear Speech Signal processing
Liste complète des métadonnées

Littérature citée [19 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00685019
Contributeur : Vahid Khanagha <>
Soumis le : mardi 16 avril 2013 - 21:31:26
Dernière modification le : mercredi 3 janvier 2018 - 14:18:08
Document(s) archivé(s) le : mercredi 17 juillet 2013 - 02:25:09

Fichier

Neuro2012.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00685019, version 1

Collections

Citation

Vahid Khanagha, Khalid Daoudi, Oriol Pont, Hussein Yahia, Antonio Turiel. Non-linear speech representation based on local predictability exponents. Neurocomputing Journal, Elsevier, 2013, Special issue on Non-Linear Speech Signal processing. 〈hal-00685019〉

Partager

Métriques

Consultations de la notice

333

Téléchargements de fichiers

254