B-spline model order selection with optimal MDL criterion applied to speech fundamental frequency stylisation

Damien Lolive 1 Nelly Barbot 1 Olivier Boëffard 1
1 CORDIAL - Human-machine spoken dialogue
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, INRIA Rennes, ENSSAT - École Nationale Supérieure des Sciences Appliquées et de Technologie
Abstract : In the speech processing field, stylization of fundamental frequency F 0 has been subjected to numerous works. Models proposed in the literature rely on knowledge stemming from phonology and linguistics. We propose an approach that deals with the issue of F0 curve stylization requiring as few linguistic assumptions as possible and in the framework of B-spline models. A B-spline model, characterized by a sequence of knots with which control points are associated, enables the formalization of discontinuities in the derivatives of the observed values sequence. Beyond the implementation of a B-spline model to stylize an open curve sampled using a constant step, we address the problem of the optimal model order choice. We propose to use a parsimony criterion based on a minimum description length (MDL) approach, in order to optimize the number of knots. We derive several criteria relying on bounds estimated from parameter values. We demonstrate the optimality of these choices in the theoretical MDL framework. We introduce a notion of variable precision of parameters which enables a good compromise between the modeling precision and degrees of freedom of the estimated models. Experiments are performed on a French speech corpus and compare three MDL criteria. The use of both B-spline model and MDL methodology enables an efficient modeling of F 0 curves and provides an RMS error around 1 Hz while allowing a relatively high compression rate about 40%.
Type de document :
Article dans une revue
IEEE Journal of Selected Topics in Signal Processing, IEEE, 2010, 4 (3), pp.571 - 581. 〈10.1109/JSTSP.2010.2048236〉
Liste complète des métadonnées

https://hal.inria.fr/inria-00538937
Contributeur : Equipe-Projet Cordial <>
Soumis le : mardi 23 novembre 2010 - 15:19:10
Dernière modification le : mercredi 16 mai 2018 - 11:23:02

Identifiants

Citation

Damien Lolive, Nelly Barbot, Olivier Boëffard. B-spline model order selection with optimal MDL criterion applied to speech fundamental frequency stylisation. IEEE Journal of Selected Topics in Signal Processing, IEEE, 2010, 4 (3), pp.571 - 581. 〈10.1109/JSTSP.2010.2048236〉. 〈inria-00538937〉

Partager

Métriques

Consultations de la notice

194