Selecting Representative Speakers for a Speech Database on the Basis of Heterogeneous Similarity Criteria

Frédéric Bimbot 1 Olivier Boëffard 2 Delphine Charlet 3 Dominique Fohr 4 Sacha Krstulovic 1 Odile Mella 4
1 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
2 CORDIAL - Human-machine spoken dialogue
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, INRIA Rennes, ENSSAT - École Nationale Supérieure des Sciences Appliquées et de Technologie
4 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : In the context of the Neologos French speech database creation project, a general methodology was defined for the selection of representative speaker recordings. The selection aims at providing a good coverage in terms of speaker variability while limiting the number of recorded speakers. This is intended to make the resulting database both more adapted to the development of recently proposed multi-model methods and less expensive to collect. The presented methodology proposes a selection process based on the optimization of a quality criterion defined in a variety of speaker similarity modeling frameworks. The selection can be achieved with respect to a unique similarity criterion, using classical clustering methods such as Hierarchical or K-Medians clustering, or it can combine several speaker similarity criteria, thanks to a newly developed clustering method called Focal Speakers Selection. In this framework, four different speaker similarity criteria are tested, and three different speaker clustering algorithms are compared. Results pertaining to the collection of the Neologos database are also discussed.
Type de document :
Chapitre d'ouvrage
Christian Müller. Speaker Classification II, 4441, Springer Berlin / Heidelberg, pp.276-292, 2007, Lecture Notes in Computer Science, 978-3-540-74121-3. 〈10.1007/978-3-540-74122-0_21〉
Liste complète des métadonnées

https://hal.inria.fr/inria-00187732
Contributeur : Dominique Fohr <>
Soumis le : jeudi 15 novembre 2007 - 11:02:27
Dernière modification le : mercredi 16 mai 2018 - 11:23:03

Identifiants

Citation

Frédéric Bimbot, Olivier Boëffard, Delphine Charlet, Dominique Fohr, Sacha Krstulovic, et al.. Selecting Representative Speakers for a Speech Database on the Basis of Heterogeneous Similarity Criteria. Christian Müller. Speaker Classification II, 4441, Springer Berlin / Heidelberg, pp.276-292, 2007, Lecture Notes in Computer Science, 978-3-540-74121-3. 〈10.1007/978-3-540-74122-0_21〉. 〈inria-00187732〉

Partager

Métriques

Consultations de la notice

338