Automates lexico-phonétiques pour l'indexation et la recherche de segments de parole

Julien Fayolle 1 Fabienne Moreau 1 Christian Raymond 1 Guillaume Gravier 1
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : This paper presents a method for indexing spoken utterances which combines lexical and phonetic hypotheses in a hybrid index built from automata. The retrieval is realized by a lexical-phonetic and semi-imperfect matching whose aim is to improve the recall. A feature vector, containing edit distance scores and a confidence measure, weights each transition to help the filtering of the candidate utterance list for a more precise search. Experiment results show that the lexical and phonetic representations are complementary and we compare the hybrid search with the state-of-the-art cascaded search to retrieve named entity queries.
Complete list of metadatas

Cited literature [10 references]  Display  Hide  Download

https://hal.inria.fr/hal-00742848
Contributor : Christian Raymond <>
Submitted on : Wednesday, October 17, 2012 - 1:40:58 PM
Last modification on : Friday, November 16, 2018 - 1:22:24 AM
Long-term archiving on : Saturday, December 17, 2016 - 2:16:36 AM

File

JEP2012.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-00742848, version 1

Citation

Julien Fayolle, Fabienne Moreau, Christian Raymond, Guillaume Gravier. Automates lexico-phonétiques pour l'indexation et la recherche de segments de parole. JEP - Journées d'Études sur la Parole, Jun 2012, Grenoble, France. pp.49-56. ⟨hal-00742848⟩

Share

Metrics

Record views

1009

Files downloads

148