Tree-Structured Named Entities Extraction from Competing Speech Transcriptions

Davy Weissenbacher 1, * Christian Raymond 1
* Auteur correspondant
1 LinkMedia - Creating and exploiting explicit links between multimedia fragments
Inria Rennes – Bretagne Atlantique , IRISA-D6 - MEDIA ET INTERACTIONS
Abstract : When real applications are working with automatic speech transcription, the first source of error does not originate from the incoher-ence in the analysis of the application but from the noise in the automatic transcriptions. This study presents a simple but effective method to generate a new transcription of better quality by combining utterances from competing transcriptions. We have extended a structured Named Entity (NE) recognizer submitted during the ETAPE Challenge. Working on French TV and Radio programs, our system revises the transcriptions provided by making use of the NEs it has detected. Our results suggest that combining the transcribed utterances which optimize the F-measures, rather than minimizing the WER scores, allows the generation of a better transcription for NE extraction. The results show a small but significant improvement of 0.9% SER against the baseline system on the ROVER transcription. These are the best performances reported to date on this corpus. Index Terms: speech transcription, structured named entities, multi-pass decoding. When real applications are working with automatic speech transcription, the first error does not originate from the incoherence in the analysis of the application , but from the noise of the automatic transcription outputs. With a rate often close to one in three words incorrect in the transcription, the quality of the preprocessing is low and, as a result, the output analysis of the application is often unexploitable. An explanation for this low performance of speech recog-nizers can be found in [8]. Little lexical and syntactic information is effectively used to enable the computation of the decoding of the acoustic output. More complex information are reintegrated in a second decoding pass where only the best sequences of words produced during the first pass are considered. The main contribution of this study is to present a simple but effective method to generate a new transcription of better quality by combining several competing transcriptions. Current Automatic Speech Recognition (ASR) systems rely on various strategies and/or resources to discover the original utterances pronounced. As a consequence, errors made by competing ASRs are different, which make the transcriptions complementary. The Rover method exploits such complementarity to recombine several transcriptions and output a
Type de document :
Communication dans un congrès
International Conference on Application of Natural Language to Information Systems, Jun 2015, Passau, Germany. International Conference on Application of Natural Language to Information Systems, 2015, <10.1007/978-3-319-19581-0_22>
Liste complète des métadonnées


https://hal.inria.fr/hal-01196808
Contributeur : Christian Raymond <>
Soumis le : jeudi 10 septembre 2015 - 14:27:29
Dernière modification le : vendredi 17 février 2017 - 16:11:23
Document(s) archivé(s) le : lundi 28 décembre 2015 - 23:55:51

Fichier

NLDB2015.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Davy Weissenbacher, Christian Raymond. Tree-Structured Named Entities Extraction from Competing Speech Transcriptions. International Conference on Application of Natural Language to Information Systems, Jun 2015, Passau, Germany. International Conference on Application of Natural Language to Information Systems, 2015, <10.1007/978-3-319-19581-0_22>. <hal-01196808>

Partager

Métriques

Consultations de
la notice

223

Téléchargements du document

50