Experiments on the Construction of a Phonetically Balanced Corpus from the Web - Archive ouverte HAL Access content directly
Conference Papers Year : 2004

Experiments on the Construction of a Phonetically Balanced Corpus from the Web

(1) , (1) , (2, 3) , (3)
1
2
3

Abstract

The construction of a speech recognition system requires a recorded set of phrases to compute the pertinent acoustic models. This set of phrases must be phonetically rich and balanced in order to obtain a robust recognizer. By tradition, this set is defined manually implicating a great human effort. In this paper we propose an automated method for assembling a phonetically balanced corpus (set of phrases) from the Web. The proposed method was used to construct a phonetically balanced corpus for the Mexican Spanish language.
Fichier principal
Vignette du fichier
Villasenor04.pdf (19.58 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

inria-00326519 , version 1 (03-10-2008)

Identifiers

  • HAL Id : inria-00326519 , version 1

Cite

Luis Villaseñor-Pineda, Manuel Montes-Y-Gómez, Dominique Vaufreydaz, Jean-François Serignat. Experiments on the Construction of a Phonetically Balanced Corpus from the Web. Conference on Intelligent Text Processing and Computational Linguistics CICLing-2004, Feb 2004, Seoul, South Korea. 4 p. ⟨inria-00326519⟩
127 View
246 Download

Share

Gmail Facebook Twitter LinkedIn More