The IFCASL Corpus of French and German Non-native and Native Read Speech

Abstract : The IFCASL corpus is a French-German bilingual phonetic learner corpus designed, recorded and annotated in a project on individualized feedback in computer-assisted spoken language learning. The motivation for setting up this corpus was that there is no phonetically annotated and segmented corpus for this language pair of comparable of size and coverage. In contrast to most learner corpora, the IFCASL corpus incorporate data for a language pair in both directions, i.e. in our case French learners of German, and German learners of French. In addition, the corpus is complemented by two sub-corpora of native speech by the same speakers. The corpus provides spoken data by about 100 speakers with comparable productions, annotated and segmented on the word and the phone level, with more than 50% manually corrected data. The paper reports on inter-annotator agreement and the optimization of the acoustic models for forced speech-text alignment in exercises for computer-assisted pronunciation training. Example studies based on the corpus data with a phonetic focus include topics such as the realization of /h/ and glottal stop, final devoicing of obstruents, vowel quantity and quality, pitch range, and tempo.
Type de document :
Communication dans un congrès
LREC'2016, 10th edition of the Language Resources and Evaluation Conference, May 2016, Portorož, Slovenia. Proceedings LREC'2016
Liste complète des métadonnées

https://hal.inria.fr/hal-01293935
Contributeur : Denis Jouvet <>
Soumis le : vendredi 25 mars 2016 - 17:20:52
Dernière modification le : mercredi 14 mars 2018 - 16:38:58
Document(s) archivé(s) le : lundi 14 novembre 2016 - 06:02:06

Fichier

LREC_2016--251_Paper_2016.03.1...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01293935, version 1

Citation

Jürgen Trouvain, Anne Bonneau, Vincent Colotte, Camille Fauth, Dominique Fohr, et al.. The IFCASL Corpus of French and German Non-native and Native Read Speech. LREC'2016, 10th edition of the Language Resources and Evaluation Conference, May 2016, Portorož, Slovenia. Proceedings LREC'2016. 〈hal-01293935〉

Partager

Métriques

Consultations de la notice

510

Téléchargements de fichiers

178