Skip to Main content Skip to Navigation
Conference papers

Impact of frame rate on automatic speech-text alignment for corpus-based phonetic studies

Katarina Bartkova 1, * Denis Jouvet 2
* Corresponding author
2 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Phonetic segmentation is the basis for many phonetic and linguistic studies. As manual segmentation is a lengthy and tedious task, automatic procedures have been developed over the years. They rely on acoustic Hidden Markov Models. Many studies have been conducted, and refinements developed for corpus based speech synthesis, where the technology is mainly used in a speaker-dependent context and applied on good quality speech signals. In a different research direction, automatic speech-text alignment is also used for phonetic and linguistic studies on large speech corpora. In this case, speaker independent acoustic models are mandatory, and the speech quality may not be so good. The speech models rely on 10 ms shift between acoustic frames, and their topology leads to strong minimum duration constraints. This paper focuses on the acoustic analysis frame rate, and gives a first insight on the impact of the frame rate on corpus-based phonetic studies.
Document type :
Conference papers
Complete list of metadatas

Cited literature [28 references]  Display  Hide  Download

https://hal.inria.fr/hal-01183637
Contributor : Denis Jouvet <>
Submitted on : Monday, August 10, 2015 - 3:01:42 PM
Last modification on : Thursday, March 5, 2020 - 4:51:06 PM
Document(s) archivé(s) le : Wednesday, November 11, 2015 - 10:24:53 AM

File

FrameRateAndSpeechTextALignmen...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01183637, version 1

Collections

Citation

Katarina Bartkova, Denis Jouvet. Impact of frame rate on automatic speech-text alignment for corpus-based phonetic studies. ICPhS'2015 - 18th International Congress of Phonetic Sciences, Aug 2015, Glasgow, United Kingdom. ⟨hal-01183637⟩

Share

Metrics

Record views

443

Files downloads

277