Skip to Main content Skip to Navigation

Analysis and Automatic Classification of Some Discourse Particles on a Large Set of French Spoken Corpora

Abstract : In French, quite a number of words and expressions are frequently used as discourse particles in spoken language, especially in spontaneous speech. The semantic load of these words or expressions differ whether they are used as discourse particles or not. Therefore, the correct identification of their discourse function remains of great importance. In this paper the distribution of the discourse function (or not discourse function), and of the detailed discourse functions of some of these words, is studied on a large set of French corpora ranging from prepared speech (e.g. storytelling and broadcast news) to spontaneous speech (e.g. interviews and interactions between people). The paper is focused on a subset of discourse particles that are recurrent in the considered corpora. The discourse function of a few thousand occurrences of these words have been manually annotated. A statistical analysis of the functions of the words is presented and discussed with respect to the types of spoken corpora. Finally, some statistics with respect to a few prosodic correlates of the discourse particles are presented, as well as some results of automatic classification and detection of the word function (discourse particle or not) using prosodic features.
Document type :
Conference papers
Complete list of metadatas

Cited literature [25 references]  Display  Hide  Download

https://hal.inria.fr/hal-01585567
Contributor : Denis Jouvet <>
Submitted on : Monday, September 11, 2017 - 4:20:06 PM
Last modification on : Tuesday, December 18, 2018 - 4:38:02 PM

File

DiscourseParticles-SLSP-v1.1-s...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01585567, version 1

Citation

Denis Jouvet, Katarina Bartkova, Mathilde Dargnat, Lou Lee. Analysis and Automatic Classification of Some Discourse Particles on a Large Set of French Spoken Corpora. SLSP'2017, 5th International Conference on Statistical Language and Speech Processing, Oct 2017, Le Mans, France. ⟨hal-01585567⟩

Share

Metrics

Record views

440

Files downloads

2237