Skip to Main content Skip to Navigation
Conference papers

Analysis and Automatic Classification of Some Discourse Particles on a Large Set of French Spoken Corpora

Abstract : In French, quite a number of words and expressions are frequently used as discourse particles in spoken language, especially in spontaneous speech. The semantic load of these words or expressions differ whether they are used as discourse particles or not. Therefore, the correct identification of their discourse function remains of great importance. In this paper the distribution of the discourse function (or not discourse function), and of the detailed discourse functions of some of these words, is studied on a large set of French corpora ranging from prepared speech (e.g. storytelling and broadcast news) to spontaneous speech (e.g. interviews and interactions between people). The paper is focused on a subset of discourse particles that are recurrent in the considered corpora. The discourse function of a few thousand occurrences of these words have been manually annotated. A statistical analysis of the functions of the words is presented and discussed with respect to the types of spoken corpora. Finally, some statistics with respect to a few prosodic correlates of the discourse particles are presented, as well as some results of automatic classification and detection of the word function (discourse particle or not) using prosodic features.
Document type :
Conference papers
Complete list of metadata

Cited literature [25 references]  Display  Hide  Download
Contributor : Denis Jouvet Connect in order to contact the contributor
Submitted on : Monday, September 11, 2017 - 4:20:06 PM
Last modification on : Tuesday, May 18, 2021 - 6:29:01 PM


Files produced by the author(s)


  • HAL Id : hal-01585567, version 1


Denis Jouvet, Katarina Bartkova, Mathilde Dargnat, Lou Lee. Analysis and Automatic Classification of Some Discourse Particles on a Large Set of French Spoken Corpora. SLSP'2017, 5th International Conference on Statistical Language and Speech Processing, Oct 2017, Le Mans, France. ⟨hal-01585567⟩



Record views


Files downloads