Skip to Main content Skip to Navigation
Conference papers

Using MMIL for the High Level Semantic Annotation of the French MEDIA Dialogue Corpus

Lina Maria Rojas Barahona 1 Thierry Bazillon 2 Matthieu Quignard 3 Fabrice Lefèvre 2 
1 TALARIS - Natural Language Processing: representation, inference and semantics
Inria Nancy - Grand Est, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : The MultiModal Interface Language formalism (MMIL) has been selected as the High Level Semantic (HLS) formalism for annotating the French MEDIA dialogue corpus. This corpus is composed of human-machine dialogues in the domain of hotel reservation and tourist information. Utterances in dialogues have been previously annotated with a concept-value flat semantics for studying and evaluating spoken language understanding modules in dialogue systems. We are now interested in investigating the use of more complex representations to improve the understanding capability. The MMIL intermediate language is a high level semantic formalism that bears relevant linguistic information, from syntax up to discourse. This representation should increase the expressivity of the current annotation though at the expense of the annotation process complexity. In this paper we present our first attempt in defining the annotation guidelines for the HLS annotation of the MEDIA corpus and its effect on the annotation process itself, revealed by annotators' disagreements due to the different levels of hierarchy and the granularity of the features defined in MMIL.
Document type :
Conference papers
Complete list of metadata

Cited literature [4 references]  Display  Hide  Download

https://hal.inria.fr/inria-00638000
Contributor : Lina Maria Rojas Barahona Connect in order to contact the contributor
Submitted on : Thursday, November 3, 2011 - 5:10:04 PM
Last modification on : Friday, February 4, 2022 - 3:31:08 AM
Long-term archiving on: : Thursday, November 15, 2012 - 11:06:04 AM

File

HLSCertifiedAnnIWCSShort.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00638000, version 1

Citation

Lina Maria Rojas Barahona, Thierry Bazillon, Matthieu Quignard, Fabrice Lefèvre. Using MMIL for the High Level Semantic Annotation of the French MEDIA Dialogue Corpus. Ninth International Conference on Computational Semantics - IWCS 2011, ACL, Jan 2011, London, United Kingdom. ⟨inria-00638000⟩

Share

Metrics

Record views

124

Files downloads

112