Annotation en rôles sémantiques du français en domaine spécifique

Quentin Pradet 1
1 ALPAGE - Analyse Linguistique Profonde à Grande Echelle ; Large-scale deep linguistic processing
Inria Paris-Rocquencourt, UPD7 - Université Paris Diderot - Paris 7
Abstract : In this Natural Language Processing Ph. D. Thesis, we aim to perform semantic role labeling on French domain-specific texts. This task first disambiguates the sense of predicates in a given text and annotates its child chunks with semantic roles such as Agent, Patient or Destination. The task helps many applications in domains where annotated corpora exist, but is difficult to use otherwise. We first evaluate on the FrameNet corpus an existing method based on VerbNet, which explains why the method is domain-independant. We show that substantial improvements can be obtained. We first use syntactic information by handling the passive voice. Next, we use semantic informations by taking advantage of the selectional restrictions present in VerbNet. To apply this method to French, we first translate lexical resources. We first translate the WordNet lexical database. Next, we translate the VerbNet lexicon which is organized semantically using syntactic information. We obtain its translation, VerbeNet, by reusing two French verb lexicons (the Lexique-Grammaire and Les Verbes Français) and by manually modifying and reorganizing the resulting lexicon. Finally, once those building blocks are in place, we evaluate the feasibility of semantic role labeling of French and English in three specific domains. We study the pros and cons of using VerbNet and VerbeNet to annotate those domains before explaining our future work.
Document type :
Theses
Complete list of metadatas

Cited literature [223 references]  Display  Hide  Download

https://hal.inria.fr/tel-01182711
Contributor : Quentin Pradet <>
Submitted on : Saturday, August 15, 2015 - 2:28:00 PM
Last modification on : Friday, January 4, 2019 - 5:33:24 PM
Long-term archiving on : Monday, November 16, 2015 - 10:44:12 AM

Licence


Distributed under a Creative Commons Attribution - ShareAlike 4.0 International License

Identifiers

  • HAL Id : tel-01182711, version 1

Collections

Citation

Quentin Pradet. Annotation en rôles sémantiques du français en domaine spécifique. Informatique et langage [cs.CL]. Université Paris Diderot (Paris 7), 2015. Français. ⟨tel-01182711⟩

Share

Metrics

Record views

565

Files downloads

1464