HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Extraction automatique de cadres de sous-catégorisation verbale pour le français à partir d'un corpus arboré

Abstract : We present our work on automatic extraction of subcategorisation frames for 1362 French verbs. We use a treebank of 15000 sentences from which we extract 12510 verb occurrences. We evaluate the results based on a functional representation of frames and we acquire 39 different frames, 1.54 per lemma on average. Then, we adopt a mixed representation (functions and categories), which leads to 925 different frames, 3.44 frames on average. We investigate several methods to reduce the ambiguity (e.g., neutralisation of passive forms or clitic arguments), which allows us to arrive at 235 frames, with 1.94 frames per lemma on average. We present a brief comparison with the existing work on French and English.
Document type :
Conference papers
Complete list of metadata

https://hal.inria.fr/inria-00420997
Contributor : Anna Kupsc Connect in order to contact the contributor
Submitted on : Wednesday, September 30, 2009 - 12:53:30 PM
Last modification on : Saturday, July 17, 2021 - 3:52:33 AM

Identifiers

  • HAL Id : inria-00420997, version 1

Citation

Anna Kupść. Extraction automatique de cadres de sous-catégorisation verbale pour le français à partir d'un corpus arboré. TALN, 2007, Toulouse, France. ⟨inria-00420997⟩

Share

Metrics

Record views

62