Building a free French wordnet from multilingual resources

Abstract : This paper describes automatic construction a freely-available wordnet for French (WOLF) based on Princeton WordNet (PWN) by using various multilingual resources. Polysemous words were dealt with an approach in which a parallel corpus for five languages was word-aligned and the extracted multilingual lexicon was disambiguated with the existing wordnets for these languages. On the other hand, a bilingual approach sufficed to acquire equivalents for monosemous words. Bilingual lexicons were extracted from Wikipedia and thesauri. The results obtained from each resource were merged and ranked according to the number of resources yielding the same literal. Automatic evaluation of the merged wordnet was performed with the French WordNet (FREWN). Manual evaluation was also carried out on a sample of the generated synsets. Precision shows that the presented approach has proved to be very promising and applications to use the created wordnet are already intended.
Document type :
Conference papers
Complete list of metadatas

Cited literature [18 references]  Display  Hide  Download

https://hal.inria.fr/inria-00614708
Contributor : Benoît Sagot <>
Submitted on : Monday, August 15, 2011 - 11:34:04 AM
Last modification on : Thursday, August 29, 2019 - 2:24:09 PM
Long-term archiving on : Friday, November 25, 2011 - 11:11:56 AM

File

Ontolex08.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00614708, version 1

Collections

Citation

Benoît Sagot, Darja Fišer. Building a free French wordnet from multilingual resources. OntoLex, May 2008, Marrakech, Morocco. ⟨inria-00614708⟩

Share

Metrics

Record views

969

Files downloads

707