Skip to Main content Skip to Navigation
Conference papers

Combining multiple resources to build reliable wordnets

Abstract : This paper compares automatically generated sets of synonyms in French and Slovene wordnets with respect to the resources used in the construction process. Polysemous words were disambiguated via a five-language word-alignment of the SEERA.NET parallel corpus, a subcorpus of the JRC Acquis. The extracted multilingual lexicon was disambiguated with the existing wordnets for these languages. On the other hand, a bilingual approach sufficed to acquire equivalents for monosemous words. Bilingual lexicons were extracted from different resources, including Wikipedia, Wiktionary and EUROVOC thesaurus. A representative sample of the generated synsets was evaluated against the gold-standards.
Document type :
Conference papers
Complete list of metadata
Contributor : Benoît Sagot Connect in order to contact the contributor
Submitted on : Monday, August 15, 2011 - 11:25:04 AM
Last modification on : Thursday, February 11, 2021 - 2:38:02 PM
Long-term archiving on: : Friday, November 25, 2011 - 11:11:50 AM


Files produced by the author(s)


  • HAL Id : inria-00614706, version 1



Darja Fišer,, Benoît Sagot. Combining multiple resources to build reliable wordnets. TSD 2008 - Text Speech and Dialogue, 2008, Brno, Czech Republic. ⟨inria-00614706⟩



Record views


Files downloads