Combining multiple resources to build reliable wordnets - Archive ouverte HAL Access content directly
Conference Papers Year : 2008

Combining multiple resources to build reliable wordnets

(1) , (2)
1
2

Abstract

This paper compares automatically generated sets of synonyms in French and Slovene wordnets with respect to the resources used in the construction process. Polysemous words were disambiguated via a five-language word-alignment of the SEERA.NET parallel corpus, a subcorpus of the JRC Acquis. The extracted multilingual lexicon was disambiguated with the existing wordnets for these languages. On the other hand, a bilingual approach sufficed to acquire equivalents for monosemous words. Bilingual lexicons were extracted from different resources, including Wikipedia, Wiktionary and EUROVOC thesaurus. A representative sample of the generated synsets was evaluated against the gold-standards.
Fichier principal
Vignette du fichier
TSD08.pdf (149.31 Ko) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

inria-00614706 , version 1 (15-08-2011)

Identifiers

  • HAL Id : inria-00614706 , version 1

Cite

Darja Fišer,, Benoît Sagot. Combining multiple resources to build reliable wordnets. TSD 2008 - Text Speech and Dialogue, 2008, Brno, Czech Republic. ⟨inria-00614706⟩
79 View
191 Download

Share

Gmail Facebook Twitter LinkedIn More