Le système WoDiS - WOlf & DIStributions pour la substitution lexicale

Kata Gábor 1
1 ALPAGE - Analyse Linguistique Profonde à Grande Echelle ; Large-scale deep linguistic processing
Inria Paris-Rocquencourt, UPD7 - Université Paris Diderot - Paris 7
Abstract : In this paper we describe the WoDiS system, as entered in the SemDis-TALN2014 lexical substitution shared task. Substitution candidates are generated from the WOLF (WordNet Libre du Français) and are clustered according to the structure of the synsets containing them to reflect the different senses of the target word. These senses are represented in a vector space specific to the target word, based on distributional data extracted from a corpus. This vector space is then mapped to the context with simple topical similarity metrics used in document classification. To overcome the data sparseness problem while representing the less frequent senses, we apply a lexical expansion method which allows to extract a higher number of relevant contexts and to compensate for the bias present in corpus-based distributional vectors. Our system ranked fourth in the final evaluation.
Complete list of metadatas

Cited literature [28 references]  Display  Hide  Download

https://hal.inria.fr/hal-01022406
Contributor : Kata Gábor <>
Submitted on : Thursday, July 10, 2014 - 1:16:16 PM
Last modification on : Friday, January 4, 2019 - 5:33:24 PM
Long-term archiving on : Friday, October 10, 2014 - 11:41:52 AM

File

semdis2014_submission_3_1_.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01022406, version 1

Collections

Citation

Kata Gábor. Le système WoDiS - WOlf & DIStributions pour la substitution lexicale. Sémantique Distributionnelle - Atelier TALN 2014, Jul 2014, Marseille, France. ⟨hal-01022406⟩

Share

Metrics

Record views

389

Files downloads

349