Automatic Validation of Terminology by Means of Formal Concept Analysis
Résumé
Term extraction tools extract candidate terms and annotate their occurrences in the texts. However, not all these occurrences are terminological and, at present, this is still a very challenging issue to distinguish when a candidate term is really used with a termino-logical meaning. The validation of term annotations is presented as a bi-classification model that classifies each term occurrence as a termi-nological or non-terminological occurrence. A context-based hypothesis approach is applied to a training corpus: we assume that the words in the sentence which contains the studied occurrence can be used to build positive and negative hypotheses that are further used to classify unde-termined examples. The method is applied and evaluated on a french corpus in the linguistic domain and we also mention some improvements suggested by a quantitative and qualitative evaluation.
Origine : Fichiers produits par l'(les) auteur(s)
Loading...