Coopération de méthodes statistiques et symboliques pour l'adaptation non-supervisée d'un système d'étiquetage en entités nommées

Abstract : Named entity recognition and typing is achieved both by symbolic and probabilistic systems. We report on an experiment for making the rule-based system NP, a high-precision system developed on AFP news corpora and relies on the Aleda named entity database, interact with LIANE, a high-recall probabilistic system trained on oral transcriptions from the ESTER corpus. We show that a probabilistic system such as LIANE can be adapted to a new type of corpus in a non-supervized way thanks to large-scale corpora automatically annotated by NP. This adaptation does not require any additional manual anotation and illustrates the complementarity between numeric and symbolic techniques for tackling linguistic tasks.
Document type :
Conference papers
Complete list of metadatas

Cited literature [9 references]  Display  Hide  Download

https://hal.inria.fr/inria-00617068
Contributor : Benoît Sagot <>
Submitted on : Thursday, August 25, 2011 - 10:40:32 PM
Last modification on : Thursday, August 29, 2019 - 2:24:09 PM
Long-term archiving on : Sunday, December 4, 2016 - 6:21:52 PM

File

taln11entnom.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00617068, version 1

Collections

Citation

Frédéric Béchet, Benoît Sagot, Rosa Stern. Coopération de méthodes statistiques et symboliques pour l'adaptation non-supervisée d'un système d'étiquetage en entités nommées. TALN'2011 - Traitement Automatique des Langues Naturelles, Jun 2011, Montpellier, France. ⟨inria-00617068⟩

Share

Metrics

Record views

649

Files downloads

854