Skip to Main content Skip to Navigation
Conference papers

Extraction d'entités dans des collections évolutives

Thierry Despeyroux 1 Eduardo Fraschini 1 Anne-Marie Vercoustre 1
1 AxIS - Usage-centered design, analysis and improvement of information systems
CRISAM - Inria Sophia Antipolis - Méditerranée , Inria Paris-Rocquencourt
Abstract : The goal of our work is to use a set of reports and extract named entities, in our case the names of Industrial or Academic partners. Starting with an initial list of entities, we use a first set of documents to identify syntactic patterns that are then validated in a supervised learning phase on a set of annotated documents. The complete collection is then explored. This approach is similar to the ones used in data extraction from semi-structured documents (wrappers) and do not need any linguistic resources neither a large set for training. As our collection of documents would evolve over years , we hope that the performance of the extraction would improve with the increased size of the training set.
Complete list of metadata

Cited literature [8 references]  Display  Hide  Download

https://hal.inria.fr/inria-00116910
Contributor : Anne-Marie Vercoustre <>
Submitted on : Friday, July 20, 2007 - 4:46:29 PM
Last modification on : Wednesday, May 30, 2018 - 10:30:34 AM
Long-term archiving on: : Friday, November 25, 2016 - 7:22:56 PM

Files

etam.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00116910, version 4
  • ARXIV : 0706.2797

Collections

Citation

Thierry Despeyroux, Eduardo Fraschini, Anne-Marie Vercoustre. Extraction d'entités dans des collections évolutives. 7ièmes Journées francophones Extraction et Gestion des Connaissances EGC 2007, Jan 2007, Namur, Belgique. pp.533-538. ⟨inria-00116910v4⟩

Share

Metrics

Record views

305

Files downloads

161