HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Extraction d'entités dans des collections évolutives

Thierry Despeyroux 1 Eduardo Fraschini 1 Anne-Marie Vercoustre 1
1 AxIS - Usage-centered design, analysis and improvement of information systems
CRISAM - Inria Sophia Antipolis - Méditerranée , Inria Paris-Rocquencourt
Abstract : The goal of our work is to use a set of reports and extract named entities, in our case the names of Industrial or Academic partners. Starting with an initial list of entities, we use a first set of documents to identify syntactic patterns that are then validated in a supervised learning phase on a set of annotated documents. The complete collection is then explored. This approach is similar to the ones used in data extraction from semi-structured documents (wrappers) and do not need any linguistic resources neither a large set for training. As our collection of documents would evolve over years , we hope that the performance of the extraction would improve with the increased size of the training set.
Complete list of metadata

Cited literature [8 references]  Display  Hide  Download

https://hal.inria.fr/inria-00116910
Contributor : Anne-Marie Vercoustre Connect in order to contact the contributor
Submitted on : Friday, July 20, 2007 - 4:46:29 PM
Last modification on : Friday, February 4, 2022 - 3:13:47 AM
Long-term archiving on: : Friday, November 25, 2016 - 7:22:56 PM

Files

etam.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00116910, version 4
  • ARXIV : 0706.2797

Collections

Citation

Thierry Despeyroux, Eduardo Fraschini, Anne-Marie Vercoustre. Extraction d'entités dans des collections évolutives. 7ièmes Journées francophones Extraction et Gestion des Connaissances EGC 2007, Jan 2007, Namur, Belgique. pp.533-538. ⟨inria-00116910v4⟩

Share

Metrics

Record views

107

Files downloads

68