HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Extraction d'entités dans des collections évolutives

Thierry Despeyroux 1 Eduardo Fraschini 1 Anne-Marie Vercoustre 1
1 AxIS - Usage-centered design, analysis and improvement of information systems
CRISAM - Inria Sophia Antipolis - Méditerranée , Inria Paris-Rocquencourt
Abstract : The goal of our work is to use a set of reports and extract named entities, in our case the names of Industrial or Academic partners. Starting with an initial list of entities, we use a first set of documents to identify syntactic patterns that are then validated in a supervised learning phase on a set of annotated documents. The complete collection is then explored. This approach is similar to the ones used in data extraction from semi-structured documents (wrappers) and do not need any linguistic resources neither a large set for training. As our collection of documents would evolve over years , we hope that the performance of the extraction would improve with the increased size of the training set.
Complete list of metadata

Cited literature [8 references]  Display  Hide  Download

Contributor : Anne-Marie Vercoustre Connect in order to contact the contributor
Submitted on : Friday, July 20, 2007 - 4:46:29 PM
Last modification on : Friday, February 4, 2022 - 3:13:47 AM
Long-term archiving on: : Friday, November 25, 2016 - 7:22:56 PM


Files produced by the author(s)


  • HAL Id : inria-00116910, version 4
  • ARXIV : 0706.2797



Thierry Despeyroux, Eduardo Fraschini, Anne-Marie Vercoustre. Extraction d'entités dans des collections évolutives. 7ièmes Journées francophones Extraction et Gestion des Connaissances EGC 2007, Jan 2007, Namur, Belgique. pp.533-538. ⟨inria-00116910v4⟩



Record views


Files downloads