Extraction d'entités dans des collections évolutives

Thierry Despeyroux; Eduardo Fraschini; Anne-Marie Vercoustre

Communication Dans Un Congrès Année : 2007

Extraction d'entités dans des collections évolutives

(1) , (1) , (1)

Thierry Despeyroux

Fonction : Auteur
PersonId : 830028

Usage-centered design, analysis and improvement of information systems

Eduardo Fraschini

Fonction : Auteur

Usage-centered design, analysis and improvement of information systems

Anne-Marie Vercoustre

Fonction : Auteur
PersonId : 830030

Usage-centered design, analysis and improvement of information systems

Résumé

The goal of our work is to use a set of reports and extract named entities, in our case the names of partners. Starting with an initial list of entities, we use a first set of documents to identify syntactic patterns that are then validated in a supervised learning phase on a set of annotated documents to perform a performance test. The complete collection is then explored. This approach comes from the one that is used in data extraction for semi-structured documents (wrappers) and do not need any linguistic ressources neither a large set for training. As our collection of documents evoluate, we hope that the performance of the extraction becomes better year after year.

Mots clés

Entity extraction wrapping method extraction pattern

Domaines

Traitement du texte et du document Recherche d'information [cs.IR]

Fichier principal

etam.pdf (42.96 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Anne-Marie Vercoustre : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00116910

Soumis le : mardi 19 juin 2007-15:17:15

Dernière modification le : mercredi 15 mars 2023-08:58:08

Archivage à long terme le : mercredi 29 mars 2017-13:34:13

Dates et versions

inria-00116910 , version 1 (28-11-2006)

inria-00116910 , version 2 (19-06-2007)

inria-00116910 , version 3 (13-07-2007)

inria-00116910 , version 4 (20-07-2007)

Identifiants

HAL Id : inria-00116910 , version 2
ARXIV : 0706.2797

Citer

Thierry Despeyroux, Eduardo Fraschini, Anne-Marie Vercoustre. Extraction d'entités dans des collections évolutives. 7ièmes Journées francophones Extraction et Gestion des Connaissances EGC 2007, Jan 2007, Namur, Belgique. ⟨inria-00116910v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

109 Consultations

73 Téléchargements

Extraction d'entités dans des collections évolutives

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Altmetric

Partager