Extraction d'entités dans des collections évolutives

Thierry Despeyroux; Eduardo Fraschini; Anne-Marie Vercoustre

Communication Dans Un Congrès Année : 2007

Extraction d'entités dans des collections évolutives

(1) , (1) , (1)

Thierry Despeyroux

Fonction : Auteur
PersonId : 830028

Usage-centered design, analysis and improvement of information systems

Eduardo Fraschini

Fonction : Auteur

Usage-centered design, analysis and improvement of information systems

Anne-Marie Vercoustre

Fonction : Auteur
PersonId : 830030

Usage-centered design, analysis and improvement of information systems

Résumé

The goal of our work is to use a set of reports and extract named entities, in our case the names of Industrial or Academic partners. Starting with an initial list of entities, we use a first set of documents to identify syntactic patterns that are then validated in a supervised learning phase on a set of annotated documents. The complete collection is then explored. This approach is similar to the ones used in data extraction from semi-structured documents (wrappers) and do not need any linguistic resources neither a large set for training. As our collection of documents would evolve over years , we hope that the performance of the extraction would improve with the increased size of the training set.

Mots clés

Entity extraction wrapping method extraction pattern

Domaines

Recherche d'information [cs.IR] Traitement du texte et du document

Fichier principal

etam.pdf (36.48 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Anne-Marie Vercoustre : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00116910

Soumis le : vendredi 13 juillet 2007-12:09:15

Dernière modification le : mercredi 15 mars 2023-08:58:08

Archivage à long terme le : mercredi 29 mars 2017-13:37:01

Dates et versions

inria-00116910 , version 1 (28-11-2006)

inria-00116910 , version 2 (19-06-2007)

inria-00116910 , version 3 (13-07-2007)

inria-00116910 , version 4 (20-07-2007)

Identifiants

HAL Id : inria-00116910 , version 3
ARXIV : 0706.2797

Citer

Thierry Despeyroux, Eduardo Fraschini, Anne-Marie Vercoustre. Extraction d'entités dans des collections évolutives. 7ièmes Journées francophones Extraction et Gestion des Connaissances EGC 2007, Jan 2007, Namur / Belgique. ⟨inria-00116910v3⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

109 Consultations

73 Téléchargements

Extraction d'entités dans des collections évolutives

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Altmetric

Partager