Extraction d'entités dans des collections évolutives

Thierry Despeyroux; Eduardo Fraschini; Anne-Marie Vercoustre

Communication Dans Un Congrès Année : 2007

Extraction d'entités dans des collections évolutives

(1) , (1) , (1)

Thierry Despeyroux

Fonction : Auteur
PersonId : 830028

Usage-centered design, analysis and improvement of information systems

Eduardo Fraschini

Fonction : Auteur

Usage-centered design, analysis and improvement of information systems

Anne-Marie Vercoustre

Fonction : Auteur
PersonId : 830030

Usage-centered design, analysis and improvement of information systems

Résumé

The goal of our work is to use a set of reports and extract named entities, in our case the names of Industrial or Academic partners. Starting with an initial list of entities, we use a first set of documents to identify syntactic patterns that are then validated in a supervised learning phase on a set of annotated documents. The complete collection is then explored. This approach is similar to the ones used in data extraction from semi-structured documents (wrappers) and do not need any linguistic resources neither a large set for training. As our collection of documents would evolve over years , we hope that the performance of the extraction would improve with the increased size of the training set.

Mots clés

Entity extraction wrapping method extraction pattern

Domaines

Traitement du texte et du document Recherche d'information [cs.IR]

Fichier principal

etam.pdf (50.89 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Anne-Marie Vercoustre : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00116910

Soumis le : vendredi 20 juillet 2007-16:46:29

Dernière modification le : mercredi 15 mars 2023-08:58:08

Archivage à long terme le : vendredi 25 novembre 2016-19:22:56

Dates et versions

inria-00116910 , version 1 (28-11-2006)

inria-00116910 , version 2 (19-06-2007)

inria-00116910 , version 3 (13-07-2007)

inria-00116910 , version 4 (20-07-2007)

Identifiants

HAL Id : inria-00116910 , version 4
ARXIV : 0706.2797

Citer

Thierry Despeyroux, Eduardo Fraschini, Anne-Marie Vercoustre. Extraction d'entités dans des collections évolutives. 7ièmes Journées francophones Extraction et Gestion des Connaissances EGC 2007, Jan 2007, Namur, Belgique. pp.533-538. ⟨inria-00116910v4⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INRIA INRIA2

109 Consultations

73 Téléchargements

Extraction d'entités dans des collections évolutives

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager