Le Web : une source d'information pour l'intégration de multi-termes dans un processus de Recherche d'Information

Mohamed Hatem Haddad; Mathias Géry; Dominique Vaufreydaz

Communication Dans Un Congrès Année : 2002

Le Web : une source d'information pour l'intégration de multi-termes dans un processus de Recherche d'Information

(1) , (1) , (2)

1
2

Mohamed Hatem Haddad

Fonction : Auteur

Communication Langagière et Interaction Personne-Système

Mathias Géry

Fonction : Auteur
PersonId : 843869

Communication Langagière et Interaction Personne-Système

Dominique Vaufreydaz

Fonction : Auteur
PersonId : 8656
IdHAL : vaufreydaz
ORCID : 0000-0002-8825-0973
IdRef : 064812596

Equipe GEOD, Groupe d'étude sur l'oral et le dialogue

Résumé

Web is a rich and diversified source of information. In this article, we propose to benefit from this richness to collect and analyze documents, with the aim of a relational indexation based on noun phrases. Proposed data processing chain includes a spider collecting data to build textual corpora, and a linguistic module analyzing text to extract information. Comparison of obtained corpus with corpus from Amaryllis conference shows the linguistic diversity of collected corpora, and particularly the richness of extracted noun phrases.

Domaines

Informatique et langage [cs.CL] Recherche d'information [cs.IR]

Fichier principal

Haddad02a.pdf (193.27 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Dominique Vaufreydaz : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00326404

Soumis le : jeudi 2 octobre 2008-21:40:51

Dernière modification le : jeudi 4 avril 2024-21:12:32

Archivage à long terme le : vendredi 4 juin 2010-12:09:13

Dates et versions

inria-00326404 , version 1 (02-10-2008)

Identifiants

HAL Id : inria-00326404 , version 1

Citer

Mohamed Hatem Haddad, Mathias Géry, Dominique Vaufreydaz. Le Web : une source d'information pour l'intégration de multi-termes dans un processus de Recherche d'Information. Journées Francophones d'Accès Intelligent aux Documents Multimédias sur l'Internet (MediaNet 2002), Jun 2002, Sousse, Tunisia. pp. 257-268. ⟨inria-00326404⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA IMAG CNRS LIG LIG_SIDCH

101 Consultations

126 Téléchargements

Le Web : une source d'information pour l'intégration de multi-termes dans un processus de Recherche d'Information

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager