Participation de l'IRISA à DeFT2012 : recherche d'information et apprentissage pour la génération de mots-clés

Vincent Claveau 1 Christian Raymond 1, *
* Corresponding author
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : This paper describes the IRISA participation to the DeFT 2012 text-mining challenge. It consisted in the automatic attribution or generation of keywords to scientific journal articles. Two tasks were proposed which led us to test two different strategies. For the first task, a list of keywords was provided. Based on that, our first strategy is to consider that as an Information Retrieval problem in wich the keyword are the queries, which are attributed to the best ranked documents. This approach yielded very good results. For the second task, only the articles were known; for this task, our approach is chiefly based on a term extraction system whose results are reordered by machine learning.
Document type :
Conference papers
Complete list of metadatas

Cited literature [5 references]  Display  Hide  Download

https://hal.inria.fr/hal-00758259
Contributor : Christian Raymond <>
Submitted on : Wednesday, November 28, 2012 - 2:06:23 PM
Last modification on : Friday, November 16, 2018 - 1:25:07 AM
Long-term archiving on : Saturday, December 17, 2016 - 4:26:25 PM

File

W12-1106.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-00758259, version 1

Citation

Vincent Claveau, Christian Raymond. Participation de l'IRISA à DeFT2012 : recherche d'information et apprentissage pour la génération de mots-clés. JEP-TALN-RECITAL 2012, Workshop DEFT 2012: DÉfi Fouille de Textes (DEFT 2012 Workshop: Text Mining Challenge), Jun 2012, Grenoble, France, France. pp.49--60. ⟨hal-00758259⟩

Share

Metrics

Record views

348

Files downloads

210