Apprentissage d'une classification thématique générique et cross-langue à partir des catégories de la Wikipédia

Abstract : Cross-lingual and generic text categorization. Text categorization usually requires a significant investment, which must often be associated to a field adaptation. The approach we propose here allows to finely associate a graph of Wikipedia categories to any text written in a given language. Moreover, the inter-lingual index of the online encyclopedia allows to get a subset of this graph in most other languages.
Complete list of metadatas

Cited literature [8 references]  Display  Hide  Download

https://hal.inria.fr/hal-00851794
Contributor : François-Régis Chaumartin <>
Submitted on : Sunday, August 18, 2013 - 9:39:20 PM
Last modification on : Monday, August 19, 2013 - 3:36:47 PM
Long-term archiving on : Tuesday, November 19, 2013 - 4:11:35 AM

File

taln-2013-court-020.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-00851794, version 1

Citation

François-Régis Chaumartin. Apprentissage d'une classification thématique générique et cross-langue à partir des catégories de la Wikipédia. TALN - Traitement Automatique des Langues Naturelles - 2013, ATALA, Jun 2013, Les Sables d'Olonne, France. pp.659-666. ⟨hal-00851794⟩

Share

Metrics

Record views

375

Files downloads

187