A Comparative Assessment of State-Of-The-Art Methods for Multilingual Unsupervised Keyphrase Extraction

Nikolaos Giarelis; Nikos Kanakaris; Nikos Karacapilidis

doi:10.1007/978-3-030-79150-6_50

Communication Dans Un Congrès Année : 2021

A Comparative Assessment of State-Of-The-Art Methods for Multilingual Unsupervised Keyphrase Extraction

(1) , (1) , (1)

Nikolaos Giarelis

Fonction : Auteur
PersonId : 1105446

University of Patras

Nikos Kanakaris

Fonction : Auteur
PersonId : 1105447

University of Patras

Nikos Karacapilidis

Fonction : Auteur
PersonId : 1033582

University of Patras

Résumé

Keyphrase extraction is a fundamental task in information management, which is often used as a preliminary step in various information retrieval and natural language processing tasks. The main contribution of this paper lies in providing a comparative assessment of prominent multilingual unsupervised keyphrase extraction methods that build on statistical (RAKE, YAKE), graph-based (TextRank, SingleRank) and deep learning (KeyBERT) methods. For the experimentations reported in this paper, we employ well-known datasets designed for keyphrase extraction from five different natural languages (English, French, Spanish, Portuguese and Polish). We use the F1 score and a partial match evaluation framework, aiming to investigate whether the number of terms of the documents and the language of each dataset affect the accuracy of the selected methods. Our experimental results reveal a set of insights about the suitability of the selected methods in texts of different sizes, as well as the performance of these methods in datasets of different languages.

Mots clés

Natural language processing Keyphrase extraction Unsupervised learning Deep learning Graph-based models Empirical research

Domaines

Informatique [cs]

Fichier principal

509922_1_En_50_Chapter.pdf (268.81 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Hal Ifip : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03287681

Soumis le : jeudi 15 juillet 2021-18:10:48

Dernière modification le : mercredi 15 février 2023-04:16:04

Archivage à long terme le : samedi 16 octobre 2021-19:06:20

Dates et versions

hal-03287681 , version 1 (15-07-2021)

Licence

Paternité

Identifiants

HAL Id : hal-03287681 , version 1
DOI : 10.1007/978-3-030-79150-6_50

Citer

Nikolaos Giarelis, Nikos Kanakaris, Nikos Karacapilidis. A Comparative Assessment of State-Of-The-Art Methods for Multilingual Unsupervised Keyphrase Extraction. 17th IFIP International Conference on Artificial Intelligence Applications and Innovations (AIAI), Jun 2021, Hersonissos, Crete, Greece. pp.635-645, ⟨10.1007/978-3-030-79150-6_50⟩. ⟨hal-03287681⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

IFIP IFIP-AICT IFIP-TC IFIP-WG IFIP-TC12 IFIP-AIAI IFIP-WG12-5 IFIP-AICT-627

229 Consultations

51 Téléchargements

A Comparative Assessment of State-Of-The-Art Methods for Multilingual Unsupervised Keyphrase Extraction

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager