An end-to-end learning solution for assessing the quality of Wikipedia articles - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

An end-to-end learning solution for assessing the quality of Wikipedia articles

Résumé

Wikipedia is considered as the largest knowledge repository in the history of humanity and plays a crucial role in modern daily life. Assigning the correct quality class to Wikipedia articles is an important task in order to provide guidance for both authors and readers of Wikipedia. Manual review cannot cope with the editing speed of Wikipedia. An automatic classification is required to classify quality of Wikipedia articles. Most existing approaches rely on traditional machine learning with manual feature engineering, which requires a lot of expertise and effort. Furthermore, it is known that there is no general perfect feature set, because information leak always occurs in feature extraction phase. Also, for each language of Wikipedia a new feature set is required. In this paper, we present an approach relying on deep learning for quality classification of Wikipedia articles. Our solution relies on Recurrent Neural Networks (RNN) which is an end-to-end learning technique that eliminates disadvantages of feature engineering. Our approach learns directly from raw data without human intervention and is language-neutral. Experimental results on English, French and Russian Wikipedia datasets show that our approach outperforms state-of-the-art solutions.
Fichier principal
Vignette du fichier
OpenSym2017 (1).pdf (540.38 Ko) Télécharger le fichier
OpenSym2017.pdf (540.38 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01559693 , version 1 (19-07-2017)
hal-01559693 , version 2 (25-07-2017)
hal-01559693 , version 3 (28-07-2017)

Identifiants

Citer

Quang-Vinh Dang, Claudia-Lavinia Ignat. An end-to-end learning solution for assessing the quality of Wikipedia articles. OpenSym 2017 - International Symposium on Open Collaboration, Aug 2017, Galway, Ireland. ⟨10.1145/3125433.3125448⟩. ⟨hal-01559693v3⟩
752 Consultations
1577 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More