An end-to-end learning solution for assessing the quality of Wikipedia articles

Quang-Vinh Dang 1 Claudia-Lavinia Ignat 1
1 COAST - Web Scale Trustworthy Collaborative Service Systems
Inria Nancy - Grand Est, LORIA - NSS - Department of Networks, Systems and Services
Abstract : Wikipedia is considered as the largest knowledge repository in the history of humanity and plays a crucial role in modern daily life. Assigning the correct quality class to Wikipedia articles is an important task in order to provide guidance for both authors and readers of Wikipedia. Manual review cannot cope with the editing speed of Wikipedia. An automatic classification is required to classify quality of Wikipedia articles. Most existing approaches rely on traditional machine learning with manual feature engineering, which requires a lot of expertise and effort. Furthermore, it is known that there is no general perfect feature set, because information leak always occurs in feature extraction phase. Also, for each language of Wikipedia a new feature set is required. In this paper, we present an approach relying on deep learning for quality classification of Wikipedia articles. Our solution relies on Recurrent Neural Networks (RNN) which is an end-to-end learning technique that eliminates disadvantages of feature engineering. Our approach learns directly from raw data without human intervention and is language-neutral. Experimental results on English, French and Russian Wikipedia datasets show that our approach outperforms state-of-the-art solutions.
Type de document :
Communication dans un congrès
OpenSym 2017 - International Symposium on Open Collaboration, Aug 2017, Galway, Ireland. 2017, 〈10.1145/3125433.3125448〉
Liste complète des métadonnées

Littérature citée [61 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01559693
Contributeur : Quang Vinh Dang <>
Soumis le : vendredi 28 juillet 2017 - 15:02:04
Dernière modification le : jeudi 11 janvier 2018 - 06:27:29

Fichiers

OpenSym2017 (1).pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

Citation

Quang-Vinh Dang, Claudia-Lavinia Ignat. An end-to-end learning solution for assessing the quality of Wikipedia articles. OpenSym 2017 - International Symposium on Open Collaboration, Aug 2017, Galway, Ireland. 2017, 〈10.1145/3125433.3125448〉. 〈hal-01559693v3〉

Partager

Métriques

Consultations de la notice

161

Téléchargements de fichiers

499