A Random Forests Text Transliteration System for Greek Digraphia

Abstract : Greeklish to Greek transcription does undeniably seem to be a challenging task since it cannot be accomplished by directly mapping each Greek character to a corresponding symbol of the Latin alphabet. The ambiguity in the human way of Greeklish writing, since Greeklish users do not follow a standardized way of transliteration makes the process of transcribing Greeklish back to Greek alphabet challenging. Even though a plethora of deterministic approaches for the task at hand exists, this paper presents a non-deterministic, vocabulary-free approach, which produces comparable and even better results, supports argot and other linguistic peculiarities, based on an ensemble classification methodology of Data Mining, namely Random Forests. Using data from real users from a conglomeration of resources such as Blogs, forums, email lists, etc., as well as artificial data from a robust stochastic Greek to Greeklish transcriber, the proposed approach depicts satisfactory outcomes in the range of 91.5%-98.5%, which is comparable to an alternative commercial approach.
Type de document :
Communication dans un congrès
Lazaros Iliadis; Ilias Maglogiannis; Harris Papadopoulos. 12th Engineering Applications of Neural Networks (EANN 2011) and 7th Artificial Intelligence Applications and Innovations (AIAI), Sep 2011, Corfu, Greece. Springer, IFIP Advances in Information and Communication Technology, AICT-364 (Part II), pp.196-201, 2011, Artificial Intelligence Applications and Innovations. 〈10.1007/978-3-642-23960-1_24〉
Liste complète des métadonnées

Littérature citée [6 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01571492
Contributeur : Hal Ifip <>
Soumis le : mercredi 2 août 2017 - 16:22:32
Dernière modification le : mercredi 14 février 2018 - 11:46:01

Fichier

978-3-642-23960-1_24_Chapter.p...
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

Citation

Alexandros Panteli, Manolis Maragoudakis. A Random Forests Text Transliteration System for Greek Digraphia. Lazaros Iliadis; Ilias Maglogiannis; Harris Papadopoulos. 12th Engineering Applications of Neural Networks (EANN 2011) and 7th Artificial Intelligence Applications and Innovations (AIAI), Sep 2011, Corfu, Greece. Springer, IFIP Advances in Information and Communication Technology, AICT-364 (Part II), pp.196-201, 2011, Artificial Intelligence Applications and Innovations. 〈10.1007/978-3-642-23960-1_24〉. 〈hal-01571492〉

Partager

Métriques

Consultations de la notice

59

Téléchargements de fichiers

37