Skip to Main content Skip to Navigation
Conference papers

Building Resources for Algerian Arabic Dialects

Abstract : The Algerian Arabic dialects are under-resourced languages, which lack both corpora and Natural Language Processing (NLP) tools, although they are increasingly used in written form, especially on social media and forums. We aim through this paper, and for the first time, to build parallel corpora for Algerian dialects, because our ultimate purpose is to achieve a Machine Translation (MT) for Modern Standard Arabic (MSA) and Algerian dialects (AD), in both directions. We also propose language tools to process these dialects. First, we developed a morphological analysis model of dialects by adapting BAMA, a well-known MSA analyzer. Then we propose a diacritization system, based on a MT process which allows to restore the vowels to dialects corpora. And finally, we propose results on machine translation between MSA and Algerian dialects.
Document type :
Conference papers
Complete list of metadata

Cited literature [22 references]  Display  Hide  Download
Contributor : Kamel Smaïli Connect in order to contact the contributor
Submitted on : Monday, September 22, 2014 - 5:14:46 PM
Last modification on : Saturday, October 16, 2021 - 11:26:09 AM


Files produced by the author(s)


  • HAL Id : hal-01066989, version 1



Salima Harrat, Karima Meftouh, Mourad Abbas, Kamel Smaïli. Building Resources for Algerian Arabic Dialects. 15th Annual Conference of the International Communication Association Interspeech, ISCA, Sep 2014, Singapour, Singapore. ⟨hal-01066989⟩



Record views


Files downloads