A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon

Imen Ben Cheikh 1 Afef Kacem 1 Abdel Belaïd 2
2 READ - READ
LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Recently, we have investigated the use of Arabic linguistic knowledge to improve the recognition of wide Arabic word lexicon. A neural-linguistic approach was proposed to mainly deal with canonical vocabulary of decomposable words derived from tri-consonant healthy roots. The basic idea is to factorize words by their roots and schemes. In this direction, we conceived two neural networks TNN_R and TNN_S to respectively recognize roots and schemes from structural primitives of words. The proposal approach achieved promising results. In this paper, we will focus on how to reach better results in terms of accuracy and recognition rate. Current improvements concern especially the training stage. It is about 1) to benefit from word letters order 2) to consider "sisters letters" (having same features), 3) to supervise networks behaviours, 4) to split up neurons to save letter occurrences and 5) to solve observed ambiguities. Considering theses improvements, experiments carried on 1500 sized vocabulary show a significant enhancement: TNN_R (resp. TNN_S) top4 has gone up from 77% to 85.8% (resp. from 65% to 97.9%). Enlarging the vocabulary from 1000 to 1700 by 100 words, again confirmed the results without altering the networks stability.
Type de document :
Communication dans un congrès
Document Recognition and Retrieval XVII, Jan 2010, San Jose, United States. 2010
Liste complète des métadonnées

Littérature citée [7 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00579680
Contributeur : Abdel Belaid <>
Soumis le : jeudi 24 mars 2011 - 15:42:37
Dernière modification le : mardi 24 avril 2018 - 13:36:05

Fichier

Imen-DRR2010.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00579680, version 1

Collections

Citation

Imen Ben Cheikh, Afef Kacem, Abdel Belaïd. A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon. Document Recognition and Retrieval XVII, Jan 2010, San Jose, United States. 2010. 〈inria-00579680〉

Partager

Métriques

Consultations de la notice

208

Téléchargements de fichiers

199