Comparative study of Arabic and french statistical language models - Inria - Institut national de recherche en sciences et technologies du numérique Access content directly
Conference Papers Year : 2009

Comparative study of Arabic and french statistical language models

Karima Meftouh
  • Function : Author
  • PersonId : 857254
Kamel Smaïli
Med Tayeb Laskri
  • Function : Author
  • PersonId : 857255

Abstract

In this paper, we propose a comparative study of statistical language models of Arabic and French. The objective of this study is to understand how to better model both Arabic and French. Several experiments using different smoothing techniques have been carried out. For French, trigram models are most appropriate whatever the smoothing technique used. For Arabic, the n-gram models of higher order smoothed with Witten Bell method are more efficient. Tests are achieved with comparable corpora and vocabularies in terms of size
Fichier principal
Vignette du fichier
ICAART.pdf (136.84 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

inria-00352927 , version 1 (14-11-2017)

Identifiers

  • HAL Id : inria-00352927 , version 1

Cite

Karima Meftouh, Kamel Smaïli, Med Tayeb Laskri. Comparative study of Arabic and french statistical language models. ICAART'09 - International Conference On agents and Artificial Intelligence, INSTICC, Jan 2009, Porto, Portugal. ⟨inria-00352927⟩
181 View
236 Download

Share

Gmail Facebook X LinkedIn More