Comparative study of Arabic and french statistical language models

Karima Meftouh 1 Kamel Smaïli 2 Med Tayeb Laskri 1
2 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : In this paper, we propose a comparative study of statistical language models of Arabic and French. The objective of this study is to understand how to better model both Arabic and French. Several experiments using different smoothing techniques have been carried out. For French, trigram models are most appropriate whatever the smoothing technique used. For Arabic, the n-gram models of higher order smoothed with Witten Bell method are more efficient. Tests are achieved with comparable corpora and vocabularies in terms of size
Document type :
Conference papers
Complete list of metadatas

Cited literature [11 references]  Display  Hide  Download

https://hal.inria.fr/inria-00352927
Contributor : Kamel Smaïli <>
Submitted on : Tuesday, November 14, 2017 - 12:24:53 PM
Last modification on : Sunday, April 8, 2018 - 11:48:13 AM
Long-term archiving on : Thursday, February 15, 2018 - 4:33:37 PM

File

ICAART.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00352927, version 1

Collections

Citation

Karima Meftouh, Kamel Smaïli, Med Tayeb Laskri. Comparative study of Arabic and french statistical language models. ICAART'09 - International Conference On agents and Artificial Intelligence, INSTICC, Jan 2009, Porto, Portugal. ⟨inria-00352927⟩

Share

Metrics

Record views

295

Files downloads

146