Segmentation of ancient Arabic documents - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Chapitre D'ouvrage Année : 2011

Segmentation of ancient Arabic documents

Abdel Belaïd
  • Fonction : Auteur
  • PersonId : 830137
Nazih Ouwayed
  • Fonction : Auteur
  • PersonId : 871683

Résumé

This chapter addresses the problem of ancient Arabic document segmentation. As ancient documents neither have a real physical structure nor logical one, the segmentation will be limited to textual area or to line extraction in the areas. Although this type of segmentation appears quite simple, its implementation remains a challenging task. This is due to the state of the old document where the image is of low quality, the lines are not straight, sinuous and connected. Given the failure of traditional methods, we proposed a method for line extraction in multi-oriented documents. The method is based on an image meshing that allows it to detect locally and safely the orientations. These orientations are then extended to larger areas. The orientation estimation uses the energy distribution of Cohen's class, more accurate than the projection method. Then, the method exploits the projection peaks to follow the connected components forming text lines. The approach ends with a final separation of connected lines, based on the exploitation of the morphology of terminal letters.
Fichier principal
Vignette du fichier
author.pdf (2.09 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

inria-00579840 , version 1 (25-03-2011)

Identifiants

  • HAL Id : inria-00579840 , version 1

Citer

Abdel Belaïd, Nazih Ouwayed. Segmentation of ancient Arabic documents. Volker Märgner and Haikal El Abed. Guide to OCR for Arabic Scripts, Springer, 2011. ⟨inria-00579840⟩
192 Consultations
369 Téléchargements

Partager

Gmail Facebook X LinkedIn More