Skip to Main content Skip to Navigation
Reports

Line and Word Segmentation of Arabic handwritten documents using Neural Networks

Ahlem Belabiod 1 Abdel Belaïd 1
1 READ - Recognition of writing and analysis of documents
LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Segmenting documents into lines and words is a very critical step before the recognition task. It is even more difficult for ancient and calligraphic writings, as is often the case in Arabic manuscript documents. In this work, we propose a new attempt to segment documents into lines and words, using deep learning. For line segmentation, we use an RU-net which allows a pixel-wise classification, thus separating pixels of lines from the background pixels. For segmenting lines into words, not having a ground truth for the word segmentation (at the image level), we use the line transcription to find the words. A BLSTM-CTC is used to achieve this mapping directly between the transcription and line image, without any segmentation. A CNN precedes this sequence to extract the features and feeds the BLSTM with the essential of the line image. Tested on KHATT Arabic database, the system achieves good performance that is of the order of 96.7\% correct lines and 80.1\% correct words.
Document type :
Reports
Complete list of metadata

Cited literature [29 references]  Display  Hide  Download

https://hal.inria.fr/hal-01910559
Contributor : Abdel Belaid <>
Submitted on : Monday, November 5, 2018 - 3:39:00 PM
Last modification on : Thursday, June 17, 2021 - 3:02:33 AM
Long-term archiving on: : Wednesday, February 6, 2019 - 2:59:19 PM

File

ArabicSegmentation.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01910559, version 1

Collections

Citation

Ahlem Belabiod, Abdel Belaïd. Line and Word Segmentation of Arabic handwritten documents using Neural Networks. [Research Report] LORIA - Université de Lorraine; READ. 2018. ⟨hal-01910559⟩

Share

Metrics

Record views

213

Files downloads

1248