Skip to Main content Skip to Navigation
Conference papers

Neighborhood Label Extension for Handwritten/Printed Text Separation in Arabic Documents

Abstract : This paper addresses the problem of handwritten and printed text separation in Arabic document images. The objective is to extract handwritten text from other parts of the document. This allows the application, in a second time, of a specialized processing on the extracted handwritten part or even on the printed one. Documents are first preprocessed in order to remove eventual noise and correct document orientation. Then, the document is segmented into pseudo-lines that are segmented in turn into pseudo-words. A local classification step, using a Gaussian kernel SVM, associates each pseudo-word into handwritten or printed classes. This label is then propagated in the pseudo-word's neighborhood in order to recover from classification errors. The proposed methodology has been tested on a set of public real Arabic documents achieving a separation rate of around 90%.
Document type :
Conference papers
Complete list of metadata

Cited literature [28 references]  Display  Hide  Download

https://hal.inria.fr/hal-01981519
Contributor : Abdel Belaid <>
Submitted on : Tuesday, January 15, 2019 - 10:18:31 AM
Last modification on : Friday, January 15, 2021 - 5:42:02 PM
Long-term archiving on: : Tuesday, April 16, 2019 - 12:56:30 PM

File

Awal-Belaid_paper_13.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01981519, version 1

Citation

Ahmad-Montaser Awal, Belaïd Abdel. Neighborhood Label Extension for Handwritten/Printed Text Separation in Arabic Documents. International Workshop on Arabic Script Analysis and Recognition, Apr 2017, NANCY, France. ⟨hal-01981519⟩

Share

Metrics

Record views

190

Files downloads

430