Skip to Main content Skip to Navigation
Conference papers

Neighborhood Label Extension for Handwritten/Printed Text Separation in Arabic Documents

Abstract : This paper addresses the problem of handwritten and printed text separation in Arabic document images. The objective is to extract handwritten text from other parts of the document. This allows the application, in a second time, of a specialized processing on the extracted handwritten part or even on the printed one. Documents are first preprocessed in order to remove eventual noise and correct document orientation. Then, the document is segmented into pseudo-lines that are segmented in turn into pseudo-words. A local classification step, using a Gaussian kernel SVM, associates each pseudo-word into handwritten or printed classes. This label is then propagated in the pseudo-word's neighborhood in order to recover from classification errors. The proposed methodology has been tested on a set of public real Arabic documents achieving a separation rate of around 90%.
Document type :
Conference papers
Complete list of metadata

Cited literature [28 references]  Display  Hide  Download
Contributor : Abdel Belaid Connect in order to contact the contributor
Submitted on : Tuesday, January 15, 2019 - 10:18:31 AM
Last modification on : Wednesday, April 27, 2022 - 4:22:57 AM
Long-term archiving on: : Tuesday, April 16, 2019 - 12:56:30 PM


Files produced by the author(s)


  • HAL Id : hal-01981519, version 1


Ahmad-Montaser Awal, Belaïd Abdel. Neighborhood Label Extension for Handwritten/Printed Text Separation in Arabic Documents. International Workshop on Arabic Script Analysis and Recognition, Apr 2017, NANCY, France. ⟨hal-01981519⟩



Record views


Files downloads