Skip to Main content Skip to Navigation
Conference papers

Segmentation de documents composites par une technique de recouvrement des espaces blancs

Yves Rangoni 1 Abdel Belaid 1
1 READ - Recognition of writing and analysis of documents
LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : We present here a method for the segmentation of composite documents. Unlike most publications, we focus on non-Manhattan layouts which are usually created by compositing. Therefore, the pages to be processed contain several sub-documents which have to be isolated. We draw inspiration from the white space cover technique introduced by Baird et al. and a suite of pre- and post-processings specific to these particular documents. The evaluations are made on administrative records coming from various sources and provided to us by our industrial partner. As we do not have any groundtruth documents we compared our results with those obtained by a commercial OCR which is outperformed by our method.
Complete list of metadata

Cited literature [14 references]  Display  Hide  Download

https://hal.inria.fr/hal-00779235
Contributor : Abdel Belaid <>
Submitted on : Wednesday, January 23, 2013 - 5:49:30 PM
Last modification on : Friday, January 15, 2021 - 5:42:02 PM
Long-term archiving on: : Wednesday, April 24, 2013 - 3:54:32 AM

File

cifed_version_publiee_yves.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00779235, version 1

Collections

Citation

Yves Rangoni, Abdel Belaid. Segmentation de documents composites par une technique de recouvrement des espaces blancs. CIFED-CORIA, Mar 2012, Bordeaux, France. ⟨hal-00779235⟩

Share

Metrics

Record views

251

Files downloads

864