Visual perception of unitary elements for layout analysis of unconstrained documents in heterogeneous databases

Abstract : The document layout analysis is a complex task in the context of heterogeneous documents. It is still a challenging problem. In this paper, we present our contribution for the layout analysis competition of the international Maurdor Cam-paign. Our method is based on a grammatical description of the content of elements. It consists in iteratively finding and then removing the most structuring elements of documents. This method is based on notions of perceptive vision: a combination of points of view of the document, and the analysis of salient contents. Our description is generic enough to deal with a very wide range of heterogeneous documents. This method obtained the second place in Run 2 of Maurdor Campaign (on 1000 documents), and the best results in terms of pixel labeling for text blocs and graphic regions.
Document type :
Conference papers
Complete list of metadatas

Cited literature [8 references]  Display  Hide  Download

https://hal.inria.fr/hal-01088807
Contributor : Aurélie Lemaitre <>
Submitted on : Friday, November 28, 2014 - 4:58:03 PM
Last modification on : Friday, November 16, 2018 - 1:35:36 AM
Long-term archiving on : Friday, April 14, 2017 - 11:10:15 PM

File

PID3215629.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01088807, version 1

Citation

Baptiste Poirriez, Aurélie Lemaitre, Bertrand Coüasnon. Visual perception of unitary elements for layout analysis of unconstrained documents in heterogeneous databases. 14th International Conference on Frontiers in Handwriting Recognition (ICFHR-2014), Sep 2014, Crete island, Greece. ⟨hal-01088807⟩

Share

Metrics

Record views

558

Files downloads

242