Skip to Main content Skip to Navigation
New interface
Conference papers

A Clustering Backed Deep Learning Approach for Document Layout Analysis

Abstract : Large organizations generate documents and records on a daily basis, often to such an extent that processing them manually becomes unduly time consuming. Because of this, automated processing systems for documents are desirable, as they would reduce the time spent handling them. Unfortunately, documents are often not designed to be machine-readable, so parsing them is a difficult problem. Image segmentation techniques and deep-learning architectures have been proposed as a solution to this, but have difficulty retaining accuracy when page layouts are especially dense. This leads to the possibilities of data being duplicated, lost, or inaccurate during retrieval. We propose a way of refining these segmentations, using a clustering based approach that can be easily combined with existing rules based refinements. We show that on a financial document corpus of 2675 pages, when using DBSCAN, this method is capable of significantly increasing the accuracy of existing deep-learning methods for image segmentation. This improves the reliability of the results in the context of automatic document analysis.
Complete list of metadata
Contributor : Hal Ifip Connect in order to contact the contributor
Submitted on : Thursday, November 4, 2021 - 3:58:28 PM
Last modification on : Friday, November 5, 2021 - 3:57:59 AM
Long-term archiving on: : Saturday, February 5, 2022 - 7:10:38 PM


 Restricted access
To satisfy the distribution rights of the publisher, the document is embargoed until : 2023-01-01

Please log in to resquest access to the document


Distributed under a Creative Commons Attribution 4.0 International License



Rhys Agombar, Max Luebbering, Rafet Sifa. A Clustering Backed Deep Learning Approach for Document Layout Analysis. 4th International Cross-Domain Conference for Machine Learning and Knowledge Extraction (CD-MAKE), Aug 2020, Dublin, Ireland. pp.423-430, ⟨10.1007/978-3-030-57321-8_23⟩. ⟨hal-03414749⟩



Record views