Skip to Main content Skip to Navigation
Conference papers

Table information extraction and structure recognition using query patterns

Thotreingam Kasar 1 Tapan. K. Bhowmik 1 Belaïd Abdel 1
1 READ - Recognition of writing and analysis of documents
LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : In this paper, we present a query-based approach to selectively extract tabular information and recognize the table structure from scanned documents. Unlike conventional table processing paradigms, we adopt a client-driven approach where clients provide a query pattern by specifying a set of key-fields in the document image. The query pattern is first transformed into an attributed relational graph where each node is described with features and the edges with spatial relationships between the nodes. A fast graph matching technique is then used to retrieve other similar graphs from the document image. Further, the extracted graphs are collectively analyzed to deduce the overall tabular structure. Experiments on a dataset of 101 commercial transaction documents demonstrate the effectiveness of the proposed method. given labeling of the line considering the outputs of the two classifiers. A set of chemistry documents is used for the evaluation of this approach. The obtained results are around 88% of table lines correctly detected.
Document type :
Conference papers
Complete list of metadata

Cited literature [13 references]  Display  Hide  Download

https://hal.inria.fr/hal-01254761
Contributor : Abdel Belaid <>
Submitted on : Thursday, January 14, 2016 - 4:11:43 PM
Last modification on : Friday, January 15, 2021 - 5:42:02 PM

Identifiers

Collections

Citation

Thotreingam Kasar, Tapan. K. Bhowmik, Belaïd Abdel. Table information extraction and structure recognition using query patterns. International Conference on Document Analysis and Recognition, Aug 2015, Nancy, France. ⟨10.1109/ICDAR.2015.7333928⟩. ⟨hal-01254761⟩

Share

Metrics

Record views

258