Document Information Extraction and its Evaluation based on Client's Relevance - Inria - Institut national de recherche en sciences et technologies du numérique Access content directly
Conference Papers Year : 2013

Document Information Extraction and its Evaluation based on Client's Relevance

Abstract

In this paper, we present a model-based document information content extraction approach and perform in-depth evaluation based on clients' relevance. Real-world users i.e., clients first provide a set of key fields from the document image which they think are important. These are used to represent a graph where nodes (i.e., fields) are labelled with dynamic semantics including other features and edges are attributed with spatial relations. Such an attributed relational graph (ARG) is then used to mine similar graphs from a document image that are used to reinforce or update the initial graph iteratively each time we extract them, in order to produce a model. Models therefore, can be employed in the absence of clients. We have validated the concept and evaluated its scientific impact on real-world industrial problem, where table extraction is found to be the best suited application.
Fichier principal
Vignette du fichier
kc_ICDAR2013_5.pdf (415.72 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

hal-00822479 , version 1 (14-05-2013)

Identifiers

Cite

Santosh K.C., Abdel Belaïd. Document Information Extraction and its Evaluation based on Client's Relevance. ICDAR - International Conference on Document Analysis and Recognition - 2013, Aug 2013, Washington DC, United States. ⟨10.1109/ICDAR.2013.16⟩. ⟨hal-00822479⟩
103 View
292 Download

Altmetric

Share

Gmail Facebook X LinkedIn More