Document Information Extraction and its Evaluation based on Client's Relevance - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2013

Document Information Extraction and its Evaluation based on Client's Relevance

Résumé

In this paper, we present a model-based document information content extraction approach and perform in-depth evaluation based on clients' relevance. Real-world users i.e., clients first provide a set of key fields from the document image which they think are important. These are used to represent a graph where nodes (i.e., fields) are labelled with dynamic semantics including other features and edges are attributed with spatial relations. Such an attributed relational graph (ARG) is then used to mine similar graphs from a document image that are used to reinforce or update the initial graph iteratively each time we extract them, in order to produce a model. Models therefore, can be employed in the absence of clients. We have validated the concept and evaluated its scientific impact on real-world industrial problem, where table extraction is found to be the best suited application.
Fichier principal
Vignette du fichier
kc_ICDAR2013_5.pdf (415.72 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00822479 , version 1 (14-05-2013)

Identifiants

Citer

Santosh K.C., Abdel Belaïd. Document Information Extraction and its Evaluation based on Client's Relevance. ICDAR - International Conference on Document Analysis and Recognition - 2013, Aug 2013, Washington DC, United States. ⟨10.1109/ICDAR.2013.16⟩. ⟨hal-00822479⟩
103 Consultations
288 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More