Inexact graph matching for entity recognition in OCRed documents - Archive ouverte HAL Access content directly
Conference Papers Year : 2016

Inexact graph matching for entity recognition in OCRed documents

(1) , (1)
1

Abstract

This paper proposes an entity recognition system in image documents recognized by OCR. The system is based on a graph matching technique and is guided by a database describing the entities in its records. The input of the system is a document which is labeled by the entity attributes. A first grouping of those labels based on a function score leads to a selected set of candidate entities. The entity labels which are locally close are modeled by a structure graph. This graph is matched with model graphs learned for this purpose. The graph matching technique relies on a specific cost function that integrates the feature dissimilarities. The matching results are exploited to correct the mislabeling errors and then validate the entity recognition task. The system evaluation on three datasets which treat different kind of entities shows a variation between 88.3% and 95% for recall and 94.3% and 95.7% for precision.
Fichier principal
Vignette du fichier
ICPR_2016.pdf (851.59 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

hal-01515412 , version 1 (27-04-2017)

Identifiers

Cite

Nihel Kooli, Abdel Belaid. Inexact graph matching for entity recognition in OCRed documents. ICPR, Dec 2016, Mexico, Mexico. pp.4071 - 4076, ⟨10.1109/ICPR.2016.7900271⟩. ⟨hal-01515412⟩
204 View
138 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More