Complex Document Classification and Localization Application on Identity Document Images

Abstract : This paper studies the problem of document image classification. More specifically, we address the classification of documents composed of few textual information and complex background (such as identity documents). Unlike most existing systems, the proposed approach simultaneously locates the document and recognizes its class. The latter is defined by the document nature (passport, ID, etc.), emission country, version, and the visible side (main or back). This task is very challenging due to unconstrained capturing conditions, sparse textual information, and varying components that are irrelevant to the classification, e.g. photo, names, address, etc. First, a base of document models is created from reference images. We show that training images are not necessary and only one reference image is enough to create a document model. Then, the query image is matched against all models in the base. Unknown documents are rejected using an estimated quality based on the extracted document. The matching process is optimized to guarantee an execution time independent from the number of document models. Once the document model is found, a more accurate matching is performed to locate the document and facilitate information extraction. Our system is evaluated on several datasets with up to 3042 real documents (representing 64 classes) achieving an accuracy of 96.6%.
Type de document :
Communication dans un congrès
ICDAR 2017 - The 14th IAPR International Conference on Document Analysis and Recognition, Nov 2017, Kyoto, Japan. pp.1-6, 2017
Liste complète des métadonnées

Littérature citée [35 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01660504
Contributeur : Teddy Furon <>
Soumis le : dimanche 10 décembre 2017 - 23:28:01
Dernière modification le : mercredi 21 février 2018 - 01:54:51

Fichier

ICDAR2017.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01660504, version 1

Citation

Ahmad-Montaser Awal, Nabil Ghanmi, Ronan Sicre, Teddy Furon. Complex Document Classification and Localization Application on Identity Document Images. ICDAR 2017 - The 14th IAPR International Conference on Document Analysis and Recognition, Nov 2017, Kyoto, Japan. pp.1-6, 2017. 〈hal-01660504〉

Partager

Métriques

Consultations de la notice

102

Téléchargements de fichiers

74