Context Processing to Read Text on Damaged Wooden Tablets

Abstract : This paper describes context processing to present candidates for damaged scripts on wooden tablets (mokkans). Since mokkans excavated from old strata have been damaged, even archeologists can hardly read scripts on mokkans. Very often, ink in several areas are faded out or completely lost, some characters might be misrecognized based on which other characters must be read. The context processing extends the Aho-Corasick method to allow self-transition and presents candidates even for scripts with lost ink and misrecognized characters. For evaluation, we employed 4,041 place names in Japan at the 8th century as the vocabulary. Each place name consists of 9 to 11 characters. Test keywords were prepared with 1 to 6 characters lost and 0 to 2 characters replaced by others from the vocabulary. Even for those with 5 characters lost and one character is replaced, the method nominates correct names in the top 10 candidates with 71.7% correctness.
Type de document :
Communication dans un congrès
Guy Lorette. Tenth International Workshop on Frontiers in Handwriting Recognition, Oct 2006, La Baule (France), Suvisoft, 2006
Liste complète des métadonnées

Littérature citée [7 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00104895
Contributeur : Anne Jaigu <>
Soumis le : lundi 9 octobre 2006 - 16:23:01
Dernière modification le : lundi 9 octobre 2006 - 16:28:16
Document(s) archivé(s) le : mardi 6 avril 2010 - 17:41:36

Identifiants

  • HAL Id : inria-00104895, version 1

Collections

Citation

Akihito Kitadai, Kazu Nishijima, Kei Saito, Masaki Nakagawa, Hajime Baba, et al.. Context Processing to Read Text on Damaged Wooden Tablets. Guy Lorette. Tenth International Workshop on Frontiers in Handwriting Recognition, Oct 2006, La Baule (France), Suvisoft, 2006. 〈inria-00104895〉

Partager

Métriques

Consultations de la notice

81

Téléchargements de fichiers

81