Comparing Sanskrit Texts for Critical Editions: the sequences move problem - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Comparing Sanskrit Texts for Critical Editions: the sequences move problem

Résumé

A critical edition takes into account various versions of the same text in order to show the differences between two distinct versions, in terms of words that have been missing, changed, omitted or displaced. Traditionally, Sanskrit is written without spaces between words, and the word order can be changed without altering the meaning of a sentence. This paper describes the characteristics which make Sanskrit text comparisons a specific matter. It presents two different methods for comparing Sanskrit texts, which can be used to develop a computer assisted critical edition. The first one method uses the L.C.S., while the second one uses the global alignment algorithm. Comparing them, we see that the second method provides better results, but that neither of these methods can detect when a word or a sentence fragment has been moved. We then present a method based on N-gram that can detect such a movement when it is not too far from its original location. We will see how the method behaves on several examples and look for future possible developments.

Domaines

Autre [cs.OH]
Fichier principal
Vignette du fichier
ACTI-KEMMAR-2012-1.pdf (459.92 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00796131 , version 1 (11-07-2014)

Identifiants

  • HAL Id : hal-00796131 , version 1

Citer

Nicolas Béchet, Marc Le Pouliquen, Marc Csernel. Comparing Sanskrit Texts for Critical Editions: the sequences move problem. 13th Internationlal Conference on Intelligent Text Processing and Computational Linguistics, Indian Institute of Technology Delhi, 2012, New Delhi, India. ⟨hal-00796131⟩
416 Consultations
335 Téléchargements

Partager

Gmail Facebook X LinkedIn More