A Probabilistic analysis of a string edit problem

Guy Louchard Wojciec Szpankowski 1
1 ALGO - Algorithms
Inria Paris-Rocquencourt
Abstract : We consider a string edit problem in a probabilistic framework. This problem is of considerable interest to many facets of science, most notably molecular biology and computer science. A string editing transformes one string into another by performing a series of weighted edit operations of overall maximum (minimum) cost. An edit operation can be the deletion of a symbol, the insertion of a symbol or the substitution of a symbol. We assume that these weights can be arbitrary distributed. We reduce the problem to finding an optimal path in a weighted grid graph and provide several results regarding a typical behavior of such a path. In particular, we observe that the optimal path (i.e., edit distance) is asymptotically almost surely (a.s) equal to an where a is a constant and n is the sum of lengths of both strings. We also obtain explicit bounds on the constant a. More importantly, we show that the edit distance is well concentrated around its average value. As a by-product of our results, we also present a precise estimate of the number of alignments between two strings. To prove these findings we use techniques of random walks, diffusion limiting processes, generating functions and the method of bounded difference.
Type de document :
[Research Report] RR-1814, INRIA. 1992
Liste complète des métadonnées

Contributeur : Rapport de Recherche Inria <>
Soumis le : mercredi 24 mai 2006 - 16:35:28
Dernière modification le : vendredi 25 mai 2018 - 12:02:02
Document(s) archivé(s) le : mardi 12 avril 2011 - 19:46:49



  • HAL Id : inria-00074858, version 1



Guy Louchard, Wojciec Szpankowski. A Probabilistic analysis of a string edit problem. [Research Report] RR-1814, INRIA. 1992. 〈inria-00074858〉



Consultations de la notice


Téléchargements de fichiers