A Probabilistic analysis of a string edit problem - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Rapport (Rapport De Recherche) Année : 1992

A Probabilistic analysis of a string edit problem

Guy Louchard
  • Fonction : Auteur
Wojciec Szpankowski
  • Fonction : Auteur

Résumé

We consider a string edit problem in a probabilistic framework. This problem is of considerable interest to many facets of science, most notably molecular biology and computer science. A string editing transformes one string into another by performing a series of weighted edit operations of overall maximum (minimum) cost. An edit operation can be the deletion of a symbol, the insertion of a symbol or the substitution of a symbol. We assume that these weights can be arbitrary distributed. We reduce the problem to finding an optimal path in a weighted grid graph and provide several results regarding a typical behavior of such a path. In particular, we observe that the optimal path (i.e., edit distance) is asymptotically almost surely (a.s) equal to an where a is a constant and n is the sum of lengths of both strings. We also obtain explicit bounds on the constant a. More importantly, we show that the edit distance is well concentrated around its average value. As a by-product of our results, we also present a precise estimate of the number of alignments between two strings. To prove these findings we use techniques of random walks, diffusion limiting processes, generating functions and the method of bounded difference.

Domaines

Autre [cs.OH]
Fichier principal
Vignette du fichier
RR-1814.pdf (1.18 Mo) Télécharger le fichier

Dates et versions

inria-00074858 , version 1 (24-05-2006)

Identifiants

  • HAL Id : inria-00074858 , version 1

Citer

Guy Louchard, Wojciec Szpankowski. A Probabilistic analysis of a string edit problem. [Research Report] RR-1814, INRIA. 1992. ⟨inria-00074858⟩
52 Consultations
26 Téléchargements

Partager

Gmail Facebook X LinkedIn More