Weighted Automata Computation of Edit Distances with Consolidations and Fragmentations

Mathieu Giraud 1, 2, * Florent Jacquemard 3, *
* Corresponding author
2 Algomus
MIS - Modélisation, Information & Systèmes, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Abstract : We study edit distances between strings, based on operations of character substitutions, insertions, deletions and additionally consolidations and fragmentations. The two latter operations transform a sequence of characters into one character and vice-versa. They correspond to the compression and expansion in Dynamic Time-Warping algorithms for speech recognition and are also used for the formal analysis of written music. Such edit distances are not computable in general for arbitrary rulesets. We propose weighted automaton constructions to compute an edit distance taking into account both consolidations and deletions, or both fragmentations and insertions. Assuming that the operation ruleset has a constant size, these constructions are polynomial into the lengths of the involved strings. We finally show that the optimal weight of sequences made of consolidations chained with fragmentations, in that order, is computable for arbitrary rulesets, and not computable if we reverse the order of fragmentations and consolidations.
Complete list of metadatas

https://hal.inria.fr/hal-01857267
Contributor : Mathieu Giraud <>
Submitted on : Wednesday, October 31, 2018 - 10:56:07 PM
Last modification on : Friday, April 19, 2019 - 4:54:58 PM
Long-term archiving on : Friday, February 1, 2019 - 5:03:01 PM

File

ms-automata.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01857267, version 3

Citation

Mathieu Giraud, Florent Jacquemard. Weighted Automata Computation of Edit Distances with Consolidations and Fragmentations. 2018. ⟨hal-01857267v3⟩

Share

Metrics

Record views

72

Files downloads

353