Seedability: optimizing alignment parameters for sensitive sequence comparison - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Article Dans Une Revue Bioinformatics Advances Année : 2023

Seedability: optimizing alignment parameters for sensitive sequence comparison

Résumé

Motivation Most sequence alignment techniques make use of exact k-mer hits, called seeds, as anchors to optimize alignment speed. A large number of bioinformatics tools employing seed-based alignment techniques, such as Minimap2, use a single value of k per sequencing technology, without a strong guarantee that this is the best possible value. Given the ubiquity of sequence alignment, identifying values of k that lead to more sensitive alignments is thus an important task. To aid this, we present Seedability, a seed-based alignment framework designed for estimating an optimal seed k-mer length (as well as a minimal number of shared seeds) based on a given alignment identity threshold. In particular, we were motivated to make Minimap2 more sensitive in the pairwise alignment of short sequences. Results The experimental results herein show improved alignments of short and divergent sequences when using the parameter values determined by Seedability in comparison to the default values of Minimap2. We also show several cases of pairs of real divergent sequences, where the default parameter values of Minimap2 yield no output alignments, but the values output by Seedability produce plausible alignments. Availability and implementation https://github.com/lorrainea/Seedability (distributed under GPL v3.0).
Fichier principal
Vignette du fichier
vbad108.pdf (900.99 Ko) Télécharger le fichier
Origine : Publication financée par une institution

Dates et versions

hal-04385612 , version 1 (10-01-2024)

Licence

Paternité

Identifiants

Citer

Lorraine a K Ayad, Rayan Chikhi, Solon P Pissis. Seedability: optimizing alignment parameters for sensitive sequence comparison. Bioinformatics Advances, 2023, 3 (1), ⟨10.1093/bioadv/vbad108⟩. ⟨hal-04385612⟩
16 Consultations
7 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More