IUPACpal: efficient identification of inverted repeats in IUPAC-encoded DNA sequences - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Article Dans Une Revue BMC Bioinformatics Année : 2021

IUPACpal: efficient identification of inverted repeats in IUPAC-encoded DNA sequences

Résumé

Abstract Background An inverted repeat is a DNA sequence followed downstream by its reverse complement, potentially with a gap in the centre. Inverted repeats are found in both prokaryotic and eukaryotic genomes and they have been linked with countless possible functions. Many international consortia provide a comprehensive description of common genetic variation making alternative sequence representations, such as IUPAC encoding, necessary for leveraging the full potential of such broad variation datasets. Results We present IUPACpal , an exact tool for efficient identification of inverted repeats in IUPAC-encoded DNA sequences allowing also for potential mismatches and gaps in the inverted repeats. Conclusion Within the parameters that were tested, our experimental results show that IUPACpal compares favourably to a similar application packaged with EMBOSS . We show that IUPACpal identifies many previously unidentified inverted repeats when compared with EMBOSS , and that this is also performed with orders of magnitude improved speed.
Fichier principal
Vignette du fichier
s12859-021-03983-2.pdf (2.15 Mo) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte

Dates et versions

hal-03498463 , version 1 (21-12-2021)

Identifiants

Citer

Hayam Alamro, Mai Alzamel, Costas S Iliopoulos, Solon P Pissis, Steven Watts. IUPACpal: efficient identification of inverted repeats in IUPAC-encoded DNA sequences. BMC Bioinformatics, 2021, 22, pp.1-12. ⟨10.1186/s12859-021-03983-2⟩. ⟨hal-03498463⟩

Collections

INRIA INRIA2
19 Consultations
47 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More