IUPACpal: efficient identification of inverted repeats in IUPAC-encoded DNA sequences - Archive ouverte HAL Access content directly
Journal Articles BMC Bioinformatics Year : 2021

IUPACpal: efficient identification of inverted repeats in IUPAC-encoded DNA sequences

(1, 2) , (1, 3) , (1) , (4, 5, 6) , (1)
1
2
3
4
5
6

Abstract

Abstract Background An inverted repeat is a DNA sequence followed downstream by its reverse complement, potentially with a gap in the centre. Inverted repeats are found in both prokaryotic and eukaryotic genomes and they have been linked with countless possible functions. Many international consortia provide a comprehensive description of common genetic variation making alternative sequence representations, such as IUPAC encoding, necessary for leveraging the full potential of such broad variation datasets. Results We present IUPACpal , an exact tool for efficient identification of inverted repeats in IUPAC-encoded DNA sequences allowing also for potential mismatches and gaps in the inverted repeats. Conclusion Within the parameters that were tested, our experimental results show that IUPACpal compares favourably to a similar application packaged with EMBOSS . We show that IUPACpal identifies many previously unidentified inverted repeats when compared with EMBOSS , and that this is also performed with orders of magnitude improved speed.
Fichier principal
Vignette du fichier
s12859-021-03983-2.pdf (2.15 Mo) Télécharger le fichier
Origin : Publisher files allowed on an open archive

Dates and versions

hal-03498463 , version 1 (21-12-2021)

Identifiers

Cite

Hayam Alamro, Mai Alzamel, Costas S Iliopoulos, Solon P Pissis, Steven Watts. IUPACpal: efficient identification of inverted repeats in IUPAC-encoded DNA sequences. BMC Bioinformatics, 2021, 22, pp.1-12. ⟨10.1186/s12859-021-03983-2⟩. ⟨hal-03498463⟩

Collections

INRIA INRIA2
16 View
25 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More