A New Challenge for Compression Algorithms: Genetic Sequences

Abstract : Universal data compression algorithms fail to compress genetic sequences. It is due to the specificity of this particular kind of "text". We analyze in some details the properties of the sequences, which cause the failure of classical algorithms. We then present a lossless algorithm, biocompress-2, to compress the information contained in DNA and RNA sequences, based on the detection of regularities, such as the presence of palindromes. The algorithm combines substitutional and statistical...
Type de document :
Article dans une revue
Information Processing & Management, Elsevier, 1994, Information processing & management, 30
Liste complète des métadonnées

https://hal.inria.fr/inria-00180949
Contributeur : Chine Publications Liama <>
Soumis le : lundi 22 octobre 2007 - 14:53:14
Dernière modification le : mardi 23 octobre 2007 - 11:55:16
Document(s) archivé(s) le : dimanche 11 avril 2010 - 23:33:03

Identifiants

  • HAL Id : inria-00180949, version 1

Collections

Citation

Stéphane Grumbach, Fariza Tahi. A New Challenge for Compression Algorithms: Genetic Sequences. Information Processing & Management, Elsevier, 1994, Information processing & management, 30. 〈inria-00180949〉

Partager

Métriques

Consultations de la notice

407

Téléchargements de fichiers

1586