Coding Region Prediction Based on a Universal DNA Sequence Representation Method

Dominique Lavenier; Xianyang Jiang; Stephen Yau

doi:10.1089/cmb.2008.0041

Article Dans Une Revue Journal of Computational Biology Année : 2008

Coding Region Prediction Based on a Universal DNA Sequence Representation Method

(1) , (2) , (3)

1
2
3

Dominique Lavenier

Fonction : Auteur
PersonId : 1401
IdHAL : dominique-lavenier
ORCID : 0000-0003-2557-680X

Biological systems and models, bioinformatics and sequences

Xianyang Jiang

Fonction : Auteur

Institute of Microelectronics and Information Technology

Stephen Yau

Fonction : Auteur

Department of Mathematics, Statistics and Computer Science [Chicago]

Résumé

Graphical representation of DNA sequences provides a simple and intuitive way of viewing, anchoring, and comparing various gene structures, so a simple and non-degenerate method is attractive to both biologists and computational biologists. In this study, a universal graphical representation method for DNA sequences based on S.S.-T. Yau's method is presented. The method adopts a trigonometric function to represent the four nucleotides A, G, C, and T. Some interesting characteristics of the universal representation are introduced. We exploit frequency analysis with our representation method on DNA sequences, demonstrating possible applications in coding region prediction, and sequence analysis. Based on the statistically experimental results from this frequency analysis, a simple coding region predictor and an optimized one are presented. An experiment on the broadly accepted ROSETTA data set demonstrates that the performance of the optimized predictor is comparable to that of other popular methods.

Mots clés

DNA sequences Coding region bioinformatics frequency analysis mining methods and algorithms. signal processing

Domaines

Architectures Matérielles [cs.AR]

Dominique Lavenier : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00347594

Soumis le : mardi 16 décembre 2008-11:42:50

Dernière modification le : vendredi 24 mars 2023-14:52:51

Dates et versions

inria-00347594 , version 1 (16-12-2008)

Identifiants

HAL Id : inria-00347594 , version 1
DOI : 10.1089/cmb.2008.0041

Citer

Dominique Lavenier, Xianyang Jiang, Stephen Yau. Coding Region Prediction Based on a Universal DNA Sequence Representation Method. Journal of Computational Biology, 2008, 15 (10), pp.1237-1256. ⟨10.1089/cmb.2008.0041⟩. ⟨inria-00347594⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA IRISA-D7 INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES INSA-GROUPE UR1-MATH-NUM

103 Consultations

0 Téléchargements

Coding Region Prediction Based on a Universal DNA Sequence Representation Method

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager