Coding Region Prediction Based on a Universal DNA Sequence Representation Method

Abstract : Graphical representation of DNA sequences provides a simple and intuitive way of viewing, anchoring, and comparing various gene structures, so a simple and non-degenerate method is attractive to both biologists and computational biologists. In this study, a universal graphical representation method for DNA sequences based on S.S.-T. Yau's method is presented. The method adopts a trigonometric function to represent the four nucleotides A, G, C, and T. Some interesting characteristics of the universal representation are introduced. We exploit frequency analysis with our representation method on DNA sequences, demonstrating possible applications in coding region prediction, and sequence analysis. Based on the statistically experimental results from this frequency analysis, a simple coding region predictor and an optimized one are presented. An experiment on the broadly accepted ROSETTA data set demonstrates that the performance of the optimized predictor is comparable to that of other popular methods.
Type de document :
Article dans une revue
Journal of Computational Biology, Mary Ann Liebert, 2008, 15 (10), pp.1237-1256. 〈10.1089/cmb.2008.0041〉
Liste complète des métadonnées

https://hal.inria.fr/inria-00347594
Contributeur : Dominique Lavenier <>
Soumis le : mardi 16 décembre 2008 - 11:42:50
Dernière modification le : mercredi 11 avril 2018 - 01:59:48

Identifiants

Citation

Dominique Lavenier, Xianyang Jiang, Stephen Yau. Coding Region Prediction Based on a Universal DNA Sequence Representation Method. Journal of Computational Biology, Mary Ann Liebert, 2008, 15 (10), pp.1237-1256. 〈10.1089/cmb.2008.0041〉. 〈inria-00347594〉

Partager

Métriques

Consultations de la notice

253