Skip to Main content Skip to Navigation
Conference papers

Fouille de données du génome à l'aide de modèles de Markov cachés

Sébastien Hergalant 1 Bertrand Aigle 2 Pierre Leblond 2 Jean-François Mari 1
1 ORPAILLEUR - Knowledge representation, reasonning
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : We propose a new data mining method based on second-order hidden Markov models (HMM2) that implements a background model coupled with dedicated a posteriori decoding algorithms to extract DNA heterogeneities. An unsupervised training and a state splitting algorithm specify a HMM2 that observe fixed length sequences (k-mer and k-d-k mer) rather than nucleotides. The training process does not require any a priori knowledge. We tested this data mining method on the Actinomycete genomes (Streptomyces and Mycobacterium) and found many sequences that appear to be parts of the binding sites for transcriptional factors.
Document type :
Conference papers
Complete list of metadata

https://hal.inria.fr/inria-00001213
Contributor : Agnès Vidard <>
Submitted on : Wednesday, April 5, 2006 - 3:57:05 PM
Last modification on : Wednesday, April 14, 2021 - 3:14:02 PM
Long-term archiving on: : Wednesday, March 29, 2017 - 12:13:01 PM

Files

 Restricted access
To satisfy the distribution rights of the publisher, the document is embargoed until : jamais

Please log in to resquest access to the document

Identifiers

  • HAL Id : inria-00001213, version 1

Collections

Citation

Sébastien Hergalant, Bertrand Aigle, Pierre Leblond, Jean-François Mari. Fouille de données du génome à l'aide de modèles de Markov cachés. Extraction et Gestion de Connaissances - EGC 2005, Jan 2005, Paris/France, France. pp.141 -- 148. ⟨inria-00001213⟩

Share

Metrics

Record views

274