HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Fouille de données du génome à l'aide de modèles de Markov cachés

Sébastien Hergalant 1 Bertrand Aigle 2 Pierre Leblond 2 Jean-François Mari 1
1 ORPAILLEUR - Knowledge representation, reasonning
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : We propose a new data mining method based on second-order hidden Markov models (HMM2) that implements a background model coupled with dedicated a posteriori decoding algorithms to extract DNA heterogeneities. An unsupervised training and a state splitting algorithm specify a HMM2 that observe fixed length sequences (k-mer and k-d-k mer) rather than nucleotides. The training process does not require any a priori knowledge. We tested this data mining method on the Actinomycete genomes (Streptomyces and Mycobacterium) and found many sequences that appear to be parts of the binding sites for transcriptional factors.
Document type :
Conference papers
Complete list of metadata

Contributor : Agnès Vidard Connect in order to contact the contributor
Submitted on : Wednesday, April 5, 2006 - 3:57:05 PM
Last modification on : Friday, March 11, 2022 - 9:52:27 AM
Long-term archiving on: : Wednesday, March 29, 2017 - 12:13:01 PM


 Restricted access
To satisfy the distribution rights of the publisher, the document is embargoed until : jamais

Please log in to resquest access to the document


  • HAL Id : inria-00001213, version 1



Sébastien Hergalant, Bertrand Aigle, Pierre Leblond, Jean-François Mari. Fouille de données du génome à l'aide de modèles de Markov cachés. Extraction et Gestion de Connaissances - EGC 2005, Jan 2005, Paris/France, France. pp.141 -- 148. ⟨inria-00001213⟩



Record views