Fouille de données du génome à l'aide de modèles de Markov cachés

Sébastien Hergalant; Bertrand Aigle; Pierre Leblond; Jean-François Mari

Communication Dans Un Congrès Année : 2005

Fouille de données du génome à l'aide de modèles de Markov cachés

(1) , (2) , (2) , (1)

1
2

Sébastien Hergalant

Fonction : Auteur
PersonId : 18401
IdHAL : sebastien-hergalant
ORCID : 0000-0001-8456-7992

Knowledge representation, reasonning

Bertrand Aigle

Fonction : Auteur
PersonId : 13778
IdHAL : bertrand-aigle
ORCID : 0000-0001-5266-5926
IdRef : 120677407

Laboratoire de génétique et microbiologie

Pierre Leblond

Fonction : Auteur
PersonId : 13774
IdHAL : pierre-leblond
ORCID : 0000-0002-8703-454X
IdRef : 074217127

Laboratoire de génétique et microbiologie

Jean-François Mari

Fonction : Auteur
PersonId : 1202580

Knowledge representation, reasonning

Résumé

We propose a new data mining method based on second-order hidden Markov models (HMM2) that implements a background model coupled with dedicated a posteriori decoding algorithms to extract DNA heterogeneities. An unsupervised training and a state splitting algorithm specify a HMM2 that observe fixed length sequences (k-mer and k-d-k mer) rather than nucleotides. The training process does not require any a priori knowledge. We tested this data mining method on the Actinomycete genomes (Streptomyces and Mycobacterium) and found many sequences that appear to be parts of the binding sites for transcriptional factors.

Mots clés

hmm fouille de données génome promoteur

Domaines

Intelligence artificielle [cs.AI]

Fichier sous embargo

Date de visibilité indéterminée

Agnès Vidard : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00001213

Soumis le : mercredi 5 avril 2006-15:57:05

Dernière modification le : vendredi 24 mars 2023-14:52:47

Archivage à long terme le : mercredi 29 mars 2017-12:13:01

Dates et versions

inria-00001213 , version 1 (05-04-2006)

Identifiants

HAL Id : inria-00001213 , version 1

Citer

Sébastien Hergalant, Bertrand Aigle, Pierre Leblond, Jean-François Mari. Fouille de données du génome à l'aide de modèles de Markov cachés. Extraction et Gestion de Connaissances - EGC 2005, Jan 2005, Paris/France, France. pp.141 -- 148. ⟨inria-00001213⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA INRA UNIV-LORRAINE INRIA2 LORIA DYNAMIC-UL INRAE

124 Consultations

1 Téléchargements

Fouille de données du génome à l'aide de modèles de Markov cachés

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager