21778 articles – 15587 Notices  [english version]

hal-00328181, version 1

Bayesian hidden Markov Model for DNA segmentation : A prior sensitivity analysis

Darfiana Nur (Auteur à contacter de préférence) 1, David Allingham 2, Judith Rousseau () 34, Kerrie Mengersen 5, Ross Mcvinish 5

Computational Statistics & Data Analysis (2009) 9999

Résumé : The focus of this paper is on the sensitivity to the specification of the prior in a hidden Markov model describing homogeneous segments of DNA sequences. An intron from the chimpanzee α-fetoprotein gene, which plays an im- portant role in embryonic development in mammals is analysed. Three main aims are considered : (i) to assess the sensitivity to prior specification in Bayesian hidden Markov models for DNA sequence segmentation; (ii) to examine the impact of replacing the standard Dirichlet prior with a mixture Dirichlet prior; and (iii) to propose and illus- trate a more comprehensive approach to sensitivity analysis, using importance sampling. It is obtained that (i) the posterior estimates obtained under a Bayesian hidden Markov model are indeed sensitive to the specification of the prior distributions; (ii) compared with the standard Dirichlet prior, the mixture Dirichlet prior is more flexible, less sensitive to the choice of hyperparameters and less constraining in the analysis, thus improving posterior estimates; and (iii) importance sampling was computationally feasible, fast and effective in allowing a richer sensitivity analysis.

  • 1 :  School of Mathematical and physical Sciences
  • University of Newcastle
  • 2 :  ARC center of excellence for Complex Dynamic Systems and Control
  • ARC center of excellence for complex dunamic systems and control
  • 3 :  CEntre de REcherches en MAthématiques de la DEcision (CEREMADE)
  • CNRS : UMR7534 – Université Paris IX - Paris Dauphine
  • 4 :  Centre de Recherche en Économie et Statistique (CREST)
  • INSEE – École Nationale de la Statistique et de l'Administration Économique
  • 5 :  school of mathematical sciences
  • Queensland University of Technology
  • Collaboration : Darfiana Nur,David Allingham, Judith Rousseau , Kerrie L. Mengersen, Ross McVinishd
  • Domaine : Mathématiques/Statistiques
    Statistiques/Théorie
  • Mots-clés : DNA sequence – hidden Markov model – Bayesian model – sensitivity analysis – α-fetoprotein – Markov chain Monte Carlo – importance sampling.
 
  • hal-00328181, version 1
  • oai:hal.archives-ouvertes.fr:hal-00328181
  • Contributeur : 
  • Soumis le : Vendredi 10 Octobre 2008, 22:10:25
  • Dernière modification le : Samedi 14 Février 2009, 17:00:07