Tout est difficile avant d'être simple ,
Bayes et celle obtenue par l'initialisation aléatoire avec l'algorithme V -Bayes (optimale pour le critère ICL (4,1) ) se situent dans les classes 2 et 3 pour les lignes et les classes centrales pour les colonnes. La partition de l'initialisation aléatoire a créé plus de groupes (donc plus petits) pour les colonnes ce qui a pour conséquence d'avoir des blocs plus contrastés. L'échantillonneur de Gibbs avec l'algorithme VEM renvoie la même partition en ligne que la partition retenue avec le plus ,
4 montre que la meilleure stratégie est à nouveau l'initialisation aléatoire couplée avec l'algorithme V -Bayes Juste après, vient la combinaisons Gibbs+V -Bayes. En règle générale, les initialisations aléatoires et l'échantillonneur de Gibbs donnent des résultats proches. Notons que les critères sont moins bons que ceux de la section 4.2.4 où nous utilisions pour chaque couple (g, m) trois ,
algorithme LG donne des résultats plus concordants que pour les blasons mérovingiens. Les données ternaires donnent probablement plus d'informations pour cet algorithme ,
Algorithme em et classification non supervisée, 2004. ,
Identifiability of parameters in latent structure models with many observed variables. The Annals of Statistics, pp.3099-3132, 2009. ,
URL : https://hal.archives-ouvertes.fr/hal-00591202
A Comparison of Segment Retention Criteria for Finite Mixture Logit Models, Journal of Marketing Research, vol.40, issue.2, pp.235-243, 2003. ,
DOI : 10.1509/jmkr.40.2.235.19225
Ségrégation professionnelle hommes-femmes : les théories en présence, 1997. ,
Modele à blocs latents pour l'analyse de données métagénomiques, 2014. ,
A generalized maximum entropy approach to bregman co-clustering and matrix approximation, Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '04, pp.1919-1986, 2007. ,
DOI : 10.1145/1014052.1014111
Model-Based Gaussian and Non-Gaussian Clustering, Biometrics, vol.49, issue.3, pp.803-821, 1993. ,
DOI : 10.2307/2532201
Sélection de modèle pour la classification non supervisée, 2009. ,
Discovering local structure in gene expression data, Proceedings of the sixth annual international conference on Computational biology , RECOMB '02, pp.49-57, 2002. ,
DOI : 10.1145/565196.565203
The netflix prize, Proceedings of KDD cup and workshop, p.35, 2007. ,
Modele génératif pour données ordinales, 44e Journées de Statistique, 2012. ,
Assessing a mixture model for clustering with the integrated completed likelihood. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.22, issue.7, pp.719-725, 2000. ,
Le logiciel MIXMOD d'analyse de mélange pour la classification et l'analyse discriminante. La revue de Modulad, pp.25-44, 2006. ,
Measure and Probability, 1986. ,
Pattern recognition and machine learning, 2006. ,
Simultaneous clustering of objects and variables, Analyse des données et Informatique, pp.187-203, 1979. ,
Revue bibliographique pour la classification croisée, 2014. ,
Latent Bloc Model : a Review, 2014. ,
General methods for monitoring convergence of iterative simulations, Journal of computational and graphical statistics, vol.7, issue.4, pp.434-455, 1998. ,
Multimodel Inference, Sociological Methods & Research, vol.27, issue.1, p.261, 2004. ,
DOI : 10.1177/0049124104268644
A classification EM algorithm for clustering and two stochastic versions, Computational Statistics & Data Analysis, vol.14, issue.3, pp.315-332, 1992. ,
DOI : 10.1016/0167-9473(92)90042-E
URL : https://hal.archives-ouvertes.fr/inria-00075196
Une histoire de discrétisation, La Revue de Modulad, vol.11, pp.7-44, 1993. ,
On Stochastic Versions of the EM Algorithm, 1995. ,
URL : https://hal.archives-ouvertes.fr/inria-00074164
Computational and Inferential Difficulties with Mixture Posterior Distributions, Journal of the American Statistical Association, vol.60, issue.451, pp.95957-970, 2000. ,
DOI : 10.1080/01621459.1995.10476589
URL : https://hal.archives-ouvertes.fr/inria-00073049
Consistency of maximum-likelihood and variational estimators in the stochastic block model, Electronic Journal of Statistics, vol.6, issue.0, pp.1847-1899, 2012. ,
DOI : 10.1214/12-EJS729
URL : https://hal.archives-ouvertes.fr/hal-00593644
Classification and estimation in the Stochastic Blockmodel based on the empirical degrees, Electronic Journal of Statistics, vol.6, issue.0, pp.2574-2601, 2012. ,
DOI : 10.1214/12-EJS753
URL : https://hal.archives-ouvertes.fr/hal-01190224
Détermination du nombre de classes dans les méthodes de bipartitionnement, 17ème Rencontres de la Société Francophone de Classification, pp.119-122, 2010. ,
Biclustering of expression data, International Conference on Intelligent Systems for Molecular Biology ; ISMB. International Conference on Intelligent Systems for Molecular Biology, pp.93-103, 1999. ,
Maximum likelihood from incomplete data via the em algorithm, Journal of the Royal statistical Society, vol.39, issue.1, pp.1-38, 1977. ,
Simultaneous co-clustering and modeling of market data, Proceedings of the Workshop for Data Mining in Marketing, 2007. ,
Information-theoretic co-clustering, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '03, pp.89-98, 2003. ,
DOI : 10.1145/956750.956764
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.9802
A permutation-based algorithm for block clustering, Journal of Classification, vol.82, issue.3, pp.65-91, 1991. ,
DOI : 10.1007/BF02616248
The Stochastic EM Algorithm: Estimation and Asymptotic Results, Bernoulli, vol.6, issue.3, pp.457-489, 2000. ,
DOI : 10.2307/3318671
How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis, The Computer Journal, vol.41, issue.8, pp.41578-588, 1998. ,
DOI : 10.1093/comjnl/41.8.578
Mixtures : estimation and applications, chapter Dealing with label switching under model uncertainty, pp.193-218, 2011. ,
Inversion probabiliste bayésienne en analyse d'incertitude, Thèse, 2012. ,
Sampling-Based Approaches to Calculating Marginal Densities, Journal of the American Statistical Association, vol.4, issue.410, pp.398-409, 1990. ,
DOI : 10.1080/01621459.1986.10478240
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.512.2330
Inference from iterative simulation using multiple sequences. Statistical science, pp.457-472, 1992. ,
DOI : 10.1214/ss/1177011136
Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. Pattern Analysis and Machine Intelligence, IEEE Transactions on, issue.6, pp.721-741, 1984. ,
Practical Markov Chain Monte Carlo, Statistical Science, vol.7, issue.4, pp.473-483, 1992. ,
DOI : 10.1214/ss/1177011137
Categorization of classification Mathematics and Computer Science in Biology and Medicine. Her Majesty's Stationery Office, 1965. ,
Algorithme de classification d'un tableau de contingence, First international symposium on data analysis and informatics, pp.487-500, 1977. ,
Classification croisée. Thèse d'état, 1983. ,
Simultaneous clustering of rows and columns, Control and Cybernetics, vol.24, issue.4, pp.437-458, 1995. ,
Clustering with block mixture models, Pattern Recognition, vol.36, issue.2, pp.463-473, 2003. ,
DOI : 10.1016/S0031-3203(02)00074-2
Clustering of contingency table and mixture model, European Journal of Operational Research, vol.183, issue.3, pp.1055-1066, 2007. ,
DOI : 10.1016/j.ejor.2005.10.074
Block clustering with Bernoulli mixture models: Comparison of different approaches, Computational Statistics & Data Analysis, vol.52, issue.6, pp.3233-3245, 2008. ,
DOI : 10.1016/j.csda.2007.09.007
Un modèle de mélange pour la classification croisée d'un tableau de données continue. Dans CAP'09, 11e conférence sur l'apprentissage artificiel, pp.287-302, 2009. ,
Co-Clustering, 2013. ,
DOI : 10.1002/9781118649480
URL : https://hal.archives-ouvertes.fr/hal-00933301
Non-uniqueness in probabilistic numerical identification of bacteria, Journal of Applied Probability, vol.132, issue.02, pp.542-548, 1994. ,
DOI : 10.1099/00207713-24-4-494
Two-mode Clustering with Genetic Algorithms, Classification, automation, and new media, pp.87-93, 2002. ,
DOI : 10.1007/978-3-642-55991-4_9
Classification and Clustering, Journal of Marketing Research, vol.18, issue.4, 1975. ,
DOI : 10.2307/3151350
Bloc Voting in the United States Senate, Journal of Classification, vol.17, issue.1, pp.29-49, 2000. ,
DOI : 10.1007/s003570000003
Gene-expression profiles in hereditary breast cancer. New Eng, J. Med, vol.344, pp.539-548, 2001. ,
Approximate Bayesian inference for simple mixtures, Dans COMPS- TAT, pp.331-336, 2000. ,
DOI : 10.1007/978-3-642-57678-2_42
Defining transcription modules using large-scale gene expression data, Bioinformatics, vol.20, issue.13, pp.1993-2003, 2004. ,
DOI : 10.1093/bioinformatics/bth166
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.100.48
Analyzing in situ gene expression in the mouse brain with image registration, feature extraction and block clustering, BMC Bioinformatics, vol.8, issue.Suppl 10, p.5, 2007. ,
DOI : 10.1186/1471-2105-8-S10-S5
An Invariant Form for the Prior Probability in Estimation Problems, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, vol.186, issue.1007, pp.453-461, 1007. ,
DOI : 10.1098/rspa.1946.0056
The Selection of Prior Distributions by Formal Rules, Journal of the American Statistical Association, vol.36, issue.435, pp.1343-1370, 1996. ,
DOI : 10.1080/01621459.1996.10477003
Learning systems of concepts with an infinite relational model, Proceedings of The Twenty-First National Conference on Artificial Intelligence, pp.381-388, 2006. ,
Consistent estimation of the order of mixture models, Sankhya Series A, vol.62, pp.49-66, 2000. ,
Méthodes bayésiennes variationnelles : concepts et applications en neuroimagerie, pp.107-131, 2010. ,
Estimation d'un modèle à blocs latent par l'algorithme SEM, 42e Journées de Statistique, 2010. ,
Model selection for the binary latent block model, 20th International Conference on Computational Statistics ,
URL : https://hal.archives-ouvertes.fr/hal-00924210
Estimation and selection for the latent block model on categorical data, Statistics and Computing, vol.22, issue.2, pp.1-16, 2014. ,
DOI : 10.1007/s11222-014-9472-2
URL : https://hal.archives-ouvertes.fr/hal-00802764
Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions, Genome Research, vol.13, issue.4, pp.703-716, 2003. ,
DOI : 10.1101/gr.648603
URL : http://www.ncbi.nlm.nih.gov/pmc/articles/PMC430175
On information and sufficiency. The Annals of Mathematical Statistics, pp.79-86, 1951. ,
DOI : 10.1214/aoms/1177729694
Essai philosophique sur les probabilités, 1825. ,
Co-clustering with generative models. Rapport technique, 2009. ,
Plaid models for gene expression data, Statistica Sinica, vol.12, pp.61-86, 2000. ,
Le critère BIC : fondements théoriques et interprétation, 2004. ,
Et si vous étiez un bayésien qui s'ignore ? Revue Modulad, pp.92-105, 2005. ,
Les plaques-boucles mérovingiennes. Dossiers de l'archéologie, pp.83-87, 1980. ,
Sélection de modèle pour la classification croisée de données continues, Thèse, 2012. ,
An Approximation of the Integrated Classification Likelihood for the Latent Block Model, 2012 IEEE 12th International Conference on Data Mining Workshops, pp.147-153, 2012. ,
DOI : 10.1109/ICDMW.2012.32
URL : https://hal.archives-ouvertes.fr/hal-00933245
Model selection in block clustering by the integrated classification likelihood, pp.519-530, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00730829
Un protocole de simulation de données pour la classification croisée, 2012. ,
Co-clustering by block value decomposition, Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining , KDD '05, pp.635-640, 2005. ,
DOI : 10.1145/1081870.1081949
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.381.2448
Some methods for classification and analysis of multivariate observations, Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, p.14, 1967. ,
Biclustering algorithms for biological data analysis: a survey, IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.1, issue.1, pp.24-45, 2004. ,
DOI : 10.1109/TCBB.2004.2
Convergence of the groups posterior distribution in latent or stochastic block models. arXiv preprint, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-01174515
Uncovering latent structure in valued graphs: A variational approach, The Annals of Applied Statistics, vol.4, issue.2, pp.715-742, 2010. ,
DOI : 10.1214/07-AOAS361SUPP
URL : https://hal.archives-ouvertes.fr/hal-01197514
Modeling heterogeneity in random graphs : a selective review, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-00948421
Sélection de variables pour la classification par mélanges gaussiens pour prédire la fonction des gènes orphelins, pp.69-80, 2009. ,
The EM algorithm and extensions, 2008. ,
9 The classification and mixture maximum likelihood approaches to cluster analysis, Handbook of statistics, pp.199-208, 1982. ,
DOI : 10.1016/S0169-7161(82)02012-4
Nonparametric Bayesian biclustering, 2007. ,
Equation of state calculations by fast computing machines. The journal of chemical physics, pp.1087-1092, 1953. ,
An examination of procedures for determining the number of clusters in a data set, Psychometrika, vol.77, issue.2, pp.159-179, 1985. ,
DOI : 10.1007/BF02294245
EXTRACTING CONSERVED GENE EXPRESSION MOTIFS FROM GENE EXPRESSION DATA, Biocomputing 2003, pp.77-88, 2003. ,
DOI : 10.1142/9789812776303_0008
Block clustering and statistical modelling, Symposium on mixture modeling, 2007. ,
Estimation and Prediction for Stochastic Blockstructures, Journal of the American Statistical Association, vol.96, issue.455, pp.1077-1087, 2001. ,
DOI : 10.1198/016214501753208735
Application of matrix clustering to web log analysis and access prediction. Dans WEBKDD 2001-Mining Web Log Data Across All Customers Touch Points, Third International Workshop, pp.13-21, 2001. ,
Consommation de médicaments. la santé et le bien-être, p.445, 2001. ,
Les contes de Perrault, p.1867 ,
A general strategy for the simultaneous classification of variables and objects in ecological data tables, Journal of Vegetation Science, vol.81, issue.4, pp.435-444, 1991. ,
DOI : 10.2307/3236025
A systematic comparison and evaluation of biclustering methods for gene expression data, Bioinformatics, vol.22, issue.9, p.221122, 2006. ,
DOI : 10.1093/bioinformatics/btl060
Bayesian Model Selection in Social Research, Sociological Methodology, vol.25, pp.111-164, 1995. ,
DOI : 10.2307/271063
[Practical Markov Chain Monte Carlo]: Comment: One Long Run with Diagnostics: Implementation Strategies for Markov Chain Monte Carlo, Statistical Science, vol.7, issue.4, pp.493-497, 1992. ,
DOI : 10.1214/ss/1177011143
Applied statistical decision theory. Division of Research, Harvard Business School, 1961. ,
Le choix bayésien : Principes et pratique, 2006. ,
Two-mode multi-partitioning, Computational Statistics & Data Analysis, vol.52, issue.4, pp.1984-2003, 2008. ,
DOI : 10.1016/j.csda.2007.06.025
Two-dimensional clusters in grammatical relations, AAAI Symposium on Representation and Acquisition of Lexical Knowledge, 1995. ,
The mondrian process, pp.1377-1384, 2008. ,
Selecting Among Multi-Mode Partitioning Models of Different Complexities: A Comparison of Four Model Selection Criteria, Journal of Classification, vol.31, issue.1, pp.67-85, 2008. ,
DOI : 10.1007/s00357-008-9005-9
Estimating the dimension of a model. The annals of statistics, pp.461-464, 1978. ,
Pac-Bayesian analysis of co-clustering and beyond, The Journal of Machine Learning Research, vol.11, pp.3595-3646, 2010. ,
Algorithms for non-negative matrix factorization, Advances in Neural Information Processing Systems 13, pp.556-562, 2001. ,
Model-based overlapping co-clustering, Proceeding of SIAM Conference on Data Mining, 2006. ,
Bayesian Co-clustering, 2008 Eighth IEEE International Conference on Data Mining, pp.530-539, 2008. ,
DOI : 10.1109/ICDM.2008.91
Discovering statistically significant biclusters in gene expression data, Proceedings of ISMB 2002, pp.136-144, 2002. ,
DOI : 10.1093/bioinformatics/18.suppl_1.S136
The information bottleneck method, annual Allerton Conference on Communication, Control, and Computing, 1999. ,
A Bayesian approach to two-mode clustering, 2009. ,
Modèles de régression linéaire en grande dimension pour l'étude des facteurs de transcription d'arabidopsis thaliana, 2013. ,
Convergence and asymptotic normality of variational bayesian approximations for exponential family models with missing values, Proceedings of the 20th conference on Uncertainty in artificial intelligence, pp.577-584, 2004. ,
Variational Bayes Estimation of Mixing Coefficients, Deterministic and statistical methods in machine learning, pp.281-295, 2005. ,
DOI : 10.1007/11559887_17
Inadequacy of interval estimates corresponding to variational bayesian approximations, 2004. ,
Convergence properties of a general algorithm for calculating variational bayesian estimates for a normal mixture model, Bayesian Analysis, vol.1, issue.3, pp.625-650, 2006. ,
Nonparametric Bayesian Co-clustering Ensembles, The 2011 SIAM International Conference on Data Mining, 2011. ,
DOI : 10.1137/1.9781611972818.29
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.4768
Block clustering with collapsed latent block models, Statistics and Computing, vol.28, issue.2, pp.1-14, 2010. ,
DOI : 10.1007/s11222-011-9233-4
URL : http://arxiv.org/abs/1011.2948
?-clusters : Capturing subspace correlation in a large data set, 2002. ,
Orthogonal nonnegative matrix tri-factorization for co-clustering : Multiplicative updates on stiefel manifolds. Information processing & management, pp.559-570, 2010. ,
DOI : 10.1016/j.ipm.2009.12.007