M. Berger, G. Badis, A. Gehrke, and S. Talukder, Variation in homeodomain DNA binding revealed by high-resolution analysis of sequence preferences, Cell, 2008.

N. Dalal and B. Triggs, Histograms of oriented gradients for human detection, CVPR, 2005.
URL : https://hal.archives-ouvertes.fr/inria-00548512

A. Dempster, N. Laird, and D. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of Royal Statistical Society, 1977.

P. Felzenszwalb, D. Mcallester, and D. Ramanan, A discriminatively trained, multiscale, deformable part model, CVPR, 2008.

A. Gelman, J. Carlin, H. Stern, and D. Rubin, Bayesian Data Analysis, 1995.

J. Havrada and F. Charvat, Quantification method in classification processes: Concept of structural ?-entropy. Kybernetika, 1967.

G. Heitz, G. Elidan, B. Packer, and D. Koller, Shape-based object localization for descriptive classification, 2009.

T. Jaakkola, M. Meila, and T. Jebara, Maximum entropy discrimination, NIPS, 1999.

E. Jaynes, Probability theory: The logic of science, 2003.

T. Jebara, Discriminative, generative and imitative learning, 2001.

T. Jebara and T. Jaakkola, Feature selection and dualities in maximum entropy discrimination, UAI, 2000.

T. Joachims, T. Finley, and C. Yu, Cuttingplane training for structural SVMs, Machine Learning, 2009.

M. P. Kumar, B. Packer, and D. Koller, Selfpaced learning for latent variable models, NIPS, 2010.

A. Mathai and P. Rathie, Basic Concepts in Information Theory and Statistics, 1974.

R. Neal and G. Hinton, A view of the EM algorithm that justifies incremental, sparse, and other variants, Learning in Graphical Models, 1999.

C. Rao, Diversity and dissimilarity coefficients: A unified approach Theoretical Population Biology, 1982. [18] A. Renyi. On measures of information and entropy, Berkeley Symposium on Mathematics, Statistics and Probability, 1961.

J. Salojarvi, K. Puolamaki, and S. Kaski, Expectation maximization algorithms for conditional likelihoods, ICML, 2005.

S. Shalev-shwartz, Y. Singer, and N. Srebro, Pegasos: Primal estimated sub-gradient solver for SVM, ICML, 2009.

B. Sriperumbudur and G. Lanckriet, On the convergence of concave-convex procedure, NIPS Workshop on Optimization for Machine Learning, 2009.

R. Sundberg, Maximum likelihood theory for incomplete data from an exponential family, Scandinavian Journal of Statistics, 1974.

I. Tsochantaridis, T. Hofmann, Y. Altun, and T. Joachims, Support vector machine learning for interdependent and structured output spaces, ICML, 2004.

C. Yu and T. Joachims, Learning structural SVMs with latent variables, ICML, 2009.

A. Yuille and A. Rangarajan, The concave-convex procedure, Neural Computation, 2003.