S. Ashida, The effect of reminiscence music therapy sessions on changes in depressive symptoms in elderly persons with dementia, Journal of Music Therapy, vol.37, issue.3, pp.170-182, 2000.

J. C. Broutart, P. Robert, D. Balas, N. Broutart, and J. Cahors, Démence et perte cognitive: Prise en charge du patient et de sa famille, chap. Mnémothérapie, reviviscence et maladie d'Alzheimer, 2017.

A. Dantcheva, P. Bilinski, H. T. Nguyen, J. C. Broutart, and F. Bremond, Expression Recognition for Severely Demented Patients in Music Reminiscence-Therapy, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01543231

A. Dantcheva and F. Bremond, Gender estimation based on smile-dynamics, IEEE Transactions on Information Forensics and Security (TIFS), vol.12, issue.3, pp.719-729, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01412408

P. N. Dawadi, D. J. Cook, M. Schmitter-edgecombe, and C. Parsey, Automated assessment of cognitive health using smart home technologies, Technology and health care, vol.21, issue.4, pp.323-343, 2013.

H. Dibeklioglu, Z. Hammal, and J. F. Cohn, Dynamic multimodal measurement of depression severity using deep autoencoding, IEEE Journal of Biomedical and Health Informatics PP, issue.99, pp.1-1, 2017.

P. Ekman and W. Friesen, Facial action coding system: a technique for the measurement of facial movement, 1978.

C. Feichtenhofer, A. Pinz, and A. Zisserman, Convolutional two-stream network fusion for video action recognition, IEEE Conference on Computer Vision and Pattern Recognition, 2016.

P. F. Felzenszwalb, R. B. Girshick, D. Mcallester, and D. Ramanan, Object detection with discriminatively trained part-based models, vol.32, pp.1627-1645, 2010.

M. F. Folstein, S. E. Folstein, and P. R. Mchugh, Mini-mental state": a practical method for grading the cognitive state of patients for the clinician, Journal of psychiatric research, vol.12, issue.3, pp.189-198, 1975.

S. Han, Z. Meng, A. S. Khan, Y. Tong, D. D. Lee et al., Incremental boosting convolutional neural network for facial action unit recognition, Advances in Neural Information Processing Systems, vol.29, pp.109-117, 2016.

B. Hasani and M. H. Mahoor, Facial expression recognition using enhanced deep 3d convolutional neural networks, Computer Vision and Pattern Recognition Workshops, pp.2278-2288, 2017.

K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, 2015.

H. Jung, S. Lee, J. Yim, S. Park, and J. Kim, Joint fine-tuning in deep neural networks for facial expression recognition, Computer Vision (ICCV), 2015 IEEE International Conference on, pp.2983-2991, 2015.

A. König, C. F. Crispim-junior, A. Derreumaux, G. Bensadoun, P. D. Petit et al., Validation of an automatic video monitoring system for the detection of instrumental activities of daily living in dementia patients, Journal of Alzheimer's Disease, vol.44, issue.2, pp.675-685, 2015.

M. Leo, G. Medioni, M. Trivedi, T. Kanade, and G. M. Farinella, Computer vision for assistive technologies, Computer Vision and Image Understanding, vol.154, pp.1-15, 2017.

W. Li, F. Abtahi, and Z. Zhu, Action unit detection with region adaptation, multilabeling learning and optimal temporal fusing, Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on, pp.6766-6775, 2017.

P. Liu, S. Han, Z. Meng, and Y. Tong, Facial expression recognition via a boosted deep belief network, Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp.1805-1812, 2014.

B. Martinez, M. F. Valstar, B. Jiang, and M. Pantic, Automatic analysis of facial actions: A survey, IEEE Transactions on Affective Computing, 2017.

M. Mathias, R. Benenson, M. Pedersoli, and L. Van-gool, Face detection without bells and whistles, European conference on computer vision, pp.720-735, 2014.

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion et al., Scikit-learn: Machine learning in Python, Journal of Machine Learning Research, vol.12, pp.2825-2830, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00650905

A. Raglio, G. Bellelli, P. Mazzola, D. Bellandi, A. Giovagnoli et al., Music, music therapy and dementia: a review of literature and the recommendations of the italian psychogeriatric association, Maturitas, vol.72, issue.4, pp.305-310, 2012.

H. M. Ridder and E. Gummesen, The use of extemporizing in music therapy to facilitate communication in a person with dementia: An explorative case study, Australian Journal of Music Therapy, vol.26, 2015.

P. Rodriguez, G. Cucurull, J. Gonzàlez, J. M. Gonfaus, K. Nasrollahi et al., Deep pain: Exploiting long short-term memory networks for facial expression classification, IEEE Transactions on Cybernetics, 2017.

R. Romdhane, E. Mulin, A. Derreumeaux, N. Zouba, J. Piano et al., Automatic video monitoring system for assessment of Alzheimers disease symptoms. The journal of nutrition, health & aging, vol.16, issue.3, pp.213-218, 2012.

S. Saha, R. Navarathna, L. Helminger, and R. M. Weber, Unsupervised deep representations for learning audience facial behaviors, 2018.

G. Sandbach, S. Zafeiriou, M. Pantic, and D. Rueckert, Recognition of 3d facial expression dynamics, Image and Vision Computing, vol.30, issue.10, pp.762-773, 2012.

E. Sariyanidi, H. Gunes, and A. Cavallaro, Automatic analysis of facial affect: A survey of registration, representation, and recognition, vol.37, pp.1113-1133, 2015.

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, 2014.

K. Simonyan, A. Zisserman, Z. Ghahramani, M. Welling, C. Cortes et al., Two-stream convolutional networks for action recognition in videos, Advances in Neural Information Processing Systems, vol.27, pp.568-576, 2014.

K. Soomro, A. Roshan-zamir, and M. Shah, UCF101: A dataset of 101 human actions classes from videos in the wild, 2012.

M. Suzuki, M. Kanamori, M. Watanabe, S. Nagasawa, E. Kojima et al., Behavioral and endocrinological evaluation of music therapy for elderly patients with dementia, Nursing & Health Sciences, vol.6, issue.1, pp.11-18, 2004.

H. Svansdottir and J. Snaedal, Music therapy in moderate and severe dementia of Alzheimer's type: a case-control study, International psychogeriatrics, vol.18, issue.04, pp.613-621, 2006.

D. L. Tran, R. Walecki, O. Rudovic, S. Eleftheriadis, B. W. Schuller et al., Deepcoder: Semi-parametric variational autoencoders for facial action unit intensity estimation, 2017.

D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, Learning spatiotemporal features with 3d convolutional networks, Proceedings of the IEEE international conference on computer vision, pp.4489-4497, 2015.

A. C. Vink, M. S. Bruinsma, and R. J. Scholten, Music therapy for people with dementia, 2003.

P. Viola and M. J. Jones, Robust real-time face detection, International journal of computer vision, vol.57, issue.2, pp.137-154, 2004.

R. Walecki, O. Rudovic, V. Pavlovic, and M. Pantic, Variable-state latent conditional random field models for facial expression analysis, Image and Vision Computing, vol.58, pp.25-37, 2017.

H. Wang, A. Kläser, C. Schmid, and C. L. Liu, Dense trajectories and motion boundary descriptors for action recognition, INRIA, 2012.
DOI : 10.1007/s11263-012-0594-8

URL : https://hal.archives-ouvertes.fr/hal-00803241

H. Wang and C. Schmid, Action recognition with improved trajectories, Proceedings of the IEEE international conference on computer vision, pp.3551-3558, 2013.
DOI : 10.1109/iccv.2013.441

URL : https://hal.archives-ouvertes.fr/hal-00873267

L. Wang, Y. Qiao, and X. Tang, Action recognition with trajectory-pooled deepconvolutional descriptors, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
DOI : 10.1109/cvpr.2015.7299059

URL : http://arxiv.org/pdf/1505.04868

L. Wang, Y. Xiong, Z. Wang, and Y. Qiao, Towards good practices for very deep two-stream convnets, 2015.
DOI : 10.1007/978-3-319-46484-8_2

URL : http://arxiv.org/pdf/1608.00859

L. Zafeiriou, S. Nikitidis, S. Zafeiriou, and M. Pantic, Slow features nonnegative matrix factorization for temporal data decomposition, 2014 IEEE International Conference on, pp.1430-1434, 2014.
DOI : 10.1109/icip.2014.7025286

URL : http://ibug.doc.ic.ac.uk/media/uploads/documents/nmf_sfa_icip2014.pdf

K. Zhao, W. S. Chu, and H. Zhang, Deep region and multi-label learning for facial action unit detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.3391-3399, 2016.
DOI : 10.1109/cvpr.2016.369

Y. Zhu, Y. Shang, Z. Shao, and G. Guo, Automated depression diagnosis based on deep networks to encode facial appearance and dynamics, IEEE Transactions on Affective Computing PP, issue.99, pp.1-1, 2017.
DOI : 10.1109/taffc.2017.2650899

,