R. W. Picard, Affective computing: challenges, International Journal of Human-Computer Studies, vol.59, issue.1-2, pp.55-64, 2003.

S. G. Barsade and A. P. Knight, Group affect, Annu. Rev. Organ. Psychol. Organ. Behav, vol.2, issue.1, pp.21-46, 2015.

D. Dupré, E. G. Krumhuber, D. Küster, and G. J. Mckeown, A performance comparison of eight commercially available automatic classifiers for facial affect recognition, PLOS ONE, vol.15, pp.1-17

A. Dhall, R. Goecke, J. Joshi, J. Hoey, and T. Gedeon, Emotiw 2016: Video and group-level emotion recognition challenges, Proceedings of the 18th ACM international conference on multimodal interaction, pp.427-432, 2016.

K. Ahuja, D. Kim, F. Xhakaj, V. Varga, A. Xie et al., Edusense: Practical classroom sensing at scale, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol.3, pp.1-26, 2019.

R. Laurent, D. Vaufreydaz, and P. Dessus, Ethical Teaching Analytics in a Context-Aware Classroom: A Manifesto, ERCIM News, pp.39-40, 2020.
URL : https://hal.archives-ouvertes.fr/hal-02438020

A. James, Y. H. Chua, T. Maszczyk, A. M. Núñez, R. Bull et al., Automated classification of classroom climate by audio analysis, 9th International Workshop on Spoken Dialogue System Technology, pp.41-49, 2019.

A. Dhall, G. Sharma, R. Goecke, and T. Gedeon, Emotiw 2020: Driver gaze, group emotion, student engagement and physiological signal based challenges, Proceedings of the ACM International Conference on Multimodal Interaction, 2020.

G. Sharma, S. Ghosh, and A. Dhall, Automatic group level affect and cohesion prediction in videos, 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW), pp.161-167, 2019.

C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, Rethinking the inception architecture for computer vision, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.2818-2826, 2016.

X. Guo, B. Zhu, L. F. Polania, C. Boncelet, and K. E. Barner, Group-level emotion recognition using hybrid deep models based on faces, scenes, skeletons and visual attentions, Proceedings of the 20th ACM International Conference on Multimodal Interaction, pp.635-639, 2018.

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, 2014.

K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.770-778, 2016.

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed et al., Going deeper with convolutions, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.1-9, 2015.

S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural computation, vol.9, issue.8, pp.1735-1780, 1997.

J. Hu, L. Shen, and G. Sun, Squeeze-and-excitation networks, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.7132-7141, 2018.

K. Wang, X. Zeng, J. Yang, D. Meng, K. Zhang et al., Cascade attention networks for group emotion recognition with face, body and image cues, Proceedings of the 20th ACM International Conference on Multimodal Interaction, pp.640-645, 2018.

K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, Joint face detection and alignment using multitask cascaded convolutional networks, IEEE Signal Processing Letters, vol.23, issue.10, pp.1499-1503, 2016.

Z. Cao, T. Simon, S. Wei, and Y. Sheikh, Realtime multi-person 2d pose estimation using part affinity fields, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.7291-7299, 2017.

A. S. Khan, Z. Li, J. Cai, Z. Meng, J. O'reilly et al., Group-level emotion recognition using deep models with a four-stream hybrid network, Proceedings of the 20th ACM International Conference on Multimodal Interaction, pp.623-629, 2018.

J. Deng, W. Dong, R. Socher, L. Li, K. Li et al., Imagenet: A large-scale hierarchical image database, 2009 IEEE conference on computer vision and pattern recognition, pp.248-255, 2009.

S. Li, W. Zheng, Y. Zong, C. Lu, C. Tang et al., Bi-modality fusion for emotion recognition in the wild, 2019 International Conference on Multimodal Interaction, pp.589-594, 2019.

M. Schuster and K. K. Paliwal, Bidirectional recurrent neural networks, IEEE transactions on Signal Processing, vol.45, issue.11, pp.2673-2681, 1997.

F. Eyben, M. Wöllmer, and B. Schuller, Opensmile: the munich versatile and fast open-source audio feature extractor, Proceedings of the 18th ACM international conference on Multimedia, pp.1459-1462, 2010.

H. Zhou, D. Meng, Y. Zhang, X. Peng, J. Du et al., Exploring emotion features and fusion strategies for audio-video emotion recognition, 2019 International Conference on Multimodal Interaction, pp.562-566, 2019.

A. Ramakrishnan, E. Ottmar, J. Locasale-crouch, and J. Whitehill, Toward automated classroom observation: Predicting positive and negative climate, 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019), pp.1-8, 2019.

G. Varol, J. Romero, X. Martin, N. Mahmood, M. J. Black et al., Learning from synthetic humans, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.109-117, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01505711

F. Yu, A. Seff, Y. Zhang, S. Song, T. Funkhouser et al., Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop, 2015.

N. C. Ebner, M. Riediger, and U. Lindenberger, Faces-a database of facial expressions in young, middle-aged, and older women and men: Development and validation, Behavior research methods, vol.42, issue.1, pp.351-362, 2010.

E. Goeleven, R. De-raedt, L. Leyman, and B. Verschuere, The karolinska directed emotional faces: a validation study, Cognition and emotion, vol.22, issue.6, pp.1094-1118, 2008.

A. Dawel, L. Wright, J. Irons, R. Dumbleton, R. Palermo et al., Perceived emotion genuineness: normative ratings for popular facial expression stimuli and the development of perceived-as-genuine and perceived-as-fake sets, Behavior Research Methods, vol.49, issue.4, pp.1539-1562, 2017.

A. Joseph and P. Geetha, Facial emotion detection using modified eyemap-mouthmap algorithm on an enhanced image and classification with tensorflow, The Visual Computer, vol.36, issue.3, pp.529-539, 2020.

J. Howse, OpenCV computer vision with python, 2013.

X. Wang, K. Wang, and S. Lian, A survey on face data augmentation, 2019.

J. Chen, Q. Ou, Z. Chi, and H. Fu, Smile detection in the wild with deep convolutional neural networks, Machine vision and applications, vol.28, pp.173-183, 2017.

D. Trampe, J. Quoidbach, and M. Taquet, Emotions in everyday life, PloS one, vol.10, issue.12, p.145450, 2015.

A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang et al., Automatic differentiation in pytorch, 2017.

S. Zagoruyko and N. Komodakis, Wide residual networks, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01832503

M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. Chen, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.4510-4520, 2018.

G. Huang, Z. Liu, L. Van-der-maaten, and K. Q. Weinberger, Densely connected convolutional networks, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.4700-4708, 2017.

S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He, Aggregated residual transformations for deep neural networks, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.1492-1500, 2017.

F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally et al., Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size, 2016.

H. Wang, Z. Wang, M. Du, F. Yang, Z. Zhang et al., Score-cam: Score-weighted visual explanations for convolutional neural networks, 2019.

U. Ozbulak, Pytorch cnn visualizations, 2019.

Q. Xu, Z. Qin, and T. Wan, Generative cooperative net for image generation and data augmentation, 2017.

X. Wang, Y. Wang, and W. Li, U-net conditional gans for photo-realistic and identity-preserving facial expression synthesis, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), vol.15, pp.1-23, 2019.