J. B. Tenenbaum and W. T. Freeman, Separating style and content with bilinear models, Neural computation, vol.12, issue.6, pp.1247-1283, 2000.

D. P. Kingma and M. Welling, Auto-encoding variational bayes, 2015.

J. Y. Zhu, T. Park, P. Isola, and A. Efros, Unpaired image-to-image translation using cycle-consistent adversarial networks, Proc. ICCV, 2017.

Y. Lee, A. Efros, and M. Hebert, Style-aware mid-level representation for discovering visual connections in space and time, Proc. ICCV, 2013.

S. Ginosar, K. Rakelly, S. Sachs, B. Yin, and A. Efros, A century of portraits: A visual historical record of american high school yearbooks, 2015.

J. B. Michel, Y. Shen, A. Aiden, A. Veres, and M. Gray, Quantitative analysis of culture using millions of digitized books, Science, vol.331, issue.6014, 2011.

D. Roy, The human speechome project. Symbol Grounding and Beyond, pp.192-196, 2006.

D. Marcus, A. Fotenos, J. Csernansky, M. Buckner, and R. , Open access series of imaging studies (oasis): Longitudinal mri data in nondemented and demented older adults, Journal of Cognitive Neuroscience, vol.22, issue.12, pp.2677-2684, 2010.

A. Tang, Canadian association of radiologists white paper on artificial intelligence in radiology, Canadian Association of Radiologists Journal, vol.69, issue.2, pp.120-135, 2018.

E. J. Crowley and A. Zisserman, Of gods and goats: Weakly supervised learning of figurative art, Proc. BMVC, 2013.

X. Shen, I. Pastrolin, O. Bounou, S. Gidaris, M. Smith et al., Large-scale historical watermark recognition: dataset and a new consistency-based approach, 2019.

, the visual arts data service

C. Doersch, S. Singh, A. Gupta, J. Sivic, and A. A. Efros, What makes paris look like paris?, Proc. ACM SIGGRAPH, 2012.
URL : https://hal.archives-ouvertes.fr/hal-01248528

C. Doersch, A. Gupta, and A. Efros, Mid-level visual element discovery as discriminative mode seeking, In: NIPS, 2013.

S. Lee, N. Maisonneuve, D. Crandall, A. Efros, and J. Sivic, Linking past to present: Discovering style in two centuries of architecture, International Conference on Computational Photography, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01152482

N. Dalal and B. Triggs, Histogram of Oriented Gradients for Human Detection, Proc. CVPR, vol.2, pp.886-893, 2005.

M. Kirby and L. Sirovich, Application of the karhunen-loeve procedure for the characterization of human faces, IEEE PAMI, vol.12, issue.1, pp.103-108, 1990.

M. A. Turk and A. P. Pentland, Face recognition using eigenfaces, Proc. CVPR, pp.586-591, 1991.

M. A. Vasilescu and D. Terzopoulos, Multilinear analysis of image ensembles: Tensorfaces, pp.447-460, 2002.

Y. Tang, R. Salakhutdinov, and G. E. Hinton, Tensor analyzers. In: ICML, pp.163-171, 2013.

E. Denton, S. Chintala, and R. Fergus, Deep generative image models using a laplacian pyramid of adversarial networks, In: NIPS, 2015.

A. Dosovitskiy and T. Brox, Generating images with perceptual similarity metrics based on deep networks, 2016.

B. Cheung, J. Livezey, A. Bansal, and B. Olshausen, Discovering hidden factors of variation in deep networks, 2014.

A. Radford, L. Metz, and S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks, 2016.

T. Quack, B. Leibe, and L. Van-gool, World-scale mining of objects and events from community photo collections, Proc. CIVR, 2008.

R. Martin-brualla, D. Gallup, and S. Seitz, Time-lapse mining from internet photos, ACM Trans. Graph, vol.34, issue.4, 2015.

K. Matzen and N. Snavely, Scene chronology, Proc. ECCV, 2014.

J. Sivic and A. Zisserman, Video data mining using configurations of viewpoint invariant regions, Proc. CVPR, 2004.

K. Grauman and T. Darrell, Unsupervised learning of categories from sets of partially matching image features, Proc. CVPR, 2006.

H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng, Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations, Proceedings of the 26th Annual International Conference on Machine Learning, pp.609-616, 2009.

Y. J. Lee and K. Grauman, Learning the easy things first: Self-paced visual category discovery, Proc. CVPR, 2011.

S. Singh, A. Gupta, and A. A. Efros, Unsupervised discovery of midlevel discriminative patches, Proc. ECCV, 2012.

J. Sivic, B. C. Russell, A. A. Efros, A. Zisserman, and W. T. Freeman, Discovering objects and their location in images, Proc. ICCV, 2005.

L. Karlinsky, M. Dinerstein, and S. Ullman, Unsupervised feature optimization (ufo): Simultaneous selection of multiple features with their detection parameters, Proc. CVPR, 2009.

B. C. Russell, A. A. Efros, J. Sivic, W. T. Freeman, and A. Zisserman, Using multiple segmentations to discover objects and their extent in image collections, Proc. CVPR, 2006.

S. Todorovic and N. Ahuja, Extracting subimages of an unknown category from a set of images, Proc. CVPR, 2006.

K. Matzen and N. Snavely, Bubblenet: Foveated imaging for visual discovery, Proc. ICCV, 2015.

M. Juneja, A. Vedaldi, C. V. Jawahar, and A. Zisserman, Blocks that shout: Distinctive parts for scene classification, Proc. CVPR, 2013.

S. Vittayakorn, A. C. Berg, and T. L. Berg, When was that made? In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp.715-724, 2017.

F. Palermo, J. Hays, and A. A. Efros, Dating historical color images, Lecture Notes in Computer Science, vol.7577, issue.6, pp.499-512, 2012.

T. Y. Lin, A. Roychowdhury, and S. Maji, Bilinear cnn models for fine-grained visual recognition, Proc. ICCV, pp.1449-1457, 2015.

A. Fukui, D. H. Park, D. Yang, A. Rohrbach, T. Darrell et al., Multimodal compact bilinear pooling for visual question answering and visual grounding, 2016.

G. E. Hinton, S. Osindero, and Y. W. Teh, A fast learning algorithm for deep belief nets, Neural Computation, vol.18, issue.7, pp.1527-1554, 2006.

N. Le-roux, N. Heess, J. Shotton, and J. Winn, Learning a generative model of images by factoring appearance and shape, Neural Comput, vol.23, issue.3, pp.593-650, 2011.

S. Osindero and G. E. Hinton, Modeling image patches with a directed hierarchy of Markov random fields, NIPS, 2008.

R. Salakhutdinov and G. E. Hinton, Deep Boltzmann machines, AISTATS, 2008.

P. Vincent, H. Larochelle, Y. Bengio, and P. A. Manzagol, Extracting and composing robust features with denoising autoencoders, Proc. ICML, 2008.

T. Kulkarni, P. Kohli, J. Tenenbaum, and V. Mansinghka, Picture: A probabilistic programming language for scene perception, Proc. CVPR, 2015.

T. Kulkarni, W. Whitney, P. Kohli, and J. Tenenbaum, Deep convolutional inverse graphics network. In: NIPS, 2015.

M. Tatarchenko, A. Dosovitskiy, and T. Brox, Multi-view 3d models from single images with a convolutional network, Proc. ECCV, pp.322-337, 2016.

S. Reed, K. Sohn, Y. Zhang, and H. Lee, Learning to disentangle factors of variation with manifold interaction, Proc. ICML, 2014.

X. Yan, J. Yang, K. Sohn, and H. Lee, Attribute2image: Conditional image generation from visual attributes, Proc. ECCV, 2016.

J. Johnson, A. Alahi, and L. Fei-fei, Perceptual losses for real-time style transfer and super-resolution, 2016.

X. Wang and A. Gupta, Generative image modeling using style and structure adversarial networks, 2016.

J. Y. Zhu, P. Krahenbuhl, E. Shechtman, and A. A. Efros, Generative visual manipulation on the natural image manifold, Proc. ECCV, 2016.

I. Goodfellow, J. Pouget-abadie, M. Mirza, B. Xu, D. Warde-farley et al., Generative adversarial nets. In: NIPS, 2015.

Y. Zhou and T. L. Berg, Learning temporal transformations from timelapse videos, Proc. ECCV, pp.262-277, 2016.

P. Isola, J. Y. Zhu, T. Zhou, and A. A. Efros, Image-to-image translation with conditional adversarial networks, 2016.

Y. Choi, M. Choi, M. Kim, J. W. Ha, S. Kim et al., Stargan: Unified generative adversarial networks for multi-domain imageto-image translation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.8789-8797, 2018.

P. Baldi and K. Hornik, Neural networks and principal component analysis: Learning from examples without local minima, Neural networks, vol.2, issue.1, pp.53-58, 1989.

T. D. Sanger, Optimal unsupervised learning in a single-layer linear feedforward neural network, Neural networks, vol.2, issue.6, pp.459-473, 1989.

A. Dosovitskiy, J. Springenberg, M. Tatarchenko, and T. Brox, Learning to generate chairs, tables and cars with convolutional networks, IEEE PAMI, 2016.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks. In: NIPS, pp.1106-1114, 2012.

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh et al., , 2015.

X. Mao, Q. Li, H. Xie, R. Y. Lau, Z. Wang et al., Least squares generative adversarial networks, 2016.

K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.770-778, 2016.

R. Collobert, K. Kavukcuoglu, and C. Farabet, Torch7: A matlab-like environment for machine learning, BigLearn, NIPS Workshop, 2011.

S. Ren, K. He, R. Girshick, and J. Sun, Faster r-cnn: Towards real-time object detection with region proposal networks, Advances in neural information processing systems, pp.91-99, 2015.

T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford et al., Improved techniques for training gans. In: NIPS, 2016.

X. Wang and A. Gupta, Generative image modeling using style and structure adversarial networks, Proc. ECCV, 2016.

R. Zhang, P. Isola, and A. A. Efros, Colorful image colorization, Proc. ECCV, 2016.