, Supplementary material. Also available in the arXiv technical report

B. Ans, S. Rousset, R. M. French, and S. Musca, Self-refreshing memory in artificial neural networks: learning temporal sequences without catastrophic forgetting, Connection Science, vol.1, issue.2, pp.71-99, 2004.
DOI : 10.1162/089976603762552988
URL : https://hal.archives-ouvertes.fr/hal-00170922

Y. Bengio, A. Courville, and P. Vincent, Representation Learning: A Review and New Perspectives, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.8, pp.1798-1828, 2013.
DOI : 10.1109/TPAMI.2013.50

G. Cauwenberghs and T. Poggio, Incremental and decremental support vector machine learning, p.NIPS, 2000.

X. Chen, A. Shrivastava, and A. Gupta, NEIL: Extracting Visual Knowledge from Web Data, 2013 IEEE International Conference on Computer Vision, p.ICCV, 2013.
DOI : 10.1109/ICCV.2013.178

C. Cortes and V. Vapnik, Support-vector networks, Machine Learning, vol.1, issue.3, pp.273-297, 1995.
DOI : 10.1007/BF00994018

S. Divvala, A. Farhadi, and C. Guestrin, Learning Everything about Anything: Webly-Supervised Visual Concept Learning, 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.CVPR, 2014.
DOI : 10.1109/CVPR.2014.412

R. M. French, Dynamically constraining connectionist networks to produce distributed, orthogonal representations to reduce catastrophic interference, 1994.

T. Furlanello, J. Zhao, A. M. Saxe, L. Itti, and B. S. Tjan, Active long term memory networks. ArXiv e-prints, 2016.

I. Goodfellow, M. Mirza, D. Xiao, A. Courville, and Y. Bengio, An empirical investigation of catastrophic forgetting in gradient-based neural networks ArXiv e-prints, arXiv 1312, p.6211, 2013.

K. He, X. Zhang, S. Ren, and J. Sun, Deep Residual Learning for Image Recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.CVPR, 2016.
DOI : 10.1109/CVPR.2016.90

G. Hinton, O. Vinyals, and J. Dean, Distilling the knowledge in a neural network, In: NIPS workshop, 2014.

H. Jung, J. Ju, M. Jung, and J. Kim, Less-forgetting learning in deep neural networks. ArXiv e-prints, arXiv 1607, p.122, 2016.

J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins et al., Overcoming catastrophic forgetting in neural networks, Proc. National Academy of Sciences, pp.3521-3526, 2017.
DOI : 10.1016/0047-259X(82)90077-X

A. Krizhevsky, Learning multiple layers of features from tiny images, Tech. rep, 2009.

Z. Li and D. Hoiem, Learning without forgetting, PAMI, 2018.

D. Lopez-paz and M. A. Ranzato, Gradient episodic memory for continual learning, p.NIPS, 2017.

M. Mccloskey and N. J. Cohen, Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem, Psychology of Learning and Motivation, vol.24, pp.109-165, 1989.
DOI : 10.1016/S0079-7421(08)60536-8

T. Mensink, J. Verbeek, F. Perronnin, and G. Csurka, Distance-Based Image Classification: Generalizing to New Classes at Near-Zero Cost, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.11, pp.2624-2637, 2013.
DOI : 10.1109/TPAMI.2013.83
URL : https://hal.archives-ouvertes.fr/hal-00817211

T. Mitchell, W. Cohen, E. Hruschka, P. Talukdar, J. Betteridge et al., Never-ending learning. In: AAAI, 2015.

A. Neelakantan, L. Vilnis, Q. V. Le, I. Sutskever, L. Kaiser et al., Adding gradient noise improves learning for very deep networks. ArXiv e-prints, 2017.

R. Ratcliff, Connectionist models of recognition memory: Constraints imposed by learning and forgetting functions., Psychological Review, vol.97, issue.2, p.285, 1990.
DOI : 10.1037/0033-295X.97.2.285

S. A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert, iCaRL: Incremental Classifier and Representation Learning, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.CVPR, 2017.
DOI : 10.1109/CVPR.2017.587

S. Ren, K. He, R. Girshick, and J. Sun, Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.39, issue.6, p.NIPS, 2015.
DOI : 10.1109/TPAMI.2016.2577031

M. Ristin, M. Guillaumin, J. Gall, and L. V. Gool, Incremental Learning of NCM Forests for Large-Scale Image Classification, 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.CVPR, 2014.
DOI : 10.1109/CVPR.2014.467

S. Ruping, Incremental learning with support vector machines, Proceedings 2001 IEEE International Conference on Data Mining, 2001.
DOI : 10.1109/ICDM.2001.989589

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh et al., ImageNet Large Scale Visual Recognition Challenge, International Journal of Computer Vision, vol.1010, issue.1, pp.211-252, 2015.
DOI : 10.1007/978-3-642-15555-0_11

A. A. Rusu, N. C. Rabinowitz, G. Desjardins, H. Soyer, J. Kirkpatrick et al., Progressive neural networks. ArXiv e-prints, 2016.

P. Ruvolo and E. Eaton, ELLA: An efficient lifelong learning algorithm, p.ICML, 2013.

K. Shmelkov, C. Schmid, and K. Alahari, Incremental Learning of Object Detectors without Catastrophic Forgetting, 2017 IEEE International Conference on Computer Vision (ICCV), p.ICCV, 2017.
DOI : 10.1109/ICCV.2017.368
URL : https://hal.archives-ouvertes.fr/hal-01573623

K. Simonyan and A. Zisserman, Two-stream convolutional networks for action recognition in videos, p.NIPS, 2014.

A. V. Terekhov, G. Montone, and J. K. O-'regan, Knowledge Transfer in Deep Block-Modular Neural Networks, In: Biomimetic and Biohybrid Systems, 2015.
DOI : 10.1007/978-3-319-22979-9_27

S. Thrun, Lifelong Learning Algorithms, pp.181-209, 1998.
DOI : 10.1007/978-1-4615-5529-2_8

A. R. Triki, R. Aljundi, M. B. Blaschko, and T. Tuytelaars, Encoder based lifelong learning, p.ICCV, 2017.

A. Vedaldi and K. Lenc, Convolutional Neural Networks for MATLAB. In: ACM Multimedia, 2015.

M. Welling, Herding dynamical weights to learn, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, p.ICML, 2009.
DOI : 10.1145/1553374.1553517

T. Xiao, J. Zhang, K. Yang, Y. Peng, and Z. Zhang, Error-Driven Incremental Learning in Deep Convolutional Neural Network for Large-Scale Image Classification, Proceedings of the ACM International Conference on Multimedia, MM '14, 2014.
DOI : 10.1007/BF00116900