Going deeper with convolutions, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. ,
Deep residual learning for image recognition, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. ,
Densely connected convolutional networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. ,
Yangqing Jia, and Kaiming He. Accurate, large minibatch sgd: Training imagenet in 1 hour, 2017. ,
Imagenet training in 24 minutes, 2017. ,
High bandwidth, low latency, burst-mode optical interconnect for high performance computing systems, Conference on Lasers and Electro-Optics, vol.1, p.4, 2004. ,
Microbenchmark performance comparison of high-speed cluster interconnects, IEEE Micro, vol.24, issue.1, pp.42-51, 2004. ,
Bohb: Robust and efficient hyperparameter optimization at scale, 2018. ,
Toward optimal run racing: Application to deep learning calibration, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01634381
Toward optimal run racing: Application to deep learning calibration, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01634381
Parallelized stochastic gradient descent, Advances in neural information processing systems, pp.2595-2603, 2010. ,
Gpu asynchronous stochastic gradient descent to speed up neural network training, 2013. ,
Terngrad: Ternary gradients to reduce communication in distributed deep learning, Advances in Neural Information Processing Systems, vol.30, pp.1509-1519, 2017. ,
Qsgd: Communication-efficient sgd via gradient quantization and encoding ,
, Advances in Neural Information Processing Systems, vol.30, pp.1709-1720, 2017.
Understanding the difficulty of training deep feedforward neural networks, Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp.249-256, 2010. ,
Training on the edge: The why and the how, 1st Workshop on Parallel AI and Systems for the Edge, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02069728
Motivation for and evaluation of the first tensor processing unit, IEEE Micro, vol.38, issue.3, pp.10-19, 2018. ,
Large scale distributed deep networks, Advances in neural information processing systems, pp.1223-1231, 2012. ,
Distributed deep learning using synchronous stochastic gradient descent, 2016. ,
Complete register allocation problems, SIAM journal on Computing, vol.4, issue.3, pp.226-248, 1975. ,
An application of generalized tree pebbling to sparse matrix factorization, SIAM Journal on Algebraic Discrete Methods, vol.8, issue.3, pp.375-395, 1987. ,
Scheduling seriesparallel task graphs to minimize peak memory, Theoretical Computer Science, vol.707, pp.1-23, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01397299
Mathematical Programming: recent developments and applications, vol.6, pp.83-107, 1989. ,
Optimal multistage algorithm for adjoint computation, SIAM Journal on Scientific Computing, vol.38, issue.3, pp.232-255, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01354902
Algorithm 799: Revolve: an implementation of checkpointing for the reverse or adjoint mode of computational differentiation, ACM Transactions on Mathematical Software (TOMS), vol.26, issue.1, pp.19-45, 2000. ,
Memoryefficient backpropagation through time, Advances in Neural Information Processing Systems, pp.4125-4133, 2016. ,
Training deep nets with sublinear memory cost, 2016. ,
Automatic differentiation in pytorch, 2017. ,
, Periodic checkpointing in pytorch, 2018.
Mitgcm user manual, 2008. ,
Engineering Design Optimization using Calculus Level Methods, 2016. ,
Backpropagation for long sequences: beyond memory constraints with constant overheads, 2018. ,
Providing the archer community with adjoint modelling tools for high-performance oceanographic and cryospheric computation, 2016. ,
Devito: an embedded domain-specific language for finite differences and geophysical exploration, 2018. ,
Achieving logarithmic growth of temporal and spatial complexity in reverse automatic differentiation, Optimization Methods and software, vol.1, issue.1, pp.35-54, 1992. ,
Optimal time and minimum spacetime product for reversing a certain class of programs, Computational Differentiation: Techniques, Applications, and Tools, pp.95-106, 1996. ,
URL : https://hal.archives-ouvertes.fr/inria-00073896
Multistage approaches for optimal offline checkpointing, SIAM Journal on Scientific Computing, vol.31, issue.3, pp.1946-1967, 2009. ,
Periodicity in optimal hierarchical checkpointing schemes for adjoint computations, Optimization Methods and Software, vol.32, issue.3, pp.594-624, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01654632
Asynchronous two-level checkpointing scheme for large-scale adjoints in the spectral-element solver nek5000, Procedia Computer Science, vol.80, pp.1147-1158, 2016. ,
H-Revolve: A Framework for Adjoint Computation on Synchrone Hierarchical Platforms, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02080706
Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014. ,
Long short-term memory, Neural Comput, vol.9, issue.8, pp.1735-1780, 1997. ,
Finding structure in time, Cognitive science, vol.14, issue.2, pp.179-211, 1990. ,
Recipe1m: A dataset for learning cross-modal embeddings for cooking recipes and food images, 2018. ,
Cross-modal music retrieval and applications: An overview of key methodologies, IEEE Signal Processing Magazine, vol.36, issue.1, pp.52-62, 2019. ,
Signature verification using a" siamese" time delay neural network, Advances in neural information processing systems, pp.737-744, 1994. ,
Siamese convolutional neural networks for authorship verification, Proceedings, 2017. ,
Descriptor learning for omnidirectional image matching, Registration and Recognition in Images and Videos, pp.49-62, 2014. ,
Deep metric learning using triplet network, International Workshop on Similarity-Based Pattern Recognition, pp.84-92, 2015. ,