Optimal multistage algorithm for adjoint computation, SIAM Journal on Scientific Computing, vol.38, issue.3, pp.232-255, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01147155
Reversible architectures for arbitrarily deep residual neural networks, Thirty-Second AAAI Conference on Artificial Intelligence, 2018. ,
, Training deep nets with sublinear memory cost, 2016.
Distributed deep learning using synchronous stochastic gradient descent, 2016. ,
Large scale distributed deep networks, Advances in neural information processing systems, pp.1223-1231, 2012. ,
Improving strongscaling of cnn training by exploiting finer-grained parallelism, IEEE International Parallel and Distributed Processing Symposium, 2019. ,
The reversible residual network: Backpropagation without storing activations, Advances in neural information processing systems, pp.2214-2224, 2017. ,
Algorithm 799: Revolve: an implementation of checkpointing for the reverse or adjoint mode of computational differentiation, ACM Transactions on Mathematical Software (TOMS), vol.26, issue.1, pp.19-45, 2000. ,
Evaluating derivatives: principles and techniques of algorithmic differentiation, vol.105, 2008. ,
On automatic differentiation, Mathematical Programming: Recent Developments and Applications, vol.6, issue.6, pp.83-107, 1989. ,
Memory-efficient backpropagation through time, Advances in Neural Information Processing Systems, pp.4125-4133, 2016. ,