And the Bit Goes Down: Revisiting the Quantization of Neural Networks

Pierre Stock 1, 2, 3 Armand Joulin 1 Rémi Gribonval 2, 3 Benjamin Graham 1 Hervé Jégou 1
2 DANTE - Dynamic Networks : Temporal and Structural Capture Approach
Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l'Informatique du Parallélisme, IXXI - Institut Rhône-Alpin des systèmes complexes
3 PANAMA - Parcimonie et Nouveaux Algorithmes pour le Signal et la Modélisation Audio
Inria Rennes – Bretagne Atlantique , IRISA-D5 - SIGNAUX ET IMAGES NUMÉRIQUES, ROBOTIQUE
Abstract : In this paper, we address the problem of reducing the memory footprint of convolutional network architectures. We introduce a vector quantization method that aims at preserving the quality of the reconstruction of the network outputs rather than its weights. The principle of our approach is that it minimizes the loss reconstruction error for in-domain inputs. Our method only requires a set of unlabelled data at quantization time and allows for efficient inference on CPU by using bytealigned codebooks to store the compressed weights. We validate our approach by quantizing a high performing ResNet-50 model to a memory size of 5 MB (20 X compression factor) while preserving a top-1 accuracy of 76:1% on ImageNet object classification and by compressing a Mask R-CNN with a 26 X factor.1
Document type :
Conference papers
Complete list of metadatas

Cited literature [48 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02434572
Contributor : Pierre Stock <>
Submitted on : Friday, January 10, 2020 - 10:55:47 AM
Last modification on : Monday, February 10, 2020 - 12:17:19 PM

File

1907.05686 (1).pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02434572, version 1

Citation

Pierre Stock, Armand Joulin, Rémi Gribonval, Benjamin Graham, Hervé Jégou. And the Bit Goes Down: Revisiting the Quantization of Neural Networks. ICLR 2020 - Eighth International Conference on Learning Representations, Apr 2020, Addis-Abeba, Ethiopia. pp.1-11. ⟨hal-02434572⟩

Share

Metrics

Record views

108

Files downloads

28