Scaling the Scattering Transform: Deep Hybrid Networks

Abstract : We use the scattering network as a generic and fixed ini-tialization of the first layers of a supervised hybrid deep network. We show that early layers do not necessarily need to be learned, providing the best results to-date with pre-defined representations while being competitive with Deep CNNs. Using a shallow cascade of 1 × 1 convolutions, which encodes scattering coefficients that correspond to spatial windows of very small sizes, permits to obtain AlexNet accuracy on the imagenet ILSVRC2012. We demonstrate that this local encoding explicitly learns invariance w.r.t. rotations. Combining scattering networks with a modern ResNet, we achieve a single-crop top 5 error of 11.4% on imagenet ILSVRC2012, comparable to the Resnet-18 architecture, while utilizing only 10 layers. We also find that hybrid architectures can yield excellent performance in the small sample regime, exceeding their end-to-end counterparts, through their ability to incorporate geometrical priors. We demonstrate this on subsets of the CIFAR-10 dataset and on the STL-10 dataset.
Type de document :
Communication dans un congrès
International Conference on Computer Vision (ICCV), Oct 2017, Venice, Italy
Liste complète des métadonnées

https://hal.inria.fr/hal-01495734
Contributeur : Eugene Belilovsky <>
Soumis le : lundi 3 avril 2017 - 17:42:12
Dernière modification le : jeudi 11 janvier 2018 - 06:21:19
Document(s) archivé(s) le : mardi 4 juillet 2017 - 14:30:20

Identifiants

  • HAL Id : hal-01495734, version 2
  • ARXIV : 1703.08961

Citation

Edouard Oyallon, Eugene Belilovsky, Sergey Zagoruyko. Scaling the Scattering Transform: Deep Hybrid Networks. International Conference on Computer Vision (ICCV), Oct 2017, Venice, Italy. 〈hal-01495734v2〉

Partager

Métriques

Consultations de la notice

385

Téléchargements de fichiers

210