FiLM: Visual Reasoning with a General Conditioning Layer

Ethan Perez; Florian Strub; Harm de Vries; Vincent Dumoulin; Aaron Courville

Communication Dans Un Congrès Année : 2018

FiLM: Visual Reasoning with a General Conditioning Layer

(1, 2) , (3, 4) , (2) , (2) , (2)

1
2
3
4

Ethan Perez

Fonction : Auteur
PersonId : 1023763

Rice University [Houston]

Université de Montréal

Florian Strub

Fonction : Auteur
PersonId : 18649
IdHAL : florian-strub
ORCID : 0000-0001-7271-5345

Sequential Learning

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Harm de Vries

Fonction : Auteur

Université de Montréal

Vincent Dumoulin

Fonction : Auteur

Université de Montréal

Aaron Courville

Fonction : Auteur
PersonId : 1011047

Université de Montréal

Résumé

We introduce a general-purpose conditioning method for neu-ral networks called FiLM: Feature-wise Linear Modulation. FiLM layers influence neural network computation via a simple , feature-wise affine transformation based on conditioning information. We show that FiLM layers are highly effective for visual reasoning — answering image-related questions which require a multi-step, high-level process — a task which has proven difficult for standard deep learning methods that do not explicitly model reasoning. Specifically, we show on visual reasoning tasks that FiLM layers 1) halve state-of-the-art error for the CLEVR benchmark, 2) modulate features in a coherent manner, 3) are robust to ablations and architectural modifications, and 4) generalize well to challenging, new data from few examples or even zero-shot.

Domaines

Réseau de neurones [cs.NE] Intelligence artificielle [cs.AI]

Florian STRUB : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01648685

Soumis le : mardi 28 novembre 2017-03:13:55

Dernière modification le : mercredi 24 janvier 2024-09:54:23

Dates et versions

hal-01648685 , version 1 (28-11-2017)

Identifiants

HAL Id : hal-01648685 , version 1
ARXIV : 1707.03017

Citer

Ethan Perez, Florian Strub, Harm de Vries, Vincent Dumoulin, Aaron Courville. FiLM: Visual Reasoning with a General Conditioning Layer. AAAI Conference on Artificial Intelligence, Feb 2018, New Orleans, United States. ⟨hal-01648685⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA CRISTAL INRIA2 CRISTAL-SEQUEL UNIV-LILLE

699 Consultations

0 Téléchargements

FiLM: Visual Reasoning with a General Conditioning Layer

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager