A Simple Recipe for Language-guided Domain Generalized Segmentation

Mohammad Fahes; Tuan-Hung Vu; Andrei Bursuc; Patrick Pérez; Raoul de Charette

Pré-Publication, Document De Travail Année : 2023

A Simple Recipe for Language-guided Domain Generalized Segmentation

(1) , (1, 2) , (2, 1) , (1, 2) , (1)

1
2

Mohammad Fahes

Fonction : Auteur

Systèmes de transport automatisés et sécurisés

Tuan-Hung Vu

Fonction : Auteur

Systèmes de transport automatisés et sécurisés

Valeo.ai

Andrei Bursuc

Fonction : Auteur
PersonId : 3798
IdHAL : andrei-bursuc
IdRef : 172354633

Valeo.ai

Systèmes de transport automatisés et sécurisés

Patrick Pérez

Fonction : Auteur

Systèmes de transport automatisés et sécurisés

Valeo.ai

Raoul de Charette

Fonction : Auteur
PersonId : 15614
IdHAL : rdecharette
IdRef : 168198606

Systèmes de transport automatisés et sécurisés

Résumé

Generalization to new domains not seen during training is one of the long-standing goals and challenges in deploying neural networks in real-world applications. Existing generalization techniques necessitate substantial data augmentation, potentially sourced from external datasets, and aim at learning invariant representations by imposing various alignment constraints. Large-scale pretraining has recently shown promising generalization capabilities, along with the potential of bridging different modalities. For instance, the recent advent of vision-language models like CLIP has opened the doorway for vision models to exploit the textual modality. In this paper, we introduce a simple framework for generalizing semantic segmentation networks by employing language as the source of randomization. Our recipe comprises three key ingredients: i) the preservation of the intrinsic CLIP robustness through minimal fine-tuning, ii) language-driven local style augmentation, and iii) randomization by locally mixing the source and augmented styles during training. Extensive experiments report state-of-the-art results on various generalization benchmarks. The code will be made available.

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Raoul de Charette : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04366798

Soumis le : vendredi 29 décembre 2023-14:38:23

Dernière modification le : jeudi 4 janvier 2024-16:45:58

Dates et versions

hal-04366798 , version 1 (29-12-2023)

Identifiants

HAL Id : hal-04366798 , version 1
ARXIV : 2311.17922

Citer

Mohammad Fahes, Tuan-Hung Vu, Andrei Bursuc, Patrick Pérez, Raoul de Charette. A Simple Recipe for Language-guided Domain Generalized Segmentation. 2023. ⟨hal-04366798⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INRIA INRIA2 ANR

20 Consultations

0 Téléchargements

A Simple Recipe for Language-guided Domain Generalized Segmentation

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager