LaRa: Latents and Rays for Multi-Camera Bird's-Eye-View Semantic Segmentation - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2022

LaRa: Latents and Rays for Multi-Camera Bird's-Eye-View Semantic Segmentation

Résumé

Recent works in autonomous driving have widely adopted the bird'seye-view (BEV) semantic map as an intermediate representation of the world. Online prediction of these BEV maps involves non-trivial operations such as multi-camera data extraction as well as fusion and projection into a common topview grid. This is usually done with error-prone geometric operations (e.g., homography or back-projection from monocular depth estimation) or expensive direct dense mapping between image pixels and pixels in BEV (e.g., with MLP or attention). In this work, we present 'LaRa', an efficient encoder-decoder, transformer-based model for vehicle semantic segmentation from multiple cameras. Our approach uses a system of cross-attention to aggregate information over multiple sensors into a compact, yet rich, collection of latent representations. These latent representations, after being processed by a series of selfattention blocks, are then reprojected with a second cross-attention in the BEV space. We demonstrate that our model outperforms the best previous works using transformers on nuScenes. The code and trained models are available at https://github.com/valeoai/LaRa.
Fichier principal
Vignette du fichier
LaRa_clean_arxiv.pdf (8.92 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03875582 , version 1 (28-11-2022)

Identifiants

Citer

Florent Bartoccioni, Éloi Zablocki, Andrei Bursuc, Patrick Pérez, Matthieu Cord, et al.. LaRa: Latents and Rays for Multi-Camera Bird's-Eye-View Semantic Segmentation. CoRL 2022 - Conference on Robot Learning, Dec 2022, Auckland, New Zealand. ⟨hal-03875582⟩
65 Consultations
43 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More