Skip to Main content Skip to Navigation
Conference papers

TransFuseGrid: Transformer-based Lidar-RGB fusion for semantic grid prediction

Abstract : Semantic grids are a succinct and convenient approach to represent the environment for mobile robotics and autonomous driving applications. While the use of Lidar sensors is now generalized in robotics, most semantic grid prediction approaches in the literature focus only on RGB data. In this paper, we present an approach for semantic grid prediction that uses a transformer architecture to fuse Lidar sensor data with RGB images from multiple cameras. Our proposed method, TransFuseGrid, first transforms both input streams into topview embeddings, and then fuses these embeddings at multiple scales with Transformers. Finally, a decoder transforms the fused, top-view feature map into a semantic grid of the vehicle's environment. We evaluate the performance of our approach on the nuScenes dataset for the vehicle, drivable area, lane divider and walkway segmentation tasks. The results show that Trans-FuseGrid achieves superior performance than competing RGBonly and Lidar-only methods. Additionally, the Transformer feature fusion leads to a significative improvement over naive RGB-Lidar concatenation. In particular, for the segmentation of vehicles, our model outperforms state-of-the-art RGB-only and Lidar-only methods by 24% and 53%, respectively.
Complete list of metadata
Contributor : David Sierra-Gonzalez Connect in order to contact the contributor
Submitted on : Friday, September 2, 2022 - 2:39:07 PM
Last modification on : Tuesday, September 6, 2022 - 9:09:58 AM


Files produced by the author(s)


  • HAL Id : hal-03768008, version 1



Gustavo Salazar-Gomez, David Sierra González, Manuel Alejandro Diaz-Zapata, Anshul Paigwar, Wenqian Liu, et al.. TransFuseGrid: Transformer-based Lidar-RGB fusion for semantic grid prediction. ICARCV 2022 - 17th International Conference on Control, Automation, Robotics and Vision, Dec 2022, Singapore, Singapore. pp.1-6. ⟨hal-03768008⟩



Record views


Files downloads