Skip to Main content Skip to Navigation
New interface
Conference papers

A Benchmark of Four Methods for Generating 360° Saliency Maps from Eye Tracking Data

Abstract : Modeling and visualization of user attention in Virtual Reality is important for many applications, such as gaze prediction , robotics, retargeting, video compression, and rendering. Several methods have been proposed to model eye tracking data as saliency maps. We benchmark the performance of four such methods for 360 • images. We provide a comprehensive analysis and implementations of these methods to assist researchers and practitioners. Finally, we make recommendations based on our benchmark analyses and the ease of implementation. Index Terms-Saliency, Visualization, Eye movements in VR, 360 images I. INTRODUCTION With the explosive growth of commercial VR and AR systems , there is increased access to innovative 3D experiences. In particular, 360 • images and videos are being used for entertainment, storytelling, and advertising. Content creators are actively working out techniques and building tools to guide user attention in these new media. A critical enabler for these efforts is measuring and visualizing eye tracking data. Eye trackers built into VR headsets serve as a reliable tool to understand how attention is allocated in 3D environments. The study of attention and eye movements in 2D content is well established. Saliency maps highlight regions that attract the most visual attention and have applications in predicting gaze [1], compression [2], and selective rendering [3] to name a few. In 360 • images, the user is surrounded by a photo-realistic virtual scene. Because only a fraction of the scene is viewed at once, the allocation of visual attention is different than 2D content. Moreover, due to the spherical nature of 360 • content, novel saliency map generation methods are required. To generate 2D saliency maps, eye tracking data is processed to identify fixations. Fixations are aggregated in a map that is convolved with a 2D Gaussian kernel. For 2D displays the number of pixels per visual degree is assumed to be the same in horizontal and vertical directions, so an isotropic Gaussian is used. The Kent distribution is an analog to a 2D Gaussian on the surface of a 3D sphere [4]. 360 • images encode spherical data, and the natural extension is to process them using such a distribution. However, computing a Kent based saliency map is slow due to a spatially varying kernel. Fortunately, several approximate alternatives exist. In this paper, we benchmark four alternative methods for generating 360 • saliency maps. We report accuracy and run-time for each algorithm, and present pseudocode to implement them. Based on these analyses and ease of implementation, we identify the most favorable approach.
Complete list of metadata

Cited literature [15 references]  Display  Hide  Download
Contributor : Olivier Le Meur Connect in order to contact the contributor
Submitted on : Thursday, December 13, 2018 - 12:45:42 PM
Last modification on : Saturday, August 6, 2022 - 3:33:02 AM
Long-term archiving on: : Thursday, March 14, 2019 - 1:45:43 PM


Files produced by the author(s)


  • HAL Id : hal-01953877, version 1


Brendan John, Pallavi Raiturkar, Olivier Le Meur, Eakta Jain. A Benchmark of Four Methods for Generating 360° Saliency Maps from Eye Tracking Data. Proceedings of The First IEEE International Conference on Artificial Intelligence and Virtual Reality, Dec 2018, Taichung, Taiwan. ⟨hal-01953877⟩



Record views


Files downloads