Skip to Main content Skip to Navigation
Journal articles

Spectral bandits

Tomáš Kocák 1, 2 Rémi Munos 3, 2 Branislav Kveton 4 Shipra Agrawal 5 Michal Valko 3
2 Scool - Scool
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189
Abstract : Smooth functions on graphs have wide applications in manifold and semi-supervised learning. In this work, we study a bandit problem where the payoffs of arms are smooth on a graph. This framework is suitable for solving online learning problems that involve graphs, such as content-based recommendation. In this problem, each item we can recommend is a node of an undirected graph and its expected rating is similar to the one of its neighbors. The goal is to recommend items that have high expected ratings. We aim for the algorithms where the cumulative regret with respect to the optimal policy would not scale poorly with the number of nodes. In particular, we introduce the notion of an effective dimension, which is small in real-world graphs, and propose three algorithms for solving our problem that scale linearly and sublinearly in this dimension. Our experiments on content recommendation problem show that a good estimator of user preferences for thousands of items can be learned from just tens of node evaluations.
Document type :
Journal articles
Complete list of metadata
Contributor : Michal Valko Connect in order to contact the contributor
Submitted on : Sunday, December 20, 2020 - 11:30:58 PM
Last modification on : Friday, January 21, 2022 - 3:12:35 AM
Long-term archiving on: : Sunday, March 21, 2021 - 6:18:45 PM


Files produced by the author(s)


  • HAL Id : hal-03084249, version 1


Tomáš Kocák, Rémi Munos, Branislav Kveton, Shipra Agrawal, Michal Valko. Spectral bandits. Journal of Machine Learning Research, Microtome Publishing, 2020. ⟨hal-03084249⟩



Les métriques sont temporairement indisponibles