Secure Cumulative Reward Maximization in Linear Stochastic Bandits

The linear stochastic multi-armed bandit is a sequential learning setting, where, at each round, a learner chooses an arm and receives a stochastic reward based on an unknown linear function of the chosen arm. The goal is to collect as much reward as possible. Linear bandits have popular applications such as online recommendation based on user preferences, where obtaining a high reward means recommending an item with high expected rating. We address the security concerns that occur when outsourcing the data and the cumulative reward maximization algorithm to an honest-but-curious cloud. We propose LinUCB-DS, a distributed and secure protocol that achieves the same cumulative reward as the standard LinUCB algorithm, without disclosing to the cloud the linear function used to draw arm rewards. We formally prove the complexity and security properties of LinUCB-DS. We also show that LinUCB-DS can be easily adapted to secure the SpectralUCB algorithm, which improves LinUCB for a class of linear bandits. We show the feasibility of our protocols via a proof-of-concept experimental study using the MovieLens movie recommendation dataset.

Domaines

Cryptographie et sécurité [cs.CR] Apprentissage [cs.LG] Base de données [cs.DB]

Radu Ciucanu : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-02942694

Soumis le : vendredi 18 septembre 2020-10:46:44

Dernière modification le : mardi 18 avril 2023-10:00:06

Dates et versions

hal-02942694 , version 1 (18-09-2020)

Identifiants

HAL Id : hal-02942694 , version 1
DOI : 10.1007/978-3-030-62576-4_13

Citer

Radu Ciucanu, Anatole Delabrouille, Pascal Lafourcade, Marta Soare. Secure Cumulative Reward Maximization in Linear Stochastic Bandits. International Conference on Provable and Practical Security (ProvSec), Nov 2020, Conférence online, Singapore. pp.257-277, ⟨10.1007/978-3-030-62576-4_13⟩. ⟨hal-02942694⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

PRES_CLERMONT CNRS UNIV-ORLEANS LIMOS INSA-GROUPE INSA-CVL CLERMONT-AUVERGNE-INP

177 Consultations

4 Téléchargements