HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Journal articles

Secure protocols for cumulative reward maximization in stochastic multi-armed bandits

Abstract : We consider the problem of cumulative reward maximization in multi-armed bandits. We address the security concerns that occur when data and computations are outsourced to an honest-but-curious cloud i.e., that executes tasks dutifully, but tries to gain as much information as possible. We consider situations where data used in bandit algorithms is sensitive and has to be protected e.g., commercial or personal data. We rely on cryptographic schemes and propose UCB - MS, a secure multi-party protocol based on the UCB algorithm. We prove that UCB - MS computes the same cumulative reward as UCB while satisfying desirable security properties. In particular, cloud nodes cannot learn the cumulative reward or the sum of rewards for more than one arm. Moreover, by analyzing messages exchanged among cloud nodes, an external observer cannot learn the cumulative reward or the sum of rewards produced by some arm. We show that the overhead due to cryptographic primitives is linear in the size of the input. Our implementation confirms the linear-time behavior and the practical feasibility of our protocol, on both synthetic and real-world data.
Complete list of metadata

https://hal.inria.fr/hal-03564146
Contributor : Radu Ciucanu Connect in order to contact the contributor
Submitted on : Thursday, February 10, 2022 - 11:04:31 AM
Last modification on : Wednesday, May 4, 2022 - 8:19:03 AM

Identifiers

Citation

Radu Ciucanu, Pascal Lafourcade, Marius Lombard-Platet, Marta Soare. Secure protocols for cumulative reward maximization in stochastic multi-armed bandits. Journal of Computer Security, IOS Press, 2022, pp.1-27. ⟨10.3233/JCS-210051⟩. ⟨hal-03564146⟩

Share

Metrics

Record views

30