Laura Invited Speaker at the Lancaster-DeepMind Workshop on Bandit

The Statistics group at the Department of Mathematics at Lancaster University and Google Deepmind will be holding a specialist workshop on multi-armed bandits in London on the 25th and 26th of September.

Laura will give a talk on Laplacian-regularized Graph Bandits 

Abstract: We study a stochastic linear bandit problem with multiple users, where we model user relationship as a graph, and assume parameters (preferences) of users form smooth signals on the graph. We propose a bandit algorithm employing a novel Upper Bound Confidence (UCB) defined based on the random-walk normalized graph Laplacian. Our algorithm yields a regret bound of $\Tilde{O}(\Psi d \sqrt{T})$ for each user where $\Psi$ depends on the connectivity between this user and its neighbors. Specifically, $\Psi \in (0,1]$ and denser connectivity leads to smaller $\Psi$, hence to lower regret. This proves the gain in exploring the graph smoothness prior, as opposed to consider all users independently (with a single-user regret bound of $\Tilde{O}(d \sqrt{T})$). Moreover, involving users graph into bandit algorithms typically results in a computation complexity scaling quadratically with the number of users. Hence, we propose a simplified algorithm, which enjoys the same regret bound theoretically but scales linearly with the number of user. We also present a finite-time analysis on proposed algorithms, and demonstrate their advantage in comparison with state-of-the-art graph-based bandit algorithms on both synthetic and real-world datasets.

0 comments on “Laura Invited Speaker at the Lancaster-DeepMind Workshop on Bandit

Leave a Reply

%d bloggers like this: