Kullback-Leibler Upper Confidence Bounds for Optimal Sequential Allocation.

2013, Discussing articles

Olivier Cappé, Aurélien Garivier,
Odalric-Ambrym Maillard,
Rémi Munos, Gilles Stoltz.
In The Annals of Statistics, 2013.



We consider optimal sequential allocation in the context of the so-called stochastic multi-armed bandit model. We describe a generic index policy, in the sense of  Gittins (1979), based on upper confidence bounds of the arm payoffs computed using the Kullback-Leibler divergence. We consider two classes of distributions for which instances of this general idea are analyzed: The kl-UCB algorithm is designed for one-parameter exponential families and the empirical KL-UCB algorithm for bounded and finitely supported distributions. Our main contribution is a unified finite-time analysis of the regret of these algorithms that asymptotically matches the lower bounds of Lai and Robbins (1985) and Burnetas et Katehakis (1996), respectively. We also investigate the behavior of these algorithms when used with general bounded rewards, showing in particular that they provide significant improvements over the state-of-the-art.

You can dowload the paper from the Annals of Statistics (here) or from the HAL online open depository* (here).

AUTHOR = {Olivier Capp\'{e}, Aur\'{e}lien Garivier, Odalric-Ambrym Maillard, R\'{e}mi Munos, Gilles Stoltz},
TITLE = {Kullback–Leibler upper confidence bounds for optimal sequential allocation},
JOURNAL = {Ann. Statist.},
FJOURNAL = {Annals of Statistics},
YEAR = {2013},
VOLUME = {41},
NUMBER = {3},
PAGES = {1516-1541},
ISSN = {0090-5364},
DOI = {10.1214/13-AOS1119} }
Related Publications:
Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences.
Odalric-Ambrym Maillard, Gilles Stoltz, Rémi Munos.
In Proceedings of the 24th annual Conference On Learning Theory, COLT, 2011.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s