OdalricAmbrym Maillard.
PhD thesis, Université de Lille 1, October 2011.
[AFIA PhD Prize 2012]
[Download]
Abstract: 
This thesis studies the following topics in Machine Learning: Bandit theory, Statistical learning and Reinforcement learning. The common underlying thread is the nonasymptotic study of various notions of adaptation: to an environment or an opponent in part I about bandit theory, to the structure of a signal in part II about statistical theory, to the structure of states and rewards or to some statemodel of the world in part III about reinforcement learning.
First we derive a nonasymptotic analysis of a KullbackLeiblerbased algorithm for the stochastic multiarmed bandit that enables to match, in the case of distributions with finite support, the asymptotic distributiondependent lower bound known for this problem. Now for a multiarmed bandit with a possibly adaptive opponent, we introduce historybased models to catch some weakness of the opponent, and show how one can benefit from such models to design algorithms adaptive to this weakness.
Then we contribute to the regression setting and show how the use of random matrices can be beneficial both theoretically and numerically when the considered hypothesis space has a large, possibly infinite, dimension. We also use random matrices in the sparse recovery setting to build sensing operators that allow for recovery when the basis is far from being orthogonal.
Finally we combine part I and II to first provide a nonasymptotic analysis of reinforcement learning algorithms such as Bellmanresidual minimization and a version of Least squares temporaldifference that uses random projections and then, upstream of the Markov Decision Problem setting, discuss the practical problem of choosing a good model of states.

You can dowload my Ph.D. manuscript from the University website (here).
Bibtex: 
@phdthesis{maillard2011apprentissage,
title={APPRENTISSAGE S{\’E}QUENTIEL: Bandits, Statistique et Renforcement.},
author={Maillard, OdalricAmbrym},
year={2011},
school={Universit{\’e} des Sciences et Technologie de Lille — Lille I}
} 
Like this:
Like Loading...
Related