RL-Theory-Seminars-Events-header-image.jpg
Community - Partner Event

RL Theory Seminar: Adaptive Reward-Free Exploration

When
Oct. 13, 2020 - Oct. 13, 2020
11:00 AM - 12:00 PM
Where

Online

Amii is proud to support our province's growing AI community. The RL Theory Seminars are hosted independently by researchers: Gergely Neu, Ciara Pike-Burke, and Amii Fellow Csaba Szepesvári.

Speaker: Pierre Ménard (Inria Lille)

Paper: https://arxiv.org/abs/2006.06294

Authors: Emilie Kaufmann, Pierre Ménard, Omar Darwiche Domingues, Anders Jonsson, Edouard Leurent, Michal Valko

Abstract: Reward-free exploration is a reinforcement learning setting recently studied by Jin et al., who address it by running several algorithms with regret guarantees in parallel. In our work, we instead propose a more adaptive approach for reward-free exploration which directly reduces upper bounds on the maximum MDP estimation error. We show that, interestingly, our reward-free UCRL algorithm can be seen as a variant of an algorithm of Fiechter from 1994, originally proposed for a different objective that we call best-policy identification. We prove that RF-UCRL needs O(SAH^4/ε^2)ln(1/δ)) episodes to output, with probability 1−δ, an ε-approximation of the optimal policy for any reward function. We empirically compare it to oracle strategies using a generative model.

Connect with the community

Get involved in Alberta's growing AI ecosystem! Speaker, sponsorship, and letter of support requests welcome.

Explore training and advanced education

Curious about study options under one of our researchers? Want more information on training opportunities?

Harness the potential of artificial intelligence

Let us know about your goals and challenges for AI adoption in your business. Our Investments & Partnerships team will be in touch shortly!