AI Seminar – Emilie Kaufmann
Online
Online
Title: On the complexity of learning good policies with and without rewards
Abstract: This talk will revolve around two performance criteria that have been studied in the context of episodic reinforcement learning: an old one, Best Policy Identification (BPI) [Fiechter, 1994], and a new one, Reward Free Exploration (RFE) [Jin et al., 2020]. We will see that a variant of the very first BPI algorithm can actually be used for the more challenging reward free exploration problem. This Reward-Free UCRL algorithm, which adaptively explores the MDP and adaptively decides when to stop exploration, requires fewer exploration episodes than state-of-the art algorithms. We will then present alternative algorithms for the BPI objective and discuss the relative complexity of BPI and RFE.
Bio: Emilie Kaufmann is a CNRS researcher in the CRIStAL laboratory at University of Lille. She is also a member of the Inria Scool team (formely SequeL), whose expertise is in sequential decision making. She worked a lot on the stochastic multi-armed bandit problem, in particular towards getting a better understanding of the difference between rewards maximization and pure-exploration problems. She also recently worked on exploration for reinforcement learning.
The University of Alberta Artificial Intelligence (AI) Seminar is a weekly meeting where researchers (including students, developers, and professors) interested in AI can share their current research. Presenters include local speakers from the University of Alberta and industry as well as other institutions. The seminars discuss a wide range of topics related in any way to Artificial Intelligence, from foundational theoretical work to innovative applications of AI techniques to new fields and problems are of interest.Learn more at the AI Seminar website and by subscribing to the mailing list!
Looking to build AI capacity? Need a speaker at your event?