News
The Tea Time Talks are back! Throughout the summer, take in 20-minute talks on early-stage ideas, prospective research and technical topics delivered by students, faculty and guests. Presented by Amii and the RLAI Lab at the University of Alberta, the talks are a relaxed and informal way of hearing leaders in AI discuss future lines of research they may explore.
Watch select talks from the first week of the series now:
The first Tea Time Talk of 2021 features a panel of reinforcement learning (RL) researchers -- all Amii Fellows, Canada CIFAR AI Chairs and UAlberta professors. Martha White moderates this panel featuring Adam White, Csaba Szepesvári, Matthew E. Taylor and Michael Bowling.
Abstract: Planning, a computational process widely thought essential to intelligence, consists of imagining courses of action and their consequences, and deciding ahead of time which ones to do. In the standard RLAI agent architecture, the component that does the imagining of consequences is called the model of the environment, and the deciding in advance is via a change in the agent’s policy. Planning and model learning have been studied for seven decades and yet remain largely unsolved in the face of genuine approximation—models that remain approximate (do not become exact) in the high-data limit. In this talk, Richard Sutton briefly assesses the challenges of extending RL-style planning (value iteration) in the most important ways: average reward, partial observability, stochastic transitions, and temporal abstraction (options). His assessment is that these extensions are straightforward until they are combined with genuine approximation in the model, in which case we have barely a clue how to proceed in a scalable way. Nevertheless, we do have a few clues; Rich suggests the ideas of expectation models, ‘meta data’, and search as general strategies for learning approximate environment models suitable for use in planning.
Abstract: Policy gradient methods are a natural choice for learning a parameterized policy, especially for continuous actions, in a model-free way. These methods update policy parameters with stochastic gradient descent by estimating the gradient of a policy objective. Many of these methods can be derived from or connected to a well-known policy gradient theorem that writes the true gradient in the form of the gradient of the action likelihood, which is suitable for model-free estimation. In this talk, Rupam Mahmood revisits this theorem and looks for other forms of writing the true gradient that may give rise to new classes of policy gradient methods.
Like what you’re learning here? Take a deeper dive into the world of RL with the Reinforcement Learning Specialization, offered by the University of Alberta and Amii. Taught by Martha White and Adam White, this specialization explores how RL solutions help solve real-world problems through trial-and-error interaction, showing learners how to implement a complete RL solution from beginning to end. Enroll in this specialization now!
Nov 7th 2024
News
Amii partners with pipikwan pêhtâkwan and its startup company wâsikan kisewâtisiwin, to harness AI in efforts to challenge misinformation about Indigenous People and include Indigenous People in the development of AI. The project is supported by the PrairiesCan commitment to accelerate AI adoption among SMEs in the Prairie region.
Nov 7th 2024
News
Amii Fellow and Canada CIFAR AI Chair Russ Greiner and University of Alberta researcher and collaborator David Wishart were awarded the Brockhouse Canada Prize for Interdisciplinary Research in Science and Engineering from the National Sciences and Engineering Research Council of Canada (NSERC).
Nov 6th 2024
News
Amii founding member Jonathan Schaeffer has spent 40 years making huge impacts in game theory and AI. Now he’s retiring from academia and sharing some of the insights he’s gained over his impressive career.
Looking to build AI capacity? Need a speaker at your event?