News
Now that the 2020 Tea Time Talks are on Youtube, you can always have time for tea with Amii and the RLAI Lab! Hosted by Amii’s Chief Scientific Advisory Dr. Richard S. Sutton, these 20-minute talks on technical topics are delivered by students, faculty and guests. The talks are a relaxed and informal way of hearing leaders in AI discuss future lines of research they may explore, with topics ranging from ideas starting to take root to fully-finished projects.
Week eleven of the Tea Time Talks features:
Many supervised learning algorithms are designed to operate under i.i.d. sampling. When those algorithms are applied to problems with nonstationary sampling, they can misbehave -- which is not surprising if one takes time to understand the conditions under which an algorithm's behaviour is (or is not) guaranteed. Dynamical systems analysis offers us some tools to extend those guarantees to certain kinds of nonstationary sampling. This talk exemplifies these ideas in a simple setting: optimizing linear regression models with SGD+momentum under periodic simple nonstationarity.
Multi-step greedy policies have been extensively used in model-based reinforcement learning (RL), both when a model of the environment is available (for example, in the game of Go) and when it is learned. In this talk, Manan presents a paper he co-authored which explores the benefits of multi-step greedy policies in model-free RL, when employed using multi-step dynamic programming algorithms: $\kappa$-Policy Iteration ($\kappa$-PI) and $\kappa$-Value Iteration ($\kappa$-VI). These methods iteratively compute the next policy ($\kappa$-PI) and value function ($\kappa$-VI) by solving a surrogate decision problem with a shaped reward and a smaller discount factor. The authors derive model-free RL algorithms based on $\kappa$-PI and $\kappa$-VI in which the surrogate problem can be solved by any discrete or continuous action RL method, such as DQN and TRPO; and identify the importance of a hyper-parameter that controls the extent to which the surrogate problem is solved and suggest a way to set this parameter. When evaluated on a range of Atari and MuJoCo benchmark tasks, their results indicate that for the right range of $\kappa$, their algorithms outperform DQN and TRPO. This shows that their multi-step greedy algorithms are general enough to be applied over any existing RL algorithm and can significantly improve its performance.
Robin shares some highlights and learnings from a year of interviewing RL researchers on the TalkRL podcast. Additionally, he dives deep into a Pommerman agent he designed.
In 1990, Scott E. Fahlman and Christian Lebiere proposed a constructive neural network architecture -- the cascade-correlation -- as an alternative to training deep neural networks with fixed architectures using backpropagation. Despite showing promising results and spurring several follow up papers, the cascade-correlation is not popular in the deep learning community. In this talk, Juan explores why the cascade-correlation is not popular anymore, in the process presenting several empirical results that demonstrate the performance of the cascade-correlation under several settings and in different domains. He discusses disadvantages of the cascade-correlation that have been found in the literature, but also several extensions that have been proposed to address each of them. He concludes by arguing why the cascade-correlation is worth caring about.
The Tea Time Talks have now concluded for the year, but stay tuned as we will be uploading the final talks next week. In the meantime, you can rewatch or catch up on previous talks on our Youtube playlist.
Nov 7th 2024
News
Amii partners with pipikwan pêhtâkwan and its startup company wâsikan kisewâtisiwin, to harness AI in efforts to challenge misinformation about Indigenous People and include Indigenous People in the development of AI. The project is supported by the PrairiesCan commitment to accelerate AI adoption among SMEs in the Prairie region.
Nov 7th 2024
News
Amii Fellow and Canada CIFAR AI Chair Russ Greiner and University of Alberta researcher and collaborator David Wishart were awarded the Brockhouse Canada Prize for Interdisciplinary Research in Science and Engineering from the National Sciences and Engineering Research Council of Canada (NSERC).
Nov 6th 2024
News
Amii founding member Jonathan Schaeffer has spent 40 years making huge impacts in game theory and AI. Now he’s retiring from academia and sharing some of the insights he’s gained over his impressive career.
Looking to build AI capacity? Need a speaker at your event?