News
The Tea Time Talks are back! Throughout the summer, take in 20-minute talks on early-stage ideas, prospective research and technical topics delivered by students, faculty and guests. Presented by Amii and the RLAI Lab at the University of Alberta, the talks are a relaxed and informal way of hearing leaders in AI discuss future lines of research they may explore.
Watch select talks from the second week of the series now:
Abstract: In this talk, Michael Bowling looks at some of the often unstated principles common in multiagent learning research, suggesting that they may be responsible for holding us back. And more importantly, might be holding back more than just multiagent. In response, he offers an alternative set of principles, which leads to the view of hindsight rationality, rooted in online learning (and connected to correlated equilibria). He questions beloved approaches of train-then-test, and the focus on evaluating artifacts, with a future-looking lens and comparison to optimal. Replacing them instead with a single-lifetime and a focus on evaluating behaviour with a hindsight lens and comparison to targeted deviations of behavior. This talk is the culmination of a year-long collaboration that introduces an alternative to Nash equilibria (with papers in AAAI and ICML this year). Michael only cursorily touches on the technical contributions of those papers, instead focusing on the more philosophical principles. View the papers if you want to dig deeper: Hindsight and Sequential Rationality of Correlated Play & Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games
Abstract: The objectives of Patrick Pilarski's talk are to: 1) Define "constructivism" and "tightly coupled" in the context of human-machine interfaces (specifically the setting of neuroprostheses); 2) Propose that for maximum potential, tightly coupled interfaces should be partially or fully constructivist; 3) Give concrete examples of how this perspective leads to beneficial properties in tightly coupled interactions, drawn from his past 10 years of work on constructing predictions and state in upper-limb prosthetic interfaces.
Abstract: Policy gradient methods are a natural choice for learning a parameterized policy, especially for continuous actions, in a model-free way. These methods update policy parameters with stochastic gradient descent by estimating the gradient of a policy objective. Many of these methods can be derived from or connected to a well-known policy gradient theorem that writes the true gradient in the form of the gradient of the action likelihood, which is suitable for model-free estimation. In this talk, Rupam Mahmood revisits this theorem and looks for other forms of writing the true gradient that may give rise to new classes of policy gradient methods.
Like what you’re learning here? Take a deeper dive into the world of RL with the Reinforcement Learning Specialization, offered by the University of Alberta and Amii. Taught by Martha White and Adam White, this specialization explores how RL solutions help solve real-world problems through trial-and-error interaction, showing learners how to implement a complete RL solution from beginning to end. Enroll in this specialization now!
May 16th 2024
News
Amii and New Harvest are excited to announce phase two of their research collaboration focused on applications of artificial intelligence and machine learning in cellular agriculture. The new phase initiates a year-long project with an open call to researchers and experts specializing in cellular agriculture and machine learning who want to apply ML solutions to solve the challenges in the field.
May 7th 2024
News
Check out the advancements being presented by Amii researchers at the 2024 International Conference on Learning Representation.
May 2nd 2024
News
Read our monthly update on Alberta’s growing machine intelligence ecosystem and exciting opportunities to get involved.
Looking to build AI capacity? Need a speaker at your event?