Research Post
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network used to select actions to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method includes obtaining path data defining a path through the environment traversed by the agent. A consistency error is determined for the path from a combined reward, first and last soft-max state values, and a path likelihood. A value update for the current values of the policy neural network parameters is determined from at least the consistency error. The value update is used to adjust the current values of the policy neural network parameters.
Feb 15th 2022
Research Post
Read this research paper, co-authored by Amii Fellow and Canada CIFAR AI Chair Osmar Zaiane: UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-Wise Perspective with Transformer
Sep 27th 2021
Research Post
Sep 17th 2021
Research Post
Looking to build AI capacity? Need a speaker at your event?