Alberta Machine Intelligence Institute

1-Minute Research: Gautham Vasan, Deep Policy Gradient Methods without Batch Updates, Target Networks, or Replay Buffers

Published

Nov 29, 2024

Content Type

Technical

Subject Matter

Research

Author

Gautham Vasan