BIMSA >
Seminar on Control Theory and Nonlinear Filtering
Policy Gradient Methods with Baselines and Advanced Techniques for Policy Learning
Policy Gradient Methods with Baselines and Advanced Techniques for Policy Learning
Organizer
Speaker
Yangtianze Tao
Time
Tuesday, October 25, 2022 8:00 PM - 8:30 PM
Venue
Online
Abstract
Last time we derived policy gradients and introduced two policy gradient methods-REINFORCE and Actor-Critic. While the method mentioned earlier is correct in theory, it does not work well in practice. With baseline introduced this time Policy Gradient with Baseline can greatly improve the performance of policy gradient methods. use baseline After (Baseline), REINFORCE becomes REINFORCE with Baseline and Actor-Critic becomes Advantage Actor-Critic (A2C).