Deep Q-network (DQN) and Q-learning
组织者
演讲者
陶飏天择
时间
2022年09月13日 15:00 至 15:30
地点
Online
线上
Tencent 735 7908 4302
()
摘要
In this report, we will describe how to approximate the optimal action-value function with a neural network, which we call a deep Q-network (DQN). The time difference algorithm (TD) used to train the DQN will also be introduced.