- 控制理论与非线性滤波讨论班

Deep Q-network (DQN) and Q-learning

组织者

丘成栋

演讲者

陶飏天择

时间

2022年09月13日 15:00 至 15:30

地点

Online

线上

Tencent 735 7908 4302 ()

摘要

In this report, we will describe how to approximate the optimal action-value function with a neural network, which we call a deep Q-network (DQN). The time difference algorithm (TD) used to train the DQN will also be introduced.