Modern control theory
This course explores the deep connections between optimal control and reinforcement learning, bridging classical techniques (Dynamic Programming, LQR, MPC) with modern data-driven methods (Q-Learning, Policy Gradient, Deep RL). Students will learn: Mathematical foundations (Bellman equations, value/policy iteration); Optimal control (LQR, LQG, Model Predictive Control); Approximate DP & RL (Monte Carlo, TD Learning, Actor-Critic methods); Applications in robotics, autonomous systems, and finance. The course balances theory (convergence, stability) and implementation (Python examples).

讲师
日期
2025年09月03日 至 12月31日
位置
Weekday | Time | Venue | Online | ID | Password |
---|---|---|---|---|---|
周三 | 14:20 - 16:55 | A3-1-101 | ZOOM 03 | 242 742 6089 | BIMSA |
修课要求
Linear algebra, probabilistic theory, calculus, optimization
课程大纲
Foundations of Optimal Control & Exact DP
1: Introduction to Dynamic Programming
2: Deterministic Continuous-Time Optimal Control
3: Stochastic DP and the LQG Problem
4: Model Predictive Control (MPC)
5: Infinite Horizon Problems
6: Shortest Path Problems & Computational Methods
Approximate DP & RL Basics
7: Approximate Value Iteration
8: Monte Carlo & Temporal Difference Learning
9: Policy Gradient Methods
10: Approximate Policy Iteration
Advanced Topics
11: Robust DP and H infinity Control
12: Multiagent RL and Games
13: Inverse Reinforcement Learning
14: Deep Reinforcement Learning
1: Introduction to Dynamic Programming
2: Deterministic Continuous-Time Optimal Control
3: Stochastic DP and the LQG Problem
4: Model Predictive Control (MPC)
5: Infinite Horizon Problems
6: Shortest Path Problems & Computational Methods
Approximate DP & RL Basics
7: Approximate Value Iteration
8: Monte Carlo & Temporal Difference Learning
9: Policy Gradient Methods
10: Approximate Policy Iteration
Advanced Topics
11: Robust DP and H infinity Control
12: Multiagent RL and Games
13: Inverse Reinforcement Learning
14: Deep Reinforcement Learning
参考资料
Bertsekas, D. P. (2019). Reinforcement Learning and Optimal Control. Athena Scientific.
Sutton & Barto, Reinforcement Learning: An Introduction
Bruno C. da Silva, Reinforcement Learning Lectures Notes (Fall 2022)
Sutton & Barto, Reinforcement Learning: An Introduction
Bruno C. da Silva, Reinforcement Learning Lectures Notes (Fall 2022)
听众
Undergraduate
, Advanced Undergraduate
, Graduate
, 博士后
, Researcher
视频公开
不公开
笔记公开
不公开
语言
中文
, 英文
讲师介绍
焦小沛,本科毕业于上海交通大学致远学院,博士毕业于清华大学数学科学系。先后在北京雁栖湖应用数学研究院,荷兰特文特大学从事博士后工作。现研究方向包括有限维滤波理论,丘-丘滤波方法,物理信息神经网络以及生物信息学。研究兴趣主要集中于(1)利用李代数等几何工具进行偏微分方程求解与有限维滤波系统的分类;(2)设计基于物理信息神经网络的新型数值算法。