Modern control theory
This course explores the deep connections between optimal control and reinforcement learning, bridging classical techniques (Dynamic Programming, LQR, MPC) with modern data-driven methods (Q-Learning, Policy Gradient, Deep RL). Students will learn: Mathematical foundations (Bellman equations, value/policy iteration); Optimal control (LQR, LQG, Model Predictive Control); Approximate DP & RL (Monte Carlo, TD Learning, Actor-Critic methods); Applications in robotics, autonomous systems, and finance. The course balances theory (convergence, stability) and implementation (Python examples).
讲师
日期
2025年09月03日 至 12月31日
位置
| Weekday | Time | Venue | Online | ID | Password |
|---|---|---|---|---|---|
| 周三 | 14:20 - 16:55 | A3-1-101 | ZOOM 03 | 242 742 6089 | BIMSA |
修课要求
Linear algebra, probabilistic theory, calculus, optimization
课程大纲
Foundations of Optimal Control & Exact DP
1: Introduction to Dynamic Programming
2: Deterministic Continuous-Time Optimal Control
3: Stochastic DP and the LQG Problem
4: Model Predictive Control (MPC)
5: Infinite Horizon Problems
6: Shortest Path Problems & Computational Methods
Approximate DP & RL Basics
7: Approximate Value Iteration
8: Monte Carlo & Temporal Difference Learning
9: Policy Gradient Methods
10: Approximate Policy Iteration
Advanced Topics
11: Robust DP and H infinity Control
12: Multiagent RL and Games
13: Inverse Reinforcement Learning
14: Deep Reinforcement Learning
1: Introduction to Dynamic Programming
2: Deterministic Continuous-Time Optimal Control
3: Stochastic DP and the LQG Problem
4: Model Predictive Control (MPC)
5: Infinite Horizon Problems
6: Shortest Path Problems & Computational Methods
Approximate DP & RL Basics
7: Approximate Value Iteration
8: Monte Carlo & Temporal Difference Learning
9: Policy Gradient Methods
10: Approximate Policy Iteration
Advanced Topics
11: Robust DP and H infinity Control
12: Multiagent RL and Games
13: Inverse Reinforcement Learning
14: Deep Reinforcement Learning
参考资料
Bertsekas, D. P. (2019). Reinforcement Learning and Optimal Control. Athena Scientific.
Sutton & Barto, Reinforcement Learning: An Introduction
Bruno C. da Silva, Reinforcement Learning Lectures Notes (Fall 2022)
Sutton & Barto, Reinforcement Learning: An Introduction
Bruno C. da Silva, Reinforcement Learning Lectures Notes (Fall 2022)
听众
Undergraduate
, Advanced Undergraduate
, Graduate
, 博士后
, Researcher
视频公开
不公开
笔记公开
不公开
语言
中文
, 英文
讲师介绍
焦小沛,于2017年本科毕业于上海交通大学致远学院(物理班),2022年博士毕业于清华大学数学科学系,师从丘成栋教授(IEEE fellow,前美国伊利诺伊大学芝加哥分校终身教授)。先后在北京雁栖湖应用数学研究院,荷兰特文特大学从事博士后工作(导师Johannes Schmidt-Hieber教授,国际数理统计学会会士)。现研究方向包括控制理论,数值偏微分方程,生物信息学。获得2025年国家青年科学基金[C类]资助。