北京雁栖湖应用数学研究院 北京雁栖湖应用数学研究院

  • 关于我们
    • 院长致辞
    • 理事会
    • 协作机构
    • 参观来访
  • 人员
    • 管理层
    • 科研人员
    • 博士后
    • 来访学者
    • 行政团队
  • 学术研究
    • 研究团队
    • 公开课
    • 讨论班
  • 招生招聘
    • 教研人员
    • 博士后
    • 学生
  • 会议
    • 学术会议
    • 工作坊
    • 论坛
  • 学院生活
    • 住宿
    • 交通
    • 配套设施
    • 周边旅游
  • 新闻
    • 新闻动态
    • 通知公告
    • 资料下载
关于我们
院长致辞
理事会
协作机构
参观来访
人员
管理层
科研人员
博士后
来访学者
行政团队
学术研究
研究团队
公开课
讨论班
招生招聘
教研人员
博士后
学生
会议
学术会议
工作坊
论坛
学院生活
住宿
交通
配套设施
周边旅游
新闻
新闻动态
通知公告
资料下载
清华大学 "求真书院"
清华大学丘成桐数学科学中心
清华三亚国际数学论坛
上海数学与交叉学科研究院
BIMSA > Optimization Methods for Machine Learning
Optimization Methods for Machine Learning
Stochastic Gradient Descent (SGD), in one form or another, serves as the workhorse method for training modern machine learning models. Amidst its myriad variations, the SGD domain is both extensive and burgeoning, presenting a significant challenge for both practitioners and even experts to understand its landscape and inhabitants. This course offers a mathematically rigorous and comprehensive introduction to the field, drawing upon the most recent advancements and insights. It meticulously constructs a theory of convergence and complexity for SGD's serial, parallel, and distributed variants across strongly convex, convex, and nonconvex settings, incorporating randomness from subsampling, compression, and other sources.

The curriculum also delves into advanced techniques such as acceleration through Polyak momentum or Nesterov extrapolation. A notable portion of the course is dedicated to a unified analysis of a large family of SGD variants. Historically, these variants have demanded distinct intuitions, convergence analyses, and applications, evolving separately across various communities. This framework includes but not limited to the useful techniques: variance reduction, data sampling, coordinate sampling, arbitrary sampling, importance sampling, mini-batching, quantization, sketching, dithering, and sparsification, as well as their combinations. This comprehensive exploration aims to equip learners with a deep understanding of SGD's intricate landscape, fostering the ability to adeptly apply and innovate upon these methods in their work.
Professor Lars Aake Andersson
讲师
牛一帅
日期
2024年04月16日 至 06月11日
位置
Weekday Time Venue Online ID Password
周二,周四 19:10 - 21:35 A3-2-303 ZOOM 11 435 529 7909 BIMSA
修课要求
Linear Algebra, Calculus, Convex Analysis, Probability theory
课程大纲
1. Introduction
2. Basic Tools from Convex Analysis, Optimization and Probability
3. Gradient Descent
4. Stochastic Gradient Descent (with Sampling and Minibatching)
5. Acceleration (Polyak Momentum and Nesterov Acceleration)
6. Adaptive Learning Rate (AdaGrad, RMSProp, AdaDelta and ADAM)
7. SGD with Gradient Shift
8. SGD with Control
9. Variance Reduction (SVRG and Loopless-SVRG, SAG and SAGA)
10. Distributed Training: Compressed Gradient Descent (CGD)
11. Randomized Coordinate Descent (RCD)
12. Federated Learning and Local Gradient Descent
13. General Convergence Analysis in the Convex Setting
14. General Convergence Analysis in the Nonconvex Setting
15. Stochastic Newton Method
16. Randomized BFGS
参考资料
1. Lectures on Convex Optimization – Y. Nesterov
2. Learning Theory from First Principles – F. Bach
3. First-Order Methods in Optimization – A. Beck
4. Large-Scale Convex Optimization: Algorithms and Analyses via Monotone Operators – E.K. Ryu and W.T. Yin
5. First-order and Stochastic Optimization Methods for Machine Learning – G.H. Lan
6. Accelerated Optimization for Machine Learning: First-Order Algorithms – Z.C. Lin, H. Li, C. Fang
听众
Undergraduate , Advanced Undergraduate , Graduate , 博士后 , Researcher
视频公开
公开
笔记公开
公开
语言
英文
讲师介绍
Yi-Shuai Niu, a tenured Associate Professor of Mathematics at Beijing Institute of Mathematical Sciences and Applications (BIMSA), specialized in Optimization, Scientific Computing, Machine Learning, and Computer Sciences. Before joining BIMSA in October 2023, he was a research fellow at the Hong Kong Polytechnic University (2021-2022); an associate professor at Shanghai Jiao Tong University (2014-2021), where he led the “Optimization and Interdisciplinary Research Group” and double-appointed at the ParisTech Elite Institute of Technology and the School of Mathematical Sciences. His earlier roles include postdoc at the University of Paris 6 (2013-2014) and junior researcher both at the French National Center for Scientific Research (CNRS) and Stanford University (2010-2012). He was also a lecturer at the National Institute of Applied Sciences (INSA) of Rouen (2007-2010) in France, where he earned a Ph.D. in Mathematics-Optimization in 2010 and double Masters in Pure and Applied Mathematics and Genie Mathematics in 2006. His research covers a wide range of applied mathematics, with a spotlight on optimization theory, machine learning, high-performance computing, and software development. His works span various interdisciplinary applications including: machine learning, natural language processing, self-driving car, finance, image processing, turbulent combustion, polymer science, quantum chemistry and computing, and plasma physics. His contributions encompass fundamental research, emphasizing novel algorithms for large-scale nonconvex and nonsmooth problems, and practical implementations, focusing on efficient optimization solvers and scientific computing packages using high-performance computing techniques. He developed more than 33 pieces of software and published about 30 articles in prestigious journals and conferences (including SIAM Journal on Optimization, Journal of Scientific Computing, Combustion and Flames, Applied Mathematics and Computation). He was PI of 5 research grants and members of 5 joint international research projects. He was awarded of shanghai teaching achievement award (First prize) in 2017, two outstanding teaching awards (First prize) at Shanghai Jiao Tong University in 2016 and 2017 respectively, as well as 17 awards in international contests of mathematics MCM/ICM (including the INFORMS best paper award in 2017).
北京雁栖湖应用数学研究院
CONTACT

No. 544, Hefangkou Village Huaibei Town, Huairou District Beijing 101408

北京市怀柔区 河防口村544号
北京雁栖湖应用数学研究院 101408

Tel. 010-60661855
Email. administration@bimsa.cn

版权所有 © 北京雁栖湖应用数学研究院

京ICP备2022029550号-1

京公网安备11011602001060 京公网安备11011602001060