BIMSA >
控制理论和非线性滤波讨论班
Beyond the Quadratic Approximation: The Multiscale Structure of Neural Network Loss Landscapes
Beyond the Quadratic Approximation: The Multiscale Structure of Neural Network Loss Landscapes
组织者
演讲者
时间
2023年11月03日 21:00 至 21:30
地点
Online
摘要
The quadratic approximation of neural network loss landscapes has been extensively used to study the optimization process of these networks. Though, it usually holds in a very small neighborhood of the minimum, it cannot explain many phenomena observed during the optimization process. Numerically, we observe that neural network loss functions possess a multiscale structure, manifested in two ways: (1) in a neighborhood of minima, the loss mixes a continuum of scales and grows subquadratically, and (2) in a larger region, the loss shows several separate scales clearly.