BIMSA >
Seminar on Control Theory and Nonlinear Filtering
Beyond the Quadratic Approximation: The Multiscale Structure of Neural Network Loss Landscapes
Beyond the Quadratic Approximation: The Multiscale Structure of Neural Network Loss Landscapes
Organizer
Speaker
Time
Friday, November 3, 2023 9:00 PM - 9:30 PM
Venue
Online
Abstract
The quadratic approximation of neural network loss landscapes has been extensively used to study the optimization process of these networks. Though, it usually holds in a very small neighborhood of the minimum, it cannot explain many phenomena observed during the optimization process. Numerically, we observe that neural network loss functions possess a multiscale structure, manifested in two ways: (1) in a neighborhood of minima, the loss mixes a continuum of scales and grows subquadratically, and (2) in a larger region, the loss shows several separate scales clearly.