Seminar: Machine Learning Seminar

ECE Women Community

Global Convergence in Neural Networks via the Second Law of Thermodynamics

Date: July,06,2026 Start Time: 11:00 - 12:00
Location: 506, Zisapel Building
Add to:
Lecturer: Matan Tsipory

Existing convergence analyses of finite-width overparameterized deep neural networks under standard parameterization often rely on parameters remaining close to their initialization and typically neglect the role of noise, despite empirical evidence that stochasticity can improve optimization and generalization.
To bridge this gap, we study L2-regularized continuous-time Langevin dynamics (CLD) for training a feedforward neural network with smooth activations. We introduce a change-of-measure inequality, inspired by the second law of thermodynamics, that enables control of key properties of the training dynamics under the time-evolving parameter distribution. Leveraging a sharpened lower bound on the minimum eigenvalue of the Neural Tangent Kernel (NTK) at initialization, we show that injected noise helps preserve NTK stability with high probability without confining the weights to a small neighborhood of their initialization. As a result, we derive a non-asymptotic upper bound on the expected training loss. Our analysis yields explicit conditions on width and noise temperature that guarantee convergence to a finite loss floor. Finally, we experimentally show the network Jacobian can change from initialization, departing the commonly analyzed lazy regime.

M.Sc. student under the supervision of Prof.Daniel Soudry.

 

All Seminars