Tracking Feature Learning in Deep Neural Networks Using Insights from Statistical Physics
Date: December,04,2023 Start Time: 10:30 - 11:30
Location: 861, Meyer Building
Lecturer: Dr. Inbar Seroussi
|Deep neural networks (DNNs) are powerful tools for compressing and distilling information. Their scale and complexity, often involve billions of inter-dependent internal degrees of freedom, rendering direct microscopic analysis difficult. Several works have shown that the statistics and dynamics of DNNs drastically simplify in the infinite width limit and become analytically tractable. However, the infinite width limit misses out on several qualitative aspects, such as feature learning and the fact that real-world DNNs are not nearly as over-parameterized. This gap is particularly apparent in deep convolutional neural networks (CNNs). In this talk, I will present a novel mean-field theory for finite fully trained deep non-linear DNNs. Specifically, we show that DNN layers couple only through the second moment (kernels) of their post-activations and pre-activations. Moreover, in various settings, the latter fluctuates in a nearly Gaussian manner. For CNNs with infinitely many channels, these kernels are inert, while for finite CNNs they adapt to the data. In several deep non-linear CNN models trained on real data, the resulting thermodynamic theory of deep learning yields accurate predictions. In addition, it provides a new tool to analyze and understand CNNs, and DNNs in general. This is joint work with Gadi Naveh and Zohar Ringel.
for more information, see https://www.nature.com/articles/s41467-023-36361-y.
|Inbar Seroussi is a postdoctoral fellow in the Applied Mathematics department at Tel-Aviv University. Before that, she was a postdoctoral fellow in the mathematics department at the Weizmann Institute of Science, hosted by Prof. Ofer Zeitouni. She completed her Ph.D. in the applied mathematics department at Tel-Aviv University under the supervision of Prof. Nir Sochen. During her PhD, she was a long-term intem at Microsoft Research (MSR). Her research interest is at the interface between machine learning, statistical physics, and high dimensional probability.