Seminar: Machine Learning Seminar

ECE Women Community

Generalization and Efficiency in Deep Learning

Date: March,06,2024 Start Time: 10:30 - 11:30
Location: 1061, Meyer Building
Add to:
Lecturer: Daniel Soudry
The talk will have four separate parts:
(1) We examine neural networks (NN) with uniform random weights, conditioned on zero training loss. We prove they typically generalize well if there exists an underlying narrow “teacher NN” that agrees with the labels.
(2) We characterize the functions realized by shallow ReLU NN denoisers – in the common theoretical scenario of zero training loss with a minimal weight norm.
(3) We present a simple method to enable, for the first time, the usage of 12-bits accumulators in deep learning, with no significant degradation in accuracy. Also, we show that as we decrease the accumulation precision further, using fine-grained gradient approximations, can improve the DNN accuracy.
(4) We find an analytical relation between compute time properties and scalability limitations, caused by the compute variance of straggling workers in a distributed setting. Then, we propose “DropCompute”, a simple yet effective decentralized method to reduce the variation among workers and thus improve the robustness of the common synchronous training.
Daniel Soudry is an associate professor and Schmidt Career Advancement Chair in AI in the Electrical and Computer Engineering Department at the Technion, working in the areas of machine learning and neural networks. His recent works focus on resource efficiency and implicit bias in neural networks.


All Seminars
Skip to content