ECE Women Community

Generalization and Efficiency in Deep Learning

Date: March,06,2024 Start Time: 10:30 - 11:30

Location: 1061, Meyer Building

Add to:

Lecturer: Daniel Soudry

Research Areas:

Machine learning and intelligent systems

The talk will have four separate parts:
(1) We examine neural networks (NN) with uniform random weights, conditioned on zero training loss. We prove they typically generalize well if there exists an underlying narrow “teacher NN” that agrees with the labels.
(2) We characterize the functions realized by shallow ReLU NN denoisers – in the common theoretical scenario of zero training loss with a minimal weight norm.
(3) We present a simple method to enable, for the first time, the usage of 12-bits accumulators in deep learning, with no significant degradation in accuracy. Also, we show that as we decrease the accumulation precision further, using fine-grained gradient approximations, can improve the DNN accuracy.
(4) We find an analytical relation between compute time properties and scalability limitations, caused by the compute variance of straggling workers in a distributed setting. Then, we propose “DropCompute”, a simple yet effective decentralized method to reduce the variation among workers and thus improve the robustness of the common synchronous training.

Daniel Soudry is an associate professor and Schmidt Career Advancement Chair in AI in the Electrical and Computer Engineering Department at the Technion, working in the areas of machine learning and neural networks. His recent works focus on resource efficiency and implicit bias in neural networks.

Seminar: Machine Learning Seminar

Seminars

Generalization and Efficiency in Deep Learning

Seminars

Generalization and Efficiency in Deep Learning

Upcoming Seminars

Quantum Light Sources Based on Superconductor-Semiconductor Structures

3D Technology Overview: Going into 3rd Dimension

Extremum Encoding for Distributed Time-Delay Estimation