Seminar: Machine Learning Seminar
Provable Benefits of Complex Parameterizations for Structured State Space Models (SSMs)
Structured State Space Models (SSMs), the core engine behind prominent neural networks such as S4 and Mamba, are linear dynamical systems often parameterized using complex numbers. The theoretical benefits of these complex parameterizations over real ones have remained an open question. In this talk, I will present results from our recent paper, Provable Benefits of Complex Parameterizations for Structured State Space Models. We establish formal gaps between real and complex diagonal SSMs, proving that complex SSMs can express all mappings of a real SSM with moderate dimensions, whereas real SSMs require significantly higher dimensions or exponentially large parameter values. Our experiments corroborate these findings and suggest an extension of the theory to account for selectivity, a new architectural feature yielding state-of-the-art performance.
Yuval Milo is a Ph.D. candidate in Computer Science at Tel Aviv University, supervised by Prof. Nadav Cohen. He holds a B.Sc. in Mathematics from the Technion, where he was part of the Rothschild Excellence Program. Yuval’s research focuses on the theoretical foundations of neural networks and deep learning.