Analysis and Mathematical Physics
Towards a Geometric Theory of Deep Learning
The mathematical core of deep learning is function approximation by neural networks trained on data using stochastic gradient descent. I will present a collection of sharp results on training dynamics for the deep linear network (DLN), a phenomenological model introduced by Arora, Cohen and Hazan in 2017. Our analysis reveals unexpected ties with several areas of mathematics (minimal surfaces, geometric invariant theory and random matrix theory) as well as a conceptual picture for `true' deep learning. This is joint work with several co-authors: Nadav Cohen (Tel Aviv), Kathryn Lindsey (Boston College), Alan Chen, Tejas Kotwal, Zsolt Veraszto and Tianmin Yu (Brown).
Date & Time
October 07, 2025 | 2:30pm – 3:30pm
Location
Simonyi Hall 101 and Remote AccessSpeakers
Govind Menon, Institute for Advanced Study