Analysis and Mathematical Physics

Towards a Geometric Theory of Deep Learning

The mathematical core of deep learning is function approximation by neural networks trained on data using stochastic gradient descent. I will present a collection of sharp results on training dynamics for the deep linear network (DLN), a phenomenological model introduced by Arora, Cohen and Hazan in 2017. Our analysis reveals unexpected ties with several areas of mathematics (minimal surfaces, geometric invariant theory and random matrix theory) as well as a conceptual picture for `true' deep learning. This is joint work with several co-authors: Nadav Cohen (Tel Aviv), Kathryn Lindsey (Boston College), Alan Chen, Tejas Kotwal, Zsolt Veraszto and Tianmin Yu (Brown).

Date & Time

October 07, 2025 | 2:30pm – 3:30pm

Location

Simonyi Hall 101 and Remote Access

Speakers

Govind Menon, Institute for Advanced Study