Generative Solutions for Cosmic Problems
In the era of high-dimensional data and simulation-based science, machine learning is transforming the different stages of the scientific method in astrophysics. I will present a summary of my research connected to generative models in two main areas:
First, I will discuss how generative models enable high-dimensional simulation-based inference for cosmology and provide a roadmap to field-level inference. Hydrodynamical simulations are essential for self-consistent predictions of diverse cosmological observables—including the kinetic and thermal Sunyaev-Zel'dovich effects (kSZ, tSZ), fast radio bursts (FRBs), baryonification effects in weak lensing, and galaxy formation. However, their computational expense has traditionally limited their use in large-scale inference. I will show how machine learning emulators of these simulations unlock inference at scale, and how generative models can interpolate across different hydrodynamical simulators that employ varying subgrid assumptions and model different physical processes.
Second, I will discuss learning rich, low-dimensional representations that capture the underlying physical processes in astrophysical data. I will introduce ongoing work on novel methods that learn to represent simulations and observations in a joint latent space of shared and private information, enabling more robust inference by focusing on what simulations and observations have in common. I will also present a complementary approach to disentangle representations arising from instrumental systematics versus the underlying astrophysics in a fully data-driven way.
If time permits, I will briefly discuss how LLM agents can close the loop in the scientific method by proposing, implementing, and testing cosmological theories.