Deep and Shallow learning in the era of large galaxy surveys and simulations
Organiser(s):
Ferreras, Walmsley, Spurio, Lahav, Wild, Hartley, Killestein, Bowles, Cheng, Lintott, Mohan, Scaife, Spindler
Session type:
Regular
Description:
The complexity of the various processes operating in galaxy formation and evolution makes it one of the more challenging problems in physics. The advent of large galaxy surveys (such as SDSS, Euclid, DES, DESI, Gaia, Rubin-LSST, Roman, etc) along with large volumes of data from numerical simulations of galaxy formation (such as EAGLE, Illustris-TNG, CAMELS) have enabled a data-driven approach, where the statistical properties of the large samples are exploited. Moreover, analysing such large samples becomes intractable with traditional methods. At present, machine learning (ML) techniques are routinely applied to classify, regress, and understand the distribution of survey and simulation data. At the same time, the availability of powerful ML computer codes that anyone can use also poses the problem of producing "black boxes" where the output is not fully understood, and where systematics based on the sample selection, methods, etc can be challenging.
Over the past few years, ML has been transformed yet again, this time by models that match human expression in both language (e.g. chatGPT, LaMDA) and digital art (e.g. StableDiffusion, Midjourney). Our community is now taking the first steps toward using these powerful tools to solve astronomical problems. At the same time, familiar core issues like uncertainty quantification and domain shift continue to threaten the practical application of both old and new methods
This session is focused on three core areas of current activity in ML applied to astrophysical data:
1) Deep Learning methods are typically based on adjusting multiple layers of neural networks to classify or regress complex data sets.
2) Shallow learning, comprising more traditional multivariate methods that exploit the statistical properties of the data, such as principal component analysis, independent component analysis, gaussian mixture models, etc.
3) Simulation-based inference, where state-of-the-art simulations are confronted with observational data following Bayesian methods.
We welcome machine-learning-focused contributions from all astronomy fields. Contributions should ideally go beyond measuring the performance of standard tools (based either on deep or “shallow" methods). We are especially interested in contributions which either introduce new astronomy-relevant algorithms or demonstrate how machine learning has led to new science results.
Topic:
Techniques
All attendees are expected to show respect and courtesy to other attendees and staff, and to adhere to the NAM Code of Conduct.
© 2025 Royal Astronomical Society