Issue |
A&A
Volume 698, June 2025
|
|
---|---|---|
Article Number | A3 | |
Number of page(s) | 14 | |
Section | Cosmology (including clusters of galaxies) | |
DOI | https://doi.org/10.1051/0004-6361/202453284 | |
Published online | 23 May 2025 |
Exploring the halo-galaxy connection with probabilistic approaches
1
Instituto de Astrofísica de Canarias, Calle Via Láctea s/n, E-38205 La Laguna, Tenerife, Spain
2
Departamento de Astrofísica, Universidad de La Laguna, E-38206 La Laguna, Tenerife, Spain
3
Departamento de Física Matemática, Instituto de Física, Universidade de São Paulo, Rua do Matão 1371, CEP 05508-090 São Paulo, Brazil
4
Center for Computational Astrophysics, Flatiron Institute, 162 5th Avenue, New York, NY 10010, USA
5
Departamento de Física, Universidad Técnica Federico Santa María, Avenida Vicuña Mackenna 3939, San Joaquín, Santiago, Chile
⋆ Corresponding authors: nvilla-ext@iac.es, natalidesanti@gmail.com
Received:
4
December
2024
Accepted:
17
March
2025
Context. The connection between galaxies and their host dark matter halos encompasses a range of intricate and interrelated processes, playing a pivotal role in our understanding of galaxy formation and evolution. Traditionally, this link has been established through physical or empirical models. On the other hand, machine learning techniques are adaptable tools capable of handling high-dimensional data and grasping associations between numerous attributes. In particular, probabilistic models in machine learning capture the stochasticity inherent to these highly complex processes and relations.
Aims. We compare different probabilistic machine learning methods to model the uncertainty in the halo-galaxy connection and efficiently generate galaxy catalogs that faithfully resemble the reference sample by predicting joint distributions of central galaxy properties, namely stellar mass, color, specific star formation rate, and radius, conditioned to their host halo features.
Methods. The analysis is based on the IllustrisTNG300 magnetohydrodynamical simulation. The machine learning methods model the distributions in different ways. We compare a multilayer perceptron that predicts the parameters of a multivariate Gaussian distribution, a multilayer perceptron classifier, and the method of normalizing flows. The classifier predicts the parameters of a categorical distribution, which are defined in a high-dimensional parameter space through a Voronoi cell-based hierarchical scheme. The results are validated with metrics designed to test probability density distributions and the predictive power of the methods.
Results. We evaluate the model’s performances under various sample selections based on halo properties. The three methods exhibit comparable results, with normalizing flows showing the best performance in most scenarios. The models not only reproduce the main features of galaxy properties distributions with high-fidelity, but can also be used to reproduce the results obtained with traditional, deterministic, estimators. Our results also indicate that different halos and galaxy populations are subject to varying degrees of stochasticity, which has relevant implications for studies of large-scale structure.
Key words: galaxies: evolution / galaxies: halos / galaxies: statistics / dark matter / large-scale structure of Universe
© The Authors 2025
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.