CHEX-MATE: A non-parametric deep learning technique to deproject and deconvolve galaxy cluster X-ray temperature profiles

A. Iqbal; G. W. Pratt; J. Bobin; M. Arnaud; E. Rasia; M. Rossetti; R. T. Duffy; I. Bartalucci; H. Bourdin; F. De Luca; M. De Petris; M. Donahue; D. Eckert; S. Ettori; A. Ferragamo; M. Gaspari; F. Gastaldello; R. Gavazzi; S. Ghizzardi; L. Lovisari; P. Mazzotta; B. J. Maughan; E. Pointecouteau; M. Sereno

doi:10.1051/0004-6361/202347234

Home

All issues

Volume 679 (November 2023)

A&A, 679 (2023) A51

Full HTML

Open Access

Issue		A&A Volume 679, November 2023


Article Number		A51
Number of page(s)		33
Section		Cosmology (including clusters of galaxies)
DOI		https://doi.org/10.1051/0004-6361/202347234
Published online		09 November 2023

A&A 679, A51 (2023)

CHEX-MATE: A non-parametric deep learning technique to deproject and deconvolve galaxy cluster X-ray temperature profiles

A. Iqbal¹, G. W. Pratt¹, J. Bobin², M. Arnaud¹, E. Rasia³^,4, M. Rossetti⁵, R. T. Duffy¹, I. Bartalucci⁵, H. Bourdin⁶, F. De Luca⁶, M. De Petris⁷, M. Donahue⁸, D. Eckert⁹, S. Ettori¹⁰^,11, A. Ferragamo¹²^,13, M. Gaspari¹⁴, F. Gastaldello⁵, R. Gavazzi¹⁵^,16, S. Ghizzardi⁵, L. Lovisari¹⁷^,18, P. Mazzotta⁶, B. J. Maughan¹⁹, E. Pointecouteau²⁰ and M. Sereno²¹^,22

¹ Université Paris-Saclay, Université Paris-Cité CEA, CNRS, AIM, 91191 Gif-sur-Yvette, France
e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
² CEA IRFU/DEDIP, 91191 Gif-sur-Yvette, France
³ INAF – Osservatorio Astronomico di Trieste, Via Tiepolo 11, 34131 Trieste, Italy
⁴ IFPU, Via Beirut 2, 3I-4151 Trieste, Italy
⁵ INAF, Istituto di Astrofisica Spaziale e Fisica Cosmica di Milano, Via A. Corti 12, 20133 Milano, Italy
⁶ Dipartimento di Fisica, Universita’ di Roma “Tor Vergata”, Via Della Ricerca Scientifica, 1, 00133 Roma, Italy
⁷ Dipartimento di Fisica, Sapienza Universitá di Roma, Piazzale Aldo Moro 5, 00185 Roma, Italy
⁸ Department of Physics and Astronomy, Michigan State University, 567 Wilson Rd, East Lansing, MI 48864, USA
⁹ Department of Astronomy, University of Geneva, ch. d’Écogia 16, 1290 Versoix, Switzerland
¹⁰ INAF, Osservatorio di Astrofisica e Scienza dello Spazio, Via Piero Gobetti 93/3, 40129 Bologna, Italy
¹¹ INFN, Sezione di Bologna, Viale Berti Pichat 6/2, 40127 Bologna, Italy
¹² Instituto de Astrofísica de Canarias (IAC), C/ Vía Láctea s/n, 38205 La Laguna, Tenerife, Spain
¹³ Dipartimento di Fisica, Sapienza Universitá di Roma, Piazzale Aldo Moro 5, 00185 Roma, Italy
¹⁴ Department of Astrophysical Sciences, Princeton University, Princeton, NJ 08544, USA
¹⁵ Laboratoire d’Astrophysique de Marseille, Aix-Marseille Université, CNRS, CNES, Marseille, France
¹⁶ Institut d’Astrophysique de Paris, CNRS, Sorbonne Université, Paris, France
¹⁷ INAF, Istituto di Astrofisica Spaziale e Fisica Cosmica di Milano, Via A. Corti 12, 20133 Milano, Italy
¹⁸ Center for Astrophysics – Harvard & Smithsonian, 60 Garden Street, Cambridge, MA 02138, USA
¹⁹ HH Wills Physics Laboratory, University of Bristol, Tyndall Ave, Bristol BS8 1TL, UK
²⁰ IRAP, Université de Toulouse, CNRS, CNES, UT3-UPS, Toulouse, France
²¹ INAF, Osservatorio di Astrofisica e Scienza dello Spazio, Via Piero Gobetti 93/3, 40129 Bologna, Italy
²² INFN, Sezione di Bologna, Viale Berti Pichat 6/2, 40127 Bologna, Italy

Received: 19 June 2023
Accepted: 1 September 2023

Abstract

Temperature profiles of the hot galaxy cluster intracluster medium (ICM) have a complex non-linear structure that traditional parametric modelling may fail to fully approximate. For this study, we made use of neural networks, for the first time, to construct a data-driven non-parametric model of ICM temperature profiles. A new deconvolution algorithm was then introduced to uncover the true (3D) temperature profiles from the observed projected (2D) temperature profiles. An auto-encoder-inspired neural network was first trained by learning a non-linear interpolatory scheme to build the underlying model of 3D temperature profiles in the radial range of [0.02–2] R₅₀₀, using a sparse set of hydrodynamical simulations from the THREE HUNDRED PROJECT. A deconvolution algorithm using a learning-based regularisation scheme was then developed. The model was tested using high and low resolution input temperature profiles, such as those expected from simulations and observations, respectively. We find that the proposed deconvolution and deprojection algorithm is robust with respect to the quality of the data, the morphology of the cluster, and the deprojection scheme used. The algorithm can recover unbiased 3D radial temperature profiles with a precision of around 5% over most of the fitting range. We apply the method to the first sample of temperature profiles obtained with XMM-Newton for the CHEX-MATE project and compared it to parametric deprojection and deconvolution techniques. Our work sets the stage for future studies that focus on the deconvolution of the thermal profiles (temperature, density, pressure) of the ICM and the dark matter profiles in galaxy clusters, using deep learning techniques in conjunction with X-ray, Sunyaev Zel’Dovich (SZ) and optical datasets.

Key words: methods: data analysis / X-rays: galaxies: clusters / galaxies: clusters: intracluster medium / large-scale structure of Universe

© The Authors 2023

Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. This email address is being protected from spambots. You need JavaScript enabled to view it. to support open access publication.

1. Introduction

Galaxy clusters are ideal probes of the large-scale structure of the Universe (Holder et al. 2001; Planck Collaboration XXIV 2016; Bocquet et al. 2015; Sereno et al. 2017; Abbott et al. 2022). X-ray observations of the hot gas in the ICM, which constitutes the dominant baryonic component in galaxy clusters, provide us with a useful tool for identifying and studying these objects. Shallow, wide-field X-ray surveys by ROSAT (ROentgen SATellite) and eROSITA (extended ROentgen Survey with an Imaging Telescope Array) have now discovered thousands of clusters (e.g. Piffaretti et al. 2011; Klein et al. 2022, and references therein). In recent years, the detailed X-ray follow-up of samples extracted from these surveys has exploited the high spatial resolution of Chandra and the large field of view and sensitivity of X-ray Multi-Mirror Mission (XMM-Newton) to investigate the morphological, structural, and scaling properties of the cluster population (e.g. Lovisari & Maughan 2022; Kay et al. 2022, and references therein).

The X-ray-derived radial temperatures and density profiles are key ingredients to derive the thermodynamic properties of the ICM, and, under the assumption of hydrostatic equilibrium, the total mass profile in galaxy clusters (Böhringer et al. 2007; Pratt et al. 2010; Ettori et al. 2010, 2013; Eckert et al. 2022). These X-ray studies have revealed the presence of two distinct types of clusters: cool cores (CCs), characterised by dense and the low-temperature cores, and non-cool core (NCCs), which exhibit a relatively flat central density and temperature. Various morphological parameters have been introduced to analyse X-ray images and to link these to the dynamical behavior of galaxy clusters and to the presence or absence of a low temperature cores, providing insights into their structural characteristics, internal dynamics, and evolutionary stages (Rasia et al. 2013; Campitiello et al. 2022). Although it is now well-established that Active Galactic Nuclei (AGN) feedback plays a major role in suppressing the ICM cooling in cluster cores, the reason for the CC and NCC dichotomy is still not fully understood (Rasia et al. 2015; Barnes et al. 2018).

X-ray observations give access to the projected (2D) density and temperature profiles of the ICM. The latter is obtained from fitting a thermal model to the spectra extracted in concentric annuli about a given centre (usually the X-ray peak or centroid). For further scientific applications, these must then be deprojected to obtain the 3D profiles. If needed, the effect of the instrumental point spread function (PSF) can be taken into account in the deprojection step. While the deprojected (3D) gas density in shells can be easily estimated from the X-ray surface brightness (Croston et al. 2006; Bartalucci et al. 2017; Ghirardini et al. 2019a), the deprojection of ICM temperature profiles is not trivial. This is partly due to the need for sufficient photon counts to build and fit the spectrum, leading to the temperature profiles having significantly coarser angular resolution than the density.

The relationship between the observed 2D temperature profile, T_2D, and the originating 3D temperature profile, T_3D, can be expressed in matrix form as

$\begin{matrix} T_{2 D} = C_{PSF} \otimes C_{proj} \otimes T_{3 D} = C \otimes T_{3 D}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathbf{T}_{\rm 2D} =\mathbf{C}_{\rm PSF}\otimes \mathbf{C}_{\rm proj}\otimes \mathbf{T}_{\rm 3D}=\mathbf{C}\otimes \mathbf{T}_{\rm 3D}, \end{aligned} $$$ (1)

where ⊗ denotes the matrix product. Assuming a cluster is spherically symmetric and that the 3D temperature profile is defined in concentric spherical shells, the (i, j)th element of the matrix C_proj encodes the projection effect of the jth 3D shell onto the ith 2D annulus on the plane of the sky. The 2D annuli may have the same or different radii to the 3D shells. We note that C_PSF is a second matrix that describes the effect of the finite instrumental PSF. Its (k, i)th element contains the fraction of counts from the ith 2D annulus that are redistributed by the telescope into the kth observed 2D annulus. If there are n-model 3D shells, and correspondingly n-model 2D annuli, plus m-observed annuli, then the dimensions of C_proj and C_PSF are n × n and m × n, respectively. If the PSF is ignored, then the dimensions of C_proj should change to m × n.

The fitting of projected parametric models of the 3D temperature profiles to both observed and simulated 2D data has been widely used in the literature (De Grandi & Molendi 2002; Pizzolato et al. 2003; Ascasibar & Diego 2008; Bulbul et al. 2010; Gaspari et al. 2012; Ghirardini et al. 2019b). Initially, these were polytropic models that assumed a simple relationship between the density and the temperature distribution (T ∝ ρ^γ − 1), but this does not fully capture all the complexities of real galaxy clusters, especially the central regions of CCs clusters. The quality of recent data has necessitated more complicated models to be proposed, perhaps the most widely used being that proposed by Vikhlinin et al. (2006):

$\begin{matrix} T_{3 D} (r) = T_{0} \times (x + τ) / (x + 1) \times \frac{{(r / R_{t})}^{- a}}{{(1 + {(r / R_{t})}^{b})}^{c / b}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} {T}_{\mathrm{3D}}(r) = T_0\times (x+\tau )/(x+1)\times \frac{(r/R_t)^{-a}}{(1+(r/R_t)^b)^{c/b}}, \end{aligned} $$$ (2)

where x = (r/R_cool)^a_cool and {T₀, τ, R_t, R_cool, a_cool, a, b, c} are the model parameters. In the framework of the Representative XMM-Newton cluster structure survey (REXCESS; Böhringer et al. 2007) and Following the most massive galaxy clusters over cosmic time (M2C; Bartalucci et al. 2019) projects, Démoclès et al. (2010) and Bartalucci et al. (2018) developed a non-parametric-like deconvolution approach. For this study the Vikhlinin et al. (2006) parametric model was used to perform the PSF correction and deprojection in order to estimate the temperature at the weighted radii of the 2D annular binning scheme. The 3D uncertainties were then computed consistently from the 2D errors, and random temperatures were drawn within these uncertainties to compute the temperature derivatives which were used in the hydrostatic equilibrium total mass computations.

However, parametric approaches are not fully satisfactory since, with a limited number of parameters, they could fail to capture features in the temperature profile due to shock fronts, edges, mergers and the presence of cool cores with one single model. Moreover, a high degree of degeneracy between the parameters could be present. The Vikhlinin et al. (2006) parametric temperature model, which was developed for cool core systems, is a complex eight-parameter model, four of which correspond to the cool-core component. It is therefore not well-suited to highly disturbed NCC clusters, which have flatter central temperature profiles instead of declining cool cores. Furthermore, for typical X-ray data quality, it exhibits a high degree of degeneracy between its parameters, leading to poorly constrained model parameters and the results that depend on the prior choices in MCMC fitting schemes. Recently, Gianfagna et al. (2021), using a sample drawn from high resolution numerical simulations, found that the Vikhlinin et al. (2006) parametric model could only fit well to 50% of their sample in the range [0.1–1] R₅₀₀¹.

Model-independent direct spectral deprojection methods offer an alternative and are commonly used to deconvolve the 3D temperature profiles. This can involve the onion-skin technique (Fabian et al. 1981; David et al. 2001; Johnstone et al. 2005; Russell et al. 2008; Lakhchaura et al. 2016), where the 3D layers are successively built up from the outside in. However, this approach is strongly dependent on the choice of the outermost bin because it is necessary to take into account the contribution to the emission from the shells outside the outermost annulus used for the analysis. Alternatively, isothermal models can be fitted to each annular spectrum and then the matrix method (i.e. Eq. (1)) can be used to deproject (e.g. Ettori et al. 2002). Ignoring the PSF effect, the equation for temperature profiles assuming that the observed projected spectra consist of a linear combination of isothermal emission models weighted by the projected emission measure simplifies to

$\begin{matrix} T_{2 D, k} = \sum_{j = 1}^{n} \frac{w_{k, j}}{\sum_{j = 1}^{n} w_{k, j}} T_{3 D, j} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} T_{\mathrm{2D},k} = \sum _{j=1}^{n} \frac{{ w}_{k,j}}{\sum _{j=1}^{n} { w}_{k,j}}T_{\mathrm{3D},j}. \end{aligned} $$$ (3)

Here, T_3D, j and T_2D, k are the 3D and 2D temperatures at the jth 3D spherical shell and kth 2D observed annulus, respectively, and the weights, w_k, j, consist of the emission measure contribution of the spherical shells onto the observed annuli (e.g. Mathiesen & Evrard 2001).

However, such model-independent approaches are often unstable if the data are noisy because Eq. (1) is an inverse problem, meaning that any noise becomes greatly amplified by the deconvolution procedure. In addition, the simplistic emission measure weighting has been found to be inaccurate when applied to X-ray observations. In particular, it has been demonstrated that in the presence of a multi-temperature components gas, w is more appropriately expressed as a non-linear combination of density and temperature (Mazzotta et al. 2004; Vikhlinin 2006), further complicating the deconvolution procedure.

Machine Learning (ML) techniques have emerged as a powerful technique for predicting key features of data and for solving inverse problems to reconstruct (deconvolve) signals, images, etc, from observations. ML techniques have been applied to study galaxy clusters too. Ntampaka et al. (2015) developed an ML algorithm based on Support Distribution Machines to reconstruct dynamical cluster masses using the velocity distribution of cluster members from simulations, achieving a reduction in the scatter between the predicted and true mass by a factor of two compared to standard methods. More complex ML approaches have led to similar significant improvements in the mass estimates (Armitage et al. 2019; Calderon & Berlind 2019).

Using deep learning techniques, Convolutional Neural Network (CNN) models have also been used to infer the dynamical mass of galaxy clusters (Ho et al. 2019; Ramanah et al. 2020; Ho et al. 2021; de Andres et al. 2022). In particular, Yan et al. (2020) used mock datasets of stellar mass, soft X-ray flux, bolometric X-ray flux, and Compton y-parameter images as input to train a CNN model to infer the mass of galaxy clusters, and Gupta & Reichardt (2020, 2021) trained CNN models to estimate cluster masses used mock SZ, cosmic micro-wave background (CMB) lensing maps. Ferragamo et al. (2023), using a combination of an auto-encoder and a random forest regression technique on a sample of 73 138 mock Compton-y parameter maps from the hydrodynamical simulations of the THREE HUNDRED PROJECT (Cui et al. 2018), and were able to reconstruct the 3D gas mass profile and total mass in galaxy clusters with a scatter of about 10% with respect to the true values. de Andres et al. (2022) and Ho et al. (2022) have used real observations to estimate the total mass profiles of galaxy clusters using deep learning models trained on mock simulations. While de Andres et al. (2022) used the Planck SZ maps (Planck Collaboration XXVII 2016) to determine the masses of Planck clusters, Ho et al. (2022) used relative line-of-sight velocities and projected radial distances of galaxy pairs from Sloan Digital Sky Survey (SDSS) data (Alam et al. 2015) to determine the mass of the Coma cluster.

In this work, we show the first use of neural networks, trained on numerical simulations, to deproject the X-ray temperature profiles of galaxy clusters. Our technique is based on that proposed by Bobin et al. (2019, 2023) where a so-called Interpolatory Autoencoder (IAE) neural network is built to model the 3D temperature profiles by learning a non-linear interpolatory scheme from a limited set of example profiles called ‘anchor points’. The main advantage of the IAE neural network is that it is able to capture the intrinsic low-dimensional, non-linear nature of the profiles even when the training sample is not large in size. This is crucial as a small sample size can otherwise pose several challenges to the effectiveness of a deep learning algorithm. The model is trained and tested with a set of 315 simulated temperature profiles, in the radial range of [0.02–2] R₅₀₀, from the THREE HUNDRED PROJECT (Cui et al. 2018). A robust temperature deconvolution scheme is then introduced to fit the trained IAE model, that makes use of an efficient regularisation term in the likelihood, along with Markov chain Monte Carlo (MCMC) sampling. The technique is then applied to a pilot sample of X-ray temperature profiles from the CHEX-MATE project (Cluster HEritage project with XMM-Newton: Mass Assembly and Thermodynamics at the Endpoint of structure formation; CHEX-MATE Collaboration 2021).

The paper is organised as follows. Section 2 discusses in detail the simulations used in training the IAE model for temperature profiles. In Sect. 3 we present the IAE model, and Sect. 4 deals with model training and the learning-based deconvolution technique. The performance of the deconvolution algorithm is tested with simulations in Sect. 5, while in Sect. 6, we apply our approach for the first time to a representative sample of 28 galaxy clusters from the first data release (DR1 hereafter, Rossetti et al., in prep.) in the CHEX-MATE sample. Finally, in Sect. 7, we summarise our work. Throughout this work, we adopt a flat ΛCDM model with H₀ = 70 km s⁻¹ Mpc⁻¹, Ω_m = 0.3 and Ω_Λ = 0.7. Further, E(z) is the ratio of the Hubble constant at redshift z to its present value, H₀ and h₇₀ = H₀/70 = 1.

2. Simulations

In this work, training of the neural network is undertaken using the gas mass-weighted 3D temperature profiles, T_3D, of galaxy clusters from the THREE HUNDRED PROJECT (Cui et al. 2018; Ansarifard et al. 2020). These simulations are based on the 324 Lagrangian regions centred on the z = 0 most massive galaxy clusters selected from the MultiDark dark-matter-only MDPL2 simulation (Klypin et al. 2016), carried out with the cosmological parameters from the Planck mission (Planck Collaboration XIII 2016). MDPL2 is a periodic cube of comoving size equal to 1.48 Gpc containing 3840³ dark matter particles. The selected regions were resimulated with the inclusion of baryons and were carried out with the code GADGET-X (Beck et al. 2016). To treat the baryonic physics several processes were included such as: metallicity-dependent radiative cooling, the effect of a uniform time-dependent UV background, a sub-resolution model for star formation from a multi-phase interstellar medium, kinetic feedback driven by supernovae, metal production from SN-II, SN-Ia and asymptotic-giantbranch stars, and AGN feedback (Rasia et al. 2015).

In the present work, we ignore the redshift dependence of the profiles, if any, and only consider the simulated sample at a fixed redshift of z = 0.33, which is the average redshift of the CHEX-MATE sample. However, we consider a mass range of M₅₀₀ > 10¹⁴ M_⊙ allowing us to build a library covering the full mass range of the CHEX-MATE sample. This left us with 314 clusters in the simulated sample.

The temperature profiles were derived in 48 fixed logarithmically spaced radial bins in the range [0.02–2] R₅₀₀ (Ansarifard et al. 2020). The lowest radial limit of 0.02 R₅₀₀ was chosen since it encloses approximately 100 gas particles for the simulated sample, which we call the precision threshold condition, thus ensuring that the analysis is statistically robust and that the results are not affected by numerical fluctuations in the gas properties at small radii (Rasia et al. 2015). The 3D mass-weighted temperature in a given shell i, T_3D, i (i.e the ith element of the T_3D vector), was calculated by weighting the temperature of the pth gas particle (T_p) using its gas mass (m_p) as a weighting function w,

$\begin{matrix} T_{3 D, i} = \frac{\sum T_{p} m_{p}}{\sum m_{p}} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} {T}_{\mathrm{3D},i}=\frac{\sum {{T}}_p {m}_p}{\sum {m}_{p}}. \end{aligned} $$$ (4)

In this calculation, no attempt was made to exclude low-temperature sub-clumps in the outskirt regions of the clusters, however, only particles with temperature > 0.3 keV were considered.

We estimated the projected 2D temperature profiles (T_2D) along the line of sight (l) using the 3D gas density (ρ) and temperature profiles (T_3D). The 2D temperature profiles were estimated in pre-defined logarithmically spaced annular bins by first considering the classical emission-measure weights (C = C_proj, see Eq. (3)):

$\begin{matrix} T_{2 D} = \frac{\int w T_{3 D} d l}{\int w d l} = C \otimes T_{3 D}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathbf{T}_{\rm 2D}=\frac{\int { w} {\mathbf{T}_{\rm 3D}\mathrm{d}l}}{\int { w} \mathrm{d}l}=\mathbf{C}\otimes \mathbf{T}_{\rm 3D}, \end{aligned} $$$ (5)

where w = ρ² (e.g. Mathiesen & Evrard 2001). We produced several versions of the T_2D profiles: First the T_2D profiles were first estimated in the same radial bins as those of the T_3D (48 bins) by using a matrix C of dimension 48 × 48 (C_48, 48). We also estimated T_2D in a coarser binning scheme to reproduce typical radial sampling from present-day X-ray observatories such as XMM-Newton and Chandra. These have either twelve or six logarithmic bins reaching only up to R₅₀₀, corresponding to matrices of dimension 12 × 48 (C_12, 48) and 6 × 48 (C_6, 48) respectively.

We also considered a more complex case where we use the spectroscopic-like weighting proposed by Mazzotta et al. (2004) to generate the 2D temperature profiles using the binning schemes discussed above. In this case, apart from the normalisation, the matrix elements of C simply change to $C_{i, j} = C_{i, j} T_{3 D, j}^{3 / 4}$ $Mathematical equation: $ {C}_{i,j}={C}_{i,j}T^{3/4}_{\mathrm{3D},j} $$ (or equivalently, the weights change to $w = ρ^{2} T_{3 D}^{- 3 / 4}$ $Mathematical equation: $ \mathit{w} = \rho^2 T_{\mathrm{3D}}^{-3/4} $$ ), where T_3D, j is the mass-weighted 3D temperature profile in the jth bin.

In many clusters in the simulated sample, the temperature profiles in the first few inner bins (typically 0–13 radial bins corresponding to radii between ≈[0.02–0.07] R₅₀₀) were noisy (i.e having < 100 gas particles). For such systems, the 2D profiles were estimated without considering such bins.

Figure 1 shows the observed scaled 2D temperature profiles of the Planck SZ sample (Planck Collaboration XI 2011) and the XMM-Newton DR1 sample (Rossetti et al., in prep., described in detail in Sect. 6.2). These are compared to 50 randomly drawn 2D temperature profiles from the THREE HUNDRED PROJECT using emission measure (left panel) and spectroscopic-like (right panel) weighting schemes and an observation-like convolution matrix, C_12, 48. Both observational and simulated temperature profiles were scaled by the average 2D temperature (T_X) in the radial range of [0.15–0.75] R₅₀₀. Figure 2 shows the distribution of the clusters in the simulated sample, Planck sample and DR1 sample on the basis of T_X. These two figures illustrate three points that will be critical for the following study:

In common with a number of works over the last 20 yr (e.g. De Grandi & Molendi 2002; Vikhlinin et al. 2006; Pratt et al. 2007; Leccardi & Molendi 2008; Ghirardini et al. 2019a), the structural similarity in the observed temperature profiles are clearly visible in Fig. 1. The central regions are characterised by a large spread, due to a mixed population of cool core and disturbed systems, while beyond the central 0.15 R₅₀₀ the profiles all decline in a similar fashion.
The simulated profiles follow the same general trend as the observed profiles. The average trend and 1-σ dispersion of the simulations is very consistent with that of the CHEX-MATE DR1 sample. The simulated temperature profiles on average are slightly hotter in the centre compared to the Planck SZ sample. This may be related to the fact that there are more low mass clusters in the simulated sample compared to the Planck SZ sample. Such low mass clusters are expected to be more strongly affected by AGN feedback, potentially leading to higher temperatures in the central region (Iqbal et al. 2018). Alternatively, the higher central temperatures in the simulations may simply be due to the fact that the sample has a large number of NCC clusters.
Overall, the observed temperature profiles are well represented by the simulated sample. This fact will be key to a successful training stage of the IAE model, which relies on identifying underlying trends in the data that would not otherwise be found. We note that the simulated profiles do not have to precisely match the observed data: as we will see, the most important point is that they reproduce the overall structure and diversity of the observed profiles, which is what our IAE model learns.

Fig. 1.

Comparison of the observed 2D temperature profiles, scaled as a function of R₅₀₀ and T_X, the temperature in the [0.15–0.75] R₅₀₀ region. The thin grey lines show 50 randomly selected simulated 2D temperature profiles from the THREE HUNDRED PROJECT, extracted with an observation-like annular binning resolution, derived using emission measure (left panel) and spectroscopic-like (right panel) weighting schemes. The thin red lines show individual profiles in the Planck Collaboration XI (2011) sample. For better visibility, the error bars corresponding to the observed profiles are not shown. The regions enclosing thick black and red lines show the 1-σ dispersion (16th–84th percentile range) of the temperature profiles of the full simulated sample and the Planck sample respectively. The regions enclosing the thick blue lines show the 1-σ dispersion of the CHEX-MATE DR1 sample. Scaled by R₅₀₀ and T_X, both the emission measure and spectroscopic-like derived 2D simulated temperature profiles become somewhat self-similar.

Fig. 2.

Number of clusters as a function of T_X in the THREE HUNDRED PROJECT sample, the Planck SZ sample and the DR1 sample.

We further classified the simulated clusters using three schemes. This is important to quantify how well the IAE model reconstructs the radial temperature distribution for different types of objects and profile shapes.

2.1. CC and NCC classification

Firstly, we classify the profiles as CC and NCC by visual inspection. The objective here is simply to select simulated profiles that mimic those of observed cool-core-like clusters with a central temperature drop, and non cool-core clusters that display an almost isothermal central temperature profile. The profiles which show a decreasing trend towards the cluster centre (positive temperature gradient) were classified as CC clusters. We identify about one-third of the clusters as belonging to the CC class. In Fig. 3, grey lines in the left panels and right panels show the 3D temperature profiles (T_3D) of CC and NCC clusters respectively.

Fig. 3.

Classification of temperature profiles in the THREE HUNDRED PROJECT. Left panel: Grey line shows the visually classified CC clusters. Cyan and green lines show the 20 most relaxed clusters (top panel) and 20 most smooth profiles (bottom panel). Right panel: Grey line shows the visually classified NCC clusters. Magenta and orange lines show the 20 most disturbed clusters (top panel) and irregular profiles (bottom panel).

2.2. Dynamical classification

Clusters in these simulations were classified on the basis of their intrinsic dynamical state (relaxed or disturbed) using a variety of estimators (Rasia et al. 2013). The two important intrinsic estimators are f_s = M_sub/M_tot, the fraction of cluster mass (M_tot) included in substructures (M_sub), and Δ_r = |r_δ − r_cm|/R_ap, which is the measure of the offset between the central density peak (r_δ), and the centre of mass (r_cm) of the cluster normalised to aperture radius R_ap. Both of the estimators were computed at R₅₀₀. Both f_s and Δ_r are expected to be lower than 0.1 for relaxed objects (Cialone et al. 2018; De Luca et al. 2021). These two dynamical parameters can be combined (Rasia et al. 2013) to give the so-called relaxation parameter χ_D

$\begin{matrix} χ_{D} = \frac{1}{2} \times (\frac{Δ_{r} - Δ_{r, med}}{| Δ_{r, quar} - Δ_{r, med} |} + \frac{f_{s} - f_{s, med}}{| f_{s, quar} - f_{s, med} |}) . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \chi _{D}=\frac{1}{2}\times \left(\frac{\Delta _{\rm r}-\Delta _{\rm r,med}}{|\Delta _{\rm r,quar}-\Delta _{\rm r,med}|} + \frac{f_{\rm s}-f_{\rm s,med}}{|f_{\rm s,quar}-f_{\rm s,med}|}\right). \end{aligned} $$$ (6)

Here Δ_r, med and f_s, med are the medians of the Δ_r and f_s distributions, respectively, and Δ_r, quar and f_s, quar are the first or the third quartiles, depending on whether the parameters of a specific cluster are smaller or larger than the median. According to this definition, clusters with χ_D < 0 are classified as relaxed, and clusters χ_D > 0 are classified as disturbed. The left panel of Fig. 4 shows the histogram of χ_D values. The cyan and magenta hatched regions represent the 20 most relaxed clusters and 20 most disturbed clusters, respectively. We will refer to these sub-samples as MR20 and MD20 hereafter. In the top panel of Fig. 3, we show the corresponding temperature profiles of the MR20 clusters (left panel) and the MD20 clusters (right panel) with cyan and magenta lines, respectively. It is interesting to note that only a few of the most relaxed objects are also categorised as CC clusters. Visual inspection of emissivity maps shows, as expected, that χ_D is clearly linked to the overall gas morphology, as also found in Campitiello et al. (2022).

Fig. 4.

Distribution of clusters in the THREE HUNDRED PROJECT as a function of the χ_D (Eq. (6)) and χ_S (Eq. (8)) criteria. The hatched cyan and magenta regions show the 20 most relaxed clusters and the 20 most disturbed clusters respectively based on χ_D criterion. The hatched green and orange 20 most show the 20 most regular profiles and the 20 most irregular profiles respectively based on χ_S criteria.

2.3. Structural classification

To enable a better assessment of the performance of the IAE model for temperature profile reconstruction, we also classified the 3D temperature profiles based directly on their smoothness. Bumps in the temperature profiles are usually associated with complex astrophysical processes such as merger shocks, gas condensation, the presence of cold substructures, sloshing, and turbulence, all of which affect the temperature in a given annulus. To measure the degree of the bumpiness of the 3D temperature profiles, we used the starlet wavelet transform, which is widely used in component separation in astrophysical images (Starck et al. 2007), to split each profile into its smooth and non-smooth components. Using this technique, the 3D temperature profile T_3D(r) can be decomposed into a J + 1 coefficient set W = {w₁, …, w_J, T_J}, as a superposition of the form

$\begin{matrix} T_{3 D} (r) = T_{J} (r) + \overset{J}{\sum_{j = 1}} w_{j} (r), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathbf{T}_{\rm 3D}(r)=\mathbf{T}_J(r)+\overset{J}{\underset{j=1}{\sum }}\mathbf{w}_j(r)\, , \end{aligned} $$$ (7)

where T_J is a smooth (coarse resolution) version of the original temperature profile and w_j represents the structure in the temperature profile on scale 2^−j.

Figure 5 shows the starlet decomposition for one of the clusters in the THREE HUNDRED PROJECT which exhibits a complex shape in the range [0.5–1] R₅₀₀. The cluster is experiencing a major merger and there is an enhancement of the temperature due to the propagation of a shock in this region. We use the starlet transform with J = 2, which we have found to be the optimal configuration to measure the non-smoothness, yielding a decomposition into a smooth temperature component and two additional non-smooth components, w₁(r) and w₂(r). We then define the root mean square deviation, χ_S of the difference between the true and smooth temperature profiles in the radial range of [0.08–1] R₅₀₀ as a measure of the non-smoothness of the temperature profiles.

$\begin{matrix} χ_{S} = \sqrt{\frac{1}{u} \sum_{i = 1}^{u} {(T_{3 D, i} - T_{J, i})}^{2}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \chi _{S}=\sqrt{\frac{1}{u}\sum _{i=1}^{u}(T_{\mathrm{3D},i}-T_{J,i})^{2}}\,, \end{aligned} $$$ (8)

Fig. 5.

Smooth (coarse) component of a complex temperature profile derived from the application of the Starlet transform with J = 2. The bottom panel shows the corresponding difference between true and smooth temperature profiles.

where u is the number of data points in the range of [0.08–1] R₅₀₀, and the lower limit of 0.08 R₅₀₀ corresponds to the radius at which all clusters satisfy the precision threshold condition. The temperature profiles were first scaled (normalised) by the average mass-weighted temperature in the radial range of [0.15–0.75] R₅₀₀ before applying the decomposition operator to calculate χ_S. The right hand panel of Fig. 4 shows the distribution of χ_S for the full sample, which follows an approximately log-normal distribution. The green and orange hatched regions represent the 20 most smooth profiles and 20 most irregular profiles, respectively, based on the χ_S criterion. We will refer to these sub-samples as MS20 and MI20 henceforth. In the bottom panel of Fig. 3, we show the corresponding temperature profiles of the MS20 (left panel) and MI20 profiles (right panel) with green and orange lines respectively. Here also, only a few of the clusters with the most smooth profiles are categorised as CC clusters. The correlation between χ_D and χ_S is shown in Fig. A.1. They are moderately correlated, with a Spearman’s correlation coefficient of 0.42 and a P value of 5 × 10⁻¹⁵.

3. Neural network model for learning 3D temperature profiles

The deconvolved temperature profile can in principle be obtained by solving the following classical inverse problem

$\begin{matrix} T_{2 D} = C \otimes T_{3 D} + N, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathbf{T}_{\rm 2D}=\mathbf{C}\otimes \mathbf{T}_{\rm 3D }+\mathbf{N}, \end{aligned} $$$ (9)

where C is a non-linear operator (matrix) which represents the observational and instrumental effects (projection, PSF, etc) and N represents the statistical properties of the noise. The standard way of solving Eq. (9) is to consider least squares regression with some regularisation R

$\begin{matrix} T_{3 D}^{fit} = min_{T} R (T) + {∥ T_{2 D} - C \otimes T ∥}^{2}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathbf{T}^\mathrm{fit}_{\rm 3D}=\min _\mathbf{T} \,\,\mathbf{R} (\mathbf{T})+ \left\Vert \mathbf{T}_{\rm 2D} - \mathbf{C}\otimes \mathbf{T} \right\Vert^2, \end{aligned} $$$ (10)

where $T_{3 D}^{fit}$ $Mathematical equation: $ \mathbf{T}^{\mathrm{fit}}_{\mathrm{3D}} $$ is the best-fitting model profile for T_3D, which is obtained by optimising the above relation with respect to T. However, Eq. (9) is an ill-posed (non-linear) problem, and using standard non-parametric methods does not result in a unique and stable solution. Therefore, one has to resort to advanced deconvolution techniques. In this work, we propose one such algorithm that makes use of neural networks to model the temperature profiles, and whose framework will be explained below. A learning-based regularisation procedure for direct deconvolution using the trained neural network is discussed in Sect. 1.

Our approach is based on manifold learning, which stems from the manifold hypothesis, that suggests the existence of a lower dimensional manifold on which real-world data lies (Fefferman et al. 2013). This is evidently the case for galaxy cluster temperature profiles, which clearly display some degree of regularity, as seen in Fig. 4. The goal is then to find the lower dimensional manifold by learning the underlying structure of the data. When one has access to a large training set (from observations and/or simulations), it may be possible to make use of machine learning (deep learning) methods to build an underlying manifold. However, this becomes quite difficult when available training samples are sparse, as is the case for cluster temperature profiles. In such cases, rather than learning the underlying manifold structure, Bobin et al. (2023) proposed the Interpolator AutoEncoder (IAE), that learns to travel on a manifold by way of interpolation between a limited number of anchor points that belong to it.

We assume that any temperature profile in a training set {Tⁱ}^{i = 1, …, n}, where n represents the total number of elements in the set, can be interpolated from a small set of d anchor points ${T_{a}^{e}}^{e = 1, \dots, d}$ $Mathematical equation: $ \{{\bf T}^e_a\}^{e=1,\ldots,d} $$ using an appropriate metric Π

$\begin{matrix} Θ (Λ^{i}) = min_{Λ^{i}} \sum_{e = 1}^{d} λ_{e}^{i} Π (T^{i}, T_{a}^{e}), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \boldsymbol{\Theta } ({\boldsymbol{\Lambda }}^{i})=\min _{\mathbf{\Lambda }^{i}}\sum _{e=1}^\mathrm{d}\boldsymbol{\lambda _{e}^{i}} \boldsymbol{\Pi }(\mathbf{T}^i,\mathbf{T}^e_a)\,, \end{aligned} $$$ (11)

where Θ is called the barycentre. The elements of vector $Λ^{i} = [λ_{1}^{i}, \dots, λ_{d}^{i}]$ $Mathematical equation: $ {\boldsymbol{\Lambda}}^{i}=[\lambda_{1}^{i},\ldots,\lambda_{d}^{i}] $$ are the barycentric weights ( $\sum_{e = 1}^{d} λ_{e}^{i} = 1$ $Mathematical equation: $ \sum\nolimits_{e=1}^{d}\lambda_e^{i}=1 $$ ) which are optimised in the above equation. If we consider the metric Π to be Euclidian, then

$\begin{matrix} Θ (Λ^{i}) = min_{Λ^{i}} \sum_{e = 1}^{d} λ_{e}^{i} | | T^{i} - T_{a}^{e} {| |}^{2} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \boldsymbol{\Theta } ({\boldsymbol{\Lambda }}^{i})=\min _{{\boldsymbol{\Lambda }}^{i}} \sum _{e=1}^{d} \lambda _{e}^{i} ||\mathbf{T}^i-\mathbf{T}_a^e||^2. \end{aligned} $$$ (12)

The above equation reduces Θ(Λⁱ) to an orthogonal projection onto the span of anchor points $T_{a}^{e}$ $Mathematical equation: $ {\bf T}_{a}^e $$ , that is

$\begin{matrix} T^{i} \equiv Θ (Λ^{i}) = \sum_{e = 1}^{d} λ_{e}^{i} T_{a}^{e} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathbf{T}^{i} \equiv \boldsymbol{\Theta } ({\boldsymbol{\Lambda }}^{i})=\sum _{e=1}^d \lambda _{e}^{i} \mathbf{T}_{a}^e. \end{aligned} $$$ (13)

The problem then reduces to finding (optimising) barycentric weights such that the barycentre Θ accurately reconstructs any input temperature profile in the training sample.

However, if the profiles are non-linear, with varying amplitudes and shapes, as is the case with the temperature profiles in galaxy clusters, the standard metric Π may not reconstruct an appropriate barycentric representation. Our method, therefore, uses the approach proposed by Bobin et al. (2019, 2023), in which a data-driven metric is constructed using a deep learning neural network that is well adapted to build physically relevant barycentres of anchor points. We introduce an auto-encoder (Vincent et al. 2010) inspired neural network model which learns to transport points (temperature profiles in our case) onto the underlying manifold using a non-linear interpolation scheme between the anchor points.

The structure of the neural network we are considering is shown in the left hand panel of Fig. 6. It consists of an encoder (Φ), that takes an input, and a decoder (Ψ), that generates the desired output. The role of the encoder is to transform the input data into a lower-dimensional representation, while the decoder is responsible for mapping the lower-dimensional data back into the original space. By performing these mappings, auto-encoders are able to learn the underlying structure of the data. In contrast to standard auto-encoders, our model training is performed by minimising the error between the input and the reconstructed training sample according to the Euclidean distance onto the manifold spanned by the anchor points in the encoder (feature space).

Fig. 6.

Design of the neural network used in this work. Left Panel: Neural network used in the training stage. Φ and Ψ represent the encoder and decoder respectively. Tⁱ are the elements of the training set and $T_{a}^{e}$ $Mathematical equation: $ {\bf T}^e_a $$ are the elements of the anchor set. Φ(Tⁱ) and $Φ (T_{a}^{e})$ $Mathematical equation: $ \boldsymbol{\Phi}({\bf T}^e_a) $$ are the representations of Tⁱ and $T_{a}^{e}$ $Mathematical equation: $ {\bf T}^e_a $$ , respectively, in the encoder (feature) space. $Θ (Λ^{i}) = Θ ([λ_{1}^{i}, \dots, λ_{d}^{i}])$ $Mathematical equation: $ \boldsymbol{\Theta}({\boldsymbol{\Lambda}}^i)= \boldsymbol{\Theta}([\lambda_1^{i} ,\ldots, \lambda_d^{i}]) $$ is the Euclidean barycentric representation of Φ(Tⁱ) in terms of d anchor points $Φ (T_{a}^{e})$ $Mathematical equation: $ \boldsymbol{\Phi}({\bf T}^e_a) $$ , which is fed to the decoder. Ψ(Θ) is the reconstructed output of the decoder. The network is trained by minimising the error between the input Tⁱ and output Ψ(Θ) temperature profiles. Right panel: Neural network (IAE model) of temperature profiles, where λ₁, …, λ_d are the input parameters and IAE([λ₁, …, λ_d]) is the output temperature profile. The decoder is not required in any step here.

More precisely, for the encoder Φ, the representation of the input profile Tⁱ (belonging to the training set Φ(Tⁱ)) is expressed in terms of the barycentre, Θ, in feature space, as an orthogonal projection onto the span of the anchor points $Φ (T_{a}^{e})$ $Mathematical equation: $ \boldsymbol{\Phi}({\bf T}_{a}^e) $$ given in Eq. (13):

$\begin{matrix} Φ (T^{i}) \equiv Θ (Λ^{i}) = \sum_{e = 1}^{d} λ_{e}^{i} Φ (T_{a}^{e}) . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \boldsymbol{\Phi }(\mathbf{T}^i) \equiv \boldsymbol{\Theta } ({\boldsymbol{\Lambda }}^{i}) = \sum _{e=1}^d \lambda _e^{i} \boldsymbol{\Phi }(\mathbf{T}^e_a). \end{aligned} $$$ (14)

The barycentric weights are constrained to sum to one so as to avoid certain scaling indeterminacies, and are not necessarily constrained to be positive like actual barycentric weights, which potentially allows us to extrapolate beyond the affine hull of the encoded anchor points. More precisely, the barycentric weights for the n elements in the training sample are computed as follows:

$\begin{matrix} min_{\begin{matrix} Λ^{i} \end{matrix}} \sum_{i = 1}^{n} {∥ Φ (T^{i}) - \sum_{e = 1}^{d} λ_{e}^{i} Φ (T_{a}^{e}) ∥}^{2} s.t. \sum_{e} λ_{e}^{i} = 1, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \min _{\begin{matrix} \boldsymbol{\Lambda }^{i} \end{matrix}} \sum _{i=1}^n\left\Vert \boldsymbol{\Phi }(\mathbf{T}^i)-\sum _{e=1}^d\lambda _e^{i}\boldsymbol{\Phi }(\mathbf{T}_a^e)\right\Vert^2 \text{ s.t.} \sum _e \lambda _e^{i}=1, \end{aligned} $$$ (15)

which can be approximated by taking the solution to the least-squares problem followed by a rescaling of the barycentric weights in order to make them sum to one.

Once the optimal barycentric weights (Λⁱ) are computed for each element Tⁱ of the training sample, the approximations (i.e. the barycenters) go back through the decoder Ψ to reproduce the input as ${\tilde{T}}^{i} = Ψ (Θ) = Ψ (\sum_{e = 1}^{d} λ_{e}^{i} Φ (T_{e}^{a}))$ $Mathematical equation: $ \widetilde{\mathbf{T}}^i=\boldsymbol{\Psi}(\boldsymbol{\Theta})=\boldsymbol{\Psi}(\sum_{e=1}^d\lambda_e^{i}{ \boldsymbol{\Phi}}(\mathbf{T}_e^a)) $$ . The learning stage reduces to estimating the weights and biases of layers of Φ and Ψ using an appropriate cost function that minimises the error between the input, Tⁱ and the output, $Ψ (\sum_{e = 1}^{d} λ_{e}^{i} Φ (T_{a}^{e}))$ $Mathematical equation: $ \boldsymbol{\Psi}(\sum\nolimits_{e=1}^d\lambda_e^{i}\boldsymbol{\Phi}(\mathbf{T}_a^e)) $$ , so that

$\begin{matrix} min_{\begin{matrix} Φ, Ψ \end{matrix}} μ \sum_{i = 1}^{n} {∥ T^{i} - {\tilde{T}}^{i} ∥}^{2} + \sum_{i = 1}^{n} {∥ Φ (T^{i}) - Θ (T^{i}) ∥}^{2} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \min _{\begin{matrix} \boldsymbol{\Phi },{\boldsymbol{\Psi }} \end{matrix}} \mu \sum _{i=1}^n\left\Vert \mathbf{T}^i- {\widetilde{\mathbf{T}}^i} \right\Vert^2 + \sum _{i=1}^n\left\Vert \boldsymbol{\Phi }(\mathbf{T}^i)- \boldsymbol{\Theta } (\mathbf{T}^i) \right\Vert^2. \end{aligned} $$$ (16)

In the training stage, we thus learn the non-linear interpolation scheme that best approximates the training samples in feature space, and the mapping between the barycentres and real space. The parameter μ controls the trade-off between these two objectives. In the evaluation phase only the decoder Ψ(Θ), which embeds the mapping between the barycentric weights and 3D temperature profiles, is used. As shown in the right panel of Fig. 6, the decoder is used as a generative model that is parameterised by the barycentric weights, Λ (for convenience we drop the subscript ‘i’ from now on). This model can easily be convolved to fit the observed 2D temperature profile so as to recover the true (3D) temperature profile. From now on, we refer to the decoder as the IAE model. The number and choice of anchor points and the model training will be discussed in the following Section.

4. Model training and fitting

4.1. Model training

We use a JAX (Bradbury et al. 2018) implementation to develop and train the IAE model. As a training sample, we use 200 randomly drawn T_3D profiles from the full sample of 315 extracted from the THREE HUNDRED PROJECT simulations.

Each profile in the training sample is first normalised to entries that sum to 1. The model is trained at the same fixed radial binning as that of T_3D profiles in the [0.02–2] R₅₀₀ radial range. For the training stage, several configurations were tested, among which the following choices were found to perform the best:

Network architecture: Both the encoder and the decoder are multi-layer perceptron (MLP) neural networks, which are composed of 2 layers, each of which has a number of hidden units equal to the input signal dimension (i.e. 48). We employ a smooth and non-monotonic Mish² activation function to introduce non-linearity and enhance the learning capacity of our deep neural network model. Since the IAE model employs a barycenter transformation of the training sample in encoder space to achieve dimensionality reduction, in this work, we only focus on a specific architecture with a fixed number of neurons per layer, corresponding to the dimension of the input samples. Further exploration of more general architectures is left for future work. For both encoder and decoder, the output Z^l + 1 of layer l can be expressed as

$\begin{matrix} Z^{l + 1} = Mish (W^{l} \otimes Z^{l} + b^{l}) + ε^{l} Z^{l} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathbf Z ^{l+1} = \text{ Mish}(\mathbf W ^{l} \otimes \mathbf Z ^{l} + \mathbf b ^{l})+ \varepsilon ^{l} \mathbf Z ^{l}. \end{aligned} $$$ (17)

Here, the first term represents the standard output of the neural network, with W and b defined as weight matrix and bias vector respectively. The second term represents skip connections (He et al. 2015; Huang et al. 2016), also known as residual connections. The skip connection acts by partially re-injecting Z up to a layer-dependent scalar factor ε. In general, the residual injection factors are typically chosen to be small for low-level layers and larger for deeper layers. This approach helps mitigate the vanishing gradient phenomenon, which is commonly encountered during the training of deep networks. For each layer l of encoder and decoder, we consider following the functional form of ε^l as used in Bobin et al. (2023)

$\begin{matrix} ε^{l} = ε_{0} (2^{1 / l} - 1), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \varepsilon ^{l} = \varepsilon _{0} \left(2^{1/l} - 1\right), \end{aligned} $$$ (18)

where ε₀ is a constant factor. By using skip connections with re-injection and layer-dependent scaling, the model can leverage both the direct information flow from earlier layers and the higher-level abstractions learned by the deeper layers, which can lead to improved performance and better training in deep neural networks.
Cost function: The cost function defined in Eq. (16) is composed of two terms. The first term measures the reconstruction error in real space, and the second term defines the error in feature space. The parameter μ allows one to tune the trade-off between these two terms. An accurate IAE model relies on both a low reconstruction error (i.e. first term of the training loss), and an efficient interpolation scheme in feature space. It has been emphasised in Bobin et al. (2023) that the second term helps improve the training process by constraining the feature space. In addition, depending on the problem and data at stake, it can help to increase the model accuracy by reducing the interpolation error in feature space, which in turn can reduce noise propagation at inference. In the present case, we noted that the trained model is not particularly sensitive to μ, which we set to 10 000 to minimise the reconstruction error in real space.
Training hyper-parameters: The batch size (the number of training profiles processed together before updating the neural network weights) is fixed to 32. The optimisation is performed by back-propagation using the standard Adam solver (Kingma & Ba 2014) with a step size of 10⁻³ and a number of epochs equal to 25 000. It is customary to further regularise the model by adding noise to the training samples, which limits over-fitting effects. To do that, Gaussian noise with mean zero and standard deviation of 2 × 10⁻³ is added to the samples at the training stage. The batch normalisation was achieved by normalising the input batch using a global mean of 0 and a standard deviation of 1. Finally, we fix the residual parameter (ε₀) to 0.1.
Number of anchor points: Anchor points serve as the basis on which temperature profiles are reconstructed using barycentric weights. Training with a small number of anchor points results in smoother (more regular) profiles; conversely, a large number of anchor points increases the model-to-data fidelity. Thus, the choice of the number of anchor points used during the training stage is essentially equivalent to choosing a regularisation parameter. For our study, the number of anchor points is fixed at five. These are generated by first dividing the training sample into five groups using a k-means clustering algorithm. The anchor points are then assumed to be the central points (centroids) of these five groups. By using five anchor points, we can ensure that the model-to-data residual remains below 10% over the observable radial range of ≈[0.02–1] R₅₀₀, as shown in Sect. 5, and at the same time, we can avoid any possible biases that could be introduced if the observations were shallow (bias-variance problem). Figure 7 shows the anchor points used in the neural network model. In Sect. 5.1.2, we will discuss the effect of increasing the number of anchor points.

Fig. 7.

Five anchor points (example profiles), $T_{a}^{e}$ $Mathematical equation: $ {\bf T}_a^e $$ , where e runs from 1 to 5 used in the IAE model.

Table 1 provides a comprehensive summary of our neural network architecture, along with the optimal hyper-parameters used in the study. For our implementation, we used publicly available source code hosted on a GitHub repository³ (Bobin et al. 2019, 2023).

Table 1.

Details on the neural network architecture and hyper-parameters used in this work.

Since our simulated sample is small, we use the term ‘validation’ to refer to testing of the model performance on simulated data (Sect. 5) before using it on real-world data, where the 3D temperature profiles are not directly available. We therefore used the training sample itself to evaluate the convergence of the cost function. Specifically, we monitored the cost function during training and found that after approximately 25 000 iterations, the cost function reached a point where it became flat. At this stage, we considered the training process to be sufficiently converged, and we terminated the training.

4.2. Model fitting

The IAE model is tested/fitted on the validation sample consisting of the remaining 115 galaxy clusters in the sample which were not used in the training stage. We have verified that the validation sample is representative of the full sample: about one-third of the validation clusters have cool cores, and the fractions of relaxed/disturbed clusters and smooth/irregular profiles are similarly distributed in the training and validation samples.

We employ Markov chain Monte Carlo (MCMC) analysis to estimate the parameters of the IAE models and use the publicly available emcee python package (Foreman-Mackey et al. 2013) for this purpose. The parameter estimation is undertaken on all the IAE parameters: the five anchor point weights Λ = [λ₁, λ₂, λ₃, λ₄, λ₅], and the amplitude (normalisation) parameter α.

The deconvolved temperature profile can be obtained from the trained non-parametric IAE model by minimising the following log-likelihood:

$\begin{matrix} L (Λ, α) & = Γ \times Tr ({(Λ - \bar{Λ})}^{⊤} \otimes Σ_{t}^{- 1} \otimes (Λ - \bar{Λ})) + \\ \frac{1}{2} \times Tr ({(T^{val} - T^{IAE})}^{⊤} \otimes Σ_{o}^{- 1} \otimes (T^{val} - T^{IAE})), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathcal{L} ({\boldsymbol{\Lambda }},\alpha )&=\Gamma \times \mathrm{Tr}\left(({ {\boldsymbol{\Lambda }}} - \bar{{\boldsymbol{\Lambda }}})^\top \otimes {\boldsymbol{\Sigma }}^{-1}_\mathbf{t} \otimes ({ {\boldsymbol{\Lambda }}} - \bar{ {\boldsymbol{\Lambda }}})\right) + \quad \quad \quad \quad \nonumber \\&\quad \frac{1}{2}\times \mathrm{Tr}\left((\mathbf{T}^\mathbf{val} -\mathbf{T}^\mathrm{IAE})^\top \otimes {\boldsymbol{\Sigma }}^{-1}_\mathbf{o} \otimes (\mathbf{T}^\mathbf{val} -\mathbf{T}^\mathrm{IAE})\right), \end{aligned} $$$ (19)

where T^val is a temperature profile (2D or 3D) in the validation sample to be fitted, Σ_o is the error covariance matrix and T^IAE = C ⊗ IAE(Λ, α) is the corresponding convolved IAE model predicted profile. Tr and ⊤ represent the trace and transpose of the matrix respectively. The first term represents the mean proximity term, with Γ controlling its overall contribution to the likelihood. This enforces the solution to be a barycentre of the example profiles (i.e. it searches for the best approximation of the input signal with respect to the learned model/network). We find that Γ in the range 0.1–1 generally provides good results, and we, therefore, fix it to 1. $\bar{Λ}$ $Mathematical equation: $ \bar{{\boldsymbol{\Lambda}}} $$ (the mean value of the Λs) and Σ_t (the covariance matrix of the Λs) are computed from the training set by generating 100 Monte Carlo simulations for each cluster with log-normal noise, which are then subsequently fitted to the IAE model using the Adam optimiser. This cost-effective regularisation strategy is introduced to avoid model extrapolation (physically unrealistic results), and enables us to have a robust and effective deconvolution algorithm. The second term is the standard likelihood related to some additive Gaussian noise perturbation. We have used flat prior distributions and Table 2 shows the prior ranges of all parameters. We used Getdist (Lewis 2019) with the chains generated by emcee to produce 2D contours and marginal posteriors.

Table 2.

Flat priors used for the IAE model parameters.

The IAE model testing was undertaken by fitting it to the T_3D profiles and T_2D profiles built in Sect. 2. For simplicity, we ignore the PSF in the testing phase. We tested and validated our model by considering three fitting cases:

3D–3D fit with fine binning: the T_3D profiles are directly fitted to recover the best-fitting 3D profiles from the IAE model. The goal, in this case, is to assess the ability of the IAE model to reproduce the input 3D temperature profile shape. In this case, as there is no projection, C in Eq. (19) is simply an identity matrix of size 48 × 48 (C_48, 48).
2D–3D fit with fine binning: we fitted the 2D projected temperature profiles with the IAE model convolved with a projection matrix C. In this case, we wish to assess how well the IAE model recovers the intrinsic 3D temperature profile when only 2D projected data are available. We used the same 2D radial logarithmic binning as that of the T_3D profiles, meaning that C has dimensions of 48 × 48 (C_48, 48). For this testing phase, we assume standard emission measure weighting to calculate the elements of C.
2D–3D fit with coarse binning: the 2D projected temperature profiles having coarse logarithmic radial binning of twelve or six points up to R₅₀₀ were fitted to the IAE model convolved with matrix C. Here, the goal is to assess the ability of the IAE model to recover the intrinsic 3D temperature profile when only a coarse 2D projected profile, similar to that obtained from present-day observations, is available. In this case C has a dimensions of 12 × 48 (C_12, 48) and 6 × 48 (C_6, 48) for the 2D temperature profiles with twelve and six bins respectively. As above, we use standard emission measure weighting to calculate the elements of C. In Sect. 5.4, we will also consider the Mazzotta et al. (2004) temperature-dependent spectroscopic-like weights.

For case 3, which seeks to mimic the typical characteristics of 2D temperature profiles measured with current X-ray satellites, we assume that the uncertainties increase linearly with radius. Based on our previous experience with XMM-Newton and Chandra observations, we assume temperature profile uncertainties that increase from 5% to 25% in the [0.02–1] R₅₀₀ radial range for the 12-bin profiles, and from 10% to 30% for the 6-bin profiles. We built a diagonal error covariance matrix, i.e. Σ_o, using this approximation. This was then incorporated in the likelihood and acts as a weighting function, giving more weight to the inner regions in the fit. In general, regardless of whether the errors increase monotonically, the inclusion of errors in the likelihood leads to an overall improvement in the fit. For cases 1 and 2 (fine binning), we do not consider errors in the likelihood and as such Σ_o is a unit matrix. Both model training and fitting a single profile with MCMC can be completed within a few minutes on a 16-core CPU.

In the objects where the temperature profiles in the first few inner bins were not reliable (i.e. having < 100 gas particles), these bins were not considered in the fitting. However, no such constraint was applied during the training stage, as one expects the network to learn only the fundamental structure of the data rather than the noise.

5. Model evaluation

In this Section, we discuss the robustness of the non-parametric IAE model reconstruction using different schemes. We check the performance of our model with respect to the radial binning, which is important since the number of radial bins corresponding to the observations is much lower compared to the resolution of the temperature profiles in the simulated sample. We also consider different weighting schemes in the fit. The model is tested with the 115 temperature profiles in the validation sample.

The performance of the model was evaluated by comparing the original 3D and 2D temperature profiles with those recovered from the IAE model. For each case, we calculated the median fractional residual and its associated 1-σ dispersion (16th–84th percentile range) at three scaled radii (0.02 R₅₀₀, R₅₀₀, and 2 R₅₀₀), and over the full radial range.

5.1. 3D-3D reconstruction of temperature profiles

5.1.1. Overall performance

We first consider the simplest case, corresponding to the 3D–3D fit with fine binning, where we directly fitted the IAE model to the intrinsic 3D gas mass-weighted temperature profiles (T_3D), ignoring projection effects. The left hand panel of Fig. 8 shows the fractional residuals (ΔT_3D/T_3D) between the input (true) and recovered temperature profiles for all the individual clusters in the validation sample. The median fractional residual profile along with 1-σ dispersion (16th–84th percentile range) are also plotted.

Fig. 8.

Fractional residuals for 115 clusters in the validation sample with IAE for the 3D–3D fit. The three horizontal dashed black lines represent zero and ±5% fractional residuals; the vertical dashed black lines represent R₅₀₀. Left panel: The grey lines show the individual fractional residuals of all the clusters. The solid black line and shaded black region show the median and 1-σ dispersion of the fractional residual distribution, respectively. The histogram shows the distribution of fractional residuals over all radii. Right panel: The cyan and magenta lines in the top panel show the fractional residuals of MR20 and MD20 sub-samples, respectively. The green and orange lines in the bottom panel show the fractional residuals of the MS20 and MR20 sub-samples respectively. Shaded regions show the corresponding 1-σ dispersion of the fractional residual distribution. The histograms show the distribution of fractional residuals over all radii. Regions enclosed by the solid black lines show the 1-σ dispersion of the fractional residual of the full validation sample. The IAE model can reconstruct 3D temperature profiles with a fractional difference of about 5% across nearly the full radial range.

The median fractional residual profile is found to be close to zero throughout the radial range: at radii, 0.02 R₅₀₀, R₅₀₀, and 2 R₅₀₀, the values are −0.010 ± 0.060, 0.010 ± 0.051 and −0.020 ± 0.120 respectively. Moreover, the median fractional residual over the full radial range is found to be −0.001 ± 0.042. The 1-σ dispersion in the fractional residuals is nearly constant at around ±5%, except beyond 1.5 R₅₀₀.

Within the validation sample, the fractional residuals of the 20 most relaxed/disturbed clusters (MR20/MD20) are displayed at the top in the right panel of Fig. 8, while the 20 most smooth/irregular profiles (MS20/MI20) are shown at the bottom. In all cases, the median fractional residuals are again consistent with zero. The 1-σ dispersion in fractional residuals over all radii for the MD20 (MI20) sub-sample is ±0.045 (±0.053), which is larger, as expected, compared to the dispersion of ±0.032 (±0.029) found in the MR20 (MS20) sub-sample. This conclusion is supported by the fact that the histogram of the residuals of the MR20 (MS20) sub-sample is more peaked at zero, and hence is narrower compared to the MD20 (MI20) sub-sample. In general, we find that for disturbed clusters and for irregular profiles, the IAE model smooths out the sharp small scale variations in the 3D temperature profiles.

5.1.2. Anchor point weights, λ_i

We have shown above that the IAE model is able to recover the average shape of the 3D profiles with high accuracy. In this context, it is interesting to consider how the anchor point weights, λ, change according to the characteristics of the profile under consideration. Figure 9 shows the temperature profiles of the most relaxed/disturbed clusters in the validation sample, classified according to the χ_D criterion discussed in Sect. 2.2, and of the most regular/irregular profiles in the validation sample, classified according to the χ_S criterion introduced in Sect. 2.3. The reconstructed median temperature profile and fractional residuals obtained with the IAE using MCMC are also shown. The IAE model produces smoother profiles on small scales by ignoring the fluctuations on such scales. At large scales, the IAE model is able to reproduce the underlying structure of the input temperature profiles. The bottom left hand panel shows the fractional residuals, which can be seen to be less than 5% over most of the radial range. The top panel of the Fig. A.2 shows the corresponding posterior distribution of the parameters of the IAE model obtained using MCMC. The parameters are seen to be well-constrained, and as anticipated the relaxed cluster profile (or the most regular profile) has tighter constraints compared to the most disturbed cluster (or the most irregular profile) which has relatively larger contour levels. Figure A.3 shows the comparison of temperature profiles and the reconstructed temperature profiles of 20 example clusters in the validation sample.

Fig. 9.

Results for the most relaxed/disturbed clusters and for the most smooth/irregular profiles with IAE for the 3D-3D fit. Left panel: Dashed cyan and magenta lines show the true 3D temperature profiles of the most relaxed and disturbed clusters respectively in the validation sample. Similarly, dashed green and orange lines show the most smooth and irregular true 3D temperature profiles respectively. The solid lines and the corresponding shaded regions show the median and 1-σ dispersion of the reconstructed temperature profile obtained from the IAE model using MCMC.

We also tested the effect on the IAE model of increasing the number of anchor points. We found that the model fidelity can be improved by increasing the number of anchor points and that the choice of 20 anchor points reduces the residuals significantly. Figure A.4 shows the recovered ensemble plot of fractional residuals using the IAE model with 20 anchor points for the full validation sample, and for the different sub-samples. There is a significant improvement in the average fractional residual in all the cases. The median of the fractional residuals for the full sample over the entire radial range is found to be 0.002 ± 0.030, about 25% smaller compared to the fiducial IAE model obtained with five anchor points. However, the usefulness of this higher dimensional model is limited to simulations only. The temperature profiles that can be obtained from current X-ray satellites generally have temperature data at around 8–15 points for typical deep observations. Use of the IAE model with 20 anchor points in cases such as this would result in over-fitting and/or large variance.

5.2. 2D–3D reconstruction of temperature profiles with fine 2D binning

We now discuss the efficiency of the IAE model when fitting the 2D (projected) temperature profiles, defined at the same radial grid as in the previous case and at which the IAE model is defined (2D–3D fit with fine binning case). Here, the 3D IAE model is convolved with the standard emission-measure weighting matrix. The resulting projected model is then fitted to the input 2D temperature profiles, in order to reconstruct the 3D temperature profiles.

Since projection results in smoother 2D temperature profiles, washing out fluctuations at small scales, one expects the 3D reconstruction obtained from the 2D profile to be more regular compared to what was found in the previous section. It is also important to note that projection effects are dominant in the inner regions (especially in CC clusters), which can introduce degeneracy into the reconstructed 3D temperature profiles in the central region. However, both the 2D and 3D profiles of CC clusters will always display a central temperature dip. Thus one can expect a larger scatter in the 3D reconstructed temperature profiles in the central regions, as compared to the 3D-3D fitting case.

In Fig. 10, we show the ensemble plot of fractional residuals of the 2D (top panel) and 3D (bottom panel) temperature profiles for the validation sample (left panel) and sub-samples (right panel). The fractional residuals in 2D space (where the fitting is actually performed) are smaller compared to the 3D temperature residuals, as expected.

Fig. 10.

Fractional 2D and 3D residuals for 115 clusters in the validation sample with IAE for the 2D–3D fit (fine binning). The three horizontal dashed black lines represent zero and ±5% fractional residuals; the vertical dashed black lines represent R₅₀₀. Left panel: Grey lines show the individual 2D (top panel) and 3D (bottom panel) residuals of all the clusters. The solid black line and shaded black region in the left panels show the median and 1-σ dispersion of the 2D (top panel) and 3D (bottom panel) residual distribution, respectively. The histogram shows the distribution of residuals over all radii. Right panel: The cyan and magenta lines show the 2D (top panel) and 3D (bottom panel) residuals of the MR20 and MD20 sub-samples respectively. Green and orange lines show the 2D (top panel) and 3D (bottom panel) residuals of the MS20 and MI20 sub-samples respectively. Shaded regions show the corresponding 1-σ dispersion of the residual distribution. Regions enclosed by the solid black lines show the 1-σ dispersion of the median residual of the full validation sample. The histograms show the distribution of residuals over all radii. When given 2D profiles as input, the IAE model can reconstruct 3D temperature profiles with a fractional difference of about 5% across nearly the full radial range.

For the 2D fit, we find median fractional residuals at radii 0.02 R₅₀₀, R₅₀₀, and 2 R₅₀₀ to be 0.009 ± 0.027, 0.004 ± 0.040 and −0.018 ± 0.095 respectively. The median of fractional residuals for the full sample and over the entire radial range is found to be −0.002 ± 0.027. Unlike in the 3D–3D case, where the dispersion around the median was slightly larger in the outer regions only, here, it also increases towards the centre, as expected from the arguments given above. The dispersion is about ±10% at the first bin.

For the 3D reconstruction, we find median fractional residuals at 0.02 R₅₀₀, R₅₀₀, and 2 R₅₀₀ of 0.021 ± 0.110, 0.014 ± 0.052 and −0.018 ± 0.095, respectively. The median of fractional residuals for the full sample and over the entire radial range is found to be −0.003 ± 0.045. Moreover, as in the 3D-3D case, here too, the histogram of the fractional residuals over all radii of MR20 and MS20 sub-samples are narrowly peaked compared to the MD20 and MI20 sub-samples, indicating again that the profiles of more relaxed clusters, or intrinsically smoother temperature 2D profiles, are reconstructed with higher fidelity in general.

In the left panel of Fig. 11, we show the recovered temperature profiles for the extreme cases of the most relaxed/disturbed cluster and the most smooth/irregular profiles in the validation sample. As in the 3D–3D case, the difference between the input and recovered temperature profiles is less than 5% over most of the radial range. The bottom panel of Fig. A.2 shows the corresponding posterior distribution of the IAE model parameters. Here also, all the parameters are well constrained. The comparison to the equivalent parameters contours for the 3D–3D case, also shown on the plot, show that, understandably, the 2D–3D reconstruction has slightly larger contour intervals compared to the 3D–3D.

Fig. 11.

Results for the most relaxed/disturbed clusters and for the most smooth/irregular profiles with IAE for the 2D–3D fit (fine binning). Left panel: Dashed cyan and magenta lines show the true 3D temperature profiles of the most relaxed and disturbed clusters in the validation sample, respectively. Dashed green and orange lines show the most smooth and irregular true 3D temperature profiles, respectively. The solid lines and the corresponding shaded regions show the median and 1-σ dispersion of the reconstructed temperature profile obtained from the IAE model using MCMC. The dotted lines show the 2D temperature profiles actually used in the fitting.

5.3. 2D–3D reconstruction of temperature profiles with an observation-like binning

So far we have tested the IAE model only with high resolution simulated temperature profiles. However, real observed 2D temperature profiles are of much lower spatial resolution, have fewer data points, and are generally detected up to R₅₀₀ only. In this section, we test the accuracy of the IAE model to recover simulated temperature profiles with resolutions similar to those found with the current X-ray observations (2D–3D fit with coarse binning case).

First, we consider a case where we fitted 2D temperature profiles having resolutions similar to that expected with moderately deep X-ray observations. In such observations, we normally expect around twelve annular data points limited up to R₅₀₀. We also impose more realistic errors in the 2D temperature profiles: They are assumed to increase linearly with a radius from 5% in the innermost bin to 25% in the outermost bin. Later in this Section, we will also consider a fitting case with 2D temperature profiles defined at only six radial points within R₅₀₀, with errors ranging from 10% to 30% from the innermost to the outermost radial bin.

5.3.1. Twelve bin case

In Fig. 12, we show the ensemble plot of the 2D and 3D fractional residuals for the 2D–3D fit with the coarse binning case, by considering twelve 2D temperature data points within R₅₀₀. Even with the lower resolution, we find that within the 2D fitting range (i.e. up to R₅₀₀), the 3D fractional residuals are still close to zero, with a 1-σ dispersion of about ±5%, as in the previous cases. The median 3D fractional residuals at radii 0.02 R₅₀₀, R₅₀₀, and 2 R₅₀₀ is found to be 0.003 ± 0.071, −0.010 ± 0.064 and −0.070 ± 0.185 respectively. The median of fractional residuals for the full sample and over the entire radial range is found to be −0.006 ± 0.051. Beyond R₅₀₀, where no 2D temperature data were available to fit, and thus where the constraints on the 3D reconstruction are only due only to projection effects, the scatter increases with radius, reaching a 1-σ dispersion of ±20% at the last bin (2 R₅₀₀). Moreover, beyond 1.5 R₅₀₀, 3D temperature profiles are underestimated by about 7%. However, it is important to mention that the true 3D temperature profiles mainly lie within the 1-σ dispersion of reconstructed temperature profiles. As before in the fine binning case, the dispersion in the 2D fractional residual is much smaller compared to the 3D reconstruction.

Fig. 12.

Fractional residuals for 115 clusters in the validation sample with IAE for the 2D–3D fit (coarse binning) using 2D temperature profiles defined at twelve radial bins up to R₅₀₀. Colour coding is the same as in Fig. 10. When given 2D temperature profiles with a binning scheme typical for moderately deep X-ray observations, the IAE model can still reconstruct 3D temperature profiles with fractional differences of about 5% throughout the 2D fitting range (i.e. [0.02–1] R₅₀₀.)

For the 2D fit, we find median fractional residuals at radii 0.02 R₅₀₀ and R₅₀₀ to be 0.001 ± 0.008, −0.026 ± 0.073 respectively. The median of fractional residuals for the full sample and over the entire radial range is found to be −0.002 ± 0.026 for the 2D profiles, similar to that found in the 2D–3D fit with the fine binning case. Since we assumed that the errors increase radially outwards such as in real observations, putting more weight on the inner regions in the fit, the constraints in the inner region are better compared to the 2D–3D fit with the fine binning case. For comparison, Fig. A.5 shows the 3D fractional residuals for the case where we do not consider error bars in the fit. Here, we find that the scatter is increased in the inner regions as compared to both 2D–3D fit with fine binning case (previous case) and coarse binning case (present case).

As in the previous cases, the histogram of the residuals of the MR20 (MS20) sub-sample has a stronger peak around zero and reduced wings compared to the MD20 (MI20) sub-sample. For example, the 1-σ dispersion in 3D fractional residuals over all radii for MD20 (MI20) sub-sample is found to be ±0.055 (±0.065) and for MR20 (MS20) sub-sample it is ±0.041 (±0.036).

In the left hand panel of Fig. 13, we show the IAE recovered temperature profiles of the most relaxed and disturbed cluster and of the most regular and irregular profile in the validation sample. As in previous cases, here also the difference between the input and recovered temperature profiles is less than 5% in the 2D fitting range of [0.02–1] R₅₀₀. Beyond R₅₀₀, as expected, the residuals can be high. The top panel of Fig. A.6 shows the corresponding posterior distribution of the parameter. One finds that the confidence intervals for the IAE model parameters are larger compared to fine binning cases (i.e. cases 1 and 2). However, we were still able to put relatively good bounds on the parameters, which are represented by nearly Gaussian posterior distributions. Figure A.7 shows the comparison of true 2D and 3D temperature profiles and the reconstructed temperature profiles of 20 example clusters in the validation sample for the twelve bin case.

Fig. 13.

Results for the most relaxed and disturbed clusters and for the most smooth and irregular profile with the 2D–3D fit (coarse binning) using 2D temperature profiles defined at twelve (left panel) and six radial bins (right panel) up to R₅₀₀. Errors in the 2D temperature profiles are assumed to increase linearly with a radius from 5% (10%)in the innermost bin to 25% (30%) in the outermost bin for the twelve (six) bin case. Colour coding is the same as in Fig. 11.

5.3.2. Six bin case

The 2D and 3D fractional residuals for a fit considering only six data points with errors linearly increasing from 10% in the innermost bin to 30% in the outermost bin in the range [0.2–1] R₅₀₀ are shown in Fig. 14. We find that the median 2D and 3D fractional residuals are still consistent with zero in the 2D fitting range. However, as expected, the 1-σ dispersion is larger compared to the previous cases and temperature profiles are underestimated by about 8% beyond 1.5 R₅₀₀ (where there are no 2D data). We find median 2D fractional residuals at radii 0.02 R₅₀₀ and R₅₀₀ to be −0.006 ± 0.022, −0.022 ± 0.070 respectively. For the 3D reconstruction, we find median fractional residuals at 0.02 R₅₀₀, R₅₀₀, and 2 R₅₀₀ to be 0.05 ± 0.128, −0.004 ± 0.090 and −0.080 ± 0.235 respectively. The median of fractional residuals for the full sample and over the entire radial range is found to be −0.008 ± 0.038 and −0.014 ± 0.075 for the 2D and 3D profiles respectively. In the right panel of Fig. 13, we show the temperature profiles of the most relaxed and disturbed cluster and of the most regular and irregular profile in the validation sample. We find that even with only six data points in the fit, the IAE is still able to recover the 3D temperature profiles with residuals less than 10% over most of the cluster region. However, the confidence intervals of the reconstructed profiles and IAE parameters, shown in the bottom panel of Fig. A.6, are larger compared to previous cases. Finally Fig. A.8 shows the comparison of true 2D and 3D temperature profiles and the reconstructed temperature profiles of 20 clusters in the validation sample for the six bin case.

Fig. 14.

Fractional residuals for 115 clusters in the validation sample with IAE for the 2D–3D fit (coarse binning) using 2D temperature profiles defined at six radial bins up to R₅₀₀. Colour coding is the same as in Fig. (10). For simplicity, we have not shown the sub-sample cases. Even when input 2D temperature profiles with a binning scheme typical for shallow X-ray observations, the IAE model can still reconstruct 3D temperature profiles with fractional differences of about 5% throughout the 2D fitting range (i.e. [0.02–1] R₅₀₀).

For comparison, Table 3 provides the median fractional residuals obtained for the different cases of fitting schemes discussed in this Section. Similarly, Table 4 shows the best-fitting parameters of IAE model for different cases obtained with MCMC. One can see that as we go from the high resolution simulated profiles to lower resolution observational-like profiles, the dispersion in fractional residuals and parameter estimates increases.

Table 3.

Median fractional 3D and 2D residuals obtained at 0.02 R₅₀₀ (third column), R₅₀₀ (fourth column), 2 R₅₀₀ (fifth column), and over the full radial range (sixth column) for the fitting schemes and samples in Sects. 5.1–5.3.

Table 4.

Best fit results for the IAE parameters derived with the MCMC for the fitting schemes and samples considered in Sects. 5.1–5.3.

We also checked the performance of the model with other binning schemes and found the performance of the IAE model to be robust. In particular, we checked the performance by considering five 2D data points up to 0.5 R₅₀₀ in the fit. We find that the IAE model is able to reproduce the results with an average fractional difference of about 5% up to 0.5 R₅₀₀ which then increases with radius and becomes about 10% at R₅₀₀ and 25% at 2 R₅₀₀. We also considered an IAE model with 20 anchor points, applied to the two observation-like cases, and found that its performance is very similar to that of our fiducial five-parameter IAE model, unlike in the 3D-3D case where it is found to have better performance. This implies that increasing the number of anchor points does not necessarily increase the model fidelity for these cases, as one must also have higher resolution input 2D temperature profiles for the model to be fitted against.

5.4. 2D–3D reconstruction of temperature profiles with spectroscopic-like weighting

In the previous Sections, we have only focused on 3D temperature reconstruction from the IAE model using 2D temperature profiles derived using standard emission-measure weights (Mathiesen & Evrard 2001). In this Section, we consider more complex spectroscopic-like weighting (Mazzotta et al. 2004), which has a stronger dependence on the 3D temperature profiles. This makes deconvolution a more complicated problem and, therefore, it is important to check the accuracy of the IAE model in this case.

In Fig. 15, we show the fractional residual for 2D and 3D temperature profiles between the input and IAE recovered temperature profiles in 2D–3D fit with twelve data points in the range [0.02–1] R₅₀₀. We find the median fractional residuals at radii 0.02 R₅₀₀ and R₅₀₀ to be 0.002 ± 0.008, −0.027 ± 0.065 respectively for the 2D profiles. For the 3D reconstruction, we find median fractional residuals at 0.02 R₅₀₀, R₅₀₀, and 2 R₅₀₀ to be 0.040 ± 0.072, −0.003 ± 0.065 and −0.060 ± 0.180 respectively. We see that on average there is a small but noticeable 4% over-estimation in the 3D temperature profiles in the first 4 radial bins. This could be caused by the presence of dense and cold substructures that in the simulated objects could lower the central value of the 3D spectroscopic-like temperature in the innermost region, where the impact of this formulation is the strongest (see e.g. Fig. 3 of Rasia et al. 2014). Similarly, beyond R₅₀₀ the temperature profiles are underestimated by 8% on average. This effect could also play a role for the central mismatch, since the convolution is temperature dependent, the slight overestimation in the first few innermost bins may be also linked to the underestimation of temperature profiles in the outermost bins. This suggests the importance of deriving accurate estimation of the temperature profiles beyond the 2D fitting range. More detailed treatment in this regard is beyond the scope of this paper and we propose possible explanations as an important future direction. However, we do find the median residual is consistent with zero over all the radial range of [0.02–2] R₅₀₀ and as in the previous cases, for the majority of the clusters the true 3D temperature profiles lie within 1-σ dispersion of the IAE recovered temperature profiles. The median of fractional residuals for the full sample and over the entire radial range is found to be −0.003 ± 0.038 and −0.003 ± 0.075 for the 2D and 3D profiles respectively.

Fig. 15.

Fractional residuals for 115 clusters in the validation sample with IAE for the 2D–3D fit (coarse binning) using spectroscopic-like 2D temperature profiles defined at twelve radial bins up to R₅₀₀. For simplicity, we have not shown the sub-sample cases.

5.5. Comparison of IAE model to a parametric model

In this Section, we use the validation sample of 115 clusters to compare the non-parametric results from IAE model to those obtained from a parametric temperature model. We first obtain the best-fitting 3D temperature from the Vikhlinin et al. (2006) model (Eq. (2)) considering the prior range on each parameter given in Table 5, and using the same binning schemes as used for the IAE model in previous sections, assuming a spectroscopic-like weighting scheme. Temperature profiles were first scaled by T_X before fitting them to the parametric model, so as to bring the parameter T₀ to a comparable scale. We find that in the 2D–3D (or 3D–3D) fine binning case, the 3D reconstruction is poor compared to the observational-like cases where the fitting is weighted according to the errors, which increase with radius. We also tried to fit the temperature profiles in log space, which could effectively address any heteroscedasticity issues and stabilise the variance over the large radial range. However, this still did not improve the model reconstruction in the 2D–3D (or 3D–3D) fine binning case. This indicates that such a parametric model struggles to accurately capture the true underlying patterns in the noiseless data, or when the noise covariance is negligible. By weighting the fitting according to the errors, which reflect the inherent uncertainties in the data and which increase with radial distance, the model can better adapt to the complexities of the noiseless data, resulting in improved performance. The significant improvement achieved by incorporating error covariance can be visually observed in Fig. 16. Even with coarse resolution, as discussed in the next paragraph, the fit shows a remarkable enhancement when realistic error covariance is considered during the fitting process. Another reason for the sub-optimal performance of the parametric model can be attributed to its highly non-linear nature and the strong degeneracy between the parameters. This results in poor constraints on the parameters, and the reconstructed 3D temperature profiles could depend strongly on the choice of fitting priors.

Fig. 16.

The 1-σ dispersion in the 3D fractional differences obtained with MCMC for priors provided in Table 5 for the Vikhlinin et al. (2006) parametric model (Eq. (2)). In the figure, we consider the 2D–3D fine binning case and 2D–3D observational-like coarse binning cases with twelve and six bins. The top panel shows the results with prior ranges for a = 0 − 0.6 and c = 0 − 4, while the bottom panel presents the results with priors ranges for a = 0 − 0.1 and c = 1 − 4. The regions enclosed by cyan and magenta lines in the bottom panel show the corresponding dispersion recovered with the IAE model for the observational-like cases with twelve and six bins respectively.

Table 5.

Flat priors used for the Vikhlinin et al. (2006) model parameters.

The arguments discussed above can be explained with Fig. 16. The top panel of the Fig. 16 shows the dispersion for the 2D–3D fine and coarse binning cases with prior ranges of parameters a = 0 − 0.6 and c = 0 − 4, which have a significant effect on the profiles in the central and outer regions respectively. We find, for the 2D–3D fine binning case, that the 3D reconstructed temperature profiles obtained from this parametric fitting have a large bias in both the central and outer regions, with median fractional residuals of values about 30% and 11% at the first and last bin respectively. For observational-like binning, having a weighted fitting, the bias in the central regions becomes consistent with zero, however, there is still a bias beyond the R₅₀₀ which increases with the median fractional residual of values about 18%. We find that the optimal priors for parameters a and c are a = 0 − 0.1 and c = 1 − 4 respectively, leading to a minimal bias in the central and outer regions respectively. This is shown in the bottom panel of Fig. 16, where one finds a median consistent with zero, but with slightly larger dispersion compared to the IAE model for the observational-like cases. In the outer regions, however, the dispersion in the 2D–3D fine binning case is barely consistent with zero for the parametric model.

Considering the optimal priors for the a and c parameters discussed above, the left panel of Fig. 17 shows the reconstruction of the 3D temperature profiles with the IAE and parametric models for typical CC and NCC clusters in the simulated sample with observational-like binning having twelve bins. While the CC profile is recovered well by both models, the reconstruction is poor in the central region for the parametric fit to the NCC case, and would require larger values of a to improve the fit in the central region. Similarly, in the right panel of Fig. 17, we show the 3D reconstruction of two complex profiles. These two clusters are experiencing ongoing merger shocks. Here one sees that, in such scenarios, the parametric model performs poorly compared to the IAE model, being unable to capture the true underlying structure of the data. We find that even increasing the priors on a and c did not have any significant improvement in the parametric fit for such complex profiles. The accurate estimation of the shape of the temperature profile is vital since the estimation of total mass profiles depends on it.

Fig. 17.

CC and NCC model recover comparison. Left panel: Comparison of the 3D temperature profiles of typical CC and NCC clusters in the THREE HUNDRED PROJECT sample recovered with the IAE and parametric models using twelve 2D annuli within R₅₀₀ (points with error bars). The dashed line shows the true 3D temperature profiles. The solid lines and shaded regions show the reconstructed 3D temperature profiles with 1-σ dispersion obtained with the IAE model. The dotted lines are the 3D temperature profiles recovered with the Vikhlinin et al. (2006) parametric model. For better visibility, the 1-σ dispersion for the parametric model is not shown. Right panel: 3D temperature profile reconstruction with the IAE and parametric models for two complex cases in the THREE HUNDRED PROJECT. For better visibility, 2D profiles and the 1-σ dispersion are not shown. For both figures, the bottom panel shows a fractional difference between the true and recovered 3D profiles. For NCC and CC clusters, both the IAE model and parametric model reconstruction with optimal priors are comparable, but the former exhibits slightly better performance. For the complex cases, the IAE model is more accurate in uncovering the profile shapes.

6. First application to CHEX-MATE X-ray data

6.1. Modifications to the IAE model

Although the THREE HUNDRED PROJECT provide us with one of the highest resolution hydrodynamical simulation samples to date, due to numerical issues, the thermal profiles could only reliably be estimated above 0.02 R₅₀₀ for most of the galaxy clusters in the sample. The number of available 2D annular temperature data points and their radial distribution will depend on the object mass and luminosity, the presence or absence of a cool core, and the depth of the observation⁴. From our experience of X-ray analysis of typical observations of local (z < 0.5) massive (M₅₀₀ > 10¹⁴ M_⊙) galaxy clusters available in the XMM-Newton or Chandra archives, we find that for many objects, one is generally able to obtain some temperature data points interior to 0.02 R₅₀₀ (corresponding to 20 − 40″ at z = 0.05 and 5 − 10″ at z = 0.3 for typical cluster masses).

Therefore, in order to make the best use of the available data, one needs to look for an optimal extrapolation of the IAE model that is able to reconstruct the temperature profiles robustly even in the very central regions. To build an IAE model that is suitable for application to such observations, we first extrapolated the simulated temperature profiles to 0.005 R₅₀₀ by fitting a Vikhlinin et al. (2006) parametric model in the inner regions (up to 0.5 R₅₀₀). We then re-trained the IAE model in the full radial range of [0.005–2] R₅₀₀ with the simulated dataset, augmented by the parametric model extrapolation in the very central regions.

6.2. Observed sample

We then use this updated IAE model on the latest CHEX-MATE Data Release 1, DR1 sample (Rossetti et al., in prep.) to deconvolve the temperature profiles. The DR1 sample is a ‘technical but representative’ sub-sample, which was built to test our pipeline for the extraction and reconstruction of the radial temperature and density profiles. It is composed of 30 clusters, whose distribution in mass, redshift, and Planck signal-to-noise-ratio (S/N) reflect the properties of the CHEX-MATE parent sample. Table A.1 provides the details of all the clusters in the DR1 sample. For data reduction and analysis, we used the XMM-Newton Science Analysis System (SAS), version 16.1. We refer to Bartalucci et al. (2023) for details on the data reduction procedures (calibration, standard pattern cleaning, removal of noisy MOS CCDs, and light-curve filtering) and on the detection of contaminating sources. From the EPIC images in the 0.7–1.2 keV band, we extracted both mean and median surface brightness radial profiles, centered on the peak and on the centroid within R₅₀₀. For the temperature profile, we extract spectra in concentric annuli centered on the surface brightness peak, using the MOS-spectra and PN-spectra ESAS tools (Snowden et al. 2008) embedded in SAS. For each region, we perform a joint fit of the MOS1, MOS2, and PN spectra with an adsorbed thermal model, to which we add a model for all the background components (Galactic foregrounds, CXB, Cosmic-ray particle background, residual soft protons). We estimate priors for the parameters of this background model that are allowed to vary within their uncertainty during the joint fit with the cluster parameters, running the Markov chain Monte Carlo method within XSPEC (see Rossetti et al., in prep., for more details). In this work, two clusters (PSZ2 G046.88+56.48 and PSZ2 G057.78+52.32) that require background treatments using off-set observations were not considered in the analysis.

6.3. Method

For deconvolution of these observed profiles, we assume that the 3D temperature profiles can be represented by the IAE model, convolved with a response matrix C = C_PSF ⊗ C_proj, which simultaneously takes into account projection and PSF redistribution.

The projection matrix, C_proj, is built by using the DR1 density profiles from (Duffy et al., in prep.), derived using the non-parametric deconvolution algorithm of Croston et al. (2006). More details of the derivation of the density profiles can be found in Croston et al. (2008) and Pratt et al. (2022). C_PSF is constructed as in Croston et al. (2006), which uses the parametric PSF model of Ghizzardi (2001) as a function of the energy and angular offsets, the parameters of which can be found in EPIC-MCT-TN-011⁵ and EPIC-MCT-TN-012⁶.

The IAE model was then projected, taking into account the spectroscopic-like weighting scheme proposed Mazzotta et al. (2004), and fitted to the observed 2D profiles. In our future work, we will examine the more complex Vikhlinin (2006) weighting scheme, which is more robust for lower temperature clusters/groups, and compare the results to other weighting schemes.

6.4. Results

6.4.1. Estimation of profiles

In Fig. 18, we show the 3D temperature profiles reconstructed using the IAE model and the Vikhlinin et al. (2006) 8-parameter parametric model for a typical NCC and a typical CC cluster in the DR1 sample. In general, we find that with the annular resolution of the present 2D profiles, both models produce similar reconstructed 3D temperature profiles. However, the parameters of the Vikhlinin et al. (2006) model are poorly constrained, and the final reconstructed temperature profiles (especially the inner and outer regions) may depend on the chosen priors.

Fig. 18.

Comparison of the scaled 2D and 3D temperature profiles of a typical NCC (PSZ2 G050.40+31.17) and CC (PSZ2 G057.92+27.64) cluster in the DR1 sample recovered with the IAE and parametric models. Solid lines and the associated shaded regions show the median and 1-σ dispersion of the reconstructed 3D temperature profile obtained with MCMC. Regions enclosed by the dashed lines represent the corresponding 1-σ dispersion 2D temperature profiles fitted to the observed 2D data (black dots). In line with our results with simulations for observational-like cases, we find that both the IAE model and parametric model with optimal priors generate comparable profiles for NCC and CC clusters.

Figure 19 shows the 3D temperature profiles of the clusters in the DR1 sample obtained with the IAE model, scaled by the average temperature (T_X) in the [0.15–0.75] R₅₀₀ region. We find that fractional dispersion is about 22% in the inner region which first decreases with the radius and attains a minimum value of 3% at around 0.5 R₅₀₀. It then starts to increase with radius, achieving a maximum value of 22% in the outer regions. Also plotted in the sub-panel is the ratio of the 3D temperature profiles recovered with the IAE and parametric models. One finds that within the radial range of [0.1–1] R₅₀₀, the difference between IAE and Vikhlinin et al. (2006) model is less than 10%. The difference between them can be as high as 25% in the inner and outer regions. However, on average both models predict very similar profiles with a difference of less than 2% over the entire radial range of [0.005–2] R₅₀₀.

Fig. 19.

Scaled 3D temperature profiles of the DR1 sample recovered with the IAE model. Also shown in the bottom panel is the ratio of 3D temperature profiles recovered with the IAE model to the parametric models. For better visibility, the error bars corresponding to the individual profiles are not shown. The black lines and grey shaded grey regions represent the median and 1-σ dispersion of the sample. The difference between the IAE model and the parametric model can be as high as 20%, although the average ratio between them remains close to unity.

As a consistency check, we compared the values of the average temperature in the [0.15–0.75] R₅₀₀ region. Figure 20 shows the observed T_X compared to T_X, model, the temperature derived from a projection of the 3D non-parametric IAE and parametric models in the same annulus. Fitting a straight line to the (T_X, model,T_X) one finds the slope for the IAE and parametric model to be 1.01 ± 0.01 and 1.01 ± 0.02 respectively.

Fig. 20.

Left Panel: Comparison of the observed T_X and the best-fit T_X, model obtained with non-parametric IAE and parametric Vikhlinin et al. (2006) models. Solid lines show the best fit for the data. We see that both our non-parametric and parametric approaches provide tight and accurate constraints on the average temperature of clusters.

6.4.2. Estimation of derivatives

While non-parametric models offer greater flexibility in modelling complex patterns and relationships, one requires a large amount of data to accurately estimate derivatives. Small irregularities in the profiles often amplify the noise in the derivatives. Therefore, it is often desirable to apply some degree of smoothing to the profiles to have accurate derivatives in the non-parametric approaches. As can be seen from Fig. 19, the reconstructed 3D temperature profiles from the IAE model have a reasonably smooth underlying structure. We find that the direct computation of numerical derivatives of individual profiles derived from the MCMC chains using spline interpolation, without applying any smoothing, usually provided a good estimate of the logarithmic derivatives and corresponding 1-σ interval. Nonetheless, we sometimes found the derivative estimates to be noisy, particularly beyond the 2D fitting range. This noise can be attributed to logarithmic binning, which can create sparsity in the outer regions. Another potential cause for the noise is small spikes in the temperature profiles between consecutive radii in the profiles inherited by the model from the simulations itself in the inner regions due to the limited resolution there. We, therefore, choose to apply a very minimal smoothing, such that only the sharp discontinuities, if any (usually small in magnitude), on local scales (2–3 radial bins) are affected/corrected and the general non-linear structure is preserved. We use the algorithm developed by Cappellari et al. (2013) which implements the one-dimensional locally linear weighted regression Cleveland (1979)⁷. It uses a tri-cube weighting function with weights (1 − u³)³ where u is a distance from the local point R under consideration and a smoothing parameter f which is the fraction of neighborhood points to be considered in the local fit around R. Increasing the value of f increases the neighborhood of influential points leading the smoother profiles. For our case, we apply modest smoothing with f = 0.15.

Figure 21 shows the corresponding logarithmic derivatives of the temperature profiles of the two clusters discussed in the previous sub-section. Here also, both the IAE and the parametric models produce consistent profiles. Furthermore, for the IAE model, the profiles obtained with and without applying the smoothing on the temperature profiles are also consistent with each other. This can be also seen in the bottom panel where the ratio between reconstructed 3D temperature profiles with and without applying smoothing is seen to be less than 1% over most of the radial range. Figure 22 shows the logarithmic derivatives of the 3D temperature profiles of the clusters in the DR1 sample obtained with the IAE model. Also, in the bottom panel, we show the difference in logarithmic derivatives derived from IAE and parametric models (Δ). We find that, although dispersion in the difference increases with the radius, the difference is consistent with zero throughout the radial range. While it is difficult to quantify this difference in the inner region, since logarithmic derivatives are close to zero, in the range [0.5–2] R₅₀₀ the difference in logarithmic derivatives between the IAE and the parametric model can be more than as 20%. The impact of this on the total mass estimate is not straightforward but is expected to be about 5%–30%.

Fig. 21.

Comparison of the logarithmic derivatives 3D temperature profiles of a typical NCC (PSZ2 G050.40+31.17) and CC (PSZ2 G057.92+27.64) cluster in the DR1 sample recovered with the IAE and parametric models. Solid lines and the associated shaded regions show the median and 1-σ dispersion obtained with MCMC. The region enclosed by the dashed lines represents 1-σ dispersion, if no smoothing is applied to the profiles derived from the MCMC chain. The bottom panel shows the ratio of the median 3D temperature profiles obtained using IAE with and without smoothing.

Fig. 22.

Logarithmic derivatives of 3D temperature profiles of the DR1 sample recovered with the IAE model. Also shown in the bottom panel is the difference between profiles recovered with the IAE model and the parametric model. For better visibility, the error bars corresponding to the individual profiles are not shown. The black lines and grey shaded grey regions represent the median and 1-σ dispersion of the sample.

7. Discussion and conclusions

Classical statistical modelling techniques can be sensitive to inaccuracies and may lead to poor performance if the data are complex (non-linear) and/or have a dynamic structure. Data-driven (model-agnostic) deep-learning techniques are now becoming increasingly popular. They make use of the topology to learn the underlying structure of the data, and often have been found to give superior performance in terms of accuracy and precision when the underlying structure of data are non-linear. However, one typically requires a massive dataset and vast computational resources to train the neural network, limiting their applicability for some scenarios. In this paper, we demonstrate the first use of deep learning techniques to build a model of galaxy cluster temperature profiles and apply this model to the problem of temperature profile deprojection. Using a non-linear interpolatory scheme with five anchor points (temperature profiles), allows us to have frugal learning with a sparse training set, and the neural network is able to uncover the lower dimensional non-linear manifold of data by way of mapping between latent space and real space.

The resulting Interpolatory Auto-Encoder (IAE) model is trained and evaluated in the radial range of [0.02–2] R₅₀₀ using a simulated dataset of 315 temperature profiles from the THREE HUNDRED PROJECT. We then implement a new deconvolution scheme using efficient and cost-effective learning-based regularisation to achieve a stable and accurate reconstruction of the 3D temperature profiles by optimising the latent parameters (barycentric weights) of the anchor points using MCMC. Moreover, the deconvolution algorithm can be easily extended to include the instrumental PSF effect. We test the IAE with a different set of deconvolution schemes with respect to the resolution, projection, and quality of the data. We find that, in general, the IAE model can recover unbiased 3D temperature profiles in the fitting range. The performance of the IAE model to recover the true temperature profiles can be summarised as follows:

We first considered the simplest case, where we tested the efficiency of the IAE model in directly fitting the high resolution simulated 3D temperature profiles, defined in 48 fixed radial bins in the range [0.02–2] R₅₀₀, the resolution with which the IAE model is trained. We find that in this case, the reconstruction of temperature profiles from the IAE model is robust, with the median fractional residuals centered around zero and a 1-σ dispersion (determined by the 16th and 84th percentile range of fractional residuals) of about ±5% over most of the radial range. The dispersion in the outskirts is somewhat larger (about ±10%). This can be interpreted as being due to the complex nature of the ICM as a result of merging/accretion processes that are dominant there. Moreover, dispersion in the fractional residuals for the sub-sample of 20 most relaxed clusters (MR20) and smooth temperature profiles (MS20) is about 35% smaller compared to the sub-sample of 20 most disturbed clusters (MD20) and irregular temperature profiles (MI20). We find that the model fidelity can be further improved by increasing the number of anchor points in the IAE model. However, since observed temperature profiles are generally of much lower resolution, increasing the complexity of the model is undesirable as it could lead to overfitting.
We then considered a case where we fitted the high resolution simulated 2D temperature profiles to the IAE model using classical emission measure weights. Here too we find the median fractional residual is centered around zero with a 1-σ dispersion of about ±5% over most of the radial range. In the first few innermost bins, however, we find that the dispersion is increased to about ±10%. This is understandable since the projection operation introduces a degeneracy in the 3D temperature profiles which is significant in the inner regions i.e the mapping between input 2D temperature profiles and IAE reconstructed 3D temperature profiles is not as strong as compared to a mapping between input 3D temperature profiles to IAE reconstructed 3D temperature profiles. However, this degeneracy can be mitigated to a large extent in the observational-like cases since the 2D temperature profiles in the inner bins have relatively smaller errors associated with them as compared to the rest of the radial bins. Moreover, as in the previous case, the distribution of the fractional residuals over all radii for the MR20 (MS20) sub-sample is narrowly peaked compared to the MD20 (MI20) sub-sample.
We next considered observation-like fitting cases, with typical temperature profile data quality such as would be obtained from the XMM-Newton or Chandra satellites. We first considered a case where we fit 2D temperature profiles defined at twelve radial points and up to R₅₀₀ only, mimicking the profile expected from the moderately deep X-ray exposures. We find that in the 2D fitting range i.e. [0.02–1] R₅₀₀, with the relatively low resolution input 2D temperature profiles, the performance of the IAE model is negligibly degraded. However, beyond R₅₀₀, where we do not consider any 2D data in the fit, the 1-σ dispersion in the 3D reconstruction increases with radius and becomes about ±20% in the last bin. The 3D median fractional residual is found close to zero over most of the radial range, except beyond 1.5 R₅₀₀ where it is underestimated by about 7%. We also considered a case where we only use only six 2D temperature data points in the fit and find that the IAE is still able to provide an unbiased estimate of the reconstructed temperature profile, albeit with a slightly larger uncertainty.
We considered a more realistic temperature-dependent spectroscopic-like weighting scheme (Mazzotta et al. 2004) in the deprojection. We find that there is a small bias of about 4% excess in the fractional residual in the innermost few bins, in addition to the underestimation in the outer regions as in the previous case.
We also compared the IAE model with a parametric temperature model. With the high resolution hydrodynamical simulated temperature profiles, the parametric model based on Vikhlinin et al. (2006) showed poor performance when the realistic error covariance matrix is ignored in the fit. Including the error covariance matrix improved the fit. The non-linearity and parameter degeneracy of the parametric model also contributed to sub-optimal performance, making the 3D reconstruction dependent on the choice of priors. In contrast, the IAE model performed better, particularly in complex cases with ongoing merger shocks, demonstrating its superior adaptability to diverse data scenarios.
Finally, in a first application to X-ray data, we built an augmented version of the IAE model in the radial range [0.005–2] R₅₀₀. The data augmentation was necessary because the simulated profiles did not have sufficient resolution to probe the very core regions that are accessible to good quality X-ray data. The augmentation step was achieved by extrapolating the simulated profiles to lower radii (below ≈0.02 R₅₀₀) by fitting them to the Vikhlinin et al. (2006) parametric model in the range ≈ [0.02–0.5] R₅₀₀. We then used this updated IAE model to reconstruct the 3D temperature profiles and logarithmic derivative of the representative (DR1) sample galaxy clusters drawn from the CHEX-MATE project. The resulting non-parametric IAE profiles were compared to those derived from parametric deprojection and deconvolution. We find that, in such observational cases where the typical number of annular data points is much fewer compared to the simulations, the difference between the IAE and parametric model is less than 10% over most of the observed region. However, in the inner and outer regions, the difference between them can be as high as 25%. Moreover, the results from the Vikhlinin et al. (2006) parametric model, especially inner and outer regions, depends on the priors chosen on the parameters as they are very poorly constrained during the fit.

It should be noted that the inner regions of the clusters, which involve processes such as AGN feeding/feedback, gas condensation, sloshing, etc., are complex and may not be accurately represented by current state-of-art cosmological simulations. Moreover, the augmentation of the central regions of the training set using the extrapolation of a parametric model could potentially introduce bias in the underlying model recovered from the IAE. Despite these limitations, we believe that the IAE model provides higher-fidelity results compared to traditional parametric modelling, as demonstrated in this study. As the size and quality of both X-ray observations and simulations are set to improve in the coming years, the robustness of IAE will also be enhanced resulting in a much lower scatter. Our future plan is to perform network training and testing on different sets of simulations so as to have a larger training and validation sample. This will potentially also help us to understand the systematics, if any, in the IAE model inherited from the particular set of numerical simulations used for training. For example, De Luca et al. (2021) showed that the dynamical state of clusters in the THREE HUNDRED PROJECT clusters varies with redshift: the relaxed clusters decrease in number from redshift z = 0 to z = 1. It remains to be seen if issues such as possible redshift dependence have any impact on learning. This effect, in principle, can be taken into account by training the model using simulated clusters across a large redshift range.

Another important step in improving the deconvolution scheme will be to force the neural network model to learn the features shared between simulations and real data using transfer/adversarial learning (Ganin et al. 2016). This will essentially mitigate the biases inherited by the neural network model from simulations. Moreover, we expect with an upgraded IAE model, the reconstruction of 3D temperature profiles beyond the observational range of R₅₀₀ will be significantly improved due to an increase in the size of the training sample. We further plan to implement a more robust model extrapolation technique in future work.

The usefulness of the IAE is not only limited to the estimation of the temperature of the galaxy clusters. We further plan to use the IAE interpolatory technique to recover the underlying density, pressure and hence dark matter profiles in the galaxy clusters. An important extension of this will be to train a neural network to estimate the total mass profiles of the galaxy clusters directly from the thermal profiles of the ICM without considering the hydrostatic equation. Another interesting prospect for our work will be to implement the deconvolution technique in SZ and lensing data, to recover the robust model of the galaxy clusters. This will further help us to understand the biases introduced in calibrating the mass and scaling relations for cosmological studies. Such studies might be also used to assess more robustly relative density/temperature fluctuations, hence constraining turbulence and relative parameters (Mach number, injection scale, etc.). Our methodology can also be implemented in other areas of astrophysics and cosmology. In fact, the IAE scheme has already been implemented in the source separation algorithm to tackle physical hyper-spectral data (Gertosio et al. 2023).

One of our immediate plans is to implement the proposed deconvolution technique to the most recent high quality CHEX-MATE X-ray sample of clusters (CHEX-MATE Collaboration 2021), and compare to other approaches such as those used in Bartalucci et al. (2018; semi-parametric reconstruction) and Eckert et al. (2022; multi-scale non-parametric reconstruction). The comparison of the estimated logarithmic derivatives will be instructive since these are highly related to the shape of mass profiles of clusters. Our ultimate goal will be to test the ΛCDM predictions on the total mass distribution in galaxy clusters using a new and sophisticated fully non-parametric approach.

The scaled radius R_Δ is defined such that R_Δ is the radius at which the mean matter density is Δρ_c, where ρ_c = 3H²(z)/8πG is the critical density of the universe at redshift z.

Mish(x) = x × tanh(ln(1 + e^x)).

https://github.com/jbobin/IAE

⁴

See Chen et al. (2023) for a discussion of an optimal binning method.

⁵

http://www.iasf-milano.inaf.it/~simona/pub/EPIC-MCT/EPIC-MCT-TN-011.pdf

⁶

http://www.iasf-milano.inaf.it/~simona/pub/EPIC-MCT/EPIC-MCT-TN-012.pdf

⁷

https://pypi.org/project/loess/

Acknowledgments

The work of A.I. was supported by CNES, the French space agency. S.E., L.L. and F.G. acknowledge the financial contribution from the contracts ASI-INAF Athena 2019-27-HH.0, “Attività di Studio per la comunità scientifica di Astrofisica delle Alte Energie e Fisica Astroparticellare” (Accordo Attuativo ASI-INAF n. 2017-14-H.0), and from the European Union’s Horizon 2020 Programme under the AHEAD2020 project (grant agreement n. 871158). This research was supported by the International Space Science Institute (ISSI) in Bern, through ISSI International Team project #565 (Multi-Wavelength Studies of the Culmination of Structure Formation in the Universe). M.S. acknowledges the financial contribution from contract ASI-INAF n.2017-14-H.0. and from contract INAF mainstream project 1.05.01.86.10. E.P. acknowledges the financial support of CNRS/INSU and of CNES, the French Space Agency. M.E.D. acknowledges partial financial support from a NASA ADAP award/SAO subaward SV9-89010. M.D.P. and A.F. acknowledge financial contribution from Sapienza Universitá di Roma, thanks to Progetti di Ricerca Medi 2020, RM120172B32D5BE2. A.F. thanks financial support from Universidad de La Laguna (ULL), NextGenerationEU/PRTR, and Ministerio de Universidades (MIU) (UNI/511/2021) through grant “Margarita Salas”. H.B., D.dL., and P.M. acknowledge support from the Spoke 3 Astrophysics and Cosmos Observations. National Recovery and Resilience Plan (Piano Nazionale di Ripresa e Resilienza, PNRR) Project ID CN_00000013 “Italian Research Center on High-Performance Computing, Big Data and Quantum Computing” funded by MUR Missione 4 Componente 2 Investimento 1.4: Potenziamento strutture di ricerca e creazione di “campioni nazionali di R&S (M4C2-19)” – Next Generation EU (NGEU) and from the European Union’s Horizon 2020 Programme under the AHEAD2020 project (grant agreement n. 871158). The authors would like to thank the reviewer for his/her careful, constructive and insightful comments in relation to this work.

References

Abbott, T. M. C., Aguena, M., Alarcon, A., et al. 2022, Phys. Rev. D, 105, 023520 [CrossRef] [Google Scholar]
Alam, S., Albareti, F. D., Allende Prieto, C., et al. 2015, ApJS, 219, 12 [Google Scholar]
Ansarifard, S., Rasia, E., Biffi, V., et al. 2020, A&A, 634, A113 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Armitage, T. J., Kay, S. T., & Barnes, D. J. 2019, MNRAS, 484, 1526 [NASA ADS] [CrossRef] [Google Scholar]
Ascasibar, Y., & Diego, J. M. 2008, MNRAS, 383, 369 [Google Scholar]
Barnes, D. J., Vogelsberger, M., Kannan, R., et al. 2018, MNRAS, 481, 1809 [Google Scholar]
Bartalucci, I., Arnaud, M., Pratt, G. W., et al. 2017, A&A, 608, A88 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bartalucci, I., Arnaud, M., Pratt, G. W., & Le Brun, A. M. C. 2018, A&A, 617, A64 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bartalucci, I., Arnaud, M., Pratt, G. W., Démoclès, J., & Lovisari, L. 2019, A&A, 628, A86 [EDP Sciences] [Google Scholar]
Bartalucci, I., Molendi, S., Rasia, E., et al. 2023, A&A, 674, A179 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Beck, A. M., Murante, G., Arth, A., et al. 2016, MNRAS, 455, 2110 [Google Scholar]
Bobin, J., Acero, F., & Picquenot, A. 2019, in 2019 IEEE 8th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 450 [Google Scholar]
Bobin, J., Gertosio, C. R., Bobin, C., & Thiam, C. 2023, Digital Signal Proc., 139, 104058 [CrossRef] [Google Scholar]
Bocquet, S., Saro, A., Mohr, J. J., et al. 2015, ApJ, 799, 214 [NASA ADS] [CrossRef] [Google Scholar]
Böhringer, H., Schuecker, P., Pratt, G. W., et al. 2007, A&A, 469, 363 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bradbury, J., Frostig, R., Hawkins, P., et al. 2018, JAX: Composable Transformations of Python+NumPy programs [Google Scholar]
Bulbul, G. E., Hasler, N., Bonamente, M., & Joy, M. 2010, ApJ, 720, 1038 [NASA ADS] [CrossRef] [Google Scholar]
Calderon, V. F., & Berlind, A. A. 2019, MNRAS, 490, 2367 [NASA ADS] [CrossRef] [Google Scholar]
Campitiello, M. G., Ettori, S., Lovisari, L., et al. 2022, A&A, 665, A117 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Cappellari, M., McDermid, R. M., Alatalo, K., et al. 2013, MNRAS, 432, 1862 [NASA ADS] [CrossRef] [Google Scholar]
Chen, C., Arnaud, M., Pointecouteau, E., Pratt, G., & Iqbal, A. 2023, A&A, submitted [Google Scholar]
CHEX-MATE Collaboration (Arnaud, M., et al.) 2021, A&A, 650, A104 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Cialone, G., De Petris, M., Sembolini, F., et al. 2018, MNRAS, 477, 139 [Google Scholar]
Cleveland, W. S. 1979, J. Am. Stat. Assoc., 74, 829 [Google Scholar]
Croston, J. H., Arnaud, M., Pointecouteau, E., & Pratt, G. W. 2006, A&A, 459, 1007 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Croston, J. H., Pratt, G. W., Böhringer, H., et al. 2008, A&A, 487, 431 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Cui, W., Knebe, A., Yepes, G., et al. 2018, MNRAS, 480, 2898 [Google Scholar]
David, L. P., Nulsen, P. E. J., McNamara, B. R., et al. 2001, ApJ, 557, 546 [CrossRef] [Google Scholar]
de Andres, D., Cui, W., Ruppin, F., et al. 2022, Nat. Astron., 6, 1325 [NASA ADS] [CrossRef] [Google Scholar]
De Grandi, S., & Molendi, S. 2002, ApJ, 567, 163 [CrossRef] [Google Scholar]
De Luca, F., De Petris, M., Yepes, G., et al. 2021, MNRAS, 504, 5383 [NASA ADS] [CrossRef] [Google Scholar]
Démoclès, J., Pratt, G. W., Pierini, D., et al. 2010, A&A, 517, A52 [CrossRef] [EDP Sciences] [Google Scholar]
Eckert, D., Ettori, S., Pointecouteau, E., van der Burg, R. F. J., & Loubser, S. I. 2022, A&A, 662, A123 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Ettori, S., Fabian, A. C., Allen, S. W., & Johnstone, R. M. 2002, MNRAS, 331, 635 [Google Scholar]
Ettori, S., Gastaldello, F., Leccardi, A., et al. 2010, A&A, 524, A68 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Ettori, S., Donnarumma, A., Pointecouteau, E., et al. 2013, Space Sci. Rev., 177, 119 [Google Scholar]
Fabian, A. C., Hu, E. M., Cowie, L. L., & Grindlay, J. 1981, ApJ, 248, 47 [NASA ADS] [CrossRef] [Google Scholar]
Fefferman, C., Mitter, S., & Narayanan, H. 2013, arXiv e-prints [arXiv:1310.0425] [Google Scholar]
Ferragamo, A., de Andres, D., Sbriglio, A., et al. 2023, MNRAS, 520, 4000 [NASA ADS] [CrossRef] [Google Scholar]
Foreman-Mackey, D., Hogg, D. W., Lang, D., & Goodman, J. 2013, PASP, 125, 306 [Google Scholar]
Ganin, Y., Ustinova, E., Ajakan, H., et al. 2016, J. Mach. Learn. Res., 17, 2096 [Google Scholar]
Gaspari, M., Brighenti, F., & Temi, P. 2012, MNRAS, 424, 190 [NASA ADS] [CrossRef] [Google Scholar]
Gertosio, R. C., Bobin, J., & Fabio, A. 2023, Signal Proc., 202, 108776 [CrossRef] [Google Scholar]
Ghirardini, V., Eckert, D., Ettori, S., et al. 2019a, A&A, 621, A41 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Ghirardini, V., Ettori, S., Eckert, D., & Molendi, S. 2019b, A&A, 627, A19 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Ghizzardi, S. 2001, XMM-SOC-CAL-TN-0022 [Google Scholar]
Gianfagna, G., De Petris, M., Yepes, G., et al. 2021, MNRAS, 502, 5115 [NASA ADS] [CrossRef] [Google Scholar]
Gupta, N., & Reichardt, C. L. 2020, ApJ, 900, 110 [NASA ADS] [CrossRef] [Google Scholar]
Gupta, N., & Reichardt, C. L. 2021, ApJ, 923, 96 [NASA ADS] [CrossRef] [Google Scholar]
He, K., Zhang, X., Ren, S., & Sun, J. 2015, ArXiv e-prints [arXiv:1512.03385] [Google Scholar]
Ho, M., Rau, M. M., Ntampaka, M., et al. 2019, ApJ, 887, 25 [NASA ADS] [CrossRef] [Google Scholar]
Ho, M., Farahi, A., Rau, M. M., & Trac, H. 2021, ApJ, 908, 204 [NASA ADS] [CrossRef] [Google Scholar]
Ho, M., Ntampaka, M., Rau, M. M., et al. 2022, Nat. Astron., 6, 936 [NASA ADS] [CrossRef] [Google Scholar]
Holder, G., Haiman, Z., & Mohr, J. J. 2001, ApJ, 560, L111 [NASA ADS] [CrossRef] [Google Scholar]
Huang, G., Liu, Z., van der Maaten, L., & Weinberger, K. Q. 2016, arXiv e-prints [arXiv:1608.06993] [Google Scholar]
Iqbal, A., Kale, R., Nath, B. B., & Majumdar, S. 2018, MNRAS, 480, L68 [NASA ADS] [CrossRef] [Google Scholar]
Johnstone, R. M., Fabian, A. C., Morris, R. G., & Taylor, G. B. 2005, MNRAS, 356, 237 [NASA ADS] [CrossRef] [Google Scholar]
Kay, S. T., & Pratt, G. W. 2022, in Handbook of X-ray and Gamma-ray Astrophysics, C. Bambi, & A. Santangelo (Springer), 100 [Google Scholar]
Kingma, D. P., & Ba, J. 2014, arXiv e-prints [arXiv:1412.6980] [Google Scholar]
Klein, M., Oguri, M., Mohr, J. J., et al. 2022, A&A, 661, A4 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Klypin, A., Yepes, G., Gottlöber, S., Prada, F., & Heß, S. 2016, MNRAS, 457, 4340 [Google Scholar]
Lakhchaura, K., Saini, T. D., & Sharma, P. 2016, MNRAS, 460, 2625 [NASA ADS] [CrossRef] [Google Scholar]
Leccardi, A., & Molendi, S. 2008, A&A, 486, 359 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Lewis, A. 2019, arXiv e-prints [arXiv:1910.13970] [Google Scholar]
Lovisari, L., & Maughan, B. J. 2022, Handbook of X-ray and Gamma-ray Astrophysics, 65 [Google Scholar]
Mathiesen, B. F., & Evrard, A. E. 2001, ApJ, 546, 100 [CrossRef] [Google Scholar]
Mazzotta, P., Rasia, E., Moscardini, L., & Tormen, G. 2004, MNRAS, 354, 10 [NASA ADS] [CrossRef] [Google Scholar]
Ntampaka, M., Trac, H., Sutherland, D. J., et al. 2015, ApJ, 803, 50 [NASA ADS] [CrossRef] [Google Scholar]
Piffaretti, R., Arnaud, M., Pratt, G. W., Pointecouteau, E., & Melin, J. B. 2011, A&A, 534, A109 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Pizzolato, F., Molendi, S., Ghizzardi, S., & De Grandi, S. 2003, ApJ, 592, 62 [NASA ADS] [CrossRef] [Google Scholar]
Planck Collaboration XI. 2011, A&A, 536, A11 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Planck Collaboration V. 2013, A&A, 550, A131 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Planck Collaboration XIII. 2016, A&A, 594, A13 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Planck Collaboration XXIV. 2016, A&A, 594, A24 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Planck Collaboration XXVII. 2016, A&A, 594, A27 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Pratt, G. W., Böhringer, H., Croston, J. H., et al. 2007, A&A, 461, 71 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Pratt, G. W., Arnaud, M., Piffaretti, R., et al. 2010, A&A, 511, A85 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Pratt, G. W., Arnaud, M., Maughan, B. J., & Melin, J. B. 2022, A&A, 665, A24 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Ramanah, K. D., Wojtak, R., Ansari, Z., Gall, C., & Hjorth, J. 2020, MNRAS, 499, 1985 [NASA ADS] [CrossRef] [Google Scholar]
Rasia, E., Meneghetti, M., & Ettori, S. 2013, Astron. Rev., 8, 40 [Google Scholar]
Rasia, E., Lau, E. T., Borgani, S., et al. 2014, ApJ, 791, 96 [NASA ADS] [CrossRef] [Google Scholar]
Rasia, E., Borgani, S., Murante, G., et al. 2015, ApJ, 813, L17 [Google Scholar]
Russell, H. R., Sanders, J. S., & Fabian, A. C. 2008, MNRAS, 390, 1207 [NASA ADS] [CrossRef] [Google Scholar]
Sereno, M., Covone, G., Izzo, L., et al. 2017, MNRAS, 472, 1946 [Google Scholar]
Snowden, S. L., Mushotzky, R. F., Kuntz, K. D., & Davis, D. S. 2008, A&A, 478, 615 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Starck, J.-L., Fadili, J., & Murtagh, F. 2007, IEEE Trans. Image Proc., 16, 297 [Google Scholar]
Vikhlinin, A. 2006, ApJ, 640, 710 [NASA ADS] [CrossRef] [Google Scholar]
Vikhlinin, A., Kravtsov, A., Forman, W., et al. 2006, ApJ, 640, 691 [Google Scholar]
Vincent, P., Larochelle, H., Lajoie, I., et al. 2010, J. Mach. Learn. Res., 11, 12 [Google Scholar]
Yan, Z., Mead, A. J., Van Waerbeke, L., Hinshaw, G., & McCarthy, I. G. 2020, MNRAS, 499, 3445 [NASA ADS] [CrossRef] [Google Scholar]

Appendix A: Supplementary Material

A.1. Correlation between χ_D and χ_S

In Fig A.1, we present the correlation analysis between χ_{_D} and χ_{_S} for the clusters in the THREE HUNDRED PROJECT. These parameters serve to classify the clusters based on their intrinsic dynamical state and the smoothness of their temperature profiles. Specifically, small values of χ_{_D} indicate relaxed clusters, while small χ_{_S} values suggest clusters with smooth temperature profiles. Our findings reveal a noteworthy correlation between χ_{_D} and χ_{_S}, as evidenced by the calculated Spearman’s correlation coefficient of 0.42 and a P value of 5 × 10⁻¹⁵.

Fig. A.1.

Correlation between χ_{_D} and χ_{_S} for the simulated clusters in THE THREE HUNDRED PROJECT. Cyan circles and green triangles represent the 20 most relaxed clusters and smooth profiles respectively. Magenta circles and orange triangles represent the 20 most disturbed clusters and irregular profiles respectively.

A.2. Temperature profile reconstruction with IAE model for the fine binning cases

Figure A.2 shows the posterior distribution of the parameters of the IAE model obtained using MCMC for the 3D-3D and 2D-3D fine binning cases for the most relaxed / disturbed clusters and of the most regular / irregular profiles in the validation sample. The parameters are found to be well-constrained, with the relaxed cluster profile having tighter constraints than the disturbed cluster profile, which has larger contour levels.

Fig. A.2.

Two-dimensional joint posterior probability distributions and one-dimensional marginal posterior probability distribution of IAE model parameters with 3D-3D fine binning (top panel) and 2D-3D fine (bottom panel) binning cases for the most relaxed cluster (smooth profile) and most disturbed cluster (irregular profile). The shaded contours represent the 68% and 95% confidence regions. For comparison, 2D contours in the 3D-3D fit case are shown with the dashed red and blue lines for the 2D-3D fine binning case.

Figure A.3 illustrates the comparison of 20 randomly selected 3D mass-weighted temperature profiles from the THREE HUNDRED PROJECT validation sample with the corresponding reconstructed 3D median temperature profiles obtained from the IAE model (3D-3D fine binning case),. The analysis covers the radial range of [0.02-2] R₅₀₀ using 48 radial bins. As shown, the discrepancy between the true and reconstructed profiles remains below 5% across most of the cluster radial range. However, it is important to note that in certain cases, particularly in the outer regions (R > 1.5 R₅₀₀), we observe larger (generally less than 20%) discrepancies between the true and reconstructed temperature profiles. This behaviour can be attributed to the intricate interplay of complex physical processes within the cluster, such as merger events, shocks, and interactions between the ICM and accreting matter in the outskirts. These processes can influence the temperature distribution, leading to local variations that the IAE model might encounter challenges in accurately capturing.

Fig. A.3.

Comparison of the 20 3D mass-weighted temperature profiles in the validation sample (dashed lines) and reconstructed 3D median temperature profiles obtained from the IAE model (solid lines). The shaded regions represent the 1-σ dispersion (16th–84th percentile range) of the recovered profile. Also shown, in the smaller subplots, are the residuals of the fit.

Figure A.4 shows the fractional residuals between the true and reconstructed temperature profiles with IAE having 20 anchor points for all the individual clusters in the validation sample considering 3D-3D fine binning case. The median fractional residual profile is found to be close to zero throughout the radial range: at radii, 0.02 R₅₀₀, R₅₀₀, and 2 R₅₀₀, the values are −0.008 ± 0.025, 0.008 ± 0.037 and −0.020 ± 0.074 respectively.

Fig. A.4.

Fractional residuals obtained with IAE model having 20 anchor points for 3D-3D fine binning case. Colour coding is the same as in Fig. (8). There is a significant improvement of 25% in the average fractional residual compared to the IAE model with 5 anchor points.

A.3. Temperature profile reconstruction with IAE model for the observational-like binning cases

Figure A.5 illustrates the 3D fractional residuals obtained when error bars (i.e. error covariance matrix, see Eqn. 19) are not considered to fit the 2D temperature profiles defined at twelve coarse bins with IAE model. In this scenario, it is evident that the scatter is amplified in the inner regions compared to the where the error covariance matrix is taken into account. The median 3D fractional residuals at radii 0.02 R₅₀₀, R₅₀₀, and 2 R₅₀₀ are determined as 0.048 ± 0.090, −0.005 ± 0.063, and −0.080 ± 0.175 respectively. The median of fractional residuals across the entire radial range for the complete sample is calculated to be −0.006 ± −0.062.

Fig. A.5.

The fractional residuals for 115 clusters in the validation sample with IAE for the 2D-3D fit (coarse binning) using 2D temperature profiles defined at twelve radial bins up to R₅₀₀, without considering errors on the 2D temperature profiles. Colour coding is the same as in Fig. (10). For simplicity, only the 3D temperature reconstruction is plotted.

Figure A.6 illustrates the posterior distribution of the IAE model parameters obtained via MCMC for 2D-3D cases binning cases (both twelve as well as six bins). The analysis covers the most relaxed and disturbed clusters, along with the most regular and irregular profiles in the validation sample. While the parameters are well-constrained, the constraints are slightly weaker compared to the fine binning cases.

Fig. A.6.

Two-dimensional joint posterior probability distributions and one-dimensional marginal posterior probability distribution of IAE model parameters with 2D-3D coarse binning of twelve (top panel) six (bottom panel) radial bins for the most relaxed cluster (smooth profile) and most disturbed cluster (irregular profile). The shaded contours represent the 68% and 95% confidence regions. For comparison, 2D contours for the 3D-3D binning case are shown with the dashed red and blue lines for the bottom panel.

Figures A.7 and A.8 present a comparison between 20 randomly selected 3D and 2D profiles from THE THREE HUNDRED PROJECT validation sample with reconstructed 2D and 3D median temperature profiles obtained from the IAE model using observational binning of twelve and six respectively, which is typical of X-ray observations. The analysis focuses on fitting simulated 2D temperature profiles in the radial range of [0.02-1] R₅₀₀ with the convolved IAE model assuming errors in temperature profiles increase linearly with radius. The results indicate that the discrepancy between the true and reconstructed temperature profiles remains around 5% in the 2D fitting range from [0.02-1] R₅₀₀.

Fig. A.7.

Left panel: Comparison of the 20 simulated 2D temperature profiles (solid points with errors) and reconstructed 2D temperature profiles obtained using IAE model (solid lines), the fitting being performed in the range [0.02-1] R₅₀₀ considering twelve 2D temperature bins. The shaded regions represent the 1-σ dispersion of the reconstructed 2D temperature profiles. The smaller subplots show the residuals of the fit. Right panel: Solid lines and the shaded regions show the corresponding reconstructed 3D temperature profiles and the 1-σ dispersion respectively. Also shown in the dashed lines are the true 3D mass-weighted temperature profiles.

Fig. A.8.

Left panel: Comparison of the 20 simulated 2D temperature profiles (solid points with errors) and reconstructed 2D temperature profiles obtained using IAE model (solid lines), the fitting being performed in the range [0.02-1] R₅₀₀ considering six 2D temperature bins. The shaded regions represent the 1-σ dispersion of the reconstructed 2D temperature profiles. The smaller subplots show the residuals of the fit. Right panel: Solid lines and the shaded regions show the corresponding reconstructed 3D temperature profiles and the 1-σ dispersion respectively. Also shown in the dashed lines are the true 3D mass-weighted temperature profiles.

A.4. DR1 sample used in this work

Table A.1 provides comprehensive details for all the clusters included in the DR1 sample used in this work. The table encompasses information on cluster names and redshifts, and other relevant properties, allowing for a comprehensive examination and analysis of each cluster characteristics. The compilation comprises 30 clusters, reflecting the mass, redshift, and Planck S/N distribution akin to the properties observed in the CHEX-MATE parent sample.

Table A.1.

List of the clusters of the DR1 sample.

All Tables

Table 1.

Details on the neural network architecture and hyper-parameters used in this work.

In the text

Table 2.

Flat priors used for the IAE model parameters.

Best fit results for the IAE parameters derived with the MCMC for the fitting schemes and samples considered in Sects. 5.1–5.3.

In the text

Table 5.

Flat priors used for the Vikhlinin et al. (2006) model parameters.

In the text

Table A.1.

List of the clusters of the DR1 sample.

In the text

All Figures

Fig. 1.

In the text

	Fig. 2. Number of clusters as a function of T_X in the THREE HUNDRED PROJECT sample, the Planck SZ sample and the DR1 sample.
In the text

	Fig. 5. Smooth (coarse) component of a complex temperature profile derived from the application of the Starlet transform with J = 2. The bottom panel shows the corresponding difference between true and smooth temperature profiles.
In the text

Fig. 6.

In the text

	Fig. 7. Five anchor points (example profiles), $T_{a}^{e}$ $Mathematical equation: $ {\bf T}_a^e $$ , where e runs from 1 to 5 used in the IAE model.
In the text

	Fig. 15. Fractional residuals for 115 clusters in the validation sample with IAE for the 2D–3D fit (coarse binning) using spectroscopic-like 2D temperature profiles defined at twelve radial bins up to R₅₀₀. For simplicity, we have not shown the sub-sample cases.
In the text

	Fig. 20. Left Panel: Comparison of the observed T_X and the best-fit T_X, model obtained with non-parametric IAE and parametric Vikhlinin et al. (2006) models. Solid lines show the best fit for the data. We see that both our non-parametric and parametric approaches provide tight and accurate constraints on the average temperature of clusters.
In the text

	Fig. A.1. Correlation between χ_{_D} and χ_{_S} for the simulated clusters in THE THREE HUNDRED PROJECT. Cyan circles and green triangles represent the 20 most relaxed clusters and smooth profiles respectively. Magenta circles and orange triangles represent the 20 most disturbed clusters and irregular profiles respectively.
In the text

Fig. A.2.

In the text

	Fig. A.3. Comparison of the 20 3D mass-weighted temperature profiles in the validation sample (dashed lines) and reconstructed 3D median temperature profiles obtained from the IAE model (solid lines). The shaded regions represent the 1-σ dispersion (16th–84th percentile range) of the recovered profile. Also shown, in the smaller subplots, are the residuals of the fit.
In the text

	Fig. A.4. Fractional residuals obtained with IAE model having 20 anchor points for 3D-3D fine binning case. Colour coding is the same as in Fig. (8). There is a significant improvement of 25% in the average fractional residual compared to the IAE model with 5 anchor points.
In the text

	Fig. A.5. The fractional residuals for 115 clusters in the validation sample with IAE for the 2D-3D fit (coarse binning) using 2D temperature profiles defined at twelve radial bins up to R₅₀₀, without considering errors on the 2D temperature profiles. Colour coding is the same as in Fig. (10). For simplicity, only the 3D temperature reconstruction is plotted.
In the text

Left panel: Comparison of the 20 simulated 2D temperature profiles (solid points with errors) and reconstructed 2D temperature profiles obtained using IAE model (solid lines), the fitting being performed in the range [0.02-1] R₅₀₀ considering six 2D temperature bins. The shaded regions represent the 1-σ dispersion of the reconstructed 2D temperature profiles. The smaller subplots show the residuals of the fit. Right panel: Solid lines and the shaded regions show the corresponding reconstructed 3D temperature profiles and the 1-σ dispersion respectively. Also shown in the dashed lines are the true 3D mass-weighted temperature profiles.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[R1] Abbott, T. M. C., Aguena, M., Alarcon, A., et al. 2022, Phys. Rev. D, 105, 023520 [CrossRef] [Google Scholar]

[R2] Alam, S., Albareti, F. D., Allende Prieto, C., et al. 2015, ApJS, 219, 12 [Google Scholar]

[R3] Ansarifard, S., Rasia, E., Biffi, V., et al. 2020, A&A, 634, A113 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R4] Armitage, T. J., Kay, S. T., & Barnes, D. J. 2019, MNRAS, 484, 1526 [NASA ADS] [CrossRef] [Google Scholar]

[R5] Ascasibar, Y., & Diego, J. M. 2008, MNRAS, 383, 369 [Google Scholar]

[R6] Barnes, D. J., Vogelsberger, M., Kannan, R., et al. 2018, MNRAS, 481, 1809 [Google Scholar]

[R7] Bartalucci, I., Arnaud, M., Pratt, G. W., et al. 2017, A&A, 608, A88 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R8] Bartalucci, I., Arnaud, M., Pratt, G. W., & Le Brun, A. M. C. 2018, A&A, 617, A64 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R9] Bartalucci, I., Arnaud, M., Pratt, G. W., Démoclès, J., & Lovisari, L. 2019, A&A, 628, A86 [EDP Sciences] [Google Scholar]

[R10] Bartalucci, I., Molendi, S., Rasia, E., et al. 2023, A&A, 674, A179 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R11] Beck, A. M., Murante, G., Arth, A., et al. 2016, MNRAS, 455, 2110 [Google Scholar]

[R12] Bobin, J., Acero, F., & Picquenot, A. 2019, in 2019 IEEE 8th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 450 [Google Scholar]

[R13] Bobin, J., Gertosio, C. R., Bobin, C., & Thiam, C. 2023, Digital Signal Proc., 139, 104058 [CrossRef] [Google Scholar]

[R14] Bocquet, S., Saro, A., Mohr, J. J., et al. 2015, ApJ, 799, 214 [NASA ADS] [CrossRef] [Google Scholar]

[R15] Böhringer, H., Schuecker, P., Pratt, G. W., et al. 2007, A&A, 469, 363 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R16] Bradbury, J., Frostig, R., Hawkins, P., et al. 2018, JAX: Composable Transformations of Python+NumPy programs [Google Scholar]

[R17] Bulbul, G. E., Hasler, N., Bonamente, M., & Joy, M. 2010, ApJ, 720, 1038 [NASA ADS] [CrossRef] [Google Scholar]

[R18] Calderon, V. F., & Berlind, A. A. 2019, MNRAS, 490, 2367 [NASA ADS] [CrossRef] [Google Scholar]

[R19] Campitiello, M. G., Ettori, S., Lovisari, L., et al. 2022, A&A, 665, A117 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R20] Cappellari, M., McDermid, R. M., Alatalo, K., et al. 2013, MNRAS, 432, 1862 [NASA ADS] [CrossRef] [Google Scholar]

[R21] Chen, C., Arnaud, M., Pointecouteau, E., Pratt, G., & Iqbal, A. 2023, A&A, submitted [Google Scholar]

[R22] CHEX-MATE Collaboration (Arnaud, M., et al.) 2021, A&A, 650, A104 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R23] Cialone, G., De Petris, M., Sembolini, F., et al. 2018, MNRAS, 477, 139 [Google Scholar]

[R24] Cleveland, W. S. 1979, J. Am. Stat. Assoc., 74, 829 [Google Scholar]

[R25] Croston, J. H., Arnaud, M., Pointecouteau, E., & Pratt, G. W. 2006, A&A, 459, 1007 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R26] Croston, J. H., Pratt, G. W., Böhringer, H., et al. 2008, A&A, 487, 431 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R27] Cui, W., Knebe, A., Yepes, G., et al. 2018, MNRAS, 480, 2898 [Google Scholar]

[R28] David, L. P., Nulsen, P. E. J., McNamara, B. R., et al. 2001, ApJ, 557, 546 [CrossRef] [Google Scholar]

[R29] de Andres, D., Cui, W., Ruppin, F., et al. 2022, Nat. Astron., 6, 1325 [NASA ADS] [CrossRef] [Google Scholar]

[R30] De Grandi, S., & Molendi, S. 2002, ApJ, 567, 163 [CrossRef] [Google Scholar]

[R31] De Luca, F., De Petris, M., Yepes, G., et al. 2021, MNRAS, 504, 5383 [NASA ADS] [CrossRef] [Google Scholar]

[R32] Démoclès, J., Pratt, G. W., Pierini, D., et al. 2010, A&A, 517, A52 [CrossRef] [EDP Sciences] [Google Scholar]

[R33] Eckert, D., Ettori, S., Pointecouteau, E., van der Burg, R. F. J., & Loubser, S. I. 2022, A&A, 662, A123 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R34] Ettori, S., Fabian, A. C., Allen, S. W., & Johnstone, R. M. 2002, MNRAS, 331, 635 [Google Scholar]

[R35] Ettori, S., Gastaldello, F., Leccardi, A., et al. 2010, A&A, 524, A68 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R36] Ettori, S., Donnarumma, A., Pointecouteau, E., et al. 2013, Space Sci. Rev., 177, 119 [Google Scholar]

[R37] Fabian, A. C., Hu, E. M., Cowie, L. L., & Grindlay, J. 1981, ApJ, 248, 47 [NASA ADS] [CrossRef] [Google Scholar]

[R38] Fefferman, C., Mitter, S., & Narayanan, H. 2013, arXiv e-prints [arXiv:1310.0425] [Google Scholar]

[R39] Ferragamo, A., de Andres, D., Sbriglio, A., et al. 2023, MNRAS, 520, 4000 [NASA ADS] [CrossRef] [Google Scholar]

[R40] Foreman-Mackey, D., Hogg, D. W., Lang, D., & Goodman, J. 2013, PASP, 125, 306 [Google Scholar]

[R41] Ganin, Y., Ustinova, E., Ajakan, H., et al. 2016, J. Mach. Learn. Res., 17, 2096 [Google Scholar]

[R42] Gaspari, M., Brighenti, F., & Temi, P. 2012, MNRAS, 424, 190 [NASA ADS] [CrossRef] [Google Scholar]

[R43] Gertosio, R. C., Bobin, J., & Fabio, A. 2023, Signal Proc., 202, 108776 [CrossRef] [Google Scholar]

[R44] Ghirardini, V., Eckert, D., Ettori, S., et al. 2019a, A&A, 621, A41 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R45] Ghirardini, V., Ettori, S., Eckert, D., & Molendi, S. 2019b, A&A, 627, A19 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R46] Ghizzardi, S. 2001, XMM-SOC-CAL-TN-0022 [Google Scholar]

[R47] Gianfagna, G., De Petris, M., Yepes, G., et al. 2021, MNRAS, 502, 5115 [NASA ADS] [CrossRef] [Google Scholar]

[R48] Gupta, N., & Reichardt, C. L. 2020, ApJ, 900, 110 [NASA ADS] [CrossRef] [Google Scholar]

[R49] Gupta, N., & Reichardt, C. L. 2021, ApJ, 923, 96 [NASA ADS] [CrossRef] [Google Scholar]

[R50] He, K., Zhang, X., Ren, S., & Sun, J. 2015, ArXiv e-prints [arXiv:1512.03385] [Google Scholar]

[R51] Ho, M., Rau, M. M., Ntampaka, M., et al. 2019, ApJ, 887, 25 [NASA ADS] [CrossRef] [Google Scholar]

[R52] Ho, M., Farahi, A., Rau, M. M., & Trac, H. 2021, ApJ, 908, 204 [NASA ADS] [CrossRef] [Google Scholar]

[R53] Ho, M., Ntampaka, M., Rau, M. M., et al. 2022, Nat. Astron., 6, 936 [NASA ADS] [CrossRef] [Google Scholar]

[R54] Holder, G., Haiman, Z., & Mohr, J. J. 2001, ApJ, 560, L111 [NASA ADS] [CrossRef] [Google Scholar]

[R55] Huang, G., Liu, Z., van der Maaten, L., & Weinberger, K. Q. 2016, arXiv e-prints [arXiv:1608.06993] [Google Scholar]

[R56] Iqbal, A., Kale, R., Nath, B. B., & Majumdar, S. 2018, MNRAS, 480, L68 [NASA ADS] [CrossRef] [Google Scholar]

[R57] Johnstone, R. M., Fabian, A. C., Morris, R. G., & Taylor, G. B. 2005, MNRAS, 356, 237 [NASA ADS] [CrossRef] [Google Scholar]

[R58] Kay, S. T., & Pratt, G. W. 2022, in Handbook of X-ray and Gamma-ray Astrophysics, C. Bambi, & A. Santangelo (Springer), 100 [Google Scholar]

[R59] Kingma, D. P., & Ba, J. 2014, arXiv e-prints [arXiv:1412.6980] [Google Scholar]

[R60] Klein, M., Oguri, M., Mohr, J. J., et al. 2022, A&A, 661, A4 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R61] Klypin, A., Yepes, G., Gottlöber, S., Prada, F., & Heß, S. 2016, MNRAS, 457, 4340 [Google Scholar]

[R62] Lakhchaura, K., Saini, T. D., & Sharma, P. 2016, MNRAS, 460, 2625 [NASA ADS] [CrossRef] [Google Scholar]

[R63] Leccardi, A., & Molendi, S. 2008, A&A, 486, 359 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R64] Lewis, A. 2019, arXiv e-prints [arXiv:1910.13970] [Google Scholar]

[R65] Lovisari, L., & Maughan, B. J. 2022, Handbook of X-ray and Gamma-ray Astrophysics, 65 [Google Scholar]

[R66] Mathiesen, B. F., & Evrard, A. E. 2001, ApJ, 546, 100 [CrossRef] [Google Scholar]

[R67] Mazzotta, P., Rasia, E., Moscardini, L., & Tormen, G. 2004, MNRAS, 354, 10 [NASA ADS] [CrossRef] [Google Scholar]

[R68] Ntampaka, M., Trac, H., Sutherland, D. J., et al. 2015, ApJ, 803, 50 [NASA ADS] [CrossRef] [Google Scholar]

[R69] Piffaretti, R., Arnaud, M., Pratt, G. W., Pointecouteau, E., & Melin, J. B. 2011, A&A, 534, A109 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R70] Pizzolato, F., Molendi, S., Ghizzardi, S., & De Grandi, S. 2003, ApJ, 592, 62 [NASA ADS] [CrossRef] [Google Scholar]

[R71] Planck Collaboration XI. 2011, A&A, 536, A11 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R72] Planck Collaboration V. 2013, A&A, 550, A131 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R73] Planck Collaboration XIII. 2016, A&A, 594, A13 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R74] Planck Collaboration XXIV. 2016, A&A, 594, A24 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R75] Planck Collaboration XXVII. 2016, A&A, 594, A27 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R76] Pratt, G. W., Böhringer, H., Croston, J. H., et al. 2007, A&A, 461, 71 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R77] Pratt, G. W., Arnaud, M., Piffaretti, R., et al. 2010, A&A, 511, A85 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R78] Pratt, G. W., Arnaud, M., Maughan, B. J., & Melin, J. B. 2022, A&A, 665, A24 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R79] Ramanah, K. D., Wojtak, R., Ansari, Z., Gall, C., & Hjorth, J. 2020, MNRAS, 499, 1985 [NASA ADS] [CrossRef] [Google Scholar]

[R80] Rasia, E., Meneghetti, M., & Ettori, S. 2013, Astron. Rev., 8, 40 [Google Scholar]

[R81] Rasia, E., Lau, E. T., Borgani, S., et al. 2014, ApJ, 791, 96 [NASA ADS] [CrossRef] [Google Scholar]

[R82] Rasia, E., Borgani, S., Murante, G., et al. 2015, ApJ, 813, L17 [Google Scholar]

[R83] Russell, H. R., Sanders, J. S., & Fabian, A. C. 2008, MNRAS, 390, 1207 [NASA ADS] [CrossRef] [Google Scholar]

[R84] Sereno, M., Covone, G., Izzo, L., et al. 2017, MNRAS, 472, 1946 [Google Scholar]

[R85] Snowden, S. L., Mushotzky, R. F., Kuntz, K. D., & Davis, D. S. 2008, A&A, 478, 615 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R86] Starck, J.-L., Fadili, J., & Murtagh, F. 2007, IEEE Trans. Image Proc., 16, 297 [Google Scholar]

[R87] Vikhlinin, A. 2006, ApJ, 640, 710 [NASA ADS] [CrossRef] [Google Scholar]

[R88] Vikhlinin, A., Kravtsov, A., Forman, W., et al. 2006, ApJ, 640, 691 [Google Scholar]

[R89] Vincent, P., Larochelle, H., Lajoie, I., et al. 2010, J. Mach. Learn. Res., 11, 12 [Google Scholar]

[R90] Yan, Z., Mead, A. J., Van Waerbeke, L., Hinshaw, G., & McCarthy, I. G. 2020, MNRAS, 499, 3445 [NASA ADS] [CrossRef] [Google Scholar]

CHEX-MATE: A non-parametric deep learning technique to deproject and deconvolve galaxy cluster X-ray temperature profiles

1. Introduction

2. Simulations

2.1. CC and NCC classification

2.2. Dynamical classification

2.3. Structural classification

3. Neural network model for learning 3D temperature profiles

4. Model training and fitting

4.1. Model training

4.2. Model fitting

5. Model evaluation

5.1. 3D-3D reconstruction of temperature profiles

5.1.1. Overall performance

5.1.2. Anchor point weights, λi

5.2. 2D–3D reconstruction of temperature profiles with fine 2D binning

5.3. 2D–3D reconstruction of temperature profiles with an observation-like binning

5.3.1. Twelve bin case

5.3.2. Six bin case

5.4. 2D–3D reconstruction of temperature profiles with spectroscopic-like weighting

5.5. Comparison of IAE model to a parametric model

6. First application to CHEX-MATE X-ray data

6.1. Modifications to the IAE model

6.2. Observed sample

6.3. Method

6.4. Results

6.4.1. Estimation of profiles

6.4.2. Estimation of derivatives

7. Discussion and conclusions

Acknowledgments

References

Appendix A: Supplementary Material

A.1. Correlation between χD and χS

A.2. Temperature profile reconstruction with IAE model for the fine binning cases

A.3. Temperature profile reconstruction with IAE model for the observational-like binning cases

A.4. DR1 sample used in this work

All Tables

All Figures

5.1.2. Anchor point weights, λ_i

A.1. Correlation between χ_D and χ_S