Resolving the hydrostatic mass profiles of galaxy clusters at z~1 with XMM-Newton and Chandra

We present a detailed study of the integrated total hydrostatic mass profiles of the five most massive ($M^{\mathrm{SZ}}_{500}>5 \times 10^{14}$ M$_{\odot}$) galaxy clusters selected at $z\sim1$ via the Sunyaev-Zel'dovich effect. These objects represent an ideal laboratory to test structure formation models where the primary driver is gravity. Optimally exploiting spatially-resolved spectroscopic information from XMM-Newton and Chandra observations, we used both parametric (forward, backward) and non-parametric methods to recover the mass profiles, finding that the results are extremely robust when density and temperature measurements are both available. Our X-ray masses at $R_{500}$ are higher than the weak lensing masses obtained from the Hubble Space Telescope (HST), with a mean ratio of $1.39^{+0.47}_{-0.35}$. This offset goes in the opposite direction to that expected in a scenario where the hydrostatic method yields a biased, underestimated, mass. We investigated halo shape parameters such as sparsity and concentration, and compared to local X-ray selected clusters, finding hints for evolution in the central regions (or for selection effects). The total baryonic content is in agreement with the cosmic value at $R_{500}$. Comparison with numerical simulations shows that the mass distribution and concentration are in line with expectations. These results illustrate the power of X-ray observations to probe the statistical properties of the gas and total mass profiles in this high-mass, high-redshift regime.


Introduction
In the current ΛCDM paradigm, structure formation in the Universe is driven by the gravitational collapse of the dark matter component. In this context, the form of the dark matter density profile is a sensitive test not only of the structure formation scenario, but also of the nature of the dark matter itself. In addition, it is impossible to fully comprehend the baryonic physics without first achieving a full understanding of the dominant dark matter component.
Cosmological numerical simulations uniformly predict a quasi-universal cusped dark matter density profile, whose form only depends on mass and redshift. Perhaps the best-known parameterisation of dark matter density profiles is the Navarro-Frenk-White (NFW) profile suggested by Navarro et al. (1997). This profile is flexible; in scaled coordinates (i.e. radius scaled to the virial radius) its shape is characterised by a single parameter, the concentration c, the ratio of the scale radius to the virial radius,r s /R ∆ 1 . Its normalisation, for a given concentration, is proportional to the mass. The concentration is known to exhibit a weak dependence on mass and redshift (typically a decrease of a factor 1.5 at z = 1, e.g. Duffy et al. 2008), although the exact dependence is a matter of some debate in the literature (e.g. Diemer & Kravtsov 2015).
In the local (z 0.3) Universe, there is now strong observational evidence for NFW-type dark and total matter density profiles with typical concentrations in line with expectations from simulations. Such evidence comes both from X-ray observations (e.g. Pointecouteau et al. 2005;Vikhlinin et al. 2006;Buote et al. A&A proofs: manuscript no. high_z_mass_printer sumption of hydrostatic equilibrium (HE), can give more precise results than other methods because of their good spatial resolution and signal-to-noise ratios. A combination with theoretical modelling can give crucial insights into both the dark matter collapse and the coeval evolution of the baryons in the potential well.
Here we present a pilot study of the X-ray hydrostatic mass profiles of the five most massive SZ-detected clusters at z ∼ 1, where the mass is M 500 > 5 × 10 14 M ⊙ as estimated from their SZ signal. Initial results, obtained by optimally combining spatially and spectrally resolved XMM-Newton and Chandra observations, concerned the evolution of gas properties, and were described in Bartalucci et al. (2017, hereafter B17). Here we used the same observations to probe the total mass and its spatial distribution. We discuss the various X-ray mass estimation methods used in Sect. 2, and the robustness of the recovered mass distribution in Sect. 3. Results are compared with local systems to probe evolution in Sect. 4 and cosmological numerical simulations in Sect. 5. We discuss our conclusions in Sect. 6.

Sample
A detailed description of the sample used here, including the data reduction, is given in B17. Briefly, the sample is drawn from the South Pole Telescope (SPT) and Planck SZ catalogues (Bleem et al. 2015;Planck Collaboration XXVI 2011), and consists of the five galaxy clusters with the highest SZ mass proxy value 2 (M SZ 500 5 × 10 14 M ⊙ ) at z > 0.9 (see Fig. 1 of B17). All five objects were detected in the SPT survey; PLCK G266.6+27.3 was also independently detected in the Planck SZ survey. All five have been observed by both XMM-Newton and Chandra, using the European Photon Imaging Camera (EPIC, Turner et al. 2001 andStrüder et al. 2001) and the Advanced CCD Imaging Spectrometer (ACIS, Garmire et al. 2003), respectively. Four objects were the subject of an XMM-Newton Large Programme, for which the exposure times were tuned so as to enable extraction of temperature profiles up to R 500 . Shorter archival Chandra observations were also used. The fifth object, PLCK G266.6+27.3, was initially the subject of a snapshot XMM-Newton observation (Planck Collaboration XXVI 2011), and was then subsequently observed in a deep Chandra exposure.
Dedicated pipelines, described in full in B17, were used to produce cleaned and reprocessed data products for both observatories. These pipelines apply identical background subtraction and effective area correction techniques to prepare both XMM-Newton and Chandra data for subsequent analysis. The definition of surface brightness and temperature profile extraction regions was also identical, and point source lists were combined. 2 Published SPT masses are estimated 'true' mass from the SZ signal significance, as detailed in Bleem et al. (2015). Masses in the Planck catalogue are derived iteratively from the Y SZ -M 500 relation calibrated using hydrostatic masses from XMM-Newton. They are not corrected for hydrostatic bias and are on average 0.8 times smaller. In Fig. 1 of B17, and in this work, the SPT masses were renormalised by a factor of 0.8 to the Planck standard.

Preliminaries
Under the assumptions of spherical symmetry and HE, the integrated mass profile of a cluster is given by where µ = 0.6 is the mean molecular weight in a.m.u 3 , m H is the hydrogen atom mass, and T (r) and n e (r) are the 3D temperature and density radial profiles, respectively. The key observational inputs needed for this calculation are thus the radial density and temperature profiles, plus their local gradients. A complication is that these quantities are observed in projection on the sky, and thus the bin-averaged 2D annular (projected) measurements must be converted to the corresponding measurements in the 3D shell (deprojected) quantities. A number of approaches exist in the literature for the specific case of cluster mass modelling (for a review, see e.g. Ettori et al. 2013, and references therein). Generally speaking, one can either model the mass distribution and fit the projected (2D) quantities (backward-fitting), or deproject the observable quantities to obtain the 3D profiles and calculate the resulting integrated mass profile (forward-fitting). This deprojection in turn can either be performed either by using parametric functions or be undertaken non-parametrically.
In the following, we chose to calculate all deprojected quantities at the emission-weighted effective radius, r w , assigned to each projected annulus, i, defined as in Lewis et al. (2003): Formally, r w should be calculated iteratively from the density profile, but McLaughlin (1999) has shown that the above equation is an excellent approximation for a wide range of density profile slopes.

Density
We used the combined XMM-Newton-Chandra density profiles detailed in B17, which were derived from the [0.3-2] keV band surface brightness profiles using the regularised non-parametric deprojection technique described in Croston et al. (2006). As shown in B17, the resulting 3D (deprojected) density distributions from XMM-Newton and Chandra agree remarkably well. We then fitted these profiles simultaneously with a parametric model based on that described in Vikhlinin et al. (2006, see Appendix A), allowing us to obtain for each object a combined density profile that fully exploits the high angular resolution of Chandra in the core and the large effective area of XMM-Newton in the outskirts. The resulting 3D density distribution is technically a parametric profile. However, in view of the much better statistical quality of the density profiles (compared to that of the temperature profiles), this last parametric step does not overconstrain the resulting mass distribution. As in B17, to avoid extrapolation, the minimum and maximum radii for the parametric models were set to match those of the measured deprojected profiles. Fig. 1: 3D temperature profiles of all the clusters of our sample. Radii are scaled by R Y X 500 . Section 2.3.2 describes the temperature profile calculation. For each panel: the black points represent the non-parametric-like 3D temperature profiles measured using Chandra and XMM-Newton, with round and squared points, respectively. The grey shaded area represents the best-fitting 3D parametric model . The blue and red areas represent the result of the backward fit (BP approach, see Sect. 2.4.1), assuming hydrostatic equilibrium and an NFW or an Einasto mass profile, respectively. The parametric models were estimated only in the radial range covered by the density profile. The shaded regions correspond to the 68% confidence level regions.
SPT−CL J0546−5345 presents a clear substructure in its south-west sector which was not masked in B17. Since here our focus is on the measurement of integrated mass profiles, such substructures should generally be excluded from the analysis. We thus computed a new combined density profile with the substructure masked for this system. The new profile we use in this work is described in Appendix B and is shown in Fig. B.1.

Temperature
We base our 3D (deprojected) temperature profiles on those published in B17. In a first step, we extracted spectra from concentric annuli centred on the X-ray peak and determined the 2D (projected) temperature profile by measuring the temperature in each bin. We iteratively modified the annular binning scheme defined in B17, to ensure that the R Y X 500 fell within the outermost radius of the final annulus of each profile.
We then employed two methods to obtain the 3D temperature profile: -Parametric: We fitted a model similar to that proposed by Vikhlinin et al. (2006), reducing the number of free parameters when necessary, to the 2D profiles. This model was convolved with a response matrix to take into account projection and (for XMM-Newton) PSF redistribution; during this convolution, the weighting scheme proposed by Vikhlinin (2006, see also Mazzotta et al. 2004) was used to correct for the bias introduced by fitting isothermal models to a multi-temperature plasma. Uncertainties were computed via 1000 Monte Carlo simulations of the projected temperature profiles. -Non-parametric-like: Analytical models such as those described above tend to be overconstrained, and do not reflect the fact that the temperature distribution is measured only at the points corresponding to the limited number of annuli within which spectra are extracted. To overcome these limitations we define the non-parameteric-like temperature profile by estimating the parametric model temperature at the weighted radii corresponding to the 2D annular binning scheme, and imposing the uncertainty on the annular spectral fit as a lower limit to the uncertainty in the 3D bin.
The resulting profiles are shown in Fig. 1, where the smooth grey envelope represents the parametric 3D temperature distribution, and the black points with errors represent the 3D nonparametric-like temperature profile.

Mass profile calculation
Total mass profiles were determined following Eq. 1. To examine the robustness of the recovered profiles, we used both forwardfitting and backward-fitting methods, as we describe below.
-Forward non-parametric-like (FNPL): This is our baseline mass measurement. It uses the combined 3D density profile For each panel: the black points represent the mass profiles obtained from the forward non-parametric-like method, using the HE equation and the non-parametric-like temperature profiles shown as black points in Fig. 1. The blue and red solid lines represent the fit of these forward non-parametric-like profiles using a NFW and an Einasto model, respectively. The grey area is the mass profile computed assuming hydrostatic equilibrium and using the parametric temperature profiles shown with a grey area in Fig. 1. The blue and red envelopes represent the mass profile computed using the backward method, i.e. fitting the observed temperature profile with a model derived from the HE equation and assuming a NFW and an Einasto profile, respectively, for the underlying total mass distribution. The parametric mass profiles are estimated in the wider radial range covered by the density profile.
(Sect. 2.3.1) and the non-parametric-like 3D temperature profile as input, and produces a mass profile estimate at each weighted radius, r w . The mass measurement and its uncertainty were calculated using a similar scheme to that first presented in Pratt & Arnaud (2003) and further developed in Démoclès et al. (2010). In this procedure, a random temperature was generated at each r w , and a cubic spline was used to compute the derivative. One thousand Monte Carlo simulations of this type were performed; the final mass profile and its uncertainties were then derived from the median and associated 68% confidence region. The mass profiles derived from these realisations were constrained to respect the monotonic condition (i.e. M(r + dr) > M(r)) and to be convectively stable (i.e. dln T/ dln n e < 2/3). The resulting mass profiles are shown with their corresponding error bars in Fig. 2. The relative errors are of the order of 30% in the inner core, and (somewhat counterintuitively) decrease to ∼ 10 − 15% at large radii. This effect is an intrinsic property of the typical amplitude and uncertainty on the logarithmic density and temperature gradients, and is quantified in more detail in Appendix C. -Forward parametric (FP): Here the fully parametric 3D density and temperature profiles were used to compute the total mass distribution on the radial grid of the combined density profile. Uncertainties were calculated using 1000 Monte Carlo realisations, and we did not impose any condition on the resulting mass profiles. The grey shaded areas in Fig. 2 correspond to the 68% dispersion envelopes. This method may lead to non-physical results, as can be seen at large radii in SPT−CL J2146−4633 and SPT−CL J0546−5345, where the cumulative total mass profiles start decreasing. For this reason, we do not compute a median profile and we do not use these results to perform quantitative analyses. However, these profiles retain the maximum amount of information on the intrinsic dispersion, allowing us to explore the dispersion related to density and temperature measurement errors. Additionally, these mass profiles are estimated on the finer radial grid and wider radial range of the density profiles and so they can be used to qualitatively investigate the behaviour in the cluster core and outskirt regions. We note that we did not extrapolate the parametric model of the density profiles, i.e. we did not attempt to estimate masses in regions where there are no observational constraints. -Backward parametric (BP): Here we assumed that the total mass distribution could be described by an NFW (Navarro et al. 1997) or Einasto (Einasto 1965;Navarro et al. 2004) distribution, and inverted Eq. 1, taking into account the 3D density profile, to obtain the corresponding 3D temperature profile. This was then projected and convolved with the instrument response and PSF, and fitted to the 2D temperature profile. Uncertainties were estimated through a Monte Carlo Notes: (a) SPT name: SPT-CLJ0615-5746. (b) The R HE 500 , M HE 500 and the c 500 values were calculated performing an extrapolation (see text for details). For this reason, these values were not used for quantitative analysis, and the errors are not reported. randomisation procedure using 1000 realisations. The resulting temperature and mass profiles are shown in Figs. 1 and 2. The analysis was again restricted to the radial range covered by the density profile.
-Direct fit (DF): We also directly fitted the FNPL mass profiles using the NFW and Einasto functional forms. The resulting best fits, computed on the combined density profile radial grid, are shown in Fig. 2. The corresponding uncertainties were estimated by repeating the fitting procedure on 1 000 Monte Carlo realisations of the FNPL mass profile. The NFW fit concentrations at R 500 , c 500 ≡ R 500 /r s where r s is the scale radius, are given in Table 1.

Determination of mass at fixed radius and density contrast
The value of M Y X 500 (and consequently R Y X 500 ) was determined iteratively using the M 500 -Y X relation, as calibrated in Arnaud et al. (2010), assuming self-similar evolution. Here Y X is defined as the product of the gas mass computed at R Y X 500 and the temperature measured in the [0.15 − 0.75]R Y X 500 region ). As the radial density bin widths used here differ from those used in B17, as described above in Sect. 2.3, the gas mass profiles and the quantities based on M 500 -Y X were updated. For this reason, the values in Table 1 differ slightly (∼ 1%) from those published in Table 2 of B17.
We determined the FNPL masses at density contrasts ∆ = [2500, 500], namely M HE 2500 and M HE 500 , at radii R HE 2500 and R HE 500 , respectively. We also interpolated all the mass profiles (except FP) described in Sect. 2.4, at R Y X 500 . These are referred to as M Method (R < R Y X 500 ) in the following text and figures. Radii and the corresponding masses are given in Table 1.
The M HE 500 of SPT−CL J2106−5844 reaches a non-physical value of ∼ 40 × 10 14 M ⊙ . The R Y X 500 is at the outer edge of the last temperature bin, so extrapolation is required. As the mass profile of this object is very steep, the radius at which ∆ = 500 is boosted, and the corresponding mass reaches non-realistic values. The resulting M HE 500 and R HE 500 estimates are provided in Table 1, although they are not used for any quantitative analysis. The DF NFW yields more reasonable M 500 estimates, although they are poorly constrained. The best-fit c 500 value is equal to the minimum value allowed by the fit (c 500 = 0.01), corresponding to the quasi-power law behaviour of the mass profile, and yields an M 500 that is significantly greater than M Y X 500 . A more conservative lower value of c 500 = 1 forces the curve to be higher in the core and the fit is then driven by the third point (R ∼ R 2500 ) because of its small relative error. This analysis yielded a ∼ 5 times higher χ 2 and a value of M 500 = 7.6±2.1×10 14 M ⊙ , now in agreement with M Y X 500 . This result must simply be considered as an NFW extrapolation, with priors on c 500 , of the well-determined mass at R 2500 .
Two objects from our sample, SPT−CL J0546−5345 and SPT−CL J2106−5844, were also analysed by Amodeo et al. (2016) using Chandra only datasets. The authors estimated M 200 and c 200 using the BP approach and the NFW functional form. Using the concentration and mass values published in their Table 2 to compute M 500 yields M 500 = 4.0 ± 2.9 × 10 14 M ⊙ and M 500 = 6.5 ± 3.9 × 10 14 M ⊙ for SPT−CL J0546−5345 and SPT−CL J2106−5844, respectively. These are perfectly consistent with the present BP-NFW estimates; however, our deeper observations and extended radial coverage allowed us to better constrain the measurements, the relative errors being ∼ 5 times smaller.

Robustness of X-ray mass
In this section, we first examine the robustness of the HE mass estimate to the X-ray analysis method. As the HE assumption is a known source of systematics, through the HE bias, we then compare the HE mass to lensing mass estimates, which do not rely on this assumption. Figure 2 shows the mass profiles resulting from the different mass estimation methods discussed above. The BP results indicate that while the NFW model is a good description in the case of relaxed objects (e.g. PLCK G266.6+27.3) and some perturbed systems (e.g. SPT−CL J2341−5119), the Einasto model is generally a better fit for our sample (as is evident from the figure, and from the χ 2 value) and is more able to fit a wider range of dynamical states. This is unsurprising given the larger number of parameters in the Einasto model. Forward and backward methods also give extremely consistent results. The limitations of the NFW model can be seen in SPT−CL J0546−5345, where this form is clearly a poor description of the data, leading to the BP NFW masses being somewhat different to those from other methods.

Mass profile shape
Overall, all the mass estimation methods yield remarkably robust and consistent results within the radial range covered by the spectroscopic data, i.e. within the minimum and maximum effective radii of the temperature profile bins, except in cases where the underlying model is insufficiently flexible. Mass profile uncertainties are quite different between methods, however, with the FP method yielding the smallest and the FNPL method A&A proofs: manuscript no. high_z_mass_printer Fig. 3: Comparison of the hydrostatic mass computed at fixed radius, R Y X 500 , using the different methods, in units of M Y X 500 . There is excellent agreement, with differences of less than 10%, when the radius is enclosed in the radial range covered by the spectroscopic data. yielding the largest (or most conservative). This simply reflects the restrictions each method places on the possible shape of the profile.
Outside the radial range covered by the spectroscopic data, the results are most robust and agnostic to the mass estimation method when the profiles are regular and can be described by a simple model (e.g. NFW). However, when the radial sampling is poor (the profiles have few points) or when the profile is irregular (e.g. SPT−CL J0546−5345), estimation of the mass outside the radial range probed by the spectroscopic data is less robust and will depend strongly on the method used to measure the mass. In addition, outside the region covered by the spectroscopic data, the uncertainties rapidly increase with the distance from effective radius of the final temperature measurement, in spite of the density information.

Mass within R Y X 500
We now turn to the robustness of the mass determined within a fixed radius, calculated as described in Sect. 2.4.2. Figure 3 shows the ratio between the mass obtained employing the different methods, with M Y X 500 as a reference mass. For SPT−CL J2146−4633 and PLCK G266.6+27.3 the HE mass measurements are in excellent agreement, the difference being within a small percent. SPT−CL J2341−5119 and SPT−CL J0546−5345 present larger differences (∼ 10%) according to the mass estimation method between mass estimates. Interestingly, the BP masses of SPT−CL J2341−5119 are closest of all the objects to its M Y X 500 . SPT−CL J2106−5844 is the only cluster for which all the methods yield masses greater than M Y X 500 , by a factor of ∼ 40%, except if we further restrict the possible range of concentration parameters. The difference between mass estimates is also noticeably larger than for the other objects, due to the limited radial coverage. Even if the masses are estimated at a fixed radius, R Y X 500 , this radius falls barely within the outermost temperature radial bin. We conclude that that in order to perform robust measurements, the radius at which the mass is to be estimated should lie within the weighted radial range covered by the spectroscopic data.

Comparison to weak lensing
Weak lensing mass measurements represent an additional and independent method of investigating the robustness of our mass determinations; furthermore, understanding the systematic differences between weak lensing and X-ray masses at z ∼ 1 is crucial for any future cosmological or physical exploitation of such samples. We compared our results with the weak lensing masses published in Schrabback et al. (2018), who determined M 500 for 13 SPT clusters observed with the Hubble Space Telescope. Four of their objects are in common with our sample. Schrabback et al. (2018) give different weak lensing M 500 estimates, depending on the choice of centre (X-ray peak and SZ peak). The top left panel of Fig. 4 show the comparison between M 500 measured using the X-ray peak as centre, M WL X−ray 500 , and M Y X 500 , as listed in Table 1. Formally, there is good agreement, with the masses for each individual cluster being consistent at 1σ. However, there is a clear systematic offset in the sense that all X-ray masses are higher than the WL masses, with an error-weighted mean ratio of M Y X 500 /M = 1.39 +0.51 −0.37 . We found the same results by comparing the X-ray masses with the weak lensing masses centred on the SZ peak, M WL SZ 500 , as shown in the bottom panels of Fig. 4. This result is unexpected. The so-called 'hydrostatic bias', owing to the assumption of HE, is believed to result in a net underestimate of the total mass in X-ray measurements, while lensing observations, although slightly biased, are expected to yield results that are closer to the true value. Indeed, such a trend has been found, for example in the Weighing the Giants (WtG, von der Linden et al. 2014) project and by the Canadian Cluster Comparison Project (CCCP, Hoekstra et al. 2015), where the Xray hydrostatic masses are ∼ 30% and 20% lower than the WL values 4 , respectively. While there are only four objects in our sample, we find the opposite trend here. With a HE-to-WL mass ratio of 1.39 +0.51 −0.37 , our results are marginally consistent at 1σ with the Schrabback et al. (2018) results, and inconsistent with WtG at the 2σ level.
This comparison underlines the capability and complementarity of X-ray observations with respect to optical observations, especially at these redshifts. The X-ray statistical errors are significantly smaller than the weak lensing uncertainties; furthermore, the X-ray results are remarkably robust, as we demonstrate in the previous sections. The results we find here show that while X-ray observations at high redshift are expensive and challenging, they offer a robust and precise tool which can efficiently complement measurements in other wavelengths.

Evolution of cluster properties
4.1. M Y X 500 -M HE 500 relation and evolution of the ratio The M 500 -Y X relation we use in this work was calibrated using hydrostatic masses derived from the relaxed subsample of Fig. 4: Comparison between our X-ray masses and the weak lensing masses published in Schrabback et al. (2018). All estimates for a given cluster are consistent within the statistical errors. However, there is a general trend of smaller lensing mass than the HE mass, contrary to expectation. Note also the higher statistical precision of the X-ray masses. Top left panel: comparison between M Y X 500 and weak lensing masses estimated at R 500 , centred on the X-ray peak. The grey area for the weak lensing represents the statistical errors. The black solid bars represent the sum in quadrature of systematic and statistical errors. The blue and red lines represent the bias, (1 − b), between the X-ray hydrostatic and weak lensing mass as measured by Weighting the Giants (WtG, von der Linden et al. 2014) and by the Canadian Cluster Comparison Project (CCCP, Hoekstra et al. 2015), respectively. To better visualise the points, we crop the lower values of SPT-CLJ2341-5119 and SPT-CLJ0546-5345, which are of the order of ∼ 10 −1 × 10 14 M ⊙ . Top right panel: same as the top left panel, except showing the comparison between the hydrostatic mass computed at R Y X 500 , M HE (R < R Y X 500 ), and the weak lensing masses. Bottom left and right panels: same as the top panels except that weak lensing masses are computed using the Sunyavez-Zeldovich (SZ) peak as the centre. The error-weighted mean ratio and corresponding errors are reported in each panel.
12 REXCESS objects Pratt et al. 2010), plus eight additional relaxed systems from Arnaud et al. (2007). This relation was derived from local objects and we assumed self-similar evolution. The present observations offer the opportunity to investigate the robustness of this relation when applied to a high-redshift sample dominated by disturbed objects. Figure 5 shows the resulting comparison of M Y X 500 with M HE 500 and M HE (R < R Y X 500 ), in the left and right panels, respectively. In both cases there is excellent agreement between individual measurements. The only exception is the M HE 500 of SPT−CL J2106−5844, which is subject to the systematic uncertainty discussed above. The error-weighed mean ratios are M Y X 500 /M HE 500 = 1.02 +0.15 −0.13 and M Y X 500 /M HE (R < R Y X 500 ) = 1.04 +0.09 −0.08 , consistent with unity. This suggets that the relation is robust, even when applied to such an extreme sample. The good agreement is consistent either with no evolution of the ratio between the two quantities, or with an evolution of the ratio where the evolution is counterbalanced by some other effects. However the latter explanation is unlikely given that the evolution is perfectly compensated, such that the agreement between M Y X 500 and M HE 500 is excellent as a function of redshift. We note that this result does not necessarily imply that there is no evolution of the bias between the hydrostatic mass and the true mass. However, our A&A proofs: manuscript no. high_z_mass_printer  comparison with weak lensing above would suggest that the bias cannot be dramatic.

Scaled mass and total density profiles
We calculated the total density profiles for our sample using the best-fitting DF Einasto model. The resulting profiles are shown compared to those from REXCESS in the left panel of Fig. 6. The right panel shows the corresponding cumulative total mass profiles.
A consistent picture emerges from these comparisons. At R Y X 500 all the mass profiles are in excellent agreement: the dispersion is similar, and interestingly is centred around unity (i.e. the M HE 500 is comparable to M Y X 500 , consistent with our findings in the previous section). Apart from the profile of SPT−CL J2106−5844, which is affected by poor radial coverage especially at large radius, all the density profiles of the z ∼ 1 sample lie within the envelope of the REXCESS profiles at high radii (> 0.5R 500 ). However, the profiles tend to be shallower on average in the central regions. The profiles of SPT-CLJ2146-4633 and SPT−CL J2106−5844 are even shallower in the core (< 0.3R 500 ) than the least-peaked REXCESS profile. The other three systems lie within the 1σ dispersion of REXCESS profiles, but tend to trace the lower envelope of the distribution in total density and total mass, especially towards the most central parts.
Unfortunately, due to the small size of the sample and the poor quality of SPT−CL J2106−5844, we cannot quantify whether there is a significant difference compared to REX-CESSin median mass profile shape and/or an increase in the intrinsic scatter around it. If these differences are confirmed, this behaviour can be interpreted either as evolution in the core regions or as being due to a difference between X-ray and SZ selection. Comparison with an X-ray selected sample at similar redshifts or comparison to a similar SZ-selected sample at lower redshift, would help to clarify this point.

Sparsity
The halo sparsity was introduced by Balmès et al. (2014) to characterise the form of the mass distribution in a way that is independent of any parametric model. It is defined as the ratio of masses integrated within two fixed overdensities, where ∆ 1,2 represent the overdensities at which the masses are calculated, with ∆1 < ∆2. As is discussed in Balmès et al. (2014), the properties of the sparsity are independent of the choice of the ∆ as long as the definition of the halo is not ambiguous (∆ 1 not too small), and that dynamical interaction between baryons and dark matter can be neglected (∆ 2 not too large). We chose to measure the sparsity within overdensity of ∆ 1 = 500 and ∆ 2 = 2500 with respect to the critical density. These overdensities are well matched to the sensitivity of the X-ray observations discussed here, and are sufficiently distant to properly sample the form of the mass profile.  Bartalucci et al. (2017), computed using the M Y X 500 . Gas masses were computed using the gas mass profiles derived from the combined density profiles. We used the stellar masses published in Chiu et al. (2016) Figure 7 shows the resulting sparsity measurements for our z ∼ 1 sample. These data are compared to those from REXCESS, which exhibit a peaked distribution in a narrow range, 1 < S < 3.
Three of the clusters in our z ∼ 1 sample have sparsity values that lie well within the REXCESS distribution. SPT−CL J2146−4633 and SPT−CL J2106−5844 lie outside this distribution, their sparsity being ∼ 4 − 5 times the mean value (∼ 2) compared to REXCESS. This result reflects what we already found for the mass profiles. This study and the recent parallel study of Corasaniti et al. (2017) represent the first applications of this quantity to a large sample of objects. The narrow distributions in Fig. 7 show its effectiveness in tracing the population characteristics.

Baryon fraction
The baryon fraction determined at the radius R is defined as where M star is the total stellar mass, M gas is the gas mass, M tot is the halo total mass, and all quantities are integrated within R. In B17 we presented the baryon fraction derived using M Y X 500 for the total mass estimate. Here we extended this analysis by deriving the baryon fraction using M HE (R < R Y X 500 ) for the M tot term. This is fundamental to understand possible systematics related to the fact that the gas mass profiles and M Y X 500 measurements are correlated (i.e. the Y X is based on the gas mass). Figure 8 shows the baryon fraction as a function of mass computed for this work, the results from B17, and the mean derived from A&A proofs: manuscript no. high_z_mass_printer and M Y X 500 , respectively. Our sample and simulated cluster radial profiles were scaled by their R Y X 500 and R true 500 , respectively. The common scaled radius is indicated with R 500 .
REXCESS (Pratt et al. 2009). For M star at R 500 we used the stellar masses published in Chiu et al. (2016, the  SPT−CL J2106−5844 presents a larger deviation, but the hydrostatic mass computation for this object is affected by the lack of radial coverage. The use of hydrostatic mass measurements here confirms and consolidates what we found in B17: in this redshift regime, the baryon fraction does not show any dependence with respect to the mass. The density enclosed within a certain radius is higher hence more energy is required to expel the gas. We also confirm the good agreement between the baryon fraction of our sample with the fraction derived by the Planck collaboration (Planck Collaboration et al. 2016).

Mass profiles
We now turn to a comparison with cosmological numerical simulations. We use the same simulated sample of five z = 1 galaxy clusters in the [4 − 6] × 10 14 × M ⊙ mass range described in Sect. 6 of B17, selected from the AGN 8.0 model of the suite of hydrodynamical cosmological simulations cosmo-OWLS (Le Brun et al. 2014). These simulations include baryonic physics, and represent an extension to larger volumes of the OverWhelmingly Large Simulations project (Schaye et al. 2010).
From the simulated datasets we extracted and fitted the pressure profiles using a generalised NFW model. We then derived the simulated mass profiles by applying the hydrostatic assumption to the gNFW pressure profile in combination with the density profile (see e.g. Pratt et al. 2016). Figure 9 shows the comparison between the observed FNPL and simulated mass profiles, scaled by M Y X 500 and M true 500 , respectively, where M true 500 is defined as the sum of all the particles within R true 500 . The agreement over the full radial range is remarkably good. The shape, normalisation, and scatter of the simulated profiles seem to reproduce well the observations, four of the five observed profiles lie within the 68% dispersion of the theoretical profiles computed at 0.1 and 0.5 R true 500 . Interestingly, there is also excellent agreement of the profiles at R 500 , hinting that M Y X 500 represents a robust estimate of the true mass in this mass and redshift regime. Unfortunately, we were not able to investigate the behaviour of the profiles in the core regions below 0.1R 500 . Furthermore, as the five simulated clusters discussed here are the only objects in the cosmo-OWLS cosmological box that fulfil the mass and redshift criteria, the qualitative agreement might be coincidental. A larger number of higher resolution simulations and better sampling of the X-ray profiles are needed in order to make progress on this front.

Concentration
The NFW concentration is known to evolve with redshift and mass (e.g. Dutton & Macciò 2014), although at the highest masses there is surprisingly little evolution (Le Brun et al. 2018). While the mass dependence in the local Universe has been confirmed in a number of works (e.g. Pointecouteau et al. 2005;Vikhlinin et al. 2006;Voigt & Fabian 2006;Gastaldello et al. 2007;Buote et al. 2007;Ettori et al. 2010), evolution has received less attention (Sereno & Covone 2013;Schmidt & Allen 2007;Amodeo et al. 2016). The constraints are especially poor in the high-z regime; the typical uncertainties on concentration parameters of the five clusters at z > 0.9 studied by Amodeo et al. (2016) are of the order of ±[60-80]% (their Table 2).
The very precise measurements afforded by the present observations allow us to further investigate the c − M relation and its evolution. Figure 10 shows the concentrations for four of the clusters in our sample compared to the theoretical predictions derived from the simulations of Dutton & Macciò (2014). The predictions were computed for a set of clusters in the local and distant universe, at z = 0 and z = 1, respectively, at ∆ = 200. We translated their results at ∆ = 500 using the NFW profile. At z = 1, we considered the concentrations plotted in their Fig.  10 rather than their power law fit, the c-M relation flattening to c 200 = 4. (c 500 = 2.6) in the present high-mass range. Bhattacharya et al. (2013) found that the dispersion for the c − M relation is ∼ 30%. We used this result to roughly estimate the typical dispersion for the c − M at z = 1.
Two clusters are within the 1σ dispersion of the mean expected relation, while two are [1.5-2]σ away. We iteratively computed the mean concentration, taking into account statistical errors and intrinsic scatter. Our results agrees, at the 1σ level, with the expectations: the mean concentration is c 500 = 2.06 ± 0.67, with an estimated intrinsic dispersion of 1.2 ± 0.6. Although the sample size is small, this is the first test of the c-M relation at these redshifts with precise individual concentration measurements (i.e. errors smaller than the expected scatter).

Discussion and conclusions
We have presented the individual hydrostatic mass profiles of the five most distant (z ∼ 1) and massive (M SZ 500 > 5 × 10 14 ) galaxy clusters from the SPT and Planck cluster catalogues, measuring for the first time the profiles up to R 500 . The combination of Chandra and XMM-Newton, following the technique developed in B17, allowed us to overcome cosmological dimming and to derive robust measurements from the core regions out to R 500 . The temperature profiles cover a typical radial range of [0.08−1] R 500 , while the combined XMM-Newton/Chandra density profiles are typically in the range [0.01 − 1.7] R 500 . We considered both parametric (forward and backward) and non-parametric approaches to measuring the mass profiles. The main results regarding the robustness of the X-ray profiles are the following: -X-ray hydrostatic mass measurements at this redshift regime are remarkably robust and method-independent. All the profiles are consistent within the uncertainties as long as they are determined in the radial range where there are density and temperature measurements. This robustness is also reflected in the determination of mass at fixed radius or at a particular density contrast. -In the very core region R < 0.08 R 500 , where only density information is available, parametric models are necessary. The density information brings a certain constraint to the shape of the mass profile, but with an uncertainty that increases with decreasing radius. -At R 500 , it is essential to have a temperature measurement to anchor the total mass at this radius and constrain the shape of the mass profile. Robust M 500 estimates are only possible when this condition is fulfilled. In the absence of this constraint, model extrapolation can be rapidly divergent and can yield unphysical results. -Generally, when the radial sampling is poor (the profiles have few points) or when the profile is irregular, estimation of the mass outside the radial range probed by the temperature data is less robust and will depend strongly on the method used to measure the mass. On the other hand, if the shape of the profile is well reproduced using an NFW or Einasto-type model, the resulting mass estimate outside the range with measured temperatures is more robust. -We compared M HE 500 and M Y X 500 for four clusters of our sample with weak lensing (WL) mass measurements from HST observations, finding that the X-ray and WL mass measurements are in agreement within the uncertainties. There is, however, an offset on average, in the sense that the X-ray masses appear to be systematically higher by a factor of 1.39 +0.51 −0.37 than the WL masses. This offset goes in the opposite direction to what has been found in previous works (e.g. WtG at the 2σ level), and is contrary to the expectations for a 'hydrostatic bias'.
The above results confirm the power of combining XMM-Newton and Chandra for measuring the mass profile distribution and for estimating the hydrostatic M 500 up to z ∼ 1. We expect these results to be even less sensitive to systematic effects such as background estimation, contamination by background or foreground point sources, and the absolute temperature calibration than for local clusters. This is because object angular size is much smaller than the field of view, yielding a better constraint on the background, and the spectrum is redshifted to lower energies, where the effective area calibration of X-ray telescopes is more robust. In parallel, the Chandra observations allow robust point source detection and density measurement very deep into the core regions. In contrast, WL measurements become increasingly challenging at these redshifts. The statistical quality of the WL mass data is much poorer than is reachable with X-rays, even with HST, and control of systematic effects (in particular the measure of the redshift distribution of background sources or the removal of contamination by cluster members) becomes more demanding. The fact that we find a positive HE bias is probably linked to these effects.
We then investigated the evolution by comparison with local data and with expectations from numerical simulations. The main results were: -The agreement between the hydrostatic masses at R Y X 500 or at ∆ = 500 and M Y X 500 is remarkably good, suggesting that the M 500 -Y X relation is robust and that it can be extended to samples of disturbed and distant objects. It also suggests that there is no significant evolution between M Y X 500 and M HE 500 with redshift. The comparison with WL masses would further suggest that there is no dramatic increase in the bias between the hydrostatic mass and the true mass. However, it is clear that better WL data are needed to settle this point.
-We compared the scaled mass and total density profiles to those of the X-ray selected local sample REXCESS. This comparison shows that on average there is excellent agreement with REXCESS at large radii. The clusters of our sample exhibit a larger dispersion in shape over the full radial range, and systematically trace the lower envelope of the REXCESS distribution in the core region. These results suggest either the presence of evolution, or an X-ray / SZ selection effect. -We computed the sparsity for a large sample of clusters (the five high-z objects plus REXCESS) studied with X-ray observations. The sparsity enables efficient characterisation of the mass distribution in the cluster halo, and the comparison with REXCESS confirms the above. -We extended and strengthened the baryon fraction results found in B17. Using the hydrostatic mass measurements we confirmed our previous finding indicating that the baryon fraction at this redshift does not depend significantly on the halo mass, and agrees with the value from Planck Collaboration et al. (2016). -A comparison with the cosmo-OWLS simulations (Le Brun et al. 2014) showed that there is excellent agreement between observed and simulated profiles, the latter derived by imitating an X-ray approach. The scatter of our sample is also well reproduced by the simulations over the full radial range. We also studied the concentration-mass relation for the first time at high precision in this mass and redshift range, and found good agreement with the evolution predicted by Dutton & Macciò (2014).
This work represents the first full application of the method developed in B17, confirming that the combination of Chandra and XMM-Newton is crucial in order to study high-redshift objects, and allowing us to investigate the statistical properties of the mass profiles of cluster haloes in the high-mass, high-redshift z ∼ 1 range. Despite the small sample size, we were able to obtain a first insight into the statistical properties of these cluster haloes, suggesting profiles that are slightly less peaked than in local systems, in line with the expected theoretical evolution. However, a robust low-redshift SZ-selected anchor for the radial mass distribution is badly needed, especially taking into account the now well-known issue of X-ray versus SZ selection effects (Lovisari et al. 2017;Andrade-Santos et al. 2017;Rossetti et al. 2017). Larger sample sizes are needed to better consolidate the average behaviour and its dispersion. In parallel, higher resolution numerical simulations of larger volumes (e.g. Le Brun et al. 2018) are needed to provide the theoretical counterparts to the type of objects we have studied here. , where β is defined as the logarithmic derivative of temperature and density, namely β T (R) and β n e (R), respectively, with respect to the logarithmic derivative of the radius, β ≡ dlog x/ dlog r. Equation C.1 shows that the relative error is proportional to the sum of the term A = ∆T/T , and of the term B = ∆β T /(β T + β n e ). The behaviour of the two terms as a function of the radius can be studied using simple models for the density and temperature. We employed the density and temperature parametric models in Eq. A.1 and Eq. A.2, respectively, to generate 'toy model' temperature and density profiles. The parameters are reported in Table C.1. We neglected the error on the density profiles and assumed a constant relative error on the temperature profile of 6%. These assumptions are a good approximation of a realistic case, where density profiles are well constrained and temperature profiles are tailored to have constant signal-to-noise ratio. We generated three density profiles with different inner slopes and three temperature profiles with different shapes, shown in panels a and b of Fig. C.1, respectively. These profiles are representative of what is generally found in large samples of clusters. The density profile in the core strongly varies from cluster to cluster (cool-core, dynamically disturbed, relaxed, etc.). In the outskirts the behaviour is self-similar and all the profiles present a steep gradient. The temperature profile strongly depends on the cluster characteristics. The shape ranges from the 'bell' shape of the cool-core clusters (T model X) to being almost flat (T model Z). For a gallery of individual density and temperature profiles, see e.g. Vikhlinin et al. (2006), Pratt et al. (2007), and Croston et al. (2008).
We took as reference the temperature profile 'T model X', and computed the hydrostatic mass profiles for the three density toy models. The results are shown in panel c of Fig. C.1. We observe the following: -the logarithmic slope, β, of the density and temperature profiles strongly depends on the radius. Panel d shows that β T is small at all radii and smaller than β n e . The difference between the two increases with radius; -the relative error of the mass profile ∆M/M, shown in panel e, reflects this different behaviour of the βs.In the outskirts, β n e is much larger than β T so that the B-term in Eq. C.1 becomes negligible compared to the A-term. For this reason, the mass profile relative errors in the outskirts tend to coincide with the temperature relative error, i.e. the A-term. This is not the case in the core where the difference between β T and β n e is less important and the B-term is no longer negligible. For this reason, in the core the mass relative errors are larger than the A-term only. The relative importance of the B-term is related to the distance between the two βs; A&A proofs: manuscript no. high_z_mass_printer -the behaviour of the B-term as a function of radius can be visualised studying the ratio between ∆β T and the distance D between the β n e and β T , the distance being defined as D ≡ |β T − β n e |. Panel f in Fig. C.1 shows that this ratio decreases with radius as D increases. The mass measurements in the final radial bin slightly deviate from this behaviour because of the larger error of the β T in the last bin. We derived β T and its error ∆β T by estimating the median and the 68% deviation within 1000 realisations of the temperature profile.
For each realisation, we estimated the gradient for the n-th bin determining the slope using the [n − 1, n, n + 1] bins. For the boundary bins we determined the slope selecting the first and last three radial bins, respectively. This is less constraining for the gradient so that the dispersion within all the realisations is greater and the resulting error ∆β T is larger; -there is a clear correlation in the log-log space between the relative error on the mass ∆M/M and this ratio ∆β T /D, as seen in panel g. Results using the other two temperature profiles are also shown and present the same behaviour.
The behaviour of the relative error on the mass profile is thus an intrinsic property of the hydrostatic equation, Eq. 1, and does not depend on the temperature or density profile shape. In particular, this effect is tightly linked to the general behaviour of the temperature and density gradients in galaxy clusters. Model Parameters n e model 1 a n 01 = 6.28 × 10 −3 cm −3 , r c1 = 198 kpc, β 1 = 0.7, α = 0.57, r s = 1613 kpc, ǫ = 0.1, n 02 = 10 −3 cm −3 , β 2 = 0.67, r c2 = 19.8 kpc T model X b T 0 = 9.1 keV, τ = 0.46, r cl = 22.78 kpc, a cl = 1.21, r t = 7391 kpc, b = 1.03, c = 2.56 Notes: (a) n e models 2 and 3 were obtained using the same parameters as that of model 1 except for α = 0.29 and α = 1.37, respectively. (b) model Y was obtained using the same parameters as model X except for T 0 = 7.28, τ = 0.91, and c = 1.28. The temperature model Z is a flat profile with T = 6.5.
Article number, page 14 of 15 Bartalucci et al.: Total mass distribution in high-redshift galaxy clusters Panel a: density profiles generated using the parametric model of Vikhlinin et al. (2006). The errors are set to 0. Panel b: temperature profiles generated using the parametric model of Vikhlinin et al. (2006). The relative error on each bin is fixed to 6%. For clarity, bins and corresponding errors are reported only for 'T model X'. We report only the shape form of 'T model Y' and 'T model Z' with solid grey and magenta lines, respectively. Panel c: hydrostatic mass profiles derived from the three density profiles and from the temperature profile 'T model X' shown in panel a and b, respectively, using the FP method. Panel d: β is defined as the logarithmic derivative of the density and temperature, namely β(n e ) and β(kT ), with respect to the logarithm of the radius, β ≡ ∂ ln x/∂ ln r. The errors of β(kT ) are derived via a Monte-Carlo procedure. Panel e: temperature and mass profile relative errors as a function of the radius. The dotted line represents the 6% relative error. The mass profiles are computed using the three density profiles and the 'T model X'. Panel f: ratio between the error of β(kT ) and the distance D between β(n e ) and β(kT ), the distance being defined as D ≡ |β(kT ) − β(n e ). Panel g: mass profile relative errors as a function of the same quantity shown on the y axis of panel f. For this plot, we also included the results using all the temperature profiles. Points are colour-coded in order to clearly identify the inner core (blue points, upper right part of the plot) and the outskirts (red points, lower left part of the plot).