Issue 
A&A
Volume 659, March 2022



Article Number  A126  
Number of page(s)  18  
Section  Cosmology (including clusters of galaxies)  
DOI  https://doi.org/10.1051/00046361/202039980  
Published online  17 March 2022 
Velocity dispersion and dynamical masses for 388 galaxy clusters and groups
Calibrating the M_{SZ} − M_{dyn} scaling relation for the PSZ2 sample
^{1}
Instituto de Astrofísica de Canarias, C/Vía Láctea s/n, 38205 La Laguna, Tenerife, Spain
email: alejandro.aguado.barahona@gmail.com, aaguado@iac.es
^{2}
Universidad de La Laguna, Departamento de Astrofísica, 38206 La Laguna, Tenerife, Spain
^{3}
Dipartimento di Fisica, Sapienza Università di Roma, Piazzale Aldo Moro 5, 00185 Roma, Italy
^{4}
Purple Mountain Observatory, No. 8 Yuanhua Road, Qixia District, Nanjing 210034, PR China
Received:
24
November
2020
Accepted:
23
November
2021
The second catalogue of Planck SunyaevZeldovich (SZ) sources, hereafter PSZ2, represents the largest galaxy cluster sample selected by means of their SZ signature in a fullsky survey. Using telescopes at the Canary Island observatories, we conducted the longterm observational program 128 MULTIPLE16/15B (hereafter LP15), a large and complete optical followup campaign of all the unidentified PSZ2 sources in the northern sky, with declinations above −15° and no correspondence in the first Planck catalogue PSZ1. This paper is the third and last in the series of LP15 results, after Streblyanska et al. (2019, A&A, 628, A13) and AguadoBarahona et al. (2019, A&A, 631, A148), and presents all the spectroscopic observations of the full program. We complement these LP15 spectroscopic results with Sloan Digital Sky Survey archival data and other observations from a previous program (ITP1308), and present a catalogue of 388 clusters and groups of galaxies including estimates of their velocity dispersion. The majority of them (356) are optical counterparts of PSZ2 sources. A subset of 297 of those clusters are used to construct the M_{SZ} − M_{dyn} scaling relation based on the estimated SZ mass from Planck measurements and our dynamical mass estimates. We discuss and correct for different statistical and physical biases in the estimation of the masses, such as the Eddington bias when estimating M_{SZ} and the aperture and the number of galaxies used to calculate M_{dyn}. The SZtodynamical mass ratio for those 297 PSZ2 clusters is (1 − B) = 0.80 ± 0.04 (stat) ± 0.05 (sys), with only marginal evidence for a possible mass dependence for this factor. Our value is consistent with previous results in the literature, but is associated with a significantly smaller uncertainty due to the use of the largest sample size for this type of study.
Key words: galaxies: clusters: general / catalogs / largescale structure of Universe
© ESO 2022
1. Introduction
Galaxy clusters (GCs) are the most massive bound objects in the Universe that emerge from hierarchical structure formation (Peebles 1980). They are excellent tracers of the matter density distribution on scales relevant for cosmological studies. Indeed, the evolution of GC abundance with mass and redshift is very sensitive to the amplitude of the matter density fluctuations σ_{8} and the mean matter density of the Universe Ω_{m} (Allen et al. 2011).
The European Space Agency (ESA) Planck^{1} mission (Planck Collaboration I 2014) provided, for the first time, the possibility to detect galaxy clusters using their SunyaevZel’dovich (SZ; Sunyaev & Zeldovich 1972) signature in a fullsky survey. The associated Planck data products have been widely used to study mass scaling relations. In particular, this paper is based on PSZ2 (Planck Collaboration XXVII 2016), the second Planck catalogue of SZ sources derived from data taken from the full 29month mission. This catalogue is built with the combined results from three cluster detection codes (MMF1, MMF3, and PwS), as described in detail in Planck Collaboration XXIX (2014), Planck Collaboration XXVII (2016).
The PSZ2 catalogue contains 1653 detections, and was partially validated at the time of publication using external Xray, optical, nearinfrared (NIR), and SZ data (Planck Collaboration XXVII 2016). This first validation process began with a crossmatch with PSZ1 (Planck Collaboration XXIX 2014). After that, the search for possible counterparts continued in Xrays with the MCXC catalogue (Piffaretti et al. 2011), which is based on the ROSAT All Sky Survey (RASS, Voges et al. 1999, 2000), and the serendipitous ROSAT and Einstein cluster catalogues. In the optical and NIR, validation was carried out using the Sloan Digital Sky Survey (SDSS, York et al. 2000), the redMaPPer catalogue (Rykoff et al. 2014), and the AllWISE midinfrared (MIR) source catalogue (Cutri et al. 2013). Finally, SZ information was also used, such as that provided by the catalogues obtained using the South Pole Telescope (SPT, Bleem et al. 2015), the Atacama Cosmology Telescope (ACT, Hasselfield et al. 2013), and by direct followup with the Arcminute Microkelvin Interferometer (AMI, Perrott et al. 2015).
This paper is the third (and last) in the series of publications associated with the observational program 128MULTIPLE16/15B (hereafter LP15), an optical followup campaign of all of the 190 unidentified (at the time of publication) PSZ2 sources in the northern sky with declinations above δ = −15° and no correspondence in the first Planck catalogue PSZ1. Papers I (Streblyanska et al. 2019) and II (AguadoBarahona et al. 2019) in this series presented the full program, the imaging results, and the full validation analysis of the LP15 sample, including the confirmation of new GCs and their corresponding redshifts. Here, in Paper III, we present all the spectroscopic observations of the program, including velocity dispersion and dynamical mass estimates in some cases. These LP15 observations are complemented here with the use of SDSS archival data, allowing us to significantly increase the number of PSZ2 clusters (259 new objects) with spectroscopic information in the northern sky. This work also makes use of the validation papers Planck Collaboration Int. XXXVI (2016), Barrena et al. (2018, 2020) associated with the study of the PSZ1 catalogue by means of the observational program ITP1308.
The mass of a GC is not directly measurable, but this problem can be circumvented using scaling relations based on different mass proxies (Pratt et al. 2019). Xray mass measurements are based on the assumption of hydrostatic equilibrium, and preferably use the product of gas mass and temperature (Y_{X} = kTM_{gas}) due to the low scatter of this quantity (e.g. Kravtsov et al. 2006). The SZ effect can also be used to estimate masses. The usual proxy in this case is the spherically integrated Comptonization parameter, Y_{SZ}, which is related to the integrated electron pressure along the line of sight. Optical and dynamical mass methods are based on the assumption of dynamical equilibrium where the galaxies are the main ingredients; they use the velocity dispersion as a mass proxy, via the virial theorem. This mass estimation is often biased because of violations of the hydrostatical or dynamical equilibrium, the temperature structure (Rasia et al. 2014), and selection and/or observational effects. To account for these deviations for all these methods, the mass bias parameter (1 − b) is introduced as M_{X} = (1 − b)_{X}M_{true} for X = Xray, SZ, and dynamical masses, respectively. One of the main aims of this work is to characterise this mass bias for the PSZ2 sample in the case of dynamical masses in order to obtain unbiased SZ masses which could be used for cosmological studies. Estimation of the mass bias has recently been attempted by a large number of groups in the community: Ruel et al. (2014), Sifón et al. (2016), and Amodeo et al. (2018) using SZ and dynamical masses; or von der Linden et al. (2014), Hoekstra et al. (2015), Smith et al. (2016), Battaglia et al. (2016), Sereno et al. (2017), PennaLima et al. (2017), Medezinski et al. (2018), and Miyatake et al. (2019) using SZ and weaklensing masses.
Estimation of the mass bias parameter is not a straightforward task. Although there are methods that give accurate estimations of the mass of an individual cluster, these usually come with large statistical or systematic errors (see e.g. Tremaine et al. 2002; Kelly 2007; De Martino & AtrioBarandela 2016; Sereno et al. 2017). This fact, combined with the existence of intrinsic scatter in the fitted relations, makes this estimation a complex task. Therefore, appropriate characterisation of the selected regression method used in each case is mandatory. In Appendix B we address this topic of linear regression with errors in both axes and intrinsic scatter, for the particular case of our sample.
This paper is structured as follows. Section 2 describes our reference sample and the corresponding data sets, including the final results of the LP15 program. Section 3 illustrates our methodology for the velocity dispersion estimates. In Sect. 4, we present our dynamical mass estimates, and compare them to the SZ masses. Section 5 shows the results for the characterisation of the scaling relation M_{SZ} − M_{dyn} in the PSZ2 north, and the results for the mass bias factor (1 − b). We present our conclusions in Sect. 6. Appendix A provides the results for the 362 galaxy clusters and groups in table format. Appendix B describes the simulations performed to validate the various regression methods. Throughout this paper, we adopt a ΛCDM cosmology with Ω_{m} = 0.3075, Ω_{Λ} = 0.691, and H_{0} = 67.74 km s^{−1} Mpc^{−1} (Planck Collaboration XIII 2016).
2. The reference sample
The PSZ2 catalogue (Planck Collaboration XXVII 2016) is the largest full sky sample of GCs detected via the SZ effect, and consists of 1653 detections. At the time of its publication, 1203 sources were confirmed as counterparts in other wavelengths. After a great effort by the community, 1425 objects in total have been validated to date (Burenin et al. 2018; Boada et al. 2019; Streblyanska et al. 2019; AguadoBarahona et al. 2019). From here, the PSZ2North subsample is defined as in Papers I and II (Streblyanska et al. 2019; AguadoBarahona et al. 2019), as those 1003 objects within the PSZ2 catalogue with Dec > −15°.
This article presents the velocity dispersion and dynamical mass for a sample of 388 objects (see Table 1). The majority of them (356) are the optical counterpart of a SZ source in the PSZ2North sample (we note that double detections are counted as two different clusters). Six clusters are the optical counterpart of a PSZ2 source not found in the PSZ2North sample. The remaining 26 objects were found during the process of analysis of the fields in which a SZ source is present, but were not associated with the SZ signal (see Sect. 2.5 for details). Each object in our sample comes from one particular data set. These data sets are described in the following sections
Summary of the data sets.
The possible presence of interlopers inside the cluster radial velocity catalogues might bias the velocity dispersion and mass estimates (Mamon et al. 2010). For this reason, we decided to analyse the clusters in two ways, trying to characterise the presence of this source of error. During the memberselection process, explained in detail in Sect. 3, we use two different apertures, namely 1 and 1.5 × r_{200}, to select the cluster members. The comparison of the mass bias in both samples gives no significant difference between them, and so we can safely assume that the number of interlopers within 1 and 1.5 × r_{200} is sufficiently small (compared to our statistical error) to not account for them. Table 1 includes the details about these subsamples.
Figure 1 shows the number of clusters inside the 1.5 × r_{200} subsample in comparison with the total number of objects in the PSZ2North sample as a function of the signaltonoise ratio (S/N) in the PSZ2 catalogue. Our sample covers the full range of S/N values, being approximately 30% of the total PSZ2North sample. This fact allows us to consider this sample as statistically representative to infer global properties of the full PSZ2North sample.
Fig. 1. PSZ2 cluster counts as a function of the S/N of the SZ detection The PSZ2North sample is represented in light blue, and the 1.5 × r_{200} sample is represented in beige. Dark blue bars represent the ratio between the 1.5 × r_{200} sample and the total number of clusters in the PSZ2North sample. The bin size is 0.5. 
2.1. LP15 data set
The 128MULTIPLE16/15B followup program LP15 was designed to observe all PSZ2North sources with no confirmed counterparts at the moment of publication of the catalogue. This original LP15 sample contains 190 objects (AguadoBarahona et al. 2019). The program had two main goals: to validate the SZ sources by finding their optical counterparts, and to use them for the calibration of the M_{SZ} − M_{dyn} scaling relation. The validation process was published in Streblyanska et al. (2019) and AguadoBarahona et al. (2019). In total, 184 sources were observed, with 81 of them confirmed as optical counterparts of the PSZ2 detections.
The LP15 program was performed during four consecutive semesters (2015B, 2016A, 2016B and 2017A). Due to technical telescope issues we were not able to complete this program in time, and so we were granted a further four observing nights during the semester 2018A in the frame of the program CAT18A12. All spectroscopic observations were obtained using the multiobject spectrographs Device Optimized for the LOw RESolution (DOLORES) at the Telescopio Nazionale Galileo (TNG) and Optical System for Imaging and lowIntermediateResolution Integrated Spectroscopy (OSIRIS) at the Gran Telescopio Canarias (GTC), both located at the Roque de los Muchachos Observatory (ORM) in La Palma (Spain). Details about the imaging and spectroscopic procedures can be found in Streblyanska et al. (2019) and AguadoBarahona et al. (2019).
In total, 94 sources were observed spectroscopically, 55 at the GTC and 39 at the TNG. We obtained good quality data to estimate the velocity dispersion for 82 clusters, which corresponds to a success rate of 87%. The mean (median) redshift of this data set is z_{spec} = 0.41 (0.39) and the mean (median) number of galaxy members for these clusters is N = 26 (22). All the spectroscopic results of these observations are presented here for the first time. Individual measurements for all cluster members (approximately 1400 redshifts in total) will be published online, and included in the VO.
2.2. ITP13 data set
In addition, we use part of the ITP13 sample described in Ferragamo et al. (2021). This sample consists of 61 PSZ1 clusters, of which 47 are also included in the PSZ2 catalogue. The observations of these objects were performed during four semesters in the framework of the International Time Program ITP13B08, a similar program to the LP15 but for the PSZ1 catalogue. We include these 47 objects in our analysis, finding a mean (median) redshift of z_{spec} = 0.37 (0.31) and a mean (median) number of galaxies members of the clusters N = 19 (17). Of those 47 objects, 43 are found in the PSZ2North sample. We complement the individual cluster member catalogues from the two data sets described above using spectroscopic data from the Sloan Digital Sky Survey (SDSS, York et al. 2000) Data Release (DR) 14, when available.
It is important to clarify here that there might be slight differences between the velocity dispersion estimates quoted in this paper and those from Ferragamo et al. (2021). Although the individual velocity catalogues used to estimate the velocity dispersion are the same, the methodology is slightly different, as described in detail in Sect. 3.
2.3. SDSS data
SDSS archival data give us a unique opportunity to enlarge our original sample. We retrieve every spectroscopic redshift within 15′ of the Planck nominal pointing for all the PSZ2 objects inside the SDSS footprint. For the cases with z_{spec} < 0.1, we expand this region to a 30′ radius to obtain as many cluster members as possible.
We identify 259 galaxy clusters following this procedure. In nine cases, the object found does not fulfill the criteria to be considered an optical counterpart of the corresponding SZ source. Those criteria are explained in detail in Streblyanska et al. (2019) and AguadoBarahona et al. (2019). The mean (median) redshift of this data set is z_{spec} = 0.22 (0.19), and the mean (median) number of galaxy members in the clusters is N = 43 (21).
2.4. Other PSZ2 clusters
Table A.1 also includes six GCs that do not belong to the PSZ2North sample but are found in PSZ2. These objects are PSZ2 G021.02−29.04, PSZ2 G027.77−49.72A, PSZ2 G027.77−49.72B, PSZ2 G171.08−80.38, PSZ2 G208.57−44.31, and PSZ2 G270.78+36.83. They were observed for a different project but inside the LP15 program, and so for this reason they are described here. As they do not form part of the PSZ2North sample, they are not considered for the characterisation of the scaling relation. These objects are listed as “Others in PSZ2” (Col. 3) in Table 1.
2.5. Beyond the PSZ2 sample
As outlined above, during this program we characterised 26 new clusters or groups that cannot be formally associated with the PSZ2 detection because they do not fulfill the matching criteria for being considered the optical counterpart. These objects are presented in Table A.2, which lists their velocity dispersion, dynamical mass, number of members, and redshift. As they are not associated with any SZ source, they cannot be used for characterisation of the scaling relation in Sect. 5. They are listed as “Beyond PSZ2” in Table 1.
3. Velocity dispersion estimates
Here we present the methodology and results for the estimation of the velocity dispersion for those 362 objects confirmed as the optical counterparts of SZ sources in the PSZ2 catalogue (columns two and three in Table 1). Table A.1 shows the results for these GCs and is organised as follows. Columns 1 and 2 are the official ID number and the Planck name in the PSZ2 catalogue. Columns 4 and 5 are the J2000 coordinates of the BCG when present; otherwise, the geometrical centre of the GC is provided. Columns 5 and 6 give the number of spectroscopic members retrieved. Columns 7–9 provide the mean spectroscopic redshift of the cluster and, when available, the BCGs. Columns 10 and 11 are our velocity dispersion estimates. Columns 12–14 present the dynamical and SZ mass estimates. Column 15 indicates whether the object was used in Sect. 4. Column 16 lists the data set from where the cluster was extracted (see Col. 1 in Table 1).
We also publish 26 clusters and groups found while studying the PSZ2 catalogue that are not associated with any SZ source due to either their large distance from the Planck pointing or to their low mass. These are presented in Table A.2, which is structured in a similar way to Table A.1. The difference is that these clusters and groups are not associated with any SZ source, and so instead of naming them with the Planck name, we simply quote the field around which they were found.
We follow the procedure outlined in Ferragamo et al. (2020) to estimate the velocity dispersion. The authors demonstrate (using hydrodynamical simulations) that the estimation of the velocity dispersion is biased in the lownumberofgalaxies regime, and present a functional form, depending on the number of galaxies used, to correct for this effect (see Eq. (11), Ferragamo et al. 2020). They also show that the aperture subsampling is a source of error, and provide a recipe to correct for this effect. Finally, the authors note that the appropriate value of the clipping in the lineofsight velocity field to minimise the presence of interlopers is 2.7. We adopt this value in our analysis.
We obtain the velocity dispersion in two steps. We make a first estimate using an iterative σclipping method and then we apply the corrections to the estimator. For the iterative σclipping method, we use a clip of 2.7σ and a cut in aperture of 1 or 1.5 × r_{200} which is included inside the clipping. Once we have obtained this first estimate, we apply the corrections due to the used estimator and the aperture. In this paper, we choose the gapper estimator (Wainer & Thissen 1976), as it is the one with the least dependence on the number of galaxies (Ferragamo et al. 2020). Figure 2 shows an example of the final velocity histogram of the cluster members for a particular case in our sample. Figure 3 shows the stacked distribution of all the galaxies in the phase space, for all clusters.
Fig. 2. Example of the distribution of galaxies in PSZ2 G009.04+31.09 as a function of the rest frame difference in radial velocity from the mean radial velocity of the cluster. The cluster members used to estimate the velocity dispersion are shown in blue. The red line represents the normal distribution expected for the estimated velocity dispersion of σ_{200} = 1068 km s^{−1}. 
Fig. 3. Projected phase space and velocities histogram distribution for all of the 11 867 galaxy members in our sample. Member velocities are normalised to the mean cluster velocity dispersion, whereas the distance to the centre of the cluster is normalised to the value of r_{200} in each cluster. Horizontal black dashed lines are the 2.7σ clip. The red line represents a Gaussian (normal) fit to the velocity histogram with σ = 1. 
As mentioned in Sect. 2 and above, we use two different apertures when selecting the cluster members. The main reason for this is to evaluate the possible bias introduced by the presence of interlopers inside the individual cluster catalogues. The interlopers are an important cause of uncertainty when estimating the velocity dispersion of a cluster as shown by many authors in the literature (see e.g. Mamon et al. 2010; Saro et al. 2013; Wojtak et al. 2018; Pratt et al. 2019). For each cluster, we present the values for apertures of 1.5 × r_{200} and r_{200}. We note that when restricting the aperture limit to r_{200}, we find 36 fewer GCs because of the drop in the number of members, as the minimum number of members that we consider to estimate the velocity dispersion is seven.
Among those 362 presented counterparts, 5 are what we call a ‘multiple detection’. This means that there is more than one cluster associated with the SZ signal. In addition, there are 16 objects that are clearly substructured, and so their velocity dispersion estimates should not be trusted, as they probably overestimate the true underlying velocity. We do not use these objects when characterising the M_{SZ} − M_{dyn} scaling relation.
Unfortunately, not all of the clusters we found are associated with the SZ emission. There might be lowmass systems or objects too distant from the SZ peak to be considered the counterpart. These clusters are not used for the calibration of the scaling relation. We also mark these objects in Table A.2.
4. Mass estimates
In the following subsections, we describe our methodology to obtain the dynamical and SZ masses. These masses are used in Sect. 5 to characterise their scaling relation and to obtain the bias parameter (1 − b) which is of enormous importance for cosmological studies.
The mass estimates are presented in Cols. 12–14 of Table A.1. Column 15 indicates whether an object is used in Sect. 5 to characterise the scaling relation. Columns 5 and 6 in Table 1 show the total number of GCs in each data set used for the estimation of the mass bias parameter.
4.1. M_{dyn} estimates
Estimating the dynamical mass of a cluster is not a simple task. As shown in Old et al. (2014), estimation of the mass using a low number of cluster members is problematic. For this reason, we use the method described in Ferragamo et al. (2020), where the authors study the behaviour of several velocity dispersion and dynamical mass estimators using hydrodynamical simulations in the lownumberofgalaxies regime. The authors demonstrate that estimation of the velocity dispersion is biased in this regime, and propose a functional form to correct for this that depends on the number of galaxies used (Eqs. (15) and (16), Ferragamo et al. 2020).
The scaling relation used for the estimation of the dynamical mass is Eq. (1) from Munari et al. (2013):
where A = 1177.0 km s^{−1} and α = 0.364. We note that these parameters were obtained for a velocity dispersion calculated using the biweight estimator (Beers et al. 1990). Thus, for consistency, we convert our velocity dispersion estimates to that of the biweight following the recipe in Ferragamo et al. (2020). After applying the corrections to due to the number of cluster members, we convert this mass into , so that we can compare it to . This last step is performed using the Python package NFW^{2} which implements the Navarro, Frenck, and White (Navarro et al. 1997) halo profile using the concentration parameter from Duffy et al. (2008):
The uncertainties on are based on the expected variance of our estimator as a function of the number of galaxies, as shown in Fig. 8 in Ferragamo et al. (2020). Those curves can be fitted to an equation of the type:
Finally, to obtain the uncertainity in from , we use a quadratic propagation of the error, but the uncertainty in the concentration parameter is not propagated.
4.2. M_{SZ} estimates
The Planck Collaboration provides, for every SZ source in the PSZ2 catalogue, an array of masses as a function of redshift, . These values were obtained by breaking the size–flux degeneracy of the Planck measurements using a prior relating the SZ flux (Y_{500}) and the cluster size (θ_{500}). In turn, this cluster size is connected to the total mass for a given redshift z. For each cluster in our sample, we interpolate these arrays to our measured redshift, and extract their SZ mass. Further details about the procedure to obtain can be found in Planck Collaboration XX (2014) and Planck Collaboration XXVII (2016).
These SZ masses suffer from Eddington bias (see e.g. van der Burg et al. 2016), especially in the lowS/N regime. Figure 5 in van der Burg et al. (2016) shows the magnitude of Eddington bias as a function of S/N in the PSZ2 catalogue for different redshifts. These latter authors estimate this Eddington bias by simulating a list of masses and redshifts following the Tinker et al. (2008) halo mass function and the redshiftdependent comoving volume element for their assumed cosmology. We use that figure to create a hypersurface and apply a 3D interpolation technique in order to correct our SZ masses for this effect and obtain the final . We note that this treatment is an approximation, as the correction for each cluster is purely statistical (see e.g. Appendix A in Mantz et al. 2010, for an illustration of the effect of this type of statistical bias). For this reason, our individual corrected masses should be seen as an approximation. Nevertheless, the overall mass bias for the full sample should be correctly estimated.
5. M_{SZ} − M_{dyn} scaling relation
In this section we present and discuss the scaling relation between SZ and dynamical masses for a statistically representative sample of the PSZ2 catalogue. Starting from the complete list of clusters presented in Sect. 2, we use two additional criteria to remove objects from the list. We exclude: (i) GCs that are clearly substructured, as the estimation of the dynamical mass is probably overestimated in this case, and (ii) those presenting multiple counterparts, because, using Planck data alone and due to the beam size, it is not possible to disentangle the individual contribution of each cluster to the total SZ flux.
After applying these two exclusion criteria, our final sample adopted for the computation of the scaling relation contains 297 PSZ2 clusters, all of them with members selected within 1.5 × r_{200}. Column 5 in Table 1 shows the distribution of those objects in the three data sets considered in Sect. 2. Figure 4 presents the scaling relation obtained for those 297 objects. From here, our main goal is to find the socalled mass bias factor (1 − b), which accounts for any difference between the true mass and the SZ mass proxies (, ) used to establish the scaling relations. We define this bias as
Fig. 4. Scaling relation for the sample of 297 PSZ2 clusters (1.5 × r_{200} sample). The dashed black line shows the 1 : 1 line. The orange line represents our best fit using the Nukers method with α = 1 (see text for details). The green line is the fit using the complete Nukers method for a free slope. The shaded regions represent the 1 and 2σ errors of the reconstructed parameters. The vertical red dotted line corresponds to the pivot mass of 6 × 10^{14} M_{⊙}. 
As explained in Sect. 3, we are not able to correct for all the physical effects potentially causing a bias when estimating the true velocity dispersion of the clusters. This leads to a bias between the true mass and the dynamical mass estimates. The main source of error is possibly the interlopers inside our sample. For this reason, we define the dynamical mass bias factor (1 − b_{dyn}) as
Combining Eqs. (4) and (5), we obtain
where
We study this last bias (1 − B) in our scaling relation. In principle, we would expect this bias (1 − B) to represent a lower bound to (1 − b), the reason being that the presence of interlopers generally produces an overestimation of the velocity dispersion, and thus of the dynamical mass (e.g. Ferragamo et al. 2020). However, and to estimate the real impact of interlopers in our sample, we repeat the same analysis with a smaller subsample of 261 clusters obtained by reducing the aperture to r_{200} when selecting the cluster members. As shown below, we find consistent results in this case, suggesting that the impact of interlopers for our particular sample is minimal (or at least smaller than our statistical error), as anticipated in Ferragamo et al. (2020). Column 6 in Table 1 shows the distribution of objects through the data sets of this smaller subsample.
5.1. Regression method
Here, we characterise various regression methods with realistic simulations matching the statistical properties of our sample. These simulations follow the very same procedure as the real data, using the same number of galaxies for each cluster to estimate the velocity dispersion and the dynamical mass uncertainties. These simulations are detailed in Appendix B. We explore two possibilities. First, we consider the simplest case of fitting for a global bias. However, as there are hints that suggest a possible mass dependence of the mass bias, we also explore a fit to a power law in mass to account for this dependence, using 6 × 10^{14} M_{⊙} as the pivot scale in order to be able to consistently compare our results with other works in the literature (e.g. Planck Collaboration XXIV 2016). The parametric form of the fitting function in this second case is given by:
To be clear, when fitting this power law, the result of the mass bias is estimated at this given mass of 6 × 10^{14} M_{⊙}.
Due to the large uncertainties in the dynamical mass estimates for our sample, all the methods that we have tested do not behave well and give completely biased outputs. For this reason, we do the linear regression in the logarithmic space, fitting for the relation
where α and ln(1 − B) are the slope and the intercept, respectively. It is important to note that in our limit of large dynamical mass errors, these are also considerably greater than the expected intrinsic scatter of the relation σ_{ln M} = 0.096 (Planck Collaboration XX 2014).
We study the dependence of the mass bias on mass by fitting the slope in Eq. (9). To do so, we tested five different regression methods that are usually applied in the literature: orthogonal distance regression (ODR), Nukers (Tremaine et al. 2002), maximum likelihood estimator with uniform prior (MLEU), bivariate correlated errors and intrinsic scatter (BCES, Akritas & Bershady 1996), and the complete maximum likelihood estimator (CMLE, Kelly 2007). All of them are described in detail in Appendix B, and applied to simulated data based on the noise distributions and range of masses in our real sample. Our results show that all methods are biased, although some of them are more robust and less affected by errors. In general, those methods that take into account the intrinsic scatter in an explicit way (BCES, MLEU and CMLE) fail in the recovery of this parameter, probably due to the fact that the error measurements are of the order of or larger than the intrinsic scatter. For our sample, both the ODR and the Nukers method appear to be more robust, showing a small bias of approximately 7% when recovering the intercept, with the slope being well recovered. Finally, we also tested the case of no mass dependence in the scaling relation, i.e. fixing the slope to one. We tested three methods in this limit: ODR, Nukers, and MLEU. The results are similar to those from the complete regression. The ODR and Nukers methods are biased by approximately 7% in the intercept, while the MLEU fails to recover both input values. This bias in ODR and Nukers of 7% in the intercept is robust, as is independent of the input value adopted for the mass bias in the simulations (we tested values from 0.6 to 1.2).
For completeness, we also studied other cases to understand the range of applicability of the different methods, always keeping the same number of clusters as in our sample. In particular, we considered the case of significantly decreasing the errors in the dynamical mass, and also a case with no intrinsic scatter in the simulated data. If the statistical errors in the dynamical mass measurements are significantly decreased (by a factor of ten), then all methods are able to recover both parameters, regardless of whether there is intrinsic scatter in the simulation. These tests strongly suggest that the main source of bias of the methods is the large errors of our data.
The two methods that best recover the parameters in our case are Nukers and ODR, both when fixing and not fixing the slope. However, as shown in Appendix B, both methods present a small bias in the intercept that has to be corrected. In this paper, we use Nukers as the reference method. It gives practically the same results as the ODR, while it also gives an estimation of the intrinsic scatter. Our simulations also show that this method provides consistent results whether fixing the slope or not. When quoting our final values, we correct our Nukers results accounting for the bias in the intercept (7%), and add a systematic uncertainty due to this bias.
5.2. Results
Table 2 shows the results for the fitting of Eq. (9) using the Nukers method described above, and for both samples. The statistical errors quoted in this table are computed with a bootstrapping technique.
Results for the mass bias using both subsamples 1 and 1.5 × r_{200}.
The value of the slope is (in both samples) slightly more than 1σ away from one, which might be indicative of a possible dependence of the (1 − B) on mass, but the results are not significant enough to make that claim. For comparison, von der Linden et al. (2014) and Hoekstra et al. (2015) find a slope of around 0.7, which goes in the direction of inverse dependence of the mass bias on mass. These latter authors obtained their results using the CMLE method (Kelly 2007). We note that in our particular case, the simulations show that this method presents a significant bias of approximately 20% (see Appendix B). A direct evaluation of the slope using CMLE gives α = 0.70 ± 0.06 for our sample, but after the bias correction, this number moves up to α = 0.88 ± 0.07, which is less than 2σ from unity.
We find no significant difference in the (1 − B) mass bias when considering different samples, suggesting that the effect of the interlopers in the region 1 − 1.5 × r_{200} is smaller than the quoted statistical error, as expected. For this reason, we restrict the following analysis to the case of the full sample (1.5 × r_{200}). The results for the mass bias using this sample are shown in Fig. 4.
We investigate the robustness of the results when selecting only those clusters with smaller error bars in the determination of the dynamical mass. For this, we repeated our analysis using different selections according to the number of cluster members used to determine the velocity dispersion (parameter N_{1.5} in Table A.1). Table 3 shows the results when restricting our sample to those clusters with N_{1.5} > 15, 20, 30, and 50. The values of the mass bias and the slope are consistent with each other in all cases. However, we note that there is a marginal trend (smaller than 1σ) towards lower values of the bias for the subsamples with more cluster members. We can understand this trend by noting that those clusters with more members are, on average, less massive as they are mostly lowredshift systems. As there is a marginal detection of a slope α > 1, we would expect lowmass clusters to present a lower (1 − B).
Results for the mass bias using the sample (1.5 × r_{200}), when restricting the analysis to those clusters with total number of spectroscopic members N in a certain range or interval.
5.2.1. Eddington bias
The effect of the Eddington bias correction is shown in Table 4. Clusters are distributed in five bins, keeping the same number of objects per bin. We restrict the analysis here to the case of a fixed slope (α = 1) as the size of the errors likely prevents us from deriving any constraint on the slope if left free. As expected, the correction applied reduces the mass bias to between 10% and less than 1%, depending on the S/N of the Planck SZ detection. We note that the central bin (5.57 < S/N ≤ 6.35) does not follow the trend of the others, but is still less than 2σ away from the mean value for the full sample.
Nukers (1 − B) estimates before and after the Eddington bias correction in S/N bins for the sample with an aperture cut of 1.5 × r_{200}.
5.2.2. Redshift dependence
Table 5 presents the results for the mass bias in five different redshift bins with the same number of objects. As in the previous study, there are not enough clusters in each bin to make the regression with a freely varying slope, and so we restrict the analysis again to the case of fixed (α = 1) slope. We find that three of the five bins show consistent results with the mean bias for the full sample. However, there are two bins (0.107 ≤ z < 0.200 and z ≥ 0.379) that are inconsistent with the mean bias at the level of approximately 2.7σ. We refer to them as the second and fifth bins.
Nukers (1 − B) estimates for different bins in redshift for the sample with an aperture cut of 1.5 × r_{200}.
We carried out several tests to explain the origin of these two outliers, but none of the analyses are conclusive. First, we explored whether this difference could be ascribed to a significantly different mean mass in the bin. In principle, the lowredshift bins could span a larger range of masses due to the survey selection function (see Fig. 26 in Planck Collaboration XXVII 2016). The median mass values that we find for those five bins are 3.6, 4.3, 7.1, 6.7, and 5.9 × 10^{14} M_{⊙}, respectively. These values do not show any specific trend that could explain the two outliers. This is also the case for the mean number of galaxies (90, 27, 19, 18 and 22) and the mean S/N (8.8, 7.5, 7.7, 6.1 and 5.9) in each of the five bins. We also divided the two anomalous bins into two new subbins in redshift, S/N, and number of galaxies. There is no appreciable difference in any subbin for the case of 0.107 ≤ z < 0.200. However, when we do the same for the fifth bin (z ≥ 0.379), it seems that the outlier here might be due to the high redshift clusters. This is in agreement with the result shown above indicating a small dependence of the mass bin on mass.
Therefore, we cannot find a simple explanation for the outlier in the second redshift bin (0.107 ≤ z < 0.200), and if confirmed with better statistics, this could be ascribed to a real physical effect. For comparison, we note that Ferragamo et al. (2021) also find a similar outlier in the same redshift bin when using PSZ1 clusters only.
6. Summary and conclusions
This is the third and last paper in a series describing the observational program LP15. Here, we present the spectroscopic data of the full program. In total, 94 PSZ2 sources were observed, 55 at the GTC and 39 at the TNG. We were able to estimate the velocity dispersion for 82 clusters. In addition, we used 47 clusters from the ITP sample and 259 clusters from the SDSS archival data to build a statistically representative sample of the PSZ2 in the northern hemisphere (PSZ2North).
We present the velocity dispersion and dynamical mass of 362 objects confirmed as an optical counterpart of a PSZ2 source, 356 from the PSZ2North sample and 9 from outside. We also discuss 26 clusters and groups that do not fulfill the matching criteria to be a counterpart of the SZ signal.
The combination of LP15, ITP, and SDSS samples yields a total sample of 297 galaxy clusters that can be used for the characterisation of the scaling relation M_{SZ} − M_{dyn} for the PSZ2 catalogue. This sample represents the largest set of SZselected clusters for which SZ and dynamical masses are available, and is the largest sample used to determine the mass bias using dynamical mass estimates.
Based on a set of realistic simulations which are representative of the actual noise level in our sample, we selected the Nukers method as the least biased regression method with which to extract the scaling relation. After correcting for the statistical bias of the regression method and the Eddington bias of the sample, we find the mass bias to be (1 − B) = 0.80 ± 0.04 (stat) ± 0.05 (sys). Assuming (1 − b_{dyn}) = 1, our value for (1 − b) is in agreement with the findings of previous studies (Ruel et al. 2014; Hoekstra et al. 2015; Battaglia et al. 2016; Sereno et al. 2017; PennaLima et al. 2017; Medezinski et al. 2018; Miyatake et al. 2019; Ferragamo et al. 2021), and do not solve the tension in the cosmological parameters (Ω_{m} − σ_{8} plane) between the CMB measurements and the cluster count analyses (Planck Collaboration XXIV 2016; Salvati et al. 2018; Planck Collaboration VI 2020; Remazeilles et al. 2019), which requires lower values for (1 − b).
We note that Ferragamo et al. (2021) present a similar analysis to the one carried out in this paper, but for 207 PSZ1 clusters. These latter authors find a mass bias of (1 − B) = 0.83 ± 0.07 ± 0.02, which is in complete agreement with our result. In that paper, there is a detailed comparison with similar works in the literature, including the one presented here. Finally, we only find marginal evidence of a possible dependence of the mass bias on mass. Our fitted slope is α = 1.17 ± 0.13 which is 1.3σ away from the massinvariant relation.
Planck (http://www.esa.int/Planck) is a project of the ESA with instruments provided by two scientific consortia funded by ESA member states and led by Principal Investigators from France and Italy, telescope reflectors provided through a collaboration between ESA and a scientific consortium led and funded by Denmark, and additional contributions from NASA (USA).
Acknowledgments
This article is based on observations made with the Gran Telescopio Canarias operated by the Instituto de Astrofisica de Canarias, the Isaac Newton Telescope, and the William Herschel Telescope operated by the Isaac Newton Group of Telescopes, and the Italian Telescopio Nazionale Galileo operated by the Fundacion Galileo Galilei of the INAF (Istituto Nazionale di Astrofisica). All these facilities are located at the Spanish Roque de los Muchachos Observatory of the Instituto de Astrofisica de Canarias on the island of La Palma. This research has been carried out with telescope time awarded for the program 128MULTIPLE16/15B. During our analysis, we also used the following databases: the SZCluster Database operated by the Integrated Data and Operation Center (IDOC) at the IAS under contract with CNES and CNRS and the Sloan Digital Sky Survey (SDSS) DR14 database. Funding for the SDSS has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Aeronautics and Space Administration, the National Science Foundation, the US Department of Energy, the Japanese Monbukagakusho, and the Max Planck Society. This work has been partially funded by the Spanish Ministry of Science and Innovation under the projects ESP201348362C21P, AYA201460438P, AYA201784185P and PID2020120514GBI00. A.S. and R.B. acknowledge financial support from the Spanish Ministry of Science and Innovation under the Severo Ochoa Programs SEV20110187 and SEV20150548.
References
 AguadoBarahona, A., Barrena, R., Streblyanska, A., et al. 2019, A&A, 631, A148 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Akritas, M. G., & Bershady, M. A. 1996, ApJ, 470, 706 [Google Scholar]
 Allen, S. W., Evrard, A. E., & Mantz, A. B. 2011, ARA&A, 49, 409 [Google Scholar]
 Amodeo, S., Mei, S., Stanford, S. A., et al. 2018, ApJ, 853, 36 [NASA ADS] [CrossRef] [Google Scholar]
 Barrena, R., Streblyanska, A., Ferragamo, A., et al. 2018, A&A, 616, A42 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Barrena, R., Ferragamo, A., RubiñoMartín, J. A., et al. 2020, A&A, 638, A146 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Battaglia, N., Leauthaud, A., Miyatake, H., et al. 2016, JCAP, 2016, 013 [NASA ADS] [CrossRef] [Google Scholar]
 Beers, T. C., Flynn, K., & Gebhardt, K. 1990, AJ, 100, 32 [Google Scholar]
 Bleem, L. E., Stalder, B., de Haan, T., et al. 2015, ApJS, 216, 27 [Google Scholar]
 Boada, S., Hughes, J. P., Menanteau, F., et al. 2019, ApJ, 871, 188 [NASA ADS] [CrossRef] [Google Scholar]
 Boggs, P. T., & Rogers, J. E. 1990, in Statistical Analysis of Measurement Error Models and Applications: Proceedings of the AMSIMSSIAM Joint Summer Research Conference held June 10–16, 1989, eds. P. Brown, & W. A. Fuller, Contemp. Math., 112, 186 [Google Scholar]
 Burenin, R. A., Bikmaev, I. F., Khamitov, I. M., et al. 2018, Astron. Lett., 44, 297 [NASA ADS] [CrossRef] [Google Scholar]
 Cutri, R. M., Wright, E. L., Conrow, T., et al. 2013, Explanatory Supplement to the AllWISE Data Release Products, Tech. Rep. [Google Scholar]
 De Martino, I., & AtrioBarandela, F. 2016, MNRAS, 461, 3222 [Google Scholar]
 Duffy, A. R., Schaye, J., Kay, S. T., & Dalla Vecchia, C. 2008, MNRAS, 390, L64 [Google Scholar]
 Ferragamo, A., RubiñoMartín, J. A., BetancortRijo, J., et al. 2020, A&A, 641, A41 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Ferragamo, A., Barrena, R., RubiñoMartín, J. A., et al. 2021, A&A, 655, A115 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Hasselfield, M., Hilton, M., Marriage, T. A., et al. 2013, JCAP, 7, 008 [CrossRef] [Google Scholar]
 Hoekstra, H., Herbonnet, R., Muzzin, A., et al. 2015, MNRAS, 449, 685 [NASA ADS] [CrossRef] [Google Scholar]
 Kelly, B. C. 2007, ApJ, 665, 1489 [Google Scholar]
 Kravtsov, A. V., Vikhlinin, A., & Nagai, D. 2006, ApJ, 650, 128 [Google Scholar]
 Mamon, G. A., Biviano, A., & Murante, G. 2010, A&A, 520, A30 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Mantz, A., Allen, S. W., Ebeling, H., Rapetti, D., & DrlicaWagner, A. 2010, MNRAS, 406, 1773 [NASA ADS] [Google Scholar]
 Medezinski, E., Battaglia, N., Umetsu, K., et al. 2018, PASJ, 70, S28 [NASA ADS] [Google Scholar]
 Miyatake, H., Battaglia, N., Hilton, M., et al. 2019, ApJ, 875, 63 [Google Scholar]
 Munari, E., Biviano, A., Borgani, S., Murante, G., & Fabjan, D. 2013, MNRAS, 430, 2638 [Google Scholar]
 Navarro, J. F., Frenk, C. S., & White, S. D. M. 1997, ApJ, 490, 493 [Google Scholar]
 Old, L., Skibba, R. A., Pearce, F. R., et al. 2014, MNRAS, 441, 1513 [Google Scholar]
 Peebles, P. J. E. 1980, The Largescale Structure of the Universe (Princeton: Princeton University Press) [Google Scholar]
 PennaLima, M., Bartlett, J. G., Rozo, E., et al. 2017, A&A, 604, A89 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Perrott, Y. C., Olamaie, M., Rumsey, C., et al. 2015, A&A, 580, A95 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Piffaretti, R., Arnaud, M., Pratt, G. W., Pointecouteau, E., & Melin, J.B. 2011, A&A, 534, A109 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Planck Collaboration I. 2014, A&A, 571, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Planck Collaboration XX. 2014, A&A, 571, A20 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Planck Collaboration XXIX. 2014, A&A, 571, A29 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Planck Collaboration XIII. 2016, A&A, 594, A13 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Planck Collaboration XXIV. 2016, A&A, 594, A24 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Planck Collaboration XXVII. 2016, A&A, 594, A27 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Planck Collaboration VI. 2020, A&A, 641, A6 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Planck Collaboration Int. XXXVI. 2016, A&A, 586, A139 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Pratt, G. W., Arnaud, M., Biviano, A., et al. 2019, Space Sci. Rev., 215, 25 [Google Scholar]
 Rasia, E., Lau, E. T., Borgani, S., et al. 2014, ApJ, 791, 96 [NASA ADS] [CrossRef] [Google Scholar]
 Remazeilles, M., Bolliet, B., Rotti, A., & Chluba, J. 2019, MNRAS, 483, 3459 [Google Scholar]
 Ruel, J., Bazin, G., Bayliss, M., et al. 2014, ApJ, 792, 45 [NASA ADS] [CrossRef] [Google Scholar]
 Rykoff, E. S., Rozo, E., Busha, M. T., et al. 2014, ApJ, 785, 104 [Google Scholar]
 Salvati, L., Douspis, M., & Aghanim, N. 2018, A&A, 614, A13 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Saro, A., Mohr, J. J., Bazin, G., & Dolag, K. 2013, ApJ, 772, 47 [NASA ADS] [CrossRef] [Google Scholar]
 Sereno, M., Covone, G., Izzo, L., et al. 2017, MNRAS, 472, 1946 [Google Scholar]
 Sifón, C., Battaglia, N., Hasselfield, M., et al. 2016, MNRAS, 461, 248 [CrossRef] [Google Scholar]
 Smith, G. P., Mazzotta, P., Okabe, N., et al. 2016, MNRAS, 456, L74 [Google Scholar]
 Streblyanska, A., AguadoBarahona, A., Ferragamo, A., et al. 2019, A&A, 628, A13 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Sunyaev, R. A., & Zeldovich, Y. B. 1972, Comments Astrophys. Space Phys., 4, 173 [NASA ADS] [EDP Sciences] [Google Scholar]
 Tinker, J., Kravtsov, A. V., Klypin, A., et al. 2008, ApJ, 688, 709 [Google Scholar]
 Tremaine, S., Gebhardt, K., Bender, R., et al. 2002, ApJ, 574, 740 [NASA ADS] [CrossRef] [Google Scholar]
 van der Burg, R. F. J., Aussel, H., Pratt, G. W., et al. 2016, A&A, 587, A23 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Voges, W., Aschenbach, B., Boller, T., et al. 1999, VizieR Online Data Catalog: IX/010 [Google Scholar]
 Voges, W., Aschenbach, B., Boller, T., et al. 2000, VizieR Online Data Catalog: IX/029 [Google Scholar]
 von der Linden, A., Mantz, A., Allen, S. W., et al. 2014, MNRAS, 443, 1973 [NASA ADS] [CrossRef] [Google Scholar]
 Wainer, H., & Thissen, D. 1976, Psychometrika, 41, 9 [CrossRef] [Google Scholar]
 Wojtak, R., Old, L., Mamon, G. A., et al. 2018, MNRAS, 481, 324 [NASA ADS] [CrossRef] [Google Scholar]
 York, D. G., Adelman, J., Anderson, J. E., Jr., et al. 2000, AJ, 120, 1579 [NASA ADS] [CrossRef] [Google Scholar]
Appendix A: Tables with the main results of the paper
362 PSZ2 optical counterparts presented in this paper.
26 clusters and groups beyond PSZ2.
Appendix B: Testing the regression methods
Here we test and validate five different regression methods that account for uncertainties in both axes, for the particular case of the sample discussed in this paper. This study is essential to verify the range of applicability of the methods, and to characterise the existence of statistical biases. Noise levels in the data and the intrinsic scatter of the underlying relation play an important role in the recovery of the bestfit estimates. To test these five methods, we performed simulations tailored to mimic the same statistical properties as in our parent sample. We show that for the noise levels of our reference sample, all five methods present a bias in some of the recovered parameters. However, all of them are unbiased in the limit of high S/N (small uncertainties).
B.1. Regression methods
We consider the problem of carrying out a linear fit of two variables with errors in both axes and including intrinsic scatter. We use the following notation. The two variables are given by x_{i} and y_{i}. Each one of those has measured errors described by Gaussian statistics, with variance σ_{x, i} and σ_{y, i}, respectively. The two variables are tracing underlying quantities ξ_{i} and η_{i}, in such a way that
By definition, ⟨ϵ_{x, i}⟩=⟨ϵ_{y, i}⟩ = 0, and . The underlying model that we want to fit for is:
with parameters m (slope) and n (intercept). The intrinsic scatter, σ_{int}, is represented by ⟨ϵ_{i}⟩ = 0, and . Here we consider the following regression methods:

i)
The orthogonal distance regression (ODR) method, which uses a modified trustregion LevenbergMarquardttype algorithm (Boggs & Rogers 1990) to estimate the function parameters. It is implemented in the Python scipy.odr package.

ii)
Nukers (Tremaine et al. 2002) method, which is based on the minimisation of the χ^{2} function

iii)
Maximum likelihood estimator with uniform prior (MLEU). In this case, the full posterior distribution, assuming Gaussian statistics and flat priors for the three parameters (m, n, σ_{int}), is given by

iv)
Bivariate correlated errors and intrinsic scatter (BCES, Akritas & Bershady 1996), which is a Bayesian method commonly used by the galaxy cluster community. We use here the Python implementation from astropy.stats which corresponds to the orthogonal distances method.

v)
Complete maximum likelihood estimation (CMLE) with correct priors (Kelly 2007). Here we use the implementation of this method from Josh Meyers^{3}.
The BCES, MLEU, and CMLE methods consider the intrinsic scatter (σ_{int}) explicitly in their calculations, and indeed both MLEU and CMLE provide an estimation of its value. The ODR and Nukers methods do not explicitly take into account the intrinsic scatter. However, Tremaine et al. (2002) showed how to obtain an estimation of the σ_{int} for the Nukers method. Once the bestfit model has been obtained, we evaluate the reduced χ^{2} in equation B.3. If this value is smaller than one, then the intrinsic scatter is taken to be zero. Otherwise, the intrinsic scatter is calculated by replacing with in the denominator of equation B.3 and balancing the righthand side term until the reduced χ^{2} is equal to one.
B.2. Simulations
The five methods described in the previous section are tested here in their complete forms, fitting simultaneously for the slope, intercept, and the intrinsic scatter (if included in the method). In addition, the ODR, Nukers, and MLEU are also tested in the particular case of fixing the slope to one, which in our case means that the mass bias is not dependent on the mass.
To test these methods, we carry out a set of realistic simulations, mimicking the sample size (297 objects) and noise conditions that we have in our cluster sample. We run three sets of simulations. In the first two sets, we use the same GCs for every iteration while in the last one we generate a set of 297 synthetic clusters for each iteration. In more detail:

Set 1. We use the 297 real clusters from Table A.1. We assume the estimated SZ masses as the true masses M_{true}, and we fix the estimated redshift z and the number of cluster members N_{gal} to the real ones.

Set 2. Fitting the properties of the real parent sample from Table A.1, we obtain a realistic distribution of dynamical masses, z and N_{gal}. We use these distributions to generate the true simulated mass M_{true}, z and N_{gal} of a set of 297 synthetic clusters.

Set 3. We build a set of 297 synthetic clusters in the same way as in 2, but for every iteration.
We use the procedure explained below to obtain the measured SZ and dynamical masses.
Using the Munari et al. (2013) relation
we obtain the true velocity dispersion σ_{200}. The next step is to simulate the measured velocity dispersion which is our observable. Here, for each cluster in every realisation we create a set of N_{gal} galaxies which are normally distributed around the σ_{200} and we estimate the measured velocity dispersion σ_{measured} using these galaxies (only for 2 and 3, the N_{gal} for 1 is fixed). Now, we apply the same procedure as for the real data. We convert the σ_{measured} into measured dynamical mass M_{dyn} using Eq. 1. Then, we correct this mass using the corrections from Ferragamo et al. (2020) due to the low number of members. The measured uncertainties are directly calculated from Eq. C.1 in Ferragamo et al. (2020). We do not include the intrinsic scatter of the relation B.5 because it is less than 5% and the uncertainties in the real data masses are not less than 10% and up to 80% with an average of 40%.
On the other hand, we simulate the SZ masses M_{SZ} by calculating the observable using the inverse procedure to that used in the Planck papers. To obtain we introduce the true simulated masses into the equation:
where is the angulardiameter distance to redshift z and E^{2}(z) = Ω_{m}(1 + z)^{3} + Ω_{Λ}. The coefficients Y_{*}, α, and β are given in Table 1 in Planck Collaboration XXIV (2016). Once we have obtained , we include the intrinsic scatter of this relation by adopting a lognormal distribution for the observed Y_{SZ} around its mean value with σ_{logY} = 0.075 ± 0.01 (see Planck Collaboration XX 2014). Finally, we insert the observed Y_{SZ} into eq. B.6 to obtain the measured M_{SZ} using the mass bias (1 − b) = 0.80 and we apply a Gaussian random noise based on the real data measured uncertainties.
The theoretically predicted intrinsic scatter in the M_{dyn} − M_{SZ} (Eq. 9) is σ_{lnM} = 0.096. It is calculated by propagating the intrinsic scatter σ_{logY} from the Y − M relation (Eq. B.6) into eq. 9. There are other sources of intrinsic scatter, such as the scatter in eq. B.5, but in this simulation we only consider σ_{logY} as it is the largest.
In our particular case, we assume no dynamical mass bias (1 − b_{dyn}) and so what we are recovering is the SZ bias which is the same as the bias between the SZ and dynamical masses (1 − B) = (1 − b).
B.3. Results
Table B.1 shows the results of the regression tests performed over the simulations described in Appendix B.2. The first column names the regression method. The second column presents the median value of the mass bias (1 − b). The third column shows the median value for the slope m. The last column presents the natural logarithm of the intrinsic scatter when available.
Results for the recovered parameters. Input values are (1 − b) = e^{n} = 0.8, slope (m = 1), σ_{lnM} = 0.096.
First, we discuss the results for the case of no mass dependence in the mass bias (m = 1). There is no particular method that recovers the (1 − b) = 0.8. All three tested methods are biased, regardless of the initial settings of the simulations. As shown in Figure B.1 the ODR and the Nukers are biased upwards by 7% while the MLEU is biased by 9% in the same direction. We consider this effect as a true bias as the standard deviations in the three methods are not greater than 3%; see Figure B.1 and Table B.1. The MLEU estimates the intrinsic scatter of the relation as σ_{lnM} = 0.24 ± 0.04 which is more than 3σ away from the predicted one σ_{lnM} = 0.096. The estimation that comes out from the Nukers method is σ_{lnM} = 0.27 ± 0.05, also more than 3σ away. This might be the reason why the methods are biased, the overestimation of the intrinsic scatter may lead into a biased estimation of the intercept. The explanation for the latter is that the methods may not be able to distinguish the difference between the intrinsic scatter and the measurement errors, as they are, on average, four times larger. Another possible explanation is that the error propagation might not be as precise as required. We use symmetric errors in the logarithm space and they are calculated as the uncertainty over the quantity in the real space. This is just an approximation that, with our big uncertainties, might produce this type of bias.
Fig. B.1. Distribution of the estimation of the mass bias parameter when fixing the slope to one for Nukers, ODR, and MLEU methods (Set 3). Vertical dashed line represents the input value (1 − b) = 0.8. 
We also perform the same analysis varying the input value of the mass bias from 0.6 to 1.2 obtaining the same results as explained above. In every case, the (1 − b) parameter and the slope are biased in the same percentage as when using (1 − b) = 0.8.
Now, we discuss the case of a possible dependence of the mass bias parameter on the mass, in other words, letting the slope of the regression free to vary. The following results are independent of the initial settings of the simulation as shown in Table B.1. The ODR and the Nukers methods, as in the case of fixed slope, are biased in the recovery of the parameter (1 − b) to exactly the same percentage, as the recovery of the slope is almost perfect. The only difference is that the standard deviation is greater for obvious reasons. The MLEU, the BCES, and the CMLE fail completely when trying to recover the slope. Although the BCES and the CMLE do recover the mass bias parameter, these methods must not be trusted as the slope they recover is between 15% and 20% lower than the input value. The MLEU fails catastrophically in both tasks. The poor performance of these methods might be caused by the incorrect estimation of the intrinsic scatter because of the confusion with the huge measurement errors, similar to the case of fixed slope.
Figure B.3 shows the distribution of the intrinsic scatter for the three methods that estimate it. We think this is one of the key questions and is why the methods do not properly recover the input values of the parameters. As explained in the previous section, the theoretical intrinsic scatter can be calculated and there is no single method that estimates it correctly. We perform the same simulations setting the intrinsic scatter to zero and find very similar results to those discussed above. We also perform the simulations setting the measurement errors two orders of magnitude lower. In this case every method properly recovers the input values, and the intrinsic scatter is even within 1 and 2σ depending on the method.
In addition, we also explored whether or not a binning approach improves the result. For this study, we divided the simulated sample (set 3) into 5, 10, and 15 bins. In general, the binning approach increases the errors on the recovered parameters independently of the number of bins used, and is still fully consistent with the case of no binning. It slightly increases the bias in the recovery of the 1b from 6% to 10%, but is still consistent with the value for no binning. We conclude that this binning approach will not improve the results, and so we do not use it in this paper.
We conclude that there is no correct regression method to use in this configuration. In other words, each method is either biased or gives inaccurate results. The main source of the problem is the large measurements errors combined with the intrinsic scatter. We select the Nukers method as our reference method for two main reasons. It gives a robust estimation of the slope when we let it vary, and it has a small bias in the (1 − b) parameter which we can correct or account for. Other methods might give better estimations of the (1 − b) parameter, like the MLEU which is less biased than the Nukers, but this method does not correctly recovered the slope (see Fig. B.2). Taking into account both estimations, (1 − b) and slope, we recommend using the Nukers method for our particular set of data.
Fig. B.2. Distribution of the estimation of the mass bias parameter (top panel) and the slope (bottom panel) for the five tested regression methods (Set 3). Vertical dashed lines represent the input values of the simulation (1 − b) = 0.8 and α = 1. 
Fig. B.3. Distribution of the estimation of the intrinsic scatter for Nukers, MLEU, BCES, and CMLE (Set 3). Vertical dashed line represents the theoretical value of the intrinsic scatter σ_{lnM} = 0.096. 
All Tables
Results for the mass bias using the sample (1.5 × r_{200}), when restricting the analysis to those clusters with total number of spectroscopic members N in a certain range or interval.
Nukers (1 − B) estimates before and after the Eddington bias correction in S/N bins for the sample with an aperture cut of 1.5 × r_{200}.
Nukers (1 − B) estimates for different bins in redshift for the sample with an aperture cut of 1.5 × r_{200}.
Results for the recovered parameters. Input values are (1 − b) = e^{n} = 0.8, slope (m = 1), σ_{lnM} = 0.096.
All Figures
Fig. 1. PSZ2 cluster counts as a function of the S/N of the SZ detection The PSZ2North sample is represented in light blue, and the 1.5 × r_{200} sample is represented in beige. Dark blue bars represent the ratio between the 1.5 × r_{200} sample and the total number of clusters in the PSZ2North sample. The bin size is 0.5. 

In the text 
Fig. 2. Example of the distribution of galaxies in PSZ2 G009.04+31.09 as a function of the rest frame difference in radial velocity from the mean radial velocity of the cluster. The cluster members used to estimate the velocity dispersion are shown in blue. The red line represents the normal distribution expected for the estimated velocity dispersion of σ_{200} = 1068 km s^{−1}. 

In the text 
Fig. 3. Projected phase space and velocities histogram distribution for all of the 11 867 galaxy members in our sample. Member velocities are normalised to the mean cluster velocity dispersion, whereas the distance to the centre of the cluster is normalised to the value of r_{200} in each cluster. Horizontal black dashed lines are the 2.7σ clip. The red line represents a Gaussian (normal) fit to the velocity histogram with σ = 1. 

In the text 
Fig. 4. Scaling relation for the sample of 297 PSZ2 clusters (1.5 × r_{200} sample). The dashed black line shows the 1 : 1 line. The orange line represents our best fit using the Nukers method with α = 1 (see text for details). The green line is the fit using the complete Nukers method for a free slope. The shaded regions represent the 1 and 2σ errors of the reconstructed parameters. The vertical red dotted line corresponds to the pivot mass of 6 × 10^{14} M_{⊙}. 

In the text 
Fig. B.1. Distribution of the estimation of the mass bias parameter when fixing the slope to one for Nukers, ODR, and MLEU methods (Set 3). Vertical dashed line represents the input value (1 − b) = 0.8. 

In the text 
Fig. B.2. Distribution of the estimation of the mass bias parameter (top panel) and the slope (bottom panel) for the five tested regression methods (Set 3). Vertical dashed lines represent the input values of the simulation (1 − b) = 0.8 and α = 1. 

In the text 
Fig. B.3. Distribution of the estimation of the intrinsic scatter for Nukers, MLEU, BCES, and CMLE (Set 3). Vertical dashed line represents the theoretical value of the intrinsic scatter σ_{lnM} = 0.096. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.