Physical properties of Brightest Cluster Galaxies up to redshift 1.80 based on HST data

Brightest cluster galaxies (BCGs) have grown by accreting numerous smaller galaxies and can be used as tracers of cluster formation and evolution in the cosmic web. However, there is still a controversy on the main epoch of formation of BCGs, since some authors believe they have already formed before redshift z=2, while others still find them to evolve at more recent epochs. We aim to analyse the physical properties of a large sample of BCGs covering a wide redshift range up to z=1.8 and analysed in a homogeneous way, to see if their characteristics vary with redshift. As a first step, we also present a new tool to define for each cluster which galaxy is the BCG. For a sample of 137 clusters with HST images in the optical and/or infrared, we analyse the BCG properties by applying GALFIT with one or two Sersic components. For each BCG, we compute the Sersic index, effective radius, major axis position angle, surface brightness. We then search for correlations of these quantities with redshift. We find that BCGs follow the Kormendy relation (between the effective radius and the mean surface brightness), with a slope that remains constant with redshift, but with a variation with redshift of the ordinate at the origin. Although the trends are faint, we find that both the absolute magnitudes and effective radii tend to become respectively brighter and bigger with decreasing redshift. On the other hand, we find no significant correlation of the mean surface brightnesses or Sersic indices with redshift. The major axes of the cluster elongations and of the BCGs agree within 30 degrees for 73% of our clusters at redshift z<= 0.9. Our results agree with the BCGs being mainly formed before redshift z=2. The alignment of the major axes of BCGs with their clusters agree with the general idea that BCGs form at the same time as clusters by accreting matter along the filaments of the cosmic web.


Introduction
Galaxy clusters are the largest and most massive gravitationally bound structures observed in the Universe. As so, they constitute perfect probes to test cosmological models and help us better understand the history of the Universe, as they will constrain the limits of observed physical parameters through time, such as mass or brightness, in numerical simulations (Kravtsov & Borgani 2012). The ΛCDM model proposes a hierarchical evolution scenario starting from small fluctuations that assemble together via the gravitational force, and grow to form bigger and bigger structures. As a result, galaxy clusters are the latest and most massive structures to have formed.
Clusters are believed to be located at the intersection of cosmic filaments, and to form by merging with other smaller clusters or groups of galaxies, and by constantly accreting gas and galaxies that preferentially move along cosmic filaments and end up falling towards the center of the gravitational potential well, which often coincides with the peak of the X-ray emission (see De Propris et al. 2020, and references therein). Generally, the brightest galaxy of the cluster, the Brightest Cluster Galaxy (BCG hereafter) lies at the center of the cluster. It is usually a supermassive elliptical galaxy, that is formed and grows by mergers with other galaxies, and can be up to two magnitudes brighter than the second brightest galaxy. This property makes BCGs easily recognisable. BCGs have often been referred to as cD galaxies, i.e. supergiant ellipticals with a large and diffuse halo of stars. Since their properties are closely linked to those of their host cluster (Lauer et al. 2014), they can be extremely useful to trace how galaxy clusters have formed and evolved. BCGs tend to be aligned along the major axis of the cluster, which also hints at the close link between the BCG and its host cluster (Donahue et al. 2015;Durret et al. 2016;West et al. 2017;De Propris et al. 2020). This alignment suggests that the accretion of galaxies may be done along a preferential axis, with galaxies falling into clusters along cosmic filaments.
Most of the stars in today's BCGs were already formed at redshift z ≥ 2 ( Thomas et al. 2010). BCGs, especially the most massive ones, can present an extended halo made of stars that were stripped from their host galaxy during mergers, and form the Intra Cluster Light (ICL). When measuring photometric properties of galaxies, some parameters such as the BCG major axis can be difficult to measure accurately as the separation between the ICL and the external envelope of the BCG is not clear. However, the ICL is a very faint component, and as we are observing bright galaxies, the ICL should not strongly affect our study, so the ICL will not be considered in this paper.
The evolution of BCG properties with redshift is of interest to study cluster formation and evolution, but this topic remains quite controversial. Some authors report no evolution of the sizes of the BCGs with redshift (see Bai et al. 2014;Stott et al. 2011, Article number, page 1 of 30 arXiv:2102.01557v1 [astro-ph.GA] 2 Feb 2021 A&A proofs: manuscript no. main and references therein): Stott et al. (2011) found that there was no significant evolution of the sizes or shapes of the BCGs between redshift 0.25 and 1.3; whereas Ascaso et al. (2010) found that, although the shapes show little change, they have grown by a factor of 2 in the last 6 Gyrs. Bernardi (2009) found a 70% increase of the sizes of BCGs since z = 0.25, and an increase of a factor 2 since z = 0.5. Bai et al. (2014) found that while the inner region of the galaxies doesn't grow much, the light dispersed around the BCG forms the outer component, resulting in a shallow outer luminosity profile. This is an indication of an inside-out growth of BCGs: the inner component forms first and then stops growing while the outer component develops. Edwards et al. (2019) gave more evidence to justify this inside-out growth of BCGs by showing that the stars in the ICL are younger and less metal-rich than those in the cores of the BCGs. They also showed that the most extended BCGs tend to be close to the X-ray center. This last statement is supported by Lauer et al. (2014), who added that the inner component would have already been formed before the cluster, while the outer component, the envelope of the BCG, is formed and grows later. In fact, numerical simulations with AGN suppressed cooling flows, showed that about 80% of the stars are already formed at redshift z ≈ 3 in the BCG progenitors that merge together to form today's BCGs (De Lucia & Blaizot 2007). Cooke et al. (2019b) found that BCG progenitors in the COSMOS field have an active star formation phase before z = 2.25, followed by a phase of dry and wet mergers until z = 1.25 that leads to more star formation and increases the stellar mass of the progenitors, after which the stellar mass of progenitors mainly grows through dry mergers, and half of the stellar mass is formed at z = 0.5. Similarly, Cerulo et al. (2019) did not find significant stellar mass growth between z = 0.35 and z = 0.05, suggesting that most of the BCGs stellar masses were formed by z = 0.35. Durret et al. (2019) observed a possible variation with redshift of the effective radius of the outer Sérsic component of BCGs for 38 BCGs in the redshift range 0.2 ≤ z ≤ 0.9, agreeing with a scenario in which BCGs at these redshifts mostly grow by accreting smaller galaxies.
Several conflicts also arise on the growth of the stellar masses of the BCGs: Collins & Mann (1998);Collins et al. (2009) ;Stott et al. (2010) found little to no evolution. On the other hand, other studies found a strong evolution in the stellar masses of BCGs since redshift z = 2 (Aragon- Salamanca et al. 1998;Lidman et al. 2012;Lin et al. 2013;Bellstedt et al. 2016;Zhang et al. 2016).
In the present paper, we will try to characterize how the properties of BCGs have evolved since z = 1.80, based on HST data to have the best possible spatial resolution, which is particularly necessary at high redshift. When dealing with a large amount of data, identifying individually the BCG of a cluster to build a sample can be a long task to do. This is why we present here a method based on several photometric properties of the BCGs, that will allow to detect BCGs automatically. We analyze a sample of 137 galaxy clusters, covering the redshift range 0.187 ≤ z ≤ 1.80, and including various types of BCGs (star forming BCGs, SF BCGs hereafter, interacting BCGs, hosts of possible AGNs, supercluster members, etc.).
The present paper (in green in Figure 1, the figure will be described in more detail in the next section) will allow to cover a large redshift range with one of the largest samples observed with HST (indicated in red). This will enable us to obtain more significant statistics on the evolution of BCG properties.
The paper is organized as follows. We will describe the data in Section 2, the method to detect automatically the BCGs in The samples represented in black use ground-based telescope data, space-based data excluding HST, or a mix of ground-based and space-based data, while those in red only use HST data. Our initial sample is represented by the red and green dotted line, and our final BCG sample in green (see Section 4).
Section 3, and the modelization of their luminosity profiles in Section 4. The results obtained as well as a short study of the link between the BCG masses, distance between the BCG and the X-ray center of the cluster, and physical properties are given in Section 5. A final discussion and conclusions are presented in Section 6.
Throughout this paper, we assume H 0 = 70 km s −1 Mpc −1 , Ω M = 0.3 and Ω Λ = 0.7. The scales and physical distances are computed using the astropy.coordinates package 1 . Unless specified, all magnitudes are given in the AB system.

The sample
The sample studied in this paper consists of 137 galaxy clusters with HST imaging taken from Jee et al.  Sazonova et al. (2020) . We also add five more distant clusters at z ≥ 0.8, as well as the cluster Abell 2813 at z = 0.29. Among them, we identify twelve clusters which present in their center two BCGs similar in magnitude and size (see Section 3). As a result, our final BCG sample contains 149 BCGs (the number of BCGs is not equal to the number of clusters studied because of these clusters with two BCGs). This sample is a good representation of the most massive BCGs in the range 0.1 ≤ z ≤ 1.80. The redshift distribution is shown on Figure 2.
All of these clusters have data available from the Hubble Space Telescope, obtained with the Advanced Camera for Surveys (ACS) in optical bands and/or the Wide Field Camera 3 (WFC3) in infrared bands, resulting in good quality images. This allows us to perform accurate photometry with relatively good precision, and to treat all the BCGs in a homogeneous way. Contrary to other studies such as Bai et al. (2014), we do not exclude in our study clusters with bright nearby objects that may hinder our measurements near the BCG area. We also identify in our sample two clusters host of blue BCGs (negative rest frame bluered color), with active star forming regions inside the BCGs. These two BCGs will be described more in details in Section 3.
Our sample covers a large range in redshift, which will enable us to trace the history of cluster formation through time.  (Bai et al. 2014;DeMaio et al. 2019;Durret et al. 2019) and/or used ground-based data (represented in black). Our sample contains more clusters and BCGs at high redshifts (z ≥ 0.7) than that of Lidman et al. (2012) (33 in Lidman et al. 2012. With the present study, we therefore almost double the previous samples and cover a larger range in redshift. This enables us to obtain more significant statistics on the evolution of BCG properties.  West et al. (2017) andDe Propris et al. (2020) mainly focus on the alignment of the BCGs with their host cluster and on the evolution of the BCG stellar masses. Our work constitutes a deeper analysis since we also study the luminosity profiles of the BCGs.

Retrieving data and cluster information
We retrieve all the FITS images from the Hubble Legacy Archive (HLA 3 ). We look for combined or mosaic images according to what is available and download stacked images directly from the HLA. To avoid handling such heavy files, we first crop these images, define the new center on the cluster coordinates found in NED, and create a new image 1.2 Mpc wide. Linear scales (arcsec/Mpc) are determined from the cluster redshifts in the literature (from Jee et al. 2011;Postman et al. 2012b;Bai et al. 2014;Donahue et al. 2015;West et al. 2017;DeMaio et al. 2019;Durret et al. 2019;Sazonova et al. 2020, or found in NED for the five other clusters we added). Cluster information can be found in Table 1. It is necessary to add the keywords 'GAIN' and 'RD-NOISE' in the header of the FITS images, which will be used later by SExtractor or GALFIT. As the images are in units of electrons/s, we set the GAIN to 1 and multiply the images by the total exposure time (EXPTIME) to get back to units in electrons. Single exposure images were summed with AstroDrizzle to get the final combined images. We also retrieve the associated weight maps (wht fits) obtained applying the option IVM (Inverse Variance Map) of AstroDrizzle.

Procedure for the detection of the BCG
The definition of the BCG that we use throughout this paper is the following: the BCG is the brightest galaxy of the cluster that lies close to the cluster center, defined as the center of the cluster member galaxy distribution. Generally, the cluster center is defined as the X-ray center of the cluster, as X-rays trace the mass distribution better. However, it is difficult to obtain X-ray data, particularly at high redshifts. If a large sample is considered, most probably a good fraction of the clusters do not have X-ray data available. Moreover, it was shown in several studies (Patel et al. 2006;Hashimoto et al. 2014;De Propris et al. 2020) that BCGs are often displaced from the X-ray center. For these reasons and anticipating for future works with much larger samples, we use a definition which is independent from X-rays and only relies on optical and infrared photometric data. We define the center as that of the spatial distribution of cluster galaxies (as in Kluge et al. 2020). As a matter of fact, X-ray coordinates are only available for 68 out of the 137 clusters in our sample (see Table 4). These X-ray positions will only be used to study whether or not the BCG properties correlate with their position relatively to the X-ray center (see Section 5).

Method for detecting red BCGs
The method applied to select automatically red cluster BCGs is schematically summarized in Figure 3, and will be described in detail below. The efficiency of the method will be discussed in Section 3.1.4. Blue BCGs will be mentioned in Section 3.2.

Rejection of foreground sources
In order to differentiate the BCG from other objects in the field, we need to identify which objects are part of the cluster and which are not. We describe below step by step our method to detect the BCGs among all the contaminations (stars, foreground and background galaxies, artifacts) in our images.
Measurements with SExtractor (refer to Bertin & Arnouts 1996, for more details on the following mentioned parameters) were done using two different deblendings (parameters DEBLEND_MINCOUNT = 0.01 and DEBLEND_MINCOUNT = 0.02). The smallest deblending parameter, i.e, the finest deblending, is sufficient to separate two nearby galaxies without fragmenting excessively spiral galaxies in the foreground, and provides the most accurate measurements. However, BCGs can present a very diffuse and luminous halo which may be associated with intracluster light (ICL). We noticed that, in presence of nearby bright sources in the region of the BCG, SExtractor would detect only those foreground sources and process the BCG halo as a very luminous background. We therefore decided to run SExtractor in parallel with a coarser deblending to take this into account. The two catalogs obtained with two different deblending parameters are then matched: we keep the values obtained with the finer deblending, and add all new objects detected using a coarser deblending.
We compute the magnitude at which our catalogue is complete at 80%, m 80% . For this, we plot the histogram in apparent magnitudes and fit the distribution up to the magnitude at which the distribution drops. By dividing the number of detected sources by the total number of sources that is expected to be detected in a magnitude bin (given by the fit), we compute the completeness of the catalogue at each bin. We can then determine m 80% , and make a cut in apparent magnitude to reject all galaxies with m ≥ m 80% + 2, as the photometry will not be accurate for these faintest objects.
Our procedure to reject the various contaminations is as follows. First, we query in NED for all the sources in the region Fig. 4. The BCG in RX J2129+0005 at redshift z = 0.234 has an extended and luminous halo, which makes it difficult to accurately estimate the a/b axis ratio. In this case, the major axis has most likely been overestimated, as the diffuse light is extended along this axis. The image was taken with the F775W ACS filter.
of the cluster and reject all those with a spectroscopic redshift that differs by more than 0.15 from the cluster redshift (|z -z spec | > 0.15). We identify bad detections by their magnitude values which get returned as MAG = 99.99 by SExtractor. All point sources or unresolved compact galaxies are eliminated using the parameter CLASS_STAR ≥ 0.95 in SExtractor, which requires a PSF model to be fed into SExtractor, created with PSFex (Bertin 2011). Most foreground galaxies can be identified by their excessively bright absolute magnitude when computed from their MAG_AUTO magnitude and assuming they are at the cluster redshift. We thus exclude all sources with an absolute magnitude MAG_ABS ≤ −26.
We can identify edge-on spiral galaxies, which appear very elongated. They can be filtered by making cuts in elongation (defined in SExtractor as the ratio of the galaxy's major to minor axis). As will be explained in Section 3.1.2, we consider two different filters. We define two different cuts depending on the filter we are looking at: in the bluest filter, we apply the criterion ELONGATION ≤ 2.3, and in the reddest filter, ELONGATION ≤ 2.6. The latter limit may seem quite high to filter efficiently all edge-on spiral galaxies. However, because of deblending issues, measuring with precision the lengths of the major and minor axes of the sources can be difficult, and will sometimes lead to a very elongated object. A very bright and elongated halo around the BCG, which can be linked to the ICL, will possibly return a high a/b axis ratio. This is the case for the BCG in RX J2129+0005, which has the highest a/b elongation (in the F606W filter) measured in our sample, reaching a/b=2.57 (see Figure 4). As the reddest filter is more sensitive to the ICL, we prefer to define a limit that is not too strict on this filter. It will not eliminate all edge-on galaxies (for that, we should lower the limit), but we cannot take the risk of filtering out any of the BCGs we are looking for. That's why we define a different, stricter limit on the bluest filter, as it will be more sensitive to the blue stellar population present in the disk of spiral galaxies, and less to the ICL.

Selection of red cluster galaxies
Early-type galaxies in clusters are usually easily recognizable by their red colors, since they are mostly red elliptical galaxies, without star formation. While blue spiral galaxies also exist inside the cluster, they are a minority, and red elliptical galaxies draw a red sequence in a color-magnitude diagram, which has a low dispersion. We thus apply a filter in color in order to only keep the red galaxies that form the red sequence.
To extract all the red early-type galaxies in a cluster at redshift z, we model their color using a spectral energy distribution (SED) template from Bruzual & Charlot (2003). The model is similar to the one used by Hennig et al. (2017): a single period of star formation beginning at redshift z f = 5, with a Chabrier IMF and solar metallicity, that decreases exponentially with τ = 0.5 Gyr. However, the model differs from Hennig et al.
(2017) on the star formation redshift: we chose a higher z f to better model clusters at higher redshifts, as Hennig et al. (2017) limit their study to redshift z = 1.1. We reject all blue galaxies (blue-red ≤ 0) and all galaxies whose measured (blue-red) color differs by more than 0.60 magnitude from the model. While the red sequence of a cluster presents a rather narrow colormagnitude relation, and therefore very little dispersion, this large limit of 0.60 was fixed in order to take into account photometric uncertainties due to deblending issues, redshift uncertainties, or simply to the accuracy of the model used (Charlot, private communication). The color is computed considering a fixed aperture of 35 kpc in diameter (parameter MAG_APER), which is large enough to contain all of the galaxy's light. All magnitudes are Kcorrected (K-correction values are taken from the EZGAL BC03 computed model) and we also take into account galactic extinction. Dust maps were taken from Schlegel et al. (1998) and reddening values for the ACS and WFC3 bandpasses were taken from Schlafly & Finkbeiner (2011), considering a reddening law R V = 3.1. The color computed will depend on the filters available and on the redshift of the cluster. The colors computed for each cluster can be found in Table 1. The filter transmissions are normalized to 1 for better visualisation, and the break at 4000 Å is marked as a red vertical line for reference. In this case, the chosen (blue-red) rest frame color is F814W-F105W, and the filter we choose for the final step (modelisation of the luminosity profile with GALFIT) is F140W (see Section 4).
The rest frame (blue-red) color to compute is defined as the color based on two magnitudes with the smallest wavelength difference that bracket the 4000 Å break at the cluster redshift. Depending on what filters are available for each cluster, the selected filters will differ. An example is given in Figure 5, for a cluster at z = 1.322; in this case, the filters bracketing the 4000 Å break are F814W and F105W 4 . In the cases where the two optimal filters are not available, the color used for tracing the red sequence galaxies at different redshifts may not be efficient; for instance, the use of the color (F606W -F140W) would not enable us to optimize the selection, as a galaxy at higher redshift (z = 1.65 for example) than the cluster redshift (here z = 1.322) would have the same color and will not be filtered out.

Rejection of spiral and isolated galaxies
The cut in colors is an important step that allows to remove most of the spiral galaxies and to maximize the number of ellipticals in our catalogues. However, a few foreground galaxies may still remain, and we describe here the method used to remove them.
The algorithm described hereafter will be applied to every single galaxy, from the brightest to the faintest, until the cluster BCG is found. We refer the reader to the sketch shown in Figure 3. The procedure includes the following steps: -Step 1: for each cluster, we sort the catalog from the brightest to the faintest galaxies, and going down their brightnesses, we exclude galaxies that are too isolated from the rest (explained hereafter). We define the BCG as the brightest elliptical galaxy at the center of the galaxy density distribution. The method to calculate the center is as follows: We compute N neigh , the number of cluster members, i.e. red sequence galaxies, found in a fixed aperture of 200 kpc radius centered on each galaxy in the final catalogue, and note N max the maximum number computed. If N is the total number of red sequence galaxies whose colors fall within 0.60 magnitude from the model, and N neigh is the number of neighbours of a given galaxy in an aperture of 200 kpc, we consider that a galaxy is isolated and unlikely to be the BCG if the ratio P = N neigh /N is smaller than 40% of P max = N max /N. Considering that we cropped our images to cover a projected area of 1.2x1.2 Mpc 2 , the aperture of diameter 400 kpc taken here represents one third of the side of the images. This is small enough to detect high density areas on the image, and big enough to work on clusters with a high spatial extent. After several trials adopting different values, the value of 200 kpc radius is the one that works best. Less than 200 kpc becomes too small for extended clusters, while a higher radius makes it difficult to detect the smaller density fluctuations, as the covered area becomes large. The limit defined at R lim = 0.40 R max was also determined after several tests. This condition allows to take into account the cluster richness and spatial extent, as well as the possible offset of the BCG relatively to the cluster center. -Step 2: the next step consists in filtering out the last spiral galaxies that remain among the potential BCG candidates.
We run SExtractor to model the potential BCG with a bulge and a disk component. We find that spiral galaxies have a very faint bulge (parameter MAG_SPHEROID): as can be seen on Figure 6, a spiral galaxy (shown in blue) near the cluster ClG J1604+4304 prevented us from successfully detecting the BCG. We see a gap in magnitude between the spiral galaxy and the other BCGs (which are not all pure ellipticals). This enables us to define a new cut in magnitude to remove these remaining spirals: MAG_SPHEROID ≤ 24. -Step 3: if a galaxy complies with these conditions, i.e. not being isolated and not being a spiral, we keep it as BCG 1 if no other BCG candidate was found before, and as BCG 2 otherwise. We don't proceed to the next step until a BCG 2 is defined.

-
Step 4: we check if there are more red sequence members in the same aperture for BCG 2 than the number defined in Step 1 for BCG 1 , by comparing their P Neigh ratios (defined in Step 1). We will note them P BCG1 and P BCG2 . If P BCG1 ≤ 0.95 P BCG2 , and if less than 30% of N Neigh,BCG2 are in common with BCG 1 , BCG 1 is eliminated and we define BCG 2 as the new BCG 1 . The factor of 95% ensures that the overdensity in which BCG 2 resides is significantly richer than the one in which BCG 1 is. The second criterion on the number of common galaxies to BCG 1 and BCG 2 is to make sure that we are not replacing a BCG that is not at the very center of the cluster by another galaxy that is closer. This criterion is necessary to avoid eliminating BCGs that are a little offset from the center of the cluster, where the density is higher. It allows to check that the two galaxies are not in the same area in the sky, i.e, we check that BCG 2 does not belong to the same clump (overdensity) as BCG 1 , or that the two galaxies do not belong to the same cluster. -Step 5: this step is taken only if BCG 2 is defined, otherwise we repeat the previous steps until it is found. If BCG 1 and BCG 2 are similar in sizes (ratio of the major axes a BCG2 /a BCG1 ≥ 0.80) and brightnesses (magnitude difference mag BCG2 -mag BCG1 ≤ 0.2), we keep both BCG 1 and BCG 2 as the BCGs of the cluster. Otherwise, BCG 2 is eliminated and BCG 1 is defined as the BCG.
The values above do not always work for poor clusters, i.e., when the number of cluster members is low, or when the density of red sequence galaxies is low. There is no problem when all the cluster members are concentrated in the same area (with a size comparable to the previously defined aperture), but if the members are dispersed over the sky and cover a large area, an aperture of 200 kpc radius becomes too small to detect density fluctuations on the sky. We thus differentiate these clusters by their number of red sequence members, N, and by the previously defined parameter P max (see Step 1). We separate the poor clusters with P max ≤ 0.25 and N ≤ 100 (very extended cluster with no important density clumps), and P max ≤ 0.5 and N ≤ 40 (low number of red sequence galaxies, extended spatial distribution). We were not able to correctly determine the BCGs for these clusters by defining a 200-kpc radius aperture, so for these poorer clusters, we consider a bigger aperture of 500 kpc radius. To take into account the bigger aperture, we also modify the second criterion in Step 4. We check that the two BCGs candidates, BCG 1 and BCG 2 , have less than 50% galaxies in common in the same aperture, to guarantee that they are not both residing in the same cluster.

Results for detected red BCGs
Among the 137 clusters in our sample, 50 clusters only had one filter available, and were thus excluded from this procedure. For these 50 clusters without available colors, we checked visually the images to determine the BCG, and checked with X-ray maps or other studies before adding them to the final sample.
In order to assess the efficiency and accuracy of our detection method, we checked each detection visually and compared it with other studies and with any X-ray map we could find. We compared the X-ray map to the position of the detected BCG to make sure that it is not too far from the X-ray peak (but not necessarily located at the peak, in a radius of about 200 kpc).
During this verification, we found that our detection differs from that of Durret et al. (2019)  define the BCG as the one in the southern structure, whereas we detect a brighter galaxy in the northern structure, that we hence define as the BCG. We choose to keep our detection as it lies near the X-ray peak in the northern structure and is surrounded by galaxies at the cluster's redshift. We find, by checking visually, that the BCG in SpARCS-J0224 defined in Bai et al. (2014) is a spiral galaxy. We thus choose to keep our detection, which is an elliptical galaxy located just south of their detection.
A few star forming BCGs can be found in our sample. We find red BCGs with a very high SFR: we give the example of MACS J0329.6-0211 at z = 0.45. This BCG has an almost starburst level of UV continuum and star formation (Donahue et al. 2015). Images of this BCG in the UV continuum and Hα- [NII] lines are given by Fogarty et al. (2015), illustrating the distribution of star formation throughout the galaxy. The high star formation rate of about 40 M yr −1 was confirmed by Fogarty et al. (2017), based on Herschel data. Green et al. (2016) also indicate that this galaxy hosts an AGN, and is quite blue (blue − red = −0.71), with strong emission lines and a rather high X-ray luminosity of 11.85 × 10 44 erg s −1 .
Overall, all the red BCGs, even the non pure elliptical BCGs, or those with not optimized colors because of the lack of available filters, were successfully detected with our method. We successfully detect 97% of the BCGs in our sample, and all the red BCGs are found. The method is effective to detect red BCGs presenting different morphologies and characteristics (mergers, star forming, traces of dust in the core, disturbed).
It is to note, though, that this method may be less reliable for poorer clusters (as defined in the previous subsection). As we were conducting several tests, trying different values of apertures in which we computed the number of red sequence galaxies, or the threshold below which galaxies are considered as isolated, we found that the detection efficiency for poorer clusters was more sensitive to these parameters. As this method relies on the density of red sequence galaxies in a small aperture, BCGs that are a little offset from the density peak (which is more difficult to calculate for poor clusters with an extended spatial distribution) could be eliminated, and rejected as being isolated from the other red sequence galaxies.
It may also be important to note that, in the presence of more than one cluster, i.e. two clusters interacting with each other, or in the case of superclusters, only the brightest galaxy of one substructure will be detected. For MACS-J0717.5+3745 for example, which we already mentioned, the BCG of the northern clump being the brightest, it is the one that is detected by our algorithm.

Finding blue BCGs
Out of the 98 BCGs (87 clusters, 11 clusters with two BCGs) in our sample which we tried to detect, we find two peculiar BCGs with blue colors.
BCGs are in majority quiescent galaxies, and their dominant stellar population is typically red and old. As they grow by undergoing mergers through time, all their gas is consumed, and we expect the star formation to be quenched or suppressed. However, we do observe, both today and in the distant universe, BCGs with intense UV emitting filaments or knots, hinting at active star formation. Cerulo et al. (2019) found that 9% of his sample of massive BCGs in the redshift range 0.05 ≤ z ≤ 0.35 from the SDSS and WISE surveys have blue colors (which they define as galaxies with colors 2σ bluer than the median color of the cluster red sequence), and are star forming. What we will refer to from now on as star forming BCGs (SF BCGs) have only been observed in cool core clusters so far. Their morphology can be quite different from that of a simple elliptical galaxy, as was stated before. These galaxies can appear disturbed, with a complex structure showing a possible recent or ongoing merger. Such examples of SF BCGs show that BCGs are not all simple ellipticals.
Two BCGs, RX J1532+3020 (z = 0.3615) and MACS J1932-2635 (z = 0.352), were not correctly detected as they are cool core BCGs with an extremely active star forming center, so they were eliminated because of their blue colors. These two BCGs were identified by eye and added manually, after checking and confirming with other studies. Their images are shown in Figure 7. These two BCGs are the only blue BCGs in our sample (blue meaning a negative rest frame blue-red color), out of the 98 BCGs for which we compute a color. While comparing with other studies, we find a few other BCGs that are star forming, but are still red.
RX J1532+3020 is one of the most extreme cool core galaxy clusters observed today, as well as one of the most massive. An intensive study by Hlavacek-Larrondo et al. (2013) shows the existence of a western and an eastern cavity, which are used to quantify the AGN feedback at the center of the galaxy. These authors estimated that this feedback would release at least 10 45 erg s −1 , which would prevent the Intra Cluster Medium (ICM) from cooling, and would then allow to solve the cooling flow problem in cool core clusters. The BCG of this cluster is a radio loud galaxy that presents in its central regions UV filaments and knots, as well as traces of dust, hinting at recent star formation, with a SFR of at least 100 M yr  MACS J1932-2635 is another cool core cluster with a huge reservoir of cold gas in the core, of mass (1.9± 0.3) × 10 10 M , which makes it one of the largest reservoirs observed today, in which Fogarty et al. (2019) detected CO emission as well as UV knots and Hα filaments around the BCG. They measured a SFR of 250 M yr −1 and also observed an elongated tail that extends to the northwest, with traces of cold dust in the tail, which they suspect might be caused by a recent AGN outburst.
In order to detect these blue BCGs, we would have to relax the condition on the color. However, this condition is necessary in order to remove most of the spiral galaxies, and we find that allowing galaxies with blue colors will make the method much less reliable, as the red sequence will be ill-defined. Our method is thus only reliable to detect red BCGs, even if they are not pure ellipticals (star forming or merging galaxies for example).

Luminosity profiles
We fit 2D analytical models on sources with GALFIT (Peng et al. 2002). Once the BCG is defined, we run SExtractor one last time to return model fit parameters in the available filter closest to the F606W rest frame at redshift z (see Figure 5), which is at a wavelength above the 4000 Å break and thus is in the spectral region where we will get the highest flux. The chosen filters can be found in Table 1. We note that there are 37 BCGs out of the 149 for which HST data are not available in the F606W rest frame or redder. The reddest filter is either bluer than the 4000 Å break or contains it, which means that we are not only looking at the oldest, reddest star population, but at the youngest bluest stars as well. These BCGs are marked by blue squares in plots. The redshift distribution of all our BCGs is plotted in Figure 2, the blue histogram represents the clusters with filters which are bluer than the 4000 Å break. These clusters observed in too blue filters are mainly between redshifts 0.7 and 1.2.

Masking
We first need to mask all the neighbouring sources. We take the SEGMENTATION map returned by SExtractor, and unmask the BCG (which is identified by an identification number), and also mask any blank region on the image. Because of deblending issues, it is more than likely that other objects, projected on the BCG, need to be masked.
We use sharp divided images to detect any neighbouring objects that pollute the signal. Sharp divided (SD) images (see e.g. Márquez et al. 1999Márquez et al. , 2003 are obtained by dividing the images by the median filtered corresponding images. This brings out all the small neighbouring sources that may have been hidden by the luminous halo of the BCG. We run SExtractor (again) on this SD image, and mask all the objects that are farther than 0.5 arcsec from the BCG coordinates (an example is given in Figure 8), which is the minimum distance required to avoid masking the BCG center. As can be seen on Figure 8, the sources masked based on the SD image detection seem larger on the final mask than on the SD images, as the SD image does not show the true sizes of the objects. We apply a factor of 6 to the minor and major axes of the sources detected by SExtractor on the SD image to create our final mask. This factor allows to include all the luminosity of the sources and to mask them efficiently. If necessary, we identify by eye and draw the regions to be masked ourselves in SAOImage DS9 and create a new mask.

PSF model
To obtain a successful model of the galaxy profiles that also works for the inner regions, an accurate description of the PSF is needed. While the PSF we used for the photometry may have been sufficient to distinguish stars from galaxies, GALFIT requires the PSF to meet a number of criteria: a very high SNR, a flat and zero background (if not, any pattern in the background will appear on the model image when convolved with the PSF); it should match the image (diffraction rings and spikes, speckle pattern, etc.) and be correctly centered (see GALFIT Technical FAQ).
We first substract from the images the sky background, which is determined by masking all sources and blank areas on the cluster image, using the routine calc_background with a 3σ clipping method. We then use PSFex, and make a selective sample of the stars that will go into making the PSF. We select all point-like sources with FLAGS = 0, MAG_AUTO ≤ 21, ELONGATION ≤ 1.1, CLASS_STAR ≥ 0.98, SNR ≥ 20 and an isophotal print ISOAREA_IMAGE ≥ 20 pixels.
Since we work on HST observations that cover a small field of view, there may not be many bright stars in the field of the cluster that we could use to compute a PSF. We tried to take several faint stars and stack them to increase the SNR of the PSF. However, we find that this often results in a PSF with an uneven background, which stands out during the model fitting returned by GALFIT, and this usually ends up being a bad fit (too large effective radius, large uncertainties). Since we are working on space observations, the PSF does not vary much, and though it may vary with time, the variations should not be significant (see Martinet et al. 2017). This means that we can replace the PSF for a given filter by another one in the same filter with a better SNR. Higher SNR PSFs return better fits.
Modeled and theoretical PSFs are available for ACS/WFC and WFC3/IR. However, according to GALFIT Technical FAQ 5 (which we refer the reader to), the profiles obtained with models may not be realistic for space-based images, so we prefer to use observed PSFs.

Profile fitting
We use GALFIT to fit two different models to our BCGs: a single Sérsic component or two Sérsic components, to allow different contributions from the inner and outer parts of the galaxies. We also tried to apply other models or combinations of models including a de Vaucouleurs profile, but they always provided worse results (i.e. they gave a worse χ 2 , and about 30% of the BCGs were not well fitted with one or two de Vaucouleurs profiles), so we will only discuss the results with Sérsic fittings.
One needs to give GALFIT an estimate of all the initial parameters: the effective surface brightness or total magnitude, the effective radius (the radius at which half of the total light of the galaxy is contained), the elongation or the position angle (PA) of the BCG. These initial guesses are taken from the SExtractor catalogs: MU_EFF_MODEL or MAG_AUTO, FLUX_RADIUS, ELONGATION, and THETA_IMAGE. We don't have an estimate of the BCG Sérsic index, so we start from the value corresponding to the de Vaucouleurs profile: n = 4. If the fitting does not converge, we try different Sérsic indices in the range 0.5 to 10. For the second Sérsic component that accounts for the inner part, the following parameters are considered: MU_EFF_SPHEROID, SPHEROID_REFF_IMAGE, SPHEROID_SERSICN and SPHEROID_THETA_IMAGE. The suffix SPHEROID refers to the bulb component when SExtractor tries to model a disk and a bulb to a galaxy. We consider an elongation (minor to major axis ratio, b/a) of 0.90 for the inner part, as an initial guess. The region to fit is a box that is 2.5 r Kron 5 https://users.obs.carnegiescience.edu/peng/work/galfit/TFAQ.html wide (cf. GALFIT FAQ), r Kron being the Kron radius returned by SExtractor. This is large enough to contain all the light from the BCG as well as some sky background, and is a good compromise to obtain good fits of our galaxies.
We first run GALFIT to fit the BCGs with one single Sérsic component. If it does not manage to converge with a Sérsic index of n = 4, we try different values between 0.5 and 10 until it converges to a meaningful fit, and reject any fit with returned effective radius larger than half the size of the fitting region, which is to say R e ≤ 2.5 r Kron /2 pixels. We then use the output parameters as initial guesses to fit the outer part of the galaxy and add another Sérsic component to fit the inner part of the galaxy. If it does not converge towards meaningful values, we increment the Sérsic index until it manages to fit the BCG. For pairs of BCGs (two brightest cluster galaxies with similar sizes and magnitudes), we fit both of them simultaneously.

Choice of the best fit model
The quality of the fit can be estimated from the reduced χ 2 (χ 2 ν ), which should be close to 1. From our results, we notice that χ 2 ν > 1.2 or χ 2 ν < 0.8 often indicate a bad fit. This happens when the model used to fit the BCG is not adapted, or when the initial parameters given are bad estimates. In this case, GALFIT may also not have converged and/or crashed.
To decide if a second component is really necessary to fit the BCG, or if one component gives equally good results, we use the F-test (Simard et al. 2011;Margalef-Bentabol et al. 2016).
As stated in Bai et al. (2014), as the background noise is not gaussian, the meaning of χ 2 ν is not as significant, and when comparing two models, a χ 2 ν closer to unity does not necessarily mean that it is a better fit. So we prefer to use a F-test.
The F-test states that if the P-value, determined from the Fvalue and the number of degrees of freedom, is lower than a probability P 0 , then you can reject the null hypothesis and consider that the second model gives a significantly better result than the simpler one. The F-value is defined as the ratio of the reduced χ 2 , χ 2 ν , of both models. GALFIT returns the χ 2 as well as the χ 2 ν of the fit, but instead of directly considering the output χ 2 ν computed by GALFIT, we compute χ 2 ν as: Article number, page 9 of 30 A&A proofs: manuscript no. main  with n do f the number of degrees of freedom, which is here defined as the number of resolution elements, n res , minus the number of free parameters in the model, n f ree . n res can be calculated as follows (see Simard et al. 2011;Margalef-Bentabol et al. 2016): where n pixels is the number of unmasked pixels used for the fitting, and θ is the full width at half maximum (FWHM) of the given PSF, in units of pixels. n do f is then: The P-value is then calculated with the routine f.cdf from scipy.stats in python. We set P 0 = 0.32, which represents a 1σ threshold value (Margalef private communication). We show the distribution of P-values as a function of redshift in Figure 9. A P-value ≤ P 0 means that we need a second component to correctly model the BCG light distribution. On this plot are not represented BCGs which could not be fitted by either model: 9 BCGs could only be fitted with a single component, 22 could only be fitted with two components, 2 could only be fitted by fixing the Sérsic index n = 4 (de Vaucouleurs profile), and 2 BCGs could not be fitted or returned a poor fit for either of the models. On Figure 10, it appears that BCGs that need a second component to obtain a good fit tend to be at lower redshifts (peak at z = 0.3), while the distribution for those that were well fitted with a single component is flatter. We also find BCGs with a model with two Sérsic profiles at higher redshifts (14 BCGs at z ≥ 1.0). If the chosen model depended on the distance, we would have expected not to have two component BCGs at higher z, which is not the case.
We must remember that part of our sample (37 BCGs) is studied in a too blue rest frame filter, and for these clusters we are not looking at the same star population. Without taking into account those observed in too blue filters, we find that 55 out of 72 BCGs (76%) at redshift z ≤ 0.8 need a second component, while the trend is reversed at z > 0.8, as 23 out of 38 BCGs (61%) can be well modeled with only one component. We also notice that the BCGs observed with too blue filters can in majority (62%) be modeled with only one Sérsic.
We can also wonder if the existence of these two distinct populations (BCGs with two components at low z, and BCGs with a single component throughout redshift), with a limit around redshift z = 0.8, may be due to the fact that BCGs at higher redshifts will be less resolved than their lower redshift counterparts.
To test this hypothesis, we bring a sample of 44 BCGs at redshifts z ≤ 1.0 to a common physical scale at redshift z = 1.2. We smooth the images with a gaussian and repeat the previous steps. The σ gauss of the gaussian to apply is calculated as : σ gauss = sqrt(σ 2 z=1.2 − σ 2 z,cluster ) with σ z,cluster computed from the FWHM of the image we want to degrade, and σ z=1.2 the σ at the reference redshift z = 1.2, which was computed as: σ z=1.2 = σ z,cluster * pixscale z=1.2 pixscale z,cluster Ouf of the 44 BCGs at z ≤ 1.0 on which we did this test, 30 returned similar results as those obtained with the original (unsmoothed) images. We also find that a few BCGs (seven) which were better fitted with two Sérsic components can be modeled just as well, according to the F-test, with only one Sérsic after smoothing the images. Surprisingly, the opposite also happened for seven other BCGs: among these seven BCGs, four could not initially be fitted with two components while the other three are close to the P-value limit, P lim = 0.32. As 68% of the tested BCGs showed no significant difference, we can confirm that the lack of resolution for the farthest BCGs does not cause the absence of an inner component for BCGs at higher redshifts.

BCGs observed in too blue filters
In all that follows, when considering together the results from BCGs better fit with one or two Sérsic components, we consider the values obtained for the outer Sérsic component (R e,out ≥ R e,in ).
As stated before, we have 37 BCGs observed in too blue filters (relatively to the 4000 Å break). We must determine if they can be taken into account in our final study. For this, we run a test on 40 clusters with filters available on the blue side of the 4000 Å break as well as appropriate red filters, to check if the returned parameters vary depending on the filters chosen. We find that the absolute magnitude and mean effective surface brightness become fainter as the filter gets bluer. However, the dispersion is too big to simply correct for the offset to bring the BCGs observed in too blue filters to the appropriate red ones. This could be due to the fact that the filters on the blue side of the break do not always fall on the same spectral region on the SED (as the SED varies with redshift, and not all clusters were observed with the same filters). The effective radii can have their sizes halved when observed with too blue filters. As for the Sérsic indices, we find that the BCGs that need a second component tend to have Sérsic indices in bluer filters consistent with those measured in the appropriate red filters. The BCGs which could be fitted with only one Sérsic have indices that vary without any clear pattern. These observations show that we cannot directly consider together the measurements obtained looking at different parts of the SED. Therefore, we chose to exclude the BCGs observed in too blue filters in what follows.
We do however find that the position angles (PA) of the BCGs are not affected and remain consistent regardless of the filter chosen (see Figure 11). The PAs of these BCGs will thus be kept. Only one point presents a big difference between the two values (PA red -PA blue > 120 degrees). We found that the ICL associated with this BCG is more extended in the reddest filter (Ellien et al. 2019, also show that the ICL tends to be more extended in redder filters). The other BCGs with a significant difference between the values measured in the two different filters are circular in shape (b/a > 0.80), so the PAs are ill-defined, which also explains the huge error bars. Fig. 11. Difference between the PA measured on the appropriate red filter (x-axis), and the one measured in a too blue filter (y-axis). In red are BCGs with elongations b/a ≤ 0.80.

Results
As explained above, we tried fitting the BCGs with one or two Sérsic profiles. In the following, the values plotted are those from the best model determined using the F-test (see previous section). The resulting parameters are summarized in Table 2 and  Table 3.
Two BCGs were not properly fitted by either model, but were correctly fitted by fixing the Sérsic index n = 4 (corresponding to a de Vaucouleurs profile). We thus kept the parameters obtained with this fit. Two other BCGs were not correctly fitted by either model, and are thus excluded, bringing our sample size to 147 BCGs.
We summarize the total numbers of galaxies that were better fit with each model: In all the plots shown in this paper, the BCGs better fitted with two Sérsic components will be represented with triangles, and those fitted with only one component with diamonds. Before drawing conclusions, we need to know if we can consider together the results from BCGs better fit with one and with two Sérsic components. In principle, the two subsamples can be put together if, for the two components, we assume that the outer component contains most of the light of the BCG and that the outer profile represents well enough the overall luminosity of the galaxy. The more important the contribution of the inner component to the total luminosity of the BCG is, the less accurate this statement will be. If an inner component is required to model the BCG, then the resulting outer profile obtained when fitting two components may not be comparable to a profile obtained with only one component.
We show the histogram of the ratio of the inner component to total fluxes for the 70 BCGs requiring two Sérsic components (see Figure 12), and find that 24 BCGs present a very important inner component, which can contribute up to 30% of the total luminosity of the galaxy. If we choose to ignore these 24 BCGs, no obvious difference can be seen in the overall relations observed in the following. However, we prefer to exclude them, as the outer profile may not be comparable to the profile obtained with a single component modeling most of the light of the galaxy.
After excluding the galaxies with a very bright inner component and those observed in too blue filters, we obtain the final numbers: -Sérsic (1 component): 40 BCGs -Sérsic + Sérsic (2 components): 46 BCGs In the following plots, BCGs observed in too blue filters will be marked with blue squares, and those fit with two Sérsic profiles and with an important inner component will be marked with light pink squares. We will also identify the blue SF BCGs (cf. Section 3) with red squares and pairs of BCGs with black triangles.

Evolution with redshift
In order to study the evolution of the BCGs, we consider the dependence of the derived parameters as a function of redshift.
The absolute magnitudes of the BCGs, computed from the total apparent magnitudes (see Table 1 for the filters considered) calculated by GALFIT, despite the very big dispersion (4 magnitudes thoughout redshift) tend to become brighter with redshift (see Figure 13, left). The trend is faint, and can be quantified with a coefficient correlation R = −0.29 and a p-value p = 0.006563 (calculated from the coefficient R and the number of data points N 6 ). By taking a significance level of α = 0.05, we show that we can reject the null hypothesis (p < α) and conclude that the trend is significant.
There is a moderate trend for BCGs to grow with time (Figure 13, right), as those with the smallest effective radii are at higher redshifts (z ≥ 1.2). The trend in logarithmic scale is quantified by a correlation coefficient R = −0.40, and with a p-value of p = 0.000142. BCGs observed in too blue filters generally have smaller effective radii than the others at a given redshift. Those with an important inner component contribution do not appear to occupy a special place in these relations.
The mean effective surface brightness ( Figure 14) shows no significant evolution as a function of redshift (R << 0.1), with a very large dispersion (it spans a range of 6 magnitudes at z ≥ 1.25). Seven BCGs at higher redshifts (z ≥ 1.4) can be seen among the galaxies with the brightest mean effective surface brightnesses (< µ >≤ 22 mag arcsec −2 ). We confirm that nothing peculiar was observed with these BCGs. Those observed in too blue filters and those with an important inner component contribution do not occupy a specific place in the relation.
The vertical gradient in color in Figure 14 shows that the large dispersion is also linked to the effective radius. As we go towards the biggest BCGs (increasing effective radii), the relation is shifted towards the fainter mean effective surface brightnesses. This is to be linked with the Kormendy relation, which will be shown in Section 5.2.
Finally, there is no correlation between the Sérsic index and redshift (Figure 15, left). However, on the right panel, we notice two different populations: the BCGs that were modeled with only one component generally have high Sérsic indices with a strong peak at n 1comp,mean = 4 (without considering the BCGs observed in too blue filters), while the BCGs that were better modeled with two Sérsic components with lower Sérsic indices show a peak at n 2comp,mean = 2.0.

Kormendy relation
The Kormendy relation (Kormendy 1977) links the (mean) effective surface brightness of elliptical galaxies to their effective radius. This relation is plotted in Figure 16. The different colors represent different redshift bins: z ≤ 0.3, 0.3 < z ≤ 0.5, 0.5 < z ≤ 0.9, 0.9 < z ≤ 1.3, and z > 1.3. We show that all the BCGs seem to follow the Kormendy relation with the same slope, but 6 https://www.socscistatistics.com/pvalues/pearsondistribution.aspx the ordinate at the origin of the line decreases with increasing redshift.
While applying a linear regression to the relation obtained in each redshift bin (R > 0.80, p < α), we find that the slope remains quite constant at all redshift bins: m = 3.33 ± 0.73, whereas the ordinate at the origin varies as c = 2.15*z + 16.65.

Inner component
The sample requiring an inner component consists of 46 BCGs. We find that the inner component follows a Kormendy relation (Figure 17), and is a continuation of the Kormendy relation shown in Figure 16 at brighter mean effective surface brightness and smaller effective radius (R e,inn ≤ 20 kpc).
We observe a very faint trend for the inner components to have brighter surface brightnesses with decreasing redshift, but the trend is not significant (R = 0.27, p = 0.06237 > α). We don't find any clear correlation (correlation coefficient ≤ 0.2) between redshift and the absolute magnitude, effective radius or Sérsic indices of the inner component of the BCGs.

Alignment of the BCG with its host cluster
Some studies have shown that BCGs tend to have a similar orientation (hereafter PA) to that of their host cluster Durret et al. 2019). As a comparison, we reproduce this study and compare our results with those of these two papers. The PA of the host clusters are taken from West et al. (2017) (computed from the moments of inertia of the distribution of red sequence galaxies) and Durret et al. (2019) (computed from density maps of red sequence galaxies), and the PAs of the BCGs are measured here with GALFIT. If measures are given in both papers, the PA cluster in Durret et al. (2019) is taken, unless the PA cluster is ill-defined (when the clusters are circular in shape), in which case the PA from West et al. (2017) is taken. We did not measure the PA of the host clusters for the clusters that are not presented in the above quoted papers, as our images are not large enough to accurately measure the full extent and shape of the cluster.
We include all BCGs for which the PA of the cluster was measured (73 BCGs), including those observed with too blue filters, as the PA measured by GALFIT is the same regardless of the filter (see Figure 11). We show the histogram of the alignment between the BCGs and their host clusters (defined as the difference of PA between that of the cluster and the BCG) in Figure 18.
We find that 39 BCGs (53% of the BCGs) are aligned with their host cluster with a difference smaller than 30 degrees. This already shows a tendency for BCGs to align with their host clusters, as a random orientation of the BCGs would result in a flat distribution. BCGs with the highest PA difference tend to be circular in shape (elongation = b/a ≈ 1, for which it is more difficult to measure a PA, resulting in high uncertainties). We thus chose to exclude all BCGs with axis ratio b/a ≥ 0.8, in order to eliminate BCGs with ill-defined PAs, as shown in red on the histogram. We then find that 32 out of 58 of BCGs are aligned with their host cluster within 30 degrees, slightly rising the percentage to 55%. There is a secondary peak between a PA difference of 30 to 40 degrees, mainly corresponding to BCGs at redshift z ≥ 0.9. At such high redshifts, galaxies appear smaller and therefore, the accuracy of the measured PA is probably worse. If we only consider galaxies at z ≤ 0.9 (blue histogram), we find that 22 BCGs out of 30 (73%) align with their cluster within less than       Figure 16, but for the inner component of BCGs fitted with two components. Grey points are the same as in Figure 16. Fig. 18. Difference between the PA of the cluster (see West et al. 2017;Durret et al. 2019) and that of the BCG as returned by GALFIT. Only clusters found in West et al. (2017) and Durret et al. (2019) are included here. The histogram is to be compared to that shown in West et al. (2017), as a way to check that our results agree with theirs. All BCGs are represented on the grey diagram, while only those with an axis ratio b/a ≤ 0.8 are included in the red diagram. We also exclude all BCGs at redshift z > 0.9 on the blue histogram. 30 degrees. This shows that in majority BCGs tend to align with their host cluster at least at z ≤ 0.9.

BCG physical properties as a function of host cluster properties
We browsed the available bibliography to retrieve the cluster masses and X-ray center coordinates. The corresponding data can be found in Table 4. We prefer lensing based mass estimates if available. We bring all the masses to M 200 , applying the conversion factor between M 500 and M 200 : M 500 = 0.72 M 200 (Pierpaoli et al. 2003).
We show the richness of the cluster as a function of redshift and cluster mass in Figure 19. The richness N of the cluster is defined here as the number of red sequence galaxies (found in Section 3) in an aperture of 500 kpc radius around the BCG. We obtain different values of N for two different BCGs in the same cluster because the richness is computed in an aperture centered on each BCG.
As can be seen on Figure 19 (left panel), clusters seem to become richer with decreasing redshift (correlation coefficient in logarithmic scale R = −0.70 and p-value of p < 10 −5 ). Clusters at higher redshifts (z ≥ 1.0) have a lower richness, with a number of detected red sequence galaxies N ≤ 60. The right panel also shows that the most massive clusters are also the richest, and the high redshift clusters (blue points on the plot) with a low richness are also the less massive (M 200,c ≤ 5 × 10 14 M ). This low value of N could in principle be due to the depth of our images, as we have a bias due to the distance: at higher redshifts, it is more difficult to detect objects and only the brightest ones can show up.
However, when looking at Figure 20, the left panel shows that we do not observe very massive clusters at high redshifts. So we have no bias due to the distance of the galaxies when measuring cluster masses: the masses are measured via lensing or derived from X-ray or SZ maps, which are independent of distance. Thus, we conclude that clusters become richer with time, and this result is not due to the depth of our images. However, although we only observe very massive clusters at lower redshifts (M 200,c ≥ 30 × 10 14 M at z ≤ 0.8), the masses of the clusters do not vary much with time (R < 0.20). The right panel shows that  We find no correlations (R ≤ 0.2) between the BCG surface brightnesses or Sérsic indices and the cluster masses.
Using the relation given in Bai et al. (2014), we compute an estimate of the BCG masses from the cluster masses: M * BCG ∝ M 0.6 cluster . We find that the most massive BCGs are also the biggest in size (moderate correlation with R = 0.46 and p = 0.00007) and also tend to be brighter (R = −0.32, p = 007353). No correlation between the BCGs masses and redshift is seen (R < 0.20).
We also study how the BCGs behave depending on their offsets to the cluster X-ray center. We exclude superclusters and clusters which present several substructures and/or several BCGs. We show the histogram of the offets on Figure 21, top left panel. We find that 31 out of 61 (51%) are within a 30 kpc radius range from the X-ray center of the cluster, showing that BCGs tend to lie close to the cluster X-ray centers. The two star forming BCGs that have undergone recent mergers and are not at equilibrium are also located at the center of the cluster (D X ≤ 10 kpc). But, we confirm that there can be a significant offset between the two (12 out of 61 BCGs, 20%, present an offset bigger than 100 kpc). Although the corresponding plots are not shown here, we find no correlation between the offset and the absolute magnitude, effective radius, Sérsic index of the BCGs, or with the alignment previously computed.
As can be seen on the top right and bottom left panels of Figure 21, however, the more massive and the richer the cluster (or the BCG, as we converted the cluster masses to BCG masses), the closer the BCG is to the X-ray center of the cluster: the objects with the biggest offets ( ≥ 100 kpc) have masses M cluster ≤ 10 × 10 14 M and richnesses N ≤ 100. We also find that there is a moderate correlation between the offset and the mean effective surface brightness of the BCG (see Figure 21, bottom right): BCGs tend to have brighter mean effective surface brightnesses the closer they are to the X-ray center (in logarithmic scale, R = 0.34, p = 0.0395).
We also analysed whether the most luminous BCGs are special (see Lin et al. 2010;Lauer et al. 2014). We studied the distribution of the difference in magnitude between the BCGs and the second ranked galaxies of the clusters. We found that the distribution was continue, with most BCGs having a difference smaller than one magnitude with the second ranked galaxy. By selecting BCGs which are at least brighter than 1 magnitude than the second ranked galaxy of the cluster (9 BCGs), we find that A&A proofs: manuscript no. main the most luminous BCGs do not occupy a specific place in the observed relations.

Discussion and conclusions
Our work deals with the largest sample (to our knowledge) of BCGs with HST imaging, covering a broad redshift interval from z = 0.1 to z = 1.8 (see Figure 1), and thus enabling us to trace the evolution of BCGs through time. Our sample is larger than most studies found in the literature based on HST images, such as Bai et al. (2014) We also study the luminosity profiles of these galaxies and how they evolve as a function of redshift. HST images allow us to perform profile fitting with precision, and GALFIT returns accurate parameters from model fitting.
We developed a new tool to detect automatically red BCGs on optical images. We successfully detected all the red BCGs regardless of their peculiar characteristics (see Section 3). We did not manage to detect in a similar way the blue BCGs, which represent here only 2% of our sample.
We then proceeded to model the luminosity profiles of these automatically detected BCGs, as well as those which have only one filter available, bringing this sample to 149 BCGs. We removed all BCGs observed in too blue filters as well as BCGs better modeled with two components for which the inner com-ponent has an important contribution to the total luminosity of the galaxy. Our final sample consists then of 86 BCGs.
We studied how the photometric properties of BCGs correlate with redshift, and we find that, although the correlation is weak -but significant, the absolute magnitude presents a faint trend to become brighter with time.
We show that there is a faint trend (see Figure 13, right) for BCGs to become bigger with decreasing redshift. This is the behaviour we expect for galaxies that grow in size with time, by accreting gas and merging with other smaller galaxies. This trend was also observed in Durret et al. (2019) up to redshift 0.9, and can be confirmed up to redshift z=1.8. Based on this relation, we find that BCGs grow by more than a factor 3 between redshifts 1.8 and 0.1. The dispersion can be linked with the Kormendy relation, that indicates that galaxies with higher surface brightnesses have smaller effective radii.
We find no strong correlation between the other photometric properties (surface brightness or Sérsic index) of the BCGs and redshift. This is in agreement with Bai et al. (2014) who do not find any correlation between the magnitude or the mean surface brightness of the BCGs and redshift, up to redshift z = 0.9. We add that no evolution can be observed up to redshift z = 1.8 either.
Although we do only observe massive clusters at lower redshifts (M 200,c ≥ 30 × 10 14 M at z ≤ 0.8), overall, the masses of the clusters do not correlate with redshift. The growth of the cluster is mainly to be linked with the cluster richness ( Figure 19): clusters become richer with time, and we find that the number of red sequence galaxies in an aperture of 500 kpc centered on the BCG increases by almost a factor 10 between z = 1.8 and z = 0.1. We confirmed that the low richness we measured at higher redshift is not due to the depth of our images. This growth mainly seems to be happening at z ≤ 1.0, as we do not observe a significant variation of the richness of the clusters before that time.
We use the relation found in Bai et al. (2014) to compute the BCG masses from the cluster masses, based on the relation found by Bai et al. (2014). We find that bigger BCGs are also more massive (see Figure 19): R e ∝ 4.42 × M BCG , but the masses do not show a significant growth with redshift.
We thus find that the sizes of the BCGs grow faster than their masses in the same redshift range. Although we do not find that the masses of the BCGs grow significantly with time, whereas Bai et al. (2014) finds a factor 2 since z = 2, we agree that the sizes have grown significantly faster than the masses in the same redshift range. Bai et al. (2014) find that the sizes grow more than twice as fast as the masses. We confirm that the masses and sizes of BCGs do not grow at the same rate. This is in favor of a scenario in which BCGs grow thanks to minor dry mergers at the later stages of their formation and evolution. Indeed, a growth mainly due to major dry mergers would make the sizes and masses grow at the same rate.
To summarize, we can say that the sizes of the BCGs, as well as the richnesses of the clusters, evolve with redshift: clusters become richer with time and, at the same time, BCGs undergo dry mergers that increase their sizes.
Another interesting result is the distribution of Sérsic indices (see Figure 15) that shows two different populations with low Sérsic indices, mainly at low redshift (z ≤ 0.8) and high Sérsic indices. The limit is also to be linked to the fact that BCGs at lower redshifts often require a second component to correctly take into account the brighter core of the galaxy. We find that BCGs better modeled with two components have a peak Sérsic index n = 2, while those that were fit with a single component have a peak at n = 4. Those modeled with only one Sérsic component are thus comparable to pure elliptical galaxies which can be well modeled with a deVaucouleurs profile. This slightly differs from the results shown in Bai et al. (2014) who find a median value of n = 5.7. But, Bai et al. (2014) only fit a single Sérsic profile to all the BCGs in their sample. If we only look at the distribution we obtained for BCGs modeled with a single component, we find that this distribution is more comparable to that of ETGs shown in their paper. Another difference with that study is related to the filters chosen to model the luminosity profiles of the BCGs. While we consider the same spectral region of the SED for all clusters in order to only look at the same old red stellar population at all redshifts, Bai et al. (2014) observe all BCGs with the ACS F814W filter, which we find is already too blue for clusters at redshifts z ≥ 0.57. We also showed, by studying the parameters obtained in two different filters for a sample of BCGs, that the parameters will vary depending on the part of the SED you look at (when looking at a bluer filter, the absolute magnitude and mean effective surface brightness become fainter, the effective radius becomes smaller, and the Sérsic indices vary without any clear trend).
Finally, we find that the Kormendy relation (Kormendy 1977) is also a function of redshift, with the relation shifted towards fainter mean effective surface brightnesses at higher redshifts. This relation shows that, at the effective radius, smaller galaxies are brighter and denser than the bigger ones. The slope of 3.33 ± 0.73 measured with our sample remains constant with redshift. Our value is in good agreement with that given in Bai et al. (2014): <µ>=(3.50± 0.18)logR e +(18.01± 0.23) and agrees within one σ with the one given in Durret et al. (2019): <µ>=(2.64± 0.35)logR e +(19.7± 0.5). We should note that cosmology or selection effects might be contributing to the results in Figure 13, which shows a trend for BCG sizes and luminosities to increase with time. Despite its faintness, the contribution of the ICL should be taken into account. The ICL blends with the envelope of the BCG, making it difficult to differentiate the galaxy from the ICL, and this may affect our measurements (in particular those of the effective radii and Sérsic indices). The ICL might contribute at some level to measured sizes and luminosities of galaxies at low redshifts, yet might be missed in high-redshift clusters because of cosmological surface brightness dimming, or perhaps because the ICL has not yet developed in these young clusters. A concern comes from the value of the background, as GALFIT is sensitive to it, but its computation is limited by the sizes of the images (even without cropping).
We broaden the work by West et al. (2017) and Durret et al. (2019) on the alignment of BCGs with their host clusters. We removed BCGs with ill measured position angles due to their circular shape, as well as BCGs at higher redshifts, z ≥ 0.9, as they will appear smaller on the CCD, and will thus be less resolved and have less accurate measured PAs. This enables us to conclude that BCGs in majority tend to align with their host cluster at least at z ≤ 0.9, as after this selection 73% of the remaining BCGs are aligned with their host cluster within 30 degrees. This is a tighter alignment than that of Durret et al. (2019), who found an alignment for 12 out of 21 BCGs (57%) between redshifts 0.21 and 0.89. Okabe et al. (2020) study the alignment of 45 dark matter (DM) haloes and their BCGs up to z = 0.97, and find that BCGs tend to be well aligned with their DM haloes, with a mean difference of 22.2 ± 3.9 degrees. A similar study was done by Ragone-Figueroa et al. (2020) on the alignment of BCGs both with the distribution of cluster galaxies and DM haloes, by analyzing cosmological hydro-simulations of 24 clusters. They find a strong alignment at z < 2, and add that relaxed clusters tend to host BCGs that align with their major axis. Similar conclusions are made by De Propris et al. (2020), who show that BCGs are generally aligned with their host cluster even when the offset between the BCG and the X-ray center is significant.
Cerulo et al. (2019) found that 9% the BCGs between 0.05 ≤ z ≤ 0.35 have colors bluer than 2σ of the median color of the cluster red sequence. During this study, we found two peculiar blue BCGs in our sample. Apart from their colors and complex structures, these two peculiar BCGs do not have photometric properties different from the other BCGs. It would be interesting to continue this study by considering a larger sample of SF blue BCGs, to see where they lie in the previous plots.
We plan to apply the method described in this paper to more than a thousand clusters from the CFHTLS, detected by Sarron et al. (2018), up to redshift z = 0.7. This will enable us to better evaluate the accuracy of our BCG detection method on groundbased-data, and although the resolution will not be as good, the sample will be significantly larger. We also found two BCGs (2%) with blue colors, and it would be interesting to estimate the fraction of blue BCGs in the Universe up to redshift z = 0.7. We can wonder if these BCGs evolve differently from the red BCGs that we detect.
A&A proofs: manuscript no. main Table 1. Sample of the 149 BCGs studied in this paper. The columns are: full cluster name, coordinates of the BCG, redshift, class of the BCG (if two BCGs are defined for a cluster, class 1 represents the brightest of the two), instrument, filter used to model the luminosity profile of the BCG (see Section 4), associated scale, color computed to extract the red sequence of the cluster (see Section 3). The BCGs with no values in the last column only had data in one filter, and their coordinates were taken from the literature.  Table 2. Parameters obtained from fitting the luminosity profiles of the BCGs with GALFIT. Only the parameters obtained for the chosen model are shown. If fitted by two Sérsic profiles, the parameters of the outer component are given (the parameters for the inner component are then given in Table 3). The columns are: full cluster name, class of the galaxy, best model (Sérsic being a model with a single component, Sérsic2 a model with two components, and Sérsic* by fixing the Sérsic index n = 4), absolute magnitude, mean effective surface brightness, effective radius, Sérsic index, elongation (ratio of the major to minor axis), position angle, alignment of the BCG with its host cluster.    Table 3. Parameters obtained for the inner component, for BCGs fitted with two Sérsic profiles. The columns are: full cluster name, class of the galaxy, absolute magnitude, mean effective surface brightness, effective radius, Sérsic index, elongation (ratio of the major to minor axis), position angle.