Open Access
Issue
A&A
Volume 697, May 2025
Article Number A197
Number of page(s) 17
Section Cosmology (including clusters of galaxies)
DOI https://doi.org/10.1051/0004-6361/202553759
Published online 19 May 2025

© The Authors 2025

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1. Introduction

Galaxies are distributed in a complex structure known as the cosmic web, where the densest regions are connected by filaments and walls (e.g., Bond et al. 1996). Groups of galaxies, which reside in this large-scale environment, host the bulk of galaxies and play a crucial role in the understanding of their evolution, but also of the formation and evolution of structures like massive galaxy clusters themselves (e.g., Tully 1987; Eke et al. 2004; Lovisari et al. 2021) and their cosmological impact (e.g., see the review by Allen et al. 2011). Galaxy groups represent the transition range in halo masses from massive galaxies to galaxy clusters and show different physical properties and evolutions with respect to galaxy clusters (e.g., Giodini et al. 2009; McGee et al. 2009; McCarthy et al. 2010; Gaspari et al. 2011). Despite there being no universally defined threshold to separate groups and clusters, they are generally distinguished in the literature by using limits, for instance, in mass at around ∼1014M or in the number of galaxy members at around 50 galaxies (e.g., Paul et al. 2017; Lovisari et al. 2021). Galaxies hosted in groups are believed to evolve in a different way with respect to galaxies in the field, undergoing substantial alteration, for instance, in star formation rate (SFR) (e.g., Scoville et al. 2013; Darvish et al. 2014, 2016; Taamoli et al. 2024), morphology (e.g., Mandelbaum et al. 2006; Capak et al. 2007a; Bamford et al. 2009), and gas and metal content (e.g., Catinella et al. 2013; Chartab et al. 2021). This occurs via interactions (Hausman & Ostriker 1978), ram-pressure stripping (Gunn & Gott 1972) and a variety of other mechanisms (see e.g., Boselli & Gavazzi 2006, for a review). Although some environmental effects are observed to be more efficient in dense massive clusters, some phenomena are strongly affecting galaxies in lower-density environments such as groups (e.g., Bianconi et al. 2018; Lietzen et al. 2012; Vulcani et al. 2018), and even filaments (e.g., Laigle et al. 2018).

Environmental effects in galaxy groups have been widely explored at z ≲ 1.5 (e.g., Wilman et al. 2005; George et al. 2012; Salerno et al. 2019; Balogh et al. 2021; McNab et al. 2021; Reeves et al. 2021; Baxter et al. 2023; Kukstas et al. 2023; Gozaliasl et al. 2024). However, the understanding of galaxy groups and galaxy evolution in groups becomes more complicated and unexplored at z ≳ 1.5 where the classical morphology-density relation observed in the local universe (Dressler 1980) seems to fade and group and cluster galaxies exhibit properties more consistent with those of the field galaxies, in terms, for example, of SFRs and morphologies (e.g., Brodwin et al. 2013; Alberts et al. 2016). In addition to this, the interval z = 1.5 − 2.0 marks the transition between virialized objects observed locally (z < 1.5) and the maturing phase of protoclusters (2.0 < z < 3.5), the progenitors of galaxy clusters (e.g., Shimakawa et al. 2018). The study of high-redshift galaxy groups and protostructures is key to understanding the evolution and star formation history of galaxies, their connection to the dark matter distribution, and the impact of physical processes such as AGN feedback (e.g., Tanaka et al. 2012; Zhang et al. 2020; Kiyota et al. 2025).

In the COSMOS field, several group catalogs have been produced over the years, like the X-ray-selected samples by Finoguenov et al. (2007) and George et al. (2011), which used the XMM-Newton data (Hasinger et al. 2007) to robustly identify galaxy groups up to z ∼ 1.0. In a more recent effort, Gozaliasl et al. (2019) revised the X-ray catalogs by incorporating all available X-ray observations from Chandra and XMM-Newton in the 0.5−2 keV band and using photometric and spectroscopic data for the optical counterparts (e.g., Ilbert et al. 2009; McCracken et al. 2012; Laigle et al. 2016). Several other works have leveraged the availability of spectroscopic redshifts (Lilly et al. 2007, 2009) to detect groups up to z ∼ 1 (e.g., Knobel et al. 2012) and proto-groups up to z ∼ 3 (e.g., Diener et al. 2013).

In this work, we made use of the deepest contiguous 0.54 deg2 galaxy catalog available to create the largest deep galaxy group catalog created to date, extending from z = 0 up to z = 3.7. We used the COSMOS-Web photometric catalog of galaxies (Shuntov et al., in prep.), which is the result of the largest contiguous imaging of the sky performed with the James Webb Space Telescope (JWST). The COSMOS-Web survey (Casey et al. 2023) is a unique combination of the unprecedented depth and spatial resolution of JWST and the coverage and data availability of the COSMOS field (Scoville et al. 2007), making it a key resource for the study and definition of the large-scale structures in the cosmic web over around 50 million Mpc3 (Casey et al. 2022). The JWST NIRCam near-infrared (NIR) photometric coverage, combined with more than 30 photometric bands available for the COSMOS field (from ultraviolet to infrared) allows for high-quality photometric redshift (photo-z, hereinafter) estimation, with precision below ∼0.03 even at the faintest magnitudes in the redshift range we are interested in Arango-Toro et al. (2025), Shuntov et al. (2025). This enables the creation of robust samples up to the protocluster regime (z ≳ 2) and a detailed study of structures potentially up to z = 7 and beyond (e.g., Morishita et al. 2023, 2025).

In order to create such a deep galaxy group catalog we made use of the Adaptive Matched Identifier of Clustered Objects (AMICO; Bellagamba et al. 2018; Maturi et al. 2019), an algorithm based on a linear optimal matched filter (Maturi et al. 2005; Bellagamba et al. 2011) which extracts signal maximizing the signal-to-noise ratio (S/N) and without any explicit color selection of galaxies. Without requiring spectroscopic information or galaxy colors, AMICO is able to detect clusters and groups up to high redshift and down to low masses (e.g., up to z = 2 and down to less than 1013 M as it was shown by Toni et al. 2024). In its simplest application, AMICO detection is based on the spatial and luminosity distribution of galaxies in clusters without using color information, which limits the possibility of biases related to the presence (or absence) of the cluster red sequence, particularly important when moving to high redshifts. The algorithm is one of the two cluster finders selected for the ESA’s Euclid mission (Laureijs et al. 2011; Mellier et al. 2018), given its performances in terms of completeness and purity when tested on Euclid-like mock catalogs (Euclid Collaboration: Adam et al. 2019). AMICO has already been validated and successfully applied to several surveys such as the Kilo Degree Survey (KiDS; Maturi et al. 2019) and the miniJPAS (Maturi et al. 2023).

More recently, Toni et al. (2024) utilized the AMICO algorithm to produce a cluster and group catalog containing 1269 candidates in total in the range 0.1 < z < 2.0. This catalog was the result of three independent AMICO runs performed using the magnitude in three different photometric bands as reference galaxy property, and to compute the luminosity function of clusters. 490 candidates were consistently detected in all three runs. A comparative analysis of the three runs suggested a possible cut in signal-to-noise to define a more robust sample. X-ray properties were assigned to these detections by cross-matching with the group sample by Gozaliasl et al. (2019). For unmatched detections, the X-ray properties were estimated using the same Chandra and XMM-Newton data. The final catalog includes 622 candidate clusters and groups with optical properties, X-ray flux estimates, and estimates of mass. This large sample of candidates with assigned X-ray properties enabled the calibration of scaling relations between two AMICO mass proxies (i.e., richness and amplitude) and X-ray mass, permitting the study of their redshift dependence and the selection of the most stable photometric bands.

In the work we present in this paper, we leveraged the insights and experience gained from the successful AMICO-COSMOS group search described in Toni et al. (2024) combined with the high-accuracy, deep photometric redshifts provided by the COSMOS-Web survey to present the largest deep galaxy group sample detected to date, produced by applying the AMICO algorithm to the COSMOS-Web data. Figure 1 shows the impressive resolution of the JWST color-composite image overlaid with the X-ray extended emission (Gozaliasl et al. 2019), for one of the richest and most massive groups in the COSMOS field.

thumbnail Fig. 1.

JWST rgb (F444W as r, [F150W, F277W] as g, F115W as b) color-composite image of the most massive group in the COSMOS-Web field. The JWST image is overlapped with the X-ray extended emission (pink) from the combined XMM-Newton and Chandra 0.5−2 keV wavelet-filtered image. (Credits: ESA/Webb, NASA & CSA, G. Gpzamoasm, K. Virolainen, A. Koekemoer, M. Franco.)

This paper is structured as follows. Section 2 introduces the COSMOS-Web data, the galaxy catalog, and the selection criteria applied to create a robust input dataset for the group search. Section 3 outlines the main steps of the detection process and the core principles of the AMICO algorithm. In Sect. 4, we present the results of the group search, first by introducing the group catalog and then by discussing the creation of realistic mocks and the evaluation of the purity of the sample. In Sect. 5, we examine detections at z > 2, compare them to known objects, and analyze their clustering properties. Finally, Sect. 6 summarizes our key findings, their implications, and the potential future research based on this work.

Throughout this paper, we use the term “group” to refer to our candidate detections, given that only a few objects in the analyzed field are expected to have masses larger than 1014 M or more than 50 members according to previous detections performed in the COSMOS field (e.g., Knobel et al. 2012; Gozaliasl et al. 2019; Toni et al. 2024). We assume a standard concordance flat ΛCDM cosmology with Ωm = 0.3, ΩΛ = 0.7, and h = H0/(100 km/s/Mpc) = 0.7.

2. Data

2.1. Photometric catalog

The COSMOS-Web Survey (PIs: J. Kartaltepe and C. Casey; Casey et al. 2023), is a 255-hour Cycle 1 observation program conducted using JWST. The survey spans 0.54 deg2 and utilizes four NIRCam filters (Rieke et al. 2023), achieving a 5σ point-source depth between 27.5 and 28.2 mag. Observations were carried out using the F115W, F150W, F277W, and F444W filters (Casey et al. 2023) and for a non-contiguous 0.19 deg2 area imaged in the F770W MIRI filter (Wright et al. 2022), with a 5σ point-source depth between 25.3 and 26.0 mag.

The COSMOS field (Scoville et al. 2007; Capak et al. 2007b) benefits from a rich legacy of multi-wavelength data, from X-rays to radio (e.g., Civano et al. 2016; Hasinger et al. 2007; Smolčić et al. 2017). Optical data include the u-band observations from CFHT’s MegaCam (Sawicki et al. 2019), the high-resolution data from the Hubble Space Telescope (ACS-HST) in the F814W band (Koekemoer et al. 2007), the Hyper-Suprime-Cam (HSC) imaging in the g, r, i, z, and y bands, in addition to the 13 intermediate and narrow bands of Subaru Suprime-Cam (SC) (Taniguchi et al. 2015). The UltraVISTA survey (McCracken et al. 2012; Moneti et al. 2023) completes the coverage at NIR-wavelengths with the Y, J, H, and Ks bands, which are complementary to JWST’s NIRCam and MIRI.

The new COSMOS-Web photometric catalog has been developed, targeting specifically the portion of the COSMOS field observed by JWST, succeeding previous COSMOS galaxy catalogs (e.g., Ilbert et al. 2013; Laigle et al. 2016; Weaver et al. 2022). More than 784 000 sources have been detected over the 0.54 deg2 area, using a PSF-homogenized χ2 detection image, which combines all NIRCam filters. Source extraction is challenging due to the variation in PSF sizes across space- and ground-based facilities, ranging from 0.05 to 1.0 arcseconds. To address this, SourceXtractor++ (Bertin et al. 2022) has been used to model Sérsic surface brightness profiles (Sérsic 1963) at NIRCam resolution, followed by photometric extraction in each band. The resolution of JWST images enables the separation of previously blended sources (Arango-Toro et al. 2025). A more detailed description of the COSMOS-Web photometric catalog used in this study is provided by Shuntov et al. (in prep.) and Arango-Toro et al. (2025).

2.2. Photometric redshifts

Accurate photometric redshift estimation is crucial for reliably detecting galaxy groups, as the precision of these redshifts directly affects the identification and characterization of such structures. Photometric redshifts in the COSMOS-Web source catalog are computed with the template-fitting code LePhare (Arnouts et al. 2002; Ilbert et al. 2006), with an expanded parameter space for the template library allowed by the depth and coverage offered by this field. The template library consists of 12 templates based on stellar population synthesis models (Bruzual & Charlot 2003), with 42 different ages and including various star formation histories (SFHs) and metallicities (Z = 0.008 Z, 0.02 Z) as outlined by Ilbert et al. (2015).

In order to assess photometric redshift accuracy, the COSMOS-Web redshifts have been compared to more than 11 000 spectroscopic redshifts (spec-zs, hereinafter), with confidence level > 97%, from the compilation created by Khostovan et al. (2025) (e.g., Lilly et al. 2007, 2009; Kartaltepe et al. 2010, 2015; Silverman et al. 2015; Kashino et al. 2019). The photometric redshift precision is around 0.01 for mF444W < 24, with a 2% rate of catastrophic failures1. Even for the faintest galaxies at 26 < mF444W < 28 and for high redshifts in the interval we are interested in (z < 4), the precision remains better than 0.03 with ∼10% failure (Arango-Toro et al. 2025; Shuntov et al. 2025).

2.3. Data cleaning and visibility mask

We cleaned the data set by keeping only galaxies (including active galaxies) based on LePhare classification. We removed masked objects limiting the selection to both FLAG_STAR_JWST = 0 and FLAG_STAR_HSC = 0 to ensure not only the availability and reliability of JWST photometry but also of the external ground-based photometry which is used to estimate photometric redshifts. Removing unsafe regions for ground-based data implies a loss of area of around 10% compared to the area selected only considering JWST photometry quality. We decided to sacrifice this area in exchange for the improvement in photo-z uncertainties and in physical properties resulting from SED fitting that including good-quality ground-based photometry ensures (see Shuntov et al. 2025, and in prep., for further details). A cleaned and high-quality sample is crucial for the study of the clustering of galaxies in three dimensions, while on the contrary, the inclusion of badly characterized galaxies with uncertain and inaccurate photo-zs can contaminate the sample with spurious detections, for instance, in correspondence with artifacts. For this reason, we additionally kept only the galaxies with the best photo-z quality flag, LP_warn_fl = 0, to reject artifacts like snowballs and sources with inconsistent photometry between different bands. We then cut the catalog at the mode of the magnitude distribution in the F150W band, which we chose as the reference band (see Sect. 3). This minimizes the introduction of noise and spurious detections in the catalog and defines the depth of the galaxy catalog, that is mF150W = 27.3. As a reference, this catalog is almost 2 magnitudes deeper than the one we used in Toni et al. (2024) and more than 3 magnitudes deeper than the cluster search performed in KiDS (Maturi et al. 2019, and in prep.). To further reduce the contamination of the galaxy sample due to poor classification, we removed all objects with an extremely small radius, choosing as a threshold radius ∼0.01″, which is the value that divides into two distinct groups the sources in the mF150W–radius plane. The total number of selected galaxies used for the group detection is 389248. The redshift and magnitude distribution of the cleaned input galaxy catalog is shown in Fig. 2.

thumbnail Fig. 2.

Distribution in magnitude (top panel) and redshift up to z = 4 (bottom panel) of the cleaned galaxy catalog used as input for the group search.

In order to include in our group detection the information about the footprint of the survey and the areas of the field that are inaccessible or potentially contaminated during the observation phase, we need to generate a so-called visibility mask. Using a visibility mask as input in the detection process has a twofold purpose: it filters out unsafe sources excluding them from the input catalog and it allows the detection algorithm to account for inaccessible areas during the detection procedure. We based the visibility mask on the input galaxy catalog, using an approach similar to that used in Toni et al. (2024). In particular, we created the visibility mask starting by assigning the value 1 (masked) to all mask pixels devoid of galaxies in the selected catalog described above. This ensures areas lying outside the field and areas affected by bright star halos are accounted for. Then, we built polygons to cover areas that may be affected by star spikes. To do this, we used the Incremental Data Release of the HSC bright-star masks by Coupon et al. (2018), extracted from Gaia DR2 (Gaia Collaboration 2018), with magnitude G < 18. The spike and halo size follow an exponential relation with the magnitude of the star, with the same approach presented by Coupon et al. (2018), and the polygons were converted to binary masks using the venice code (Coupon 2018). Besides these polygons, we visually inspected the galaxy density maps in different redshift bins and manually masked areas with possible artifacts close to the star halos and/or field borders. During the visual inspection, due to suspiciously shaped overdensities at z > 2, we also discarded the area occupied by the central galaxies of a z ∼ 0.1 known group in COSMOS, which is listed, for instance, by Gozaliasl et al. (2019) as ID20149. This makes this known bright object undetectable but improves purity at higher redshifts. This masking procedure ensures our galaxy catalog is as clean as possible from spurious or bad-photometry detections, especially for high-z detections.

The resulting composite visibility mask yielded an effective area in which the group search was performed of about 0.45 deg2.

3. Detection of groups with AMICO

AMICO (Bellagamba et al. 2018; Maturi et al. 2019) detects galaxy clusters and groups in photometric galaxy catalogs using position, photometric redshift, and any additional galaxy property. The algorithm is based on a linear optimal matched filter (e.g., Maturi et al. 2005) that extracts a signal for which we have an a priori model. The galaxy density can be described as the sum of a signal component and a noise component, representing the cluster and the field galaxies, respectively. The signal component S is expressed by S(x) = AMc(x) where Mc(x), is the a priori cluster model and a function of the n galaxy properties contained in the vector x. The factor A, the signal amplitude, is retrieved as a convolution of the data with an optimal filter defined via constrained minimization and is related to the cluster richness. AMICO generates a three-dimensional amplitude map and selects detections at the peaks with the highest signal-to-noise ratios. Then, it attributes membership probability to galaxies whenever a candidate is identified. This probability is then used to compute the apparent richness, λ, as the sum of all member probabilities, and the intrinsic richness, λ, considering only members restricted to m + 1.5 and inside R200, where m, the luminosity function knee magnitude, and R200, the virial radius, are model parameters, as we describe below and in Toni et al. (2024). In this paper, we focus on the specific characteristics of this particular application of AMICO to the new COSMOS-Web data. Therefore, for a complete description of the mathematical formalism and working principle of AMICO, we refer the reader to Bellagamba et al. (2011, 2018) and Maturi et al. (2019).

For this specific application, we adopted the same amplitude-map resolution used for the AMICO-COSMOS catalog (Toni et al. 2024), that is Δr = 0.3′ (on the sky plane) and Δz = 0.01. We used AMICO with a single galaxy property, which we chose to be the magnitude in the JWST F150W band (simply m, hereinafter). This photometric band, in addition to having good resolution and low background noise with respect to other bands, was proven to be stable to the calibration of the mass-proxy scaling relation at high redshift observed in the H-band run described in Toni et al. (2024). For what concerns the redshift probability distribution p(z) of each galaxy, we decided to rely on an analytical Gaussian distribution that peaks at the redshift with the highest probability for each galaxy with its 1σ uncertainties.

Cluster model. To describe the expected distribution of cluster galaxies, we built an analytical model with a truncated Navarro-Frenk-White profile (Navarro et al. 1997) and a Schechter luminosity function (Schechter 1976). We used the following parameters from the literature: normalization as estimated by Hennig et al. (2017), concentration parameter from Ragagnin et al. (2021) and faint-end slope of the luminosity function from Andreon et al. (2014). The characteristic magnitude of the luminosity function, m, was estimated using evolutionary synthesis models with GALEV (Kotulla et al. 2009). We used the same GALEV configuration as in Toni et al. (2024), evolving a massive (∼1011M) elliptical galaxy. Figure 3 shows the redshift evolution of m overlapped with the selected galaxies (orange density contours) for a model with formation redshift, zf = 8 and an exponentially declining star formation burst (purple line), and one with zf = 5 and without burst (light blue line). The former marks the exponential cut-off in the number of galaxies and better describes the magnitude evolution trend of the data set up to at least z ∼ 3.7, redshift to which we extended our search. This first m evolution trend was chosen to build the cluster model. We adopted a resolution in magnitude of Δm = 0.5 for the model and the noise describing group and field galaxies, respectively. This ensures sufficient galaxy statistics in each magnitude bin.

thumbnail Fig. 3.

Magnitude and redshift of the galaxies of the input catalog (orange density contours) and evolution with redshift of the characteristic magnitude of the luminosity function, m, for two different models with different formation redshifts and star formation burst, as shown in the legend. The model with zf = 8 and past burst is the one used for the model in this work, marked by the solid purple line. This model better describes the trend of magnitude with respect to the one represented by the solid blue line. The dashed purple line indicates the same model, but for m + 1.5 which is the limit used in the definition of the intrinsic richness.

Noise model. We estimated the noise, which describes the distribution of field galaxies, by approximating it to the general distribution of the whole galaxy sample. Given the small area covered by this group search, we investigated the impact of group galaxies on the noise and observed visible peaks localized in redshift, which could be due to physical overdensities. We cleaned the noise model from the group galaxy contamination by dividing the field into four non-overlapping quadrants of roughly the same area and by taking the median noise value of each corresponding noise bin for the four quadrants as the final one. This cancels out the contribution of localized overdensities that are present in a specific quadrant and not in the others. Additionally, we performed a regularization of the noise model similar to the one described in Toni et al. (2024), attributing a large value to empty bins and those with m < m − 3.

4. Catalog of galaxy groups

We performed a group search with AMICO over the effective area of 0.45 deg2 in the COSMOS-Web field, as previously described. We ran the code down to (S/Nnocl)min = 6.0, where S/Nnocl is the AMICO signal-to-noise ratio, which does not include shot-noise from cluster/group members in the amplitude variance. When taken as a reference, this was proved to yield a more stable redshift dependence of the purity of the sample, with respect to the standard S/N, that includes both background and member contribution (Maturi et al., in prep.). For further details on the definition and on the difference between the two signal-to-noise ratios in AMICO, we refer the reader to Bellagamba et al. (2018) and Maturi et al. (2019). We then cut the catalog at λ > 2, which is a typical value to reject single and pairs of galaxies and minimize the number of unrealistic and spurious detections. Additionally, we rejected detections falling into the first and last redshift bin which might be affected by border effects. The final catalog we produced contains 1678 detections in the range 0.08 ≤ z ≤ 3.7 and with S/Nnocl > 6.0. Despite AMICO has been widely used at z < 1 (Bellagamba et al. 2018; Maturi et al. 2019, 2023) and already tested (Euclid Collaboration: Adam et al. 2019), and applied to real data (Toni et al. 2024) up to z = 2, the detection of groups and clusters at even higher redshift, where they are still taking shape, is a quite unexplored regime, that we are interested in addressing with this work. The possibility to detect objects at z > 2 with AMICO and the resulting sample derived from the application to the COSMOS-Web data are discussed in more detail in Sect. 5. For what concerns the sample at z < 2, we compared the detections in the COSMOS-Web field with the previous COSMOS catalog constructed with AMICO (Toni et al. 2024). We performed a three-dimensional matching with maximum radial separation dr = 1 Mpc/h and maximum redshift separation dz = 0.05(1 + z), equivalent to ∼200 cMpc at z = 2, which are typical values for group/cluster matching. In the matching, we allowed multiple associations in order not to exclude potential cases of fragmentation or over-merging of detections. For this comparison, we worked on the common volume covered by the two catalogs, which is defined by the interval 0 < z < 2 and by the smallest effective area, namely the one of the COSMOS-Web field. Therefore, we discarded masked detections2 from the previous AMICO-COSMOS catalog according to the visibility mask described in Sect. 2.3. We found good correspondence between the two samples on the common volume, with a total number of 847 groups matched for the COSMOS-Web sample and 520 for the AMICO-COSMOS one. We explored the possibility for the difference in counts to signal fragmentation or over-merging. However, we found that the impact of fragmentation is marginal. Since we have used two different definitions of signal-to-noise as a lower limit to stop the detection process, we repeated the matching considering only COSMOS-Web detections with S/N > 3.0 which was the minimum threshold of the AMICO-COSMOS catalog. This yielded a more consistent number of counts, with 400 matched detections for COSMOS-Web and 397 for AMICO-COSMOS. The percentage of matched detections for the AMICO-COSMOS catalog is around ∼97% and increases to ∼98% when considering only detections with S/N > 3.5, which was identified as the threshold for selecting the most robust sample, based on the comparison between the detection performances in different photometric bands (see Toni et al. 2024, for details).

In Fig. 4, we show the redshift distribution of all the objects of this catalog and the cumulative signal-to-noise distribution, marking with the orange line the distributions for the detections with λ > 10, a value to select the rich-end tail of the distribution in λ and identify the richest objects in the catalog. In Fig. 5, we plot the intrinsic richness, λ vs z in three intervals of S/Nnocl. The trend of increasing intrinsic richness for increasing redshift observed up to z ∼ 2 is expected, since the further we observe, the harder it is to detect poor and faint objects. At z ≳ 2, overdensities and numerous low-λ detections might be due to the fact that AMICO detects cores and substructures of clumpy extended protostructures rather than individual virialized clusters/groups. Additionally, the redshift interval z ∼ 2.4 − 2.6 is occupied in COSMOS by several large overdensities of galaxies known, for example, as Hyperion and Colossus protoclusters (Cucciati et al. 2018; Lee et al. 2016, see Table 1 for further details and references). Figure 6 shows four examples of detections at different redshifts. The final group catalog includes the columns described in Appendix A and it is accompanied by the list of member galaxies for each detection with membership and field probability as assigned by AMICO. This galaxy membership probability is calculated after AMICO detects candidates in the 3D amplitude map, as briefly described in Sect. 3. In particular, the probability of the i-th galaxy belonging to the j-th detection is expressed by

thumbnail Fig. 4.

Redshift distribution (left panel) and normalized cumulative signal-to-noise distribution (right panel) for all the detections in the COSMOS-Web group catalog. In orange, we show the richest detections, with λ > 10.

P i , j = F i A M c ( x i ) p ( z j ) A M c ( x i ) p ( z j ) + N · $$ \begin{aligned} P_{i,j} = F_{i}\frac{A M_c(\boldsymbol{x}_i)p(z_j)}{A M_c(\boldsymbol{x}_i)p(z_j)+N}\cdot \end{aligned} $$(1)

thumbnail Fig. 5.

Intrinsic richness, λ, for the sample of detected groups and its trend with redshift, in three different S/Nnocl bins as indicated in the plot.

Table 1.

Compilation of protoclusters and high-redshift clusters and groups known in the literature for the COSMOS field.

thumbnail Fig. 6.

Four examples of detections present in the group catalog, at different redshifts. Circles indicate member galaxies, color-coded with membership probability and with their redshift printed next to each circle. Purple-to-yellow contours mark the density of galaxies and green contours the X-ray emission from the combined XMM-Newton and Chandra mosaic image in the 0.5−2 keV band. White lines delimit masked areas. The center of the group is marked with a red cross. In the top left panel, a group at z = 0.71 with λ ∼ 78 and central AGN (the same group shown in Fig. 1). In the top right panel, a group at z = 1.29 with λ ∼ 14. In the bottom left panel, a group at z = 2.22 with λ ∼ 24 and on the bottom right panel one at z = 3.14 with λ ∼ 28, flagged for being on the extended X-ray emission of a low-z group and for being at the edge of the field, as visible on the top of the stamp. JWST rgb images are in the same filters as in Fig. 1.

Here, Fi is the probability of belonging to the field, which is used as a scaling factor to consider any previous associations of the galaxy. The field probability assumes a value of 1 for each galaxy when the iterations start and decreases at each new association. The vector containing the galaxy properties as described in Sect. 3, xi, contains here the sky position of the galaxy with respect to the group center. The noise and the value of the galaxy redshift probability distribution at the group redshift, zj, are here expressed by N and p(zj). The galaxy member catalog and the membership probability for each galaxy associated with a detection are part of the output returned by AMICO. However, this membership association is by definition a model-dependent quantity. If one wants to study the properties of the clusters and groups themselves, such as the luminosity function or the density profile in a model-independent way, one needs to rely on a different method to identify possible members and field galaxies. For this reason, we included in our group catalog the possibility of retrieving statistical membership, as it was done for example by Puddu et al. (2021) to study cluster luminosity functions. We proceeded as follows: in the membership catalog, we added a column named “FLAG_CYLINDER”, which we assigned the value “1” for all galaxies located within a cylinder centered in the center of each detection, with a radius of 0.5 Mpc/h and depth 0.01(1 + z); we then assigned the flag value “0” to all galaxies which do not fall in any of the cylinders. These flags help identify potential group and field galaxies without relying on the AMICO association probability. Besides this, we also created two three-dimensional maps of the volume covered by the group search, marking the effective area (taking into account only unmasked regions) inside or outside the cylinders, used to easily retrieve the group and field volumes in a given redshift slice and perform background subtraction. This was done to make the membership catalog usable for galaxy population and group property studies. However, when referring to group galaxies in this paper, we will make use of the membership probability directly returned by AMICO, selecting members as galaxies associated with P > 50%, unless otherwise specified.

4.1. Purity and completeness of the sample with SinFoniA

We derived the purity and completeness of the sample by making use of data-driven mock catalogs, produced with the Selection Function extrActor (SinFoniA; Maturi et al. 2019). This method has already been applied to wide-field surveys such as KiDS (Maturi et al. 2019, and in prep.) and is currently part of the implementation of the Euclid pipeline. The idea is to generate mocks based on the input dataset used for the creation of our candidate catalog, which is already divided into field and group galaxies via the association probability returned by AMICO. After this, the AMICO algorithm is applied to the mock galaxy catalog in the same way as was done for the real data, and the list of generated mock groups is used as a “truth table” to study the resulting group catalog. Therefore, the reference catalog is matched with the results of the search and used to estimate the purity and completeness of the sample and the uncertainties on the retrieved properties of the groups. The SinFoniA algorithm exploits a Monte Carlo approach to create realizations of the universe and generate realistic mocks directly from the data. Therefore, the generated mocks are not based on any model or assumption, and they are able to reproduce the complexity of the real data. This reduces the possibility of introducing biases due to the cosmological assumptions, which are, for example, behind numerical simulations. Here, we briefly describe the main steps of the approach we adopted for this specific application and the results of our comparison against realistic mocks. For a complete description of SinFoniA, we refer the reader to Maturi et al. (2019).

Mock catalog. To create a data-driven mock catalog we proceeded as follows. First, we derived the cumulative distribution function (CDF) of a group sample identical to the original catalog, but extending down to S/Nnocl ∼ 0, and chose as reference ranking quantity the S/Nnocl returned by AMICO. We did this in redshift bins to account for a possible redshift dependence. The CDF of this sample is shown in Fig. 7, with different colors referring to different redshift bins. There is no significant redshift dependence for this application, proving once again the good quality of the photo-z estimation. Then, we extracted the group detections using the CDF at their signal-to-noise ratios as the probability to be extracted, since this can be interpreted as the likelihood of the detection to correspond to a real group. This was done to avoid a sharp cut in S/Nnocl when choosing which detections to consider. The field probability of galaxies is then recalculated by taking into account which detections have been rejected. Secondly, we extracted the field galaxies by using the membership information. The probability of the galaxy belonging to the field (Fi in Eq. (1)) is used as the probability for the galaxy to be extracted as part of the mock field. For this application, we kept the original position and redshift of the field galaxies to maintain the noise spatial correlation and so the main features of the large-scale structure. Then, we generated the list of possible mock members. Once again, we extracted the possible members from the catalog created via rejection based on the CDF.

thumbnail Fig. 7.

Cumulative distribution function (CDF) of S/Nnocl, in different bins of redshifts (color bar on the right), used to select detections the mocks are based on, instead of using a sharp signal-to-noise cut.

All galaxies are then collected in bins of apparent richness and redshift so that the mock groups can be built by drawing from the full population in the corresponding bin. We used a bin resolution of Δλ = 10 and Δz = 0.01 for apparent richness and redshift, respectively. With this approach, mock groups are built with members extracted from stacks of different original detections, using the membership probability (Pi, j in Eq. (1)) as the probability of being extracted as members. However, this method tends to cancel out the intrinsic shape and orientation of clusters/groups. To restore their scatter in physical shape, we performed a coordinate transformation that does not affect density but introduces an ellipsoidal shape resembling that of dark matter halos in simulations (Despali et al. 2017). Mock group position and redshift are randomly shuffled, with maximum displacement values of 0.25 Mpc/h and 0.01 in redshift. These values were chosen in order to introduce a perturbation that is not large enough to alter the features and structures on large scales, namely to preserve the three-dimensional spatial correlation. Finally, we ran AMICO on the generated mock catalog and used this as a reference to study the reliability of the original detections. Maturi et al. (in prep.) discuss the negligibility of the possible under-representation of small and blended objects in the mocks.

Matching procedure and group observables. When assessing the quality of detections through matching with a reference, it is important to ensure the good quality of the matching procedure itself. To do so, we performed an initial matching with relatively loose tolerance, namely dz = 0.05(1 + z) and dr = 0.5 Mpc/h, and used an a-priori sorting of the catalogs by richness to prioritize richer groups with respect to the more numerous poor groups. Then, we looked at the distribution of redshift and spatial separation between successfully matched detections, which is reported in the left panel of Fig. 8. The majority of matched detections are concentrated at Δr < 0.2 Mpc/h and at |Δz|/(1 + z) < 0.01, suggesting these might be more appropriate tolerance values to be chosen for the matching in order to discard those likely to be random matches (see Fig. 8). We estimated the number of random matches by fitting the scatter distribution in redshift and position with a function given by the product of two Gaussian distributions, plus a constant that represents the value of the background consisting of random matches. We then repeated the same fitting with Cauchy distributions, which better represent the tails of the distribution. These tests showed that the background value is negligible with respect to the density inside the selected rectangle. We observed that typical scatter values in radial separation assume realistic values (Δr = 0.2 Mpc/h) while for the redshift separation, the scatter distribution is very narrow, with most of the detections having their redshift matched within ∼0.005(1 + z). In the right panel of Fig. 8, we show the normalized distribution of scatter in redshift for the mock matching (dashed black line) and for the comparison between photo-z and spec-z of the detected groups with spectroscopic counterparts (solid purple and green lines, for two redshift intervals). For details about the spectroscopic counterpart assignment see Sect. 4.2. The shown spec-z values correspond to the mean spectroscopic redshift of members assigned with P > 90%. We also tested lower probability thresholds, but the scatter did not significantly change. It is visible that the scatter in redshift in the mocks is significantly smaller when compared to the spectroscopic scatter.

thumbnail Fig. 8.

Left panel: Distribution of matched detections in the redshift scatter, |Δz|/(1 + z) – radial separation, Δr [Mpc/h] plane for an initial matching with maximum separation dz = 0.05(1 + z) and dr = 0.5 Mpc/h. Most of the matched detections are concentrated in the rectangular area at Δr < 0.2 Mpc/h and at |Δz|/(1 + z) < 0.01. Right panel: Distribution of the redshift scatter for the mock matching (dashed black line) compared to the scatter between group photo-z and group spec-z estimated as the mean of spec-members with P > 90%, in two intervals of redshift (solid purple and green lines). The distributions are normalized to 1.0 to better compare scatters.

The matching against the true catalog allowed us to check for possible biases in the estimation of group observables, like the proxies of mass, amplitude, richness, and intrinsic richness. The relative scatter of the true and detected observable quantities is shown in Fig. 9 for apparent richness, λ, intrinsic richness, λ, and amplitude, A. No significant bias is visible in any of the observables and across the entire sample. As expected, the three proxies of mass are highly affected by the Malmquist bias, which is a selection effect and does not have to do with the detection method. This selection effect is significant for the smallest detections, as visible in Fig. 9, indicatively for λ ≲ 70, for λ ≲ 10, and for A ≲ 0.5, with a slight redshift dependence for the latter.

thumbnail Fig. 9.

Relative scatter of the three observables (O, i.e., λ, λ and A, from left to right) between the detected observables, Odet, and the true observables as in the mocks, (Otrue). The scatter is here expressed by ΔO = Odet − Otrue. Different colors mark different redshift bins as indicated in the plot on the right. Points with a scatter larger than 3.0 and the detection in Fig. 1 (which is much richer than the rest of the sample) are not shown for better visualization of scatters and biases.

Purity and completeness. The purity of a detected sample in reference to a true sample is defined as the ratio of the successfully matched detections to the total number of detections; the completeness is computed as the ratio of the successfully matched detections to the total number of true objects. We estimated the purity and the completeness of our sample against the mock catalog produced with SinFoniA in bins of detected redshift and detected candidate properties, the two signal-to-noise ratios, S/Nnocl and S/N and the two main mass-proxies returned by AMICO, the amplitude A and the intrinsic richness, λ, for the purity and the two mass proxies for the completeness. The values of purity and completeness at different redshifts and for the different properties are shown in Figs. 10 and 11, respectively. As expected from the previous analyses of AMICO cluster and group samples (e.g., Maturi et al., in prep.), the S/Nnocl value is the detection property with the best stability in redshift when analyzing the purity of the catalog. This means that this indication of signal-to-noise can be easily linked to a desired purity threshold. We found that, for this application, a cut at S/Nnocl ∼ 10 (selecting around 670 detections) would identify a 90%-pure group sample, while a cut at S/Nnocl ∼ 7 (selecting around 1400 detections) would set purity at 80%. Figure 12 shows the signal-to-noise cut (horizontal axes) to be applied to this group sample for each desired level of purity (vertical axes). Concerning completeness, for this sample it resulted in being less redshift-dependent for the intrinsic richness (bottom panel of Fig. 11), with an 80% completeness reached at around λ ∼ 16.

thumbnail Fig. 10.

Purity of the group sample evaluated against the mock catalog produced with SinFoniA. The four panels show the redshift dependence of (non-cumulative) purity referred to four different reference detection properties: signal-to-noise ratio without cluster contribution, with cluster contribution, amplitude, and intrinsic richness (from top to bottom).

thumbnail Fig. 11.

Completeness of the group sample evaluated against the mock catalog produced with SinFoniA. The two panels show the redshift dependence of completeness referred to two different proxies of mass returned by AMICO: amplitude (top) and intrinsic richness, λ (bottom).

thumbnail Fig. 12.

Relation between minimum S/Nnocl and purity of the sample. In correspondence with some reference values of purity, we report the number of selected detections.

4.2. Spectroscopic counterparts

We associated spectroscopic redshifts with our detections, using the compilation of spectroscopic catalogs for the COSMOS field by Khostovan et al. (2025). Public and private surveys included are, for instance, Lilly et al. (2007, 2009), Kartaltepe et al. (2015), Hasinger et al. (2018), Kashino et al. (2019) and many others. Additional references can be found, for example, in Gozaliasl et al. (2024), Table 2. We selected sources with high-quality spectra, namely with a confidence level larger than 80%. This yields a spectroscopic sample with more than 66 000 galaxies with spectroscopic redshift in the COSMOS field. We assigned spectroscopic counterparts to our galaxy members identified by AMICO, via positional matching within 1″, which is the same separation used to assign counterparts from different surveys in the compilation (Khostovan et al. 2025). Thanks to the large spectroscopic coverage offered by the COSMOS field, we were able to assign a spec-z to around 20% of the group members assigned by AMICO (4300 members assigned with P > 50% have a high-quality spectroscopic redshift). We defined the quality of the spectroscopic association to the group as Q, which is the ratio of the sum of the membership probabilities of the spectroscopic members to the number of spectroscopic members, Q = ∑iPi/Ngal. We found that 535 detections in total have at least three members with a spec-z. 948 detections have at least one galaxy member with spec-z and Q > 70%. A total of 1075 detections have at least 3 spectroscopic members or an association with Q > 70%. Among these, around 80% of the group candidates have the redshift assigned by AMICO that is consistent with zspec. Here, zspec is defined as the mean spectroscopic redshift of the associated members. The consistency criterion is met when the relative difference between the two redshifts is less than 3%, that is, |Δz|/(1 + zspec) < 0.03, which is around the largest σ of the photo-z uncertainty (see Sect. 2). By assigning spectroscopic counterparts to our group members, we were able to exclude the presence of any significant general z-dependent bias between photo-zs and spec-zs, namely between AMICO redshift (based on galaxy photo-zs) and spectroscopic redshift of the group, estimated as the average redshift of the available spectroscopic members. This kind of bias was, for instance, found in the KiDS sample at low redshift and consequently corrected (Maturi et al., in prep.). The mean bias of the sample with more than 3 spectroscopic members is Δz/(1 + zspec) =  − 0.011 ± 0.051.

However, even if a significant mean bias characterizing the sample is not visible, individual detections may show an inconsistency between the redshift assigned by AMICO and the one based on the available spectroscopic members. Therefore, we studied and marked with a dedicated flag all the detections that have a spectroscopic inconsistency of redshifts. For this analysis, we used only groups with at least 4 spectroscopic members or with spectroscopic quality Q > 95%. The flag is assigned to all detections that have relative redshift scatter, |Δz|/(1 + zspec) > 0.15, which is the same limit chosen to compute outlier fractions for galaxies in the COSMOS-Web galaxy catalog (see e.g., Casey et al. 2023; Shuntov et al. 2025). Only 30 detections are affected by spectroscopic mismatch as just described. This confirms the already assessed good photo-z quality of COSMOS-Web data used for this group search (Shuntov et al. 2025; Arango-Toro et al. 2025). In this work, we also leveraged the availability of a large compilation of spec-zs to test the AMICO algorithm on a hybrid galaxy sample using also the spectroscopic redshifts, when available. This application is described in Appendix B.

4.3. Detection flags

Besides flagging the detections in our catalog without spectroscopic members and those with spectroscopic members with redshift not compatible with the photometric one, we added useful flags to help filter the sample depending on the study that has to be performed with this catalog. Our flagging system is based on a list of base-2 flag bits referring to the following properties, ordered starting from the least severe:

  1. lack of spectroscopic members

  2. less than 3 arcmin from a border edge

  3. X-ray projection or proximity flag3

  4. masked fraction larger than 25%

  5. central X-ray selected AGN (obtained via matching with the X-ray sources in COSMOS-Web or COSMOS2020)

  6. spectroscopic mismatch (see Sect. 4.2)

  7. low intrinsic richness (λ < 5).

In our detection catalog, we introduced the flag “DETECTION_FLAG” which represents the sum of all these detection flag bits. Thus, if more than one criterion is present, the flags are summed together; for example, a group without spectroscopic members and λ = 3 will be flagged with 65. According to this flagging system, around 67% of the entire sample down to S/N = 6.0 is flagged with a value smaller than 20, which is an example of how to select robust detections based on their individual properties.

5. Candidates at z ≥ 2

The depth of this catalog and the availability of high-quality photometric redshifts make this application ideal for testing the AMICO algorithm in the relatively unexplored regime of high-redshift groups, protogroups, and protoclusters.

5.1. COSMOS protocluster compilation

In order to benchmark our results in this challenging regime, we first collected literature about structures already known in the COSMOS field. We created a compilation of all currently known clusters, groups, protoclusters, or protocluster cores in the range 2.0 ≲ z ≲ 4.0, which is the interval we are interested in for this comparison.

The collected detections with relative references and sample properties, like position and redshift, are summarized in Table 1. It should be noted that for this application of the AMICO algorithm to the COSMOS-Web field, we made use of a cluster model, as described in Sect. 3, so we are relying on our knowledge of how galaxies are distributed in clusters and groups at low redshift, which might not be what we actually see at z > 1.5. According to currently favored scenarios for cluster formation, present-day galaxy clusters and groups are the aftermath of the assembling, growing, and maturing of protocluster cores (see e.g., Shimakawa et al. 2018). In this high-z group search, we expect the algorithm to detect cores or possibly virialized substructures of protoclusters which may extend for several tens of Mpcs. For this reason, to perform a consistent comparison of our high-z sample with the known structures in the literature, we considered peaks and substructures inside extended protoclusters as individual objects. Just as an example, the 10 density peaks of Elentari (Forrest et al. 2023) and the 7 peaks of Hyperion (Cucciati et al. 2018) are counted as 17 different objects in our protocluster compilation (see Table 1). In this compilation, we included not only objects detected as overdensities of photometric or spectroscopic redshifts (e.g., Chiang et al. 2014; Diener et al. 2013; Sarron & Conselice 2021), but also discovered with several other approaches and methods, using, for example, the emission from distant galaxies or radio-galaxies (e.g., Geach et al. 2012; Castignani et al. 2014; Daddi et al. 2022), the mapping of Lyman α forest (e.g., Lee et al. 2014) and the group X-ray emission (e.g., Wang et al. 2016; Gozaliasl et al., in prep.). Since this protocluster compilation might be used as a reference also in other studies beyond the comparison in this work, we included all detections in the COSMOS field, and not only in the COSMOS-Web portion of it. In case the object falls outside the COSMOS-Web field, we make that explicit in Table 1. Additionally, we did not perform internal matching to discard possible double detections, since it is not straightforward to define the spatial limits of protoclusters. However, if the discovery of a structure is generally attributed to more than one work, we report the corresponding references (see Col. 6 of Table 1).

5.2. Our candidates in the COSMOS-Web field

We ran a three-dimensional matching between our catalog and the compilation, in which we allow an AMICO detection to have more correspondences in the compilation, given that we did not discard objects potentially detected multiple times, as previously mentioned. The matching was run as follows: for each AMICO detection, we attributed a successful match with every other known object that lies within a cylinder of radius dr = 1.0 Mpc/h and redshift depth dz = 0.05(1 + z). These values are similar to scatter values chosen in literature for matching clusters/groups at very high redshift (see e.g., Sarron & Conselice 2021). In this procedure, we used an a priori sorting of the AMICO catalog by (S/N)nocl. In the last column of Table 1, the matched protoclusters are indicated with a checkmark. Whenever the reference detection is not a single object but a catalog or a structure with multiple peaks, we indicate next to the checkmark the percentage of matched peaks or detections over the total number of objects falling in the available area according to our visibility mask (described in Sect. 2.3). We successfully matched 205 of our groups to be compatible with detected protoclusters or substructures of protoclusters at z ≥ 2 already known in literature, like the Hyperion protocluster structure (Cucciati et al. 2018; Casey et al. 2015; Wang et al. 2016), Elentari (Forrest et al. 2023), CC2.2 (Darvish et al. 2020), and others. We identified a total number of 316 new high-z objects4 with 2 ≤ z ≤ 3.7. These are interesting candidate protocluster cores and protogroups that do not match to any of the structures we collected from the literature in the field. Two examples of new high-z detections are shown in Fig. 13, one object is at z = 2.47 and one at z = 3.40.

thumbnail Fig. 13.

Two examples of high-z detections not known in the literature. Stamps are JWST color-composite images, annotations are the same as described in the caption of Fig. 6. Left: Candidate at z = 2.47 with λ ∼ 11. Right: Candidate at z = 3.40 with λ ∼ 25.

As mentioned above, at such high redshifts, we detected with AMICO groups that might be cores or density peaks of more extended protoclusters. For this reason, we explored the possibility of identifying protoclusters made of multiple AMICO detections by applying a 3D clustering algorithm. We treated our high-z detections as points with 3D coordinates and performed a clustering analysis with the sklearn algorithm for Density-Based Spatial Clustering of Applications with Noise (DBSCAN; Ester et al. 1996). We used a minimum of two members per cluster, to allow any cluster size down to pairs of detections, and a clustering scale of 0.05 deg as angular separation and 0.02(1 + z) as maximum redshift separation. This analysis resulted in 111 potential large-scale protoclusters, with a maximum size of 14 detections in the same object. We show the largest of these structures in Fig. 14, to which we gave the name of AmicOne (pronounced or ), the first and largest protocluster candidate detected with AMICO. This structure is formed by the clustering of 14 different cores, detected in the range 2.5 ≤ z ≤ 3.09, and it includes two groups consistent with those (IDs 37 and 47 in our compilation, in Table 1) detected by Diener et al. (2013). The diametral size of the structure is about 20 cMpc on the plane of the sky at z = 2.65 and it extends over a wide range of redshifts, equivalent to ∼600 cMpc.

thumbnail Fig. 14.

AmicOne protocluster candidate, consisting of fourteen cores (AMICO detections) with z ∈ [2.5, 3.09] (black points) clustered together in the same large-scale structure, according to our clustering analysis. In the background, the amplitude map returned by AMICO at the redshift slice of the detection marked by the green circle, namely at z = 2.65. Detections (black points) have different redshifts in the range, this is why they are not all lying on an amplitude peak. The amplitude peaks at their individual redshifts. The white crosses mark the position of the protogroups found by Diener et al. (2013). The size of the points is proportional to the redshift.

6. Conclusions

We produced a deep galaxy group catalog based on the new COSMOS-Web photometric galaxy catalog down to F150W < 27.3. The detection procedure was performed with the AMICO algorithm, a widely tested cluster detector currently used in Euclid, KiDS, and other major surveys. The group search was performed over an effective area of 0.45 deg2 and up to z = 3.7, covering the transition from protoclusters to virialized objects and their maturing phase. We detected 1678 groups with a signal-to-noise ratio (S/Nnocl) larger than 6.0. For each detection in our catalog, we provided information about position, redshift, signal-to-noise ratio, mass proxies (like amplitude A and intrinsic richness, λ), masked fraction, flags indicating detection features, like spectroscopic counterparts, presence of AGNs, and spectroscopic confirmation of redshift. In addition to the detection catalog, we also provide a list of members with their association probability and the information needed to estimate statistical membership.

We evaluated the purity of our sample against realistic mocks generated with the SinFoniA algorithm. This method allows for the creation of mock catalogs, capturing the complexity of real data, without relying on any a priori assumption. We found the relationship between purity and S/Nnocl, which is the detection property that yields the most redshift-independent relation with purity. This makes it possible to establish cuts of the original catalog based on the desired purity level. For example, we found that a cut at S/Nnocl ∼ 10 would correspond to selecting a sample of around 670 objects with purity above 90%.

We leveraged the successful application to COSMOS data up to z = 2 presented in Toni et al. (2024) and the good quality of the photometric redshift and depth of the COSMOS-Web survey to create this deep catalog of galaxy groups and to explore the possibility of detecting protocluster cores with AMICO at z ≥ 2. We successfully detected 316 new objects at z ≥ 2 and 205 compatible with being part of protoclusters and high-z groups known in the literature. To perform this comparative analysis, we created a compilation of known objects at z > 2 in COSMOS and matched them with our detections. In total, our catalog contains 509 candidate groups and protocluster cores in the range 2 ≤ z ≤ 3.7, of which around 400 are not isolated but lie within 111 Mpc-scale structures we found by performing a 3D clustering analysis with a minimum cluster size of 2, with the DBSCAN algorithm. Among these structures, the one including the highest number of substructures (14 cores) is a large protocluster candidate found at redshift z ∼ 2.5 − 3.0 and we have assigned it the name AmicOne.

Such a deep and well-characterized sample of groups, extending over a wide range of richness and redshift, is an important resource for the study of group and cluster assembly and evolution, from high-z cores to the massive objects we observe today. Additionally, it makes it possible to study several aspects of galaxy populations in different environments. For instance, it provides information about the numerous but relatively unexplored population of low-mass groups, and about galaxies in the outskirts of groups and clusters.

Data availability

The group and member catalog is available at the CDS via anonymous ftp to cdsarc.cds.unistra.fr (130.79.128.5) or via https://cdsarc.cds.unistra.fr/viz-bin/cat/J/A+A/697/A197


1

Photometric redshift performance is estimated in terms of the σNMAD indicator, that is σNMAD = 1.48 × median[(|Δz − median(Δz)|)/(1 + zspec)], where Δz = zphot − zspec, the difference between photometric and spectroscopic redshift. Galaxies are defined as outliers if |Δz|> 0.15(1 + zspec).

2

However, we allowed matching with masked detections if these fall within the tolerance of the matching.

3

In the same way as it was done in Toni et al. (2024) to clean the sample for the calibration of scaling relations, this flag marks detections within 0.02 deg from the center of a rich detection or lying on their extended X-ray emission.

4

During the revision phase of this paper, a catalog of overdensity peaks was published by Hung et al. (2025). This catalog includes objects compatible with ∼49% of our new detections. The overall matching rate is ∼93%.

Acknowledgments

We acknowledge the contribution of the COSMOS collaboration, consisting of more than 200 scientists. More information about the COSMOS survey can be found at https://cosmos.astro.caltech.edu/. This work was made possible by utilizing the CANDIDE cluster at the Institut d’Astrophysique de Paris. The cluster was funded through grants from the PNCG, CNES, DIM-ACAV, the Euclid Consortium, and the Danish National Research Foundation Cosmic Dawn Center (DNRF140). It is maintained by Stephane Rouberol. LM acknowledges the financial contribution from the PRIN-MUR 2022 20227RNLY3 grant “The concordance cosmological model: stress-tests with galaxy clusters” supported by Next Generation EU and from the grant ASI n. 2024-10-HH.0 “Attività scientifiche per la missione Euclid – fase E”. This research was supported through the Visiting Scientist program of the International Space Science Institute (ISSI) in Bern. MF is supported by the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 101148925. GC acknowledges the support from the Next Generation EU funds within the National Recovery and Resilience Plan (PNRR), Mission 4 – Education and Research, Component 2 – From Research to Business (M4C2), Investment Line 3.1 – Strengthening and creation of Research Infrastructures, Project IR0000012 – “CTA+ – Cherenkov Telescope Array Plus”. ET acknowledges funding from the HTM (grant TK202), ETAg (grant PRG1006) and the EU Horizon Europe (EXCOSM, grant No. 101159513).

References

  1. Alberts, S., Pope, A., Brodwin, M., et al. 2016, ApJ, 825, 72 [NASA ADS] [CrossRef] [Google Scholar]
  2. Allen, S. W., Evrard, A. E., & Mantz, A. B. 2011, ARA&A, 49, 409 [Google Scholar]
  3. Andreon, S., Newman, A. B., Trinchieri, G., et al. 2014, A&A, 565, A120 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  4. Arango-Toro, R. C., Ilbert, O., Ciesla, L., et al. 2025, A&A, 696, A159 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  5. Arnouts, S., Moscardini, L., Vanzella, E., et al. 2002, MNRAS, 329, 355 [Google Scholar]
  6. Ata, M., Lee, K.-G., Vecchia, C. D., et al. 2022, Nat. Astron., 6, 857 [NASA ADS] [CrossRef] [Google Scholar]
  7. Balogh, M. L., van der Burg, R. F. J., Muzzin, A., et al. 2021, MNRAS, 500, 358 [Google Scholar]
  8. Bamford, S. P., Nichol, R. C., Baldry, I. K., et al. 2009, MNRAS, 393, 1324 [NASA ADS] [CrossRef] [Google Scholar]
  9. Baxter, D. C., Cooper, M. C., Balogh, M. L., et al. 2023, MNRAS, 526, 3716 [NASA ADS] [CrossRef] [Google Scholar]
  10. Bellagamba, F., Maturi, M., Hamana, T., et al. 2011, MNRAS, 413, 1145 [NASA ADS] [CrossRef] [Google Scholar]
  11. Bellagamba, F., Roncarelli, M., Maturi, M., & Moscardini, L. 2018, MNRAS, 473, 5221 [NASA ADS] [CrossRef] [Google Scholar]
  12. Bertin, E., Schefer, M., Apostolakos, N., et al. 2022, Astrophysics Source Code Library [record ascl:2212.018] [Google Scholar]
  13. Bianconi, M., Smith, G. P., Haines, C. P., et al. 2018, MNRAS, 473, L79 [CrossRef] [Google Scholar]
  14. Bond, J. R., Kofman, L., & Pogosyan, D. 1996, Nature, 380, 603 [NASA ADS] [CrossRef] [Google Scholar]
  15. Boselli, A., & Gavazzi, G. 2006, PASP, 118, 517 [Google Scholar]
  16. Brodwin, M., Stanford, S. A., Gonzalez, A. H., et al. 2013, ApJ, 779, 138 [NASA ADS] [CrossRef] [Google Scholar]
  17. Bruzual, G., & Charlot, S. 2003, MNRAS, 344, 1000 [NASA ADS] [CrossRef] [Google Scholar]
  18. Capak, P., Abraham, R. G., Ellis, R. S., et al. 2007a, ApJS, 172, 284 [NASA ADS] [CrossRef] [Google Scholar]
  19. Capak, P., Aussel, H., Ajiki, M., et al. 2007b, ApJS, 172, 99 [Google Scholar]
  20. Casey, C. M., Cooray, A., Capak, P., et al. 2015, ApJ, 808, L33 [NASA ADS] [CrossRef] [Google Scholar]
  21. Casey, C., Kartaltepe, J., & Cosmos-Web 2022, BAAS, 240, 203.02 [Google Scholar]
  22. Casey, C. M., Kartaltepe, J. S., Drakos, N. E., et al. 2023, ApJ, 954, 31 [NASA ADS] [CrossRef] [Google Scholar]
  23. Castignani, G., Chiaberge, M., Celotti, A., Norman, C., & De Zotti, G. 2014, ApJ, 792, 114 [NASA ADS] [CrossRef] [Google Scholar]
  24. Castignani, G., Combes, F., Salomé, P., et al. 2019, A&A, 623, A48 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  25. Catinella, B., Schiminovich, D., Cortese, L., et al. 2013, MNRAS, 436, 34 [NASA ADS] [CrossRef] [Google Scholar]
  26. Chartab, N., Mobasher, B., Shapley, A. E., et al. 2021, ApJ, 908, 120 [CrossRef] [Google Scholar]
  27. Chiang, Y.-K., Overzier, R., & Gebhardt, K. 2014, ApJ, 782, L3 [NASA ADS] [CrossRef] [Google Scholar]
  28. Chiang, Y.-K., Overzier, R. A., Gebhardt, K., et al. 2015, ApJ, 808, 37 [NASA ADS] [CrossRef] [Google Scholar]
  29. Civano, F., Marchesi, S., Comastri, A., et al. 2016, ApJ, 819, 62 [Google Scholar]
  30. Coupon, J. 2018, Astrophysics Source Code Library [record ascl:1802.002] [Google Scholar]
  31. Coupon, J., Czakon, N., Bosch, J., et al. 2018, PASJ, 70, S7 [Google Scholar]
  32. Cucciati, O., Zamorani, G., Lemaux, B. C., et al. 2014, A&A, 570, A16 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  33. Cucciati, O., Lemaux, B. C., Zamorani, G., et al. 2018, A&A, 619, A49 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  34. Daddi, E., Valentino, F., Rich, R. M., et al. 2021, A&A, 649, A78 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  35. Daddi, E., Rich, R. M., Valentino, F., et al. 2022, ApJ, 926, L21 [NASA ADS] [CrossRef] [Google Scholar]
  36. Darvish, B., Sobral, D., Mobasher, B., et al. 2014, ApJ, 796, 51 [NASA ADS] [CrossRef] [Google Scholar]
  37. Darvish, B., Mobasher, B., Sobral, D., et al. 2016, ApJ, 825, 113 [NASA ADS] [CrossRef] [Google Scholar]
  38. Darvish, B., Scoville, N. Z., Martin, C., et al. 2020, ApJ, 892, 8 [NASA ADS] [CrossRef] [Google Scholar]
  39. Despali, G., Giocoli, C., Bonamigo, M., Limousin, M., & Tormen, G. 2017, MNRAS, 466, 181 [Google Scholar]
  40. Diener, C., Lilly, S. J., Knobel, C., et al. 2013, ApJ, 765, 109 [NASA ADS] [CrossRef] [Google Scholar]
  41. Diener, C., Lilly, S. J., Ledoux, C., et al. 2015, ApJ, 802, 31 [NASA ADS] [CrossRef] [Google Scholar]
  42. Dong, C., Lee, K.-G., Ata, M., Horowitz, B., & Momose, R. 2023, ApJ, 945, L28 [NASA ADS] [CrossRef] [Google Scholar]
  43. Dressler, A. 1980, ApJ, 236, 351 [Google Scholar]
  44. Edward, A. H., Balogh, M. L., Bahé, Y. M., et al. 2024, MNRAS, 527, 8598 [Google Scholar]
  45. Eke, V. R., Baugh, C. M., Cole, S., et al. 2004, MNRAS, 348, 866 [NASA ADS] [CrossRef] [Google Scholar]
  46. Ester, M., Kriegel, H. P., Sander, J., & Xu, X. 1996, Second International Conference on Knowledge Discovery and Data Mining (KDD’96), 226 [Google Scholar]
  47. Euclid Collaboration (Adam, R., et al.) 2019, A&A, 627, A23 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  48. Finoguenov, A., Guzzo, L., Hasinger, G., et al. 2007, ApJS, 172, 182 [Google Scholar]
  49. Forrest, B., Lemaux, B. C., Shah, E., et al. 2023, MNRAS, 526, L56 [NASA ADS] [CrossRef] [Google Scholar]
  50. Franck, J. R., & McGaugh, S. S. 2016, ApJ, 833, 15 [NASA ADS] [CrossRef] [Google Scholar]
  51. Gaia Collaboration (Brown, A. G. A., et al.) 2018, A&A, 616, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  52. Gaspari, M., Brighenti, F., D’Ercole, A., & Melioli, C. 2011, MNRAS, 415, 1549 [NASA ADS] [CrossRef] [Google Scholar]
  53. Geach, J. E., Sobral, D., Hickox, R. C., et al. 2012, MNRAS, 426, 679 [NASA ADS] [CrossRef] [Google Scholar]
  54. George, M. R., Leauthaud, A., Bundy, K., et al. 2011, ApJ, 742, 125 [Google Scholar]
  55. George, M. R., Leauthaud, A., Bundy, K., et al. 2012, ApJ, 757, 2 [NASA ADS] [CrossRef] [Google Scholar]
  56. Giodini, S., Pierini, D., Finoguenov, A., et al. 2009, ApJ, 703, 982 [NASA ADS] [CrossRef] [Google Scholar]
  57. Gozaliasl, G., Finoguenov, A., Tanaka, M., et al. 2019, MNRAS, 483, 3545 [Google Scholar]
  58. Gozaliasl, G., Finoguenov, A., Babul, A., et al. 2024, A&A, 690, A315 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  59. Gunn, J. E., & Gott, J. R., III 1972, ApJ, 176, 1 [Google Scholar]
  60. Hasinger, G., Cappelluti, N., Brunner, H., et al. 2007, ApJS, 172, 29 [NASA ADS] [CrossRef] [Google Scholar]
  61. Hasinger, G., Capak, P., Salvato, M., et al. 2018, ApJ, 858, 77 [Google Scholar]
  62. Hausman, M. A., & Ostriker, J. P. 1978, ApJ, 224, 320 [NASA ADS] [CrossRef] [Google Scholar]
  63. Hennig, C., Mohr, J. J., Zenteno, A., et al. 2017, MNRAS, 467, 4015 [NASA ADS] [Google Scholar]
  64. Hung, D., Lemaux, B. C., Cucciati, O., et al. 2025, ApJ, 980, 155 [Google Scholar]
  65. Ilbert, O., Arnouts, S., McCracken, H. J., et al. 2006, A&A, 457, 841 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  66. Ilbert, O., Capak, P., Salvato, M., et al. 2009, ApJ, 690, 1236 [NASA ADS] [CrossRef] [Google Scholar]
  67. Ilbert, O., McCracken, H. J., Le Fèvre, O., et al. 2013, A&A, 556, A55 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  68. Ilbert, O., Arnouts, S., Le Floc’h, E., et al. 2015, A&A, 579, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  69. Ito, K., Tanaka, M., Valentino, F., et al. 2023, ApJ, 945, L9 [NASA ADS] [CrossRef] [Google Scholar]
  70. Kartaltepe, J. S., Sanders, D. B., Le Floc’h, E., et al. 2010, ApJ, 721, 98 [NASA ADS] [CrossRef] [Google Scholar]
  71. Kartaltepe, J. S., Sanders, D. B., Silverman, J. D., et al. 2015, ApJ, 806, L35 [NASA ADS] [CrossRef] [Google Scholar]
  72. Kashino, D., Silverman, J. D., Sanders, D., et al. 2019, ApJS, 241, 10 [Google Scholar]
  73. Khostovan, A. A., Kartaltepe, J. S., Salvato, M., et al. 2025, ArXiv e-prints [arXiv:2503.00120] [Google Scholar]
  74. Kiyota, T., Ando, M., Tanaka, M., et al. 2025, ApJ, 980, 104 [Google Scholar]
  75. Knobel, C., Lilly, S. J., Iovino, A., et al. 2012, ApJ, 753, 121 [Google Scholar]
  76. Koekemoer, A. M., Aussel, H., Calzetti, D., et al. 2007, ApJS, 172, 196 [Google Scholar]
  77. Kotulla, R., Fritze, U., Weilbacher, P., & Anders, P. 2009, MNRAS, 396, 462 [NASA ADS] [CrossRef] [Google Scholar]
  78. Koyama, Y., Polletta, M. D. C., Tanaka, I., et al. 2021, MNRAS, 503, L1 [NASA ADS] [CrossRef] [Google Scholar]
  79. Kukstas, E., Balogh, M. L., McCarthy, I. G., et al. 2023, MNRAS, 518, 4782 [Google Scholar]
  80. Laigle, C., McCracken, H. J., Ilbert, O., et al. 2016, ApJS, 224, 24 [Google Scholar]
  81. Laigle, C., Pichon, C., Arnouts, S., et al. 2018, MNRAS, 474, 5437 [Google Scholar]
  82. Laureijs, R., Amiaux, J., Arduini, S., et al. 2011, ArXiv e-prints [arXiv:1110.3193] [Google Scholar]
  83. Lee, K.-G., Hennawi, J. F., Stark, C., et al. 2014, ApJ, 795, L12 [NASA ADS] [CrossRef] [Google Scholar]
  84. Lee, K.-G., Hennawi, J. F., White, M., et al. 2016, ApJ, 817, 160 [NASA ADS] [CrossRef] [Google Scholar]
  85. Lietzen, H., Tempel, E., Heinämäki, P., et al. 2012, A&A, 545, A104 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  86. Lilly, S. J., Le Fèvre, O., Renzini, A., et al. 2007, ApJS, 172, 70 [Google Scholar]
  87. Lilly, S. J., Le Brun, V., Maier, C., et al. 2009, ApJS, 184, 218 [Google Scholar]
  88. Lovisari, L., Ettori, S., Gaspari, M., & Giles, P. A. 2021, Universe, 7, 139 [NASA ADS] [CrossRef] [Google Scholar]
  89. Mandelbaum, R., Seljak, U., Cool, R. J., et al. 2006, MNRAS, 372, 758 [NASA ADS] [CrossRef] [Google Scholar]
  90. Maturi, M., Meneghetti, M., Bartelmann, M., Dolag, K., & Moscardini, L. 2005, A&A, 442, 851 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  91. Maturi, M., Bellagamba, F., Radovich, M., et al. 2019, MNRAS, 485, 498 [Google Scholar]
  92. Maturi, M., Finoguenov, A., Lopes, P. A. A., et al. 2023, A&A, 678, A145 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  93. McCarthy, I. G., Schaye, J., Ponman, T. J., et al. 2010, MNRAS, 406, 822 [NASA ADS] [Google Scholar]
  94. McConachie, I., Wilson, G., Forrest, B., et al. 2022, ApJ, 926, 37 [NASA ADS] [CrossRef] [Google Scholar]
  95. McConachie, I., Wilson, G., Forrest, B., et al. 2025, ApJ, 978, 17 [Google Scholar]
  96. McCracken, H. J., Milvang-Jensen, B., Dunlop, J., et al. 2012, A&A, 544, A156 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  97. McGee, S. L., Balogh, M. L., Bower, R. G., Font, A. S., & McCarthy, I. G. 2009, MNRAS, 400, 937 [Google Scholar]
  98. McNab, K., Balogh, M. L., van der Burg, R. F. J., et al. 2021, MNRAS, 508, 157 [CrossRef] [Google Scholar]
  99. Mellier, Y., Racca, G., & Laureijs, R. 2018, 42nd COSPAR Scientific Assembly, 42, E1.16 [Google Scholar]
  100. Moneti, A., McCracken, H. J., Hudelot, W., et al. 2023, VizieR Online Data Catalog: II/373 [Google Scholar]
  101. Morishita, T., Roberts-Borsani, G., Treu, T., et al. 2023, ApJ, 947, L24 [NASA ADS] [CrossRef] [Google Scholar]
  102. Morishita, T., Liu, Z., Stiavelli, M., et al. 2025, APJ, 982, 153 [Google Scholar]
  103. Navarro, J. F., Frenk, C. S., & White, S. D. M. 1997, ApJ, 490, 493 [Google Scholar]
  104. Newman, A. B., Rudie, G. C., Blanc, G. A., et al. 2022, Nature, 606, 475 [NASA ADS] [CrossRef] [Google Scholar]
  105. Paul, S., John, R. S., Gupta, P., & Kumar, H. 2017, MNRAS, 471, 2 [NASA ADS] [CrossRef] [Google Scholar]
  106. Polletta, M., Soucail, G., Dole, H., et al. 2021, A&A, 654, A121 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  107. Puddu, E., Radovich, M., Sereno, M., et al. 2021, A&A, 645, A9 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  108. Ragagnin, A., Saro, A., Singh, P., & Dolag, K. 2021, MNRAS, 500, 5056 [Google Scholar]
  109. Reeves, A. M. M., Balogh, M. L., van der Burg, R. F. J., et al. 2021, MNRAS, 506, 3364 [NASA ADS] [CrossRef] [Google Scholar]
  110. Rieke, M. J., Kelly, D. M., Misselt, K., et al. 2023, PASP, 135, 028001 [CrossRef] [Google Scholar]
  111. Salerno, J. M., Martínez, H. J., & Muriel, H. 2019, MNRAS, 484, 2 [NASA ADS] [CrossRef] [Google Scholar]
  112. Sarron, F., & Conselice, C. J. 2021, MNRAS, 506, 2136 [NASA ADS] [CrossRef] [Google Scholar]
  113. Sawicki, M., Arnouts, S., Huang, J., et al. 2019, MNRAS, 489, 5202 [NASA ADS] [Google Scholar]
  114. Schechter, P. 1976, ApJ, 203, 297 [Google Scholar]
  115. Scoville, N., Aussel, H., Brusa, M., et al. 2007, ApJS, 172, 1 [Google Scholar]
  116. Scoville, N., Arnouts, S., Aussel, H., et al. 2013, ApJS, 206, 3 [Google Scholar]
  117. Sérsic, J. L. 1963, Boletín de la Asociación Argentina de Astronomía, 6, 41 [Google Scholar]
  118. Shimakawa, R., Koyama, Y., Röttgering, H. J. A., et al. 2018, MNRAS, 481, 5630 [NASA ADS] [CrossRef] [Google Scholar]
  119. Shuntov, M., Ilbert, O., Toft, S., et al. 2025, A&A, 695, A20 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  120. Sillassen, N. B., Jin, S., Magdis, G. E., et al. 2024, A&A, 690, A55 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  121. Silverman, J. D., Kashino, D., Sanders, D., et al. 2015, ApJS, 220, 12 [NASA ADS] [CrossRef] [Google Scholar]
  122. Smolčić, V., Novak, M., Bondi, M., et al. 2017, A&A, 602, A1 [Google Scholar]
  123. Spitler, L. R., Labbé, I., Glazebrook, K., et al. 2012, ApJ, 748, L21 [NASA ADS] [CrossRef] [Google Scholar]
  124. Taamoli, S., Nezhad, N., Mobasher, B., et al. 2024, ApJ, 977, 263 [Google Scholar]
  125. Tanaka, M., Finoguenov, A., Lilly, S. J., et al. 2012, PASJ, 64, 22 [NASA ADS] [Google Scholar]
  126. Taniguchi, Y., Kajisawa, M., Kobayashi, M. A. R., et al. 2015, PASJ, 67, 104 [NASA ADS] [Google Scholar]
  127. Toni, G., Maturi, M., Finoguenov, A., Moscardini, L., & Castignani, G. 2024, A&A, 687, A56 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  128. Tully, R. B. 1987, ApJ, 321, 280 [CrossRef] [Google Scholar]
  129. Vulcani, B., Poggianti, B. M., Jaffé, Y. L., et al. 2018, MNRAS, 480, 3152 [NASA ADS] [CrossRef] [Google Scholar]
  130. Wang, T., Elbaz, D., Daddi, E., et al. 2016, ApJ, 828, 56 [NASA ADS] [CrossRef] [Google Scholar]
  131. Weaver, J. R., Kauffmann, O. B., Ilbert, O., et al. 2022, ApJS, 258, 11 [NASA ADS] [CrossRef] [Google Scholar]
  132. Wilman, D. J., Balogh, M. L., Bower, R. G., et al. 2005, MNRAS, 358, 71 [Google Scholar]
  133. Wright, R. H., Sabatke, D., & Telfer, R. 2022, Proc. SPIE, 12180, 121803P [Google Scholar]
  134. Yuan, T., Nanayakkara, T., Kacprzak, G. G., et al. 2014, ApJ, 795, L20 [NASA ADS] [CrossRef] [Google Scholar]
  135. Zhang, C., Ramos-Ceja, M. E., Pacaud, F., & Reiprich, T. H. 2020, A&A, 642, A17 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

Appendix A: Structure of the catalog

We summarize the structure of the group catalog with the description of each column in Table A.1.

Table A.1.

Descriptions of columns for the group and member catalog.

Appendix B: AMICO run with spectroscopic redshifts

The availability of almost 100 spectroscopic surveys covering the COSMOS field and collected in the compilation by Khostovan et al. (2025) makes this an ideal chance to test cluster and group detection with AMICO, including the information coming from spectroscopic redshifts.

In Sect. 4.2, we performed a simple a-posteriori association of spectroscopic counterparts to the member galaxies retrieved with the AMICO detection described in Sect. 3, which is entirely based on photometric redshifts. In this Appendix, we describe instead a group search done with AMICO by using not only photometric redshifts but also spectroscopic redshifts when available. First of all, we associated spectroscopic redshifts with the full cleaned input galaxy catalog. This yielded a galaxy sample with spectroscopic redshift available for almost 20,000 galaxies, which is 5% of the initial input catalog. Besides these, we kept all galaxies with the photo-z, only creating a hybrid input catalog. We built this hybrid catalog by selecting galaxies with spec-z and by shifting their redshift to the spectroscopic values. Then, we added a 1σ error, compatible with the size of the chosen AMICO redshift resolution, namely Δz = 0.01(1 + z). We then run AMICO detection on the hybrid catalog with the same parameters described in Sect. 3. This run resulted in a catalog of 1559 candidate clusters and groups in the range 0.03 ≤ z ≤ 3.7 and with λ > 2. Among these, a total of 595 groups are detected with S/Nnocl > 10.0. We compared the results of this run with the one based only on photo-z by matching the two catalogs within dz = 0.05(1+z) and dr = 1 Mpc/h. For simplicity, we refer to the standard catalog described in Sect. 4 as PhotoCat and to the results of this run with hybrid redshifts as SpecCat. The comparison between the two catalogs resulted in 1501 successfully matched candidates in total, of which 1404 had 0.03 ≤ z ≤ 3.7 and λ > 2, that is ∼ 90% and ∼ 83% of SpecCat and PhotoCat, respectively. Then, we looked for possible unmatched detections that have significantly changed their redshift by introducing spec-zs but are still detected above the minimum signal-to-noise. To do so, we matched unmatched detections from SpecCat and PhotoCat, without using the redshift, within a sky separation of 1 arcmin. There are 62 detections in the considered redshift range that are compatible with having the same position in the sky but having a different redshift. We then compared the assigned members to clean from random matches. Among the 62 matches, we found only 4 detections sharing more than 5 associations with P > 20% changing their redshift of a scatter larger than 0.07(1 + z), so we attribute most of them to random matches. Then, we analyzed the possibility of redshift fragmentation, namely the possibility that a given detection with a sufficient number of members with spectroscopic redshifts and a sufficient number of members with only photometric redshifts ends up being split into the spectroscopic and photometric components, resulting in two distinct detections aligned along the line of sight. This can be studied by looking at the possible correspondence on the sky plane (without redshift information) between a couple of matched detections and an unmatched spectroscopic detection since this would indicate that the photometric component is still successfully identified but it also gave origin to a new detection at a different redshift due to the spectroscopic galaxies. Among the list of 151 sky matches within 1 arcmin, we found 42 detections that share at least three members with P > 50%. However, the selected pairings have a redshift scatter between the matched couple and the only-spectroscopic detection smaller than the average photo-z error. Therefore, in our sample, the redshift fragmentation does not affect the group detection. Finding several examples of pairs coupling with a third detection in the SpecCat may instead indicate that introducing spectroscopic redshifts possibly contributes to reducing over-merging, and the algorithm can more easily distinguish between close-by objects.

All Tables

Table 1.

Compilation of protoclusters and high-redshift clusters and groups known in the literature for the COSMOS field.

Table A.1.

Descriptions of columns for the group and member catalog.

All Figures

thumbnail Fig. 1.

JWST rgb (F444W as r, [F150W, F277W] as g, F115W as b) color-composite image of the most massive group in the COSMOS-Web field. The JWST image is overlapped with the X-ray extended emission (pink) from the combined XMM-Newton and Chandra 0.5−2 keV wavelet-filtered image. (Credits: ESA/Webb, NASA & CSA, G. Gpzamoasm, K. Virolainen, A. Koekemoer, M. Franco.)

In the text
thumbnail Fig. 2.

Distribution in magnitude (top panel) and redshift up to z = 4 (bottom panel) of the cleaned galaxy catalog used as input for the group search.

In the text
thumbnail Fig. 3.

Magnitude and redshift of the galaxies of the input catalog (orange density contours) and evolution with redshift of the characteristic magnitude of the luminosity function, m, for two different models with different formation redshifts and star formation burst, as shown in the legend. The model with zf = 8 and past burst is the one used for the model in this work, marked by the solid purple line. This model better describes the trend of magnitude with respect to the one represented by the solid blue line. The dashed purple line indicates the same model, but for m + 1.5 which is the limit used in the definition of the intrinsic richness.

In the text
thumbnail Fig. 4.

Redshift distribution (left panel) and normalized cumulative signal-to-noise distribution (right panel) for all the detections in the COSMOS-Web group catalog. In orange, we show the richest detections, with λ > 10.

In the text
thumbnail Fig. 5.

Intrinsic richness, λ, for the sample of detected groups and its trend with redshift, in three different S/Nnocl bins as indicated in the plot.

In the text
thumbnail Fig. 6.

Four examples of detections present in the group catalog, at different redshifts. Circles indicate member galaxies, color-coded with membership probability and with their redshift printed next to each circle. Purple-to-yellow contours mark the density of galaxies and green contours the X-ray emission from the combined XMM-Newton and Chandra mosaic image in the 0.5−2 keV band. White lines delimit masked areas. The center of the group is marked with a red cross. In the top left panel, a group at z = 0.71 with λ ∼ 78 and central AGN (the same group shown in Fig. 1). In the top right panel, a group at z = 1.29 with λ ∼ 14. In the bottom left panel, a group at z = 2.22 with λ ∼ 24 and on the bottom right panel one at z = 3.14 with λ ∼ 28, flagged for being on the extended X-ray emission of a low-z group and for being at the edge of the field, as visible on the top of the stamp. JWST rgb images are in the same filters as in Fig. 1.

In the text
thumbnail Fig. 7.

Cumulative distribution function (CDF) of S/Nnocl, in different bins of redshifts (color bar on the right), used to select detections the mocks are based on, instead of using a sharp signal-to-noise cut.

In the text
thumbnail Fig. 8.

Left panel: Distribution of matched detections in the redshift scatter, |Δz|/(1 + z) – radial separation, Δr [Mpc/h] plane for an initial matching with maximum separation dz = 0.05(1 + z) and dr = 0.5 Mpc/h. Most of the matched detections are concentrated in the rectangular area at Δr < 0.2 Mpc/h and at |Δz|/(1 + z) < 0.01. Right panel: Distribution of the redshift scatter for the mock matching (dashed black line) compared to the scatter between group photo-z and group spec-z estimated as the mean of spec-members with P > 90%, in two intervals of redshift (solid purple and green lines). The distributions are normalized to 1.0 to better compare scatters.

In the text
thumbnail Fig. 9.

Relative scatter of the three observables (O, i.e., λ, λ and A, from left to right) between the detected observables, Odet, and the true observables as in the mocks, (Otrue). The scatter is here expressed by ΔO = Odet − Otrue. Different colors mark different redshift bins as indicated in the plot on the right. Points with a scatter larger than 3.0 and the detection in Fig. 1 (which is much richer than the rest of the sample) are not shown for better visualization of scatters and biases.

In the text
thumbnail Fig. 10.

Purity of the group sample evaluated against the mock catalog produced with SinFoniA. The four panels show the redshift dependence of (non-cumulative) purity referred to four different reference detection properties: signal-to-noise ratio without cluster contribution, with cluster contribution, amplitude, and intrinsic richness (from top to bottom).

In the text
thumbnail Fig. 11.

Completeness of the group sample evaluated against the mock catalog produced with SinFoniA. The two panels show the redshift dependence of completeness referred to two different proxies of mass returned by AMICO: amplitude (top) and intrinsic richness, λ (bottom).

In the text
thumbnail Fig. 12.

Relation between minimum S/Nnocl and purity of the sample. In correspondence with some reference values of purity, we report the number of selected detections.

In the text
thumbnail Fig. 13.

Two examples of high-z detections not known in the literature. Stamps are JWST color-composite images, annotations are the same as described in the caption of Fig. 6. Left: Candidate at z = 2.47 with λ ∼ 11. Right: Candidate at z = 3.40 with λ ∼ 25.

In the text
thumbnail Fig. 14.

AmicOne protocluster candidate, consisting of fourteen cores (AMICO detections) with z ∈ [2.5, 3.09] (black points) clustered together in the same large-scale structure, according to our clustering analysis. In the background, the amplitude map returned by AMICO at the redshift slice of the detection marked by the green circle, namely at z = 2.65. Detections (black points) have different redshifts in the range, this is why they are not all lying on an amplitude peak. The amplitude peaks at their individual redshifts. The white crosses mark the position of the protogroups found by Diener et al. (2013). The size of the points is proportional to the redshift.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.