Euclid preparation XXI. Intermediate-redshift contaminants in the search for z > 6 galaxies within the Euclid Deep Survey

Context. The Euclid mission is expected to discover thousands of z > 6 galaxies in three deep ﬁelds, which together will cover a ∼ 50deg 2 area. However, the limited number of Euclid bands (four) and the low availability of ancillary data could make the identiﬁcation of z > 6 galaxies challenging. Aims. In this work we assess the degree of contamination by intermediate-redshift galaxies ( z = 1–5.8) expected for z > 6 galaxies within the Euclid Deep Survey. Methods. This study is based on ∼ 176000 real galaxies at z = 1–8 in a ∼ 0.7deg 2 area selected from the UltraVISTA ultra-deep survey and ∼ 96000 mock galaxies with 25 . 3 ≤ H < 27 . 0, which altogether cover the range of magnitudes to be probed in the Euclid Deep Survey. We simulate Euclid and ancillary photometry from ﬁducial 28-band photometry and ﬁt spectral energy distributions to various combinations of these simulated data. Results. We demonstrate that identifying z > 6 galaxies with Euclid data alone will be very e ﬀ ective, with a z > 6 recovery of 91% (88%) for bright (faint) galaxies. For the UltraVISTA-like bright sample, the percentage of z = 1–5.8 contaminants amongst apparent z > 6 galaxies as observed with Euclid alone is 18%, which is reduced to 4% (13%) by including ultra-deep Rubin ( Spitzer ) photometry. Conversely, for the faint mock sample, the contamination fraction with Euclid alone is considerably higher at 39%, and minimised to 7% when including ultra-deep Rubin data. For UltraVISTA-like bright galaxies, we ﬁnd that Euclid ( I E − Y E ) > 2 . 8 and ( Y E − J E ) < 1 . 4 colour criteria can separate contaminants from true z > 6 galaxies, although these are applicable to only 54% of the contaminants as many have unconstrained ( I E − Y E ) colours. In the best scenario, these cuts reduce the contamination fraction to 1% whilst preserving 81% of the ﬁducial z > 6 sample. For the faint mock sample, colour cuts are infeasible; we ﬁnd instead that a 5 σ detection threshold requirement in at least one of the Euclid near-infrared bands reduces the contamination fraction to 25%.


Introduction
Over more than a decade now, numerous works have investigated the presence of galaxies around the epoch of re-ionisation. In particular, photometric studies of various fields have identified many galaxies at z > 6, mostly through deep Hubble Space Telescope (HST) imaging (e.g. Bouwens et al. 2010;Ellis et al. 2013;Oesch et al. 2014;Atek et al. 2015;Finkelstein et al. 2015;Bowler et al. 2020;Salmon et al. 2020;Roberts-Borsani et al. 2022). These studies provide us with clues regarding the physical nature of the objects present in the early Universe, which is of key importance for constraining the early phases of galaxy evolution.
The number densities of low-luminosity z > 6 galaxies are relatively high, enabling a search for these sources in deep, small-area surveys, such as the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (CANDELS; Grogin et al. 2011;Koekemoer et al. 2011). Conversely, bright z > 6 galaxies (M UV −20.5) are much rarer and, thus, are only likely to be found in wide-area surveys with reasonable depths at optical and near-infrared (NIR) wavelengths. For example, recent work from Bouwens et al. (2021) determined that the number density of z ∼ 6 galaxies with rest-frame magnitudes M 1600 ∼ −21 is 1.4 × 10 −5 Mpc −3 mag −1 , and increases by a factor of ∼ 10 3 for sources with M 1600 ∼ −17. Consequently, only a minor fraction of z > 6 galaxy studies have been devoted to exploring the bright end of the galaxy luminosity function at these high redshifts (e.g. Willott et al. 2013;Duncan et al. 2014;Bowler et al. 2015;Song et al. 2016;Stefanon et al. 2019). It is around these brightest galaxies where re-ionisation was presumably completed first (Pentericci et al. 2014;Castellano et al. 2016).
To date, the necessary combination of area and depth to search for bright z > 6 sources is only available in a few fields (McCracken et al. 2012;Jarvis et al. 2013). However, the forthcoming Euclid mission (Laureijs et al. 2011) will open up a new era in the search of such objects by mapping a large area of the sky at NIR wavelengths. In addition to its main wide survey, Euclid will perform deep observations of three so-called Euclid Deep Fields, which will encompass a total area of ∼50 deg 2 (Euclid Collaboration: Scaramella et al. 2022). Euclid carries four photometric bands: a visible imager (VIS; Cropper et al. 2016) that has an optical band I E 1 at 5500-9000 Å and the Near-Infrared Spectrometer and Photometer (NISP; Maciaszek et al. 2016) that carries three NIR bands, that is, Y E , J E , and H E , which together cover the wavelength range 9000-20 000 Å (Euclid Collaboration: Schirmer et al. 2022). The expected 5σ depths (assuming point-like sources) for the Euclid Deep Fields are I E = 28.2 and Y E J E H E ≈ 26.4 (AB magnitude). With these characteristics, the Euclid Deep Fields are expected to reveal thousands of z > 6 galaxies and therefore enable studies of early galaxy formation and evolution with unprecedented statistical significance.
Since Euclid has a limited number of photometric bands, enormous efforts are being made to provide additional, external coverage of the Euclid Deep Fields, both with ground-based facilities and the Spitzer Space Telescope (e.g. Euclid Collaboration: Moneti et al. 2022;McPartland et al. in prep). In the best-case scenario, Euclid Deep sources will have photometric coverage in at most 10-12 filters, and therefore deriving accurate photometric redshifts and galaxy physical parameters will be challenging. As such, a pre-launch critical assessment of cone-mail: mierlo@astro.rug.nl 1 Originally referred to as the VIS band within the Euclid Consortium but recently renamed the I E band. tamination in Euclid galaxy selections at different redshifts is of utmost importance.
Identifying z > 6 galaxies in particular is challenging for a number of reasons. Extreme emission line galaxies at intermediate redshifts can mimic Lyman-break galaxies due to a combination of large equivalent width emission lines and a faint continuum, therefore contaminating the selection of high-redshift objects (Atek et al. 2011;Huang et al. 2015). A second type of degeneracy arises from the blackbody spectra of cool, brown dwarf stars that have similar NIR colours to z > 6 galaxies (Stern et al. 2007;Bowler et al. 2015;Salmon et al. 2020). Finally, another main source of contamination in the selection of z > 6 sources are intermediate-redshift (z ∼ 1-6) galaxies, for which the 4000 Å break can be misidentified as the Lyman-α break at λ = 1216 Å of a high-redshift object (Vulcani et al. 2017). This latter sort of contamination is the focus of this work. We note that the study of high-redshift contaminants to intermediate-z sources, especially dusty galaxies at z = 4-6, is an interesting complementary problem, but outside the scope of this paper.
Here we make use of galaxies selected from the third data release (DR3) of the UltraVISTA ultra-deep survey (McCracken et al. 2012) and the Spitzer Matching Survey of the UltraVISTA ultra-deep Stripes (SMUVS; Ashby et al. 2018) to assess the degree of contamination produced by intermediate-redshift galaxies in the selection of z > 6 galaxies in the Euclid Deep Survey. UltraVISTA and SMUVS are uniquely suited for this simulation because of their considerable common area (∼0.66 deg 2 ) and depths (∼25.5 AB mag). However, we emphasise that this analysis is only valid for galaxies at z = 6-8 due to the limitations of the fiducial sample, and as such we cannot study the photometric redshift recovery of Euclid z > 8 galaxies.
This paper is organised as follows. In Sect. 2 we briefly describe the datasets used in this work. In Sect. 3 we describe our source catalogue construction and how the Euclid and ancillary photometry were simulated. We present our estimates on the contamination fraction of bright z > 6 galaxies in the Euclid Deep Fields in Sect. 4, together with colour selection criteria to separate intermediate-z interlopers from true z > 6 galaxies. In addition, we analyse the degree of z > 6 contamination and the effectiveness of the colour criteria for a sample of faint (25.3 ≤ H < 27.0) mock galaxies in Sect. 5. In Sect. 6 we comment on the validity of our results and finally present our concluding remarks in Sect. 7. Throughout this paper we adopt a cosmology with H 0 = 70 km s −1 Mpc −1 , Ω m = 0.3, and Ω Λ = 0.7. All magnitudes and fluxes are total, with magnitudes referring to the AB system (Oke & Gunn 1983). Stellar masses correspond to a Chabrier (2003) initial mass function (IMF).

UltraVISTA/SMUVS and non-SMUVS galaxy catalogues
As a basis to simulate Euclid (+ancillary) photometry, we use real, NIR galaxy surveys in the field of the Cosmic Evolution Survey (COSMOS; Scoville et al. 2007). Specifically, the ultradeep UltraVISTA survey (McCracken et al. 2012) has provided Y, J, H, K s images whose depth is relatively similar (∼25-26 mag) to that expected for the Euclid Deep Fields, and therefore constitutes an excellent starting point to simulate Euclid galaxies. However, given that the Euclid Deep Survey will be 1.2 magnitude deeper in the H band than the UltraVISTA survey, we create a complementary catalogue of Euclid-like faint mock galaxies from scaled-down versions of the fiducial UltraVISTA spectral energy distributions (SEDs). This process is described Article number, page 2 of 27 van Mierlo et al.: Intermediate-redshift contaminants of Euclid z > 6 galaxies in detail in Sect. 5; here we discuss the construction of the Ultra-VISTA galaxy catalogue that forms the basis for both the bright UltraVISTA-like sample and the faint mock sample.
In this work we only consider the three (out of four) UltraV-ISTA ultra-deep stripes with ultra-deep Spitzer Space Telescope (Werner et al. 2004) coverage from SMUVS . This programme used the Spitzer Infrared Array Camera (IRAC; Fazio et al. 2004) to map the three UltraVISTA ultra-deep stripes with deepest ancillary data, reaching matching depths in the 3.6 and 4.5 µm bands, over a total area of 0.66 deg 2 . Deshmukh et al. (2018; hereafter D18) presented a photometric catalogue of approximately 300 000 SMUVS sources with multi-wavelength ancillary data in COSMOS, for a total of 28 bands from Canada-France-Hawaii Telescope (CFHT) u through IRAC 4.5 µm. The SMUVS photometry has been obtained using sources detected in the UltraVISTA DR3 HK s stack mosaic as priors, and by requiring that each source has a detection in at least one of the IRAC bands. Given that the SMUVS images suffer from severe source confusion, the IRAC photometry was measured using a point-spread-function (PSF) fitting technique from the Image Reduction and Analysis Facility (IRAF), using empirical images of the PSF as constructed from stars in the field. Using this method, ∼ 95 % of all UltraVISTA sources are detected in at least one IRAC band. In addition, the IRAC photometry of sources with bright IRAC neighbours was not utilised in the SED fitting to prevent contamination in these bands from affecting the best-fit SED.
D18 derived photometric redshifts and stellar masses for all these sources, based on SED analysis, as we explain in Sect. 2.2. We refer the reader to D18 for detailed information about the SMUVS catalogue. Here we use it as a basis to obtain our Euclid simulated data.
The measured fluxes were corrected to total fluxes through point-source aperture corrections, based on the curves of growth of non-saturated stars in the field (as derived by D18). These corrections are consistent with those quoted in McCracken et al. (2012) and Laigle et al. (2016). For the Spitzer photometry, typical aperture corrections have been tabulated by Ashby et al. (2015). These authors have demonstrated that treating Spitzer sources as point-like is valid in virtually all cases at z > 2 and in a large fraction of sources at z = 1-2. Moreover, we note that our Euclid (+ancillary) photometry are simulated directly from our COSMOS photometry and therefore, the recovery fraction and contaminants of high-z sources studied in this paper are not influenced by the use of point-like photometry. This is confirmed by the results shown in Appendix A, where we repeat our z > 6 recovery tests on the COSMOS2020 catalogue (Weaver et al. 2022), which contains independent photometric measurements of the same field.
The total fluxes were subsequently corrected for Galactic extinction using the dust maps from Schlafly & Finkbeiner (2011). We used the masks from D18 to mask regions of contaminated light surrounding the brightest sources. This removes ∼ 6 % of the considered UltraVISTA region; as a consequence, our masked catalogue covers a total area of ∼ 0.7 deg 2 . Following the method outlined in Caputi et al. (2006, Fig . 1), we cleaned the non-SMUVS catalogue for Galactic stars using the Subaru SuprimeCam (i + − z + ) and UltraVISTA (J − K s ) colour diagram. Sources that have an HK s -based stellarity parameter greater than 0.8 and reside in the stellar locus were discarded from our non-SMUVS sample, where the stellar locus refers to sources that have (J − K s ) ≤ 0.1. This approach is slightly different from D18, who used the (J − [3.6]) versus (B − J) colour diagram to identify Galactic stars. Given that no IRAC 3.6 µm photometry is available for the non-SMUVS sources, we utilised this alternative colour diagnostic to clean the non-SMUVS catalogue from Galactic stars.
By including the non-SMUVS sources in our analysis, we gain approximately 19 700 additional sources. The majority of the additional sources (∼ 70 %) resides in the second ultra-deep stripe, as the northern part of it (2.°61 ≤ Dec ≤ 2.°76) is not covered by SMUVS.
Finally, for both the UltraVISTA SMUVS and non-SMUVS catalogues, we updated the UltraVISTA photometry using the DR4 mosaics to increase the sensitivity of our photometry, by running SourceExtractor on the DR4 images and matching the resulting source catalogue with our DR3 catalogue. We therefore emphasise that our final UltraVISTA catalogue consists exclusively of DR3-selected sources, of which the UltraVISTA Y, J, H, and K s bands have been updated with the DR4 photometry. Between DR3 and DR4, the 5σ limiting magnitudes in the ultra-deep stripes increase by 0.1, 0.2, and 0.1 mag in the Y, J, and H bands, respectively, while the K s band depth is unchanged.

Galaxy physical parameters obtained with SED fitting
We derived photometric redshifts and main physical parameters for all the galaxies in the general (SMUVS and non-SMUVS) UltraVISTA catalogue with updated DR4 photometry, following a similar SED fitting methodology to that applied by D18, but with a few important changes more suitable for highredshift sources, as follows. We made use of the χ 2 -fitting routine LePhare (Arnouts et al. 1999;Ilbert et al. 2006), adopting a broader set of star formation histories (SFHs) than D18, that is, a single stellar population, an exponentially declining SFH, and a delayed exponentially declining SFH. For both declining models, we used the same range of star formation timescales τ = 0.01, 0.1, 0.3, 1.0, 3.0, 5.0, 10.0, and 15 Gyr. We considered two metallicities: solar (Z = Z ) and sub-solar (Z = 0.2Z ). In total, we considered 36 templates of different combinations of SFH and metallicity. We also included empirical spectra of L, M, and T stars from the SpeX Prism Library (Burgasser 2014) to minimise the contamination of the high-z galaxy sample by dwarf stars. The effectiveness of this method in removing brown-dwarf contamination was demonstrated by Bowler et al. (2014) and Bowler et al. (2015). Finally, we used the redshift range z = 0-9 for our SED fitting, whereas D18 used the redshift range z = 0-7.
As in D18, each SED template was attenuated with the Calzetti et al. (2000) reddening law and the colour excess was left as a free parameter between E(B − V) = 0 and 1, with a step of 0.1. We ran LePhare with emission lines (the recipe based on simple scaling relations from Kennicutt 1998 between the ultraviolet (UV) luminosity and Oii line; see Ilbert et al. 2009) and multiplied the flux errors by a factor of 1.5 since, as shown by Dahlen et al. (2013), this choice improves photometric redshift estimation. We include a flat prior for the absolute magnitude in Article number, page 3 of 27 A&A proofs: manuscript no. 43950corr the Subaru r band such that −10 < M r < −26. We adopted the same treatment for non-detections as in D18 and Caputi et al. (2015), that is, we substituted them with 3σ flux upper limits in the broad bands and simply ignored them in the intermediateand narrow-band data. We then chose the option in LePhare that rejects any SED template that produces fluxes higher than the 3σ upper limits in the bands with non-detections. In order to improve the quality of the fit, we applied photometric zero-point corrections as in D18. These were derived as follows: after we obtained best-fit SEDs with LePhare, we calculated the mean offset between the observed and template fluxes in each band, which were subsequently applied to the photometric catalogue. We repeated this process until the offsets converged to obtain our final photometric redshifts. Averaged over all 28 bands, the offset between the observed and template fluxes is 0.06 mag.
We cleaned the output redshift catalogue returned by LePhare as follows: first, we removed sources that are best fit by stellar (rather than galaxy) templates. This was achieved by comparing the best-fit galaxy χ 2 ν,gal and stellar χ 2 ν,star values for any source with a HK s -based stellarity parameter greater than 0.8 (as measured with SourceExtractor from the HK s detection image); we removed these sources if χ 2 ν,star < χ 2 ν,gal or if |χ 2 ν,gal − χ 2 ν,star | < 4. Second, for all galaxies at z > 3.6, we checked if the high-redshift solution is compatible with their detection at short wavelengths, that is, we ensure galaxies with high-redshift solutions do not have flux bluewards of the Lyman break. Therefore, following Caputi et al. (2015), we discarded sources with z phot > 3.6 and a > 2σ U-band detection; or with z phot > 4.6 and a > 2σ B-band detection; or with z phot > 5.6 and a > 2σ V-band detection; or with z phot > 6.6 and a > 2σ r-band detection. To further clarify, these conditions are implemented such that any band bluewards of the Lyman break is checked, for instance, we ensured a z phot > 6.6 source also does not have significant detections in the U, B, and V bands. In addition, for z phot > 7 galaxies we do not expect any detection bluewards of the Lyman-α line, due to Lyman series absorption of neutral hydrogen in the intergalactic medium (Inoue et al. 2014). Therefore, we discarded sources with z phot > 7.0 and a > 2σ z ++ -band detection; or with z phot > 8.0 and a > 2σ Y-band detection. Lastly, for all sources with z phot ≥ 6, we performed rigorous visual inspection of their broad-band images and removed all sources that are for example contaminated by bright neighbours or appear artificial (e.g. they are aligned exactly on the diffraction spikes of bright stars).

Simulation of Euclid, Rubin-LSST, H20 survey,
and Spitzer photometry of z = 1-8 galaxies The main goal of this analysis is to assess the identification of z > 6 and z = 1-6 galaxies based on the data that are (or will be) available in the Euclid Deep Fields. Euclid will observe the sky in four photometric bands: the I E , Y E , J E and H E bands, which together cover the wavelength range 5500-20 000 Å (Euclid Collaboration: Schirmer et al. 2022). Given that one of the aims of this research is to investigate how the inclusion of external data improves the photometric redshift of Euclid sources at z > 6, we Fig. 1: Redshift distribution of the UltraVISTA DR4 z = 1-8 galaxies in the UltraVISTA ultra-deep stripes 1, 2, and 3. These galaxies constitute the fiducial intermediate (z = 1-6) and highz (z = 6-8) samples in this work. The intermediate-z and high-z samples consist of 175652 and 315 galaxies, respectively. considered additional Rubin, CFHT and Subaru Hyper Suprime Camera (HSC), and Spitzer photometry. To simulate photometry in the Euclid (+ancillary) bands, we made use of the above described SMUVS/UltraVISTA galaxy catalogue as a basis. In addition, as described in Appendix A, we repeated our analysis based on the COSMOS2020 catalogue (Weaver et al. 2022), for which all photometric measurements have been obtained in an independent manner.
In this paper we consider complementary data from the Vera Rubin C. Observatory, which will sample the two southern Euclid Deep Fields in the Legacy Survey of Space and Time (LSST; Ivezić et al. 2019). The Rubin Observatory will observe in six photometric bands, ugrizy, which span the wavelength range 3200-11 000 Å. Given that the Euclid Deep Field North cannot be observed by Rubin, we also consider the ongoing Hawaii Two-0 Survey (H20), which is currently observing the Euclid Deep Field North and Euclid Deep Field Fornax (McPartland et al., in prep). The H20 survey will consist of deep optical data in the MegaCam u band of the CFHT and the Subaru HSC g,r,i,z bands, and will be available long before the Rubin full depth mosaics. Therefore, we consider both simulated Rubinand H20-like photometry complementary to the Euclid bands. Lastly, Spitzer/IRAC observations of the Euclid Deep Fields were presented in Euclid Collaboration: Moneti et al. (2022), who combined new observations with all relevant archival IRAC data to produce very deep imaging of these fields in all four IRAC bands. Given that these Spitzer mosaics are very similar in depth to SMUVS (5σ mag = 24.8), we directly use the observed SMUVS photometry and therefore only consider the IRAC 3.6 and 4.5 µm bands. We note that our choice for including the H20 survey and IRAC bands is based on the Cosmic Dawn Survey (Toft et al., in prep), which is an ongoing effort to obtain multiwavelength imaging for the Euclid Deep Fields to depths that will match the Euclid data.
In Table 1 we provide an overview of the expected 5σ pointlike source depths, mean wavelengths and filter widths of Euclid (+ancillary) photometric bands considered in this work. Their corresponding transmission curves are shown in Fig. 2. The expected Euclid depths are taken from Euclid Collaboration: Scaramella et al. (2022), assuming that the Euclid Deep Survey will be two magnitudes deeper than the Euclid Wide Survey. For our tests we consider two different scenarios for the Rubin 5σ point source depth: one that is expected after 10 years of observing and one that is representative for the Rubin Deep Drilling Fields (DDF), which are likely to coincide with the two southern Euclid Deep Fields. We assumed approximate Rubin DDF depths from Foley et al. (2018). It is worth noting that the 5σ depths presented in Table 1 are estimates and may vary once all programmes are finalised, with the exception of the already completed Spitzer observations. We considered our UltraVISTA galaxy catalogue with fiducial z > 1 redshifts and simulated their Euclid (+ancillary) photometry by convolving their best-fit SEDs based on the 28-band COSMOS photometry with the Euclid (+ancillary) filter curves, which can be easily done with LePhare. We modelled the flux errors following separate methods for each instrument. For the Euclid and H20 photometry, we followed the method presented in LePhare to simulate magnitude errors: where m 5 is the 5σ point-source depth from Table 1 and σ m 5 is the corresponding magnitude error. In addition, we added a 0.03 systemic magnitude error to σ m in quadrature. For the Rubin photometry, we used the formulae provided in Ivezić et al. (2019). The Rubin total photometric error has both a systematic and a random contribution, with the latter being dependent on the expected 5σ depth in each band. We refer the reader to Ivezić et al. (2019) for a detailed description of the Rubin flux error prescription. For the Spitzer photometry we adopted a strategy where the fluxes are sampled from the fiducial best-fit SED (as was done for the Euclid, H20, and Rubin photometry) and the flux errors are simply set to the observed Spitzer flux errors from SMUVS, as the Spitzer observations of the Euclid Deep Fields are similar in depth to SMUVS. Given that not all sources in our UltraVISTA catalogue are Spitzer-detected, we only simulated Spitzer photometry for galaxies that have a detection in at least one IRAC band. All simulated magnitudes and magnitude errors were subsequently converted to flux space.
In total, we address eight scenarios of different combinations of Euclid, Rubin, H20, and Spitzer photometry, as summarised in Table 2. Throughout this paper, we globally refer to these combinations as Euclid (+ancillary) photometry. The number of final sources in the simulated Euclid photometric catalogues are listed in this table, where the distinction between intermediate-z and high-z galaxies is based on the fiducial redshift. We remind the reader that the number of sources in the catalogues including Spitzer photometry is lower as not all galaxies in our UltraV-ISTA DR4 catalogue are IRAC-detected. For each filter we randomised the simulated photometry of each galaxy by sampling a Gaussian distribution with mean µ equal to the modelled flux from the fiducial best-fit SED and standard deviation σ equal to the flux error, derived as explained above.
Since our simulated fluxes are directly sampled from the fiducial best-fit galaxy template, they are unaffected by exposure time limits, contrary to real, observed photometry. Therefore, to ensure our simulated photometry is realistically deep, we applied a 2σ flux limit to each filter, as derived from their expected 5σ survey depth. For Euclid, Rubin, and H20 bands with fluxes fainter than their 2σ detection limits, we adopted 2σ flux upper limits in the subsequent SED fitting process. For Spitzer fluxes fainter than the corresponding 2σ limits, we did not adopt 2σ upper limits, but rather excluded the band in the SED fitting process. We implemented this criterion because the χ 2 -minimisation technique of the SED fitting naturally has most of its weight at the longest wavelength filters and could be confused rather than helped by the presence of flux upper limits.
Article number, page 5 of 27 A&A proofs: manuscript no. 43950corr In our analysis, we consider a single realisation of the randomly simulated photometry. We produced and analysed a few other realisations, but found no significant differences in the results discussed below.

Photometric redshifts based on Euclid and ancillary data
We repeated the SED fitting of the sources with fiducial z > 1 redshifts, using LePhare again, but now considering only the simulated Euclid (+ancillary) data. We used exactly the same LePhare settings as for the UltraVISTA DR4 catalogue, described in Sect. 2.2 (the flat absolute magnitude prior is now applied to the H E band). Despite the low number of photometric bands, LePhare finds a redshift solution for > 99 % of the sources from Euclid photometry alone. We did not repeat our checks for stellar solutions or compatibility with short wavelengths, but the latter is discussed in Sect. 6.1.
We aim to illustrate how the incorporation of ancillary data improves the performance of the photometric redshift recovery. Therefore, we compare the derived redshifts (to which we refer as simulated redshifts) with the fiducial redshifts obtained from our UltraVISTA and remaining COSMOS photometry (28 bands in total) in three scenarios: Euclid, Euclid+H20, and Eu-clid+Rubin DDF. The results are shown in Fig. 3. In each panel, we identify catastrophic outliers as sources satisfying the condition where z sim is the simulated redshift and z fid the fiducial redshift. We calculated the catastrophic outlier fraction (OLF) in two redshift bins separately, namely z = 1-6 (intermediate-z) and z = 6-8 (high-z). We note that the OLF only quantifies the quality of the photometric redshifts in these fiducial redshift bins; contamination of the intermediate-z bin to the high-z bin is addressed in Sect. 4.2. From Fig. 3 it is evident that the addition of ancillary data improves the redshift estimation for both intermediate-and high-z galaxies. First, we discuss the photometric redshift quality when only the four Euclid bands are utilised. At z fid > 6, the redshifts are already quite accurate, with an OLF of 7.6 %. This is the result of the wavelength range covered by the Euclid bands, as for galaxies at z = 6-8, they sample the rest-frame UV and optical continuum, enabling the identification of z > 6 galaxies with the Lyman-break drop-out technique (Steidel et al. 1996). On the contrary, the redshift recovery at intermediate redshifts is considerably poorer. The majority (> 65 %) of the intermediate-z sample consists of z fid = 1-2 galaxies (see Fig. 1), and so the Euclid filters sample the rest-frame optical continuum redwards of the Lyman-α line. With no constraint on the Lyman break, galaxies with red UV slopes (either from dust attenuation or old age) are easily confused for higher-redshift objects. Simultaneously, young non-dusty sources at z fid = 1-2 that have a mostly flat UV and optical continuum become highly degenerate with z ∼ 5 galaxies, as a flat SED without a strong 4000 Å break can be confused for a UV-bright high-redshift object. Interestingly, only the latter type of degeneracy is clearly present in Fig. 3; z fid = 1-2 galaxies are predominately scattered between z sim = 4 and z sim = 6, with only a few sources at z sim > 7.
We find that the inclusion of deep optical data, either from Rubin observations or the H20 survey, reduces the number of catastrophic outliers, especially at intermediate redshifts. At z = Fig. 3: Comparison of the fiducial photometric redshift to the photometric redshift obtained from three combinations of Euclid and ancillary photometry. In each panel, the catastrophic outlier fractions (OLFs) are reported for two fiducial redshift bins, z = 1-6 and z = 6-8. The OLF represents the fraction of sources with |z sim − z fid |/(1 + z fid ) > 0.15. The outlier identification boundaries are indicated with solid red lines. The data points are presented as 2D histograms with bin size ∆z = 0.1. The colour bar corresponds to the number of galaxies in each bin and is the same for all panels. The solid orange line is the identity curve. Data outside the two solid red lines are identified as catastrophic outliers. Euclid Rubin ( Notes. Summary of the eight scenarios of combinations of Euclid and ancillary data considered in this work and the corresponding 2σ magnitude limits that were applied to the photometry. The number of intermediate-z and high-z galaxies that are inserted in our simulations (based on their fiducial redshift) is indicated in the last two columns.
6-8, this can be explained since the short-wavelength bands sample the spectrum bluewards and redwards of the Lyman-α break, such that the photometric redshift estimation becomes more precise. In addition, the inclusion of optical data rules out a lowredshift nature for nearly all z fid > 6 galaxies. At intermediate redshifts, we see how the inclusion of optical data strongly improves the OLF. Moreover, with data from the Rubin DDF, which constitutes the deepest optical ancillary data considered in this work, the degeneracy between z fid = 4-6 and z sim > 6 galaxies is almost completely lifted in our analysis.

Identification of z > 6 contaminants
For each Euclid (+ancillary) data scenario, we identify three populations from the redshift comparison plots shown in Fig. 3: (i) galaxies with fiducial redshifts z fid = 1-6 that stay in that same redshift bin when the photometric redshift is obtained with Euclid (+ancillary) photometry, to which we refer as the 'stable' intermediate-z galaxy population; (ii) galaxies with fiducial redshifts z fid > 6 that stay at these high redshifts when the photometric redshift is obtained with Euclid (+ancillary) photometry, which are the 'true' z > 6 galaxies; and (iii) galaxies with fiducial redshifts z fid = 1-5.8 and Euclid (+ancillary) redshifts z > 6, which we consider to be the intermediate-redshift contaminants to the high-z galaxy sample. The latter population constitutes the main subject of study in this paper. We set an upper redshift cut at a fiducial redshift z fid = 5.8 for the purpose of defining contaminants to avoid discussing sources that may end up populating the z > 6 regime simply due to a random error scattering of the photometric redshifts. Therefore, galaxies with z fid = 5.8-6 that are falsely recovered at z > 6 are discarded from our analysis, as they would constitute only 6 % the true z > 6 population (Euclid alone) and the majority are recovered at z sim = 6-6.2. Lastly, we acknowledge a fourth population consisting of galaxies with fiducial redshifts z fid > 6 that appear as intermediate-z galaxies when observed with Euclid. These sources constitute 9 % of the fiducial z = 6-8 galaxy sample (Euclid alone). The study of this population is outside the scope of this paper. For each data scenario, in Table 3 we present the number of true z > 6 galaxies, the number of z > 6 contaminants and the following fraction of contaminants amongst the apparent z > 6 galaxy population. By apparent z > 6 galaxies we mean all the galaxies assigned a photometric redshift z > 6 based on the Euclid (+ancillary) simulated photometry, independently of being truly at these redshifts or not. We also report the completeness in Table 3, which represents the percentage of fiducial z fid = 6-8 galaxies that are correctly identified as z sim > 6 sources. The missing galaxies in our reported completeness are those with fiducial z fid > 6 redshifts, but which are falsely recovered at z sim < 6 with the Euclid (+ancillary) photometry. We remind the reader that not all galaxies in our fiducial z = 1-8 galaxy sample are IRAC-detected, explaining the lower numbers of z > 6 contaminants and true z > 6 galaxies in scenarios where Spitzer data are considered.
We calculated the uncertainties in the contamination fraction by producing ten randomised flux catalogues of the z > 6 contaminants and the true z > 6 galaxies, for which we derived the photometric redshifts with LePhare. Subsequently, we assigned a probability of correct identification to each source by counting in how many realisations the source is re-identified as a contaminant, and identically for the true z > 6 sample. Using the average probability of correct identification, we calculated upper and lower limits on both the number of contaminants and the number of true z > 6 galaxies, which were propagated into an upper and lower limit on the contamination fraction. We adopt this approach as a compromise because producing ten realisa-Article number, page 7 of 27 A&A proofs: manuscript no. 43950corr

%
Notes. Number of true z > 6 galaxies (that is, galaxies at fiducial z fid = 6-8 that are recovered at z sim > 6) and z > 6 contaminants (galaxies at fiducial z fid = 1-5.8 recovered at z sim > 6), from various combinations of Euclid and ancillary data. In addition, we report the fraction of contaminants amongst the total apparent z > 6 population, for which uncertainties from ten random realisations of the contaminant and true z > 6 source photometry. We also report the completeness, which represents the percentage of fiducial z = 6-8 galaxies that are correctly identified as z > 6 sources.
tions of the entire z = 1-8 flux catalogue is too computationally expensive.
Our main findings on the contamination fraction are as follows. First, the fraction of contaminants amongst the apparent z > 6 population is already relatively low when only data from the four Euclid bands are available: 18 % of all apparent z > 6 galaxies are actually intermediate-z contaminants. In addition, the z > 6 completeness is very high in all data scenarios, even with Euclid photometry alone.
Second, for sources at the UltraVISTA depth, the inclusion of ancillary optical data produces a negligible effect in the fraction of contaminants and the z > 6 completeness level. This is because the Rubin and H20 surveys are both shallower in the optical regime than the Euclid Deep Survey (see Table 1), and as such their data are of little help in preventing intermediate-z galaxies from being misidentified as z > 6 galaxies. In fact, the contamination fractions from Euclid+Rubin and Euclid+H20 are slightly higher than that from Euclid photometry alone, although the difference is not significant within the uncertainties. We expect the H20 data to perform better in the redshift recovery compared to Rubin, even though the latter includes the additional y band coverage. This can be explained as the H20 data are slightly deeper, especially in the z band.
Only with the ultra-deep photometry from the Rubin DDF does the contamination fraction improve, as the Rubin DDF r and i data will be 0.3 and 0.1 mag deeper than the I E photometry. Clearly, the Rubin DDF data provide such stringent constraints on the photometric depth that even the faintest intermediate-z galaxies in our sample cannot be confused for high-redshift sources. However, we note that this is the most idealised scenario we consider in this paper, and only with these data can the contamination fraction be taken to very low levels. As a safety measure, we tested the scenario where we combine simulated Rubin DDF and Spitzer data, without including Euclid photometry. In this case, we find a contamination fraction of 0.08 and a completeness of 92 %. This demonstrates that whereas the Rubin DDF photometry is incredibly effective at reducing the degree of contamination, Euclid photometry is essential to achieve virtually no contamination.
Third, Spitzer data are moderately helpful in preventing the incidence of intermediate-redshift contaminants to the z > 6 sample. The majority of contaminants are at z fid = 4-6, and produce redder fiducial (H − [3.6]) contaminants than actual z > 6 galaxies. Hence, IRAC detections enable one to distinguish between a red SED slope from intermediate-z interlopers and a flat SED slope that one would observe for young galaxies at high redshifts. However, using Euclid+Spitzer data, the contamination fraction is still 0.13, so the improvement is marginal compared to the Euclid+Rubin DDF scenario. We believe this is mostly due to the typical uncertainty of IRAC fluxes, given that Spitzer sources are severely blended in crowded fields such as COSMOS. In addition, we investigate the signal-to-noise ratio (S/N) in the simulated IRAC bands and conclude that 67 % of the contaminants identified from Euclid+Spitzer data have a F ν /σ F ν ≥ 5 detection in both IRAC bands.
It is possible that the particular de-blending treatment used to obtain IRAC photometry (see Deshmukh et al. 2018 for a detailed overview) leads to a slight underestimation of the flux errors. Given that in this scenario redshifts are based on only six bands, any uncertainties in the IRAC photometry carry more weight in the SED fitting as compared to the fit on the fiducial, 28-band photometry. We note that the limited effectiveness of the IRAC photometry is independent of our specific method for measuring the IRAC fluxes; when we estimate the contamination fractions using the COSMOS2020 catalogue as presented in Appendix A, for which the IRAC photometry was derived in a completely independent manner, we find it has essentially no effect on the contamination fraction.
Finally, combining Euclid photometry with both optical and infrared data yields the best contamination fractions; evidently, with more photometric bands available for the SED fitting, the redshift recovery steadily improves. In the Euclid+Rubin DDF+Spitzer scenario, the deep, 11-band photometry is highly successful at correctly identifying z > 6 galaxies, and so contamination from intermediate-z interlopers is virtually non-existent and the z > 6 completeness is very high at 94 %.
Apart from the eight combinations of Euclid and ancillary bands considered throughout this work, we evaluate a few other scenarios to gain more insight into preventing intermediate-z interlopers.
First, we tested the importance of the Rubin y band for the selection of high-z galaxies, given that Euclid itself will create Article number, page 8 of 27 van Mierlo et al.: Intermediate-redshift contaminants of Euclid z > 6 galaxies very deep imaging in the Y E band (see Fig. 2 for the respective filter transmission curves). Presumably, y-band observations are of key importance to the z > 7 galaxy selection, since at z = 7-8 the Lyman-α break at 1216 Å is sampled by this band. In this paper we have assumed that the Rubin DDF will be 0.1 mag shallower in the y band as compared to the Euclid Deep Fields. We derive the contamination fraction from Euclid+Rubin DDF photometry whilst excluding the y band, and find that it is 0.07 as compared to 0.04 whilst including the y band (see Table 3). Simultaneously, we find that a 0.5 mag increase in the Rubin DDF y-band depth does not improve the contamination fraction any further. Therefore, we conclude that even though the central wavelengths and filter widths of the Rubin y and Euclid Y E band differ, ultra-deep Rubin y-band photometry is only marginally effective when Y E -band imaging is readily available.
Second, we tested how robust our results on the contamination fraction are when we vary the full final depth of the ancillary data. The magnitude limits adopted throughout this paper are, with the exception of the Spitzer data, not definite as the observations have not commenced or are not completed yet. Therefore, an assessment of how the degree of contamination is dependent on the final survey depths is important. We determine that if the H20 survey was 0.5 mag deeper across all five bands, the fraction of contaminants amongst the apparent z > 6 sample would decrease from 0.19 to 0.10. Simultaneously, by making the Rubin DDF 0.5 mag shallower across all six bands, the contamination fraction worsens from 0.04 to 0.08. Clearly, the final depth of the optical data has strong implications for the contamination fraction.
Finally, we investigated which optical band contributes the most to accurate z > 6 galaxy selection, using Euclid+H20 data. We increased the 5σ depth by 0.5 mag for each H20 band individually whilst the photometry in the other bands remains unchanged, creating five different flux catalogues. Subsequently, we derive the contamination fraction for all five realisations, and find the Subaru HSC i band is most important for excluding intermediate-z interlopers, reducing the contamination fraction to 0.14 (as compared to 0.19 from Euclid+H20 with no depth variations). This is unsurprising as the majority of our fiducial z > 6 sample is at z = 6-7 and so the i band provides a strong constraint on the Lyman-α break. The second most important band is the Subaru HSC z band (contamination fraction of 0.15). Conversely, we find that increasing the survey depth in the CFHT u band leaves the contamination fraction unimproved. We emphasise that these results on the importance of individual bands concern the contamination of z > 6 sources. In fact, the CFHT u band is most important for normal galaxies at lower-redshifts, that is, the OLF of stable intermediate-z galaxies as defined in Fig. 3 moderately improves to 11.2 % from increasing the u band depth.
Depending on the research purpose, certain detection threshold requirements may be imposed on potential Euclid highredshift galaxies. Therefore, we explore how a 5σ detection threshold requirement in at least one of the NIR bands for Euclid high-redshift galaxies may result in lower degree of contamination by intermediate-z sources. Considering only Euclid data, the contamination fraction is reduced to 0.12 +0.04 −0.04 with this measure. Generally, we find a moderate improvement in the contamination fraction in each Euclid (+ancillary) data scenario, but no significant differences within the error bars. We further explore the usefulness of S/N cuts in Sect. 4.3. Fig. 4: Median magnitudes of our simulated sources in the Euclid, Rubin, and Spitzer filters. Light brown circles represent stable intermediate-z galaxies, red stars z > 6 contaminants, and blue diamonds true z > 6 galaxies. For each population, the dotted lines indicate the 16 th and 84 th percentiles. For each band, the 2σ flux limit is shown with a green bar. The selection of intermediate-z, contaminants, and z > 6 galaxies is based on Euclid+Rubin+Spitzer data. The median magnitudes for z > 6 galaxies in the Rubin u, g, r bands are equal to −99 (no intrinsic flux) and therefore omitted from this figure.

Separation of contaminants from true z > 6 galaxies
Having quantified the incidence of intermediate-redshift contaminants in the z > 6 sample, now we aim to develop a method to cleanly separate the contaminants from the true z > 6 galaxies, based on the photometry available in the Euclid Deep Fields. To achieve this, we investigate which photometric and SED properties can separate the two populations. Specifically, we investigate the usefulness of colour diagrams. For instance, Bisigello et al. (2020) already showed how Euclid colour-colour selection techniques can effectively separate star-forming and quiescent galaxies at z = 0-3. Figure 4 shows the median observed magnitude in each filter for intermediate-z galaxies, z > 6 contaminants and true z > 6 galaxies identified from Euclid+Rubin+Spitzer photometry. This figure demonstrates that contaminants of z > 6 galaxies are the faintest amongst intermediate-redshift galaxies, that is, falsely identified z > 6 galaxies tend to be much fainter than secure z = 1-6 galaxies. On average, contaminants are ∼ 1.9 mag fainter than stable intermediate-z galaxies in the Euclid and Rubin filters. Only in the IRAC bands are the contaminants similarly bright to the stable intermediate-z sources. The photometry of contaminants is dominated by 2σ flux upper limits. This is in agreement with Vulcani et al. (2017), who have shown that low-z contaminants of drop-out selected z > 5 galaxies are located near the detection limit of galaxy surveys. Because of the numerous upper limits, the SED fitting of the contaminants is poorly constrained, that is, their redshift parameter space becomes highly degenerate.
Moreover, Fig. 4 clearly shows how contaminants differ from true z > 6 galaxies based on their I E , Y E , J E , and H E photometry. True z > 6 galaxies have very red (I E − Y E ) colours in addition to very flat (Y E − J E ) and (J E − H E ) colours. Conversely, z > 6 Article number, page 9 of 27 A&A proofs: manuscript no. 43950corr Colour cut Condition all true z > 6 galaxies with a I E and/or Y E detection included 95 % of true z > 6 galaxies with a I E and/or Y E detection included 90 % of true z > 6 galaxies with a I E and/or Y E detection included all contaminants with a I E and/or Y E detection excluded contaminants show on average a gradual brightening over the same wavelength range, with bluer (I E − Y E ) colours than true z > 6 galaxies. The physical interpretation of this difference is straightforward. At z fid = 6-8, the I E and Y E bands sample the rest-frame spectrum blue-and redwards of Lyman-α line at λ = 1216 Å, resulting in a strong, red (I E −Y E ) colour. On the contrary, at z fid = 1-5.8, the I E and Y E bands sample the UV continuum mostly redwards from the Lyman-α line. At this wavelength range, the fiducial SEDs of the z > 6 contaminants are particularly faint: they are below the assumed Euclid Survey depth and therefore have significantly different (I E − Y E ) colours than true z > 6 galaxies.
For true z > 6 galaxies, we obtain median (Y E − J E ) = 0.14 and (J E − H E ) = 0.01 colours as the Y E , J E , and H E filters sample the rest-frame UV and blue optical continuum. This is the result of our input fiducial z = 6-8 galaxies, which have similarly flat UltraVISTA (Y − J) and (J − H) colours and are typically UV bright (median M 1500Å = −21.7 mag). Additionally, the majority of true z > 6 galaxies have fiducial redshift z fid = 6-7 (77 %), so the median (Y E − J E ) colour is marginally influenced by the  3.4 & 0.9 0.00 58% 0.00 59% 0.00 58% 0.00 59% 0.00 55% 0.00 55% 0.00 55% 0.00 55% Notes. Contamination fraction and completeness for four different (I E − Y E ) & (Y E − J E ) colour cuts, in eight scenarios of Euclid (+ancillary) photometry, once applied to all sources that have a I E and/or Y E flux measurement. The colour cuts are listed in the first column, while the other columns correspond to the different data combinations. In these eight columns, we show the fraction of intermediate-z interlopers amongst the apparent z > 6 galaxy sample after applying the colour cut (on the left) and the percentage of fiducial z > 6 galaxies correctly identified as high-z sources that survive the cut (on the right). The colouring of the rows in this table is equal to how the colour selection criteria are plotted with coloured, dashed lines in Fig. 5. red (Y E − J E ) colour caused by the Lyman-α break of z sim > 7 sources. The flat colour signature is typical for high-redshift Lyman-break galaxies (Salmon et al. 2020), as they have high specific star-formation rates (de Barros et al. 2014) and virtually no dust attenuation , resulting in flat UV spectra because of the dominance of young stellar populations.
Although Fig. 4 is based on Euclid+Rubin+Spitzer data, the observed general trends are present in all eight combinations of photometry. In all scenarios, we find that z > 6 contaminants comprise the faintest intermediate-redshift galaxies, and that contaminants have significantly different (I E − Y E ) and (Y E − J E ) colours from the true z > 6 galaxies.
Article number, page 11 of 27 A&A proofs: manuscript no. 43950corr These results clearly indicate the importance of the Y E band in separating contaminants from the real z > 6 galaxies. Therefore, we construct colour-colour diagrams considering this band, namely (I E − Y E ) versus (Y E − J E ), which are shown in Fig. 5. We emphasise that these colour diagrams can only be constructed for galaxies up to z fid = 8, as sources beyond this redshift are Y E dropouts and as such do not have a meaningful (I E − Y E ) colour. Therefore, the below proposed colour criteria cannot be used to identify Euclid z > 8 galaxies.
Similarly, because the contaminants comprise the faintest galaxies in our sample, many have flux upper limits in both the I E and Y E band, so that their (I E − Y E ) colour is meaningless. Therefore, this analysis considers only true z > 6 galaxies and z > 6 contaminants that have a detection in the I E and/or Y E band. In the scenario of Euclid data alone, this means that 1.4 % and 46 % of the true z > 6 galaxies and z > 6 contaminants are excluded, respectively. These numbers are representative for the other combinations of Euclid and ancillary photometry. Furthermore, the vast majority of true z > 6 galaxies in all scenarios have a detection in the Y E band but a flux upper limit in the I E band; as such, their (I E − Y E ) colour is actually a lower limit and may be even redder in reality.
To separate the contaminants from the true z > 6 galaxies, we present an array of (I E − Y E ) & (Y E − J E ) colour criteria that produce different degrees of contamination and z > 6 completeness. The four colour cuts were derived using only Euclid data and are based on sources with a solid flux measurement in at least the I E or Y E band. The colour criteria and the conditions that they meet are listed in Fig. 5. We subsequently applied these colour cuts to the other Euclid (+ancillary) data scenarios, and derived the completeness and contamination fraction of the surviving galaxy sample. The results are presented in Table 4. The completeness is defined as the fraction of recovered z fid = 6-8 galaxies compared to the entire fiducial z fid = 6-8 sample, consisting of 315 galaxies.
First, we emphasise that the achieved contamination fractions from applying the colour criteria are lower limits, given that generally half of the z > 6 contaminants are not included in this analysis. For the true z > 6 galaxies and contaminants that have a I E and/or Y E detection, the colour criteria are highly successful at preventing the intermediate-z interlopers from entering the high-redshift galaxy sample. Specifically, the colour cut (I E − Y E ) > 2.8 & (Y E − J E ) < 1.4 (in yellow) reduces the contamination fraction to 0.01 while simultaneously preserving 90 % of the true z > 6 galaxies (Euclid alone); the resulting completeness of fiducial z = 6-8 galaxies is 81 %. When additional optical photometry, Spitzer data, or a combination of both are considered, this cut reduces the contamination fraction to ≤ 0.01, whilst still preserving ∼ 80 % of the high-redshift galaxies.
Here we comment on the usability of these Euclid colour diagrams for the selection of z = 6-8 galaxies. In the case of Euclid data alone, 30 out of the 65 contaminants cannot be included in these diagrams because they do not have a well-constrained 4, this means that in the worst-case scenario, 30 additional contaminants could actually survive this selection, and so the contamination fraction would be 0.11 instead of 0.01. Therefore, the purity of the apparent z > 6 sample would still improve from applying colour selection criteria, although possibly not as drastically as presented in Table 4. On the contrary, given that the (I E − Y E ) colour of most true z > 6 galaxies is a lower limit, the recovered completeness with the colour criteria may be higher in reality. In conclusion, the presented colour criteria in this work are useful for selecting a relatively pure sample of z > 6 galaxies whilst maintaining acceptable completeness, although it is not possible to exactly state to which degree.
As mentioned in Sect. 4.2, an alternative strategy that is often used to ensure intermediate-z galaxies do not enter the highredshift galaxy sample is to require a detection with a certain S/N for high-z candidates. Here we explore in depth how imposing a 3-, 5-, and 10σ detection threshold requirement on the apparent Euclid z > 6 sample could reduce the contamination fraction. Figure 6 shows the distribution of the H E -band S/N for contaminants and true z > 6 galaxies, where the three S/N cuts are indicated with dashed lines. We imposed each S/N cut on the apparent z > 6 sample and subsequently recomputed the contamination fraction and z > 6 completeness. Ultimately, we find that even a 10σ S/N requirement merely reduces contamination to 5 % whilst preserving only 30 % of the actual high-z galaxies. For reference, the most stringent colour cut presented in Table 4 is able to maintain 58 % completeness. We find similar results for the Y E and H E bands. We conclude that for the UltraVISTA-like bright sample, the colour cuts are more effective for identifying intermediate-z interlopers than imposing a S/N requirement.
Finally, we explore how effective a combination of a S/N requirement with colour selection criteria would be. First, we applied a 5σ H E -band S/N detection threshold requirement to the apparent z > 6 sample recovered with Euclid data alone. Subsequently, we applied the same colour criteria presented in this paragraph to the restricted sample, and highlight the results from the colour criteria (I E − Y E ) > 2.8 & (Y E − J E ) < 1.4. Imposing these criteria, the contamination fraction is reduced to 0.01 whilst preserving 70 % of the high-redshift sources. However, even with the detection threshold requirement, 5 out of 18 contaminants do not have a constrained (I E − Y E ) colour and, therefore, cannot be included in the colour diagrams. Therefore, if all of these sources survived these colour criteria and continued to populate the recovered high-redshift sample, the contamination fraction would be 0.03. In summary, the combination of a S/N requirement and colour selection criteria is able to recover a high-redshift sample with very high purity, but at the cost of the z > 6 completeness; as such, whether or not this combination should be used will depend on the research purpose.

The nature of the z > 6 contaminants
Now that we have quantified the expected degree of contamination in the Euclid Deep z > 6 galaxy selection and showed how this can be reduced, we aim to understand the nature of the contaminant sources. We do this by inspecting the fiducial physical parameters of these galaxies, as obtained from LePhare in the original run based on COSMOS 28-band photometry. Subsequently, we compare the Euclid-derived SED properties of true z > 6 galaxies and contaminants, in order to investigate if the two populations can be further segregated based on their recovered physical parameters. Figure 7 shows the fiducial and simulated normalised redshift distribution of z > 6 contaminants and true z > 6 galaxies, in each Euclid (+ancillary) data scenario. Between the different photometric scenarios, the shape of the fiducial redshift distribution of the contaminants is roughly similar: we consistently identify broad peaks at z fid ∼ 1-3 and z fid ∼ 4.0-6, and typically very few sources at z fid ∼ 3-4. In addition, for almost all scenarios, the majority of contaminants have z fid ∼ 4.0-6. There- Fig. 6: Histogram of the H E -band S/N. Contaminants and true z > 6 galaxies as identified from Euclid data alone are displayed in red and blue graphs, respectively. The vertical dashed lines indicate the 3σ, 5σ, and 10σ detection cutoffs. The contamination fraction and completeness computed for the three cuts are indicated in the table.

Photometric redshift distributions
fore, we conclude that the underlying galaxy populations that are likely to be misidentified as high-z galaxies are similar, regardless of the external data available to complement Euclid data in the SED fitting analysis. Figure 7 also shows the redshift distribution of z > 6 contaminants once constrained with simulated Euclid (+ancillary) photometry. Here we can see how the contaminants affect different redshift bins at z > 6, which varies according to the considered data combination. Generally we find that without constraints from Spitzer photometry, the contaminants are systematically placed at higher redshifts than in scenarios where Spitzer data are available. From Fig. 7 it is evident that the recovered redshift distributions of true z > 6 galaxies and z > 6 contaminants are different but largely overlapping, and thus the z > 6 contaminants cannot be identified solely based on their recovered redshifts.

Physical parameters
So far, we have established that, regardless of which ancillary data are added to the Euclid photometry, z > 6 contaminants are amongst the faintest sources in our intermediate-z galaxy sample and primarily correspond to galaxies with fiducial redshifts around z fid ∼ 2.5 or z fid ∼ 5. Here we analyse the physical properties of these galaxies, to better understand why specifically these populations are sensitive to the z > 6 degeneracy.
In Fig. 8 we show the rest-frame (u − r) colour against the fiducial redshift for intermediate-z galaxies and z > 6 contaminants, as identified from Euclid photometry. The rest (u − r) colour is derived from the fiducial COSMOS photometry and corresponding best-fit SED, and samples the rest-frame optical continuum at λ ∼ 3800 Å (u) and λ ∼ 6300 Å (r). First, we note that the blue (u − r) colours of contaminants at z fid = 3.0-4.0 are a coincidence rather than a real degeneracy: there are only two galaxies in this bin that are both faint and have poorly constrained fiducial SEDs. Overall, we find that z > 6 contaminants typically have redder (u − r) colours than stable intermediate-z galaxies. Only at z fid = 5.5-6.0 are the rest-frame optical colours of z > 6 contaminants similar to those of stable z fid = 5.5-6.0 galaxies; the former are contaminants simply because they are much fainter compared to the latter. Lastly, for other combinations of Euclid and ancillary photometry, contaminants at z fid = 3.0-4.0 display similarly red colours as those in other fiducial redshift bins. In conclusion, including the result from Fig. 4, the z > 6 contaminants are generally characterised as faint, systematically reddened, intermediate-z galaxies, in good agreement with previous studies (e.g. Oesch et al. 2012;Bowler et al. 2014).
In order to uncover what produces the redder optical colours of the contaminants, we inspect their fiducial stellar masses, dust extinctions and ages as derived from the original (COS-MOS) photometry. We also investigate the Euclid (+ancillary) photometry-derived physical parameters of the contaminants, to see if these sources that appear as z > 6 galaxies occupy a different parameter space than the true z > 6 galaxies.
In Fig. 9 we show the fiducial stellar mass, colour excess, and age distributions of the z > 6 contaminants and the stable intermediate-z galaxies, as derived from the original SED fitting with COSMOS photometry. We show the comparison for two scenarios: one where interlopers are identified from Euclid photometry alone, and one where they are identified from Euclid+H20+Spitzer data. Furthermore, the contaminants are split into three samples based on their fiducial redshift, that is, z fid = 1-3.5, z fid = 3.5-5 and z fid = 5-6, following the typical redshift distinction we observed in Fig. 7, and considering that contaminants at z fid = 3.5-5 and z fid = 5-6 display different (u − r) colours. Apart from the three physical properties investigated in Fig. 9, we also inspect the characteristics of the best-fit SEDs of these sources, that is, their metallicity and SFH.
First and foremost, it is clear from Fig. 9 that z > 6 contaminants and stable intermediate-z galaxies occupy the same parameter space for any property investigated in this paper. Therefore, we conclude from this that the galaxies driving the Euclid z > 6 contamination are not part of some specific population, but are primarily interlopers because of their faintness.
When we investigate the three samples of z > 6 contaminants as identified from Euclid photometry, we find that contaminants at z fid = 1-3.5 are typically moderately massive galaxies that have red optical colours either because they are young with considerable dust attenuation, or because they are old with well established stellar populations. Towards higher redshifts, we observe that contaminants at z fid = 3.5-5 constitute almost solely of young, massive galaxies that are strongly affected by dust. Finally, contaminants at z fid = 5-6 are comparably massive, typically not dusty and of average age. Summarising, the sources that produce similar Euclid colours to actual z > 6 galaxies are either intermediate-z galaxies that become degenerate through typical confusion of the Lyman-α and 4000 Å breaks (either due to strong dust attenuation or old age), or faint galaxies with flat SEDs bordering z ∼ 6 that become degenerate because of the limited measurements available to properly constrain them.
Given that we have demonstrated how the inclusion of ancillary photometry reduces the number of intermediate-z interlopers, we also inspect the physical parameters of contaminants identified from Euclid+H20+Spitzer data. These surviving galaxies constitute the core of the z > 6 contamination, since they produce similar colours as high-redshift galaxies in not just the four Euclid bands, but in seven ancillary optical and infrared filters as well. As shown in the lower panels of Fig. 9, the remainder of z > 6 contaminants show a similar albeit narrower mass distribution compared to the scenario with only Eu-Article number, page 13 of 27 A&A proofs: manuscript no. 43950corr  clid photometry. Conversely, contaminants at z fid = 1-3.5 and z fid = 3.5-5.0 are now solely young and dusty galaxies. Ap-parently, the degeneracy between older, intermediate-z galaxies with well-developed 4000 Å breaks and actual z > 6 galaxies is at least partially broken through the addition of ancillary photometry, whereas the degeneracy between dust-reddened galaxies and z > 6 galaxies mostly remains.
We show the normalised, stacked probability distribution function of the redshift, PDF(z), of the z > 6 contaminants and true z > 6 galaxies in Fig. 10. For clarity, we divide the contaminants into only two redshift bins, namely z fid = 1-3.5 and z fid = 3.5-6. Interestingly, although their fiducial PDF(z) is generally broad considering it was derived from 28-band photometry, the z > 6 contaminants do not show significant probability for secondary redshift solutions at z > 6. When observed with Euclid alone, the PDF(z) of any z > 6 contaminant is highly degenerate. This means that even though the contaminants are falsely identified as high-redshift galaxies with Euclid, one cannot possibly exclude a low-redshift nature based on the PDF(z) that is recovered. Upon further inspection, we find that even with additional H20 and Spitzer photometry, the majority of contaminants produce highly degenerate results for the PDF(z); only with the ultra-deep photometry from the Rubin DDF do we retrieve z > 6 contaminants that have a PDF(z) solely defined at z > 6. Therefore, one would never know from the PDF(z) that these sources are actually misidentified intermediate-z galaxies. Now that we have characterised the galaxy population that drives the z > 6 contamination, we inspect the physical properties as derived with Euclid (+ancillary) photometry of the true z > 6 galaxies and z > 6 contaminants. Since the stellar mass is the most important physical parameter second to the photometric redshift, we first inspect the stellar-mass recovery in Fig. 11. Fig. 9: Fiducial COSMOS photometry-derived stellar mass, colour excess, and age distributions of the stable intermediate-z galaxies (in light brown) and the z > 6 contaminants. The latter are divided into three samples: contaminants with fiducial z = 1-3.5 (in hatched green), contaminants with fiducial z = 3.5-5 (in red) and contaminants with fiducial z = 5-6 (in purple). The median of each distribution is indicated with a vertical line in a corresponding colour. The upper panels show contaminants identified from Euclid data alone, and the lower panels show contaminants identified from Euclid+H20+Spitzer data. Fig. 10: Normalised, stacked probability distribution function of the redshift, PDF(z), of the z > 6 contaminants and true z > 6 galaxies identified from Euclid data. The fiducial PDF(z) derived from COSMOS 28-band photometry is shown in the left panel; the simulated PDF(z) derived from Euclid photometry is shown on the right.
It is clear that in the scenarios of Euclid and Euclid+H20 data, where the true z > 6 galaxies are unconstrained in the NIR, the stellar-mass recovery is poor. Conversely, when IRAC photometry is available, the recovery of the stellar mass from only the six Euclid and Spitzer bands is very efficient. In addition, when we inspect the age and SFH recovery of the true z > 6 galaxies, we see (not shown in this paper) a similar trend. Spitzer photometry is therefore far more effective than optical data for the recovery of these parameters. Without IRAC data, the rest-frame stellar continuum beyond the 4000 Å break of an apparent z > 6 source is completely unconstrained, and so physical parameters directly related to the older stellar population (the stellar mass and age of the galaxy) are unfounded. Figure 11 also shows the stellar-mass recovery of the z > 6 contaminants. Naturally, we do not expect a tight correlation, as the contaminants are by definition misidentified as vastly different galaxies from Euclid (+ancillary) photometry. However, the abundant scatter in the distribution is noteworthy; we have established that distinct populations of galaxies cause the z > 6 contamination, but we see no signs of bimodal behaviour in the stellar-mass recovery. By including Spitzer photometry, the light from the older stellar populations is actually constrained: for 74 % of the contaminants, their recovered stellar mass is higher than their fiducial stellar mass. This is expected, as even though we know the contaminants are relatively faint compared to their stable intermediate-z counterparts, their fluxes can only be attributed to massive z > 6 galaxies since the stellar mass is directly derived from the SED normalisation. Unfortunately, the simulated stellar mass distributions of true z > 6 galaxies and intermediate-z interlopers overlap considerably, and so we cannot further separate these populations based on their stellar masses, nor on their ages or SFH models.
Article number, page 15 of 27 A&A proofs: manuscript no. 43950corr Fig. 11: Fiducial stellar mass versus Euclid (+ancillary) photometry-derived stellar mass for true z > 6 galaxies (upper panels) and z > 6 contaminants (lower panels), for three scenarios of Euclid and ancillary photometry. The number of galaxies shown is indicated in the upper-left corner of each panel, as is the bias and scatter σ f . The bias is defined as mean[(M sim − M fid )/M fid ] and the scatter as rms[(M sim − M fid )/M fid ]. The colour intensity corresponds to the number of galaxies in each bin (darker colours correspond to more sources). Fig. 12: Fiducial colour excess versus Euclid photometryderived colour excess for true z > 6 galaxies (left panel) and z > 6 contaminants (right panel). The colour intensity corresponds to the number of galaxies in each bin (darker colours correspond to more sources). Figure 12 shows the colour excess recovery for z > 6 contaminants and intermediate-z galaxies derived from Euclid data alone. We do not show any other combination of Euclid and ancillary photometry, as the results are universal. The vast majority of true z > 6 galaxies (93 %) have very low fiducial dust extinction, E(B − V) ≤ 0.2. These values are well recovered once observed with Euclid. This is different for the z > 6 contaminants; as expected, dusty intermediate-z galaxies are mistaken for high-redshift sources with considerably less dust attenuation. Unfortunately, the colour excess cannot be employed to separate interlopers from actual z > 6 galaxies; 72 % of the z > 6 contaminants have recovered dust extinction E(B − V) ≤ 0.2.
We employ the two-parameter Kolmogorov-Smirnov test (Smirnov 1939) to compare the parameter distributions of z > 6 contaminants and true z > 6 galaxies. In virtually all scenarios of data availability, the test shows that the stellar mass and colour excess values of contaminants and actual z > 6 galaxies are unlikely to come from the same parent distribution, with pvalues generally below 0.01 (in the scenarios with Rubin DDF photometry we cannot properly compare the distributions due to the low number of contaminants). The age distributions of interlopers and actual z > 6 galaxies are more similar. Still, the parameter spaces of each physical property significantly overlap for the two populations. Therefore, we conclude that there exists no obvious separation between intermediate-z contaminants and true z > 6 galaxies based on their recovered stellar mass, dust extinction and age parameters.

Complementary tests based on a faint mock galaxy sample
In this work we used real data from the COSMOS field to investigate the expected contamination of z > 6 galaxies in the Euclid Deep Fields. However, especially in the optical regime, the Euclid Deep Fields will be considerably deeper than our fiducial UltraVISTA catalogue. In fact, the 5σ limiting magnitude of the UltraVISTA H-band photometry included in our fiducial UltraV-ISTA catalogue is 25.2, whereas the expected 5σ depth in Euclid H E band is 26.4 magnitude. Therefore, our results on the contamination fraction in Table 3 are biased towards the brightest and Article number, page 16 of 27 van Mierlo et al.: Intermediate-redshift contaminants of Euclid z > 6 galaxies most massive galaxies at z fid = 6-8, and therefore possibly too optimistic.
To uncover how successful Euclid data will be at identifying faint high-redshift galaxies, we created a sample of faint mock galaxies from our fiducial UltraVISTA catalogue and repeated our analysis of the contamination fraction with this sample. To create the mocks, we shifted the entire fiducial best-fit SEDs of our UltraVISTA galaxies at z fid = 1-8 by 1.2 magnitude, which is the difference in depth between the UltraVISTA H-and Euclid H E -band images. From these modified SEDs, we selected all galaxies with 25.3 ≤ H < 27.0 magnitude, since sources with H < 25.3 are already discussed in the bright UltraVISTA-like sample and H = 27.0 corresponds to the H-band 3σ flux limit in the Euclid Deep Survey.
To ensure the faint mock sample follows a realistic photometric redshift distribution, especially at z fid = 6-8, we made use of the CANDELS photometric redshift catalogue in the COS-MOS field produced by Nayyeri et al. (2017), which consists of 38 671 sources identified with the HST and contains photometric data in 42 bands. This catalogue is approximately 2.4 magnitudes deeper in the H band than our UltraVISTA catalogue. From this CANDELS catalogue, we selected all galaxies with z = 1-8 and 25.3 ≤ H < 27.0, the latter based on the HST/WFC3 F160W band at 1.6 µm. Subsequently, in ∆z = 0.2 bins, we selected galaxies from our constructed faint mock galaxy sample at random to replicate the re-normalised CANDELS redshift distribution in that bin. Within each redshift bin, we also altered the scaling of the modified SEDs of individual sources such that their H magnitude distribution (in ∆m = 0.1 mag bins) is identical to the re-normalised CANDELS H magnitude distribution in the same redshift bin. This method ensures that our faint mock sample follows quite closely the CANDELS luminosity functions at z = 6-8.
We note that the CANDELS z = 7.8-8.0 redshift distribution cannot be reproduced, as our UltraVISTA catalogue does not contain sources in that redshift bin. However, for simplicity, we globally refer to the faint mock high-redshift sample as mock galaxies with z fid = 6-8 redshifts. Therefore, the final faint mock sample contains 96 084 sources, of which 936 are at z fid = 6-8. Out of this sample, 41 775 sources are detected in at least IRAC channel 1 or channel 2, and 408 of these sources are at fiducial z fid = 6-8. The fiducial photometric redshift and H magnitude distribution of the faint mock sample are shown in Fig. 13, together with the corresponding re-normalised CANDELS distributions.
We sampled the Euclid (+ancillary) fluxes directly from the modified SEDs of the faint mock galaxies, and obtained the final, randomised Euclid (+ancillary) photometry following the methods described in Sect. 3, except for the simulated Spitzer flux errors. Instead, these were not taken directly from the observed UltraVISTA/SMUVS photometry, but scaled along with the fiducial SEDs to preserve the S/N and subsequently adopted as the simulated Spitzer flux errors.
We note that some of the UltraVISTA sources, especially at z fid = 6-8, are included in the faint mock sample multiple times, as there are relatively more high-redshift sources in the CANDELS catalogue than in the scaled-down UltraVISTA catalogue. However, fiducial duplicates can be considered as individual sources for the purpose of this analysis, as the simulated Euclid (+ ancillary) photometry is independently randomised for each instance.
We derive the contamination fraction and z > 6 completeness of the mock sample in all eight Euclid (+ancillary) data scenarios, which are shown in Table 5. The uncertainties on the contamination fraction were derived in the same manner for the UltraVISTA-like bright sample (see Sect. 4.2). First and foremost, the z > 6 completeness ranges from 87 % to 93 % (in the most optimistic scenario Euclid+Rubin DDF+Spitzer), which is lower than the 91-96 % completeness obtained with the bright UltraVISTA-like sample. This is unsurprising, since we have demonstrated how the z fid = 6-8 galaxies are characterised by having very red (I E − Y E ) colours and 2σ flux upper limits in the I E band. By shifting the fiducial SED downwards, the simulated Y E fluxes of z fid = 6-8 are fainter and the (I E − Y E ) colours not as red, which makes identification of these sources more difficult. This is illustrated in Fig. 14, where we show the median magnitude in the Euclid, Rubin and Spitzer bands of the faint mock galaxies. Nonetheless, a completeness of > 80 % for faint highredshift sources identified with Euclid will be excellent for most astronomy science purposes.
Simultaneously, we find that the contamination fraction of the mock sample is 0.39 with Euclid photometry alone, compared to 0.18 with the bright UltraVISTA-like sample. Again, this stark increase is not surprising, as we have demonstrated how intermediate-z interlopers are primarily characterised by their faintness. By shifting the UltraVISTA photometry downwards, intermediate-z galaxies that previously remained at the same redshifts when observed with Euclid are now assigned Euclid (+ancillary) fluxes much closer to their respective 2σ flux limits, and therefore are more likely to be mistaken for z > 6 galaxies.
With the faint mock sample, we find similar trends regarding the usefulness of ancillary optical data; only the ultra-deep Rubin DDF photometry truly improves the contamination fraction, reducing it to 0.07. In addition, we find that including the simulated Spitzer photometry of the mock galaxies barely improves the contamination fraction, contrary to the moderate improvement obtained with the regular UltraVISTA sample. We explain this from the average (Y E − [3.6]) colour of z > 6 contaminants and true z > 6 galaxies. As can be seen in Fig. 14, contaminants identified in the faint mock sample do not produce the extremely red (Y E − [3.6]) colours of contaminants in the bright UltraVISTA-like sample. Because of this, the (Y E −[3.6]) colours of z > 6 contaminants and true z > 6 galaxies are more similar, meaning Spitzer photometry barely helps in distinguishing between these sources.
As was done for the UltraVISTA-like bright sample, we explore what happens to the contamination fraction once we im-Article number, page 17 of 27 A&A proofs: manuscript no. 43950corr

%
Notes. Number of true z > 6 galaxies and z > 6 contaminants, from various combinations of Euclid and ancillary data, based on the faint mock sample. As in Table 3, we report the contamination fraction and completeness for each data availability scenario.
Fig. 14: Median simulated magnitudes of the faint mock sample in the Euclid, Rubin, and Spitzer filters. Light brown circles represent stable intermediate-z galaxies, red stars z > 6 contaminants, and blue diamonds true z > 6 galaxies. For each population, the dotted lines indicate the 16 th and 84 th percentiles. The 2σ flux limits are shown with green bars. The selection of intermediate-z, contaminants, and z > 6 galaxies from the faint mock sample is based on Euclid+Rubin+Spitzer data. The median magnitudes for z > 6 galaxies in the Rubin u and g bands and for contaminants in the Rubin u band are equal to −99 (no intrinsic flux) and therefore omitted from this figure.
pose a NIR detection threshold requirement for the source selection. This should be especially relevant for the faint mock sample, given that many of these sources have flux measurements close to the 2σ flux limits in each band. If we require that faint mock sources have a 5σ detection in at least one of the Euclid NIR bands, 70 831 sources (74 % of the sample) survive, of which 669 galaxies have fiducial z fid = 6-8. We recalculate the contamination fraction and z > 6 completeness on this restricted sample for all eight data combinations, and show the results in Table 6. We emphasise that the completeness is calculated as the fraction of all fiducial z fid = 6-8 galaxies that satisfy the detection threshold requirement; otherwise, variations in the com-pleteness could be simply attributed to the restriction imposed on the sample. From Table 6, it is clear that a NIR detection threshold requirement for faint, apparent z > 6 galaxies considerably reduces contamination from intermediate-z sources. In any data scenario, the z > 6 completeness is comparable to that of the UltraVISTAlike bright sample. This is expected, as the faintest and therefore most poorly constrained high-redshift sources do not survive the detection threshold requirement. Between the different combinations of Euclid and ancillary photometry, we find that including Spitzer photometry does not improve the redshift recovery, nor is it able to distinguish contaminants from true z > 6 sources. Again, the only truly valuable addition to Euclid data is ultradeep photometry from Rubin DDF.
To conclude, the high level of z > 6 galaxy recovery together with the low contamination fraction are only representative of bright galaxies equivalent to those in the UltraVISTA catalogue. As we showed here, the z > 6 completeness level becomes lower and the contamination fraction significantly higher for fainter sources, which will be part of the Euclid Deep Survey as well. Although we have shown that a NIR detection threshold requirement only moderate reduces the degree of contamination in the UltraVISTA-like bright sample, it is very useful for the faint mock sample.

Validity of the simulated high-z solutions
We present numbers of z > 6 contaminants for each scenario of simulated photometry in Table 3. These contaminants are identified from their simulated redshift, with no further checks on the compatibility of the photometry and the best-fit photometric redshifts. Therefore, we explore if additional validation of their redshift could reduce the fraction of contaminants amongst z > 6 galaxies.
We checked if the photometric redshifts are compatible with detections at short wavelengths. At z sim > 6, the Lyman limit at λ = 912 Å is redwards of the Rubin u and g filters, and CFHT u and HSC g filters. Therefore, any apparent z > 6 galaxy with a > 2σ detection limit in these bands can be ruled out. Similarly, at z sim > 6.5 the Lyman limit is redwards of the Rubin r and HSC r bands, so a detection in said bands should not exceed its 2σ limit. Additionally, due to Lyman series absorption of neutral

%
Notes. Number of true z > 6 galaxies and z > 6 contaminants, from various combinations of Euclid and ancillary data, based on the restricted faint mock sample. The restricted faint mock sample consists of sources that have a 5σ flux measurement in at least one the Euclid NIR bands. We report the contamination fraction and z > 6 completeness, where the latter is calculated as the fraction of all fiducial z fid = 6-8 sources that satisfy the detection threshold requirement.
hydrogen in the intergalactic medium (Inoue et al. 2014), the Lyman break of a galaxy spectrum shifts to the Lyman-α line at λ = 1216 Å; as such, sources at z sim > 7 with Rubin r and i, HSC r and i, and I E detections exceeding their 2σ limit are ruled out, as are z sim > 8.1 galaxies with a > 2σ detection in the Y E band. Based on the bright UltraVISTA-like sample, imposing these conditions on the z > 6 contaminant sample has strongly varying consequences depending on the photometric availability: 2.0 % of the Euclid-observed contaminants are discarded; 20,% of the Euclid+Rubin-observed contaminants are discarded; 38 % of the Euclid+Rubin DDF-observed contaminants are discarded; 28 % of the Euclid+H20-observed contaminants are discarded; and 0 % of the Euclid+Spitzer-observed contaminants are discarded. We find considerably lower values for the faint mock sample; between 0.2 and 4 % of contaminants can be discarded from these checks, depending on the combination of Euclid and ancillary data. Clearly, it is useful to compare the best-fit photometric redshifts with their detections in ancillary optical bands, as it will improve the contamination fraction. Nonetheless, we conclude that z > 6 contaminants generally are assigned valid redshifts by the SED fitting based on their simulated photometry. Therefore, the majority of contaminants are not the result of poor SED fitting in which short-wavelength photometry is ignored by the algorithm.

Independent check of contamination fraction with COSMOS2020
We verified our estimates of the contamination amongst Euclid z > 6 galaxies by repeating our analysis of the bright UltraVISTA-like sources using a different photometric catalogue and other SED fitting codes. We used the COSMOS2020 catalogue (Weaver et al. 2022) to simulate Euclid (+ancillary photometry) following the same method presented in Sect. 3. The COSMOS2020 catalogue contains the latest updated data in the COSMOS field (Scoville et al. 2007), with photometry for over 1.7 million sources in 44 optical and infrared bands. In this work we have made use of the so-called COSMOS2020 Classic catalogue, which contains aperture photometry performed on PSFhomogenised images in all bands, except for the IRAC photometry, which was derived from PSF-fitting with the IRACLEAN software (Hsieh et al. 2012). We provide a detailed description on how we selected a sample of z = 1-8 galaxies from this catalogue in Appendix A.
Using the COSMOS2020 Classic photometry, we performed three independent tests to derive the contamination fraction in various Euclid (+ancillary) data scenarios. Here we briefly list our tests for which fitting codes were used (detailed descriptions of the tests are provided in Appendix A): (i) The new C++ version of LePhare (LePhare++), following the two-step SED fitting strategy used in Weaver et al. (2022, see our Appendix A.1). (ii) The conventional LePhare algorithm, adopting the photometric redshifts from Weaver et al. (2022) but deriving their corresponding best-fit templates using the SED fitting parameters from this paper (see Appendix A.2). (iii) The new Python version of the EAZY template fitting algorithm (version 0.5.2.dev7;Brammer et al. 2008; see our Appendix A.3).
Using these photometric redshift codes, we simulated Euclid (+ancillary) photometry from the COSMOS2020 catalogue. For each test, we recomputed the photometric redshifts based on this photometry and tabulated the resulting estimates of the contamination fraction and z > 6 completeness in Appendix A.
Here we reflect on the reproducibility and therefrom resultant validity of our Euclid z > 6 galaxy sample contamination estimates. We find generally good agreement between the contamination fractions derived from our own UltraVISTA catalogue and from the COSMOS2020 catalogue processed by either version of the LePhare code. Therefore, we conclude that regardless of the fiducial photometry, between 15 and 20 % of apparent z > 6 galaxies identified from Euclid photometry with LePhare are actually intermediate-z interlopers. As we have shown in Sect. 5, this number is likely too optimistic, since Euclid will probably detect fainter z > 6 galaxies than what is currently possible with COSMOS. The contamination is neither surprising nor unanticipated, as huge efforts are being made to collect ancillary photometry in all three Euclid Deep Fields. Overall, this result emphasises how we should be careful with Euclid high-redshift candidates for which ancillary photometry is not available.
Between the LePhare strategies, the contamination fractions with Euclid data alone agree well, although those derived from the COSMOS2020 photometry are slightly smaller. Interestingly, we find conflicting results regarding the importance of ancillary data for preventing z > 6 contamination. Overall, Spitzer photometry simulated from COSMOS2020 data Article number, page 19 of 27 A&A proofs: manuscript no. 43950corr barely contributes to reducing the contamination fraction, contrary to the moderate improvement achieved with our own Ultra-VISTA photometry, whereas the COSMOS2020 catalogue actually contains the deepest possible IRAC photometry over the COSMOS field. In addition, ancillary optical photometry from the H20 survey improves the contamination fraction in the test with LePhare++. We believe this is possible given that the COS-MOS2020 catalogue contains very deep optical imaging from the Subaru HSC. However, we do not find the same improvement from the H20 survey data in the test with LePhare. Nonetheless, we observe the same general trend that optical imaging from the Rubin DDF is most effective at distinguishing intermediate-z interlopers from actual z > 6 galaxies, which confirms that the relative depth of the ancillary optical data is of key importance.
We used EAZY to reproduce our results in three data scenarios: Euclid, Euclid+H20, and Euclid+Rubin DDF. These scenarios were chosen as they summarise the different levels of expected depths of the ancillary data. Interestingly we find that the contamination fractions as derived with EAZY are extremely low in all three scenarios. In addition, the redshift recovery of actual z > 6 galaxies is poorer than when using LePhare, with a completeness of 81 % with Euclid alone. Contrary to what is expected, the level of completeness drops considerably from adding ancillary data: in the Euclid+H20 scenario, the z > 6 completeness is only 45 %. In all three scenarios, the missing z fid = 6-8 galaxies are misidentified as z sim = 1-2 and z sim = 5.5-6 galaxies, which is in accordance with the typical redshift degeneracies we identified for the intermediate-z contaminants. These results demonstrate how our EAZY routine appears reluctant to assign high-redshift solutions at all, which is remarkable, as we did not employ an apparent magnitude prior for the SED fitting of the Euclid (+ancillary) photometry. The lower z > 6 completeness may be partially due to the different flux upper-limit strategies between LePhare and EAZY. With the former code, 2σ upper limits in the optical bands impose very strong constraints on the photometric redshift of a true z > 6 galaxy; LePhare will outright reject most intermediate-z solutions as they produce higher fluxes than the upper limits in optical bands. With EAZY, the upper limits are less constraining as they are treated as actual detections with large flux errors that reflect the 2σ detection level. Therefore, false intermediate-z solutions for actual z > 6 galaxies may be unlikely, but not immediately rejected.
In conclusion, our tests using different SED fitting codes on the same dataset demonstrate that the choice of code could have strong implications for both the degree of contamination and the redshift recovery of z > 6 galaxies.
6.3. Identification of z > 6 galaxies from colour-colour diagrams alone We showed how, once the photometric redshifts are determined, a simple (I E − Y E ) & (Y E − J E ) colour cut can weed out z > 6 contaminants from true z > 6 galaxies in all scenarios of photometric availability, but only for sources that have a flux measurement in at least the I E or Y E band. Regardless, it makes sense to wonder to what degree it is possible to use directly this colourcolour plot for a z > 6 galaxy selection in Euclid, without the need for photometric redshifts in the first place. In fact, based on the bright UltraVISTA-like sample, we find that not any z fid = 1-6 galaxy (with a detection in I E and/or Y E ), contaminant or not, populates the (I E − Y E ) > 3.4 and (Y E − J E ) < 0.9 colour box (Euclid data alone). Compared to 184 true z > 6 galaxies that do, this means one can potentially obtain a pure sample whilst preserving 58 % of the fiducial z > 6 galaxies. This suggests that direct selection of z > 6 galaxies may possible from Euclid colours alone, circumventing the need for photometric redshifts.
However, once we take into account the faint mock sample, it becomes clear that photometric redshifts are absolutely necessary. In the scenario of Euclid data alone, 66 % of contaminants in the faint mock sample have an unconstrained (I E − Y E ) colour. If we would apply the colour criteria (I E − Y E ) > 3.4 and (Y E − J E ) < 0.9 to the faint mock sample, virtually none of the contaminants that have a (I E − Y E ) colour survive, but neither do the actual high-redshift galaxies: the completeness drops to 13 %. In addition, considering that more than half of the contaminants cannot be included in this analysis at all, we can conclude that using colour cuts to identify faint high-redshift sources with Euclid is unfeasible.
For the bright UltraVISTA-like sample, we explore if combining the (I E − Y E ) and (Y E − J E ) colours with additional criteria can achieve pure z > 6 galaxy selection with acceptable completeness. Based on the Euclid photometry, we inspect the (Y E − J E ) versus (J E − H E ) colour diagrams for all sources that survive the (I E − Y E ) > 2.8 & (Y E − J E ) < 1.4 colour cut, including all galaxies with an unconstrained (I E − Y E ) colour. We find that the true z > 6 galaxies are well-separated from stable intermediate-z galaxies, but not from contaminants. Additionally, many contaminants that have no (I E − Y E ) colour actually have (Y E − J E ) and (J E − H E ) colours similar to true z > 6 galaxies, so applying a succeeding colour cut in this colour space is mostly unhelpful for weeding out contaminants.
Finally, before we can draw conclusions on the feasibility of Euclid z > 6 galaxy selection from colours alone, we have to address a caveat in our analysis: we have not inspected the colours of fiducial z fid = 0-1 galaxies, given that by definition our intermediate-z sample spans z fid = 1-6. Upon inspection of the Euclid colours of UltraVISTA galaxies at z = 0-1, virtually none of these sources survive the (I E − Y E ) & (Y E − J E ) colour criterion, and therefore contamination of z > 6 galaxies by z = 0-1 galaxies is negligible.
To summarise, our results demonstrate that one should not select a z > 6 galaxy sample from Euclid colours alone, as the colours z > 6 contaminants are poorly constrained, especially in the faint regime. Accurate knowledge of photometric redshifts cannot be circumvented, as even the most stringent colour cut presented in this work is not guaranteed to eliminate intermediate-z contamination, while it strongly sacrifices the z > 6 completeness.
In general, we note that photometric redshift algorithms are by nature more efficient than colours alone for accurate selection of galaxy samples. As an alternative to the proposed colour selection criteria, one could make efforts to fine-tune LePhare (or any other SED fitting tool) to assign more weight to the Y E band or the (Y E − J E ) colour, thereby optimising the algorithm to further lift the degeneracy between intermediate-and high-redshift galaxies. More work is warranted to investigate how to optimise SED fitting codes to improve the discrimination between true z > 6 and intermediate-z interlopers.
In addition, machine learning methods as alternatives to traditional SED fitting are becoming increasingly popular, especially with the advent of large sky surveys such as the Euclid Survey. Recent works using mock Euclid photometry have demonstrated how machine learning approaches typically outperform template-fitting algorithms to retrieve the photometric redshift and galaxy classification at z < 1 (Euclid Collaboration: Desprez et al. 2020;Humphrey et al. 2021 in prep). At higher redshifts, machine learning methods become increasingly unre-liable because extensive spectroscopic training samples are lacking. However, Euclid will obtain spectroscopy and high-quality photometry in several reference fields that already contain deep ancillary data. Therefore, machine learning methods for selecting z > 6 galaxies are likely to become more viable in the future and will possibly reduce contamination of intermediate-z galaxies further.

Possible caveats in this work
As a few final words of caution, a possible caveat in this work is how the Euclid (+ancillary) photometry is simulated from bestfit SED templates, as compared to extracting fluxes from real Euclid (+ancillary) images or even simulated images that emulate the real data more directly. By sampling the photometry directly from models, we do not take into account factors that could degrade the quality of the photometric measurements, such as telescope defects and source confusion. We do note, however, that Euclid will have an excellent (HST-like) angular resolution, and, thus, source blending is unlikely to be a major problem, considering the depths of the Euclid Deep Fields. Nonetheless, the simulated data in this work are idealised to a certain degree, and as such photometric redshift degeneracies could be more prevalent for the real data. Moreover, we used the same template set to simulate the Euclid (+ancillary) photometry and to recompute the photometric redshifts based on these simulated data. The template fluxes were randomised as we described in Sect. 3, but nonetheless, our results on the contamination fractions could be affected by this.
Second, throughout this work we have not distinguished between normal galaxies and galaxies that host an active galactic nucleus (AGN), even though we know that these types of galaxies are present in the COSMOS field (Brusa et al. 2010;Delvecchio et al. 2017;Chang et al. 2017). We cross-correlated our UltraVISTA DR4 catalogue with the C-COSMOS X-ray catalogue (Civano et al. 2016) and identified 996 X-ray AGN sources with host galaxy redshifts between z = 1-5.3. These sources are rare, as they constitute only 0.6 % of the total galaxy population between z = 1-5.3. Moreover, we find that the simulated redshifts of these sources lie below z = 6 in any scenario of Euclid (+ancillary) data. Therefore, we conclude that our X-ray selected AGNs do not contaminate the Euclid high-redshift galaxy sample. However, alternative methods to identify AGNs exist, so more work regarding this AGN contamination of z > 6 galaxies is desired.
Lastly, throughout this work we have assumed the expected full depths for the Euclid and Rubin photometry, finalised once the nominal missions have been completed. With this assumption, we have demonstrated how ancillary photometry is only marginally helpful to prevent intermediate-z contaminants from entering the high-redshift sample, and how z > 6 galaxies are well-recovered from Euclid data alone for both bright and faint galaxies. This does not imply that the ancillary data have an insignificant role in the study of high-z galaxies. The efforts to gather ancillary imaging in the Euclid Deep Fields are very important. For a long time the ancillary data in the optical bands will be deeper than that provided by Euclid I E . In fact, the final I E depth assumed here will not be achieved until late stages of the Euclid operations. Given that the Euclid and Rubin missions will run over many years, interim datasets at intermediate depths will be released. Therefore, ancillary imaging will be very important for all the studies of high-z galaxies performed in the first years of Euclid data analysis. In general, we conclude that more work is required to analyse the contamination fraction estimates at intermediate Euclid and Rubin depths.

Summary and conclusions
We have investigated the contamination from intermediate-z interlopers (z = 1-5.8) of the z > 6 galaxy population as expected for the Euclid Deep Fields. Our tests are based on ∼176 000 real galaxies at z = 1-8 in a ∼0.7 deg 2 area selected from the Ul-traVISTA ultra-deep survey in the COSMOS field and an additional sample of ∼96 000 mock galaxies with 25.3 ≤ H < 27.0 that follow a CANDELS-like photometric redshift and H magnitude distribution. For both datasets, we simulated Euclid and ancillary photometry from fiducial 28-band photometry and subsequently re-derived photometric redshifts for eight scenarios of data availability: (i) Euclid data only; (ii) Euclid and Rubin ugrizy data; (iii) Euclid and ultra-deep Rubin ugrizy data from the Rubin DDF; (iv) Euclid and CFHT u and Subaru HSC griz data from the H20 survey; (v) Euclid and Spitzer IRAC 3.6 and 4.5 µm data; (vi) Euclid, Rubin, and Spitzer data; (vii) Euclid, Rubin DDF, and Spitzer data; and (viii) Euclid, H20, and Spitzer data. We emphasise that the findings presented below are only representative of galaxies up to z = 8 due to the limitations of our fiducial sample and that we cannot assess the photometric redshift recovery of Euclid z > 8 galaxies.

We determined the fraction of intermediate-z contaminants
(fiducial z = 1-5.8) amongst the apparent z > 6 population as identified from Euclid (+ancillary) data. Based on the bright UltraVISTA-like sample, we estimate the contamination fraction of z > 6 galaxies observed with Euclid to be 0.18 +0.07 −0.06 . Contrarily, when we consider the faint mock sample, the contamination fraction of z > 6 galaxies with Euclid is significantly higher at 0.39 +0.10 −0.11 . 2. Based on the bright UltraVISTA-like sample, we find that the contamination fraction is reduced to 0.13 +0.04 −0.05 when including Spitzer IRAC photometry, and to 0.04 +0.03 −0.02 when including Rubin DDF photometry. In our most optimistic scenario, where we combined Euclid, Rubin DDF, and Spitzer photometry, virtually any possible contamination from intermediate-z interlopers is ruled out; the contamination fraction is 0.01 +0.0006 −0.0001 . Conversely, when we consider the faint mock sample, the contamination fraction is only significantly reduced once we include the ultra-deep Rubin DDF photometry, to 0.07 +0.03 −0.03 . 3. We replicated our analysis of the bright UltraVISTA-like sample on the COSMOS2020 galaxy catalogue, which contains independent photometry over the same field, in combination with various SED fitting routines. We find generally good agreement between the contamination fraction estimated from our own photometry and those derived using LePhare/LePhare++ with the COSMOS2020 catalogue. However, we obtain different results using the EAZY code, which returns a lower fraction of contaminants as well as a lower completeness in the recovery of z > 6 galaxies. 4. The contaminants of z > 6 galaxies have distinctly different (I E − Y E ) colours from true z > 6 galaxies, so colour selection criteria can be used to separate these populations a posteriori of obtaining the Euclid-like photometric redshifts. However, many contaminants have unconstrained (I E − Y E ) colours and as such cannot be included in this analysis. Regardless, we have presented a grid of (I E − Y E ) & (Y E − J E ) colour cuts that balance the contamination fraction and z > 6 galaxy completeness. For the bright UltraVISTA-like sample, the colour Article number, page 21 of 27 cut (I E − Y E ) > 2.8 & (Y E − J E ) < 1.4 when applied to the apparent z > 6 galaxies (that have a flux measurement in the I E and/or Y E bands; identified from Euclid data) reduces the contamination fraction to 0.01 while preserving 81 % of the fiducial z > 6 galaxies (i.e. galaxies at fiducial z = 6-8 that are recovered at z > 6 from Euclid photometry). Considering the faint mock sample, these proposed colour selection criteria are not useful, given that the majority of contaminants are undetected in both the I E and Y E bands. 5. Alternatively, we find that imposing a 5σ detection threshold requirement in at least one of the Euclid NIR bands is useful for obtaining a purer apparent z > 6 sample, specifically for the faint mock sample. By employing this detection threshold requirement, the contamination fraction of the faint mock z > 6 galaxies is 0.25 +0.07 −0.07 (Euclid data alone), and we maintain a z > 6 completeness of 91 %. As such, the identification of faint high-redshift sources in the Euclid Deep Fields will be very efficient, with a moderate contamination of intermediate-z galaxies. 6. We investigated if one can select a pure sample of z > 6 galaxies from Euclid colours alone. Based on the bright UltraVISTA-like sample, we have shown how a strict colour cut at (I E − Y E ) > 3.4 & (Y E − J E ) < 0.9 could possibly select a pure sample of high-redshift galaxies, whilst maintaining a z > 6 completeness of 58 %. However, this cannot be achieved for the faint mock sample at all, and therefore we conclude that deriving photometric redshifts cannot be circumvented. 7. We compared the fiducial physical properties (based on 28 bands in COSMOS) of the intermediate-redshift galaxies that are z > 6 contaminants and stable intermediate-z galaxies (i.e. z = 1-6 galaxies that stay in the same redshift bin when observed with Euclid (+ancillary) data). The analysis of the physical parameters is solely based on the bright UltraVISTA-like sample. We find that the z > 6 contaminants reside primarily at fiducial redshifts z fid ∼ 1-3 and z fid ∼ 4.5-6 and are primarily faint, regardless of which ancillary data are added to the Euclid photometry. We identify three distinct populations: (i) moderately reddened galaxies at z fid =1-3.5 that are misidentified through the typical confusion of the Lyman-α break at λ = 1216 Å and the 4000 Å break; (ii) a group of contaminants at z fid = 3.5-5 that are strongly dust-reddened; and (iii) a group at z fid = 5-6 that have typically ill-defined, flat, fiducial SEDs and are mistaken for z ∼ 6 galaxies. 8. Moreover, we compared the physical properties as derived with Euclid (+ancillary) photometry of z > 6 contaminants and true z > 6 galaxies. We have demonstrated how Spitzer photometry is essential for recovering the stellar masses of z > 6 galaxies. The z > 6 contaminants are most separated from true z > 6 galaxies based on their colour excess, as the latter have virtually no dust attenuation. Although the parameter distributions of true z > 6 galaxies and z > 6 are generally statistically different, we find no selection criteria that effectively separate them based on their physical parameters.
Overall, we conclude that the Euclid high-redshift recovery will be excellent for bright z = 6-8 galaxies, and successful for faint galaxies as well. In addition, ultra-deep ancillary photometry is highly effective at reducing contamination from intermediate-z interlopers to the z > 6 galaxy sample.

%
Notes. Number of true z > 6 galaxies and z > 6 contaminants, using the COSMOS2020 Classic catalogue from Weaver et al. (2022). The fiducial photometric redshifts and Euclid (+ancillary) photometry were derived using the methods described in Weaver et al. (2022). Only sources with a 3σ detection in both the J E and H E bands are considered here.

%
Notes. Number of true z > 6 galaxies and z > 6 contaminants, using the COSMOS2020 Classic catalogue from Weaver et al. (2022). The fiducial photometric redshifts and Euclid (+ancillary) photometry were derived using the same method adopted for the UltraVISTA/SMUVS photometry, described in Sect. 2.2. Only sources with a 3σ detection in both the J E and H E bands are considered here.

%
Notes. Number of true z > 6 galaxies and z > 6 contaminants, using the COSMOS2020 Classic catalogue from Weaver et al. (2022). The fiducial photometric redshifts and Euclid (+ ancillary) photometry were derived with EAZY, following the methods outlined in Weaver et al. (2022). Only sources with a 3σ detection in both the J E and H E bands are considered here.