Euclid preparation: XX. The Complete Calibration of the Color–Redshift Relation survey: LBT observations and data release

The Complete Calibration of the Color–Redshift Relation survey (C3R2) is a spectroscopic program designed to empirically calibrate the galaxy color–redshift relation to the Euclid depth ( I E = 24 . 5), a key ingredient for the success of Stage IV dark energy projects based on weak lensing cosmology. A spectroscopic calibration sample that is as representative as possible of the galaxies in the Euclid weak lensing sample is being collected, selecting galaxies from a self-organizing map (SOM) representation of the galaxy color space. Here, we present the results of a near-infrared H - and K -band spectroscopic campaign carried out using the LUCI instruments at the LBT. For a total of 251 galaxies, we present new highly reliable redshifts in the 1 . 3 ≤ z ≤ 1 . 7 and 2 ≤ z ≤ 2 . 7 ranges. The newly-determined redshifts populate 49 SOM cells that previously contained no spectroscopic measurements and almost twice the occupation numbers of an additional 153 SOM cells. A ﬁnal optical ground-based observational e ﬀ ort is needed to calibrate the missing cells, in particular in the redshift range 1 . 7 ≤ z ≤ 2 . 7, which t lack spectroscopic calibration. In the end, Euclid itself will deliver telluric-free near-IR spectra that can complete the calibration.


Introduction
The Euclid satellite (Laureijs et al. 2011) is scheduled for launch in 2023; it will observe galaxies to z > 2 over 15 000 deg 2 using two instruments: VIS, an optical imager that will reach an AB magnitude depth of 24.5 (for extended sources at 10 − σ, see also Euclid Collaboration: Scaramella et al. 2021) with a single broad I E filter, and NISP, a combined near-infrared imager (in Y E , J E , and H E , see Euclid Collaboration: Schirmer et al. 2022) and slitless spectrograph. The optical imager will determine galaxy shape distortions with unprecedented accuracy. When combined with a precise determination of the true ensemble redshift distribution, this allows the weak lensing effects caused by the distribution of matter along the line of sight to be measured and the cosmological parameters to be constrained (Euclid Collaboration: Blanchard et al. 2020).
The estimated number of weak lensing source galaxies that will be imaged from Euclid makes their systematic spectroscopic follow-up unfeasible; this mission is thus critically dependent upon the determination of accurate photometric redshifts (z phot ). Currently, the precision of photometric redshifts based on multiband optical surveys is to the order of σ z /(1 + z) = 0.03 − 0.06, and the fraction of catastrophic outliers, which are defined as objects whose z phot differs from their spectroscopic redshift (z spec ) by more than 0.15(1 + z), is of the order of a few tens of percent (Ma et al. 2006;Hildebrandt et al. 2010). We expect that the combination of ground-based optical and Euclid near-infrared photometry will deliver the slightly improved requirements of the mission: σ z /(1 + z) ≤ 0.05, and a fraction of catastrophic outliers less than 10% (see Laureijs et al. 2011 andEuclid Collaboration: Desprez et al. 2020).
While small changes in z phot precision per source have a relatively small impact on cosmological parameter estimates, small systematic errors in z phot can dominate all other uncertainties for these experiments. The aim of the C3R2 project (Masters et al. 2015(Masters et al. , 2017(Masters et al. , 2019Euclid Collaboration: Guglielmo et al. 2020;Euclid Collaboration: Stanford et al. 2021) is to calibrate photometric redshifts by measuring accurate spectroscopic redshifts of selected objects sampling a self-organizing map (SOM), a representation of the galaxy color space. The SOM projects the highdimensional galaxy color space onto a 2D plane. Each galaxy is assigned to a cell in this plane with given coordinates (X, Y). If this plane is sampled with enough cells, the distribution of photometric redshifts of all galaxies belonging to a cell is narrow and there is a well-defined correspondence between the position occupied by a galaxy in the multi-color space and its redshift. We can define z phot,SOM as the average of the photometric redshifts of all galaxies belonging to the cell. By measuring spectroscopic redshifts of galaxies in each cell it is possible to calibrate the mean of the photometric redshift distribution in an efficient and homogeneous way across the galaxy color space. Regions in color space particularly difficult to calibrate because of broad or bimodal photometric redshift distributions can be identified by comparing the measured spectroscopic redshifts with z phot,SOM and looking for deviations larger than 0.15(1 + z). The The LBT is an international collaboration among institutions in the United States, Italy, and Germany. The LBT Corporation partners are: LBT Beteiligungsgesellschaft, Germany, representing the Max-Planck Society, the Astrophysical Institute Potsdam, and Heidelberg University; The University of Arizona on behalf of the Arizona university system; Istituto Nazionale di Astrofisica, Italy; The Ohio State University, and The Research Corporation, on behalf of The University of Notre Dame, University of Minnesota, and University of Virginia. e-mail: saglia@mpe.mpg.de minimum calibration requirement is to populate each SOM cell with at least one spectroscopic redshift; problematic cells might require (much) more than this. Ultimately, galaxies belonging to uncalibrated cells may be dropped from the Euclid weak lensing sample; for a detailed discussion see Masters et al. (2015). In this work we continue this effort by presenting the redshift measurements of z > 1 galaxies collected at the Large Binocular Telescope (LBT) in the COSMOS (Capak et al. 2007;Scoville et al. 2007;Lilly et al. 2007) and VVDS (McCracken et al. 2003;Le Fèvre et al. 2004;Jarvis et al. 2013) fields, using the near-infrared LUCI spectrographs. Further data releases of optical spectra acquired at the VLT and the GRANTECAN telescopes are in preparation. The Euclid science working groups are actively discussing how the final ground-based dataset will be merged with the spectroscopy information delivered by the Euclid mission itself to perform the optimal calibration of the photometric redshifts.
The paper is organized as follows: in Sect. 2 we describe the strategy, target selection, and mask preparation; in Sect. 3 we describe the observations and data reduction; in Sect. 4 we discuss the redshift determination and the attribution of a flagging scheme consistent over the whole C3R2 survey; in Sect. 5 we present the results of the redshift assignments, and we investigate the bias of the photometric redshifts using in C3R2 and the origin of catastrophic SOM redshifts, and we discuss our success rate and SOM cell coverage; finally, we present our conclusions in Sect. 6.

Strategy, target selection, and mask preparation
The LBT consists of two 8 m mirrors mounted on a common structure and pointing at the same position on the sky. The LUCI1 and LUCI2 near-infrared spectrographs (Seifert et al. 2003) are mounted at the front Bent Gregorian f/15 focal stations of the LBT, and can be used in combination with masks (Buschkamp et al. 2010), designed and cut well in advance of the observations, that allow the simultaneous collection of multiple spectra. The field must be observed with identical pointings and rotation angles, but potentially with different masks and instrumental setups on the two sides. The masks cover a 4 × 4 arcmin 2 field of view, but only for slits placed in a 2.8 arcmin wide central stripe are optimally focused spectra delivered. Moreover, the wavelength range covered depends on the distance of a slit from the middle of a mask. At least three stars are needed to align the masks. Dedicated holes (typically 4 × 4 arcsec 2 big) were required until the June 2020 runs. Starting from the July 2020 run, after a change in requirements from the observatory, only one hole was necessary.

Strategy
Given the characteristics of the LBT and of the LUCI spectrographs described above, and following the strategy adopted by Euclid Collaboration: Guglielmo et al. (2020) (hereafter G2020), we collected spectra in the H and K bands. In the H band we can measure spectroscopic redshifts of galaxies between 1.3 and 1.7 by detecting their Hα, [N ii], and [S ii] emission lines, or of galaxies with redshifts between 2 and 2.7 by detecting their Hβ and [O iii] lines. In the K band we can measure spectroscopic redshifts of galaxies between 2 and 2.7 by detecting their Hα, [N ii], and [S ii] emission lines, or (in principle) of galaxies with photometric redshifts less than 1.7 by detecting their Paschen (hereafter Pa) lines. We used the N1.8 camera, which delivers 0.25 × 0.25 arcsec 2 per pixels, and the G210 gratings, for which Fig. 1: COSMOS_M25H mask designed with the lms tool. The white square shows the LUCI field of view (4 × 4 arcmin 2 ), the long-dashed vertical lines bracket the optimal field of view for spectroscopy (2.8 arcmin wide). The green circle at the center of the field allows the mask to be moved. The cyan circles identify the alignment stars. The yellow rectangles show the slits; slit number 25 is positioned on an acquisition star to monitor seeing and transparency during the observations. The square slit number 26, also positioned on an acquisition star, allows verification of the centering of the mask after alignment. At the lower end of the mask, a series of six small holes is present for engineering purposes. Below these, the red rectangle is the area occupied by the identification number cut into the mask.
1 arcsec-wide slits give a resolution of R = 2950 in the H band and R = 2500 in the K band. The nominal wavelength range for a centered slit is 0.202 µm and 0.328 µm in the H and K band, respectively. The central wavelength can be adjusted in the range 1.55 to 1.75 µm for the H band, and 2.06 to 2.40 µm for the K band. We optimized the central wavelength for each mask separately, based on the photometric redshifts of the observed galaxies, to maximize the likely return.
The typical angular size of the galaxies in the C3R2 catalogues is 1-2 arcsec in diameter. In order to achieve maximum efficiency avoiding separate sky observations, we opted for an on-slit nodding strategy, which sets the default slit length to 10 arcsec and forces all slits to be parallel. Taking into account the minimum allowed distance between slits, and the necessity of cutting holes for acquisition stars, the maximum number of galaxies targeted per mask is ≈ 20. In reality, we never managed to observe more than 18 galaxies per mask. We ranked the list of positions and rotation angles according to the total number of galaxies observable simultaneously in the H and K bands. We observed the VVDS field in October and the COSMOS field in the months from December to May. We ended up with a list of 13 pairs of (H + K) masks in the VVDS field, and 28 pairs of (H + K) masks in the COSMOS field, for which at least a total (H+K) 20 galaxies could be assigned, minimizing the number of repeated observations. As in G2020, we aimed to observe each mask for 2 hours, split into 36 exposures of 200 s each, dithering the mask along the slits by 2 arcsec following an ABBA pattern. The actual number of exposures collected for each mask varied according to observational conditions and constraints; see Table  A.1.

Target selection and mask preparation
The selection of galaxies to be observed started from the catalogues produced by Masters et al. (2019) and used by G2020, excluding galaxies already observed with KMOS at the VLT. The catalogues adopt the photometric redshifts provided by Ilbert et al. (2006) and Laigle et al. (2016). We considered Priority 1 and Priority 2 galaxies (hereafter primary galaxies), as in G2020, and mapped the number of galaxies assignable to each mask as a function of the coordinates of the field centers and rotation angles, taking into consideration several constraints. At least one suitable guiding star had to be present in the allowed patrol field and within the appropriate magnitude range; at least three acquisition stars (preferably selected to have low proper motions) within the appropriate magnitude range should be present in the 4 × 4 arcmin 2 field of view; at least three 4 × 4 arcsec 2 holes (or just one from October 2020) were cut for this purpose. At least one 10 × 1 arcsec 2 slit was assigned to a star, to be able to re-Article number, page 3 of 22  construct empirically the dithering pattern and measure the seeing and the relative transparency during the observations. The galaxies were placed such that the expected Hα line based on their photometric redshift fell into the wavelength range computed based on the distance to the central stripe of the field and the optimized central wavelength.
The actual design of the masks was performed with the lms tool 1 . During this phase, further tweakings of the centers, position angles, and slit assignments were necessary. Guide stars too near the border of the patrol field had to be changed, some acquisition stars or some slits had to be dropped or changed in length (down to 7 arcsec) because of additional constraints (e.g., one end of the mask has a regular pattern of small holes that must be avoided). Finally, the space available between the slits was filled manually with secondary galaxies, when possible. These are galaxies where the [O iii] lines could possibly appear in the H band, or the Pa lines in the K band. Once the list of Priority 1 and 2 galaxies assigned to each mask was ready, the optimal central wavelength was calculated as the average of the Hα wavelengths, redshifted with the respective photometric redshifts. The gbr files detailing the masks for the cutting machine were passed to the observatory, where the masks were produced and inserted into the cryostatic dewar before each observing run.
An example of a pair of H + K masks is given in the Figs. 1 and 2. All the observed masks are listed in Table A.1, where the coordinates of their centers and of their rotation angles, together with the date of the observations are given. In the end (see Table   1 https://sites.google.com/a/lbto.org/luci/preparing-to-observe/maskpreparation/lms-install A.1), on average we extracted 12.3 galaxy spectra per mask, of which on average 9.6 were primary.

Observations and data reduction
The scripts controlling the telescope and LUCI operations were prepared using the LBTO OT software 2 . Except when one of the two LUCI spectrographs was not available, the paired approach was adopted, allowing simultaneous observation of the same field in the H and K bands. Monocular scripts were prepared for the October 2019 run (when only LUCI1 was available) and for the January 2020 run (when LUCI1 became unavailable during the run). Monocular scripts were also prepared for the observations of telluric standards. Calibration scripts to obtain dark, flat, and arc observations were also prepared in paired mode and performed during the day or at night during periods of bad weather. The observations started in visitor mode (October 2019), and continued in remote observing mode as the COVID19 emergency made international traveling impossible. Staff members from the LBT Observatory operated the instrument from Tucson for all observing runs, with scientists from the Max Planck Institute for extraterrestrial physics (M. Fabricius, S. de Nicola, R. Saglia, J. Snigula) and the Landessternwarte (J. Heidt) mainly supervising, but also directly controlling the procedures from Germany. Overall, the VVDS field was observed under good meteorological conditions; in contrast, several nights during which the COSMOS field was observed (in particular the March 2021 run), suffered from cirrus, high winds, and poor seeing.
The observation of a field follows four steps. First, an image is taken without the mask at each telescope, the acquisition stars are identified, and their positions are measured. Second, the bestfitting field rotations and translations are determined separately for the two images and applied, possibly culling the most deviant acquisition stars. Third, a second pair of images is taken through the mask to verify that acquisition and seeing-monitoring stars appear in the appropriate holes and slits. If necessary, a second translation is determined and applied to optimize the centering orthogonal to the slits. The achieved RMS precision of the alignment was typically between 0.1 and 0.3 arcsec. Fourth, the spectroscopic observations start.
After the observation of a field, or before the observation of the next one, a telluric standard was observed separately for the H and K band, putting the standard star sequentially in three slits of each mask, selected to cover the whole probed wavelength range. We observed a total of 88 masks (58 in the COSMOS field and 30 in the VVDS field), 47 in the H band (30 in the COSMOS field and 17 in the VVDS field), and 41 in the K band (28 in the COSMOS field and 13 in the VVDS field). As mentioned above, during the October 2019 run, LUCI2 was unavailable, and during the January 2020 run LUCI1 stopped working.
The data reduction was performed using the IDL Flame pipeline developed by Belli et al. (2018). We used the configuration which relies on the science data for tracing, extraction, and wavelength calibration, exploiting the brightness of the sky background. The steps follow the sequence described in Belli et al. (2018). First, the position of the reference star on each frame is searched for and, if detected, the flux, vertical position, and full width at half maximum (FWHM) are measured. This allows the determination of the nodding and dithering offsets of each A and B frame in the ABBA sequences. However, for many of the K-band pointings, the reference star cannot be detected in single frames and appears visible only after the coaddition is performed. In these cases, the list of the nominal shifts is used. After application of the bad pixel and master pixel maps, a master slit flat is computed and used to identify the slit positions and map their edges. Cutouts of the slits are produced and, after a rough wavelength calibration, individual OH emission lines are identified and fitted with a Gaussian. The relation between pixel position and wavelength is determined as a second-order polynomial for each slit row. The spatial illumination correction is determined from the detected OH lines and applied. A model of the sky is constructed following Kelson (2003) and subtracted. Finally, each sky-subtracted slit frame is rectified and wavelength calibrated before all the A and B frames are stacked together and the A-B and B-A results are produced. Their combination gives the final 2D spectra shown in Figs. 3 and 4. One-dimensional spectra are extracted after having identified the appropriate spatial window, either determining the spatial extent of the emission lines or of the continuum. Examples are shown in Figs. 3 and 4. We note that wavelengths refer to the vacuum and do not take into account the barycentric correction, which is applied a posteriori to the measured redshift.

Redshift determination and flagging
Redshifts are measured on the 1D spectra after having spotted by eye the signatures of emission lines on the 2D spectra: the presence of a sequence of a negative (black), a positive (white), and a negative (black) blob (see Figs. 3 and 4). Based on the alignment of the predicted positions of Hα, or Pa lines with the peaks of the observed emission lines, an estimate of the redshift of the galaxy is derived, together with the appropriate flag. Following the scheme used in G2020, a Flag = 4 is assigned to redshifts with multiple good S/N line detections; a Flag = 3.5 is given to very good S/N single line detections; a Flag = 3 is given to convincing single line detections. Lower Flag (1 and 2) spectra indicate redshifts that are not secure enough for our purposes; Flag = −99 are non-detections or failed reductions. Examples of Flag = 4, 3.5, 3 spectra and redshifts are given in Figs. 3 and 4.
In a number of slits, more than one object was detected, either as a spatially separate object or on the same line of sight, but with different redshifts. In these cases multiple 1D spectra were extracted and analyzed. The correct identification was attempted by inspection of the relevant images.
The procedure described above was performed independently by two of us (RPS and RZ); discrepant assessments were discussed and resolved. We extracted 1119 spectra, of which 19 were not matched with an object from the photometric parent catalogues. We matched 292 good spectra (Flag ≥ 3): 163 in the COSMOS field, 129 in the VVDS field. The complete statistics are given in Table 1, where we list the numbers referring to primary spectra (targeting the Hα line) and secondary spectra (targeting the [O iii] or Pa lines). Table A.1 (column "Success Rate") gives the number of extracted identified spectra (a total of 1100) in each pointing together with the number of good spectra (Flag ≥ 3) and the number of primary spectra collected. The mean success rate (defined as the ratio of the number of identified good spectra over the total number of collected identified spectra) is 0.27 (0.22 in the COSMOS field and 0.37 in the VVDS field; the lower success rate in the COSMOS field stems from the suboptimal meteorological conditions under which several COSMOS masks were observed). It is higher in the H band (0.32; 0.28 in the COS-MOS field and 0.40 in the VVDS field) than in the K band (0.19; 0.13 in the COSMOS field and 0.33 in the VVDS field). The success rates achieved by G2020 are better, which stems from the larger field of view (which allowed an optimized choice of good candidates), the fixed wavelength range, and the IFU available with the KMOS instrument. We investigate the properties of the galaxies with unreliable spectra (Flag < 3) in Sect. 5.
The barycentric corrections z helcorr = v helcorr /c appropriate to each mask are listed in Table A.1 together with the Julian date (JD) adopted for the computation. This corresponds to the middle of the sequence of exposures. When two series of exposures observed in different nights are summed together, the JD refers to the first of the two. The barycentric corrections are computed through the python routine pyals.helcorr from PyAstronomy; we quote the two digits that affect the last quoted digit of the redshifts. Using the relation 1 + z spec = (1 + z meas ) (1 + z helcorr ), where z meas are the measured redshifts shown in Figs. 3 and 4, we compute the redshifts z spec given in Table B.1 using the formula z spec = z meas + (1 + z meas ) z helcorr . In addition, A number of objects were observed multiple times, enabling an empirical determination of the errors on our redshifts and to compare redshifts derived from the K-and H-band spectra. Figure 5 shows that the RMS uncertainty on the redshifts of objects at z < 1.8 with repeat observations is 0.0002, and two times larger for objects with z > 1.8. One COSMOS object (ID = 481647) was observed by the previous C3R2 releases, providing z spec = 1.501, identical to z spec = 1.501 obtained  Extracted  1119  635  484  766  424  342  353  211  142  Good  305  212  93  174  127  47  131  85  46  Identified  1100  618  482  751  410  341  349  208  141  Good  292  200  92  163  117  46  129  83  46  Primary  851  446  405  597  316  281  254  130  124  Good  249  158  91  139  94  45  110  64  46  Secondary  249  172  77  154  94  60  95  78  17  Good  43  42  1  24  23  1  19  19  0  Unidentified  19  17  2  16  15  1  4  3  1  Good  13  12  1  11  10  1  2  2  0  Primary  14  12  2  11  10  1  3  2  1  Good  8  7  1  7  A unique redshift z and flag were assigned to each object with repeated observations by averaging the available measurements; the resulting values are reported in Table 2 and used in the following figures.

Results
We measured good (Flag ≥ 3) spectroscopic redshifts for 253 objects (two of which already known): 71 with Flag = 4, 62 with Flag = 3.5, and 120 with Flag = 3. Figure 6 compares these to the photometric (left) and the SOM-based (right) redshifts. The photometric redshifts are from Ilbert et al. (2006) and Laigle et al. (2016); the SOM-based redshifts are the averages of the photometric redshifts of the galaxies belonging to the SOM cell. There are no catastrophic failures with |z phot − z spec |/(1 + z spec ) ≥ 0.15 and Flag ≥ 3. Seven objects have |z phot,SOM − z spec |/(1 + z spec ) ≥ 0.15; we examine the cells they belong to below. The values of σ N MAD = 1.48Median(|z − z spec |/(1 + z spec )) and Bias = Mean((z − z spec )/(1 + z spec ))) are comparable to what achieved in previous C3R2 releases; for example G2020 quote σ N MAD = 0.03 and Bias = −0.003 when considering z = z phot and σ N MAD = 0.044 and Bias = 0.027 when considering z = z phot,SOM . Figure 6 shows that the agreement between photometric and spectroscopic redshifts for objects with z spec < 2 is excellent, with a mean difference of −0.006 and RMS = 0.03. At higher redshift, however, photometric redshifts appear systematically larger, with (z phot − z spec )/(1 + z spec ) = 0.02 and similar RMS. We investigate this issue by examining the whole C3R2 dataset published to date (see Fig. 7). When averaged in redshift bins of width 0.1, the mean difference (z phot − z spec )/(1 + z spec ) appears slightly negative (≈ −0.01±0.0015) up to redshifts 1.7, and slightly positive (≈ +0.01 ± 0.003) at higher redshifts, consistently in all datasets and for both Flag ≥ 3 or Flag = 4 redshifts. Moreover, in each bin up to redshift 1.7 the mean difference, or bias, is well determined with a signal-to-noise ratio S/N between 3 and 8. This S/N is lower (between 2 and 6) at redshifts higher than 2; the bias is unconstrained (with S/N less than 1) in be-tween. The S/N improvement in each bin achieved by the new redshifts released here is ≤ 0.5.
This shows that the C3R2 project is able to detect and correct the residual small biases present in the most accurate available photometric redshift samples that can be used as reference in the Euclid mission. Therefore the tight Euclid requirement that the mean redshift in each tomographic bin must be constrained at the level of 0.002(1 + z) is achieved up to redshift 1.7 thanks to the calibration provided by the C3R2 dataset. It is almost achieved for redshifts higher than 2, but remains problematic in between, where the error on the possible residual bias is ≈ 0.005.
How the spectroscopic calibration of single SOM cells will be implemented when building the lensing tomographic bins for Euclid is still under investigation and goes beyond the scope of this paper. Figure 6, right shows that six galaxies have |z phot,SOM − z spec |/(1 + z spec ) > 0.15. A seventh galaxy at z phot,SOM = 0.8 (and therefore not visible in the figure) has a similarly discrepant redshift. As noted in G2020, they belong to cells with a large spread in photometric redshifts, probing the second peak of the distribution or its tail (see Fig. 8). Figure 9 examines the success rate as a function of redshift. As already discussed above, the success rate is lower in the K band and peaks when the redshifted Hα line is around the midredshift of either band. This is expected since the wavelength coverage depends upon the position of the galaxies on the masks, and we set the central wavelength of the spectrographs to the average of the expected Hα positions. Moreover, the focus of the spectrographs deteriorates for objects that are not within the central 2.8 arcmin stripe of the field. Similarly, the [O iii] lines at z phot ≤ 2.1 fall outside of the covered wavelength range in the H band if the objects are not in extreme positions on the masks. Overall, the success rate achieved for z phot ≥ 2.0 objects targeting the Hα in the K band or the [O iii] lines in the H band is similar. In contrast, we managed to detect only once (with Flag = 3) the Pa lines from the 77 targeted z phot ≤ 1.7 objects in the K band. However, these objects were selected as fillers when space was available in the masks and no Hα targets were left. Figure 10 examines the success rate as a function of the apparent total H-band magnitude of the galaxies. This is not a strong function of the H magnitude; it is approximately 35%, down to H magnitudes of 23 and declining for fainter objects.
With the same motivation as in G2020, we now examine the success rate in populating SOM cells with spectra. We targeted 518 cells (388 in the COSMOS field, 194 in the VVDS field,  Fig. 3: One-and two-dimensional spectra with Flag = 4, 3.5, 3 of objects with z meas < 2. The 2D spectra are 10 arcsec in width, with the vertical scale given in pixels, and show the whole wavelength range (in µm) observed. The 1D spectra show the wavelength range around the relevant emission lines; the black solid lines show the measured flux (in arbitrary units with a 3 pixel smoothing), the dotted lines the flux without smoothing plus its errors, the red lines the sky (scaled down by a factor of 1000 and shifted by −0.2 flux units). The vertical dashed lines indicate the emission lines (with vacuum rest-frame wavelengths in Å) used (when visible) to measure the spectroscopic redshift (not yet corrected to the heliocentric system).
with overlap), obtaining reliable spectra (Flag ≥ 3) for 202 (127 in COSMOS, 94 in VVDS with overlap), translating to a success rate of 39% (33% in COSMOS, 48% in VVDS). Of the 202 cells probed successfully here, 49 were empty before these observations; for the remaining 153 cells with at least one good spectroscopic redshift from previous C3R2 campaigns, on average our observations increased the number of spectroscopic redshifts per cell by 72%. Figure 11 shows the distribution of the SOM cells probed by the current spectroscopic release, together with the cells still empty. Summing up the contributions of all C3R2 releases, we find the following. In the 839 cells with 1.3 < z phot,SOM < 1.7, there are 52491 galaxies (with I-band magnitudes brighter than 24.5 in the COSMOS and VVDS fields). In 683 of these cells (81%), where 45370 galaxies are found (86% of the total), we collected at least one good spectroscopic redshift. In the 895 cells with 2 < z phot,SOM < 2.7, there are 40013 galaxies. In 683 of these cells (74%), where 30981 galaxies are found (77% of the total), we collected at least one good spectroscopic redshift.

Discussion and conclusions
We present the results of the LBT campaign to calibrate photometric redshifts in the redshift range 1.3 to 2.7 using the LUCI near-infrared spectrographs in the framework of the C3R2 project. We observed 88 masks, 58 in the COSMOS field and 30 in the VVDS field, with an average of 12 objects per mask, aiming to detect the Hα line for primary targets, and the [O iii] or Pa lines for secondary objects. We extracted 1119 spectra, 1100 of which were identified as C3R2 galaxies. From 292 of these galaxies we were able to measure reliable spectroscopic redshifts. We assessed their precision from repeated measure-Article number, page 7 of 22 ments to be in the range ∆z ∼ 0.0002 to 0.0004. After averaging repeat measurements, we ended up with reliable spectroscopic redshifts for 253 galaxies, two of which with already-known values.
Comparison with the C3R2 photometric redshifts shows that none of these galaxies are catastrophic outliers. The values of σ NMAD and the bias are comparable to those reported in the previous C3R2 releases. Analyzing the whole C3R2 spectroscopic database published to date, we detect a small systematic shift (z phot − z spec )/(1 + z spec ) on the order of −0.01 ± 0.0015 in the redshift range 1.3 < z phot < 1.8 and of +0.01 ± 0.003 in the range 2 < z phot < 2.7. This is relevant for the calibration of the redshifts of SOM cells in these redshift ranges, essentially matching the Euclid requirement of knowing the mean redshift of each tomografic bin to 0.002(1+z). Our redshift determinations populate 49 SOM cells with no prior spectroscopic measurements, approximately doubling the occupation numbers of an additional 153 cells. Seven SOM cells have |z phot,SOM − z spec |/(1 + z spec ) ≥ 0.15; they have a large spread in photometric redshifts, and our objects probe the second peak of the distribution or its tail. In the redshift range 1.3 to 1.7 there are still 156 cells (19%) without at least one good spectroscopic redshift; in these cells that lack a spectroscopic calibration we find only 14% of the galaxy population. In the redshift range 2 to 2.7 there are 232 cells (26%) without at least one good spectroscopic redshift; in these cells we find 23% of the galaxy population. In between, in the redshift range 1.7 and 2 where telluric absorption makes the direct observation of Hα impossible, the number of cells without at least one good spectroscopic redshift is 126 (28%); in these cells we find 26% of the galaxy population with redshifts between 1.7 and 2. The redshift bias (z phot − z spec )/(1 + z spec ) is not well constrained, with an error (0.005) larger than the Euclid requirement.
Two questions arise. First, is it worth attempting to calibrate the missing cells in the H and K bands with LUCI at the LBT?
Probably not, since their density is too low for the field of view available. Moreover, G2020 found that most of them have low star formation rates, making the detection of emission lines very difficult, and the success rate (around 30%) of the present campaign is not particularly encouraging. Second, is it worth at-tempting the calibration of the cells missing in the redshift range 1.7 to 2 with ground-based facilities? Detecting the Hγ, Hβ, and [O iii] lines in the J band up to redshift 2, 1.9, and 1.8, respectively, could be possible with an increasing probability of success. Whether the LBT with the LUCI spectrographs should be used for such a campaign depends on the density on the sky of the objects needing spectra. However, the difficult remaining sources at these redshifts are primarily passive galaxies, and so spectroscopic searches for Balmer and [O iii] emission lines are unlikely to be efficient or successful. A costly solution could be ground-based (optical) absorption-line spectroscopy. A release of optical spectra gathered at the VLT and the GRANTECAN telescopes is in preparation. In the end the (possibly partial) natural solution will come from Euclid itself: its spectroscopic program will deliver near-infrared spectra in the relevant redshift range free of telluric absorption. The survey is designed to detect emission lines down to 2 × 10 −16 ergs −1 cm −2 at 3.5σ in the wide configuration, and reach up to 2 mag fainter levels in the deep fields (Euclid Collaboration: Scaramella et al. 2021), or 3 to 6×10 −17 ergs −1 cm −2 . Using the Hα fluxes measured by G2020, we estimate that 60 to 90% of the sources will be detected within these limits in the deep fields. Therefore, the mission itself will provide enough spectra to finish the photometric redshift calibration in the near-infrared range to the required precision. In particular, a question that needs to be clarified is whether in a given SOM cell the galaxies for which we are able or unable to measure a spectroscopic redshift might have different redshift distributions.   Technology, under a contract with NASA. We thank the staff of the LBT observatory for their support during the mask preparation and the execution of the observations. The values of σ N MAD = 1.48Median(|z − z spec |/(1 + z spec )) and Bias = Mean((z − z spec )/(1 + z spec ))) are also given, with z = z phot in the left plot and z = z phot,SOM in the right. Fig. 7: Mean differences between photometric and spectroscopic redshifts. Left: Mean differences of (z phot − z spec )/(1 + z spec ) of non-catastrophic failures in bins of z spec as a function of z spec for Flag ≥ 3 redshifts released by the C3R2 project: yellow from Masters et al. (2015Masters et al. ( , 2017Masters et al. ( , 2019; magenta from G2020; red from Euclid Collaboration: Stanford et al. (2021); blue, this paper. The black line shows the average in bins of the whole sample, the dotted lines the error range. Right: Same, but for Flag = 4 redshifts.