Faint millimeter NIKA2 dusty star-forming galaxies: finding the high-redshift population

We develop a new framework to constrain the source redshift. The method jointly accounts for the detection/non-detection of spectral lines and the prior information from the photometric redshift and total infrared luminosity from spectral energy distribution analysis. The method uses the estimated total infrared luminosity to predict the line fluxes at given redshifts and generates model spectra. The redshift-dependent spectral models are then compared with the observed spectra to find the redshift. Results. We apply the aforementioned joint redshift analysis method to four high-z dusty star-forming galaxy candidates selected from the NIKA2 observations of the HLSJ091828.6+514223 (HLS) field, and further observed by NOEMA with blind spectral scans. These sources only have SPIRE/Herschel photometry as ancillary data. They were selected because of very faint or no SPIRE counterparts, as to bias the sample towards the highest redshift candidates. The method finds the spectroscopic redshift of 4 in the 5 NOEMA-counterpart detected sources, with z>3. Based on these measurements, we derive the CO/[CI] lines and millimeter continuum fluxes from the NOEMA data and study their ISM and star-formation properties. We find cold dust temperatures in some of the HLS sources compared to the general population of sub-millimeter galaxies, which might be related to the bias introduced by the SPIRE-dropout selection. Our sources, but one, have short gas depletion time of a few hundred Myrs, which is typical among high-z sub-millimeter galaxies. The only exception shows a longer gas depletion time, up to a few Gyrs, comparable to that of main-sequence galaxies at the same redshift. Furthermore, we identify a possible over-density of dusty star-forming galaxies at z=5.2, traced by two sources in our sample, as well as the lensed galaxy HLSJ091828.6+514223. (abridged)


Introduction
It is now clearly established that dusty star-forming galaxies (DSFGs) are critical players in the assembly of galaxy stellar mass and the evolution of massive galaxies at z < 3 (e.g.Madau & Dickinson 2014).At higher redshift, observing the dusty star-formation and its spatial and redshift distribution requires undoubtedly (sub-)mm experiments and is still very challenging.For example, the limited existing estimates on dust-obscured star formation rate densities (SFRD) at z > 4 are still not consistently measured, as shown in the discrepancy between recent studies (e.g., Gruppioni et al. 2020;Dudzevičiūtė et al. 2020;Fudamoto et al. 2021;Zavala et al. 2021;Fujimoto et al. 2023).This is largely due to difficulties in uncovering a large unbiased sample of high-redshift DSFGs in relatively large cosmic volumes.Bright and faint DSFGs at high redshift have been uncovered by SPT (Reuter et al. 2020) and ALMA surveys (Franco et al. 2018;Zavala et al. 2021;Aravena et al. 2020).
However, statistical studies with these sources suffer from the fact that either strongly-lensed DSFG samples are not well statistically defined or covered areas are limited.
It is well known that in the (sub-)millimeter, larger area and relatively deep surveys can efficiently find high-redshift DSFGs (Béthermin et al. 2015b), thanks to the negative k-correction (e.g., Casey et al. 2014), combined with the shape of the luminosity functions.Such large-area deep surveys are conducted with single-dish telescopes, as with the SCUBA2 instrument on the JCMT (Holland et al. 2013) or the NIKA2 instrument on the IRAM 30m (Perotto et al. 2020).The angular resolutions of such single-dish surveys are 13", 11.1" and 17.6", for SCUBA2 at 850 µm, and NIKA2 at 1.2 and 2 mm respectively.This makes it difficult to unambiguously identify the multiwavelength counterparts of the DSFGs and to search for the high-redshift population.As already shown by the follow-ups of SCUBA2 sources with ALMA (e.g., Simpson et al. 2020), the combination of single-dish and interferometer surveys is by far the most efficient way of constraining the dusty star formation at 2 < z < 6.Indeed, the high resolution and sensitivity of (sub-)millimeter interferometers can provide accurate position measurements on DSFGs and thus the identification of their multiwavelength counterparts.However, getting photometric redshift from optical-IR is complicated by the lack of sufficiently deep homogeneous multi-wavelength data to analyze large samples.Moreover, DSFGs are subject to significant optical extinction (some of them are even optically dark, see Franco et al. 2018;Williams et al. 2019;Manning et al. 2022) which impacts the quality and reliability of photometric redshift estimates and prevents optical/near-infrared spectroscopic follow-up.Photometric redshifts from far-IR/mm to radio broad-band photometry have been used in studies on the cosmic evolution of high-z DSFGs since the discovery of DSFGs (Yun & Carilli 2002;Hughes et al. 2002;Negrello et al. 2010).However, these measurements are even more uncertain than the optical-IR photometric redshift, as the spectral energy distributions (SEDs) in the far-IR/mm do not show any spectral features (but a broad peak), and there are often only few data points on the SEDs to constrain the model.In addition, there is a strong degeneracy between dust temperature and redshift in distant dusty galaxy, which limits the usefulness of simple photometric redshifts (e.g., Blain 1999).Finally, in the modelling of the FIR emission, optically thin or thick solutions are heavily degenerate.Indeed, the same SED could arise from either cold and optically thin or from a warmer and optically thicker FIR dust emission with no robust way to discriminate between the two by using continuum observations (Cortzen et al. 2020).This often leads to an overestimate of the FIR photometric redshifts because of an apparent colder dust temperature derived from optically-thin emission in high-redshift, starbursting DSFGs (Jin et al. 2019).
For such galaxies, spectral scans in the millimeter can be the only way of getting the spectroscopic redshift, as shown in e.g.Walter et al. (2012); Fudamoto et al. (2017); Strandet et al. (2017); Zavala et al. (2018).The success rate of measuring the redshift using millimeter spectral scans can be very high, being >70% (Weiß et al. 2013;Strandet et al. 2016) and even up to >90% (Neri et al. 2020).Such a success rate is obtained on large samples in a reasonable amount of telescope time but for bright DSFGs.For example, with a total time of 22.8 hours on 13 DSFGs with average 850µm fluxes of 32 mJy, Neri et al. (2020) measured the redshift of 12/13 sources with NOEMA.Weiß et al. (2013) obtained a ∼90% detection rate for sources with S 1.4 mm > 20 mJy.Obviously, for much fainter objects, obtaining redshifts may become much more difficult (e.g.Jin et al. 2019).
We are currently conducting a deep survey with NIKA2, the NIKA2 Cosmological Legacy Survey (N2CLS), a guaranteed time observation (GTO) large program searching for a large sample of high-z DSFGs (Bing et al. 2022(Bing et al. , 2023)).The observations cover two fields, GOODS-N and COSMOS, and most of the detected DSFGs are sub-mJy sources at 1.2 mm.One of the goals of N2CLS is to put new solid constraints on the obscured SFRD at z > 4. To reach that goal, we need first to obtain the redshift of N2CLS sources.While deep optical-IR data are available in the two fields and have been extensively used to obtain photometric redshifts, a large fraction of the sources currently lack a secure redshift.Given the wealth of ancillary data already available on these two fields, blind millimeter spectral scans is the only solution to measure their spectroscopic redshift.As a pilot program to try to identify the high-redshift population, we selected 4 high-redshift candidates detected at 1.2 and 2 mm by NIKA2.They have been selected from their far-IR/mm SEDs photometric redshift in the HLSJ091828.6+514223field observed with NIKA2 during the Science Verification.This paper presents the redshift identification and source properties based on the spectral scans obtained with NOEMA on these sources.It is organised as follows.In Sect 2, we present the sample and NIKA2 observations.Section 3 describes the NOEMA observations and data reduction, as well as the extraction of continuum fluxes and spectral scans.In Sect.4, we extensively discuss the redshifts.In particular, we develop a new method that combines both far-IR to millimeter photometric data and spectral scans to measure the redshift.Source properties, as their dust mass and temperature, kinematics and excitation of molecular gas, are given in Sect. 5. Section 6 presents the potential discovery of a DSFG over-density at z=5.2 in the HLS field.Conclusions on the main results and the possible implications of our findings in future high-z DSFGs studies are given in Sect.7. Finally, three appendices give more details on the method of redshift measurements and its validation.Throughout the paper, we adopt the standard flat ΛCDM model as our fiducial cosmology, with cosmological parameters H 0 =67.7 km/s/Mpc, Ω m =0.31 and Ω Λ =0.69, as given by Planck Collaboration et al. (2020).

NIKA2 field around HLSJ091828.6+514223
As part of the NIKA2 Science Verification that took place in February 2017, we observed an area of 185 arcmin2 , centered on HLSJ091828.6+514223 , a lensed dusty galaxy at z=5.24 (Combes et al. 2012), for a on source time of about 3.5 hours at the center.This allowed us to reach 1σ sensitivities of about 0.3 mJy at 1.2 mm and 0.1 mJy at 2 mm on HLSJ091828.6+514223.This galaxy is close to the z=0.22 cluster Abell 773, but is likely lensed by a galaxy at z∼0.63.For our NIKA2 sources, the magnification by the galaxy cluster is <10%.Therefore, we do not expect the NIKA2 sources to be highly magnified (E.Jullo, private communication).
The NIKA2 field overlaps almost entirely with Herschel SPIRE observations at 250, 350, 500 µm.On the contrary, the PACS, IRAC and HST images cover only a very small part of the field on the west side (where NIKA2 observations have lower signal-to-noise ratios).Thus only SPIRE data were used to select high-z candidates.The SPIRE fluxes were measured using FASTPHOT1 (Béthermin et al. 2010) through simultaneous PSF fitting, using NIKA2 source positions as priors on the SPIRE maps.
We built a 1.2 and 2 mm catalog using the NIKA2 data reduced using the collaboration pipeline (Ponthieu et al., in prep).A total of 27 sources are detected with S/N>5 in at least one band (1.2 or 2 mm).From this catalog, we selected four sources detected at both 1.2 and 2 mm with high signal to noise ratio (between 5.7 and 9.7) and for which there is a faint (at the level of confusion noise) or no SPIRE counterparts, as to bias the sample towards the highest redshift candidates.Indeed, rough sub-millimeter photometric redshifts, obtained by fitting empirical IR SED templates from Béthermin et al. (2015a) to our SPIRE+NIKA2 data, were z phot ∼ 5 − 7.These sources are named HLS-2, HLS-3, HLS-4, and HLS-22.Their fluxes are between 1.7 and 2.9 mJy at 1.2 mm and 0.28 and 0.60 mJy at 2 mm.The flux measurements and uncertainties are presented

NOEMA observations and data calibration
Follow-up observations were made using NOEMA from 2018 to 2020, with 4 different programs.The 4 sources in the HLS field were all observed by NOEMA with the PolyFiX correlator.They were initially targeted by project W17EL (HLS-2/3/4) and W17FA (HLS-22) using the same setups that continuously cover the spectra from 71 GHz to 102 GHz with the D configuration in band1.HLS-22 were further observed in project W18FA with the A configuration in band1 and HLS-2/3 were further observed in project S20CL with the D/C configuration in band2.The total on-source time of all of the proposals is 44.9 hours.The details of the observations on each source are summarized in Table 2.
NOEMA observations are first calibrated using CLIC and imaged by MAPPING under GILDAS2 .Radio sources 3C454.3,0716+714, 1156+295, 1055+018, 0851+202 and 0355+508 are used for bandpass calibrations during these observations, and the source fluxes are calibrated using LHKA+101 and MWC349.With the calibrated data, we further generate the uv table with the original resolution of 2 MHz.We also produce the continuum uv table of each source by directly compressing all corresponding lower sideband (LSB) and upper sideband (USB) data with the uv_compress function in MAPPING.

NOEMA continuum flux measurement and source identification
We identify the counterparts of our sample in the NOEMA continuum data.We first generate the continuum dirty map and then clean the continuum image of each source with the Clark algorithm within MAPPING.The cleaned image of each source with the highest SNR and(or) the best spatial resolution is shown in Fig. 1.We blindly search for candidate sources by identifying all of the peaks above 4×RMS within the NOEMA primary beam.Their accurate positions are then derived with uv_fit function in MAPPING (with the peak positions as the initial prior and point source as the model), and the continuum fluxes at the other frequencies are estimated with source models fixed to these reference images.
The continuum fluxes are measured using uv_fit and the same models as given in Table 3.The 4 sidebands in the 2 setups of W17EL and W17FA are combined together to generate continuum uv tables centered on 3.6 mm, given the low SNR of the continuum emission at such a long wavelength.When data are available, the continuum fluxes at higher frequencies are measured both sideband by sideband and on the combined LSB+USB uv-table.The continuum fluxes are listed in Table 4.
We detect 5 reliable continuum sources within the primary beam of NOEMA observations as counterparts of 4 NIKA2 HLS sources.We further checked the residual RMS on the map with source models more complex than point sources.This does not improve the level of residuals of three NOEMA sources.For the rest two sources, HLS-2-1 and HLS-3, the favored simplest models are a circular Gaussian model and an elliptical Gaussian model, respectively.The position and preferred models of each source is listed in Table 3, and we note that the position of these sources does not change significantly depending on the model.
We show on Fig. 1 the cleaned images of NOEMA observations.The NIKA2 source HLS-2 is resolved into two continuum sources in our high-resolution NOEMA observation with SNR∼10.The rest of NIKA2 sources are all associated with one single NOEMA source.For these sources (HLS-3, HLS-4 and HLS-22), we compare their positions in the NOEMA and NIKA2 observations.The maximum offset is found in HLS-3 with a value of 1.9 arcsec.The average offset is 0.9 arcsec among these three sources, which suggests a high positional accuracy of NIKA/NIKA2 for locating sources with relatively high SNR.
For HLS-2 and HLS-3, part of our NOEMA observations measure their continuum fluxes at a frequency close to the representative frequencies of NIKA2 2 mm bands.The NOEMA and NIKA2 fluxes are consistent for HLS-3.The total NOEMA fluxes of the 2 components of HLS-2 is 50% higher than that measured by NIKA2, while still being consistent with each other within 3 σ uncertainties.This first comparison is encouraging.A detailed study of NIKA2 and NOEMA fluxes is beyond the scope of this paper and will be conducted with more statistics (e.g. with the NOEMA follow-up of N2CLS sources).

Extraction of NOEMA millimeter spectra
We extract the millimeter spectra of NOEMA continuum sources from the full uv table.The uv tables are first compressed by the uv_compress function in MAPPING, which makes averages within several channels to enhance the efficiency of the line searching with higher SNR per channel and smaller load of data.For observations in band1 we set the number of channels to average to 15 while the observations in band2 and band3 are averaged every 25 channels, which corresponds to channel widths of 107km/s, 100km/s and 59km/s at 84 GHz, 150 GHz and 255 GHz, respectively.Given the typical line width (a few hundred to one thousand km/s) for sub-millimeter galaxies (Spilker et al. 2014), the compression of uv tables could still ensure Nyquist sampling by 2-3 channels on the emission line profiles and preserve the accuracy of line center and redshift measurement.
To extract the spectra, we perform uv_fit on the compressed spectral uv table with the position and source model fixed to the same as those given in Table 3.For the observations of the W17EL002 setup, we flagged the visibilities associated with one antenna significantly deviating from the others.Given the relatively low angular resolution of most of our data on HLS sources   (∼5" in band1 and ∼2" in band2), these galaxies is unlikely to be significantly resolved, thus the uv_fit at fixed position on the uv tables should be able to uncover the majority of their line emission.
We further remove the continuum in the extracted spectra, assuming a fixed spectral index of 4. This is equivalent to a modified black-body spectrum with a fixed emissivity (β) at 2, and is generally consistent with the dust emissivity we derived in Sect.5.2.We use these continuum subtracted NOEMA spectra for the redshift search and the emission line flux measurement (see Sect. 4.2 and Sect. 5.1).The extracted spectra and continuum model to be removed are shown in Fig. 2.

Source redshift from photometric-spectroscopic joint analysis
An accurate redshift is a prerequisite to the accurate estimate of the physical properties of high-z galaxies.However, the optical-IR SED of high-z DSFGs are often much poorer constrained Notes. (*) Data sets and fluxes derived with free parameters on source position and shape in uv_fit.The fluxes at the other frequencies on a specific source are fitted with positions and shapes fixed to the same as marked data set, which are given in Table 3.
than other high-z galaxies due to their faintness at these wavelengths, which poses challenges to the accurate measurement of their photometric redshifts.In far-IR, the degeneracy between colors, dust temperature and redshift could also lead to a highly model dependent estimate of the photometric redshift.In this section, we describe the different methods and summarize the results of redshift estimate on our sample, with both photometric and spectroscopic data described in Sect.3. Specifically, we introduce a new joint analysis framework to determine the red-HLS-2-1 HLS-2-2 HLS-3 HLS-22 HLS-4 Fig. 2: Millimeter spectra of all HLS sources extracted from the uv tables obtained from the NOEMA observations.The continuum models to be subtracted are presented as grey solid lines.The lines identified to determine the spectroscopic redshift of the sources (see Sect. 4 and Fig. 4 for details) are marked by dashed black vertical lines.shifts of NIKA2 sources combining the probability distribution function of photometric redshifts together with the corresponding IR luminosities and blind spectral scans, which helps us identify the low SNR spectral lines in the NOEMA spectra.

Photometric redshifts
The lack of deep optical and infrared data for the HLS field makes it impossible to conduct a full SED modeling of NIKA2 detected sources.However, with the NIKA2 and SPIRE pho-tometry, we fit the far-IR SED of HLS sources with dust emission templates to estimate their redshifts and IR luminosities.Given the poor angular resolution of the FIR data, we are not able to obtain the fluxes of each single component resolved by NOEMA observations on HLS-2.Thus we only fit with the integrated fluxes under the assumption that the 2 components blended within the beam of SPIRE and NIKA2 are located at the same redshift.
B15 templates could be described as a series of empirical dust SEDs of galaxies at different redshifts.The dust SEDs are produced based on the deep observational data from infrared to millimeter.It considers 2 populations of star-forming galaxies, starburst and main-sequence galaxies, and produces the 2 sets of empirical SED templates correspondingly.We fit our photometric data points with the templates of main-sequence galaxies, which consist of 13 SEDs at each redshift.These templates include the average SED and the SEDs within ±3σ uncertainties with steps of 0.5σ.The estimated redshift, as well as the 1σ uncertainties based on the fitting using B15 main sequence and starburst SED templates are listed in Table 5.In the following redshift searching involving B15 templates (see Sect. 4.2 and Sect. 4.3), we will only use and present the output of SED fitting based on main sequence templates.This is mainly because the output results of the SED fitting based on starburst and main sequence templates, as shown in Table 5, are highly consistent considering the uncertainty.Casey (2012) describes the intrinsic FIR dust emission of galaxies using a generalized modified black-body model in far-IR plus a power-law model at mid-IR.For the SED fitting with the Casey (2012) template, we work within the framework of the MMPZ algorithm Casey (2020).It considers the intrinsic variation of dust SED at different IR luminosities, as well as the impact of the rising CMB temperature at high redshift.The default set of IR SEDs fixes the mid-infrared spectral slope to 3 and dust emissivity β to 1.8.The template SED also considers the transition from optically thin to optically thick when going to lower wavelengths, where the wavelength of unity opacity (τ(λ)=1) is fixed to 200 µm.The redshift, the total infrared luminosity and the corresponding wavelength at the peak of IR SED are the main parameters to be considered for the fit.The empirical correlation between the latter two parameters is also taken into account during the fit.
From the analysis and results shown in Fig. 3, we find that the redshifts from MMPZ are systematically lower than those from B15 template fitting, with a typical ∆z/(1 + z) of around 20%.However, the two redshifts are still consistent within their uncertainties.The infrared luminosities returned by MMPZ are also systematically lower by ∼0.3 dex, especially at redshifts beyond 3.
The faintness and large flux uncertainties of our sources in the 3 SPIRE bands make the constraint on the peak of their IR SEDs much worse than for brighter/lensed high-z sources, which leads to large uncertainties on the estimated total IR luminosity.Compared to the template fitting with B15, MMPZ further takes the CMB heating and dimming (da Cunha et al. 2013) into consideration.Although this could affect the dust emissivity index β and, as a result of β-T degeneracy, the dust temperature and IR luminosity, the β values are all fixed to 2 in these 2 templates.Thus we consider that the inclusion of the CMB effect is not the major contributor to the differences between the results of the two template fitting methods.
The difference in the estimated total IR luminosities will be propagated to the joint photometric and spectral analysis on the source redshift in Sect.4.2.

Joint analysis of photometric redshifts and NOEMA spectra
Due to the lack of characteristic spectral features in far-IR, the photometric redshifts of our sample derived from Sect.4.1 still have large uncertainties.Searching for emission lines in the millimeter spectra provides an approach to constrain our redshifts with a significantly better accuracy.To identify the possible emission lines in the spectra, we performed a blind search in the NOEMA spectra.The NOEMA spectra are first convolved by a box kernel of 500 km/s width, which corresponds to the typical molecular line width of bright (sub)millimeter selected galaxies (e.g., Bothwell et al. 2013).To more completely uncover the possible emission lines in these noisy spectra, we list five lines 3.2 100.628 3.0 (if exist) with the highest S/N in the convolved spectra with S/N > 3 in Table 6.We failed to detect any lines for HLS-2 and HLS-4 with S/N > 3 in our observations.For HLS-3 and HLS-22, we identify one and two detections, respectively.The "detection" at 100.628 GHz in the HLS-22 spectrum is likely to be a glitch or noise spike with wrongly estimated uncertainty (see Appendix A and Fig. A.2).With only one significant detection of an emission line, it is not possible to have an unambiguous redshift solution.
To find the redshift solutions, we need to take additional constraints from the broad-band photometry, in particular from the total infrared luminosities at any sampled redshift in the SED fitting.From the output χ 2 and IR luminosities of all models at one given redshift, we could derive the weighted average value of total infrared luminosity of the source at this redshift using Eq. ( 1): where the σ( j) is a weighting term to account for the deviation of the 13 model SEDs from the median of star-forming galaxies at a given redshift in B15.Indeed, at a given redshift, the B15 template includes 1 median SED and 12 SEDs within ±3σ uncertainties with a spacing of 0.5σ.So when deriving the source IR luminosities at given redshifts, the σ( j) terms should be included to account for the probability of the IR template SEDs to deviate from the median in B15 model.The values of σ( j) are thus between -3 and +3 with step of 0.5.When using the output from MMPZ, the σ( j) will be set to 0.
With a series of average IR luminosity over the redshift grid from the SED fitting, we linearly interpolate the IR luminosity at any given redshift.We further use the IR luminosity to constrain the fluxes of strong FIR-millimeter emission lines at any given redshift based on the well defined, almost redshift invariant L FIR -L line relations in the form of Eq. 2: The luminosities and fluxes of the 12 CO lines of J(1-0) to J(12-11), two transitions of [CI], and the [CII] line at 158 µm are predicted based on various scaling relations found in the literature.The detailed information are listed in Table 7 and references therein.With the estimated fluxes of different line species at a given redshift, we generate a model spectrum in the frequency range of the NOEMA spectral scans and compare this model with the observations 3 .When generating the model spectra, we assume the emission lines have Gaussian profiles with a fixed full width half maximum (FWHM) of 500 km/s.We also linearly interpolate the Fig. 3: Results of IR template fitting on our 4 HLS sources with the B15 dust templates and MMPZ method, using the SPIRE, NIKA2 and NOEMA photometry.The plots in the first column show the probability density distribution (normalized by the peak values) of each sources.The second column shows the evolution of the weighted average infrared luminosity with the redshift.The third column shows the best fit SED models with the observations.Sources from the top to the bottom are HLS-2, HLS-3, HLS-4 and HLS-22.
L FIR,med -z relations from the IR template fitting to a finer redshift grid to avoid missing any possible redshift solutions.
The spacing between adjacent redshifts in the resampled grid satisfies Eq. 3, which is equivalent to a fixed spacing in velocity (∆v) between adjacent redshifts: We fixed the ∆v to be 1/3 of the chosen FWHM, making the emission line profile to be Nyquist-sampled by the predicted line centers at the corresponding redshifts in the new grid.This ensures that emission lines in the spectra and their corresponding redshift solutions will not be missed in our analysis due to poor redshift sampling.The goodness of the model prediction at a given redshift is evaluated by log-likelihood ln(L spec (z)) from the χ 2 between the model spectra and the data, as given below in Eq. 4 and Eq.5: In addition to the goodness of match between spectra and models, we further account for the goodness of SED fitting at given redshifts, χ 2 S ED (z), which is defined similarly to Eq. 5.The joint log-likelihood at each sampled redshift reads as: As already pointed out, we assume that the two counterparts for HLS-2 have a similar redshift and share the same FIR SED.Under this assumption, the total infrared luminosity of the two NOEMA sources is thus computed based on their contributions to the total flux at 2 mm, which are later used in deriving their final joint-likelihood of redshift.

Redshifts measurement
The results of the joint log-likelihood of redshift from photom-etry+spectral scan analysis on the five HLS sources are shown in Fig 4. For each source, we normalize the L spec (z) to the peak value, which helps us to compare quantitatively the relative goodness of match between the model predictions and the observed spectra at different redshifts.We select all the peaks in ln(L spec (z)) with an amplitude larger than -10 and a width larger than 3 samples in the redshift grid as the possible redshift solutions of our sources, using the "find_peaks" algorithm in SciPy.Considering the large uncertainties on the total infrared luminosity of HLS sources, we further cross validate their possible redshift solutions by repeating the joint likelihood analysis using the output IR luminosity at different redshifts from MMPZ fitting, and apply the same algorithm to record the possible redshift solutions.Compared to the log-likelihood of redshift with photometric constraints only (see Fig. 3), the joint analysis helps us to highlight significant isolated peaks for the redshift of HLS-2-1, HLS-2-2, HLS-3 and HLS-22 at z=5.241, 5.128, 3.123 and 3.036, respectively.The redshift with maximum log-likelihood value is also not sensitive to the choice of IR templates.As shown in Fig 4, the B15 template and MMPZ find almost the same redshift where the joint log-likelihood value reaches the maximum, which further confirm their redshift solutions as listed above.For HLS-4, despite having the most accurate photometric redshift constrained through template fitting, the absence of emission line detection in the band1 spectral-scan observations results in our analysis being unable to identify significant peaks in the joint log-likelihood.Thus no reliable redshift solution is found for this source.
For HLS-3 and HLS-22 with blindly detected candidate emission lines at S/N>3, our method successfully confirms the candidate lines but for that at 100 GHz in the HLS-22 spectrum.The extremely narrow profile of the candidate detection suggests that this is likely to be a glitch, which is shown and discussed in Appendix A. For HLS-3, MMPZ also reveals a secondary redshift solution at z=2.299 with slightly lower log-likelihood in the analysis.This assigns the most significantly detected emission line at 139.746 GHz to be CO(4-3), while the best solution at z=3.123 assigns it to be CO .If HLS-3 has z=3.123, we also expect to cover the CO(3-2) line in the spectral scan.Although the line is not detected at 3σ (see Sect. 5.1), we could not simply reject any of the two possible redshift solutions due to the the high noise level around the observed frequency.Thus, we consider the redshift of HLS-3 to be less secure than that found for the other sources with at least two lines with S/N>4 (HLS-2-1, HLS-2-2 and HLS-22, see Table 9) and we will provide the estimate of HLS-3 properties based on both redshift solutions in the paper.We also note that HLS-2-2 could have a secondary redshift solution at z=3.385.However, this redshift could not match with any of the two most significant emission lines found in the spectrum, which correspond to CO(4-3) and [CI](2-1) at z=5.128.Thus, we only adopt the z=5.128 solution in the following analysis.
In Table 8, we summarize the redshifts from the joint analysis method (z joint ), as well as the far-IR photometric redshifts based on the two far-IR templates (z B15,ms and z MMPZ ).For sources with ambiguous redshift solutions, we use the two different z f ix for the analysis.The uncertainties of z joint are conservatively given and correspond to 0.5×FWHM of the line (see Table 9) In the following sections, the source properties are estimated at the best solution of redshift of joint analysis, or the most possible photometric redshift when no redshift solution is found in the joint analysis.These choices of redshifts of our sample are listed as z f ix in Table 8.
The first row shows the normalized log-likelihood from SED fitting and the joint log-likelihood for each source after considering the information obtained from the NOEMA spectral scans.The second and third row show the cutout of the spectra around candidate spectral lines at the best redshift solutions.The lines shown in the second row are detected in the earliest band1 spectral scans (W17EL and W17FA) and those in the third row are detected in the additional follow-up observations (W18FA and S20CL).The models generated based on the fit with the B15 dust templates and MMPZ at the most probable redshift are plotted as solid and dashed red lines, respectively.We emphasize again that these models are not coming from parametric fitting but are generated using an estimate on the total infrared luminosity and the L CO (L [CI] )-L IR scaling relations.Given the line profile, the spectra of HLS-3 will be analysed using a double-gaussian model.The rest of the sources will be analysed using a single-gaussian model.

Robustness and self-consistency of the joint analysis method
The analysis on the source properties of our sample, as dust mass, temperature and star-formation rate, largely relies on the estimated redshifts from the joint analysis method, which is subject to assumptions on the line widths and line luminosities.
To test the robustness of the redshift derived from the jointanalysis method, we make tests with model spectra of varying line widths, with NOEMA data of more limited spectral coverage and with different far-IR templates in deriving photometric redshift and predicting IR luminosity.These tests show that the redshifts of our sources from the joint analysis are reliable.Besides, we also check the self-consistency of our redshift solution by comparing our L FIR -L CO correlation with the scaling relations and their scatters.These tests and discussions are presented in Appendix A,B,C and D.

Kinematics and excitation of molecular gas of HLS sources
We find redshift solutions to 4 NOEMA sources associated with 3 NIKA2 sources.Each of these sources has at least one emission line detected with SNR>4 or 2 lines with SNR at ∼3-4 in NOEMA spectral scans.With these redshifts, we further measure the flux and line width of the spectral lines covered by the observations, and derive the corresponding lines luminosity.We start the fitting with a single Gaussian model on the continuumsubtracted spectra of each source.No matter if they are detected with high significance, all CO/[CI] lines falling into the frequency coverage of NOEMA are considered.To make a more robust analysis on the line width, we also force the kinematics of CO and [CI] lines to be the same during the fit.For the emission line at 139.746 GHz of HLS-3, a double Gaussian model results in an Akaike information criterion (AIC) of -17.1, comparing to the AIC of -11.3 using a single Gaussian model.This suggests an improved quality of the fit with the double-Gaussian model and thus we use this as the model for HLS-3.The line widths, fluxes and upper limits of CO/[CI] lines for each source are listed in Table 9.We measure the total flux or flux upper limit of each line by integrating the spectra within ±3σ line around the best-fit line center for the single-peaked lines.For HLS-3 with a double-peaked line profile (noted as the red and blue peak, respectively), we estimate the line fluxes and upper limits by integrating the spectra in the range of [f center,red -3σ red , f center,blue +3σ blue ].The corresponding CO/[CI] line luminosities or 3σ upper limits (L' line and L line ) are also calculated using the following equations (from Solomon et al. 1997): where S line ∆V is the velocity integrated flux in Jy km s −1 , ν rest = ν obs (1 + z) is the rest frequency in GHz, and D L is the luminosity distance in Mpc.
The [CI](1-0) lines of HLS-2-1 and HLS-2-2 are covered by the spectral scan but are located at the noisiest edges of the NOEMA sidebands.This makes their upper limits of little scientific value and thus we discard them from the table.The CO(7-6) line of HLS-2-2 is marginally detected but only partly covered by our observations.For this line, we use the output parameters from the spectral line fitting to constrain its flux using a complete Gaussian profile.
Figure 5 shows the best-fit models for each CO/[CI] line.The line identification in each panel are presented assuming the best redshift solution of each source, as listed in Table 8.The spectra have the same channel width as the ones we used for the joint analysis.From the best-fit parameters we find the line widths are generally consistent with the assumption we made during the redshift search in Sect.4.2, with an average FWHM of 500 km/s.Although previous observations reveal that the integrated [CI] and CO lines from the same high-z galaxies may have different line widths and line profiles (Banerji et al. 2018), fixing or relaxing the velocities and widths of the different lines during our analysis does not significantly change the quality of the best-fit model.
The observations of the 2 sources associated with HLS-2 cover both mid-J (CO(4-3) or CO ) and high-J CO lines (CO(7-6) or CO ), allowing us to roughly estimate the conditions of their molecular gas and compare with other DSFGs at similar redshifts using luminosity ratios (expressed in K km/s pc 2 ).For HLS-2-1, the L' CO(8−7) /L' CO (5−4) is 0.31±0.11,which is consistent with the values found in typical high-z SMGs with low excitation (Bothwell et al. 2013) but lower than the reported value of some starburst galaxies and luminous quasars at similar redshift (Rawle et al. 2014;Li et al. 2020).The L' CO(7−6) /L' CO(4−3) and CO(8-7)/CO(4-3) of HLS-2-2 are <0.17 and <0.12, respectively.These values are even lower than the typical value of high-z SMGs but still consistent with the low excitation ISM found in the "Cosmic Eyelash" (Danielson et al. 2011).
For the rest 2 sources with CO detection of less separated quantum numbers J, our observations find L' CO(5−4) /L' CO(3−2) >0.73 in HLS-3 and L' CO(4−3) /L' CO(3−2) =0.78±0.22 in HLS-22, respectively.The value for HLS-22 is generally consistent with the average CO SLED of high-z SMGs in Bothwell et al. (2013), being similar to the case of HLS-2-1.On the contrary, HLS-3 has a L' CO(5−4) /L' CO(3−2) ratio higher than typical SMGs in Bothwell et al. (2013) and resembles the average of the SPT sample (Spilker et al. 2014) or the local starburst galaxy M82 (Carilli & Walter 2013) with higher excitation.However, the observations on these 2 sources do not cover higher-J CO lines like for HLS-2, which traces warmer and denser components in the molecular gas reservoir.Thus, with these 2 line ratio measurements, it is more difficult to conclude.

Dust mass and dust temperature
The Far-IR continuum emission of star-forming galaxies could be well represented by a single temperature modified black-body model from which the dust temperature (T dust ), dust emissivity index (β) and total dust mass (M dust ) can be derived.At high redshift, the increasing temperature of the cosmic microwave background (CMB) reduces the contrast of star-forming galaxy emissions in the (sub-)millimeter and changes the apparent shape of the spectrum at these frequencies.Considering the impact of the CMB, the observed modified black-body emission of high-z SMG could be expressed as Eq. 9 using the optical-thin assumption (da Cunha et al. 2013) : The dust emissivity κ(ν) in far-IR can be described by a single power law: where k 0 stands for the absorption cross section per unit dust mass at a given specific frequency ν 0 .Here we take k 0,850µm =0.047 m 2 /kg from Draine et al. (2014) (see also Berta et al. 2021).We perform MCMC fitting (using the PyMC3 package) on our far-IR to millimeter photometric data using the model given in Eq. 9.The two sources associated with HLS-2 are fitted using their integrated flux, as they have very similar redshifts and their individual fluxes at the far-IR could not be obtained with the low resolution SPIRE data.We adopt uniform priors for T dust and M dust and a flat prior between 1 and 3 for dust emissivity β.We constrain the temperature to be between T CMB at the given redshifts and 80 K.The redshift values are fixed to z f ix given in Table 8. Figure 6 shows as an example the best-fit modified black-body model for HLS-2, as well as the 1σ and 2σ confidence intervals.For HLS-22, the original fit with free parameters lead to a non-physically low dust temperature at 16 K and a high β larger than 3, which is due to the poor observational constraints at 3 mm and <500 µm.Thus, we perform a constrained modified black-body fit with a fixed β of 1.8, consistent with the average of the other HLS sources.We list the estimated dust temperature, mass and emissivity index in Table 10.
We derive a dust mass of ∼ 10 9 M ⊙ , which is consistent with the dust masses derived for bright SMGs selected from blind single-dish surveys (Santini et al. 2010;Miettinen et al. 2017).The abundant dust indicates that these high-z dusty star-forming galaxies have already experienced a rapid metal enrichment in the first few billion years of the Universe.The dust emissivity β of our sample (excluding HLS-22) has a median value of 1.75, which is also consistent with the values found in a variety of galaxies across the cosmic time.
We also measure the far-IR luminosities (L FIR ) by integrating the model SEDs between 50 and 300 µm at the rest-frame of each source.The L FIR are listed in Table 10.Our observations could not properly constrain and model the mid-IR emission of galaxies.Thus, we extrapolate our L FIR (50-300 µm) to the total infrared luminosity (L IR , 3-1000 µm) by multiplying L FIR by a factor of 1.3, based on the calibrations given in Graciá-Carpio et al. (2008).We further derive the star formation rates, SFR, based on the standard scaling relations from (Kennicutt & Evans 2012).The corresponding results are also listed in Table 10.
The dust temperature of our sample varies from 18 to 41 K. Fig. 7 shows the comparison between the dust temperature of our sample with other DSFGs and star-forming galaxies (Roseboom et al. 2013;Riechers et al. 2014Riechers et al. , 2017;;Pavesi et al. 2018;Béthermin et al. 2020;Faisst et al. 2020;Neri et al. 2020;Bakx et al. 2021;Sugahara et al. 2021) at different redshifts.Similar to our analysis, the literature dust temperatures here for comparison are all derived under the optically-thin assumption.The average dust temperature of normal star-forming and starburst galaxies from Béthermin et al. (2015a) and Schreiber et al. (2018) are also shown as baselines of comparison at different redshifts.We find large scatters in the dust temperature of our sample with respect to these empirical T dust -(z) scaling relations on star-forming/starburst galaxies.Among our HLS and literature sample, HLS-3 shows one of the lowest dust temperatures of 23(18) K, while the other three sources at higher redshifts have higher dust temperature not distinctive to literature DS-FGs and the average dust temperature of normal star-forming galaxies.DSFGs with apparently cold temperatures have been reported by some studies in recent years (Jin et al. 2019;Neri et al. 2020).Similarly to these galaxies, the redder/colder far-IR 0.9 +0.9 −0.5 ×10 2 HLS-4 4.3 8.9 +0.3 2.9 +2.0 −1.7 ×10 2 HLS-22 3.0 8.9 +0.1 SED of HLS-3 could resemble normal DSFGs at much higher redshift in SED modeling, which explains the significant deviation of its far-IR photometric redshift from its spectroscopic redshift based on the Béthermin et al. (2015a) SED template.At fixed L IR , we expect galaxies with colder dust temperature to be brighter at 1.2 mm and thus these galaxies are more favored by the selection of candidate high-z DSFGs based on red far-IR to millimeter colors.

Molecular gas mass
Both the CO emission lines and dust continuum in the Rayleigh-Jeans tail of the far-IR SED have been widely used to estimate the amount of molecular gas in galaxies (Carilli & Walter 2013;Hodge & da Cunha 2020).In this section we measure the molecular gas mass for our sample and cross-validate the results with various methods.
The detections/constraints on CO emission lines enable the estimate of the molecular gas mass using the CO luminosity to 10 3 obs (GHz) We also overlaid the average T dust -z relation of main sequence galaxies derived by Schreiber et al. (2018) and Béthermin et al. (2015a) based on observational data.For HLS-4 with only photometric redshift, we show with the dash-dotted grey line the degeneracy between dust temperature and redshift.The T dust of HLS-3 at the two possible redshifts are both plotted and connected by the red solid line.
molecular gas mass conversion factor α CO .Robust estimations of the conversion factor are mostly made on the lowest J transition, CO(1-0), while CO detections in our sample start from CO(3-2) to CO .Thus, we need to convert the luminosities of the lowest-J CO line detected in our observations to CO(1-0), in addition to the assumptions on the α CO conversion factor between CO(1-0) luminosity to molecular gas mass.In our case, we take the advantage of multiple line detections/flux upper limits in the NOEMA spectra of each source to find the matched cases in the literature and roughly estimate the CO(1-0) luminosity and molecular gas mass.As described in Sect.5.1, for each source, we compare their CO luminosity ratios with literature results and find the cases with CO SLEDs that could reproduce the observed values.For HLS-2-1 and HLS-22, we apply L' CO(5−4) /L' CO(1−0) =0.32 and L' CO(3−2) /L' CO(1−0) =0.52 from the average SLED of unlensed SMGs in Bothwell et al. (2013).
Having the L' CO(1−0) (see Table .11), we then estimate the total molecular gas mass with a fixed conversion factor α CO .As the derived star formation rates do not reveal solid evidence of ongoing starburst in our sample, we adopt the typical Milky Way value of α CO =4.36M ⊙ (K km/s pc 2 ) −1 .The estimated molecular gas are also listed in Table .11.
For all sources, we also provide an estimate on their molecular gas mass using their continuum emission in the Rayleigh-Jeans tail.Following the calibration described in Scoville et al. (2016), we derive the luminosity of dust emission at rest-frame 850µm.We use the series of optically-thin modified black-body models generated by the combinations of parameters  11.We find the differences between the molecular gas masses estimated by the two methods are within a factor of 2. Our galaxies have molecular gas mass of 1-3×10 11 M ⊙ , suggesting a gas-rich nature.For HLS-4, we derive a similarly massive molecular gas reservoir as that of the 4 sources with CO detections.
One of the primary source the uncertainty of molecular gas mass measurement comes from the CO-to-H2 conversion factor (α CO ).We applied a typical Milky-way α CO value of 4.36 to the CO(1-0) luminosities.The alternative estimate, based on (Scoville et al. 2016), has a equivalent α CO of 6.5.However, previous studies find starburst galaxies could have much lower α CO compared to the Milky Way like values typical for normal starforming galaxies (e.g Downes & Solomon 1998;Tacconi et al. 2008).The exact value of α CO at high redshift is still highly uncertain.Although there are evidence for α CO as large as the Milky Way in high-z SMGs, starburst-like values are also prevalently used in previous studies.This would introduce differences of a factor of 5-7 in molecular gas mass measurements.The impact of α CO is accounted for in the values of molecular gas mass and gas depletion time given in Table 11.
Combining the measurements from Tables 10 and 11, we derive the gas-to-dust ratios of the 4 sources (HLS-2-1, HLS-2-2, HLS-3 and HLS-22) with relatively secure spectroscopic redshifts and dust-independent molecular gas mass measurements from CO lines.With the assumption of a Milky Way like α CO , our analysis yields an average gas-to-dust ratio of 113, which is in line with values found in local and high-z massive galaxies (Santini et al. 2010;Rémy-Ruyer et al. 2014;De Vis et al. 2019;Rujopakarn et al. 2019) and consistent with the values expected at solar metallicity (Leroy et al. 2011;Magdis et al. 2012;Shapley et al. 2020).A lower, starburst-like α CO will lead to an average gas-to-dust ratio 5 to 8 times lower, which is also consistent with the results of Rowlands et al. (2014) under similar assumptions but still at the extreme values.Such abundant dust in ISM could be difficult to explain unless the sources are already enriched to super-solar metallicity at z = 3 − 5 (Chen et al. 2013;Santini et al. 2014) or (and) they are undergoing vigorous merger+starburst events (Silverman et al. 2018).
We finally derive the depletion time of the molecular gas in each galaxy using the molecular gas mass and the SFR.For the molecular gas mass, we use M gas,S 16 to keep the measurement consistent among all galaxies with or without CO line detections.The results are also given in the last column of Table 11.Fig 8 shows the gas depletion time of HLS sources compared to high-z main sequence galaxies Tacconi et al. (2020) and SMGs (Dunne et al. 2022).Considering the uncertainties of α CO , the plot marks the τ dep in rectangles, with the upper and lower bounds at the τ dep derived using α CO values of 6.5 and 0.8, respectively.Fig. 8 shows that most of HLS sources have short gas depletion time of a few hundred Myrs, which is typical among high-z SMGs in (Dunne et al. 2022).The only exception, HLS-3, shows possibly long gas depletion time up to a few Gyrs and being comparable to main sequence galaxies at the same redshift.Remarkably, HLS-3 also has the largest millimeter continuum size (2.8"×0.7",see Table 3) and the lowest dust temperature (23 +7 −6 K at z=3.123 or 18 +6 −5 K at z=2.299, see Table 10).These atypical properties among SMGs, in addition to its long gas depletion time, suggest that HLS-3 is more likely a massive main sequence galaxy under secular evolution.As reported in Table 3, HLS-2-1 and HLS-3, are already partially resolved in dust continuum with compact NOEMA configurations.This could further suggests extended distributions of the molecular gas reservoir/disk.

A possible over-density of DSFGs at z=5.2
It is found that HLS 2-1 and HLS 2-2, separated by 12 arcsec on the map, have both a redshift of ∼5.2.They are also located within 2 arcmin from HLSJ091828.6+514223 a bright lensed DSFG firstly found by the Herschel Lensing Survey at a similar redshift of z=5.243 (Egami et al. 2010;Combes et al. 2012;Rawle et al. 2014).At this redshift, their projected separation in the sky corresponds to a physical transverse distance of ∼800 kiloparsec.
Given the close spectroscopic redshifts of HLS-2-1 and HLSJ091828.6+514223 the physical distance between these two sources is given by their transverse distance D t =796 kpc, computed following D t = D A × θ sep .The physical distance between HLS-2-1 (or HLSJ091828.6+514223 ) to HLS-2-2, which are separated in both redshift and sky coordinates, is approximately estimated using the following equations: We derive physical distances (D) of ∼9.4 Mpc between HLS-2-1/HLSJ091828.6+514223 to HLS-2- HLSJ091828.6+514223 and HLS-2-1 corresponds to 5.0 comoving Mpc, which is comparable to the scale of the z∼5 overdensities found in COSMOS and GOODS-N associated with SMG/DSFGs (Mitsuhashi et al. 2021;Herard-Demanche et al. 2023).When assuming the core of the possible structure has the same redshift as HLS-2-1 and HLSJ091828.6+514223, the deviation of the redshift of HLS-2-2 would correspond to a line-ofsight comoving distance of 58 Mpc.This is an order of magnitude larger than the scale of the SMG over-density in COSMOS, while still being comparable to the proto-clusters traced by Ly-α emitters at z=5-6 (Jiang et al. 2018;Calvi et al. 2021).Although the stochasticity of star formation makes SMG a unreliable tracer of the most massive halos at intermediate redshift, the high SFR of the 3 sources at such high redshift could only be produced by the most massive galaxies tracing the densest environments in the early Universe (Miller et al. 2015).A more complete redshift survey on the other NIKA2 sources in the HLS field, as well as deep optical-IR observation in the same region, could possibly reveal more members of this possible galaxy over-density to confirm its nature and understand its fate of cosmic evolution.

Summary and conclusions
We present the study on 4 DSFGs selected from the early science verification observations of NIKA2, the KIDs camera installed on the IRAM 30m telescope.
We develop a new framework to determine the redshift of sources with the joint analysis of multi-wavelength photometry and millimeter spectral scans.Accounting for the additional constraints on IR luminosity from the SED modeling, we predict the flux of the strongest emission lines from CO, [CI] and [CII], generate the model spectra at given redshifts accordingly, estimate the goodness of match between the broad-band SEDs, models and the observed millimeter spectra altogether and quantitatively find the most probable redshift solutions based on all this information.
Based on the prior selection on red far-IR to millimeter colors, we identify a sample of 4 millimeter NIKA2 sources of mJy fluxes in the HLS field with possible high redshifts, at z = 3 − 7. We conducted deep NOEMA observations on these sources, and resolve them into 5 individual sources.With the NOEMA spectral scans and the newly developed joint-analysis method, we obtain their redshift and confirm they all have z > 3. Our analysis reveals that most of their properties, such as star formation rate, dust temperature and gas depletion time are normal compared to typical high-z DSFGs with very active star formation.However, we also find that one of our source (HLS-3) shows significantly low dust temperature and long gas depletion time, resembling the properties of secularly-evolved main sequence star-forming galaxies.Furthermore, we find two sources at z=5.2 that are separated by only 5 comoving Mpc, possibly linked to a third source lying at a distance comparable to the proto-cluster size as traced by Ly-α emitters at z = 5 − 6.This could be the hint of an interesting high-z structure in this field.
We demonstrate that our method to constrain the redshift, applied to millimeter selected DSFGs with only far-IR to mm photometry and blind spectral scans, could determine the true redshift accurately.Such accuracy of redshift determination with multiple low SNR emission lines shows promising potential in blind redshift searching on large sample of high-z millimeter-faint DSFGs, even in the absence of accurate optical-IR photometric redshifts.The method is especially expected to improve the design and efficiency of blind redshift search on candidate high-z DSFGs detected by the NIKA2 Cosmological Legacy Survey (N2CLS).Indeed, most of N2CLS sources are fainter (sub-mJy) than the 4 sources discussed here.The new tool we developed will allow us to mitigate the increase of NOEMA or ALMA time that will be needed for these faint DSFGs.
The joint analysis methods also provide possible implications to the strategy to obtain accurate redshift and cosmic evolution of high-z DSFGs.The next generation single-dish telescopes/instruments, such as the CCAT-prime (CCAT-Prime Collaboration et al. 2023) and LMT TolTEC (Wilson et al. 2020), are planned to devote a substantial fraction of observing times in wide area deep blind surveys.With thousands of DSFGs expected to be detected, these surveys aim to reveal the role of DS-FGs in the formation and evolution of massive galaxies through their cosmic evolution and environment/clustering.However, comparing to the existing deep millimeter surveys, the majority of these planned surveys are not expected to be completely covered by deep surveys in near IR at >2µ (Wang et al. 2019;Williams et al. 2019;Fudamoto et al. 2021;Xiao et al. 2023).The lack of the wide and deep near IR surveys like COSMOS-Web (Casey et al. 2023) could make it difficult to identify the counterpart of high-z DSFGs, which further prevent the application of optical-IR SED modeling for efficient and accurate redshift measurements.Our practice on the HLS sources under similar conditions, however, demonstrate that the joint constraints of photometric redshift, IR luminosity and millimeter spectra from far-IR SED and blind spectral scans could also provide a promising accuracy and robustness in efficient redshift searching of high-z DSFGs.Further improvement following this strategy, including the application of this method to the redshift identification of a larger sample of DSFGs discovered by the NIKA2 Cosmological Legacy Survey (Bing et al. 2023), is expected to benefit the key scientific objectives of these future wide area (sub)millimeter surveys.
Appendix A: Robustness of the joint-likelihood method using different line widths One of the key assumption is the width of the emission lines, which we fixed to 500km/s.Previous studies reveal a correlation between total IR/line luminosity and far-IR to millimeter line width, possibly originating from the regulation of gaseous disk rotation by gravity or (and) feedback from star formation or AGN (Bothwell et al. 2013;Goto & Toft 2015).The assumed line width generally matches the average of DSFGs with ULIRG-HyLIRG luminosities in infrared, which is similar to the derived IR luminosity of our sample.However, observations also show significant scatters among IR luminous DSFGs.Our sample could be a typical example of the variety of line width of luminous DSFGs, which have line FWHMs ranges from ∼250km/s (HLS-2-1) to ∼750km/s (HLS-3).Besides, the assumption of Gaussian line profile generally holds for most of our source, but HLS-3, as described in Sect.5.1, has a significant double-peak feature in the detected emission line in band2.
The impact on the joint analysis result from the mismatch between real and assumed line width and profile are an uncertain prior.Thus, we make the following tests to check if and how the results of this joint analysis could change with the assumption on different line widths.In addition to the default setting of 500 km/s FWHM Gaussian line profile, we further perform the joint analysis with line profiles FWHMs of 300km/s and 800 km/s, using the redshift and infrared luminosity derived from the fit with Béthermin et al. (2015a).We identify the redshift solutions of the 4 HLS sources with at least 1 line detected with these 2 different assumptions, using the same method and criteria described as in Sect.4.2.The results are listed in Table A.1.
From the results in Table A.1, we conclude that the redshift solutions from the joint analysis method are generally not sensitive to the assumptions on emission line widths.For all sources but HLS-22, we find little variation in the redshift solutions using different line width assumptions.The differences of ∆z ∼ 0.003, as shown in Fig. A.1, are mostly originating from the changes of the peak intensity of emission lines, which could lead to slight variations of χ 2 (z).However, such little difference is still within the width of the emission lines, and will neither cause false identification of emission line, nor affect the analysis on line fluxes and kinematics in Sect.5.1 with the corresponding central frequency as an initial guess.The only case of significant inconsistency in redshift from the test is HLS-22, where the procedure using 300km/s line width strongly favors a redshift solution at z = 2.436.Given the frequency of the 2 emission lines detected with high SNR, we are confident about the redshift solution at z = 3.036 from the analysis with the model spectra of 500km/s line width.Therefore, We checked the 300km/s model and the data at z=2.436 and find that the mis-identification is caused by a strong noise spike at 100.64 GHz, as shown in  spectrum with the decrease of model line width.In our calculation of χ 2 spec (z) and correspondingly, joint log-likelihood, their variation with redshift are dominated by the goodness of match between model and data within the range of model line profiles.With a narrower line width in the model, the number of data points that dominate the variation of χ 2 spec (z) will be smaller compared to the cases with wider line width.This will make the analysis with narrow line width more sensitive to single spurious data points, like the noise spike in HLS-22 spectra, and lead to the mis-identification in Fig. A.2.A less aggressive spectral binning along frequency and a pre-processing with sigma clipping could probably reduce such false identification in practice.

Appendix B: Robustness of the joint-likelihood method with narrower frequency coverage
Millimeter spectral scans made by interferometers are widely used to blindly search for the emission lines from candidate high redshift DSFGs, determine their spectroscopic redshift and study the conditions of their cold ISM (Strandet et al. 2016;Fudamoto et al. 2017;Jin et al. 2019;Neri et al. 2020;Reuter et al. 2020).These spectral scan observations are designed to cover a continuous frequency range with several spectral setups.For ALMA and NOEMA, the default setup of the blind spectral scans at their current lowest frequency band covers ∼ 31 GHz.The earliest observations in 2018 on HLS sources blindly and continuously cover the spectra of HLS sources between 71 GHz and 102 GHz in NOEMA band1 with 2 setups, which follows this basic strategy of blind redshift search.To test the joint analysis method under more realistic conditions in large DSFG redshift survey projects, we apply the method to analyse these band1 spectra and compare their resulting redshifts with the ones in Sect.From the comparison between the redshift analysis using early and full datasets, we find that the best redshift solutions of HLS-2-1 and HLS-2-2 remain stable, while the results of HLS-22 and HLS-3 are affected by the narrowed spectral coverage.Such a difference in robustness under different spectral coverage could be explained as follows.The joint analysis method works equivalently to the automatic alignment and stacking of 2 or more lines in the spectra.If it is at the correct redshift and with multiple lines covered, this method could numerically boost the stacked SNR of emission lines, even if none of the single lines are detected with high significance.At a fixed coverage in frequency, we could expect that sources with higher redshift could have more CO lines to be covered.Taking our sample as an example, although the lines of HLS-2-1 and HLS-2-2 at z∼ 5.2 are only tentatively detected, their relatively high redshifts ensure that at least two CO/[CI] lines are covered by the spectral scan.On the contrary, the narrowed spectral coverage leaves only one CO line in the spectral coverage of early observations of HLS-3 and HLS-22, leading to ambiguous redshift solutions no matter if the line is detected at high significance.With the comparison of the redshift robustness of these two groups of sources under different spectral coverage, we also emphasize that wide spectral coverage covering at least two strong molecular/atomic lines could be even more crucial in the redshift identification of DS-FGs compared to reaching high sensitivity.

Appendix C: Cross-validation and tension between different SED modeling
The application of the joint-analysis framework on HLS sources largely relies on the current knowledge on the far-IR SED of high-z galaxies.However, although Herschel provides estimates on the mean far-IR SEDs and the redshift evolution of the main population of star-forming galaxies, these results are also limited by significant source confusion, especially in SPIRE data at longer wavelengths.Moreover, current studies reveal some DS-FGs with apparently low-dust temperature (Jin et al. 2019), as well as a significant warm-dust contribution in some starburst galaxies (Eisenhardt et al. 2012;Wu et al. 2012;Fan et al. 2016).These results suggest that a large variation in far-IR SEDs could exist in high-z DSFG populations.
Our choices of template could not be free from these issues, and this is the reason why we adjust our joint-analysis framework to the results of 2 different far-IR SED templates and modeling framework, and make the cross-validation between the results of the two.The analysis with Béthermin et al. (2015a) templates and MMPZ mostly shows consistent redshift solutions.This suggests the relative stability of the joint-analysis method with input information from different SED fitting results.
However, some discrepancies on HLS-22 when using typical blind spectral scan conditions in Appendix B are also found, which leads us to have an additional check on its origin.From the comparison on derived IR luminosity in Fig. 3, and the comparison between model and data in Fig. 4 and Fig. B.1, we notice that the estimated infrared luminosity and line fluxes from MMPZ are systematically lower than those from Béthermin et al. (2015a).At the correct redshift, the predicted line fluxes of Béthermin et al. (2015a) match better with the observed line fluxes compare to MMPZ.On the contrary, MMPZ generally returns more accurate photometric redshifts, especially on HLS-3, where the photometric redshift from Béthermin et al. (2015a) significantly deviates from the spectroscopic redshift.However, as indicated by the low dust temperatures of the HLS sample, it is possible that the properties of far-IR emission of these galaxies are not representative among high redshift star-forming galaxies.Thus, we decide not to make any preference on the choice of dust template and far-IR SED fitting in our framework, and we recommend a cross-validation between the redshift solutions from various method in application.
Besides, the faint emission of HLS sources in SPIRE bands introduce large uncertainties on the constraints on source SEDs around the peak of far-IR emission.This, as a result, could contribute to the difference in IR luminosities derived from methods with different prior constraints (Casey 2020).These issues also further suggest the importance of matching observation at ALMA band 8-10 frequencies in properly reconstructing the far-IR SED, as well as estimating the IR luminosity and star formation rate of high-z DSFGs selected by millimeter surveys.

Scaling Relation
The redshift from the joint analysis is derived based on the goodness of match between the emission line model and observed spectra.As mentioned in Sect.4.2, in this approach we predict the expected fluxes of spectral lines based on the L FIR from the SED template fitting using the best-fit scaling relation between L FIR -L line from literature (Greve et al. 2014;Liu et al. 2015;Valentino et al. 2018).However, these scaling relations are subject to substantial scatter up to a factor of a few in observations.To test the possible impact of these scatters on our analysis and the robustness of the joint analysis method against them, we first checked the output best redshift solution after adding a systematic offset to all L FIR -L line scaling relations when generating the predicted spectra models.In this test, we shift the predicted CO/[CI] line fluxes by four different systematical offsets corresponding to ±0.5 and ±1.0 times of the 1σ scatter of the scaling relations.The exact values of the 1σ scatters for the considered lines are given in Table .7. The best redshift solutions for HLS sources (except for HLS-4) after applying these four different offsets in L FIR -L line conversion are listed in Table D.1.We find that all offsets in line flux, but the +1.0σ, result in best redshift solutions similar to the analysis using the median (zero offset) scaling relations.This suggests good robustness of our joint analysis method against the existing scatter of L FIR -L line scaling relations in observations.As for the test with +1.0σ offset, we further checked the reason leading to the discrepancy in best redshift solution.For HLS-22 and HLS-2-2, the application of the +1.0σ offset leads to mismatches between the model and the data due to glitches or noise spikes in the spectra (see Fig. A.2 as an example, where the noise spike is matched with CO  at the best redshift solution of HLS-22 here).For HLS-3, we find this unlikely low redshift solution after applying the +1.0σ offset as the code assigns the only strong emission line in the spectral scan at 139.746 GHz to be CO .This suggests the stronger demand of having spectral scan wide enough to cover more than one strong spectral line in the redshift confirmation of DSFGs with moderate redshift (i.e z∼2-3).To test the self-consistency of the joint analysis method, we put the measured CO line fluxes and far-IR luminosities of HLS sources at their best redshift solutions (the z best,med in Table D.1) in the corresponding L FIR -L CO diagrams to check if they follow the scaling relations used for line flux predictions.The results are shown in Fig. D.1.The plotted CO and far-IR luminosities are derived using the Gaussian-fitted line fluxes and modified black-body fitting (see Table. 9 and Table. 10).Apart from the average L FIR -L CO correlations, we also show the samples from Cañameras et al. (2018) with multiple line transitions from the same sources as a comparison to our sources.
From the Fig. D.1, we find that most of our sources fall within +-2σ of the scaling relation used in our analysis, which is also consistent with the regions occupied by bright submillimetre galaxies in Cañameras et al. (2018).The only exception is HLS-3, which is also highlighted in Fig. D.1.On either L FIR -L ′ CO (5−4) or L FIR -L ′ CO(4−3) diagram, HLS-3 falls well below the scaling relation even if we consider the scatter of these scaling relations.However, we also note that it has one of the poorest SPIRE photometry among all of the four HLS sources.As the SPIRE bands close to the peak wavelength of SED predominantly constrain the IR luminosity, it is likely that the IR luminosity of HLS-3 is much less constrained than the rest HLS sources, especially compared to HLS-2 and HLS-4.The comparison between L FIR -L CO correlations of different transitions (Greve et al. 2014;Liu et al. 2015) and our sample based on measurements from our observations at the best redshift solutions.The sources with upper limits on line luminosities are presented as leftward triangles.

Fig. 1 :
Fig. 1: Cleaned images of NOEMA observation on our four NIKA2 sources.The effective beam size and shape of each map is shown in the bottom right of each panel.The contour levels from orange to dark red correspond to -4, 4, 8 and 12× RMS of each map, respectively.The red crosses mark the position of detected NOEMA sources from the uv_fit.The two resolved sources associated with HLS-2 are also labeled separately (HLS-2-1 and HLS-2-2).The scale bars in the maps (upper left) correspond to 5 arcseconds in the sky.The frequency of the continuum data are given in the lower left corner of each panel.

Fig. 5 :
Fig. 5: Observations and best fits on spectral lines, including both detections and upper limits.The best-fit model of each line is shown with the red solid line.For HLS-3, we show the two lines assuming z=3.123.The luminosity of the CO(3-2) line has a best-fit value consistent with zero.Table 10: Dust properties of the HLS sources from optical-thin modified black-body fitting.Source z f ix log(M dust,MBB /M ⊙ ) T dust,MBB β L FIR (50-300µm) SFR K 10 12 L ⊙ M ⊙ /yr HLS-2 5.2 9.1 +0.1 −0.1

Fig. 6 :Fig. 7 .
Fig. 6: Example of a modified black-body fitting on the far-IR to millimeter photometric data for one of our galaxy.(a) Photometric data and best-fit model.±1σ and ±2σ uncertainties of the model are shown with blue shades of different transparency.(b) Corner plot of the posterior distribution of the 3 parameters.The contours correspond to 1, 1.5 and 2 σ in the 2D histogram.

Fig. 8 .
Fig.8.The gas depletion time of HLS sample based on the molecular mass from Rayleigh-Jeans dust emission and the star formation rate from far-IR luminosities.The red dashed line shows the redshift evolution of gas depletion time of main sequence galaxies fromTacconi et al. (2020).The grey dots show the gas depletion time of z> 2 SMGs based on the data summarized inDunne et al. (2022).
Fig. A.2.The false identification suggests an increased sensitivity to narrow spikes in the

Fig. A. 1 : 2 :
Fig. A.1: Upper row: Joint log-likelihood of 4 HLS sources with models spectra with line widths of 300km/s, 500km/s and 800km/s.Lower row: Comparison between the models of these 3 line widths at the corresponding redshift solution and the observed source spectra.

Fig. B. 1 :
Fig. B.1:The result of joint-analysis on the 4 HLS sources with spectroscopic redshifts derived in Sect.4.2, using only the 31 GHz NOEMA spectral scans observed in 2018.First row shows the likelihood from SED fittings and joint log-likelihood of photometric and spectroscopic data, using the SED fitting outputs withBéthermin et al. (2015a) templates and MMPZ.The comparison between observed spectra and the model spectra predicted by the IR luminosities from the 2 SED fitting results are shown in the second and third row.
Appendix D: Impact of the Scatter of L FIR -L CO Fig. D.1:The comparison between L FIR -L CO correlations of different transitions(Greve et al. 2014;Liu et al. 2015) and our sample based on measurements from our observations at the best redshift solutions.The sources with upper limits on line luminosities are presented as leftward triangles.

Table 2 :
Information on NOEMA follow-up observations.

Table 3 :
NOEMA continuum source positions and best-fit sizes.

Table 6 :
S/N>3 lines blindly detected in the NOEMA spectra.

Table 7 :
Parameters of the log-linear L FIR -L line in our analysis.

Table 8 :
Summary on the joint-analyzed redshifts of NOEMA sources

Table B .
1: Redshift of HLS sources from the joint-analysis with only the 31 GHz continuous spectra observed in 2018.
4.2.The results from this analysis are presented in Table B.1 and Fig. B.1.

Table D .
1: Redshift of HLS sources from the joint-analysis when using different amount of offsets for all L FIR -L line scaling relations.