Measuring the gas reservoirs in 10 8 < M ⋆ < 10 11 M ⊙ galaxies at 1 ≤ z ≤ 3

Context. Understanding the gas content in galaxies, its consumption and replenishment, remains


Introduction
Cold molecular gas is the material that fuels the galaxy machinery that works to form stars. Knowing the amount of gas available in galaxies, how efficiently it is converted into stars, as well as how it is replenished is crucial to understanding their evolutionary pathways.The cosmic history of the gas mass density resembles that of the star formation rate density (Decarli et al. 2019, Riechers et al. 2019, Magnelli et al. 2020, Walter et al. 2020), peaking at z ∼ 2 and steadily decreasing until now.The gas mass (M gas ) content in galaxies at a fixed stellar mass (M ⋆ ) increases with redshift at least at 0 < z < 3.At a fixed redshift, the gas fraction (f gas = M gas /(M ⋆ +M gas )) decreases with M ⋆ (Genzel et al. 2010, Béthermin et al. 2015, Morokuma-Matsui & Baba 2015, Dessauges-Zavadsky et al. 2020, Tacconi et al. 2020, Magnelli et al. 2020, Wang et al. 2022).
The relation between M gas and M ⋆ at different redshifts has been quantified by a variety of studies (e.g., Scoville et al. 2016, Tacconi et al. 2018, Liu et al. 2019b, Tacconi et al. 2020, Kokorev et al. 2021, Wang et al. 2022), covering 0 < z < 6.It is typically parameterized according to cosmic time or redshift, and the distance from the galaxies to the main sequence (MS) of star-forming galaxies (SFGs).The term MS refers to the tight correlation that exists between the SFR and the M ⋆ (e.g., Noeske et al. 2007, Elbaz et al. 2011, Whitaker et al. 2012, Speagle et al. 2014, Tomczak et al. 2016, Santini et al. 2017, Pearson et al. 2018, Barro et al. 2019), which is seen to be present at least at 0 < z < 6.
The cold molecular gas can be studied directly using rotational lines of molecular hydrogen, H 2 .However, the transition probabilities are very small, line emission is weak, and transitions are sufficiently excited only in radiation or shock-warmed molecular gas-like photodissociation regions and outflows (Parmar et al. 1991, Richter et al. 1995).Common alternatives to study the gas content in distant galaxies include the use of the low-transition 12 CO millimeter rotational lines and dust continuum measurements.
For the first approach, it is typically assumed that the CO lower rotational lines are optically thick and the CO line luminosity is proportional to the total molecular gas mass (M H 2 ), using an empirical conversion factor (Dickman et al. 1986, Solomon et al. 1987, Bolatto et al. 2013).
For the second approach, the M gas can be derived based on the dust content, converting the dust mass (M dust ) obtained by fitting the infrared (IR) spectral energy distribution (SED) (Draine & Li 2007) to M gas , for which it is typically assumed a metallicity-dependent gas-to-dust ratio (δ GDR ; e.g.Magdis et al. 2012, Genzel et al. 2015).One can also use the photometry measured in the Rayleigh-Jeans (RJ) tail of the SED (e.g.Scoville et al. 2016, Hughes et al. 2017).The Scoville et al. (2016) method (S16 hereafter) works similarly to the previous one, assuming a constant δ GDR with a mass-weighted dust temperature (T dust ) of 25 K.These approaches assume that zero-point calibrations based on z = 0 measurements are also valid at higher redshifts.
These methods have been previously used in other works to study the gas content in the local and distant universe (e.g.Saintonge et al. 2017, Decarli et al. 2019, Freundlich et al. 2019, Sanders et al. 2023 derived M gas using the 12 CO rotational lines; Schinnerer et al. 2016, Wiklind et al. 2019, Kokorev et al. 2021 used the dust emission;and Liu et al. 2019b, Aravena et al. 2020, and Birkin et al. 2021 used both methods).However, despite the increasing number of studies in the field, most of the efforts so far focus on individual (sub-)millimeter detections of massive objects (> 10 10−11 M ⊙ ).For instance, the Schinnerer et al. 2016 sample is made up of ALMA detections at 240 GHz, with M ⋆ > 10 10.7 M ⊙ at z ∼ 3.2.In Freundlich et al. (2019), they include CO emitters with 10 10−11.8M ⊙ within 0.5 < z < 3. The Liu et al. 2019b sample contains galaxies at 0.3 < z < 6 that show high-confidence ALMA detections, with median M ⋆ = 10 10.7 M ⊙ .
Alternatively, other studies sought to extend this analysis to fainter galaxies and improve the completeness of the data-sets by stacking the emission of similar sources, not imposing a flux criterion on the (sub-)millimeter emission of the sources.In Tacconi et al. (2020), part of their sample is based on this strategy, made up of stacks of Herschel far infrared (FIR) spectra.Their data-set also includes individual CO emitters though.In Magnelli et al. (2020), they measure the cosmic density of dust and gas by stacking H-band selected galaxies above a certain M ⋆ .In Garratt et al. (2021), they study the evolution of the H 2 mass density back to z ≈ 2.5 measuring the average observed 850 µm flux density of near-infrared selected galaxies.In Wang et al. (2022), they employ stacking to derive the mean mass and extent of the molecular gas of a mass-complete sample down to 10 10 M ⊙ .In the latter study, they obtain M H 2 which are generally lower than previous estimates, based on individual detections.
In this work, we use the emission at 1.1 mm measured with observations obtained by the Atacama Large Millimeter/submillimeter Array (ALMA) to infer the content of gas present in a mass-complete sample of galaxies at 1.0 < z < 3.0, analyzing stacked ALMA images for subsamples in different redshift ranges and M ⋆ bins.Taking advantage of the galaxy catalog provided by the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (CANDELS; Grogin et al. 2011, Koekemoer et al. 2011) in The Great Observatories Origins Deep Survey (GOODS; Dickinson et al. 2003, Giavalisco et al. 2004), specifically in GOODS-S (Guo et al. 2013, G13 hereafter), we probe the 10 10−11 M ⊙ stellar mass regime with a complete sample whose 80% completeness level reaches down to 10 8.6 M ⊙ at z = 1 (10 9.2 M ⊙ at z = 3.0) (Barro et al. 2019).Our analysis aims at removing potential biases at the high-mass end when based on detections of individual galaxies.We aspire to check whether faint sources in ALMA follow the same scaling relations derived from brighter sources or, on the contrary, present a distinct molecular gas content than that prescribed for their stellar masses.Moreover, this sample gives us the chance to explore the gas reservoirs of less massive galaxies, ∼ 10 9−10 M ⊙ , for which previous scaling relations are still not well calibrated.
The structure of the paper is as follows.In Section 2 we present the data and sample selection.We then describe the physical properties of the sample and compare them with other catalogs in Section 3. In Section 4 we present our stacking and flux measurement methodology applied to the ALMA data.In Sections 5 and 6, we present and discuss our results regarding the gas reservoirs of our sample, comparing them with previous scaling relations.The conclusions are summarized in Section 7.

Data
We base this work on the images provided by the GOODS-ALMA 1.1 mm galaxy survey (Franco et al. 2018, Gómez-Guijarro et al. 2022a) in GOODS-S, carried out in ALMA Band 6.This survey extends over a continuous area of 72.42 arcmin 2 (primary beam response level ≥ 20%) with a homogeneous average sensitivity.It is the result of two different array configurations.Cycle 3 observations (program 2015.1.00543.S; PI: D. Elbaz) were based on a more extended array configuration that provided a high angular resolution data-set (Franco et al. 2018).Cycle 5 observations (program 2017.1.00755.S; PI: D. Elbaz) were based on a more compact array configuration, which resulted in a lower angular resolution data-set (Gómez-Guijarro et al. 2022a).In this work, we use the low-resolution data-set, which has a sensitivity of 95.2 µJy beam −1 and an angular resolution of 1 ′′ .330 × 0 ′′ .935 (synthesized beam full width at half maximum (FWHM) along the major×minor axis).This choice is motivated by our interest in detections and flux measurements, as opposed to resolving the extent of the sources.

Sample selection
In this research, we use the source catalog provided by CAN-DELS in GOODS-S (G13), that includes the redshifts, M ⋆ , SFR, and other SED-derived parameters for the galaxies.We select sources in the redshift range 1.0 ≤ z ≤ 3.0, where G13 cataloged 18,459 galaxies (out of the full sample of 34,930 galaxies).While the lower limit is chosen given our interest in high redshift Article number, page 2 of 18 galaxies, especially at cosmic noon, where the gas mass density reaches its peak, the upper limit is chosen considering that the completeness at 3 < z < 4 may be compromised.As pointed out in Mérida et al. (2023) (M23 hereafter), H-band-based catalogs can be affected by significant selection effects when the H-band photometric point (i.e. the flux within the F160W HST filter) lies bluewards of the Balmer break.This feature is shifted to 320 nm at z = 3 (400 nm at z = 4).
We focus on galaxies with M ⋆ > 10 8 M ⊙ , taking into account the G13 mass completeness limits.G13 compute this limit by looking for the most massive galaxies whose flux is equal to the faint limit of the sample, which evolves with redshift (Fontana et al. 2004, Grazian et al. 2015).The limits we report were obtained considering a 90% completeness limit in the flux of H = 26 mag, which corresponds to the limit computed for the shallow part of GOODS-S.Additionally, we only keep in our sample those sources with M ⋆ < 10 11 M ⊙ .The number density of the G13 sample sharply decreases towards >10 11 M ⊙ , with only 14 objects within our redshift range.They all have M ⋆ slightly above ∼ 10 11 M ⊙ , right at the limit, so they cannot trace a higher mass bin than the one considered in this work (10 10−11 M ⊙ , see Sec 4 for more information on the divisions in redshift and M ⋆ in further analysis).Additionally, considering the uncertainty in M ⋆ for these galaxies, 8 out of the 14 sources could fall into the 10 10−11 M ⊙ range.If we include these 14 galaxies in the highest mass bin considered in this work, the f gas barely change, showing only a variation of the order of a hundredth.
We, moreover, restrict the sample to SFGs as indicated by the UV J diagram (Whitaker et al. 2011), which allows us to classify galaxies in quiescent or star-forming according to their rest-frame colors.This UV J selection guarantees that the M gas we derive, based on stacking galaxies, is not biased to lower values because of the contribution of quiescent galaxies.In Fig. 1, we show the UV J diagram depicting the galaxies that enter the selection and those that are discarded.All these criteria leave us with a sample of 15,236 sources.
Finally, we discard any source lying outside the GOODS-ALMA map coverage.This coverage is defined as the area where the noise is uniform across the map, excluding the edges of the outermost pointing, where there is no pointing overlap.Our final sample is thus composed of 5,530 star-forming objects located at 1.0 ≤ z ≤ 3.0, with stellar masses ranging 10 8−11 M ⊙ .This sample shows typical uncertainties in the SED-derived parameters of 0.11 for the redshifts, 0.07 dex for the M ⋆ , and 0.05 dex for the SFRs.
Within this sample, we looked for the ALMA counterparts of our galaxies, as well as for galaxies in the vicinity of sources showing an ALMA counterpart, using the source catalog listed in Gómez-Guijarro et al. (2022a), GG22 hereafter.We select those galaxies closer than 5 ′′ to any object included in GG22.This 5 ′′ radius is chosen considering the growth curve of the low-resolution ALMA map point spread function (PSF) and the trade-off between the number of objects in the sample and the possible contamination of ALMA detected galaxies.This condition affects just ∼3% of the objects in the sample.The effect of the exclusion of these individually detected sources and their neighbors in the photometry and further f gas is discussed in Sec. 4 and 5.1.We will refer to the subsample obtained when excluding these galaxies as the undetected data-set hereafter.
Additionally, we also looked for counterparts of these sources in other ALMA-based catalogs, namely the ALMA twenty-six arcmin 2 survey of GOODS-S at one millimeter (ASAGAO; Hatsukade et al. 2018) and the ALMA Hubble Ultra Deep Field (ALMA-HUDF; Dunlop et al. 2017, Hill et al. 2023).In Yamaguchi et al. (2020), they include a list of those ASAGAO sources that have an optical counterpart in the Fourstar Galaxy Evolution Survey (ZFOURGE; Straatman et al. 2016), and in Hill et al. (2023) they include the optical counterparts in G13 of the sources from the ALMA-HUDF catalog.
Only 4 sources from the undetected data-set coincide with objects from ASAGAO and another 4 with sources from ALMA-HUDF.
We also investigated if any of these sources belonging to the undetected data-set shows significant emission at (sub-)millimeter wavelengths.We measured the photometry of each of these objects using the aperture photometry method provided by photutils from Python, selecting an aperture radius r = 0 ′′ .8.Only 18 out of the undetected sources show a signalto-noise ratio SNR>3.Only 3 out of these 18 galaxies show an SNR>3.5.When considering the SNR at the peak, only 1% of the galaxies from the undetected data-set show an SNR peak > 3.5 (including the latter 18 sources).Based on the analysis carried out in GG22, none of the sources is massive enough for the ALMA emission excess to be regarded as real and, therefore, they are indistinguishable from random noise fluctuations (see GG22 for more details).Given the low SNR of our sources, in order to look into the gas reservoirs of these galaxies we need to analyze stacked data.

Properties of the sample and comparison with other catalogs
In this section, we compare the properties of the galaxies from our sample with those from other catalogs that were used by previous studies that also aimed at inferring the gas content of galaxies.In particular, we will refer to (i) the sources from the "super-deblended" catalogs, performed in GOODS-N and in the Cosmic Evolution Survey (COSMOS; Scoville et al. 2007), that were used to derive the Kokorev et al. (2021)  their scaling relation; (iv) the galaxies from the COSMOS2020 catalog that lie in the A 3 COSMOS footprint (COSMOS2020 * hereafter), which is the sample the Wang et al. (2022) scaling relation is based on and, finally, (v) the sources from GG22.All these comparison samples are cut to only include galaxies within the same M ⋆ and redshift intervals that we are considering in this work (see Sec. 2).Later, in Sec.5.2, we will refer to these same data-sets in the context of scaling relations.In Fig. 2, we show different diagrams that highlight the properties of the listed samples together with our data-set, based on G13.In Table 1, we summarize the information contained in Fig. 2. Given that some of these works do not report the values of the rest-frame colors of their sources, the i − H and b − i colors allow us to build a diagram that works similarly to an UV J, but using apparent magnitudes.In Fig. 2, for the panel showing this color vs. color diagram, we use the photometry measured within the F435W (b), the F775W (i), and the F160W (H) bands from HST for our sample and for GG22.For A 3 COSMOS and COSMOS2020 * , we use the photometry measured within the Subaru Prime Focus Camera (Suprime-Cam) b band, the Hyper Suprime-Cam (HSC) i band, and the UltraVISTA H band.For the super-deblended catalogs and the Tacconi et al. (2020) dataset we also use the HST photometry, together with the Canada-France-Hawaii Telescope (CFHT) and Subaru observations in the absence of HST data.
Fig. 2 also compares the position of the samples in the SFR vs M ⋆ plane.In that panel, we include the M23 fit, defined for 1.5 < z < 2.0, up to 10 10 M ⊙ , and the Barro et al. (2019) MS fit, B19 hereafter, above 10 10 M ⊙ .The distance of each point to the MS is re-scaled to its corresponding redshift.In the fourth panel, we show the difference with respect to the MS for each galaxy in these samples, ∆MS (∆MS = log SFR−log SFR MS , with ∆MS = 0 being equivalent to δMS = SFR/SFR MS = 1).We use M23 to calculate the ∆MS of galaxies with M ⋆ /M ⊙ < 10 10 and B19 for sources with higher stellar masses, given that M23 focuses on the low-mass end of the MS.
It is important to mention that for our G13-based sample, Tacconi et al. (2020), and COSMOS2020 * , the SFR were computed following the ladder technique (Wuyts et al. 2011), which combines SFR indicators at UV, mid-infrared (MIR) and FIR.For the galaxies from the super-deblended catalogs, the SFR was computed from the integrated IR luminosity (LIR) using the Daddi et al. (2010) relation.The SFRs for A 3 COSMOS were computed from the IR luminosity using the Kennicutt (1998) calibration.For the GG22 galaxies, the SFRs were calculated as the sum of the SFR IR (using the Kennicutt 1998 calibration) and the SFR UV (using the Daddi et al. 2004 calibration).We checked that only 219 of our galaxies (4%, with only 10 galaxies with M ⋆ > 10 10 M ⊙ ) are detected in the MIR and/or FIR using Spitzer MIPS and Herschel PACS and SPIRE.This means that the SFRs of our sample come mostly from the UV emission.

This work: the G13-based data-set
The sample evenly populates the redshift range considered in this work (first panel of the first row in Fig. 2), with median and quartiles z = 1.9 2.4 1.5 .The low-mass coverage of G13 allows us to reach down to 10 8−9 M ⊙ (log (M ⋆ /M ⊙ )=8.6 9.1 8.3 ).In terms of the optical colors (second panel of the first row), our sample shows typical values of 0.81 1.31  0.29 mag for i − H and 0.12 0.35 −0.04 mag for b − i.
The position of our galaxies in the SFR vs M ⋆ plane (first panel of the second row) is compatible with the MS, with only a minor population of galaxies above or below three times the typical scatter (∼ 0.04% and ∼ 1% of the galaxies above/below 3σ, respectively, with σ being ∼ 0.3 dex according to Speagle et al. 2014).The median ∆MS (second panel of the second row) of our galaxies is ∆MS = −0.030.17 −0.25 dex.If we check the position of our galaxies with respect to the MS according to M ⋆ (third row), we see that this typical ∆MS is maintained over the whole M ⋆ range, including the high-mass end, where most of our comparison samples are located, thus making it difficult to see our galaxies in the SFR vs M ⋆ plane above 10 9.5 M ⊙ .

Comparison data-sets
-"Super-deblended" catalogs The "super-deblended" catalogs (Liu et al. 2018, Jin et al. 2018), performed in GOODS-N and COSMOS and constructed using FIR and sub-millimeter images, use the prior positions of sources from deep Spitzer/IRAC and Very Large Array (VLA) 20 cm observations to obtain the photometry of blended FIR/sub-millimeter sources.They also employ the SED information from shorter wavelength photometry as a prior to subtract lower redshift objects.In the case of the COSMOS superdeblended catalog, the authors additionally select a highly complete sample of priors in the K s-band from the UltraVista catalogs.Apart from selecting those galaxies satisfying our redshift and M ⋆ cuts, we only keep those galaxies showing an SNR > 3 in at least 3 FIR to sub-millimeter bands from 100 µm to 1.2 mm, following Kokorev et al. (2021).The optical photometry of these galaxies is obtained by looking for possible optical counterparts in the CANDELS catalog performed in GOODS-N (B19) and in the COSMOS2020 catalog (Weaver et al. 2022).
The median redshift and quartiles of the galaxies satisfying our selection criteria are z = 1.6 2.0 1.1 , with a higher concentration of lower redshift galaxies compared to our sample, and in line with T20.This data-set is biased towards more massive galaxies than our sample (log M ⋆ /M ⊙ = 10.4 10.7 10.1 ), ∼ 1.8 dex more massive than our galaxies.Their optical i − H and b − i colors are redder than the ones traced by our sources (i − H = 1.85 2.34  1.35 mag and b−i = 1.24 1.71 0.88 mag, respectively), ∼ 1 mag redder in both colors.In terms of the position of these galaxies with respect to the MS, these sources show ∆MS values compatible with being MS galaxies, showing ∆MS = 0.34 0.55 0.14 dex.This value corresponds though to a more star-forming data-set, more compatible with the upper envelope of the MS, given the typical scatter.6% of the galaxies show values >3σ.If we examine the evolution of ∆MS with M ⋆ , we see that below 10 10 M ⊙ , the sample gets increasingly star-forming, with most of the sample below 10 9 M ⊙ surpassing ∆MS=1.
-A 3 COSMOS The A 3 COSMOS dataset (Liu et al. 2019a) contains ∼700 galaxies (0.3 < z < 6) with high-confidence ALMA detections in the (sub-)millimeter continuum.It consists of a blind extraction, imposing an SNR peak > 5.40, and on a prior-based extraction, using the known positions of sources in the COSMOS field, cutting the final sample to SNR peak > 4.35.We extract the photometry of these sources from the COSMOS2020 catalog.
The A 3 COSMOS galaxies with redshifts and M ⋆ in common with this work are mainly located at higher redshifts (z = 2.11 2.64  1.75 ) compared to our galaxies, and are also biased towards more massive objects (log (M ⋆ /M ⊙ ) = 10.7 10.9 10.5 ), ∼ 2 dex more Article number, page 4 of 18   Kokorev et al. (2021), and L19 to that of Liu et al. (2019b).The T20 data-set was used in that same work to obtain their scaling relation.

Mass-complete samples
This work 1.9 2.4 1.5 8.6 9.1 8.3 0.81 1.31 0.29 0.12 0.35 −0.04 −0.03 0.17 According to their position in the SFR vs. M ⋆ plane, these objects are also compatible with the MS, but, as well as the galaxies from the super-deblended catalogs, T20, and GG22, they are located in the upper envelope, showing values nearly 2 times the typical scatter (∆MS = 0.58 0.76 0.36 dex, with 13% of the galaxies above 3σ).
- Tacconi et al. (2020) The Tacconi et al. (2020) sample (T20 hereafter) is based on the existing literature and ALMA archive detections for individual galaxies and stacks.It consists of 2,052 SFGs.858 of the measurements are based on CO detections, 724 on FIR dust measurements, and 470 on ∼1 mm dust measurements.We extract their photometry looking for the counterparts of the individual objects in the CANDELS catalogs performed in the different cosmological fields, using the catalogs already specified together with the Stefanon et al. 2017 catalog for the Extended Groth Strip (EGS; Davis et al. 2007), and Galametz et al. 2013 for the Ultra Deep Survey (UDS; Lawrence et al. 2007, Cirasuolo et al. 2007).It is however true that, since part of their sample is based on stacking, our results regarding the colors will only reflect the nature of the individual detections that make up the sample.
We see that the Tacconi et al. (2020) galaxies meeting our redshift and M ⋆ criteria are centered at z = 1.4 2.0 1.1 , in line with the super-deblended sample.In terms of M ⋆ , this data-set is made up mostly of massive objects (log (M ⋆ /M ⊙ ) = 10.6 10.8 10.4 ), 2 dex more massive than our sample.According to the optical colors, this sample traces redder values of i − H and b − i, typically 1.87 2.36  1.45 mag and 0.95 1.40 0.76 mag for each of these colors.This is more than 1 mag redder in i − H and ∼ 0.8 mag redder in b − i.
These galaxies are more star-forming than our sources, showing ∆MS = 0.33 0.67 0.02 dex, which is compatible with them being in the upper envelope of the MS (13% of the galaxies are located above 3σ).
-COSMOS2020 * The COSMOS2020 catalog comprises 1.7 million sources across the 2 deg 2 of the COSMOS field, ∼966,000 of them measured with all available broad-band data.Compared to COS-MOS2015 (Laigle et al. 2016), it reaches the same photometric redshift precision at almost one magnitude deeper.It goes down to 10 8.43 M ⊙ at z = 1 with 70% completeness (10 9.03 M ⊙ at z = 3).We keep those galaxies that lie within the A 3 COSMOS footprint, which we will call COSMOS2020 * , consisting of 207,129 objects.This sample is not biased towards ALMAdetected galaxies, CO emitters, or high-mass systems, which makes it more similar to our data sample.
The median redshift of the galaxies within our redshift and M ⋆ intervals is z = 1.77 2.24  1.31 , comparable to the values we retrieve for our sample.The COSMOS2020 * data-set shows a typical M ⋆ of log (M ⋆ /M ⊙ ) = 9.0 9.6 8.6 , ∼0.5 dex more massive than our sample.According to the optical colors, these sources show similar i − H colors (0.85 1.21  0.51 mag) to our galaxies, and around ∼ 0.50 mag redder colors of b − i (0.59 0.89 0.37 mag).These objects are located well within the MS typical scatter, with a ∆MS very similar to the one we obtain for our data-set (∆MS = −0.060.28 −0.39 dex, with 4% of the galaxies above 3σ and 7% of the galaxies below 3σ).
-GG22 GG22 presented an ALMA blind survey at 1.1 mm and built a bona fide sample of 88 sources, comprising mostly massive dusty star-forming galaxies.Half of them are detected with a purity of 100% with a SNR peak > 5 and half of them with 3.5 ≤ SNR peak ≤ 5, aided by the Spitzer/IRAC and the VLA prior positions.We retrieve the optical fluxes of the GG22 ALMAselected galaxies from ZFOURGE.
The GG22 sources compatible with our redshifts and M ⋆ cuts are also biased towards high redshifts compared to our sample, similarly to A 3 COSMOS (z = 2.15 2.67 1.91 ).As well as the super-deblended data-set, T20, and A 3 COSMOS, GG22 is mainly made up of massive galaxies, ∼2 dex more massive than our objects (log (M ⋆ /M ⊙ ) = 10.5 10.7 10.3 ).Their optical colors are also redder than the ones showed by our sample, with median and quartiles being 1.98 2.54 1.30 mag for i − H (∼1 mag redder) and 0.82 1.36  0.41 mag for b − i (0.7 mag redder).These galaxies are also MS objects but, as well as the sources from the super-deblended data-set, T20, and A 3 COSMOS, they are more star-forming than our sources (∆MS = 0.46 0.83 0.22 dex), located above the typical scatter of the MS (20% of the galaxies above 3σ).

Comparison remarks
The main differences between our data-set and the comparison samples, with the exception of COSMOS2020 * , are the blue optical colors of our galaxies, their low M ⋆ coverage, and their closer proximity to the MS.However, the latter results concerning the blue optical colors of our galaxies can be a consequence of mixing different redshifts and M ⋆ when producing the colorcolor diagram.We thus decided to cut it in 0.5 redshift bins and select galaxies with > 10 10 M ⊙ , which allows a direct comparison with the other catalogs.Below 10 10 M ⊙ , as highlighted in different panels in Fig. 2, we lack sources to compare.The red colors of the comparison samples are thus due to > 10 10 M ⊙ systems.
In Fig. 3, we show the color-color diagram included in Fig. 2 divided into different redshift bins.We only show the superdeblended, A 3 COSMOS and COSMOS2020 * galaxies as comparison data-sets since the number of objects in each redshift bin included in these catalogs still provides the means to obtain meaningful number statistics to compare with.
When restricting our sample to galaxies with > 10 10 M ⊙ , the difference in i − H diminishes and we retrieve similar values to those obtained for the comparison samples.We get 2.23 2.66 1.71 mag at 1.0 ≤ z < 1.5, and 2.25 2.84  1.51 mag at 2.5 ≤ z ≤ 3.0 for our data-set.
For the b − i color, we trace bluer values than the superdeblended catalogs and A 3 COSMOS while getting similar results to COSMOS2020 * .The difference between the set of our sample and COSMOS2020 * , and the super-deblended catalogs and A 3 COSMOS increases with redshift, and the color gets bluer as well.We obtain 1.07 1.32  0.82 mag at 1.0 ≤ z < 1.5 (0.25 0.42 0.12 mag at 2.5 ≤ z ≤ 3.0) according to our data-set, compared to 1.56 1.98  1.19 mag at 1.0 ≤ z < 1.5 (1.20 1.48 0.99 mag at 2.5 ≤ z ≤ 3.0) for the super-deblended catalogs, and 1.54 1.76  1.02 mag at 1.0 ≤ z < 1.5 (1.36 1.61  1.01 mag at 2.5 ≤ z ≤ 3.0) for A 3 COSMOS.The similarity with COSMOS2020 * and the discrepancy with the superdeblended catalogs and A 3 COSMOS in this color are expected.The COSMOS2020 * includes all the galaxies at these M ⋆ , regardless of their flux at (sub-)millimeter wavelengths, hence being mass-complete at 10 10 M ⊙ , similarly to our sample.On the contrary, the super-deblended catalogs use prior positions from deep Spitzer/IRAC and VLA observations, and the A 3 COSMOS only considers sources with high-confidence ALMA detections, which is translated to redder colors of b − i and higher dust obscurations.Our galaxies show median optical extinctions, A(V), ranging from 1.03-1.71mag, smaller as we increase in redshift, whereas these numbers are 2.08-2.28mag for A 3 COSMOS.

Stacking analysis and flux measurements
In order to study the gas content of our galaxies, we stack the emission of objects similar to each other.We group galaxies according to (1) redshift and (2) log M ⋆ .We distinguish: -4 redshift bins: 1.0 ≤ z < 1.5, 1.5 ≤ z < 2.0, 2.0 ≤ z < 2.5, and 2.5 These divisions in redshift and stellar mass are chosen as a result of an estimation used to evaluate and maximize the probability of obtaining detections according to different combinations of redshift and M ⋆ intervals.The estimation is based on the depth of the observations and the previous knowledge about the gas reservoirs in galaxies as given by the scaling relations derived in other works (see Sec. 5.2).If we consider the expected gas fractions provided by these relations and use the δ GDR approach (see Sec. 1 and 5.1), we can calculate the typical flux density that corresponds to those gas fractions and we can roughly infer the number of objects necessary to obtain a measurement with SNR > 3.For this, we quantify the relation σ ∝ 1/ √ N (with σ being the resulting noise in the stacked map and N the number of objects) considering different combinations of redshift and M ⋆ bins and obtain that, for getting a measurement (SNR > 3) at 10 ≤ log M ⋆ /M ⊙ ≤ 11, just a few objects (< 10) are required.For the 9 ≤ log M ⋆ /M ⊙ < 10 bin, we require a number of objects of the order of hundreds.Finally, for the 8 ≤ log M ⋆ /M ⊙ < 9 bin, we would need tens of thousands of objects to reach the necessary depth according to our current knowledge of the gas reservoirs in galaxies.We check that the adopted redshift division guarantees these numbers for the 9 ≤ log M ⋆ /M ⊙ < 10 and 10 ≤ log M ⋆ /M ⊙ ≤ 11 mass bins, while for the 8 ≤ log M ⋆ /M ⊙ < 9 bin, we lack objects, regardless of how we divide in redshift, what already warns that the probability to obtain a measurement in this M ⋆ bin is very low.This estimation does not, however, ensure that we are obtaining measurements for the two remaining bins, given that scaling relations are not calibrated for the kind of objects we are considering in this work, but still can be used as a starting point.
After defining the bins, we stack 50×50 arcsec 2 (1,000×1,000 pixel 2 ) cutouts within the low-resolution ALMA mosaic, centered at each source and using the coordinates of the centroids provided by G13.Before the stacking, we corrected these centroids for a known offset between the HST and ALMA data, reported in different studies (e.g.Dunlop et al. 2017, Franco et al. 2018).We apply the correction from Whitaker et al. (2019), that corresponds to δR.A.(deg) = (0.011±0.08)/3600 and δdecl.(deg)= (−0.26± 0.10)/3600.We opted for median stacking galaxies instead of mean stacking them.This choice is motivated by our aim to provide an estimate of the gas reservoirs of the bulk of the SFG population, not biased towards bright sources.Additionally, this method allows us to get closer to the detection threshold in the case of the 10 9−10 M ⊙ bin; the use of mean stacking reports lower SNR in 3 of the 4 redshift bins.
Despite our choice, we computed the fluxes and further physical parameters from both, mean stacking and median stacking measurements.In Sec.5.1 we discuss quantitatively the effects introduced by mean or median stacking the galaxies in the gas content, and in Appendix A we include analogs of some of the figures and tables appearing in this paper showing the results obtained using mean stacking.
We checked that the centroids computed using the stacked emission in ALMA are compatible with those provided by G13, based on the HST imaging, within 0 ′′ .06.Following GG22, the photometry is calculated within an aperture of r = 0 ′′ .8.This radius provides the optimal trade-off between total flux retrieval and total SNR for the GG22 sample.We then apply the corresponding aperture correction by dividing this flux density by that enclosed within the synthesized dirty beam (normalized to its maximum value) using the same aperture radius (see Gómez-Guijarro et al. 2022a for more details).This aperture correction is ∼1.67 for r = 0 ′′ .8.
When the SNR < 3, we calculate an upper limit for the flux density based on the surrounding sky emission in the stacked image by placing 10,000 r = 0 ′′ .8 apertures at random positions across a 20×20 arcsec 2 cutout centered at the stacked source.We measure the photometry within each of these apertures and produce a histogram with all the values, fitting the resulting Gaussian distribution leftwards to the peak, to skip the possible emission of the source.We compute the upper limit as 3 times the standard deviation of the fit.We also checked that this approach is compatible with the standard deviation obtained by iteratively drawing N (equal to the number of sources in the stack) empty positions along the mosaic, stacking them, and measuring the flux within a r = 0 ′′ .8 aperture.The compatibility of the two methods is the result of the noise uniformity along the map.
If SNR > 3 within the aperture we repeat the measurement using an aperture radius r = 1 ′′ .We checked that this larger Article number, page 7 of 18 log M /M 10 This work Super-deblended 2.5 z 3.0 radius allows us to optimize the flux/SNR gain/loss, recovering ∼7% more flux.The aperture correction for r = 1 ′′ is ∼1.28.The uncertainty associated with the measurements is the result of the combination of the error of the stacked data (which is used to compute the SNR) and the uncertainty linked to the underlying distribution of sources that contribute to the stack.The former component is calculated by placing 10,000 r = 1 ′′ apertures at random positions across the 50×50 arcsec 2 stacked cutout.We measure the photometry within each aperture and fit the histogram leftwards to the peak, as done in the calculation of the upper limits in the previous case.The standard deviation provided by this fit is taken as the uncertainty.The uncertainty associated with the underlying distribution is computed via bootstrapping, considering the standard deviation of the bootstrap samples.
In Fig. 4 we show an image of all the stacks.In Table 2 we list the flux densities we measure, together with the derived uncertainties.In both, Fig. 4 and Table 2, we also include the results for the undetected data-set, defined in Sec. 2. In Sec.5.1, we discuss the effects of the inclusion or exclusion of the GG22 sources and their neighbors in the f gas .For both, the whole sample and the undetected data-set, we obtain SNR > 3 flux density measurements for 10 10−11 M ⊙ (high-mass bin) at all redshifts.The flux density enclosed within this M ⋆ bin increases towards higher redshifts.For the intermediate-mass (10 9−10 M ⊙ ) and the low-mass bins (10 8−9 M ⊙ ), we provide 3σ upper limits.In the case of the intermediate-mass bin, we obtain a signal close to our SNR threshold at 1.5 ≤ z < 2.0, with an SNR = 2.6 (SNR = 2.5 for the undetected data-set), and again at 2.5 ≤ z ≤ 3.0, with an SNR = 2.2 (SNR = 2.1 for the undetected data-set).Given   5. Flux ratio between a point source emission and a modeled galaxy profile versus the effective radius.The blue line shows the profile for a Sérsic index, n = 0.5, the orange one for n = 1, the green one for n = 1.5, and the red one for n=4.The vertical solid line shows the typical size of the dust component according to Gómez-Guijarro et al. (2022a) for a z = 1.9, log M ⋆ /M ⊙ ∼ 10.5 galaxy.The vertical dashed line shows the typical size according to the median Sérsic parameters extracted from G13 for the M ⋆ ≥ 10 10 M ⊙ galaxies in our sample.
the lack of detection in the lower mass bins, we tested whether regrouping the galaxies, stacking all sources within 10 8−10 M ⊙ , would allow the threshold to be exceeded.We did this for each redshift bin, and also including all galaxies at 1 ≤ z ≤ 2 and 2 < z ≤ 3.However, these tests do not report any detections; the galaxies in the low-mass bin dominate the emission of the stack.
The use of a certain aperture radius in our measurements, in this case, r = 1.0 ′′ , involves some flux loss.Departure from a point-like source may involve an additional flux correction based on the galaxy morphology (see Blánquez-Sesé et al. 2023).We consider two size estimations: the size of the dust component as prescribed by GG22 and the size of the stellar component as measured and reported in G13, based on H-band data.
As pointed out in several studies, the dust component is usually more concentrated than the stellar one (e.g.Kaasinen et al. 2020, Tadaki et al. 2020, Gómez-Guijarro et al. 2022a, Liu et al. 2023).However, it is currently uncertain if our stacks, based on a mass-complete sample including faint objects, follow the latter statement, given that previous size estimations of the dust component rely on individual detections of bright objects at (sub-)millimeter wavelengths.Due to this, we also include a size estimation based on the stellar component.
According to GG22, the effective radius R eff (the radius that contains half of the total light) of the dust component of a source with z = 1.9 and M ⋆ = 10 10.5 M ⊙ is 0 ′′ .10.At HST H-band resolution our galaxies are fitted by a Sérsic profile characterized by a median Sérsic index n = 1.36 and a median effective radius R eff = 0 ′′ .36.In Fig. 5, we show the flux correction factor one should take for our measurements (i.e., at 10 10−11 M ⊙ ) versus the R eff .Focusing on the size estimation provided by GG22, the flux correction associated with our measurements is negligible.According to the size of the stellar component, for a Sérsic index n = 1.0-1.5, this correction ranges from 1.17-1.22.If the size of the dust component resembled that of the stellar component, this ∼20% correction would translate to 0.08 dex larger M gas than those reported in Table 3 and Fig. 6.

Observed evolution of the gas reservoir of our sample
We calculate the gas content of our sample following two approaches.The first is based on the computation of a δ GDR using a mass metallicity relation (MZR), and the second on the RJ dust continuum emission (see Sec. 1).
For the first one, we produce synthetic spectra of the dust emission of our galaxies, according to their median redshift and ∆MS, using the Schreiber et al. ( 2018) IR SED template library1 .This library contains 300 templates, divided into two classes: 150 dust continuum templates due to the effect of big dust grains, and 150 templates that include the MIR features due to polycyclic aromatic hydrocarbon molecules (PAHs).These templates, which can be co-added, correspond to the luminosity that is emitted by a dust cloud with a mass equal to 1 M ⊙ .After scaling each template to the measured flux density of the stacked galaxy at 1.1 mm, we obtain the LIR by integrating the rest-frame template flux between 8 and 1000µm.This luminosity is then translated to M dust by multiplying the intrinsic M dust /LIR of the template by the LIR that corresponds to the measured flux density.Schreiber et al. (2018) models use different dust grain composition and emissivity yielding lower M dust by a factor of two on average when compared to the more widely used Draine & Li (2007) models.Therefore, in order to have comparable M dust with the literature studies and prescriptions needed to convert them into M gas , we re-scale the results based on the Schreiber et al. ( 2018) by an appropriate factor at each source redshift (Leroy et al. in prep).M gas is then obtained through the dust emission using the δ GDR -Z relation derived by Magdis et al. (2012), assuming the MZR from Genzel et al. (2015), using the median M ⋆ and z of the corresponding bin.The M gas that we get using this approach corresponds to the total gas budget of the galaxies, including the molecular and atomic phases.As explained in Gómez-Guijarro et al. (2022b) and references therein, the molecular gas dominates over the atomic one within the physical scales probed by the dust continuum observations at this wavelength.It is worth noting that this statement has been tested within the angular scales probed by dust continuum observations, but the HI may dominate at larger scales (Chowdhury et al. 2020).
Let us note that this approach assumes that the emissivity index (β) adopted in the Schreiber et al. ( 2018) templates (∼1.5, the average value for local dwarf galaxies; Lyu et al. 2016) is accurate for our galaxies since we do not have FIR data to better constrain this parameter.Leroy et al. (in prep) perform stacking using ALMA and Herschel data to obtain the SED of typical MS galaxies.They obtain this β by SED fitting, getting values that are compatible with the β = 1.5 assumed in the Schreiber et al. ( 2018) models.In Shivaei et al. (2022), they use stacks of Spitzer, Herschel, and ALMA photometry to examine the IR SED of high-z subsolar metallicity (∼0.5 Z⊙) luminous IR galaxies (LIRGs), adopting β = 1.5 for their sample.In this paper, they also discuss other possible values of this parameter, but still, they perform the analysis using β = 1.5.
For the second approach, we follow S16, using the corrected version of equation 16 from that paper.In this paper, they affirm that the luminosity-to-mass ratio at 850 µm is relatively constant under a wide range of conditions in normal star-forming and starburst galaxies, at low and high redshifts.We can thus use the measurements of the RJ flux density, derive the luminosity, and estimate M gas .They note that this approach is equiva-Table 2. Summary of the photometry derived in this work.We show the flux densities for our sample, and a subsample excluding the Gómez-Guijarro et al. (2022a) sources and those from Guo et al. (2013) which are closer than 5 ′′ to them, as explained in Sec. 2. We split the total uncertainty into its two components: the error due to the stacked data and the uncertainty linked to the underlying distribution of the individual galaxies.The total uncertainty is obtained by summing the previous error contributions in cuadrature.The absence of uncertainty denotes an upper limit at the 3σ level.lent to a constant δ GDR for high stellar mass galaxies.This approach is justified if the variation of the mass-weighted T dust on galactic scales is small.This is true for galaxies in the vicinity of the MS, as reported in Magnelli et al. (2014), that uses stacked Herschel data up to z ∼ 2. However, higher and a wider range of temperatures are observed in systems further away from the MS (see Clements et al. 2010, Cochrane et al. 2022).The fact that the mass-weighted T dust keeps constant at these redshifts for MS galaxies is also supported by simulations.Liang et al. (2019), using the high-resolution cosmological simulations from the Feedback in Realistic Environments (FIRE) project, report that the mass-weighted T dust does not strongly evolve with redshift over z = 2 − 6 at fixed IR luminosity.At a fixed redshift, it is however tightly correlated with LIR, hence sources with very high LIR, normally starburst objects, show higher massweighted T dust than 25 K.We do not expect such high LIR values for our galaxies.
It is also important to take into account that the RJ tail methods can be safely applied if λ rest >> hc/(k B T dust ), where k B is the Boltzmann constant and T dust refers to the mass-weighted T dust .For a mass-weighted T dust of 25 K, this requires λ rest >> 580 µm.At higher redshifts and for lower mass-weighted T dust , the restframe wavelength probed by ALMA band-6 observations is far from the RJ regime (Cochrane et al. 2022).Results at z ∼ 3 should therefore be interpreted with caution.
S16 insist that the calibration samples that they use are intentionally restricted to objects with high stellar mass (M ⋆ > 5 × 10 10 M ⊙ ), hence not probing lower metallicity systems.As a consequence, we only use this approach for the calculation of the M gas in the high-mass bin.On the other hand, for the low-and intermediate-mass bins, the M gas is only computed following the δ GDR method.We discuss the effect of S16 and other prescriptions in the calculation of the gas content in lower mass galaxies in Sec. 6.
An offset between both approaches, S16 and δ GDR , is expected though at high stellar masses, as reported in Gómez-Guijarro et al. (2022b), where they find a median relative difference (M RJ gas -M δ GDR gas )/M δ GDR gas = 0.23 ± 0.84 between both measurements.
The errors are estimated through 10,000 Monte-Carlo simulations perturbing the photometry randomly within the total uncertainties and assuming an error of 0.20 dex for the metallicity (Magdis et al. 2012).All these values are included in Table 3.As done previously for the fluxes, we also computed the error due to the underlying distribution for the median log M ⋆ , z, and ∆MS using bootstrapping.When propagating these uncertainties (of the order of the hundredth) to derive the error in the metallicity we obtain values lower than 0.20 dex, and hence lower uncertainties for the metallicity-derived parameters.As a consequence, we decided to keep the 0.2 dex value for the metallicity uncertainty, which is more conservative.
To fully understand and check the consistency of our measurements, we include two additional calculations of the gas fractions for the high-mass bin: one for the undetected data-set, and another considering only the GG22 galaxies.This is to both compare the measurement of the gas fractions that we obtain Fig. 6.Gas fractions versus stellar masses derived for each redshift range.Squares represent the gas fractions obtained for our sample, circles show the gas fractions derived for the undetected data-set.Diamonds represent the gas fractions calculated for the GG22 galaxies only.The differences between samples highlight the effect of the exclusion and inclusion of individually detected galaxies in the gas fractions.Gas fractions from the colored filled markers are computed following the gas-to-dust ratio prescription whereas colored empty markers with a cross within represent the gas fractions as calculated using the S16 method (only for the high-mass bin, upper limits are only computed following the gas-to-dust ratio method).Both values are included in Table 3. 3σ upper limits at lower stellar masses are shown with smaller squares and vertical arrows.Uncertainties are also included for the measurements.We also show the Liu et al. (2019b) 2022) (W22) scaling relations in green, gray, pink, and black, respectively.For each line, there is a dashed and a solid part.The solid part represents the mass range for which these relations are calibrated, whereas the dashed one shows an extrapolation of these relations for lower stellar masses.For the T20 relation, we also include the uncertainty as a shaded gray region.The scaling relations are computed using the median redshift of each bin and a ∆MS = 0.The distance from each of the points to these scaling relations is re-scaled to their corresponding redshifts and ∆MS using T20.
when considering the full mass-complete sample with the results excluding individually detected sources and to check what is the contribution of these bright individually-detected galaxies in the stacked measurements.All these values are also included in Table 3.
In Fig. 6, we show the evolution with M ⋆ of the gas fractions derived for each redshift bin.As mentioned in Sec. 4, we provide 3σ upper limits for the low-and intermediate-mass bins and measurements for the high-mass bin.For the low-mass bin we get f gas < 0.97 − 0.98 along 1 ≤ z ≤ 3.These numbers are f gas < 0.69 − 0.77 for the intermediate-mass bin.For the high-mass bin, focusing first on the δ GDR results, we obtain f gas = 0.32 − 0.48.Looking at the two additional cases, we check that removing the GG22 galaxies and their neighbors from our sample (i.e., considering only the undetected data-set) slightly drops the measurements (f gas = 0.30 − 0.45).The number of detected objects is much lower compared to the contribution of the undetected galaxies (see Table 3), which dominate the emission from the stack.If we only consider the detected galaxies from GG22 and stack them following the procedure described along Sec. 4, we get, as expected, much higher gas fractions (f gas = 0.46 − 0.66).Taking the individual gas fractions of the GG22 objects, provided in Gómez-Guijarro et al. (2022b), and calculating the mean for each redshift bin, we get values of f gas = 0.53 − 0.69.If we turn to the gas fractions as derived using the S16 relation, we see that the latter provides higher values of the gas fractions but both results are compatible within the uncertainties.
If we compare the values of f gas in terms of the kind of stacking method chosen, we see that mean stacking (see Appendix A) provides slightly higher values of the gas content at 10 10−11 M ⊙ (f gas = 0.34 − 0.54) getting closer to the W22 and T20 relations.Overall, this variation is compatible within the uncertainties derived for f gas .

Scaling relations framework and comparison with our sample
Based on galaxy samples such as the ones described in Sec. 3, several works provide us with different scaling relations that allow us to obtain the gas content of galaxies given the redshift, the ∆MS, and the M ⋆ .Some of these are The L19 relation is based on the A 3 COSMOS project, already introduced in Sec. 3, together with ∼1,000 CO-observed galaxies at 0 < z < 4 (75% of them at z < 0.1).The galaxies from the A 3 COSMOS project probe the log M ⋆ /M ⊙ ∼ 11 − 12 MS.Complementary sources (most of them belonging to Tacconi et al. 2018 andKaasinen et al. 2019) sample the log M ⋆ /M ⊙ ∼ 10 − 11 MS at z > 1.For z < 0.03, the complementary sample also covers the log M ⋆ /M ⊙ ∼ 9 − 10 MS, but they insist that the metallicity-dependent CO-to-H 2 conversion factor α CO might be more uncertain, and so the estimated M gas .
T20 is based on individually detected objects plus stacks of fainter galaxies, as pointed out in Sec. 3.This relation is an expansion of the results obtained in Tacconi et al. (2018).
K21 uses ∼5,000 SFGs at z < 4.5, drawn from the superdeblended catalogs introduced in Sec. 3. The median redshift of the sample K21 is based on is z ∼ 0.90, with a median M ⋆ ∼ 4.07×10 10 M ⊙ .The low-mass and low-redshift part of their sample is restricted to galaxies that lie above the MS.Nevertheless, 69% of the galaxies qualify as MS, 26% are classified as starbursts and 5% qualify as passive galaxies.
W22 is based on the COSMOS2015 galaxy catalog (in Sec. 3 we referred to COSMOS2020, which is an updated version of COSMOS2015).They select star-forming MS galaxies with ALMA band 6 or 7 coverage in the A 3 COSMOS database and well within the ALMA primary beam, obtaining a final sample of 3,037 sources.They stack galaxies, binning in redshift and M ⋆ , in the uv domain, covering the M ⋆ range 10 10−12 M ⊙ .They do not select galaxies according to a certain SNR threshold at Article number, page 12 of 18 (sub-) millimeter wavelengths, so the sample includes both, detected and undetected ALMA sources.
In Fig. 6, the values of the scaling relations there represented are derived using the median redshift of the bin and a ∆MS = 0.
If we compare our results with the previous scaling relations, for the high-mass bin, we see that the measurements for our sample are more compatible with the W22 scaling relation than with any of the latter relations.These measurements are also within the uncertainties defined by T20 but below L19 and K21.In the case of the intermediate-mass bin, the upper limits lie on the level established by the W22 and T20 extrapolations, and well below L19 and K21.In the low-mass bin, upper limits are poorly constraining and lie above most of the scaling relations.

Discussion
Pushing the limit to bluer, less dusty, more MS-like, and more mass-complete samples yields lower levels of gas than those prescribed by literature scaling relations based on redder and less complete data-sets.The super-deblended catalogs and the A 3 COSMOS samples are highly dust-obscured, showing red optical colors, and are more star-forming than our galaxies.As a consequence, the L19 and K21 scaling relations yield f gas −M ⋆ relations with a higher normalization.Going to mass-complete samples, like the one used in the W22 relation, leads to the inclusion of blue objects with low obscurations and SFRs compatible with being right on top of the MS.As a result, the W22 relation exhibits a lower normalization, better matching our results.T20 lies between the two regimes, presumably because it is based on a combination of individually detected red, dust-obscured objects, complemented by stacks of bluer and fainter galaxies.
Regarding our results for the high-mass bin, our low values of f gas could still be contaminated to a certain extent by the presence of post-starburst galaxies in their way to quiescence.These galaxies can have passed the UV J screening due to their blue U − V color and can be pulling the f gas to lower values.It is true, however, that this effect gains importance at z > 3 (D' Eugenio et al. 2020, Forrest et al. 2020, Valentino et al. 2020), out of the redshift range considered in this work.In Antwi-Danso et al. (2023), they test the performance of the UV J diagram selecting quiescent galaxies, including post-starbursts.According to their results based on the Prospector-α SED modeling, the UV J selection reaches the ∼90% completeness at z ≤ 4. They define this completeness as the number of quiescent galaxies that are selected divided by the total number of quiescent galaxies in the sample, with quiescence being defined as a specific SFR below the threshold of the green valley.
To quantify the effect of the possible contamination due to these post-starburst objects, we produce mock sources, based on sky positions where no galaxies have been cataloged, and introduce them in the stack, checking their imprint on the resulting f gas .Considering the UV J selection to be 90% complete at these redshifts, we see that after introducing these mock sources we obtain between 5%-7% less f gas .This difference is smaller than the uncertainty we derive for this parameter.
Additionally, the fact that we are comparing our values of f gas , obtained using a certain method, with the results provided by scaling relations whose measurements of f gas come from different conversion prescriptions, might be another source of discrepancy.
The L19 and W22 scaling relations rely on the RJ-tail continuum method of Hughes et al. (2017).Using this prescription to compute the f gas of our sample, adopting α CO =6.5 (K km s −1 pc 2 ) −1 , we get 7% larger values, similarly to what we get using S16.T20 use the Leroy et al. (2011) δ GDR together with the Genzel et al. (2015) MZR.Using this prescription, we obtain 3% larger values of f gas .K21 is based on the Magdis et al. (2012) δ GDR prescription together with the Mannucci et al. (2010) fundamental metallicity relation (FMR) calibrated for Kewley & Dopita (2002) that they convert to the Pettini & Pagel (2004) (PP04 N2) scale following Kewley & Ellison (2008).Using this method, we get 5% larger values of f gas .
The discrepancy between some of the scaling relations and our data is therefore not a consequence of the methodology or other factors that might be artificially pulling down our values, but simply the end result of considering a mass-complete sample that includes bluer less-dusty objects in comparison with other samples progressively more and more biased to redder dustier galaxies.
Concerning our findings for the intermediate-mass bin, it is important to take into account that at low M ⋆ the link between metallicity and the α CO or the δ GDR is still not well constrained and can lead to a bad estimation of the gas content.Up to date, there is very little information about the gas content of galaxies with ∼ 10 9 M ⊙ at high redshifts.Most efforts so far focused on galaxies at z ∼ 0 (e.g., Jiang et al. 2015, Cao et al. 2017, Saintonge et al. 2017, Madden et al. 2020, Leroy et al. 2021).According to T20 and references therein, it is hard or impossible to detect low-mass galaxies with substantially subsolar metallicity and to determine their gas content quantitatively.They suggest that there might be an interstellar medium component that might be missed or overlooked with the current techniques, such as gas/dust at very low temperatures.Deeper observations would be required to provide a better constraint on the f gas of these systems.
We also test the effect of using different prescriptions to compute the f gas in the intermediate-mass bin.Using RJ continuum methods such as S16 or Hughes et al. (2017) yields ∼ 10% lower values than the ones we report.Discrepancies are expected since these methods are calibrated for more massive galaxies.S16 relies on a sample of 0.2−4×10 11 M ⊙ galaxies whereas the Hughes et al. (2017) sample comprises M ⋆ ranging from 6−11×10 10 M ⊙ .On the other hand, the Leroy et al. (2011) prescription provides values which are compatible with our results (they differ in less than 1%), whereas the use of the Magdis et al. (2012) prescription using the Mannucci et al. (2010) FMR yields similar f gas at z < 2 but starts differing at higher redshifts, where this approach reports 8% less f gas .This difference is compatible with the uncertainties but still could reflect that the metallicity of low-mass galaxies at z > 2 deviates from that observed for local galaxies, contrary to what is seen in higher-mass systems, whose metallicity does not evolve with redshift until z > 2.5 (Mannucci et al. 2010).This highlights the need to re-calibrate these relations for less massive objects compatible with being MS galaxies.In most cases, the low-mass sample of this kind of studies are mainly made up of galaxies showing very high SFRs.

Summary and conclusions
Taking advantage of the CANDELS mass-complete catalog performed in GOODS-S (Guo et al. 2013), we are able to explore the gas content of galaxies in ALMA, using Band-6 observations at 1.1 mm (Gómez-Guijarro et al. 2022a).Our sample is composed of 5,530 star forming blue (< b − i >∼ 0.12 mag, < i − H >∼0.81 mag) galaxies at 1.0 ≤ z ≤ 3.0, located in the main sequence.It allows us to explore the gas content of 10 10−11 M ⊙ star-forming galaxies regardless of their emission Article number, page 13 of 18 at (sub-)millimeter wavelengths.Additionally, and thanks to the stellar mass coverage and completeness of the sample, we can provide an upper limit of the gas content of lower mass galaxies at ∼ 10 9−10 M ⊙ .We report measurements at 10 10−11 M ⊙ and 3σ upper limits for the gas fraction at 10 8−10 M ⊙ .
At 10 10−11 M ⊙ , we are tracing lower gas fractions, f gas = 0.32 − 0.48, than those derived from other scaling relations that use samples of redder and dustier objects on average, being biased towards individually-detected sources at (sub-)millimeter wavelengths, more subject to higher attenuations and also more star-forming than our galaxies.Relations based on more general mass-complete samples show more compatible values to the ones we report.
At 10 8−9 M ⊙ , the values we retrieve lie well above the scaling relations extrapolation, whereas at 10 9−10 M ⊙ the upper limits, ranging from 0.69 to 0.77, are located well within the region defined by the Wang et al. (2022) and Tacconi et al. (2020) scaling relations.The position of the upper limits at these intermediate masses supports the idea that the extrapolation derived from these scaling relations is representative of the upper bound of the underlying f gas −M ⋆ relation as traced by the bulk of star-forming galaxies.

Fig. 1 .
Fig. 1.Rest-frame U − V versus V − J color for the G13 sample within 1 ≤ z ≤ 3 and 10 8−11 M ⊙ .In red we show the sources classified as quiescent according to this diagram.The star-forming sources are depicted in blue.

Fig. 2 .
Fig.2.From left to right and up and down we show: stellar mass vs. redshift, a color vs. color diagram based on i − H and b − i, the star formation rate vs. stellar mass, histograms showing the distance to the main sequence in log scale, ∆MS, and ∆MS vs. stellar mass.We cut the comparison samples to only include galaxies within 1.0 < z < 3.0 and having 10 8−11 M ⊙ .In blue, we represent our sample.The GG22 galaxies are identified in yellow.TheTacconi et al. (2020) and A 3 COSMOS samples are shown in gray and green, respectively.The COSMOS2020 * sample is represented in magenta and the galaxies from the super-deblended catalogs are displayed in maroon.In the stellar mass vs. redshift plot, the blue contours showing our sample enclose roughly 20%, 50%, 60%, 70%, 80%, and 90% of the data.In the color-color diagram, the contours roughly enclose the 10%, 20%, 40%, 60%, 80%, and 90% of the data.In these two panels, histograms of the quantities there represented are also included, following the same color code.Quartiles are represented as horizontal segments in all the histograms.In the z-histograms shown in the first panel, we artificially elevate the baselines for the sake of clarity.TheMérida et al. (2023) main sequence up to 10 10 M ⊙ and theBarro et al. (2019) main sequence above 10 10 M ⊙ are shown in the third panel as a solid black line.The dashed lines in the third, fourth, and fifth panels show the area enclosed within 3σ with respect to the main sequence, based on the typical scatter reported inSpeagle et al. (2014) (∼0.30dex).The last panel, showing ∆MS vs. the stellar mass, is split for the sake of clarity, distinguishing between the super-deblended catalogs,Tacconi et al. (2020), A 3 COSMOS, and GG22 (top), and our sample and COSMOS2020 * (bottom).The typical uncertainties for the redshifts, stellar mass, and SFRs of our galaxies are small, ∼0.11, 0.07 dex, and 0.05 dex, respectively.In the case of the i − H and b − i colors, these are 0.14 mag and 0.20 mag, respectively.Article number, page 5 of 18

Fig. 3 .
Fig. 3. Color vs. color diagram based on i − H and b − i in different redshift bins.We only include galaxies with M ⋆ ≥ 10 10 M ⊙ within our data-set, the super-deblended catalogs, A 3 COSMOS, and COSMOS2020 * .See Fig. 2 for the description of the color codes and markers here shown.

Fig. 4 .
Fig. 4. Cutouts of 7×7 arcsec 2 of the ALMA low-resolution map showing the median stacked galaxies in each redshift and mass bin.For each bin, we include cutouts that correspond to the whole sample and the undetected data-set, as defined in Sec. 2. The apertures used to measure the photometry are displayed in green.The size of the beam is shown in the first panel of the figure in gray.The flux densities and the corresponding uncertainties for each stacked galaxy are included in Table2.Article number, page 9 of 18 Fig.6.Gas fractions versus stellar masses derived for each redshift range.Squares represent the gas fractions obtained for our sample, circles show the gas fractions derived for the undetected data-set.Diamonds represent the gas fractions calculated for the GG22 galaxies only.The differences between samples highlight the effect of the exclusion and inclusion of individually detected galaxies in the gas fractions.Gas fractions from the colored filled markers are computed following the gas-to-dust ratio prescription whereas colored empty markers with a cross within represent the gas fractions as calculated using the S16 method (only for the high-mass bin, upper limits are only computed following the gas-to-dust ratio method).Both values are included in Table3.3σ upper limits at lower stellar masses are shown with smaller squares and vertical arrows.Uncertainties are also included for the measurements.We also show theLiu et  al. (2019b) (L19), Tacconi et al. (2020) (T20), Kokorev et al. (2021) (K21), and Wang et al. (2022) (W22) scaling relations in green, gray, pink, and black, respectively.For each line, there is a dashed and a solid part.The solid part represents the mass range for which these relations are calibrated, whereas the dashed one shows an extrapolation of these relations for lower stellar masses.For the T20 relation, we also include the uncertainty as a shaded gray region.The scaling relations are computed using the median redshift of each bin and a ∆MS = 0.The distance from each of the points to these scaling relations is re-scaled to their corresponding redshifts and ∆MS using T20.
Fig. A.1.Cutouts of 7×7 arcsec 2 of the ALMA low-resolution map showing the mean stacked galaxies in each redshift and mass bin (see Fig. 4) The flux densities and the corresponding uncertainties for each stacked galaxy are included in Table A.1.

Table 1 .
Wang et al. (2022)perties of our data-set and the comparison samples shown in Fig.2, and described along Sec. 3. The comparison samples are limited to the redshifts and stellar masses studied in this work.The values here shown correspond to the median and first and third quartiles.The names between brackets refer to the scaling relations that are based on each data-set (see Sec. 5.2).W22 refers to theWang et al. (2022)relation, K21 to the relation from