The ALPINE-ALMA [CII] Survey: Obscured Star Formation Rate Density and Main Sequence of star-forming galaxies at z>4

Star formation rate (SFR) measurements at z>4 have relied mostly on rest-frame far-ultraviolet (FUV) observations. The corrections for dust attenuation based on IRX-β relation are highly uncertain and are still debated in the literature. Hence, rest-frame far-infrared (FIR) observations are necessary to constrain the dust-obscured component of the SFR. In this paper, we exploit the rest-frame FIR continuum observations collected by the ALMA Large Program to INvestigate [CII] at Early times (ALPINE) to directly constrain the obscured SFR in galaxies at 4.4<z<5.9. We use stacks of continuum images to measure average infrared (IR) luminosities taking into account both detected and undetected sources. Based on these measurements, we measure the position of the main sequence of star-forming galaxies and the specific SFR (sSFR) at z ∼ 4.5 and z ∼ 5.5. We find that the main sequence and sSFR do not evolve significantly between z ∼ 4.5 and z ∼ 5.5, as opposed to lower redshifts. We develop a method to derive the obscured SFR density (SFRD) using the stellar masses or FUV-magnitudes as a proxy of FIR fluxes measured on the stacks and combining them with the galaxy stellar mass functions and FUV luminosity functions from the literature. We obtain consistent results independent of the chosen proxy. We find that the obscured fraction of SFRD is decreasing with increasing redshift but even at z ∼ 5.5 it constitutes around 61% of the total SFRD.


Star formation rate density at z>4
The evolution of the star formation rate density (SFRD) is one of the fundamental elements for properly describing the history of the Universe. This measure provides information about the in situ growth of galaxies, provides an inference for the number of ionizing photons produced by galaxies, and, hence, can help to place constraints on the history of the reionization (e.g. Madau & Dickinson 2014;Bouwens et al. 2015a;Robertson et al. 2015).
The SFRD is generally measured by converting the dust-corrected FUV luminosity density of galaxies (see e.g. Madau & Dickinson 2014). Since a fraction of the farultraviolet (FUV) light emitted by massive short-lived stars can be absorbed by dust, when present, the total SFRD consists of two components: the SF RD F U V,uncorr measured from the rest-frame FUV luminosity density, and the dusthidden component SF RD IR measured from the rest-frame infrared (IR) luminosity density.
The ratio between these two components changes with time and, hence, redshift. The obscured fraction, SF RD IR , dominates at low redshifts; it is 3 to 10 times higher than SF RD F U V,uncorr at z<3 (e.g., Sanders et al. 2003;Takeuchi et al. 2003;Magnelli et al. 2011Magnelli et al. , 2013Burgarella et al. 2013). At higher redshift, the SF RD IR is poorly con-strained. However, since metals are produced by stars, the first galaxies should have very low dust content, slightly enriched only by Population III stars. Therefore, in these galaxies, SF RD F U V,uncorr should be well in excess of the dust-hidden fraction SF RD IR . As star formation continues, stars internal to the galaxy begin to generate more dust. While the survival of dust grains is regulated by a series of constructive and destructive processes (Calura et al. 2017), the net result is an increase of the SF RD IR in respect to the SF RD F U V,uncorr , but it is still unknown how fast the SF RD IR increases at early times and when it overcomes SF RD F U V,uncorr , mainly due to the lack of large, statistically-representative galaxy populations observed and selected both in the FUV and far-infrared (FIR) at high redshift.
Only a few attempts to constrain the SF RD IR at high redshifts exist in literature and their results remain inconclusive. Gruppioni et al. (2013) measured SF RD IR up to z∼4 using Herschel selected galaxies and found that the SF RD IR decreases rapidly at z>3. Rowan-Robinson et al. (2016) used the 500 µm sources from Herschel and found that the SF RD IR at z=3-6 is even higher than the estimates from the rest-frame UV, although these results still have significant uncertainty. The measurements of Koprowski et al. (2017) using SCUBA-2 data at redshift up to z∼5, on the other hand, suggest a very steep decrease of SF RD IR , steeper than Gruppioni et al. (2013) (see discussion in Gruppioni & Pozzi 2019), so that SF RD IR at z>4 is even lower than SF RD F U V,uncorr . This may be difficult to reconcile with Rowan-Robinson et al. (2016) estimates as well as with individual detections of very dusty galaxies at redshifts z 7 (Strandet et al. 2017;Bowler et al. 2018), z = 8.312 (Tamura et al. 2019) and z = 8.38 (Laporte et al. 2017), where the IR contribution should be extremely low if the SF RD IR continues to decrease rapidly with an increasing redshift. As the discoveries of individual highly dust-obscured objects are not from volumecomplete samples, it is difficult to determine whether such objects are common in the early Universe. However, the recent study using a volume-complete sample from CO Luminosity Density at High Redshift (COLDz) survey by Riechers et al. (2020) resulted in detection of three massive dusty starburst galaxies over ∼ 200, 000 Mpc 3 , indicating a high space density of such dusty objects at z>5. Since this study was limited to luminous starburst galaxies, this is only a lower limit and less luminous galaxies can contribute even more to SF RD IR . On the other hand, the previously mentioned studies based on measuring IR LFs Rowan-Robinson et al. 2016;Koprowski et al. 2017) suffered from several deficiencies at the high-redshift end (3.0 < z < 4.2). At these redshifts, the observations were found to be considerably incomplete, and, furthermore, those galaxies that were observed were mostly brighter than typical galaxies at these redshifts. As such, conclusions related to the relative contribution of unobscured and obscured star formation at these epochs were limited in nature.
To overcome the difficulty of computing IR LFs at z>4, the measurements of the cosmic infrared background (CIB) produced by dusty galaxies can be used together with the modeling of dusty galaxy clustering, from which the SF RD IR can be then derived. Maniyar et al. (2018) did this analysis using the CIB anisotropy measurements and their cross-correlation with cosmic microwave background (CMB) lensing (see also Planck Collaboration et al. 2014). The results turned out to be consistent with a shallower evolution of the SF RD IR at high redshifts compared to the results by Gruppioni et al. (2013) and Koprowski et al. (2017). At z>4, the SF RD IR does not drop below SF RD F U V,uncorr but approaches it up to z∼6.
Due to the difficulty in constraining the SF RD IR at redshifts z>4, the measurements of the total SFRD rely mostly on the rest-frame FUV part of the spectrum. To account for the obscured fraction, the relation between the infrared excess (IRX) and the UV spectral slope (β) (Meurer et al. 1999) is often used. However, stacking analysis (Fudamoto et al. 2017), as well as individual observations of high redshift galaxies with ALMA revealed significant scatter of dust properties. Most galaxies show a deficit of FIR flux compared to local galaxies with the same β slopes (Capak et al. 2015;Bouwens et al. 2016;Fudamoto et al. 2020), which can be partially explained by higher dust temperatures (Faisst et al. 2017). The measurements of dust temperatures are scarce but indeed higher than in local galaxies (30-43 K at z∼5) . Since measurements of the SFRs at high redshifts are based on dust corrections coming from the IRX-β relation, the scatter in the relation induces uncertainties on these measurements. For this reason, rest-frame FIR observations at z>4 remain important to constrain the IR luminosities independently.

Main sequence and sSFR at z>4
It is well established in the literature that star-forming galaxies form a so-called main sequence, a relationship between the amount of stellar mass in situ (M * ) and the SFR (Noeske et al. 2007;Daddi et al. 2007;Whitaker et al. 2012;Speagle et al. 2014;Salmon et al. 2015;Tasca et al. 2015;Schreiber et al. 2015;Tomczak et al. 2016;Santini et al. 2017;Pearson et al. 2018;Khusanova et al. 2020). However, the measurements of the SFRs of individual galaxies at z>4 suffer from the uncertainty of the IRX-β relation and/or the dust temperature of the galaxies (see Sect. 1.1), hence, lower number of detection in the rest-frame FIR than expected assuming local IRX-β (e.g., Capak et al. 2015;Bouwens et al. 2016). Besides, getting a wide volume in the rest-frame FIR at z>4 is expensive observationally. Therefore, the constraints on the main sequence of galaxies are limited and are mostly based on the rest-frame FUV data.
Only a few attempts have been made to constrain the main sequence using sub-mm observations, mainly using data from Herschel (Schreiber et al. 2015;Pearson et al. 2018). At z>4, however, the measurements are limited by confusion noise (Nguyen et al. 2010), which makes it difficult to observe individual sources without careful and wellinformed de-blending schemes. Therefore, these works primarily are limited to only the high mass end of the main sequence (log(M * /M ) > 10.5). Star-forming galaxies with such high masses are rather rare, and it remains unclear if the slope of the main sequence remains the same over the whole range of masses.
The main sequence evolution with redshift is linked to the sSFR evolution defined as SF R/M * , although the sSFR evolution can vary depending on the mass if the slope of the main sequence is not unity, as claimed in several studies (e.g., Salmon et al. 2015;Schreiber et al. 2015;Tasca et al. 2015;Khusanova et al. 2020). In that case, the measurements from UV selected samples, which probe galaxies with lower stellar masses (log(M * /M ) 10) are not comparable with FIR samples probing higher stellar masses (Schreiber et al. 2015). The sSFR constraints at high redshifts are inconclusive: while many studies find a shallower evolution at high redshifts (Bouwens et al. 2012;González et al. 2014;Tasca et al. 2015;Faisst et al. 2016;Santini et al. 2017), some find higher sSFR, consistent with steeper evolution, as at lower redshifts (Stark et al. 2013;de Barros et al. 2014).
The ALMA Large Program to INvestigate [CII] at Early times (ALPINE) was designed with a goal of measuring the dust-hidden SFR in galaxies at z>4 from a representative UV selected sample Le Fèvre et al. 2019;Faisst et al. 2020). In this paper, we use a large sample of 118 galaxies observed in the rest-frame FIR with ALMA to determine the SF RD IR at z>4 from ALPINE. We also determine their average and individual SFRs and constrain the main sequence over the stellar mass range 9 log(M * /M ) 11. The combination of the rest-frame FUV and FIR data allows estimating SFRs without relying on IRX-β relation and by directly measuring the total infrared luminosities. We present the ALPINE data in Sect. 2. In Sect. 3, we describe the stacking code and the methods to measure the SFRD. In Sect 4, we present and discuss our SFRD measurements. Sect. 5 is dedicated to a discussion about the main sequence and the sSFR evolution with redshift. In Sect. 6 we summarize our results.
Throughout the paper, we use a ΛCDM cosmology with Ω Λ = 0.70, Ω m = 0.30 and h = 0.7. All magnitudes are given in the AB system. We assume a Chabrier (2003) initial mass function (IMF) to convert luminosities to SFRs.

Data
ALPINE was designed to observe [CII] emission and the rest-frame FIR continuum of a representative sample of star-forming galaxies in the redshift range 4.4 < z < 5.8 with a gap at redshifts 4.65<z<5.05, where the [CII] 158 µm emission line falls into the low transmission window (Le Fèvre et al. 2019). The selection of galaxies covers a wide range of stellar masses 9 log(M * /M ) 11 and star formation rates derived from the rest-frame FUV with SED fitting 1 log(SF R[M /yr]) 3 . The targets are "normal" star-forming galaxies and, based on observations in the rest-frame FUV, populate the main sequence. We use stacking of both detections and nondetections to derive the average properties of the sample (see Sect. 3). In this way, we avoid bias towards Luminous Infrared Galaxies (LIRGs) and Ultra Luminous Infrared Galaxies (ULIRGs), compared to other observations of samples selected in the rest-frame FIR at these redshifts.
In total, 118 galaxies were observed in the period from May 2018 to January 2019 (see Bethermin et al. 2020). The targets were primarily selected from two large spectroscopic surveys: VIMOS UltraDeep Survey (VUDS, Le Fèvre et al. 2015;Tasca et al. 2017) and DEep Imaging Multi-Object Spectrograph (DEIMOS, Faber et al. 2003) 10K Spectroscopic Survey (Hasinger et al. 2018). All targets have secure spectroscopic redshifts from the rest-frame UV spectra and are imaged with both space and ground-based instruments (Grogin et al. 2011;Koekemoer et al. 2011;Laigle et al. 2016;Skelton et al. 2014). The secure spectroscopic redshift ensures that [CII], if present, will fall within the relatively narrow observed ALMA bandpass.
The ALMA data were calibrated with the standard ALMA pipeline. The CASA software was used for the data reduction (McMullin et al. 2007). To produce continuum maps, we excluded all the channels that correspond to the position of the [CII] line. In case of non-detections of the [CII] emission, we used spectroscopic redshifts from VUDS or DEIMOS 10K and excluded data within 500 km/s around the expected position of [CII] line. For galaxies with detected [CII] emission, we manually defined the window around the detected emission line. We then produced the continuum maps, using the task tclean. We used natural weighting to maximize SNR. More detail on data reduction can be found in Bethermin et al. (2020).
The surveys was optimized to detect [CII] line emission. Therefore, the expected SNR for the rest-frame FIR continuum varies in the range from 0.5 to 3 for most targets based on the SED models of Béthermin et al. (2017). We reached a sensitivity of 50 µJy/beam at 4.4<z<4.6 and 28µJy/beam at 5.1<z<5.8. We found that the 95% purity is reached with a SNR=3.5 cut for targeted galaxies. Only 23 out of 118 (19%) of the ALPINE targets are detected with such significance. However, given the known positions of the targets, we can extract the average continuum flux of non-detected galaxies using stacking, as described in Sect. 3.1.
The galaxies in ALPINE are chosen in the two wellstudied fields, Cosmic Evolution Survey (COSMOS, 89% of  Giacconi et al. 2002) that is covered by CANDELS (Koekemoer et al. 2011;Grogin et al. 2011 analyzed the multiwavelength ancillary data in both fields and derived the physical properties of galaxies using SED fitting (stellar masses, SFRs, rest-frame FUV luminosities, etc). The SED fitting is performed with the LePhare package (Arnouts et al. 1999;Ilbert et al. 2006), using Bruzual & Charlot (2003) models. The model parameters are outlined in Table 1. The photometry from the public COSMOS catalog (Laigle et al. 2016) was used for targets in the COSMOS field, and the 3D-HST catalog (Skelton et al. 2014) was used for the ECDFS. The fitting was performed in flux density space and emission lines are included. More detail about the SED fitting of ALPINE galaxies can be found in Faisst et al. (2020).

Stacking of the rest-frame FIR continuum images
As mentioned above, for ∼81% of the ALPINE targets, the observed continuum flux cannot be measured since it falls below the 3.5σ limits. Therefore, we used stacking to measure the average flux of galaxies with similar physical properties. We use all the targeted galaxies including both detections and non-detections, since our goal is to get the average flux of the whole population of galaxies in each bin.
Here we use the mean stacks of continuum images. We only use the targeted galaxies and their UV-positions for alignment. Since the targets are not perfectly aligned on stacks due to uncertainties in astrometry or real physical offsets, the flux is measured using an aperture of 3 arcsec diameter. Bethermin et al. (2020) has tested three other methods of continuum flux measurements in addition to aperture photometry: Gaussian fitting on a continuum map and on the UV plane, and peak flux. They all were proven to be consistent with each other, except for the peak flux. This is because the sources are not pointlike. The other methods give equivalent results. However, on the stacks only aperture photometry can be used, since we cannot align the non-detected sources and their sizes are possible also different. Therefore, their mean profile will not be Gaussian. The aperture photometry, on the other hand, makes it possible to measure the flux without any assumption on the profile shape. Since the beam sizes in ALPINE do not exceed 1.6 arcsec, the 3 arcsec aperture is large enough to include all the flux from the targets if the sources are significantly extended. Fujimoto et al. (2020), examined the radial surface brightness profiles of ALPINE targets and found that the sources are compact in the restframe FIR.
We use bootstrapping analysis to ensure the robustness of the measurements of the flux and to estimate the uncertainties coming from both the noise and the sample variance. Each stack is produced from a set of continuum images. To measure the average flux and its uncertainty, we randomly draw a new sample of images without withdrawal from the initial set and produce a new stack image, from which the continuum flux is remeasured. We repeat the procedure 1000 times and find the average and standard deviation of the measurementσ boot .
We also measure the flux in the field with the same aperture at each step. We mask the positions of the serendipitously detected objects and the center beforehand, to ensure that the flux measured in the field is not contaminated. The apertures are then placed at random positions, away from serendipitous sources. The variance of the field flux is an estimate of the photometric noise on the stacked image and is on average σ phot = 0.05 mJy for the typical number of stacked images (8). The average flux in the field is compatible with zero as expected (F f ield = 0.1 ± 50.0 µJy).
As discussed in Béthermin et al. (2012), the bootstrap uncertainty σ boot and the photometric noise σ phot are related as: where N stack is the number of images in the stack and σ pop is the intrinsic population variance. The uncertainties derived using the bootstrap method are thus including both the photometric noise and the sample variance coming from the finite size and the possible large heterogeneity of the stacked samples. In order to test reliability of the flux measurement on the stacks, we generated a number of fake sources on the images avoiding the center with fluxes drawn from a Gaussian distribution with a known mean F in and variance σ in .
The sources are Gaussian and their FWHM were allowed to vary from 0.8 to 1.6 arcsec (as the beam sizes in ALPINE, Bethermin et al. 2020). We used the procedure described above to stack images with the fake sources, but in order to simulate position uncertainties of real targets, we added random offsets sampled from a Gaussian with σ = 0.12 arcsec (as the average astrometric offset of ALPINE targets, Faisst et al. 2020). Then we measured the average flux F out and the bootstrap uncertainty σ out . We performed the procedure 250 times, varying the F in flux value in the range from 0.01 to 0.18 mJy or ∼0.25-3.5σ phot and the number of images from 6 to 13 (as we used later for the stacks). Fig. 1 shows the distribution of ∆F/σ out = (F out − F in )/σ out . The mean of the distribution is -0.03 and the standard deviation is 1.13. The mean is compatible with zero. The standard deviation of the distribution can be reduced to ∼ 1, if we limit the analysis to stacks with a high number of images (>20), however, this is not feasible given the number of objects in the sample. We, therefore, make stacks with the number of objects in range from 6 to 13, but divide them in smaller bins to be able to constrain the In conclusion, the average fluxes on stacks are well recovered within the error bars if the sources are not significantly extended or offset. This is the case for the detected targets , and stacking, therefore, can be safely used to recover the average continuum fluxes.

SFRD measurement: method description
The total SFRD can be defined as (see e.g. Madau & Dickinson 2014): where SF RD F U V,uncorr is derived from the FUV luminosity density (ρ F U V ) not corrected for dust attenuation, and SF RD IR is derived from the IR luminosity density (ρ IR ). The conversion factors are κ F U V = 1.5 * 10 −10 M year −1 L −1 and κ IR = 10 −10 M year −1 L −1 for Chabrier (2003) IMF (Kennicutt 1998).

Measuring IR luminosity density
Since ALPINE is not a volume-limited sample, it is rather difficult to determine the IR LF from the target sample (see Gruppioni et al. 2020, for IR LF measurement based on non-target galaxies found in ALPINE). We use, therefore, a different approach to obtain the IR luminosity density, which can then be converted to the SF RD IR . The method is based on using some physical property as a proxy of the infrared luminosity -L IR and convolving the volume density distribution of this proxy with the mean LIR-proxy relation as: where x is a proxy (M F U V or M * , see Sect. 3.2.2), L IR is the total infrared luminosity and φ(x) is either the FUV luminosity function (UVLF) or the galaxy stellar mass function (GSMF). The total IR luminosity is derived from the SED templates scaled to the 158 µm flux measured on stacks and then integrated from 8 to 1000 µm. We use average conversion factors f temp,z∼4.5 = 0.13 ± 0.02 and f temp,z∼5.5 = 0.12 ± 0.03 derived from a set of templates from the literature, which are consistent with the stacked Herschel data (see Appendix A for more detail on the templates and conversion factor). A summary of the method is presented in Fig. 2. The volume density distribution can be derived, for example, from the parent surveys, from which ALPINE targets were selected (VUDS or DEIMOS 10K) or from the literature measurements of the proxy density functions (UVLF or GSMF in our case).

The choice of proxy
The method described above relies on the assumption that there is a relation between the proxy and the IR luminosity. To first order, this is true for stellar masses (M * , we will refer to "stellar mass" as "mass" hereafter) of star-forming galaxies, which sit on the main sequence. Their SFR increases with the mass (Whitaker et al. 2012Tasca et al. 2015;Salmon et al. 2015;Santini et al. 2017;Pearson et al. 2018;Khusanova et al. 2020). While it is still unclear what fraction of SFR is hidden by dust, we expect that there is a relation between the masses and the IR dust emission (see Sect. 5 for more discussion on the main sequence). Also the FUV-magnitudes are linked to SFR to first order. If we assume that more star-forming galaxies have higher IR luminosities, the FUV-magnitudes should correlate with the L IR . Each proxy, however, has its own sources of uncertainty, which are discussed below. Therefore, we use both of them and compare the results.
The accuracy of the mass is limited, since it is derived from SED fitting and it is, therefore, model-dependent.
The caveat in using the mass as a proxy is the fact that the sample of galaxies in ALPINE is selected to have FUV luminosities above 0.6L * . Hence, the galaxies selected at lower masses are not representative of typical galaxies at these masses, since they are biased towards higher UV luminosities at fixed mass. We show in Fig. 3 the median values of the M * −M F U V relation from Salmon et al. (2015) and the fit of this relation. This relation agrees well with the M * and M F U V measurements for ALPINE targets. We find the mass at which the 1σ envelope around the best fit crosses the FUV luminosity selection limit, where σ is the scatter around the fit. As shown in Fig. 3, the mass limit is log M lim /M = 9.6 at both redshifts. Below this limit we are likely to miss galaxies with faint FUV magnitudes, biasing SFR towards higher values.
The FUV magnitudes are not affected by this bias. Since the spectroscopic redshifts are known, they are not modeldependent and can be derived with high accuracy given well-measured photometry ). We do not correct the derived FUV magnitudes for dust attenuation in order to keep them model-independent. However, the more dust is in the galaxy, the more FUV radiation is absorbed for a given SFR/IMF combination. Therefore, even if majority of faint in FUV galaxies have low dust content, very dust rich galaxies can contaminate FUV faint bins. In that case, the population will be heterogeneous in each bin and the relation may have larger scatter. This will make it more difficult to constrain the relation between FUV magnitude and IR luminosity. We determine the population variance with the bootstrapping procedure described in Sect. 3.1 (see Eq. 1). The impact of the population variance is then propagated to the uncertainties on the SFRD. However, if the population variance is intrinsically too large, the use of FUV magnitudes as a proxy can be challenging to analyze small samples.

Integration limits
ALPINE probes only a range of masses (9 log(M * /M ) 11) or FUV-magnitudes (−19 log(M * /M ) −23), used as a proxy (this range is schematically shown as shaded region in Fig. 2). In this range, we have measurements of the rest-frame FIR flux directly from the stacks of ALPINE images. The proxy density is known from observations. The SF RD IR measured from integration in this range is, hence, the most robust lower limit of the dust-hidden fraction of SFRD. However, more infrared luminosity can be contained in FUV faint or low mass galaxies. While these galaxies are expected to have low dust content, they can still affect the results due to their high density in the Universe. Galaxies with high masses or bright in FUV are rare, but could also contribute to the total SFRD budget. Therefore, we need to extrapolate the observed in ALPINE L IR − proxy relation and the proxy density functions to obtain the estimate of the total SF RD IR .
We determine, first, the lower limit (SF RD IR,min ) by integrating the Eq. 3 within the observed in ALPINE mass of FUV-magnitude range. Then, for the extrapolation, we use the integration limits corresponding to 0.03L * 1 and 100L * . These are M F U V = −17 and M F U V = −26, respectively. In order to define integration limits for masses, we convert these limits to SFRs. Then we use the relation between SFR and mass from the main sequence fit in Khusanova et al. (2020) to find the corresponding masses: log(M * /M ) = 6.0 for the lower limit and log(M * /M ) = 12.4 for the upper limit. With these limits, we can make a meaningful comparison between the SF RD F U V,uncorr derived in the literature from UV LFs and the SF RD IR measured here.
We note that even after extrapolation, we can still be missing galaxies with high IR luminosity, but too faint in the rest-frame FUV to be detected in current photometric surveys (e.g., Caputi et al. 2014;Franco et al. 2018;Williams et al. 2019;Wang et al. 2019;Gruppioni et al. 2020). Their contribution will be discussed in the Sect. 4.3.3.

Proxy number density
First, we need to define the proxy number density: UVLF or GSMF. Their measurements are widely available in the literature (e.g., Bouwens et al. 2015b;Finkelstein et al. 2015;Ono et al. 2018;Pelló et al. 2018;Khusanova et al. 2020;Song et al. 2016;Davidzon et al. 2017;Wright et al. 2018) but are still affected by many uncertainties. In particular, the faint and low mass end slopes are still not well constrained. Another uncertainty is the shape of the UVLF.   While most authors assume the Schechter function form, some recent works found that at high redshift the double power law (DPL) is preferable form (Bowler et al. 2015;Ono et al. 2018;Khusanova et al. 2020). We do not favor any particular UVLF or GSMF, but use several of them to show how the uncertainties on their parameters and functional form propagate into our SFRD estimates.  2018) used a joint analysis of GAMA, COSMOS, and 3D-HST data. We note that the physical parameters derived for ALPINE galaxies in COSMOS field were compared with the COSMOS15 catalogue and found to be consistent (see Faisst et al. 2020). Therefore, the uncertainties of mass measurements, which are more affected by the choice of the IMF and SFH in SED fitting compared to FUV magnitudes, should not have impact on our results, when we combine the COSMOS-based GSMF with LIR-mass relation from ALPINE.

L IR -proxy relation
As discussed in Sect. 3.2.2, we use two proxies in this work: masses and the FUV magnitudes derived using the SED fitting as described in detail in Faisst et al. (2020). The bin sizes are chosen to be small enough to reduce the heterogeneity of the underlying population but to include enough galaxies (6)(7)(8)(9)(10)(11)(12)(13) in the bin to have robust average flux measurements. As was discussed in Sect. 3.1, even faint fluxes on the stacks give us information on the average flux of the underlying population of galaxies. Therefore, we use 3.5σ phot upper limits only when the measured fluxes of the stack have negative values. In all the remaining cases, we use the average flux measured with bootstrapping (the SNR in these cases is >0.5σ phot , see Fig. 4). When stacking by mass bins, we consider the results as upper limits in bins below log M lim /M = 9.6, for the reasons discussed in Sect. 3.2.2 (bias towards higher SFR because of the FUV selection of ALPINE). The upper limits are the sum of the flux measured on stacks and 3σ. We converted the average rest-frame 160 µm fluxes to L IR and found the L IR -proxy relation. We show the resulting relations in Fig. 4. The L IR is clearly well correlated with both proxies. We fit this relation with a power law using the MCMC method with emcee package in python and checking at each step that the fit does not go above upper limits. The results are shown in Fig. 4. The slope is well constrained, and the normalization lies within the error bars. We use the MCMC chains later when deriving the SF RD IR . This allows us to derive the uncertainty of the SF RD IR measurements, which relies on the precision of our L IR -proxy relation.
In Fig. 4, we show the photometric errors and population variance computed as described in Sect. 3 (see Eq. 1) and SNR in each stack. The population variance at z ∼ 4.5 is ∼ 0.91 dex on average and at z ∼ 5.5 it is ∼ 0.47 dex, almost two times lower. As discussed in Sect. 3.2.2, if the underlying population is heterogeneous, the scatter of its FIR properties will be large. This is what we observe at z∼4.5. This makes the L IR − M F U V relation particularly difficult to constrain at z ∼ 4.5.
We note that the two points below M lim , which were excluded from the fit due to a possible bias (see Sect. 3.2.2), are in a very good agreement with the power law fit to the points above M lim . Therefore, we trust these two points and use them later, when we derive the main sequence.
Interestingly, we observe a stronger evolution with redshift of the L IR −M * relation than the L IR −M F U V relation. As the redshift decreases, the L IR on average increases for galaxies with the same mass, which means the dust attenuation increases in these galaxies.

Infrared contribution to SFRD
We present the results of our measurements of SF RD IR in Fig. 5 and 6 for redshifts z∼4.5 and z∼5.5 respectively, and in Table 2. Although the results depend on the choice of the proxy density parameters, in most cases they agree with each other within the error bars.
The choice of the proxy also plays a role in the constraining power of our method. At redshift z∼4.5, the population variance in FUV bins is larger as we noted before. Hence, although we find a fit to the measured points, these constraints should be interpreted with caution. At z∼5.5, on the contrary, we have weaker constraints when using the mass as a proxy. Due to the bias towards higher F U V magnitudes in low mass bins, we had to exclude them from the fitting of the L IR − M * relation. The remaining points have larger uncertainties at z∼5.5. Hence, the fit of the L IR −M * relation has larger uncertainties, which propagate into uncertainties of the SFRD estimates.
The measurements obtained with FUV as well as M * as a proxy agree well with each other within the errors in both redshift bins. The fact that the measurements obtained with different proxies are consistent with each other gives us on robustness of our constraints of the IR contribution to the SFRD. In the next section, we discuss our measurements in the context of the SFRD evolution and describe the results in more detail.

The lower limit
First, we obtained the lower limit on SF RD IR . For that, we integrated the L IR -proxy relation with the proxy density function in the range where both of them are known from the observations. Hence, this estimate is free from the uncertainties induced by extrapolations. The integra-  As described in Sect. 4.1 and 4.2, each power law function describing L IR -proxy relation is fitted with the MCMC (see Eq. 3). To derive the SF RD IR and its uncertainty, we use the MCMC chains of fit parameters to produce a resulting SF RD IR chain. In this way, we obtain a probability distribution function (PDF) of SF RD IR , which is a normalized distribution of individual SF RD IR measurements at each step. These PDFs are shown in the left panel of Fig. 5 and 6 for both proxies. Depending on the choice of the proxy density function, the PDFs differ but remain consistent within the error bars for each proxy. In Fig. 8, we show the lowest value minus 1σ as our lower limits for the IR contribution to SFRD.
These are the most robust lower limits of the SF RD IR as they are obtained with a large and representative sample of galaxies at 4.4 < z < 5.8 and only using the proxy range, where actual observations are available. Given this lower limit, the infrared contribution to the total SFRD is already at least 8-13% at z > 4 as compared to the total SFRD from the Madau & Dickinson (2014) curve. While this is seemingly a small amount, these results are in severe disagreement with the results by Koprowski et al. (2017). They fitted the evolution of SF RD IR with redshift by the Gaussian function. The extrapolation of this curve gives log(SF RD IR M yr −1 M pc −3 ) = −3.95 at z∼5.5, which is much lower than the ALPINE lower limit at this redshift log(SF RD IR M yr −1 M pc −3 ) = −2.67. Although, the agreement between ALPINE lower limits and SF RD IR measurements by Koprowski et al. (2017) could be reached by a shallower function at z>4, we note that we only considered a small range of masses or FUV magnitudes for computing the SF RD IR and our sample is UV-selected. Hence, the real values at both redshifts are even higher and in even stronger tension with Koprowski et al. (2017). Due to the fact that the work of Koprowski et al. (2017) is based on the infrared luminosity functions and in the high redshift bins, the observations barely cover the faint end slope of the IR LF, whose uncertainty could be one of the source of this discrepancy.

Extrapolated SF RD IR
The total SF RD IR is higher than the lower limit derived above. To derive the total SF RD IR , we assume that the L IR − proxy relation is the same at all FUV-magnitudes and masses and extrapolate the L IR − proxy relation as well as the proxy number density function to the integration limits, corresponding to 0.03L * and 100L * or log(M * /M ) = 6.0 and log(M * /M ) = 12.4 for masses, as mentioned in Sect. 3.2. We obtain the PDFs in the same way as above and show the results on the right panels in Fig. 5 and 6. To place the results in the context of the SFRD evolution, we combine the PDFs obtained with different proxy density functions from the literature. Since the lengths are the same for all chains, combining them does not favor any particular proxy density function. We obtain an average estimate of the SFRD and our uncertainties naturally include the systematics coming from the GSMF and UVLF. Fig. 7 illustrates the evolution with redshift of SF RD IR and the uncorrected for dust estimate from FUV luminosity density -SF RD F U V,uncorr . We observe a gradual decrease of the SF RD IR , which only starts to intersect with the SF RD F U V,uncorr at z>5. Even at such high redshifts, the SF RD F U V,uncorr does not overcome the SF RD IR . It remains an open question, however, at which redshift the SF RD F U V,uncorr overcomes the SF RD IR , since at z>4 the SF RD IR is only gradually approaching the SF RD F U V,uncorr . If the evolution is flatter at higher redshifts, then the dust contribution becomes negligible compared to SF RD F U V,uncorr only at very high redshifts. This trend is consistent with the extrapolation of the Maniyar et al. (2018) model of SF RD IR evolution, which is based on CIB anisotropies. Our results agree well with this model at both redshifts. However, if the contribution from ultradusty objects, which are missing from the optical and nearinfrared (NIR) catalogs is significant, the SF RD IR evolution can be even flatter.
In this paper, we only use the target sources from ALPINE. Recently, Gruppioni et al. (2020) determined the SF RD IR using the non-target ALPINE sources. Our results are in good agreement with the results by Gruppioni et al. (2020) being higher. This is not surprising, since these sources were not selected from the surveys probing restframe UV. Therefore, they are better suited for tracing the the star formation contribution coming from ultra-dusty galaxies.

The total SFRD
We now define the total SFRD at z∼4.5 and z∼5.5 by summing the SF RD U V,uncorr with the SF RD IR from ALPINE. We plot our results in Fig. 8. The results are in agreement with the FUV based estimates corrected for dust attenuation from the literature within the error bars at both redshifts. They are also in agreement with the measurements based on non-target sources from ALPINE  and with the measurement based on the [CII] luminosity function of non-target sources of the field sample ). However, our results obtained by a combination of two proxies (see Table 2) are 0.5 dex and 0.3 dex higher than the fit of SFRD evolution by Madau & Dickinson (2014) or in 1.4σ and 1.2σ tension at z∼4.5 and z∼5.5 respectively. Although, given the un-Article number, page 9 of 19 A&A proofs: manuscript no. ALPINE_SFRD_and_MS The SF RDIR and SF RDF U V,uncorr evolution with redshift. The red symbols are the SF RDIR, derived in this paper. Small offsets of ±0.08 are applied to their redshift for clarity. The triangles are the lower limits for SF RDIR derived in Sect. 4.3.1. The blue symbols are SF RDF U V,uncorr derived from the FUV data (Cucciati et al. 2012;Dahlen et al. 2007;Reddy & Steidel 2009;Bouwens et al. 2012;Schenker et al. 2013;Bouwens et al. 2015b;Pelló et al. 2018;Khusanova et al. 2020), orange symbols are SF RDIR from Magnelli et al. (2013); Gruppioni et al. (2013); Koprowski et al. (2017) and Gruppioni et al. (2020). The magenta point is from [CII] luminosity function based on the field sample of Loiacono et al. (2020). Orange lines are the fits of SF RDIR evolution from Koprowski et al. (2017) and Maniyar et al. (2018). Fig. 8. The evolution of the total SFRD (Eq. 2) with redshift. The red symbols are the total SFRDs, derived in this paper. Small offsets of ±0.08 are applied to their redshift for clarity. The triangles are the lower limits for SF RDIR derived in Sect. 4.3.1. The blue symbols are dust-corrected SFRD derived from the FUV data (Cucciati et al. 2012;Dahlen et al. 2007;Reddy & Steidel 2009;Bouwens et al. 2012;Schenker et al. 2013;Bouwens et al. 2015b;Pelló et al. 2018;Khusanova et al. 2020), orange symbols are derived from reradiated dust emission from forming stars as measured from the IR (Magnelli et al. 2013;Gruppioni et al. 2013;Koprowski et al. 2017). The magenta point is from [CII] luminosity function based on the field sample of Loiacono et al. (2020). The dotted line is the fit of SFRD evolution from Madau & Dickinson (2014).
certainty on the SF RD IR measurement, it is not possible to reach firm conclusions, the evolution of the total SFRD at z>4 is possibly shallower than the best fit in Madau & Dickinson (2014).
In Fig. 9, we show the evolution of the dust obscured fraction of SFRD with redshift. We use all the points from the literature together with our measurements shown in Fig. 7 and 8 to calculate the average SF RD IR and SF RD U V,uncorr in each redshift bin and the ratio of SF RD IR to the total SFRD. We find the infrared contribution to the total SFRD equal to 68 +18 −25 % and 61 +20 −25 % at z∼4.5 and z∼5.5 respectively. Although significant uncertainties remain, our results show that the dust obscured fraction of SFRD is significant even at z>4. Our results are consistent with semi-analytical model (SAM) by Cousin et al. (2019), which predict that ∼ 70% of UV radiation is Fig. 9: The evolution of the fraction of the dust obscured SF RD IR as a function of redshift. The arrows show the lower limits obtained using the lower limits discussed in Sect. 4.3.1. The points at z<4 are obtained using a compilation of the literature from Fig. 8. The points at z>4 are obtained using the UVLF+GSMF combined measurements from Table 2. absorbed by dust at z=5, and with the best fit model by Zavala et al. (2018), which predicts 35%-85% at z∼4-5.
Since we used a sample of galaxies detected in FUV, we did not take into account the contribution from highly obscured and purely IR galaxies, which remains unknown at these redshifts, mainly due to the difficulties in detecting such galaxies. The low number counts of the most massive galaxies makes it difficult to make a statistical census. No sources with z>5 were found by the deep ALMA observations of the Hubble Ultra-Deep Field (HUDF; Aravena et al. 2016;Dunlop et al. 2017;González-López et al. 2019). Only recently, a few sources with no optical or NIR counterparts were discovered with ALMA at z∼4 and z∼5 (Franco et al. 2018;Williams et al. 2019) and even z∼6.9 (Strandet et al. 2017;Marrone et al. 2018). The largest sample up to date was assembled by Wang et al. (2019). It contains 39 Hband dropouts at z>3 observed with ALMA. Given their results, such galaxies could have an additional contribution of ∼ 11% to the total SFRD at z > 4. Since their sample is restricted to massive H-band dropouts, this is only a lower limit. The lower mass H-band dropouts could contribute to it significantly. Indeed, given the most recent results by Riechers et al. (2020), the dust obscured SFRD is log(SF RD IR M yr −1 M pc −3 ) = −2.02 or 22% of the total SFRD (based on the space density of two sources detected by CO (J=2→1) and not detected in the rest-frame UV and optical).
Another uncertainty comes from the fact that the conversion from the 160 µm flux to L IR in this work is based only on one point in the infrared SED. Therefore, our measurement of the total IR luminosity are dependent on the choice of the dust SED template. The infrared SED in high redshift galaxies is uncertain due to poorly constrained dust temperatures (Pavesi et al. 2016;Bouwens et al. 2016;Faisst et al. 2017Faisst et al. , 2020. We tested a number of templates against the stacking of Herschel data and only used the ones, which are compatible with the Herschel stacks at red-shift range of ALPINE (see Appendix). The best fit of modified black body to these stacks gives the dust temperature T dust = 41 ± 1 K at z ∼ 4.5 and T dust = 43 ± 5 K at z ∼ 5.5 consistent with most recent measurements by Faisst et al. (2020) (T dust = 38 ± 8 at z ∼ 5.5). Although some uncertainties on the dust temperature and the rest-frame FIR part of the SED remain, we incorporate them into the uncertainty of our SFRD measurement by using an average conversion factor between the templates compatible with the stacks of Heschel data and taking into account the uncertainty of the conversion factor. Nevertheless, more observations are needed to better constrain the dust temperatures and SED of galaxies at z>4.

Star-forming main sequence and sSFR evolution
In Sect. 4.2, we derived the L IR − M * relation. Here we use this relation to derive the main sequence, i.e. the relation between the mass and SFR for these galaxies. In each mass bin, we derive average SFR as SF R = κ F U V L 1600 + κ IR L IR . The average L 1600 is computed from FUV absolute magnitudes, determined with the SED fitting, and the average L IR from stacks. As discussed in Sect. 3.2.2, we consider the results from the stacks as upper limits at masses below the completeness limits.
We derived the main sequence in the two redshifts bins: 4.4<z<4.7 and 5.1<z<5.8. We show the results in Fig. 10. The main sequence is well-defined and is in very good agreement with previous results from the literature (Santini et al. 2017;Tasca et al. 2015;Heinis et al. 2013;Khusanova et al. 2020). We fitted the relation between SFR and mass with a power law and found a slope α = 0.99 ± 0.08 at z∼4.5. We find an excellent agreement with the Schreiber et al. (2015) parametrization of the main sequence evolution at z∼4.5 based on Herschel data, even if we measure the SFRs at lower masses compared to Schreiber et al. (2015).
At z∼5.5, we obtain a shallower slope α = 0.66 ± 0.21. The difference with the slope at z∼4.5 is only 1.5σ, providing no evidence for an evolution of the slope between the two redshift bins. We note that at z∼5.5, we could only provide FIR flux upper limits at lower masses due to the selection effects (see Sect. 3.2.2). Therefore, we only used three points to fit our data and the fit is subject to larger uncertainties. The slope of the main sequence found with Herschel data (Pearson et al. 2018) at the same redshift but at larger masses is steeper than found with ALPINE data, but still consistent with our data on stacks within the error bars.
In Fig. 10, we also plot the individual measurements for galaxies with continuum detections. As expected, most of them are above the main sequence since they represent the brightest galaxies with high SFRs. This confirms that it is necessary to use stacking to obtain unbiased measurements of the average SFR as a function of M * and fit the main sequence.
We measured the sSFR in the mass range 9.6 < log(M * /M ) < 9.8 as in Santini et al. (2017). The average sSFR is log sSF R(Gyr −1 ) = −8.4 ± 0.18 and log sSF R(Gyr −1 ) = −8.45 ± 0.16 at z∼4.5 and z∼5.5 respectively. We present a comparison with results from the literature in Fig. 11. We find good agreement within the error bars with the results from Khusanova et al. (2020) at z>5 and with Herschel data at z∼4 (Béthermin et al. 2015;Schreiber et al. 2015), although we note that these Article number, page 11 of 19 A&A proofs: manuscript no. ALPINE_SFRD_and_MS  Results of various observations and numerical simulations are shown (gray symbols are based on FUV data and orange symbols are based on the rest-frame FIR data). The orange stars show this paper's results. The blue stars are the results obtained with VUDS data in Khusanova et al. (2020). The solid line shows the fit to the previous results from VUDS , the shaded area and the dashed line show the results coming from the observations of EW(Hα) in COS-MOS (Faisst et al. 2016). works only probe the high mass end (log(M * /M ) > 10.0) at z∼4. The results are clearly against a steep increase of sSFR at z>4 and even indicate a possible decrease in sSFR at high redshift. More observations at higher redshifts are necessary to explore this trend.
If the growth of galaxies is regulated mainly by gas accretion through cold streams, the sSFR evolves as ∼ (1 + z) 2.25 (Dekel et al. 2009;Davé et al. 2011;Sparre et al. 2015). This is not the case at high redshifts. Therefore, another mechanism may play a role in the growth of galaxies, such as major and minor mergers (see, e.g., Faisst et al. 2016). Another possible explanation could be the suppression of star formation by stronger stellar feedback at high redshifts and reduced star formation efficiency (see, e.g., Weinmann et al. 2011). Future studies will place constraints on the role of different mechanisms regulating star formation and the growth of galaxies at high redshifts.

Conclusions
In this paper, we explored the SFRD, sSFR and main sequence evolution at z>4 with ALPINE. Using the stacking technique, we constrained the average properties of galaxies even with low number of individual detections.
We determined the robust lower limits log SF RD IR (M yr −1 M pc −3 ) = −2.44 and log SF RD IR (M yr −1 M pc −3 ) = −2.67 at z∼4.5 and 5.5 respectively, which indicate that the dust plays an important role at high redshift and that dust build-up happens quite rapidly during and just after epoch of reionization.
We then extrapolated the density functions and found the total dust-hidden IR contribution to SFRD using either masses or FUV magnitudes as a proxy of infrared luminosity. Using the masses, we obtain log SF RD IR,tot (M yr −1 M pc −3 ) = −1.07 +0.28 −0.23 and log SF RD IR,tot (M yr −1 M pc −3 ) = −1.48 +0.24 −0.25 at z∼4.5 and 5.5 respectively. Using the FUV magnitudes, we obtain log SF RD IR,tot (M yr −1 M pc −3 ) = −1.50 +0.38 −0.39 and log SF RD IR,tot (M yr −1 M pc −3 ) = −1.64 +0.23 −0.24 at z∼4.5 and 5.5 respectively. The results agree with each other within the error bars and suggest that 68 +18 −25 % and 61 +20 −25 % of the star formation is hidden by dust at z∼4.5 and z∼5.5 respectively. While at z<4 the SF RD IR component dominates over the SF RD F U V,uncorr , at z>4 their contribution is comparable. The SF RD IR approaches and possibly crosses the SF RD F U V,uncorr at z>5 but this is yet to be confirmed with future observations. The evolution of the total SFRD at redshifts higher than the peak of SFRD at z∼2.5 may be shallower than previously measured. Further observations leading to reduced uncertainties are needed to confirm this conclusion.
We also provided robust measurements of SFRs of galaxies at z>4 and used them to determine the main sequence at z∼4.5 and z∼5.5. We found that the observed main sequence is in good agreement with the previous results based on the rest-frame UV and optical data, as well as the IR data for the most massive galaxies. The main sequence has a slope α = 0.99 ± 0.08 at z∼4.5 and α = 0.66 ± 0.21 at z∼5.5, and no signs of turn-over at high masses. No significant evolution is observed from z∼4.5 to z∼5.5.
We used the SFR and mass measurements to determine the average sSFR of galaxies at z>4. The results support a shallower or non-existent sSFR evolution at high redshifts than predicted from models of cold gas accretion, or no evolution. Other mechanisms should play an important role in governing the growth of galaxies.

Appendix A: Conversion of FIR flux to IR luminosity
To convert the rest-frame 160 µm flux to IR luminosity, we scale our flux measurement to the dust SED template and then integrate it from 8 to 1000 µm. Since we use stacks of galaxies, which are not on the same redshift, we use the mean redshift of galaxies in stacks for that conversion. Since we only obtain a measurement of one point in SED, our conversion to IR luminosity will depend on dust SED templates.
There are numerous dust SED templates available in the literature (e.g., Magdis et al. 2012;Béthermin et al. 2017;Álvarez-Márquez et al. 2016;Casey et al. 2018;Schreiber et al. 2018;De Rossi et al. 2018;Álvarez-Márquez et al. 2019). Some of them were already discussed and tested by Bethermin et al. (2020). Bethermin et al. (2020) made stacks of Herschel data for galaxies at z>4 selected via photometric redshifts. They then compared the SED templates to these stacks. We define the conversion factor f temp for each template and the χ 2 from comparison to Herschel stacks at both redshifts (see Table A.1). The conversion factor is defined as f temp = νL ν=158µm /L IR . We discard the templates with reduced χ 2 > 1.5 in at least one of the redshift bins (see Fig. A.3). We define the mean conversion factors from the remaining templates f temp,z∼4.5 = 0.13 ± 0.02 and f temp,z∼5.5 = 0.12 ± 0.03 for redshift bins 4<z<5 and 5<z<6 respectively. These are the values we use to convert the monochromatic flux to the infrared luminosity and propagate their uncertainties into our SFRD measurements.
We note that although the De Rossi et al. (2018) template passes our χ 2 criteria, it goes above the upper limits in redshift bin 4<z<5, but is perfectly consistent at 5<z<6. Therefore, we use this template only for 5<z<6 redshift bin.