Are LGRBs biased tracers of star formation? Clues from the host galaxies of the $Swift$/BAT6 complete sample of bright LGRBs III: Stellar masses, star formation rates and metallicities at $z>1$

(Abridged) Long gamma-ray bursts (LGRB) have been suggested as promising tracers of star formation owing to their association with the core-collapse of massive stars. The goal of this work is to characterise the population of host galaxies of LGRBs at 1<z<2, investigate the conditions in which LGRBs form at these redshifts and assess their use as tracers of star formation. We perform a spectro-photometric analysis to determine the stellar mass, star formation rate, specific star formation rate and metallicity of the complete, unbiased host galaxy sample of the Swift/BAT6 LGRB sample at 1<z<2. We compare the distribution of these properties to the ones of typical star-forming galaxies from the MOSDEF and COSMOS2015 Ultra Deep surveys, within the same redshift range. We find that, similarly to z<1, LGRBs do not directly trace star formation at 1<z<2, and they tend to avoid high-mass, high-metallicity host galaxies. We also find evidence for an enhanced fraction of starbursts among the LGRB host sample with respect to the star-forming population of galaxies. Nonetheless we demonstrate that the driving factor ruling the LGRB efficiency is metallicity. The LGRB host distributions can be reconciled with the ones expected from galaxy surveys by imposing a metallicity upper limit of 12+logOH ~ 8.55. Metallicity rules the LGRB production efficiency, which is stifled at Z>0.7 Zsun. Under this hypothesis we can expect LGRBs to trace star formation at z>3, once the bulk of the star forming galaxy population are characterised by metallicities below this limit. The moderately high metallicity threshold found is in agreement with the conditions necessary to rapidly produce a fast-rotating Wolf-Rayet star a in close binary system, and could be accommodated by single star models under chemically homogeneous mixing with very rapid rotation and weak magnetic coupling.


Introduction
Long duration gamma-ray bursts (LGRBs, prompt emission duration longer than 2s) have been shown to be connected to the end of life of massive stars (Woosley 1993;Woosley & Bloom  1: Stellar mass, star formation rate and metallicity for the hosts of the BAT6 LGRB sample at 1 < z < 2. References are : 1) Krühler et al. (2015); 2) this work; 3) Perley et al. (2016b Salvaterra et al. 2009Salvaterra et al. , 2013Tanvir et al. 2009). Indeed, in addition to their bright afterglows, even at high redshift (Lamb & Reichart 2000), the detection of LGRBs in the soft γ-ray domain of the electromagnetic spectrum is largely unaffected by dust. Various authors have tried to use LGRBs to estimate the SFR density at high redshift (e.g. Kistler et al. 2008;Robertson & Ellis 2012), however these studies used intrinsically biased and incomplete samples. The importance of using a carefully selected, unbiased and complete sample of LGRBs and their host galaxies has since been recognised and various samples have been designed to address this issue, such as TOUGH (Hjorth et al. 2012), Swift/BAT6 (Salvaterra et al. 2012) and SHOALS (Perley et al. 2016a).
Different studies using the host galaxies of these samples have tried to obtain information on the LGRB efficiency, that is the relation between the LGRB rate and the SFR, fundamental for using LGRBs as tracers of the SFR density. Factors that can impact this relation can be related to the conditions needed for the progenitor star to produce an LGRB. Metallicity is the most commonly invoked factor, as most single-star progenitor models of LGRBs require low metallicity to expel the hydrogen envelope while keeping enough angular momentum, necessary for the production of the GRB jet (e.g. Woosley & Heger 2006;Yoon & Langer 2006). Due to the cosmological origin of the majority of LGRBs it is not possible to study directly the progenitor stars, their environment and their remnants. Therefore current studies focus on the properties of the LGRB host galaxies to gather information on the LGRB efficiency. The results obtained to date using complete unbiased samples of LGRB host galaxies (Vergani et al. 2015;Perley et al. 2016b;Japelj et al. 2016a), agree on the fact that there is a preference for LGRBs to explode in sub-solar metallicity host galaxies (see also Bignone et al. 2017 on results using the Illustris simulation). Nonetheless extremely low metallicities are not required, and host galaxies having super-solar metallicities are not excluded (see e.g. Savaglio et al. 2012), even if much rarer than expected from a direct relation between LGRB rate and SFR.
The results obtained from the studies above are based on the comparison of the properties of LGRB host galaxies with those of representative star-forming galaxies sampled through galaxy surveys. Due to the faintness of a considerable fraction of the LGRB host galaxies, to date such a comparison, especially when involving spectroscopically-derived properties (SFR, metallicity), has been performed in detail only at z < 1 (Krühler et al. 2015;Japelj et al. 2016a). Improvements of existing photometric surveys (e.g. COSMOS2015, Laigle et al. 2016), and the emergence of deep spectroscopic surveys (e.g. VUDS, Le Fèvre et al. 2015) with access to the near-infrared (e.g. MOSFIRE Deep Evolution Field, i.e., MOSDEF survey, Kriek et al. 2015) allow us now to investigate the LGRB efficiency by comparing the properties of complete samples of LGRB hosts to samples of typical star-forming field galaxies in detail also at z > 1. This paper is organised as follows. In Section 2 we present our sample selection, the observations and analysis of our LGRB hosts, and characterise their properties and the evolution of these properties with redshift. In Section 3 we compare our sample with surveys of field galaxies. We discuss our results in more detail in Section 4 and our conclusions are summarised in Section 5.
2.6 ph cm −2 s −1 ) LGRBs with favourable observing conditions for optical follow-up (Jakobsson et al. 2006). This selection results in 58 LGRBs with a 97% redshift completeness, extending up to z ∼ 6. No correlations have been found between the prompt emission properties (peak energy, luminosity) of LGRBs and their host galaxies' properties (see Levesque et al. 2010;Japelj et al. 2016a, and Fig A.1 of the Appendix for our sample up to z = 2). Therefore, by construction, our sample is statistically representative of the whole LGRB host galaxy population (including dark LGRBs, Greiner et al. 2011;Melandri et al. 2012). For the purpose of this paper, we restrict ourselves to the redshift range 1 < z < 2 (see Table 1), building on the previous papers of Vergani et al. (2015) and Japelj et al. (2016a) that considered the z < 1 range.

Stellar mass
To determine the host galaxy stellar masses we used photometric measurements (typically covering the visible to near-infrared wavelength range) from the literature, complemented with new values that we measured from archival data for GRB 061007, GRB 100615A and GRB 090201. All of the values and references are reported in the appendix (Tables A.1

and A.2).
We fit the available observational constraints (excluding nondetections 1 ) on the emission-line fluxes and broad-band photometry of the galaxies in our sample using the Bayesian spectral interpretation tool beagle (Chevallard & Charlot 2016;version 11.3). The version of beagle we use relies on the models of Gutkin et al. (2016), who follow the prescription of Charlot & Longhetti (2001) to describe the emission from stars and the interstellar gas. In particular, the models are computed combining the latest version of the Bruzual & Charlot (2003) stellar population synthesis model with the standard photoionisation code cloudy (Ferland et al. 2013). We use three parametrisations for the star formation histories of model galaxies in beagle constant star-formation, an exponentially declining function ψ(t) ∝ exp(−t/τ SFR ) and an exponentially delayed function ψ(t) ∝ t exp(−t/τ SFR ). For the exponentially declining and exponentially delayed functions, we let the star formation timescale and the star-formation freely vary in the ranges 7 ≤ log(τ SFR /yr) ≤ 11.5 and −4 ≤ log(S FR/M yr −1 ) ≤ 4. Finally, we superpose on the exponentially delayed function a current burst with a variable duration of 6 ≤ log(t current /yr) ≤ 9. For the three star formation histories, we let the age of the galaxy vary in the range 6.0 ≤ log(age/yr) ≤ 10.15 and we adopt a standard Chabrier (2003) initial mass function. We further adopt the same metallicity for stars and star-forming gas (Z = Z IS M ) and assume that all stars in a galaxy have the same metallicity, in the range −2.2 ≤ log(Z/Z ) ≤ 0.25. Finally, we let the stellar mass vary in the range 4 ≤ log(M * /M ) ≤ 12.
The stellar mass values reported in Table 1 are the median of the probability distribution functions from the best-fitting SFH/attenuation prescription which were chosen as having the lowest χ 2 while also having predicted SFRs and metallicities consistent with the ones measured from spectroscopy. The best fit SED for each fitted host are shown in Appendix A. 1 However we verified that the results from the SED fitting do not violate the limits imposed by the photometric non-detectrions.
In general, the stellar masses found are consistent within errors independently of the SFH or dust attenuation chosen (the only debated case is GRB 061121 for which the stellar mass spans values from 7 × 10 8 to 2 × 10 10 M , we chose the stellar mass corresponding to the SFH prescription that yields SFR and metallicity values consistent with the ones derived by spectroscopy; we note that using the stellar mass value of log(M * /M )∼ 10 would not change the results of our study). The largest dispersion between the stellar mass values obtained from the different SFH and dust attenuation prescriptions is ∼ 0.5 dex. We cross-checked the stellar mass values with the CIGALE SED code (Noll et al. 2009), and values from Kruehler & Schady (2017), derived using the LePhare SED code (Arnouts et al. 1999;Ilbert et al. 2006). Even if a detailed analysis on the different SED codes to determine stellar masses is far beyond the scope of this paper, we stress that the stellar mass values found are consistent within the errors, and that the overall results of this study would remain unchanged independently of the choice of the aforementioned codes.
We noticed a discrepancy (see also Corre et al. 2018;Arabsalmani et al. 2017;Heintz et al. 2017) when computing stellar masses from SED fitting compared to values based on the rest-frame near-infrared (NIR) magnitude only (e.g. from Perley et al. 2016b, used also in Vergani et al. 2017. These stellar mass values are mostly overestimated compared to the values derived by SED fitting. This effect is known, especially at lower stellar masses due to the variations in the mass-to-light ratio as a function of stellar mass (Ilbert et al. 2010). It should be noted however that Perley et al. (2016b) tried to correct for this effect by using a mass-to-light conversion factor that is not simply a linear factor but is a function of z and galaxy luminosity, fitted based on a template model of galaxy evolution. Due to the lack of wide photometric coverage, for 4 of our 15 hosts at 1 < z < 2 (GRB 091208B, GRB 050318, GRB 050802, GRB 060908) it was not possible to perform a SED fitting, therefore the stellar masses are computed with the method described in Perley et al. (2016b), with the aforementioned caveats. These values are considered as upper limits in the analysis. However, as explained later (see Sect. 3), they are discarded when performing the statistical test of Sect. 3 as they do not comply with the limits of the surveys.
The resulting stellar mass cumulative distribution for the hosts of the BAT6 sample is shown in Fig 1, in the top panel. There is an evolution towards higher median mass between z < 1 and 1 < z < 2. As LGRB host galaxies are selected only by the fact that they host an LGRB explosions, and as we are considering an unbiased and complete sample of LGRB host galaxies, the stellar mass evolution we find is not a selection effect and is intrinsic to the properties of LGRB host galaxies. Nevertheless, we anticipate that higher stellar mass values would be expected considering the SFR determined in Sect. 2.3 and the relation found between stellar mass and SFR in SF galaxies (e.g. Shivaei et al. 2015).
We also plot the distribution of the stellar masses or limits for the BAT6 LGRB host galaxies at 2 < z < 3 (see Tab. A.3). Those were determined from rest-frame NIR observations only, and (with the exception of GRB 090201) published by Perley et al. (2016b). The distribution at 2 < z < 3 is riddled with upper limits, and given the different methodology used for the stellar mass determination (and its caveats), we can only tenta-  tively conclude that the median stellar mass does not seem to increase significantly with respect to the one at 1 < z < 2.

Star Formation Rate and Metallicity
SFRs and metallicities were determined using the host galaxy spectra. The data at z > 1 come from the VLT/X-Shooter spectrograph (Vernet et al. 2011), and the spectra have already been presented in Krühler et al. (2015) and Vergani et al. (2017). The large wavelength coverage (3000 to 25000 Å) and sensitivity of X-Shooter allow us to detect the strongest rest-frame optical emission lines up to z = 2, ensuring a homogeneous methodology for the determination of star formation rates and metallicities.
We performed a new data reduction and analysis of the data, with the standard Esoreflex pipeline (version 2.7.3, Modigliani et al. 2010) using the nodding recipe. The spatial width of the 2D to 1D spectrum extraction was scaled according to the spatial width of the detected emission lines to maximise the signal to noise ratio. The flux calibration was cross-checked with the host photometry when available, or otherwise with a telluric standard star taken at similar airmass and seeing, to account for any slit loss or absolute calibration inconsistencies (see Japelj et al. 2016a). Emission lines were measured using IRAF 2 by fitting a one (or more when relevant) component Gaussian function and cross-checked by comparing to the flux resulting from direct integration under the line profile. The resulting fluxes are compiled in Table A.4. In case of a non-detection, a 3σ upper limit is quoted. The measurements are consistent within the errors with the values reported by Krühler et al. (2015) and Vergani et al. (2017).
The measured emission line fluxes were corrected for Galactic extinction using the extinction curve of Pei (1992) and the extinction map of Schlafly & Finkbeiner (2011). The Balmer line fluxes were not corrected for Balmer absorption due to the absence of a detectable continuum in most hosts and its weakness in LGRB hosts as expected from their low stellar masses (Zahid et al. 2011). The fluxes were also corrected for the host intrinsic extinction, with the A V measured using the Balmer decrement (assuming case B recombination, Osterbrock 1989) and an SMC extinction curve following the findings of for example Japelj et al. 2015.
SFRs were determined using the dust-corrected Hα luminosities, following Kennicutt (1998) scaled to the IMF of Chabrier (2003). In the few cases where it was not possible to correct for dust extinction, the SFRs are reported as lower limits. As shown in Figure 1, panel (b), the median SFR increases from ∼ 1.3 +0.9 −0.7 M yr −1 at z < 1, to ∼ 24 +24 −14 M yr −1 at 1 < z < 2, in agreement with Krühler et al. (2015).
Gas phase metallicities are notoriously hard to determine at high redshift by direct electron temperature methods due to the weakness of the [OIII]λ4363 line. Instead, alternative methods based on the calibration of strong line ratios are commonly used. Each calibrator has its own relative scale (see Kewley & Ellison 2008 for more details). It is therefore important to use the same method to determine metallicity for all the host galaxy in our sample. Here we infer the metallicity from the method developed by Maiolino et al. (2008) (referred to as M08) which relies on multiple calibrators simultaneously, taking advantage of all the emission lines detected. The bottom panel of Figure 1 indicates that, contrary to stellar mass and star formation, the metallicity distribution of LGRB hosts does not seem to evolve (see also Krühler et al. 2015). This provides a first clue suggesting that metallicity is a regulatory factor in the production of LGRBs, which is in line with previous studies (Vergani et al. 2015;Perley et al. 2016b).

Comparison with the star-forming galaxy population
If we assume that LGRBs are direct tracers of SF, then more SF equates to a higher chance of producing an LGRB (for a fixed stellar IMF). Hence from a statistical point of view we expect the various distributions of the properties of LGRB hosts to follow the ones of the general population of star-forming galaxies weighted by their SFR. The lack of agreement between these distributions can be an indication of a factor regulating the production of LGRBs. Vergani et al. (2015) and Japelj et al. (2016a) have already shown discrepancies between the distributions of the Swift/BAT6 LGRB hosts properties and the SFR-weighted ones of star-forming galaxies at z < 1, (see also Krühler et al. 2015, Perley et al. 2016band Schulze et al. 2015 tackling the same issue using other samples). Here we aim to extend this analysis to higher redshift. Owing to a low number of objects and, in some cases, limits or large errors, we employ a Bayesian approach to provide robust statistical estimates, which we describe in Sect 3.2.

COSMOS 2015 Ultra Deep
The COSMOS2015 (Laigle et al. 2016) is a deep (K s ≤ 24.7) photometric survey of half a million galaxies at z < 6, with wavelength coverage from the near-UV to the infrared. Within this catalogue we selected the star-forming galaxies of the COS-MOS2015 Ultra Deep stripes (COSMOS2015UD) from the ESO phase 3 archive system 3 . The advantage of COSMOS2015UD relies in the large number of objects (∼ 10 4−5 ) with available stellar masses and accurate photometric redshifts. These stellar masses were determined by SED fitting with the LePhare code using a Chabrier (2003) IMF (see Ilbert et al. 2015 for more details). While comparing the properties of the BAT6 LGRB host galaxies with COSMOS2015UD, we take into account its redshift-dependent mass completeness and remove the LGRB hosts with stellar masses below this limit, resulting in a comparison sub-sample of ten hosts at 1 < z < 2. The COSMOS2015UD SFR are dust-corrected and obtained from SED fitting without the Infrared photometry (http://www.eso.org/rm/api/v1/ public/releaseDescriptions/100).

The MOSDEF survey
The LGRB hosts at 1 < z < 2, we make use of MOSDEF galaxies in the lowest of these three redshift ranges, at z ∼ 1.5. Galaxies were targeted down to fixed restoptical (observed H-band) magnitudes (H AB ≤ 24.0 at z ∼ 1.5). We select galaxies with detections of both Hα and Hβ at S/N≥3 such that reddening-corrected SFR can be determined. Requiring detections of both Hα and Hβ does not significantly bias the MOSDEF sample above log(M * /M ) ∼ 9.5 (Shivaei et al. 2015;Sanders et al. 2018). AGN were excluded following the prescriptions described in Shivaei et al. (2015) and references therein. This selection results in a MOSDEF comparison sample of 133 galaxies ranging in redshift from 1.37 to 1.73 with z med = 1.53.
SFRs were calculated based on reddening-corrected Hα luminosity using the Kennicutt (1998) calibration with the Chabrier (2003) IMF, the measured Balmer decrement (Hα/Hβ), and the Cardelli et al. (1989) Milky Way extinction curve. The MOSDEF stellar masses (see Sanders et al. 2018) were estimated by fitting flexible stellar population synthesis models (Conroy et al. 2009) to photometry spanning the observed optical to mid-infrared using the SED fitting code FAST (Kriek et al. 2009). Solar metallicity, delayed star formation histories, the Calzetti et al. (2000) dust reddening curve, and the Chabrier (2003) IMF were assumed for the SED fitting. For comparison with the LGRB host galaxies, SFR(Hα) and stellar mass values were calculated assuming the Planck Collaboration et al. (2014) cosmology 4 , the same as for the BAT6 host sample.
The MOSDEF metallicities used in this paper were determined using the M08 method, in the same way as for the hosts of the BAT6 sample (see Sect 2.3). Of the 133 galaxies in the MOS-DEF comparison sample, 127 have sufficient emission line information to calculate metallicities using the M08 method. When comparing this MOSDEF sample with the BAT6 LGRB host galaxies, we excluded from the comparison 6 LGRB hosts with log(M * /M ) < 9.3 because they fall in a stellar mass range in which the MOSDEF sample is significantly incomplete. This results in a BAT6 comparison sub-sample of 9 LGRB hosts. We note that the SFRs of the LGRB host galaxies in the BAT6 comparison sub-sample fall within the SFR range of the MOSDEF comparison sample.

Bayesian framework
Our calculations rely on the assumption that the probability distribution function (PDF) for our data can be reasonably well described by an asymmetric Gaussian distribution for which the scale parameter is given by the asymmetric errors and the location parameter is given by the value quoted in our table. For example, the PDF of a quantity µ +σ p −σ m is given by: where A is the normalisation given by: In the event of upper limits on the stellar mass of our galaxies, we use a uniform distribution (uninformative prior) between log(M * /M ) = 7 and the upper limit for the comparison of BAT6 sample at different redshifts; when comparing with the COS-MOS2015UD and MOSDEF surveys, the lower stellar mass limit is set to the mass completeness of the survey 5 . For lower 4 The papers published previously in the MOSDEF collaboration used a different cosmology. 5 The mass completeness of the COSMOS2015UD survey varies with redshift; the value used as lower limit is the mass completeness at the redshift of the host. limits on the SFR or the specific SFR (sSFR) (objects for which no extinction could be derived), we use a uniform distribution between the limit and a maximum SFR calculated by assuming an A V of 4. We then estimate the median and 95% confidence bounds on our cumulative distribution functions (CDFs), by computing 10000 Monte Carlo realisations of our data sampling from the aforementioned PDFs, this confidence interval is represented as a shaded area in the figures showing CDFs. In a similar fashion, we computed 10000 realisations of the K-S test for each individual CDF when comparing with the MOSDEF and COSMOS2015UD samples 6 .

Stellar mass
The top left panel of Figure 2 shows the stellar mass cumulative distribution of the hosts of the BAT6 sample compared to the SFR-weighted distribution of the star-forming field galaxies of the COSMOS2015UD at 1 < z < 2. The distribution of Dstatistic and p-values from 10000 Monte Carlo realisations of the 2 sample K-S test are shown in the bottom panels, indicating that the vast majority of realisations exclude the null hypothesis that the two samples are drawn from the same distribution at the 95% confidence level. It should be noted (see Sect. 3.1.1) that the SFR used to weight the COSMOS2015UD distribution are obtained from SED fitting. It has been shown that SFRs determined in such way can be underestimated at SFRs higher than ∼ 50 M yr −1 (e.g. Reddy et al. 2015;Lee et al. 2015). This corresponds to ∼ 12% of the COSMOS2015UD SF galaxies. Considering that high SFR values are normally associated with high stellar mass galaxies, this underestimation would have the effect of increasing the discrepancy between the two distributions. However, we note also that there is a good consistency between the COSMOS2015UD and MOSDEF (see below) SFR-weighted distributions. These considerations are also valid for the SFRweighted SFR and sSFR distributions presented in the following sections.
In the right panels of Figure 2, the same comparison is performed with the star-forming galaxies of the MOSDEF survey. In this case the SFR used is that determined from the dustcorrected Hα luminosities. We computed 10000 MC realisations of both the BAT6 and the MOSDEF sample with the assumptions described in Sect. 3.2, with the difference that each galaxy in the MOSDEF sample is weighted by the realisation of its SFR. For each realisation, we compute the 2 sample K-S test which yields a distribution of p-values firmly excluding the possibility that LGRB hosts are drawn from the same stellar mass distribution as that of MOSDEF galaxies weighted by their SFR.
Another way to look at the discrepancy of the distributions and have some information on the behaviour of the LGRB efficiency as a function of the stellar mass is to use the method presented by Boissier et al. (2013), and used also in Vergani et al. (2015). In the present work, instead of using galaxy models, in Fig. 3 we compare the LGRB host galaxies directly with the MOSDEF star-forming galaxies. The efficiency here is defined as the fraction of LGRB hosts divided by the fraction of MOS- DEF galaxies in a given stellar mass or metallicity bin. The results are normalised to the first bin value. We apply this method also for the galaxy properties presented in the following sections (SFR, sSFR, and metallicity; see Fig. 3).
We also investigated the evolution of the median stellar mass with redshift for the BAT6 hosts compared to the SFR-weighted COSMOS2015UD sample, presented in Fig. 4. The discrepancy between the BAT6 hosts and the SFR-weighted field galaxies is most notable at low redshift and decreases up to z = 3 as is shown in the bottom panel, although the last redshift bin is to be taken cautiously due the low number of hosts within it. Additionally, the stellar masses of the LGRB hosts in the last redshift bin (2 < z < 3) are derived using a different methodology (see 2.2). With these caveats in mind, this trend is consistent with the observations of Perley et al. 2016b (see also Hunt et al. 2014).

Star formation rate
The top panels of Figure 5 show the SFR cumulative distribution of hosts of the BAT6 sample compared to SFR-weighted distribution of star-forming field galaxies of COSMOS2015UD and MOSDEF at 1 < z < 2. As confirmed by the p-value distribution, there is good agreement between the two distributions.
The top panels of Figure 6 show the specific SFR (sSFR, defined as SFR/M * ) cumulative distribution of hosts of the BAT6 sample compared to the SFR-weighted distribution of star-forming field galaxies of COSMOS2015UD and MOSDEF at 1 < z < 2. The p-value distribution in the bottom panels indicates that in the majority of cases we cannot exclude the null hypothesis that the two samples are drawn from the same distribution for the COSMOS2015UD sample, while for the MOS-DEF sample, it is less definitive since the p-value distribution peaks around 0.05. In ∼ 40% of cases, we cannot discard the null hypothesis at the 95% confidence.
We note that we could not determine the SFR for three host galaxies. Nonetheless their stellar masses were lower than the stellar mass completeness of the surveys. Therefore, when comparing with surveys our sample is still complete.
In Figs. 7 and 8 we plot the BAT6 host galaxies and the MOSDEF star-forming galaxies in the SFR, sSFR vs stellar mass plane, respectively. We fit the SFR vs stellar mass relation (socalled Main Sequence, e.g. Whitaker et al. 2012) for the MOS-DEF sample of star-forming galaxies at 1 < z < 2 following Shivaei et al. (2015). We derived the fraction of galaxies above the 1-sigma intrinsic scatter (see Japelj et al. 2016b) of the relation within the MOSDEF sample to be 27±5%. Excluding the 3 hosts falling in the low mass region, sparsely populated by the MOSDEF sample, the fraction of LGRB host galaxies showing such an enhancement of SFR, with respect to the MOSDEF 1 < z < 2 relation, is 66±22%.

Metallicity
The MOSDEF survey allows us also to perform the comparison of the metallicity distribution, within the same redshift range and using the same calibrator (M08). Fig. 9 shows the cumulative distribution of the metallicity of hosts of the BAT6 sample compared to the SFR-weighted distribution of star-forming galaxies of the MOSDEF at 1 < z < 2. The distribution of p-values in the bottom right panel indicates we can reject the hypothesis that the MOSDEF star-forming galaxy sample weighted by SFR and the BAT6 sample are drawn from the same distribution at the 95% confidence level. We note that we could not determine the metallicity for four host galaxies. Nonetheless their stellar masses were lower than the stellar mass completeness of the MOSDEF sample. Therefore, our sample is still complete with respect to the comparison with the MOSDEF galaxies. Figure 10 shows the mass-metallicity relation (MZR) for the BAT6 hosts and the MOSDEF sample, using the M08 calibrator. We see that the LGRB hosts are consistent with the star-forming field galaxies at low mass and low metallicity but there is a clear dearth of high mass and high metallicity LGRB host galaxies 7 . Indeed there is only one host (which has very large errors) above 7 In Vergani et al. (2017) the authors also present the MZR based on the same BAT6 sample but the stellar masses are revised in this work (see Sect. 2.2). 12 + log(O/H) ∼ 8.7, whereas the area of stellar masses above ∼ 10 10 M and 12 + log(O/H) ∼ 8.6 is well populated by the star-forming galaxies of MOSDEF.
We also computed the Fundamental Metallicity Relation (FMR) as defined by Mannucci et al. (2011), represented in Fig. 11. This relation is supposed to be redshift independent. Nonetheless, as Sanders et al. (2015Sanders et al. ( , 2018) find a redshift dependence of the FMR built with the MOSDEF sample, we prefer to plot here only the BAT6 hosts at 1 < z < 2, omitting hosts at z < 1.
In Vergani et al. (2017) the authors noted a discrepancy between the region occupied by the LGRB hosts and the FMR (explained by a metallicity threshold for LGRB production).   Fig. 7: SFR versus stellar mass plot for the BAT6 sample at 1 < z < 2 (squares). The circles are from the MOSDEF sample at 1 < z < 2. The red line is the best fit to the MOSDEF data and the dotted lines represent the intrinsic scatter, following the method of Shivaei et al. (2015). The points are coloured by metallicity in the M08 calibrator. Galaxies where no metallicity could be measured are coloured in grey with a cross. The grey circles are from the MOSDEF sample at 1 < z < 2.
Here, it appears the LGRB hosts occupy mostly the low µ 8 area (whereas roughly half of the MOSDEF sample lies at µ > 9.7), and, in this region, they are consistent with the MOSDEF points. However, at those µ values, both the MOSDEF sample and the LGRB hosts seem to have lower metallicities with respect to the FMR predictions. A complete analysis of this discrepancy is beyond the scope of this paper. Here we can point out that this could be due to an underestimation of the FMR slope at low µ, or to an evolution of the relation in redshift (as found by Sanders et al. 2018), as we are comparing galaxies at 1 < z < 2 (our LGRB and MOSDEF samples) with the FMR built mainly with low-redshift galaxies. Indeed, different works showed that evolving physical conditions of ionised gas in HII regions may lead to evolution in the relationships between emission-line ratios and metallicity (e.g. Steidel et al. 2014;Shapley et al. 2015;Sanders et al. 2016). This is not an issue when comparing the metallicities of the BAT6 sample and the MOSDEF one as we selected the same redshift range 1 < z < 2, unless the physical  conditions in LGRB hosts are significantly different from those in typical SF MOSDEF galaxies.

Discussion
The analysis presented in the previous sections clearly shows that the stellar mass and metallicity CDFs of the LGRB hosts do not follow those of typical star-forming galaxies weighted by SFR. This implies that, due to some factors affecting the LGRB production efficiency, at 1 < z < 2 the LGRB rate cannot be used to directly trace star formation. As found in previous work (Vergani et al. 2015;Perley et al. 2016a;Japelj et al. 2016a;Vergani et al. 2017), it seems that metallicity is the main factor involved: LGRBs explode preferentially in sub-solar metallicity environments. Indeed, as we will discuss in more detail later in this section, in the commonly used LGRB collapsar progenitor model (Woosley 1993) a dependence of the LGRB production on metallicity is expected. In this context, the discrepancies in the stellar mass distribution are a direct consequence of the relation between stellar mass and metallicity (lower metallicities correspond to lower stellar masses).
Nonetheless, in our analysis there seems to be evidence also for an enhancement of sSFR among LGRB host galaxies compared to star-forming galaxies found in galaxy surveys. In the literature there are indications that starburst galaxies are generally characterised by lower metallicity than non-starburst ones (e.g. Sanders et al. 2018). It is therefore necessary to investigate which is the real driving factor affecting the LGRB efficiency, i.e. if it is the preference for galaxies with enhanced SFR that has as a consequence the preference for sub-solar metallicities, or the opposite. Fig. 12 shows that MOSDEF host galaxies with high sSFR have on average lower metallicity than those with lower sSFR values. Nonetheless, within the sSFR range covered by the MOSDEF galaxies considered in this work, for a fixed sSFR the fraction of MOSDEF star-forming galaxies having metallicities larger than 12 + log(O/H)∼ 8.5 is much higher than that of LGRB hosts. Stronger evidence that a possible preference for enhanced SFR would not be the only factor at play comes from the lack of LGRB host galaxies in the high stellar mass -high SFR region of Fig. 7, compared to MOSDEF galaxies. Indeed, if enhanced SFR is the driving factor, we should find LGRB host galaxies with enhanced SFR also at stellar masses larger than ∼ 10 10 M .
All the results point towards metallicity as the main driving factor. In order to further test this hypothesis, we apply a stepfunction metallicity cut on the MOSDEF sample and perform the comparison with our LGRB hosts again. We impose different metallicity thresholds. As the metallicity threshold value decreases the BAT6 and MOSDEF CDFs become more and more consistent until the majority of the p-values indicate we cannot confidently discard the null hypothesis that LGRB hosts and MOSDEF star-forming galaxies are drawn form the same population. Using a metallicity cut of 12 + log(O/H)= 8.55, the SFR-weighted CDFs of MOSDEF come into agreement with the ones of the BAT6 sample, as is shown in Figure 13. The twosample K-S test results in a distribution of p-values consistent with the null hypothesis that the two samples are drawn from the same underlying distribution in the majority of MC realisations. This implies that the discrepancies observed for the stellar mass and metallicity CDFs can be explained by a simple threshold on the metallicity, without the need for a contribution from a preference for starburst galaxies. This also naturally explains the trend observed in Fig. 4. Following the redshift evolution of the MZR, as the redshift increases, to a given stellar mass corresponds a lower metallicity. The metallicity threshold is therefore fulfilled by galaxies more and more massive. This explains the evolution of the median stellar mass of the LGRB host galaxies reaching the agreement with that of SF galaxies at z ∼ 3 (see also Section 5).
To verify that enhanced star formation is not the main driving factor affecting the LGRB efficiency, we perform the same analysis above, but applying a cut only on the sSFR this time. As shown in Fig. 14, even a sSFR cut of MOSDEF galaxies at log(sSFR) ≥ −8.7, (comparable with the sSFR of LGRB host galaxies) is not able to reconcile the stellar mass and metallicity distributions.
In general, we cannot exclude that a preference for galaxies with enhanced star formation (or starbursts) is also at play, but we can affirm that this is not the major factor driving the LGRB efficiency (see also Graham & Fruchter 2017). We tested also the effect of various sSFR cuts on top of a metallicity cut. The impact is very mild and results in a slightly better agreement of the distributions for a metallicity cut between 12 + log(O/H)= 8.55 and 12 + log(O/H)= 8.7. In Kelly et al. (2014) a preference for LGRB to explode in more compact galaxies (smaller half-light radii, higher SFR density and stellar mass density) compared to the SDSS star-forming galaxies is found, in addition to the preference for low-metallicities. However, considering the redshift The CDFs match more closely, and the 2 sample K-S tests suggest we can not discard the null hypothesis in the majority of realisations.
range and low stellar-masses of our study, a morphological analysis cannot be performed.
The results obtained can be interpreted in terms of the conditions necessary for a massive star undergoing a collapse to form an LGRB. A high metallicity would create too much wind-loss in the final stages of the progenitor's life, causing a loss of angular momentum that is necessary for the formation of an ultrarelativistic jet. However, the threshold we find (corresponding to 0.7 Z in the M08 scale) is higher than the 0.1−0.3 Z metallicity upper limit values predicted by most single-star progenitor models (e.g. Yoon & Langer 2006;Woosley & Heger 2006). Some studies pointed out that the Kewley & Dopita (2002) photoionization models on which the M08 method is based may overestimate oxygen abundances by 0.2-0.5 dex compared to the metallicity derived using the so-called direct T e method (see e.g., Kennicutt et al. 2003;Yin et al. 2007). On the other hand it should also be noted that the oxygen abundances determined using temperatures derived from collisional-excited lines could be underestimated by 0.2-0.3 dex (see e.g. López-Sánchez et al. 2012;Nicholls et al. 2012). A way to accommodate single star pro-genitors models with environments characterised by the higher metallicity values found in our works is to invoke chemically homogeneous mixing with very rapid rotation (Brott et al. 2011) and weak magnetic coupling (Georgy et al. 2012;Martins et al. 2013). In such cases LGRB could be produced also up to solar metallicities, but it is still not clear whether their rates would correspond to the LGRB observed rates.
Another possibility to be considered is an enhancement of the [O/Fe] in LGRB host galaxies. Indeed oxygen overabundances have been found in young and/or starburst galaxies (e.g. Vink et al. 2000 and references therein;Izotov et al. 2006) due to the longer time scale needed to produce type Ia SNe, that are the main producer of iron, compared to type II SNe where oxygen is produced. This was also pointed out by Steidel et al. (2016) as an explanation of the higher stellar metallicity compared to the nebular one found for galaxies at z > 2. Indications of low iron abundances compared to oxygen have been found by Hashimoto et al. (2018) in the host galaxies of two very lowredshift LGRBs: GRB 980425 and GRB 080517. At the Z values we find, iron is the main driver of the wind mass-loss of WR stars (Vink & de Koter 2005). If the LGRB environment is characterised by oxygen overabundance, a [O/Fe] 0.5 would imply iron metallicities in agreement with most single star LGRB models.
Binary channels where the progenitor star is tidally spunup by its companion (de Mink et al. 2009;Podsiadlowski et al. 2010) must also be considered. The evolution of massive stars in binaries is more complex to model than as single stars (e.g. Fryer & Heger 2005;Yoon 2015). A few studies on evolutionary models of binary stars have started to investigate the effects of rotation and metallicity (e.g. de Mink et al. 2009;Eldridge et al. 2017). In Song et al. (2016) the evolution of single and close binary stellar models (before any mass transfer) with strong core-envelope coupling is compared. Rotating massive stars in binary systems do not significantly lose their surface velocity, independent of the metallicity. Interestingly, the surface velocity increases with the initial stellar mass and the metallicity, and homogeneous evolution is more favoured at metallicities Z 0.5 Z than at lower metallicities. The avoidance of the Roche lobe overflow phase during the main sequence phase is favoured in high-mass star models at metallicities Z 0.5 Z . In the proposed scenario the primary star can enter the WR phase at an early stage of its evolution keeping fast rotation and high angular momentum. Even if the final stages of this evolution still need to be studied, this could be a channel for the formation of LGRBs also at moderately high metallicity.
More in general, it must be pointed out that the effect of metallicity goes beyond the final stages of the progenitor's life, and could also possibly affect the IMF of stars. The universality of the IMF is still debated, and different works pointed out the possibility of a metallicity dependence of the IMF, where a larger fraction of massive stars is produced at lower metallicity (e.g. Marks et al. 2012;Martín-Navarro et al. 2015).
It is worth noting that the metallicities derived in this paper are integrated over the entire galaxy. The possibility that the LGRB production site is situated in a low-metallicity pocket of a higher metallicity host should be considered. While this can not be excluded, various authors have shown that LGRB hosts are small and compact (Lyman et al. 2017), and when possible to resolve, little metallicity variation is found throughout the hosts (Levesque et al. 2011;Krühler et al. 2017;Izzo et al. 2017; see however Niino et al. 2017;Bignone et al. 2017 for considerations on metallicity variations). We stress also that we used a simple step-function for the metallicity threshold because our small statistics do not allow us to constrain the shape of this function, however, in reality, it is more likely to be a smooth function of decreasing probability of hosting an LGRB with increasing metallicity.
Based on the fact that the hosts of the BAT6 LGRB sample represent a statistically complete sample of LGRB hosts, we can estimate the fraction of super-solar metallicity hosts (in the M08 scale). With the conservative assumption that hosts without a metallicity measurement are super-solar (very unlikely, as they are mostly low mass galaxies), that fraction is less than 31±15% at z < 1 and 33±13% at 1 < z < 2 (15±15% and 13±13%, respectively, if the host without metallicity measurement are subsolar).

Conclusions
Using a complete and unbiased sample, we showed that the properties of LGRB host galaxies evolve between z < 1 and 1 < z < 2. Their median stellar mass increases from log(M * /M ) = 9.0 +0.1 −0.2 to 9.4 +0.2 −0.3 , their median star formation rate increases from SFR = 1.3 +0.9 −0.7 to 24 +24 −14 M yr −1 , while their median metallicity remains constant at 12 + log(O/H) ∼ 8.45 +0.1 −0.1 . Based on the SF galaxy relation between SFR and stellar mass, the stellar mass evolution we found for LGRB host galaxies is weaker than that expected following their SFR evolution. If LGRB prefer to explode in environments for which the metallicity is below a certain threshold, such a (weaker) evolution is expected. In fact a fixed metallicity threshold would stifle LGRBs from exploding in high stellar mass galaxies, and at the same time would correspond to a higher stellar mass at higher redshift as the massmetallicity relation evolves towards lower metallicities at fixed mass, or equivalently higher mass at fixed metallicity.
While performing the analysis of LGRB host galaxy properties, we revised some stellar mass values reported in the literature with proper SED fitting, confirming that the use of NIR photometry only can lead to overestimations of the stellar masses. We looked at the LGRB FMR with the revised stellar masses, showing that there is still a shift with respect to the relation found by Mannucci et al. (2011), at lower µ, but that our sample is consistent with the MOSDEF star-forming galaxy sample. This could be due to an underestimation of the FMR slope at low µ or to the current systematic uncertainties regarding evolution of metallicity calibrations with redshift.
We tested the hypothesis that LGRBs are pure tracers of star formation (i.e., the probability of forming an LGRB is proportional to the SFR) by comparing the cumulative distributions of stellar mass, SFR, sSFR and metallicity of our sample with the ones of the COSOMOS2015UD (excluding metallicity) and MOSDEF representative surveys of star-forming galaxies at 1 < z < 2. Even if there is evidence for a preference of LGRB to explode in galaxies with enhanced star formation, we demonstrated that the major factor explaining the discrepancy between the mass and metallicity CDFs is a decrease of LGRB production in galaxies with metallicities above 12 + log(O/H) ∼ 8.55 in the M08 calibrator, although this threshold is to be cautiously treated as an indication rather than an absolute value due to statistics and calibrator robustness. A lower LGRB production efficiency in higher metallicity environments can be understood in terms of the conditions necessary for the progenitor star to form a LGRB. The values found in this study invoke peculiar conditions of massive single star evolutionary models, and may be in better agreement with evolution in binary systems.
If this metallicity threshold is the only factor regulating the LGRB production efficiency, we expect LGRB to trace star formation in an unbiased manner once the bulk of the star-forming population of field galaxies is below this threshold. Assuming a threshold value of Z th = 0.7 Z , following the prescription of Langer & Norman (2006), and assuming that the LGRB luminosity function and density do not vary with redshift, this will happen for z > 3. This scenario is in agreement with the findings of Greiner et al. (2015) and Perley et al. (2016b). It is also supported by the decrease towards z ∼ 3 of the discrepancy of the stellar mass of the LGRB hosts and that of star-forming galaxies in surveys weighted by SFR. The collection of larger sample of high-z GRBs with future dedicated satellites as the THESEUS mission (Amati et al. 2018) will provide a viable way to probe the star formation history up to z = 10 and beyond.