Gamma-ray bursts as probes of high-redshift Lyman-alpha emitters and radiative transfer models

We present the updated census and statistics of Lyman-$\alpha$ emitting long gamma-ray bursts host galaxies (LAE-LGRBs). We investigate the properties of a sub-sample of LAE-LGRBs and test the shell model commonly used to fit Lyman-$\alpha$ (Ly$\alpha$) emission line spectra. Among the LAE-LGRBs detected to date, we select a golden sample of four LAE-LGRBs allowing us to retrieve information on the host galaxy properties and of its interstellar medium gas. We fit their Ly$\alpha$ spectra using the shell model, and constrain its parameters with the observed values. From the comparison of the statistics and properties of LAE-LGRBs to those of LAE samples in the literature, we find evidences of Ly$\alpha$ suppression in dusty systems, and a fraction of LAE-LGRBs among the overall LGRB hosts lower than that found for Lyman-break galaxy (LBG) samples at similar redshift range. However, we find that LAE-LGRBs are representative of Ly$\alpha$ emission from the bulk of UV-selected galaxies at z~2. We find that the golden sample of LAE-LGRBs are complex systems characterized by multiple emission blobs and by signs of possible galaxy interactions. The fitting procedure fails in recovering the HI column densities (NHI) measured from the afterglow spectra, and the other properties described by the shell-model parameters in the cases with very high NHI. The afterglows of most LGRBs and LAE-LGRBs show high NHI, implying that statistically the bulk of Ly$\alpha$ photons expected to be produced by massive stars in the star-forming region hosting the GRB will be surrounded by such opaque lines of sight. We interpret our results in the context of more sophisticated models and of different dominant Ly$\alpha$ emitting regions. We also compare LAE-LGRBs to LAE Lyman continuum (LyC) leakers in the literature in terms of properties identified as possible indirect indicators of LyC leakage. [Abridged]


Introduction
Due to its brightness and rest-frame wavelength, the Lyman-α (Lyα) emission line is one of the most used features to detect high-redshift galaxies (e.g., Ouchi et al. 2009;Sobral et al. 2015;Zitrin et al. 2015;Bagley et al. 2017). The natural connection of this line with the UV emission from star-forming regions makes it an interesting proxy to study the escape of Lyman continuum Hubble fellow (LyC; <912 Å). Indeed, recent studies, such as those of Verhamme et al. (2015Verhamme et al. ( , 2017, show that this line is one of the most reliable indirect indicators of ionizing photon leakage. To escape a galaxy, the Lyα photons produced in starforming regions have to pass through the gas where they are embedded. As this radiation resonantly scatters in the presence of neutral hydrogen and is easily absorbed by dust, the journey of photons in the interstellar and circumgalactic medium (ISM and CGM, respectively) can be complex. Nevertheless, different properties can favour their escape such as low neutral-hydrogen (HI) column densities, low dust content or suitable ISM geometries and kinematics (e.g., Kunth et al. 1998;Shapley et al. 2003;Verhamme et al. 2008;Wofford et al. 2013;Henry et al. 2015;Rivera-Thorsen et al. 2015). As a consequence, the Lyα line flux and profile reflect the signatures of the physical and dynamical properties of the gas and dust content of the Lyα emitters and their surrounding environment. Interpreting the line is complex, and radiative transfer models which take into account the different sources of distortion of the intrinsic profile are necessary to recover the physical meaning of the observed line. A simple and successful model, commonly used to reproduce the Lyα shape, is the shell model (e.g., Ahn 2004;Verhamme et al. 2006;Schaerer et al. 2011;Gronke et al. 2015). It consists of an homogeneous expanding shell of neutral hydrogen and dust surrounding a central emitting source.
While successful in reproducing the line profile (see e.g., Verhamme et al. 2008;Lidman et al. 2012;Yang et al. 2017;Gronke 2017), it is important to test whether the best-fit parameters values of the shell model correspond to the real characteristics of the Lyα-emitting galaxies. Orlitová et al. (2018) highlighted discrepancies between modelling results and observed double-peaked Lyα lines, by constraining independently 5 out of the 7 shell-model parameters with ancillary data, for twelve Green Pea (GP) galaxies at z ∼ 0.2. In particular, the constrained model neither reproduces the observed blue peak of the line correctly nor, in half of the cases, the red peak. For the prediction of the parameters in the unconstrained case, the main discrepant values are the redshift, the FWHM i (Lyα) and the velocity expansion of the shell. Similar discrepancies for the FWHM i (Lyα) were also found by Hashimoto et al. (2015) for double peak Lyα profiles of galaxies at z ∼ 2.2. This work emphasises that the use of the shell model to interpret the Lyα line and retrieve physical properties, such as N HI , must proceed cautiously to avoid misinterpretation. It also suggests that considering an homogeneous shell to describe star-forming regions and their surrounding gas could be too simplistic.
The simultaneous availability of the information needed to constrain the model parameters is rare, especially at high redshift. Two individual studies of lensed galaxies at redshift z = 2.7 allowed to interpret the Lyα line using partially constrained shell model Dessauges-Zavadsky et al. 2010). The fitting of the Lyα line is in good agreement with the observation in the study of Dessauges-Zavadsky et al. (2010), while it requires different expansion velocities for the front and the back of the modelled shell in .
Gamma-ray bursts (GRBs) can be a useful additional tool to investigate Lyα emission and test the shell model at high redshift. GRBs are the most extreme cosmic electromagnetic phenomena (see Gehrels & Razzaque 2013 for a review). Their brightness makes them powerful probes through the cosmic history since they can be detected up to the highest redshifts (the spectroscopic record holder is GRB 080423 at z = 8.2; Salvaterra et al. 2009;Tanvir et al. 2009). In the case of long GRBs (LGRBs), the energy powering the bursts is released during the core-collapse of massive stars (e.g., Hjorth et al. 2003). In addition, several studies indicate that LGRBs have the tendency to occur in dwarf galaxies with high specific star-formation rates and prefer low-metallicity environments, typically subsolar (e.g., Perley et al. 2016b;Japelj et al. 2016;Graham & Fruchter 2017;Vergani et al. 2017;Palmerio et al. 2019). This makes LGRB hosts likely representative of the common galaxies at high redshift, including during the epoch of reionization (Salvaterra et al. 2011(Salvaterra et al. , 2013Tanvir et al. 2019).
The bright afterglows associated with LGRBs provide ideal background lights to probe systematically and at any redshift the ISM, CGM and inter-galactic medium (IGM) along the line of sight of this population of faint galaxies. The absorption present in the afterglow spectra directly traces the environment of the star-forming regions and also outflows or inflows even for the faintest objects. Once the afterglow has faded, the host galaxy can be directly observed through photometry and spectroscopy. This offers the interesting possibility of combining information on the cold and warm gas with the emission properties of the GRB host galaxy (e.g., Vergani et al. 2011;Chen 2012;Friis et al. 2015;Wiseman et al. 2017;Arabsalmani et al. 2018).
In this work, (i) we update the statistics of Lyα-emitting (LAE 1 ) LGRB host galaxies and compare their properties with those of LGRB hosts in general, and of LAEs and LyC leakers in the literature; (ii) we use a golden sample of four LAE-LGRBs at 2 < z < 3.2 allowing the combination of the emission properties of the host galaxies with the information on the ISM probed by the afterglow, to investigate the properties of such systems and test the Lyα radiative transfer modelling.
The paper is organized as follows. In Section 2, we present a statistical study on LAEs in LGRB hosts. We describe the physical properties of the host galaxies of our golden sample in Section 3 and the Lyα radiative transfer model results in Section 4. In section 5, we discuss the differences between model predictions and observations, and we compare LAE-LGRBs to LGRB hosts in general and to LAEs and LyC-leaker galaxies in the literature. We draw our conclusions in Section 6.

Previous studies and approach
The early studies of high-redshift GRB host galaxies, based on the first sample of five objects, Fynbo et al. 2002;Møller et al. 2002;Fynbo et al. 2003;Vreeswijk et al. 2004), seemed to indicate that all GRB hosts were LAEs. A subsequent systematic study, based on the larger TOUGH sample ; sixty-nine LGRB host galaxies), has been carried out by Milvang-Jensen et al. (2012). They targeted a subsample of twenty LGRB hosts in the redshift range z = 1.8 − 4.5 for VLT/FORS1 (Appenzeller et al. 1998) spectral observations. They found seven LGRB hosts with significant Lyα emission (3σ detection), corresponding to 35% LAEs.
The first step of our work is to update the census of LAE-LGRBs. To this end we considered two approaches: (i) a determination of the LAE statistics considering spectroscopic samples of LGRB host galaxies or afterglows; (ii) the search for LAE-LGRBs in the literature and from the host-galaxy spectra available in the ESO archive 2 . In both cases, we consider a minimum value of z = 1.6 for the LGRBs, which corresponds to the atmospheric UV cutoff at 310 nm and is the lower-redshift limit allowing the detection of the Lyα line in the VLT/X-shooter spectra 3 . The samples considered for point (i) are the TOUGH, the X-shooter host-galaxy sample , XHG in the following) and the X-shooter afterglow sample (Selsing et al. 2019; XAFT in the following).
We stress that the census presented in the following sections (especially in Sect. 2.5) may not reflect the general statistics of LAEs among LGRB host galaxies. In addition to issues concerning the completeness of the samples and of the observations, the spectra are generally not homogeneous in terms of exposure times, instruments and observing conditions. Nonetheless, sometimes it is possible to define a common flux limit, as in the case of Milvang-Jensen et al. (2012) sample. The case of afterglow spectra is even more complex as, in addition to inhomogeneous follow-up and target brightness, the Lyα absorption along the GRB line of sight could affect the Lyα emission detection and profile.

Data reduction
To perform the census, we reduced archival data of several Xshooter spectra of GRB host galaxies. We describe here the method applied for the data reduction. All observations of the host galaxies, the telluric stars, and the spectrophotometric standards were reduced in the same way using the version 2.8.5 of the X-shooter data reduction pipeline (Modigliani et al. 2010). Before processing the spectra through the pipeline, the cosmicray hits and bad-pixels were removed following the method of van Dokkum (2001). Then, we subtracted the bias from all raw frames and divided them by the master flat field. We traced the echelle orders and calibrated the data in spatial and wavelength units using arc-line lamps. The flux calibration was done using spectrophotometric standards (Vernet et al. 2009) and a correction for flexure was applied. Lastly, the sky-subtraction and the rectification and merging of the orders was done to obtain the final two-dimensional spectra. Additionally, the spectra were corrected for the Milky Way (MW) extinction using the extinction curve from Pei (1992). The A V values are obtained from the NASA Extragalactic Database (NED) and correspond to the extinction map of Schlafly & Finkbeiner (2011). The wavelengths of the extracted 1D-spectra were converted in the vacuum reference and corrected for the Earth's rotation and revolution around the Sun (heliocentric correction). To optimally select the extraction regions we chose the spatial extension of the brightest emission line and applied this 1D extraction throughout the whole spectrum. When the Lyα line is detected, we selected the extraction region according to the spatial extension of this line which can be larger than Balmer lines due to resonant scattering. Emission line fluxes were determined by fitting a Gaussian function to the data, setting the continuum flux density in a region close to the emission line. We also numerically integrated the flux over the line width as a comparison to control the consistency of the values and uncertainties. For the asymmetric line-profile of the Lyα line, we use a skewed Gaussian parametrized as described in Vielfaure et al. (2020). When lines of interest were not detected, we estimated a 3σ upper limit. For upper limits of nebular emissions, we use a FWHM in agreement with other nebular lines detected in the spectrum. For Lyα line, similarly to Milvang-Jensen et al. (2012), we select the same width for all upper limits which is 900 km s −1 centered at 300 km s −1 . The fluxes have been corrected for slit loss by calculating the flux difference between the observation of a telluric star (close in time and space to the observation of the GRB host, and with the same instrumental setup) to the tabulated values 4 expected to be measured.

The TOUGH sample
We first focus on the TOUGH sample of 69 GRBs. The TOUGH Lyα study ) targeted the 20 GRBs with a redshift in the range z = 1.8 − 4.5, as known at the time of the Lyα observing campaign. Subsequently as part of other TOUGH campaigns Krühler et al. 2012) and later work  the redshift completeness of TOUGH has increased to 60/69 . With respect to Milvang-Jensen et al. (2012), this includes adding redshifts for eleven TOUGH GRBs in the range z = 1.8 − 4.5 (GRB: 050714B, 050819, 050915A, 051001, 060805A, 060814, 070103, 070129, 070224, 070328 and 070419B); all these hosts have VLT/X-shooter spectra.
We reduced the X-shooter data, as described in Sect. 2.2, to look for the detection of Lyα emission. We find no LAEs among these eleven additional host galaxies. The new statistic of LAEs among LGRB host galaxies of the TOUGH sample is therefore 23% ± 7%.
Taking into account the fact that the spectra have different flux limits (see Fig. 1 and 2), we determine the statistics of Lyα detection above a flux cut of 1.12 × 10 −17 erg s −1 cm −2 (corresponding to the lowest Lyα detection of the TOUGH sample), a luminosity cut of 5.6 × 10 41 erg s −1 (corresponding to the flux cut at the median redshift of the TOUGH sample, z = 2.45), and the fraction of LAEs among the sample with a rest-frame Lyα EW > 20 Å. To estimate the uncertainty on these statistics, based on the sample sizes, we perform a bootstrap method employing 10 6 random resamples with replacement of the number of LAEs among the LGRBs considered for each cut. The results are reported in Table 1.
In principle, we can look for Lyα emission also at z > 4.5. This would add three further objects from the TOUGH sample. They lack host galaxy spectral observations but these have afterglow spectra available in the literature (GRB 050904: Totani et al. 2006;GRB 060522: Tanvir et al. 2019;GRB 060927: Fynbo et al. 2009). They show no Lyα emission but formal limits have not been determined. Krühler et al. (2015) presented the UVB-arm spectra for only three objects at z > 1.6 of their XHG sample. Therefore, we reduce all the UVB spectra of the sample (following the procedure described in Sect. 2.2) and inspect them to look for Lyα emission. We find three LAEs among the 37 host galaxy spectra at z > 1.6, corresponding to 8% ± 6% (see also Fig. 1). They are GRB 060926, GRB 070110 and GRB 100424A (also reported in Malesani et al. 2013). We will focus in more details on GRB 060926 and GRB 070110 in Sections 3.3 and 3.4. The unpublished 1D and 2D spectra of the host of GRB 100424A, showing its Lyα detection, are reported in Fig. 3. The flux of the line corrected for Galactic extinction and slit loss is F Lyα = (3.4 ± 0.5) × 10 −17 erg s −1 cm −2 . The flux limits are much less homogeneous than for the TOUGH sample. We apply the same cuts as for the TOUGH sample and summarize the statistics in Table 1.

The XHG and XAFT samples
Finally, focusing on the XAFT sample, Selsing et al. (2019) report the detection of four LAEs (GRBs 121201A, 150915A, 151021A, 170202A) among their X-shooter afterglow sample of 41 LGRBs, corresponding to 10% ± 5%. We report the fluxes retrieved from the literature or determined in this paper (see the tables in Appendix A). For the TOUGH and XHG samples, we plot also the upper limits of the host galaxies with no Lyα emission detection (empty symbols). Bottom Panel: Lyα luminosity of the LAE-LGRBs as a function of rest-frame Lyα equivalent width (EW 0 (Lyα)). The sample and symbols used are the same as in the top panel.

Overall detections
The results of the three samples presented above are detailed in the tables of Appendix A. By merging them and removing the overlapping cases, we find 14 LAEs out of 84 LGRB host and afterglow spectra. The statistics on the LAE-LGRBs (restricted to the host galaxy spectra only) are summarized in Table 1.
To complete the census, we should add to this merged sample the complementary LAE-LGRBs individually reported in the lit-  erature. To this purpose, we selected from two GRB databases 5 , maintained by D. Perley and J. Greiner, all LGRBs detected until December 1, 2020, with a spectroscopic redshift z > 1.6. This results in 234 LGRBs (see Fig. 4) with 193 having a spectrum covering the Lyα line and for which it is possible for us to verify whether the Lyα emission is detected. For those not included in the sample presented in the previous sections, we searched the literature for articles or Gamma-ray Coordinates Network (GCN 6 ) Circulars claiming a detection of Lyα line. If unpublished spectra were available in the ESO archive, we retrieved and reduced them.

Comparison of LAE-LGRB properties to
LGRBs that are not LAEs and LAEs that are not LGRB hosts LGRB host galaxies are selected in principle only based on the fact that they host a LGRB explosion. Therefore, they are starforming systems, with a young star formation, and preferentially with sub-solar metallicity (see e.g., Lyman et al. 2017;Palmerio et al. 2019). In the following, we look at LAE-LGRB statistics and compare it to those of LAEs in galaxy surveys at a similar redshift range to see if these systems have an enhanced or suppressed Lyα emission compared to the general population of star-forming galaxies. We also compare the properties of the LAE-LGRBs to those of LGRB host galaxies not showing Lyα emission (Fig. 5) to try to identify possible driving factors of Lyα escape. We stress that in some cases only a small fraction of objects in our samples have available information to be used for such comparison. We refer the reader to Tables in Appendix A for the values and references of properties discussed in the following and shown in Fig. 5. In the survey of galaxies at z ∼ 2 presented by Steidel et al. (2004), 40% of galaxies with available spectra show the presence of Lyα emission (are LAEs) and 10% have EW 0 (Lyα) > 20 Å (Erb et al. 2014). Reddy et al. (2008) found 65% LAEs for LBGs at z ∼ 3, and 23% with EW 0 (Lyα) > 20 Å. Pentericci et al. (2010) found a fraction of 50% LAEs in their selected GOOD-MUSIC LBG sample at z ∼ 3, and 18% with EW 0 (Lyα) > 20 Å. The LAE statistics of the TOUGH and XHG samples presented in previous sub-sections of Sect. 2 is lower than that of those  studies (see Table 1, column f tot ). However, the selection criteria of the parent sample are not the same and may contribute to this difference. Most of the above-mentioned studies are based on galaxies with magnitude brighter than R = 25.5 mag. This corresponds to the median value of the TOUGH sample, and about a half of TOUGH LAEs are fainter galaxies. If we apply this selection to the TOUGH sample (hence reduced to 17 objects) we obtain a fraction of 22% ± 15% LAEs and 7% ± 7% LAEs having EW 0 (Lyα) > 20 Å. Those numbers should be considered as lower limits as the Lyα flux limit reached by the Steidel et al. (2003Steidel et al. ( , 2004 spectroscopic survey is deeper than ours (typically ∼ 10 −18 erg s −1 cm −2 instead of ∼ 10 −17 erg s −1 cm −2 ). The fraction of LAEs in the TOUGH sample could be as high as 47%, assuming that all the objects with flux limit values above that of the weakest detected Lyα are all LAEs. Therefore, the lower statistics found for LGRB samples than for galaxy surveys could be explained by the different Lyα flux limits. The magnitude selection of Pentericci et al. (2010) reaches R = 26 mag. Applying the same cut to the TOUGH sample would not change so much the results compared to the R = 25.5 mag cut. We do not find any particular correlation of the LGRB host galaxy properties (stellar mass, SFR, metallicity, UV magnitude, dust extinction or HI column density along LGRB line of sight) with Lyα luminosity or EW 0 , but our results are limited by the poor statistics. As shown in Fig. 5, whenever the information on the SFR of LAE-LGRBs is available, its value is below 30 M yr −1 , except for GRB 080602 which has a lower limit of 48 M yr −1 (Palmerio et al. 2019). If we focus on stellar masses, we see that LAE-LGRBs are associated preferentially with stellar masses M < 10 10 M , as also found by Pentericci et al. (2010) for LBG samples at z ∼ 3. However, due to the poor statistics we cannot conclude if this is an intrinsic properties of LAEs or a consequence of the characteristics of the parent samples.
The low fraction of LAEs in the XHG sample is likely due to the difference in the M UV and stellar mass range covered by the TOUGH and XHG samples. Indeed, by construction, the latter is biased towards LGRBs with highly extinguished afterglows , that on average may originate in more massive and dusty galaxies. Over the 30 objects of the XHG sample having information on E(B-V) (even if with large errors), 70% have E(B-V) values higher than the most extinguished LAE-LGRB in the sample. This is consistent with the fact that dust is considered having a significant impact on Lyα suppression (Neufeld 1990;Laursen et al. 2009) and that LAEs are preferentially found in systems with low extinction (e.g., Shapley et al. 2003;Pentericci et al. 2007Pentericci et al. , 2010Matthee et al. 2016). The metallicities of the LAE-LGRBs also support this trend as their values are among the lowest values of the TOUGH and XHG sample (see Appendix A).  Matthee et al. (2021) predicted that the bulk of Lyα population of UV galaxies with M 1500 ∼ −20 ± 1 at z = 2 should be below ∼ 2×10 42 erg s −1 , and the fraction of LAEs with EW 0 > 25 Å should be of 10 − 30%. We can see in Fig. 6 that most of the THOUGH unbiased sample of LGRB hosts is below this limit. This is due to the fact that LGRB samples have no pre-selection except for the LGRB explosions. Therefore, LAE-LGRBs are representative of Lyα emission from the bulk of UV-selected galaxies at z ∼ 2. Furthermore, the fraction of LAEs with EW 0 GRB 011211 GRB 021004 GRB 070110 GRB 060926  > 25 Å among the TOUGH sample is of ∼ 25%, in agreement with Matthee et al. (2021) expectations for UV-selected galaxies.

The golden sample
One of the strengths of GRBs is that they give us the possibility to combine the information on the host galaxy gas retrieved from the afterglow spectra with that obtained through direct host galaxy observations, once the afterglow faded. This advantage can be useful to characterize LAE-LGRBs and to test Lyα models.
For this purpose, we selected from the census presented in Sect. 2 a golden sample of LAE-LGRBs fulfilling the following criteria: (i) Availability of both afterglow and host galaxy spectra; (ii) Lyα detection in the afterglow or host galaxy spectra with spectral resolution R>1000, as lower resolution can lead to a misinterpretation of the line profile (e.g., Verhamme et al. 2015;Gronke et al. 2015); (iii) Detection of other host galaxy emission lines in order to de-termine the systemic redshift of the galaxy, the intrinsic Lyα and galaxy properties.
We stress the importance of X-shooter observations to fulfill point (iii), thanks to its wide spectral coverage from ∼ 300 nm to 2500 nm.
We find only four objects fulfilling the above-mentioned criteria: GRBs 011211, 021004 (reported independently in the literature), 060926 (part of the XHG sample), and 070110 (part of both the XHG and TOUGH sample). The images of their fields, with the slits used to obtain the afterglow and host-galaxy spectra, are shown in Fig. 7. We reduced the X-shooter spectra of the four GRB hosts (see log of the observations in Table 2) and measured the emission line fluxes (reported in Table 3 and 4) accordingly to the procedure described in Sect. 2.2. These fluxes only represent the Lyα photons falling in the slit, without applying any correction to take into account the two-dimensional spatial extension of the Lyα emission. If we consider a 2D-Gaussian approximation, the resulting fluxes would be on average a factor ∼ 2 higher. The Lyα fluxes of these host galaxies have previously been reported in Milvang-Jensen et al. (2012). Our measurements using X-shooter spectra provide lower values but consistent within the uncertainties (at 2 or 3 σ). This is a good agreement, taking into account that different observing techniques and slit position angles may cover different parts of the diffuse Lyα emission.
For each host galaxy we determined the star-formation rate (SFR) (or put limits on it), and the Lyα properties by the following method. We used the Hα flux (measured or converted from the Hβ flux, assuming Hα = 2.86 × Hβ; Osterbrock 1989) to derive the Hα luminosity corrected for Milky Way extinction (L(Hα MWcor obs )). We converted the Hα luminosity into SFR using the relation from Kennicutt (1998) (1) We cannot retrieve information on the dust extinction of the four host galaxies from the emission lines, due to the detection of only one of the Balmer lines (Hα or Hβ) in the spectra. Therefore, the SFR values are determined without applying any hostgalaxy dust correction and should be considered as lower limits.
Under the assumption that the Lyα and the Balmer lines originate from the same regions and are produced by the same recombination process, it is possible to infer the intrinsic properties of the Lyα emission line from the Hα or Hβ lines. We determined the FWHM i (Lyα) from a Gaussian fit of the Hα (or Hβ) line profile that we corrected for instrumental dispersion. In case B recombination, the theoretical ratio between the Lyα and the Hα lines is ∼ 8.7 (Brocklehurst 1971). Using this value and assuming no extinction, we converted the Balmer flux to intrinsic Lyα flux (F i (Lyα)). We determined the f esc (Lyα) as the ratio of the observed Lyα flux to the intrinsic one. Following the consideration on Lyα fluxes reported above, the f esc (Lyα) values should be considered as lower limits. The rest-frame Lyα equivalent width (EW 0 (Lyα)) is determined by dividing the rest-frame Lyα flux by the Lyα continuum level.
We will detail more on the analysis of the data of each GRB in the following sections. The fluxes of the identified lines, the SFR, and the Lyα emission characteristics are reported in Tables  3 and 4. In general, we remark that the Lyα emission line profile and velocity spread of our sample (see next sections) are typical of LAEs at z ∼ 2. Similarly to those of the XLS-z2 sample of LAEs presented in Matthee et al. (2021), they show an asymmetric profile with a prominent peak redshifted from the systemic redshift of the host by 200 km s −1 , and extending over 500 km s −1 . However, the velocity shift of the ISM absorption lines detected in the afterglow spectra with respect to the systemic velocity of the galaxies (∼ −100 km s −1 ) are not as high as the the XLS-z2 sample (∼ −260 km s −1 ), likely testifying of a more static environment.

GRB 011211
The photometry of the afterglow and host galaxy of GRB 011211 has been published in Jakobsson et al. (2003). The detection of Lyα emission through narrow-band filters was reported by Fynbo et al. (2003). The host galaxy of GRB 011211 has a multicomponent morphology Fynbo et al. 2003). The GRB site is in the southeast part of the system, while the Lyα emission peaks in the central-north part of the system and extends over the entire system.
We present here the X-shooter observation of the host galaxy (Prog. ID: 084.A-0631; PI: S. Piranomonte, see Table 2), not previously published. At the afterglow position, we clearly identify the [OIII]λλ4959, 5007 doublet and Lyα line (see Fig. 8). From the [OIII]λλ4959, 5007, we derive a redshift of 2.1434 ± 0.0001. The Lyα peak is redshifted by 280 ± 30 km s −1 compared to the host galaxy redshift. The observed Lyα line properties are reported in Table 4. The spatial extension of the Lyα line in the 2D spectrum is 2. 5, compared to 1. 8 for the brightest nebular emission line [OIII]λ5007. From the 3σ upper limit on the Hα flux we determine a SFR < 2.0 M yr −1 . Perley et al. (2013) obtained an average dust attenuation of A V = 0.19 +0.70 −0.00 mag from the host galaxy Spectral Energy Distribution (SED) fitting.
The afterglow spectrum shows the detection of the GRB Damped Lyα system (DLA) with log(N HI /cm −2 ) = 20.4 ± 0.2 , with many associated absorption lines. From the low-ionization state lines (LIS) redshift reported by Vreeswijk et al. (2006) we determine the velocity of the ISM with respect to the systemic redshift of the host galaxy, V LIS = −160 ± 180 km s −1 , with important uncertainties due to the low resolution of VLT/FORS2 spectra. The Lyα emission peak would fall in the Lyα absorption trough but is not detected in the afterglow spectra, likely due to a combination of line faintness, spectral resolution and noisy spectral region. From the power-law fitting of the afterglow, Jakobsson et al. (2003) determined A V = 0.08 ± 0.08 mag along the line of sight.

GRB 021004
The afterglow and host galaxy of GRB 021004 have been intensively observed (see Fynbo et al. 2005 and references therein). Møller et al. (2002), showed the presence of Lyα emission in the optical spectrum of the GRB afterglow, which was later confirmed by many other works (e.g., Mirabal et al. 2003;Starling et al. 2005). The host galaxy of GRB 021004 has a compact core with a faint second component and the GRB site is located at the center of its host . The X-shooter observation of the host galaxy (Prog. ID: 084.A-0631; PI: S. Piranomonte, see Table 2) has previously been published in Vergani et al. (2011). In Fig. 7, we show the image of GRB 021004 field with superimposed the slits used to observe the host galaxy and the afterglow. Vergani et al. (2011) report the detection of [OIII]λ5007 emission line, at the same redshift as the GRB, from the galaxy eastward of the GRB host in the spectrum obtained with the slit position angle PA=92 • . Its proximity to the GRB host (projected distance of 14 kpc) may indicate a possible interaction between this galaxy and the GRB host.
For our analysis we used the X-shooter spectrum obtained with slit PA of 41 • that, due to its larger slit width, minimizes the loss of Lyα flux. We reduced the spectrum following the procedure described in Sect. 2, and identify the [OIII]λλ4959, 5007 doublet, Hα and Lyα emission lines (see Fig. 8). Hβ falls exactly on a sky line and the [OII]λ3727 doublet is not detected (3σ upper limit of 1.7 × 10 −17 erg s −1 cm −2 ). We derive a redshift of z = 2.3298 ± 0.0001 for the host galaxy. We estimate the line fluxes and SFR following the procedure described in Sect. 3. We find a SFR = 6.7 ± 0.8 M yr −1 (without host extinction correction). The Lyα emission from the host galaxy is an asymmetric line redshifted by 180±20 km s −1 with an EW 0 = 105±10 Å. The spatial extension of the Lyα line in the 2D spectrum is 4. 7, com-pared to 2. 0 for the brightest nebular emission line [OIII]λ5007. The SED fitting performed by de Ugarte Postigo et al. (2005) determined A V = 0.06 +0.08 −0.06 mag, whereas Perley et al. (2013) report A V = 0.42 +0.09 −0.07 mag. The Lyα emission is also detected in the afterglow spectrum at the same velocity and with a similar shape profile as that detected in the host galaxy spectrum. The afterglow spectrum shows also the Lyα absorption of the GRB sub-DLA system, with log(N HI /cm −2 ) = 19.5 ± 0.5 .
The VLT/UVES afterglow spectrum of GRB 021004 presents a plethora of absorption lines with complex velocity structures (Fiore et al. 2005;Chen et al. 2007;Castro-Tirado et al. 2010), spanning thousands of km s −1 . Although a progenitor-star wind origin was claimed, this was firmly excluded (at least for some of these absorbing systems) by the detection of low-ionization transitions (see Chen et al. 2007). Here we focus only on the lowest velocity component, associated with the GRB sub-DLA, therefore likely representing the cold and warm ISM gas of the GRB host galaxy. Using the shift of the AlIIλ1670 and CIIλ1334 LIS from the systemic redshift of the galaxy, we determine a ISM velocity V LIS = −80 ± 20 km s −1 . From the spectral flux distribution of the afterglow at several epochs, de Ugarte Postigo et al. (2005) determined an average A V = 0.20 ± 0.08 mag along the GRB line of sight.

GRB 060926
In the VLT/FORS1 image (Prog. ID: 079.A-0253(A); P.I.: P. Jakobsson) shown in Fig. 7, we see an offset between the GRB position and the brightest part of the system. Both are covered by the slit used for the host spectroscopic observations. In the X-shooter host galaxy observations obtained under program 085.A-0795(A) (presented here for the first time; PI: H. Flores; see Table 2), we identify the [OIII]λ5007, [OII]λ3726, Hβ and Lyα lines (see Fig. 8). These lines, except Lyα, are used to derive a redshift of z = 3.2090 ± 0.0001 for the host galaxy which is consistent with the redshift derived in Krühler et al. (2015) with a different data set. Even with seeing conditions of ∼0. 8, from the emission lines it is not possible to separate the different parts of the host galaxy systems. The spatial extension of the Lyα line in the 2D spectrum is 2. 7, compared to 1. 4 for the brightest nebular emission line [OIII]λ5007.
Due to the high redshift of the host, the wavelength coverage of X-shooter does not allow us to observe the Hα line. We determine a SFR = 20.2 ± 4.0 M yr −1 , assuming Hα = 2.86 × Hβ (Osterbrock 1989). As we do not have any information on the host extinction, the value inferred above should be considered as a lower limit of the SFR. The Lyα emission from the host galaxy is an asymmetric line redshifted by 250±20 km s −1 from the systemic redshift. The observed Lyα equivalent width and intrinsic properties are reported in Table 4.
The afterglow spectrum of GRB 060926 has been published in Fynbo et al. (2009). The Lyα line is clearly detected in the trough of the DLA. From the SiIIλ1526, AlIIλ1670 and CIIλ1334 LIS redshift, we determine a ISM velocity of V LIS = −30 ± 180 km s −1 . The HI column density derived from the fit of the Lyα absorption ) is among the highest values probed by GRB afterglow, log(N HI /cm −2 ) = 22.6 ± 0.15.

GRB 070110
The near-infrared X-shooter spectrum of GRB 070110 host galaxy has been studied in Krühler et al. (2015), but the UVB spectrum has not been previously published. The [OIII]λλ4959, 5007, [OII]λ3726, Hα and Lyα lines are identified in the host galaxy spectrum (see Fig. 8), allowing a redshift determination of z = 2.3523 ± 0.0001. The measured Hα flux corresponds to a SFR = 1.9 ± 0.6 M yr −1 . The Lyα emission from the host galaxy shows an asymmetric profile and is redshifted by 285 ± 40 km s −1 from the systemic redshift (see also  Table 4 for other Lyα properties). The spatial extension of the Lyα line in the 2D spectrum is 2. 1, compared to 2. 0 for the brightest nebular emission line [OIII]λ5007.
The afterglow spectrum of GRB 070110 has been published in Fynbo et al. (2009). They report the detection of the Lyα emission line in the trough of the GRB-DLA. From the SiIIλ1526, AlIIλ1670 and CIIλ1334 LIS redshift, we determine an ISM velocity V LIS = −20 ± 50 km s −1 . The HI column density derived from the fit of the Lyα absorption ) is log(N HI /cm −2 ) = 21.7 ± 0.10. Troja et al. (2007) determined A V = 0.08 ± 0.08 mag along the GRB afterglow line of sight.

Description of the model
Following Vielfaure et al. (2020), we model the observed Lyα line with the "shell-model" fitting pipeline described in Gronke et al. (2015). This simplistic model consists of an expanding, homogeneous, spherical shell composed of uniformly mixed neutral hydrogen and dust (Ahn et al. 2003;Verhamme et al. 2006). The expanding geometry was motivated by the HI outflows which seem to be ubiquitous at both low-and high-redshifts (e.g., Shapley et al. 2003;Wofford et al. 2013;Rivera-Thorsen et al. 2015;Chisholm et al. 2015). The Lyα emitting source is placed at the centre of the sphere filled with ionized gas and surrounded by a neutral and dusty shell. This source would correspond to the star-forming region hosting the LGRB. Such assumption is reasonable considering that LGRBs are produced by massive stars and are found in the central star-forming region of their host (typically at ∼ 1 kpc from the center, Lyman et al. e.g., 2017). The photons are collected in all directions after their scattering through the neutral shell.
The shell model is defined by seven parameters that we briefly detail here. Four parameters describe the shell properties: the radial expansion velocity (V exp ). This parameter is physically linked to the broadening and the peak shift of the Lyα line. For outflowing shell, the velocity expansion regulates the blue peak intensity by scattering the photons in the red wing of the line. A higher velocity of the gas increases these effects but suppresses the interaction of the photons with the neutral gas. See for example Verhamme et al. (2006) for more details on this effect. The V exp of the neutral gas can be linked to the velocity shift of the LIS lines (V LIS ) when available. -The HI column density (N HI ). The effect of increasing N HI is to broaden the line by shifting the Lyα photons from the line center. -The dust optical depth (τ d ) at wavelengths in the vicinity of Lyα. The dust grains absorb and scatter the Lyα photons. An increasing dust content will favor this process and affect the intensity of the line but can also affect its shape. An increasing N HI boosts the interaction of the photons with dust so both parameters are linked. -The effective temperature of the gas (T). Its effect is complex and its impact on the width of the line and the position of the peak depends on the other parameter values. The distribution of the residuals is projected in the right bottom panels. We also superimpose a Gaussian distribution with µ = 0 and σ = 1 (dotted black line) as a reference for comparison. We test the deviation of the normalized residual distribution from this Gaussian distribution by using the Shapiro-Wilk test and we provide the p-value of the test in the bottom panels. A p-value above 0.05 indicates that it is no longer possible to reject the null hypothesis that the normalized residual distribution is drawn from a Gaussian distribution at a 95% confidence level. Notes. ∆z = (z fit − z host ) × c, where z fit is the Lyα redshift of the best fit, z host is the redshift derived from the X-shooter emission lines (Table 6), and c is the speed of light; log(N HI /cm −2 ) is the HI column density of the shell; V exp is the shell expansion velocity; log(T/K) is the temperature of the gas; τ d is the dust optical depth; FWHM i (Lyα) and EW i (Lyα) are the intrinsic FWHM and equivalent width of the Lyα line, respectively.
We consider a Gaussian profile for the intrinsic Lyα emission and an adjacent flat UV continuum, therefore three additional parameters are used: the redshift of the emitter (z) corresponding to the center of the intrinsic Lyα line; the intrinsic equivalent width of the Lyα line (EW i (Lyα)); the intrinsic FWHM of the Lyα line (FWHM i (Lyα)).
We refer the reader to Verhamme et al. (2006) and Gronke et al. (2015) for a more detailed description on the model and parameters. The observation of LGRB host galaxy and after-glow allows us to constrain these parameters even for faint starforming galaxies. However, while host galaxy observations provide information for the overall galaxy, the afterglow constrains parameters along the LGRB line of sight. Therefore, by using the shell model with afterglow observations we assume that the LGRB region is the main source of Lyα emission in the host, and that the LGRB line of sight probes gas properties characteristic of the environment surrounding the LGRB region. In the following, to differentiate between the parameters constrained by these two types of observation, we annotate with X OA the parameters related to optical afterglow observations and X HG the host galaxy ones. As a first step, we fit the observed Lyα lines without con-A&A proofs: manuscript no. main Notes. We refer to Sect. 3 for the value determinations, except for the dust optical depth, τ d , that is converted from A V by considering a dust albedo of A = 0.5 (see Verhamme et al. 2006;Gronke et al. 2015). "*" is for value determined from the FWHM of the [OIII]λ5007 line. " OA " is for the parameters related to optical afterglow observations and " HG " for the host galaxy ones. straints, as it would typically be done for the majority of high-z LAE observations, where most of the parameters can hardly be constrained observationally. Then, we fit the Lyα profiles using the observational constraints, discussing as well the effects of constraining the parameters with only the host galaxy or afterglow observations.

Unconstrained Lyα profile fitting
To evaluate the predictions of the shell model, we fit the observed Lyα lines of the four LGRBs of the golden sample using an improved model grid and process originally described in Gronke et al. (2015), without considering any prior. The automatic fitting succeeds in reproducing the line profile satisfactory in all four cases. In Fig. 9, we present for each spectrum the best fit corresponding to the lowest χ 2 . In all cases but GRB 011211 the models fit a blue peak which is not clearly seen in the spectra but consistent with the uncertainties. The distributions of normalized residuals presented at the bottom of each spectrum are consistent with a unit Gaussian distribution, indicating a good agreement between the models and the observations. While the unconstrained shell model succeeds in reproducing the observed profiles, the comparison between the best shell parameters returned by the fitting (Table 5) and the corresponding values determined from the observations (Table 6) reveals important discrepancies for GRB 060926 and GRB 070110. Their observed HI column density is higher than the fit results, and the FWHM i (Lyα) fitting values are largely overestimated.
The Lyα escape fraction ( f esc (Lyα); see Table 7) is also a byproduct of the shell model and is found to be consistent (within the uncertainties) with the value determined from the observations (see Table 4) only in the case of GRB 070110. Particularly, f esc (Lyα) is discrepant for GRB 011211 and 021004 whereas the fit and best-fitting parameters are in agreement with the observations. The disagreement on f esc (Lyα) implies also that the Hα flux and subsequent SFR obtained by the fit are inconsistent with the values determined from the observations (see Tables 3 and 7). For GRB 011211 the Hα flux is less constrained (only 3σ upper limit) but a lower Hα flux would imply an increased difference. We caution, however, that the Lyα escape fractions derived from the spectral fitting procedure is highly uncertain as it depends on the fitted value of τ d which mostly has only minor effects on the spectral shape (see discussion in Gronke et al. 2015). The disagreement could also come from the lack of extinction correction of the observed Hα flux in our calculation of f esc (Lyα) and SFR (see Sect.3). An A V ≈ 0.6 mag for both GRB 011211 and 021004 (considering an SMC extinction curve; Japelj et al. 2015) would result in consistent Hα and f esc (Lyα) values within 1σ.

Constrained Lyα profile fitting
We then proceeded to fit the Lyα using the observational constraints described in Sect. 3 and listed in Table 6, as priors. Using the afterglow and host galaxy spectra, as described in Sect. 3, we are able to constrain up to 6 out of the 7 parameters taken as input by the model. The host galaxy observations provide the information on the observed Lyα line and the Balmer lines used to constrain the intrinsic Lyα emission (FWHM HG i (Lyα) and EW HG i (Lyα)). They also provide the nebular emission lines used to constrain the redshift (z HG ). The afterglow observations constrain the gas properties: the hydrogen column density (N OA HI ), the velocity of the gas (V OA LIS ) and the information on the dust content (τ OA d ). It was not possible to constrain the HI temperature. The effect of this parameter on the final profile is complex and usually it is degenerated with the other parameters (Gronke et al. 2015). We let this parameter free for the 4 cases. For GRB 060926 we lack information about the dust extinction, and for GRB 011211 we have only upper limits for Hα. To be consistent with the constraints on the N OA HI values derived from the afterglow, the dust extinctions are taken from the afterglow SED fitting reported in Sect. 3. The grid we use does not extend up to log(N HI /cm −2 ) = 22.6, so this value is not firmly constrained for GRB 060926 but constrained to the highest value of the grid which is 21.8. We report the results of the fitting in Table 8. In Fig. 10, the intrinsic Lyα derived from the model and the best fit obtained with the lowest χ 2 are superimposed to the data. Notes. Same as Table 5 for models constrained with z, N HI , V exp , τ d , FWHM i (Lyα) and EW i (Lyα) Fig. 10. Same as Fig. 9 but with the shell model parameters z, N HI , V exp , τ d , FWHM i (Lyα) and EW i (Lyα) constrained by the values determined from the observations.
Not surprisingly, we find a good agreement between the fitting and the observations of the two hosts with lower N OA HI (GRB 011211 and 021004). As shown by the residuals at the bottom of Fig. 10, the fit is worse than in the unconstrained case for GRB 070110 and fails to reproduce the Lyα profile and parameters for GRB 060926, which has an extreme log(N OA HI /cm −2 ) value of 22.6. In these two cases the models tend to fit a double peak profile with higher red peak velocity shift (∼ 400 km s −1 ) than observed (∼ 280 km s −1 ) and quasi-static HI shell (3 ± 1 km s −1 ) which results in a double peak profile. By relaxing the constraints, without considering priors for FWHM i (Lyα) HG , we manage to fit the profile of GRB 070110 in a better way but the best-fitting FWHM i (Lyα) becomes extremely narrow with an unrealistic value of 8 +6 −3 km s −1 . We also performed two sets of fitting constraining separately the parameter values determined from the host galaxy and the afterglow observations. We find that the profiles of the fitting constrained with the host galaxy observations (z HG , FWHM HG i (Lyα) and EW HG i (Lyα)) are very similar to the unconstrained cases, well reproducing the data (see Appendix C). Differently from the unconstrained case, the blue peak is not present, which is due to the constraint on FWHM HG i (Lyα). The predictions for the properties of the gas are also similar to the unconstrained cases. Especially, the N HI values are lower than observed through LGRB line of sight. The fitting constrained with the afterglow observa-Article number, page 13 of 28 A&A proofs: manuscript no. main tions (z HG , N OA HI , V OA LIS and τ OA d ) only gives very similar results to the fully constrained case (see Appendix D). The models successfully reproduce the low N HI cases but fail to reproduce the two cases with high N OA HI values. These results are in agreement with previous finding that the most constraining parameters are N HI and V exp (e.g., Gronke et al. 2015).

Implications of the shell-model fits
The unconstrained fitting succeeds in reproducing all Lyα profiles of the golden sample (and GRB 191004B; see Vielfaure et al. 2020). However, if we compare the parameter values resulting from the fitting with those observed, we find that for GRB 060926 and GRB 070110 there are important discrepancies. A fit performed constraining the parameters values with the observed ones does not succeed in reproducing the observed line profile of these two cases.
The FWHM i (Lyα) provided by the unconstrained models for GRB 060926 and 070110 are very large and disagree with the FWHM i (Lyα) HG we determined assuming that the Balmer photons and Lyα photons are produced by recombination in the same regions. Such high values are usually returned by the shell model fitting to reproduce the line profile of the emitters having Lyα peaks and wings substantially shifted from the systemic redshift. A similar discrepancy has also been found by Hashimoto et al. (2015) in the study of z ∼ 2 LAEs and in the Yang et al. (2016) and Orlitová et al. (2018) studies on low-redshift GP galaxies. The latter also reports a discrepancy between the systemic redshift derived from the models and the observed ones, whereas they agree in our case. Hashimoto et al. (2015) propose that other sources of Lyα emission than star formation, such as gravitational cooling, could account for the large intrinsic width of the Lyα emission line. Orlitová et al. (2018) try to explain the FWHM discrepancy considering complex kinematic structures (as testified by broad Hα and Hβ wings observed in their GP spectra) but with negative results. They conclude that the discrepancy is a model failure and not a physical effect. From our data we cannot recover possible kinematic structures. A likely cause for the mismatch is a widening of the "intrinsic" Lyα spectrum due to radiative transfer effects (as discussed, e.g., in Yang et al. 2016). In fact, the study of Gronke et al. (2018) supports such a scenario (also see the recent study by Li et al. 2020, where 100 km s −1 observed spectra were modeled with narrow intrinsic spectra). There, the authors post-process a hydrodynamic simulation of a galactic disk using Lyα radiative transfer, and show that the spectrum is indeed widened compared to Hα due to ISM turbulence (cf. their figure 1 where the intrinsic spectrum of width ∼100 km s −1 is widened to ∼400 km s −1 due to radiative transfer effects within the gaseous disk) 7 . This could in principle explain the effect seen here where the intrinsic width obtained from the shell-model fitting is much wider than the one inferred from the observations. Another important discrepant value is N HI . The fit returns lower values than those determined from the afterglow spectroscopy in half of our sample. The two GRBs of the discrepant 7 In Gronke et al. (2018), the subsequent effect of the "shell" can be taken quite literally as they consider cosmic-ray driven outflows which are likely smooth and cold (e.g., Girichidis et al. 2018) -like a shell. However, in general this can also be due to CGM effects, or, a lack of numerical resolution leading to a "shell" by smoothing out otherwise multiphase structure ). cases have much higher N OA HI than the other GRBs in the golden sample, with log(N OA HI /cm −2 ) > 21.5 as opposed to values of log(N OA HI /cm −2 ) ∼ 20. The Lyα emission corresponds to a resonant transition making its escape unfavorable when the photons pass through a static high-HI density medium. In the shell model, N HI impacts the Lyα peak shift, with low N HI values corresponding to small shifts and vice-versa. Indeed, when we constrain its value, the model is not able to reproduce the line profile for GRB 060926 and 070110, and results in a peak shifted to larger velocities (∼400 km s −1 ).
When considering all the LGRBs with a measured N OA HI (reported in Tanvir et al. 2019), we see (Fig. 11) that such high N OA HI values are common among LAE-LGRBs. Even if the fraction of LAE-LGRBs from sub-DLA/Lyman-limit systems (log(N OA HI /cm −2 ) < 20.3; 22% ± 11%) seems to be higher (considering also the poor statistics) than that from DLA (log(N OA HI /cm −2 ) > 20.3; 15% ± 4%), there is no strong evidence for Lyα suppression associated with high N OA HI . Also, the Lyα luminosity seems not to depend on N OA HI (see Fig. 11, right panel). The discrepancy among the observed N OA HI values and those found by the fitting, as well as the high N OA HI values found among LAE-LGRBs, could be explained by the difference between the medium probed by Lyα photons and the afterglow emission.
LGRB lines of sight may go through regions with high N OA HI , while Lyα photons may have escaped through lower N HI lines of sight from the GRB young star-forming region. This would imply an anisotropic environment. The theoretical studies of resonant line transfer through simplified anisotropic geometries (e.g., Dijkstra & Kramer 2012;Behrens et al. 2014;Eide et al. 2018) as well as turbulent medium with ionized channels (e.g., Kimm et al. 2019;Kakiichi & Gronke 2019) show that Lyα spectra are shaped by the lowest density pathways. This predicts that in general, the column density probed by Lyα is less or equal to the one observed along the line of sight, in agreement to what we find for the LAE-LGRB modelling presented here. To put constraints on the HI anisotropy is important also to the consequences this can have on the ionizing escape fraction of galaxies (e.g., Vielfaure et al. 2020). Already from this work where we find log(N OA HI (LGRB)/N HI (Lyα)) ∼ 2-3 in two of the cases, we can speculate that it is difficult to explain this large anisotropy by purely turbulent driving. For reasonable Mach numbers M 5 studies show only ∼ 1 dex difference (for ∼ 10% of the sightlines; Safarzadeh & Scannapieco 2016) which suggests the need of radiative or mechanical feedback causing larger anisotropies (Kimm et al. 2019;Kakiichi & Gronke 2019;Cen 2020).
The study of LGRB lines of sight shows that, from a statistical point of view, low-density channels surrounding star-forming regions should be rare (Tanvir et al. 2019). This is consistent with LGRB progenitors being massive stars typically formed in dense molecular clouds and residing in gas-rich star-forming galaxies (e.g., Jakobsson et al. 2006). The observation of the timevariability of fine-structure transitions, in LGRB afterglow spectra, has shown that the distance of the dominant absorbing clouds ranges from ∼50 pc to 1 kpc (e.g., Vreeswijk et al. 2013). The neutral gas probed by LGRB afterglows should therefore be connected to the star-forming regions where LGRBs explode. Since these regions are also expected to be the main birthplaces of Lyα photons, the properties of LGRB lines of sight could represent the average conditions of the environment surrounding Lyα emission regions. If we consider the proportion of sub-DLA LGRBs as representative on average of the low-density channels through which the gas has favourable conditions for the Lyα ra- diation to escape, they would correspond to 12% of all the possible lines of sight 8 .
Even if LGRBs sites pinpoint regions with high starformation activity (and therefore a significant production of Lyα photons), another possibility to explain the N HI discrepancy is that most of the Lyα photons that succeed to escape from the galaxy may not originate from the young star-forming regions probed by GRBs (as also proposed by Vreeswijk et al. 2004 for GRB 030323). One way to investigate both scenarios would be through galaxy simulations, studying the global, the line of sight and the Lyα emission properties and comparing them to those determined by LGRB afterglow and host observations. This is beyond the scope of this paper and will be the object of a further paper.
The situation is even more complex if we look at the morphological properties of the host galaxies in more detail. The images of the host galaxies reveal a patchy and irregular morphology (see Fig. 7). The HST images obtained for GRB 021004  shows that the GRB position is superimposed to the strongest UV emitting region. This is not the case for GRB 011211 , where the Lyα emission envelops the entire galaxy but the peak of emission is clearly separated from the GRB position. Interestingly, this does not prevent the match between the shell model and the observations. A complex morphology is also found for other LAE-LGRB hosts (GRB 000926 Fynbo et al. 2002, GRB 050315 andGRB 061222A Blanchard et al. 2016). Such irregular and clumpy systems may represent mergers (e.g., Conselice et al. 2003) or just clumps of star formation. The presence of a companion galaxy ∼15 kpc away from the GRB 021004 host, and the high velocity absorption found in its afterglow spectra, may suggest interaction of the two galaxies and the presence of outflowing gas. In this light, the success of a very simple configuration as the one of the shell model in reproducing the Lyα profile and the observed parameters is maybe surprising.

Comparison to LyC leakers
In Fig. 12, we compare properties of LAE-LGRBs to those of known LyC leakers in the literature. For this comparison, we consider the LyC leakers at z ∼ 3.1 reported by Fletcher et al. (2019) and Ion2 (z ∼ 3.2, de Vanzella et al. 2016). We also consider those detected in the local Universe analysed in Verhamme et al. (2017) as well as the green peas studied in Izotov et al. (2018a,b). The properties compared are the Lyα escape fraction, the rest-frame Lyα equivalent width, the Lyα red peak velocity (v peak ) and also the [OIII]/[OII] ratio (O 32 ). The last three properties have been proposed in the literature to correlate with ionizing and Lyα photons leakage. Verhamme et al. (2017) have pointed out rest-frame EW(Lyα) > 70 Å, f esc (Lyα) > 0.3, v peak < 150 km s −1 and O 32 > 4 as good indirect indicators of high LyC leakage ( f esc (LyC) >5%). The O 32 ratio is a proxy of the ionization state of the gas in the star forming regions and has been proposed as a marker of densitybounded HII regions (Jaskot & Oey 2014;Nakajima et al. 2013;Nakajima & Ouchi 2014;Stasińska et al. 2015). Its positive correlation with the escape fraction of ionizing and Lyα photons leakage have been shown in several studies for local GPs or highredshift LAEs (e.g., Nakajima & Ouchi 2014;Izotov et al. 2016Izotov et al. , 2018ade Barros et al. 2016;Verhamme et al. 2017;Fletcher et al. 2019).
From the comparisons we see that, except for GRB 060707 and GRB 060605 that show very high v peak , the LAE-LGRBs fall in the same parameter space as LyC leakers and follow the correlations between the indirect indicators found by Verhamme et al. (2017). Following the study of Verhamme et al. (2017) on low-redshift LyC leakers, we can cut these plots in two regions corresponding to strong LyC leakers ( f esc (LyC) > 5%, red rectangle) and weak LyC leakers ( f esc (LyC) < 5%, blue rectangle). All LAE-LGRBs fall in the category of the weak LyC leakers except GRB 021004 which appears systematically in good agreement with the region of the strong leakers. In panel (b) of Fig.  12, we also superimposed the distribution of [OIII]/[OII] ratio for the GRB sample of Krühler et al. (2015) and GRB 081121 reported here. We see that for the majority of the GRBs this ratio is around two, with seven cases at O 32 > 4. GRB 021004 has a high value of O 32 > 10 and is the strongest LAE of our  golden sample with a substantial f esc (Lyα) of 60%, in agreement with potentially high f esc (LyC). Nevertheless, the Lyα profile of GRB 021004 is a single peak with no residual flux at the Lyα line center. This is not the typical line shape observed for confirmed LyC emitters which have the tendency to show double-or triple-peak profile (Verhamme et al. 2017;Rivera-Thorsen et al. 2017;Vanzella et al. 2020). Also our shell-model fitting suggests the column densities to be too large to allow LyC photons to escape. However, this model already predicted similar HI column density (log(N HI /cm −2 ) ≈ 19 − 20) for four Green Pea galaxies out of the five with detected LyC emission reported in Yang et al. (2017), suggesting that LyC emission can escape through holes in the ISM even if the Lyα photons probe denser neutral gas. The only LAE-LGRB for which LyC leakage has been detected along the LGRB line of sight ( f esc (LyC) =0.35 +0.10 −0.11 ) is GRB 191004B (Vielfaure et al. 2020). However, in Fig. 12 (panel (c)) we can see that it does not fall in the high escape fraction region but in the lower left area. The reasons could be that (i) f esc (LyC) is lower at the scale of the galaxy than along the LGRB line of sight, and (ii) the indicators of strong LyC leakage evolves with redshift. As a comparison, the LyC emitters from Fletcher et al. (2019) are found out of the high escape region (red rectangle). They show lower rest-frame EW(Lyα) and higher v peak than the local LyC emitters whereas their escape fraction of ionising photons is significantly higher ( f esc (LyC) = 15 − 60%). This could also suggest that strong LyC leakers span a wider parameters space than predicted by the study of local LyC emitters. Overall it is clear that this kind of studies are still limited by the poor statistic and the current results show the difficulty of characterizing LyC leakage based only on these properties.

Conclusions
In this paper we have studied the Lyα emission of LGRB host galaxies. First, we provide a new census of LAEs among LGRB host galaxies. To date, there are 29 LAE-LGRBs. The fraction of LAEs among LGRB hosts varies from ∼10% to 40% depending on the sample and threshold considered. Such statistics are lower than those found for LBG samples at similar redshift range, but they become comparable when taking into account only LAEs with rest-frame EW(Lyα) > 20 Å. These results can be explained by the different selection criteria of the parent samples and by the shallower spectral observations of LGRB samples compared to LBG ones. We compared the properties of LAE-LGRBs to those of LGRB hosts in general and find evidences of Lyα emission suppression in dusty hosts. We showed that, since LGRB hosts are not selected following usual criteria and techniques used for Lyα emission searches in star forming galaxies, they probe the regime of Lyα emission of the bulk of the UV-selected galaxy population at intermediate (and possibly high) redshifts.
We then selected a sub-sample of four LGRBs that allowed the combination of the emission properties of the host galaxies with the information on the ISM probed by the afterglow. We fit the Lyα emission of these galaxies using the shell model and we use the ancillary observational properties to constrain and test the model. We find that, without priors on the parameters, the shell model succeeds in reproducing the four profiles. However, the properties predicted by the model differ from the observed ones for the two cases (GRB 060926 and 070110) having a high N OA HI . Similarly, constraining the model parameters using the values determined from the observations, the shell model succeeds in reproducing the two lower N OA HI cases but fails for the two highest N OA HI ones. These results may be due to the fact that (i) the N HI of the gas probed by the LGRB afterglow (N OA HI (LGRB)) is not representative of that encountered by the Lyα photons (N HI (Lyα)); specifically we find N HI (Lyα) ≤ N OA HI (LGRB), consistent with the picture that Lyα photons escape preferentially through low-column density channels and (ii) as known, the shell model relies on a too simplistic gas configuration. A consistent scenario would be provided by a turbulent medium widening the intrinsic Lyα spectrum, and an anisotropic HI distribution allowing the escape of Lyα photons through lower density channels than probed by the LGRB sightline. Such a description is consistent with mechanical and radiative feedback mechanisms produced by starburst and supernova that clear out channels from the active regions within the galaxies. Nevertheless, we stress that the afterglows of most LGRBs and LAE-LGRBs show log(N OA HI /cm −2 ) > 20.3, implying that likely the Lyα photons produced by massive stars will predominantly be surrounded by such high-density gas, and low-density channels should be rare (∼ 10%). Another possibility is that, even if dense star-forming regions produce a high fraction of Lyα photons, most of Lyα escaping photons come from less dense regions overall the galaxy. We will investigate further both scenarios in the future taking advantage of galaxy simulations.
Finally, we compared the properties of the LAE-LGRBs with those of LyC leakers in the literature. Specifically, we considered the commonly used Lyα indirect indicators of LyC leakage at low redshift and find that only one LAE-LGRB (GRB 021004) has the corresponding values for being a strong LyC leaker. However (strong) LyC reported in the literature at z ∼ 3, span a wider range of parameter values than those commonly considered to point towards strong LyC leakage, questioning the use of such indicators at least at high redshift.
The analysis presented in this paper needs larger statistics. To this purpose, we have been awarded X-shooter time to observe eleven more LAE-LGRBs reported in the census. In the future it will be possible to extend these studies to higher redshift. Indeed, new space GRB missions like Gamow Explorer (White 2020) and THESEUS 9 (Amati et al. 2018; pre-selected as M5 European Space Agency mission), in synergy with ELT and JWST observations, will enable us to perform similar studies at z > 3.5, and with better statistics.

Notes.
For GRB 060714 the values reported correspond to the Lyα detection from the afterglow spectrum ) but the upper limits derived from the host galaxy  are given between brackets.
Article number, page 20 of 28