Modeling the Statistics of the Cold Neutral Medium in Absorption-selected High-redshift Galaxies

We present a statistical model of the selection function of cold neutral gas in high-redshift (z~2.5) absorption systems. The model is based on the canonical two-phase model of the neutral gas in the interstellar medium and contains only one parameter for which we do not have direct observational priors: namely the central pressure (P*) of an L* halo at z=2.5. Using observations of the fraction of cold gas absorption in strong HI-selected absorbers, we are able to constrain P*. The model simultaneously reproduces the column density distributions of HI and H$_2$, and we derive an expected total incidence of cold gas at z~2.5 of $l_{CNM} = 12\times 10^{-3}$. Compared to recent measurements of the incidence of CI-selected absorbers (EW$_{CI\,1560}$>0.4 {\AA}), the value of $l_{CNM}$ from our model indicates that only ~15% of the total cold gas would lead to strong CI absorption (EW>0.4 {\AA}). Nevertheless, CI lines are extremely useful probes of the cold gas as they are relatively easy to detect and provide direct constraints on the physical conditions. Lastly, our model self-consistently reproduces the fraction of cold gas absorbers as a function of N(HI).


Introduction
Our understanding of star formation throughout cosmic time is intimately linked to our ability to observe and constrain the physical properties of the gas in and around galaxies. The neutral gas is of particular interest and tends to split into a warm, diffuse (T ∼ 10 4 K, n ∼ 0.5 cm −3 ), and a cold, dense (T ∼ 100 K, n ∼ 50 cm −3 ) phase (Field, Goldsmith, & Habing 1969), the latter being more inclined to Jeans instability and subsequent star formation. Wolfire et al. (1995) described the two neutral phases as a result of the balance of heating and cooling mechanisms which, for a range in external pressure, exist in equilibrium. The minimum pressure required for a stable cold neutral medium (CNM) to exist, P min , depends on several factors, out of which metallicity, Z, and ambient ionizing flux, I uv , play central roles. However, while the canonical two-phase description has been intensively investigated in nearby environments, these all have rather similar Z and I uv . In order to test the current theoretical framework over an increased range of parameter space, it is convenient to look to high-redshift galaxies as the average I uv is higher (Khaire & Srianand 2019) and the average metallicity is lower (e.g., De Cia et al. 2018).
One powerful way of studying the neutral gas at high redshift is through damped Lyα absorption systems (DLAs) observed in the spectra of distant quasars (see review by Wolfe et al. 2005). However, these high column density absorbers (N H i > 2×10 20 cm −2 ) predominantly probe the warm and diffuse gas phase (e.g., Srianand et al. 2012;Neeleman et al. 2015).
Instead, the cold gas phase can be studied directly by choosing appropriate tracers of the CNM (e.g., molecular hydrogen, H 2 , neutral carbon, C i, or absorption from H i at 21-cm, since the 21-cm optical depth depends inversely on the temperature). Yet, such cold gas absorbers (hereafter referred to as CNM absorbers) are rare and only a few hundred -compared to tens of thousands of DLAs -have been identified in large-scale spectroscopic surveys Ledoux et al. 2015;Srianand et al. 2012;Kanekar et al. 2014). These CNM absorbers are extremely powerful probes of the physical conditions of the gas (density, temperature, and I uv ) through atomic fine-structure and molecular transitions (e.g., Srianand et al. 2005;Noterdaeme et al. 2007b). It is thus possible to constrain the interstellar medium (ISM) conditions in galaxies out to high redshift z ∼ 3 (e.g., Balashev et al. 2019). Such direct observational constraints are crucial in order to test theoretical models and numerical simulations of star formation.
Recent numerical simulations are able to follow the detailed evolution of H 2 directly and resolve the small-scale CNM (e.g., Nickerson, Teyssier, & Rosdahl 2019;Bellomi et al. 2020). However, the CNM cross section depends heavily on resolution and on the poorly constrained feedback mechanisms from star formation and supernovae. Nevertheless, the bulk of the CNM cross section arises in a centrally concentrated region with a uniform covering factor.
In this letter, we model the CNM cross section at high redshift using a simple analytical description of the two-phase ISM. Our approach is based on the work by Krogager et al. (2020, hereafter K20) who present an effective model for DLAs building on the ideas of Fynbo et al. (2008). Here we include a simple pressure-based prescription of the two-phase medium to model the total CNM incidence and the fraction of DLAs exhibiting CNM absorption. This way we are able to quantify the fraction of CNM absorbers that were selected using C i absorption lines.

Literature data
We have compiled a sample of known CNM absorbers in the literature at z abs > 1.5 with measurements of both metallicity and N H i . The observable quantities used in this analysis are the column densities of neutral and molecular hydrogen, the gasphase metallicity, and the thermal gas pressure. We furthermore included impact parameters for the seven absorption systems where an emission counterpart has been identified. The collected data are presented in Table 1. Only absorption systems with detections of C i and/or H 2 were included in the compilation. The metallicities were calculated based on the elements Zn and S since these do not suffer from strong dust-depletion effects (De Cia et al. 2016).

Modeling neutral gas absorption
The modeling of the neutral gas of high redshift galaxies followed the framework for DLA absorption by K20 and Fynbo et al. (2008). Here we provide a brief summary of the model. Galaxies were drawn randomly from the UV luminosity function φ(L), which is described by a Schechter function: φ(L) = φ 0 (L/L * ) α exp −L/L * , with α = −1.7. For each randomly drawn galaxy, we assigned an average projected extent of neutral gas (R dla ), a central metallicity (Z 0 ), and a radial metallicity gradient (γ Z ) based on empirically derived scaling relations, for details see K20. An impact parameter (b) was assigned with a probability proportional to the area of a circular annulus at a given projected radius. The absorption metallicity was then calculated at the given impact parameter following the assumed metallicity gradient. Based on the absorption metallicity and the N H i value along the line of sight (see Sect. 3.2), we calculated the optical extinction (A V ) following Zafar & Møller (2019). We included a mock optical selection similar to large spectroscopic surveys by probabilistically rejecting sightlines with large A V (Krogager et al. 2019). For details regarding the implementation, we refer readers to K20.
In this work, we further model the radial pressure of the halo in order to calculate at what radial scales the ISM is able to support the CNM. We do this by assigning a halo mass to a given luminosity from which we can then calculate a halo pressure and its radial dependence. Since we are including a prescription for the pressure in this work, we are able to model N H i more accurately than in our previous model (K20). The details of the pressure-based model are presented in what follows.

Modeling the cold neutral medium
Following Elmegreen & Parravano (1994), we assume that P tot ∝ Σ 2 . The radial dependence of the stellar surface density Σ then leads to a radial dependence on the pressure of the form: The effective radius r e scales with luminosity as: where the characteristic scale length of an L * galaxy, r * e , is taken to be 3 kpc at z = 2.5 (van der Wel et al. 2014), and the powerlaw index t e has a value of 0.3 (Brooks et al. 2011). The central pressure, P 0 , is calculated as a function of halo mass, M h , assuming that the central pressure scales with the virial pressure: h . The halo mass is assigned based on the luminosity according to the luminosity-M h relation by Mason et al. (2015). The central pressure of a halo is then calculated as: where M * h = 5 × 10 12 M at z = 2 − 3 (Mason et al. 2015). The central pressure of an L * galaxy, P * 0 , is not constrained directly by observations. Instead, we explore a range of 10 5 −10 7 K cm −3 .
We approximate P min following eq. (33) of Wolfire et al. (2003): where I uv is the ambient UV field in units of the Draine field (Draine 1978), Z d is the dust abundance, and ζ cr is the cosmic ray ionization rate. In the above expression, P * min refers to the minimum pressure at Solar metallicity with I uv = 1. We assume a fiducial value of P * min = 10 4 K cm −3 (Wolfire et al. 2003). We further assume that I uv /ζ cr is constant as both the cosmic ray ionization rate and the ambient UV field depend on the star formation activity. We model Z d /Z as a broken power-law following Bialy & Sternberg (2019, their eq. 5).
The I uv is calculated based on the star formation rate surface density: The star formation rate, ψ , is taken to be directly proportional to the UV luminosity, whereas the disk scale length is given above in Eq.
(2). The total I uv is given as the sum of the extragalactic UV background and the UV field related to on-going star formation. Hence, combining equations (5) and (2), assuming that the ambient UV field of an L * galaxy is one in units of the Draine field, results in an effective scaling between I uv and L of the form: where I 0 = 0.16 is the extragalactic UV background integrated over 6 − 13.6 eV at z = 2.5 calculated by Khaire & Srianand (2019) in units of the Draine field. The radial dependence of the total pressure leads to a characteristic radial scale, R cnm , within which the total pressure is greater than P min , and hence the ISM is able to support a stable CNM. This radial scale is determined as: For the ensemble of random sightlines drawn in the model described above, we then refer to sightlines as CNM absorbers if the impact parameter is less than R cnm , that is to say we implicitly assume a covering factor of one for the CNM for r < R cnm . The validity of this assumption is discussed in Sect. 5.

Atomic and molecular hydrogen
Given the inclusion of the halo pressure as a function of radius, we were able to model the neutral hydrogen in more detail than the average radial profile assumed by K20. We started out by modeling the total hydrogen column, N H , as a function of radius. Motivated by the universality of the exponential form as observed by Bigiel & Blitz (2012), we used a central value of log N H (r = 0) = 21.5 in units of cm −2 for all galaxies. The exponential scale length was then matched to reproduce log N H i = 20.3 at the radius R dla , which was set to match the observed incidence of DLAs (see K20). In doing so, we assumed that N H is dominated by H i at these large radii, which is consistent with Bigiel & Blitz (2012).
We subsequently split the total hydrogen column density into separate atomic and molecular column densities using the pressure-based prescription by Blitz & Rosolowsky (2006): with P = 4 × 10 4 K cm −3 . Here, P tot was calculated at the position of the random impact parameter following Eq. (1). In order to properly reproduce the distribution function of N H i , we included a random scatter on N H of 0.4 dex, mimicking the scatter in the observed radial profiles (Bigiel & Blitz 2012). The molecular fraction of a given absorber was furthermore given a random scatter of 0.1 dex according to the observations (Bigiel & Blitz 2012). We verified that this modeling of N H i and N H 2 reproduces the distribution function of N H i . The agreement is only slightly worse than the fit by K20.

Results
Based on a stacking experiment performed on SDSS spectra, Balashev & Noterdaeme (2018, hereafter BN18) infer the average covering fraction of self-shielded H 2 -bearing gas among H i-selected absorption systems of 4.0 ± 0.5 stat ± 1.0 sys % and 37 ± 5 stat ± 4 sys %, for log(N H i / cm −2 ) > 20.3 and > 21.7, respectively. Here, we consider this molecular covering fraction as a proxy for the CNM fraction, that is to say the fraction of sightlines showing CNM absorption.
The only a priori unknown parameter in our CNM model, P * 0 , was constrained by matching the CNM fractions by BN18 using χ 2 -minimization. For this purpose, we used an effective uncertainty on the observed CNM fractions, combining the systematic and statistical uncertainties in quadrature. The best-fit value is log(P * 0 /k B / K cm −3 ) = 5.95 ± 0.11, for which we find the following average CNM fractions of 4.5% and 34.8% for log(N H i / cm −2 ) > 20.3 and > 21.7, respectively. This value of P * 0 is consistent with the central pressure inferred for the Milky Way of log(P 0 /K cm −3 ) ≈ 6 given the local pressure measured in the Solar neighborhood of log(P r=R /K cm −3 ) = 3.6 (Jenkins & Tripp 2011), assuming R = 8 kpc and r h = 3 kpc.
In order to check the resulting model distribution of thermal pressures, we compared our model to the observed thermal pressures, see Table 1. The observed average thermal pressure for CNM absorbers is log(P th /K cm −3 ) = 4.0 ± 0.1, and our model predicts a value of log(P th /K cm −3 ) = 3.8. Taking into account the caveat that the observations are neither homogeneous nor representative, the average pressure predicted by our model agrees well with the observations. The predicted column density distribution functions f (N, X) of H i and H 2 are presented in Fig. 1. Overall, our model simultaneously reproduces the observed statistics of N H i and N H 2 . The power-law inferred from the stacking experiment by BN18 represents the average shape of f (N, X) and it is not sensitive to the "knee" at higher column densities. log(N H 2 ) ∼ 22 is therefore not surprising. We note that the location of the knee depends on the adopted central value of N H , which is motivated by observations by Bigiel & Blitz (2012). The expected incidence of CNM absorbers was calculated as the integral: where we adopted L min = 10 −4 L * following K20; σ(L) denotes the luminosity dependent absorption cross section, and the Hubble parameter is given as: The CNM absorption cross section was calculated based on the luminosity-dependent R cnm : σ(L) = πR 2 cnm , which yields an incidence of l cnm = 11.3 × 10 −3 at z = 2.5 taking the dust obscuration bias into account.
The incidence of C i-selected absorbers inferred by Ledoux et al. (2015) at similar redshifts is l Ci = 1.5 ± 0.5 × 10 −3 , which was corrected for incompleteness due to the limiting equivalent width. In order to compare our predicted l cnm with that inferred by Ledoux et al. (2015), we restricted our calculation to model points with log(Z abs /Z ) > −0.6, which corresponds to the minimum metallicity in the C i-selected sample by Ledoux et al. (2015). This yields an expected C i incidence of l Ci = 4.8 × 10 −3 . A comparison between the overall expected incidence of cold gas and that traced by C i absorption is discussed in Sect. 5.
Lastly, in Fig. 2 we show the distribution of impact parameters as a function of N H i and metallicity for the modeled CNM absorbers compared to the overall absorber population. In this figure, we show the seven CNM absorbers for which emission counterparts have been identified (see Table 1). For comparison purposes, we show the compilation of emission counterparts of H i-selected absorbers by . As mentioned in the discussion by Krogager et al. (2018), the average projected radial extent of CNM gas at high redshift is expected to be roughly a factor of ten smaller than the typical extent of DLAs. This is in good agreement with our model where the median impact parameter of CNM absorbers is a factor 11 smaller than the median for DLAs.

Discussion
An implicit assumption in our analysis is that the strong H 2 lines (N H 2 > 10 18 cm −2 ) analyzed by BN18 are complete tracers of the CNM in our model. One independent way to probe the CNM gas in DLAs is through H i 21-cm absorption lines. The detection rate of 21-cm absorption in DLAs at z > 2.2 has been constrained to be 16 +17 −9 % (Kanekar et al. 2014). At face-value, this indicates that the CNM fraction of DLAs is higher than the 4 % obtained from H 2 statistics and assuming CNM always bears self-shielded H 2 (BN18). However, given the low-number statistics of high-redshift 21-cm absorbers, this fraction should be interpreted with great care. The CNM fraction derived from 21-cm absorbers depends strongly on the redshift criteria, for example, for z > 2.5, which is more comparable to the sample analyzed by BN18, and the fraction drops to ∼7 % (1/14; see also Srianand et al. 2012). The assumption that strong H 2 absorption systems trace the bulk of the CNM is therefore consistent with 21-cm absorption statistics.
Absorption from C i is a good tracer of CNM gas at high metallicity due to the increasing detectability with an increasing metal column and due to the more efficient cooling and dustshielding from UV photons in high-metallicity gas. A comparison between our model prediction and the incidence derived for strong C i absorption lines with W C i,1560 > 0.4 Å by Ledoux et al. (2015) shows that ∼30 % of the CNM gas at log(Z abs /Z ) > −0.6 1 was identified by selecting based on strong C i absorption. However, the simplifying assumptions in our model about the CNM covering factor and geometry affect the estimated fraction of the CNM identified using C i-selection.
We assume that the covering factor of CNM within R cnm is unity. While this may be an over-simplification, it is consistent with observations at lower redshifts (e.g., Wiklind et al. 2018) and the observation of several C i absorption components towards both lines of sight in the lensed quasar observed by Krogager et al. (2018). We also assume a spherical distribution of the cross section similar to K20. While this may work well for the more extended warm gas phase dominating the strong H i absorbers, the CNM on smaller scales may be distributed in a more flattened disk-like structure. Taking such a flattening into account would decrease the cross section of the CNM by roughly a factor of two. Accounting for such a geometrical effect would bring the model expectation for l cnm of metal-rich systems into closer agreement with the observed l Ci . Nonetheless, some part of the cold, molecular gas may arise outside a flattened disk, for example, in accretion streams or material carried out in outflows. This may decrease the projected ellipticity on average, thereby lending more support to our spherical approximation.
Taking our model at face value, we find that strong C iselected absorbers represent only a fraction of the total observable CNM (i.e., not obscured by dust). The inferred total l cnm of 11.3 × 10 −3 yields a fraction of 13 ± 5 % of the total CNM detectable through strong C i absorption in current spectroscopic surveys of quasars.
As alluded to above, a non-negligible part of the CNM has a dust obscuration that is too high in order to be identified in optical surveys. Based on our model, we find that the fractional completeness of l cnm is ∼95 % due to dust obscuration, giving a total unobscured CNM incidence of l cnm, tot ≈ 12 × 10 −3 .
We have so far only considered data from intervening quasar absorption systems. Another powerful probe of high-redshift galaxies is absorption systems observed in γ-ray burst afterglow spectra at the redshift of the host galaxy, the so-called GRB-DLAs. The GRB-DLAs probe very central regions of their host galaxies with impact parameters less than 5 kpc on average (Lyman et al. 2017). For comparison purposes, in Fig. 2 we show a sample of GRB-DLAs. The small impact parameters overlap with the expected impact parameters of CNM sightlines from our model. This is consistent with the high fraction ( 25 %) of GRB-DLAs showing absorption from H 2 or C i (Bolmer et al. 2019;Heintz et al. 2019). The GRB sightlines, however, tend to probe higher N H i , which may be a result of GRBs arising in regions of recent star formation where the gas column could be higher than the average galactic environment. A formal analysis of the GRB statistics in the context of our model is beyond the scope of this letter, yet we highlight that the similarities of ES-DLAs (log(N H i / cm −2 ) > 21.7; Noterdaeme et al. 2014;Ranjan et al. 2020) and GRB-DLAs in terms of the high fraction of them showing signs of CNM and their small impact parameters are in agreement with our model.

Summary
We have presented a simple model to reproduce the statistics of high-redshift absorption systems featuring either H 2 or C i absorption lines as tracers of the CNM. The aim of the model is to test the canonical two-phase model of the neutral ISM at high redshifts where the environment is significantly different from local galaxies (mainly in terms of metallicity and ambient ionizing flux). The main principle of the model relies on a pressure gradient throughout the galactic halo which determines at which point the neutral medium is able to support a stable CNM. The criterion for CNM to exist is determined by the minimum pressure, P min , calculated following theoretical considerations by Wolfire et al. (2003). This pressure-based prescription was incorporated into the model framework by K20 in order to predict the H i and Z abs of CNM absorption systems.
We were able to constrain the single a priori unknown model parameter by matching the fraction of high-redshift (z ≈ 2.5) DLAs hosting CNM gas (as traced by H 2 ; BN18). Our model then simultaneously reproduces the distribution functions of N H i and N H 2 . Furthermore, the distributions of metallicity, N H i , and impact parameters are all in qualitative agreement with observations. However, due to the heterogeneous sample selection and limited statistics, it was not possible to formally quantify the goodness of fit.
We find that the fraction of CNM identified by strong C i absorption lines (with equivalent widths in excess of 0.4 Å) amounts to ∼15 % of the total CNM observable in opticallyselected quasar surveys, and that the total incidence of CNM absorption is underestimated by ∼5 % due to dust obscuration.
Our model corroborates a simple picture for the neutral gas inside and around high-redshift galaxies. The mostly warm, neutral gas as probed by DLAs is extended over scales on the order of 10 kpc, whereas the cold neutral medium clouds are distributed on much smaller galactic scales, of the order 1 kpc, where the pressure is high enough.