Physical properties of CO-dark molecular gas traced by C$^+$

Neither HI nor CO emission can reveal a significant quantity of so-called dark gas in the interstellar medium (ISM). It is considered that CO-dark molecular gas (DMG), the molecular gas with no or weak CO emission, dominates dark gas. We identified 36 DMG clouds with C$^+$ emission (data from Galactic Observations of Terahertz C+ (GOT C+) project) and HINSA features. Based on uncertainty analysis, optical depth of HI $\tau\rm_{HI}$ of 1 is a reasonable value for most clouds. With the assumption of $\tau\rm_{HI}=1$, these clouds were characterized by excitation temperatures in a range of 20 K to 92 K with a median value of 55 K and volume densities in the range of $6.2\times10^1$ cm$^{-3}$ to $1.2\times 10^3$ cm$^{-3}$ with a median value of $2.3\times 10^2$ cm$^{-3}$. The fraction of DMG column density in the cloud ($f\rm_{DMG}$) decreases with increasing excitation temperature following an empirical relation $f\rm_{DMG}=-2.1\times 10^{-3}T_(ex,\tau_{HI}=1)$+1.0. The relation between $f\rm_{DMG}$ and total hydrogen column density $N_H$ is given by $f\rm_{DMG}$=$1.0-3.7\times 10^{20}/N_H$. The values of $f\rm_{DMG}$ in the clouds of low extinction group ($A\rm_V \le 2.7$ mag) are consistent with the results of the time-dependent, chemical evolutionary model at the age of ~ 10 Myr. Our empirical relation cannot be explained by the chemical evolutionary model for clouds in the high extinction group ($A\rm_V>2.7$ mag). Compared to clouds in the low extinction group ($A\rm_V \le 2.7$ mag), clouds in the high extinction group ($A\rm_V>2.7$ mag) have comparable volume densities but excitation temperatures that are 1.5 times lower. Moreover, CO abundances in clouds of the high extinction group ($A\rm_V>2.7$ mag) are $6.6\times 10^2$ times smaller than the canonical value in the Milky Way. #[Full version of abstract is shown in the text.]#


Introduction
The interstellar medium (ISM) is one of the fundamental baryon components of galaxies. The ISM hosts star formation. Determining the composition of the ISM will improve understanding of the lifecycle of ISM and the evolution of galaxies.
The 21-cm hyperfine line of atomic hydrogen has been used to trace the neutral medium. The linear relation between Hi column density and visual extinction, N(Hi)/A V = 1.9 × 10 21 cm −2 mag −1 (Bohlin, Savage, & Drake 1978), is valid for A V < 4.7 mag. Molecular hydrogen, H 2 , the main component of ISM, lacks a permanent dipole moment and does not have rotational radio transitions in cold ISM. CO and its isotopologues have been used as the main tracers of dense, well-shielded H 2 gas. At Galactic scales, H 2 column density is derived through multiplying integrated CO intensity W(CO) by an X CO factor of 2 × 10 20 cm −2 (K · kms −1 ) −1 with ±30% uncertainty in the Milky Way disk (Bolatto, Wolfire, & Leroy 2013). Though mean values of X CO are similar for CO-detected molecular gas in diffuse and dense environments (Liszt et al. 2010), the volume density of CO-detected molecular gas is an order of magnitude greater than typical values of diffuse atomic gas shown in Heiles & Troland (2003).
The transition from diffuse atomic hydrogen to dense CO molecular gas is not well understood. Dust is assumed to be mixed well with gas. Infrared emission of dust has been used as a tracer of total hydrogen column density. Results from the all-sky infrared survey by the Infrared Astronomical Satellite (IRAS), the Cosmic Background Explorer (COBE), and the Planck satellite revealed excess of dust emission, implying additional gas that cannot be accounted by Hi and CO alone (Reach, Koo, & Heiles 1994;Reach, Wall, & Odegard 1998;Hauser et al. 1998;A&A proofs: manuscript no. 28055 Planck Collaboration 2011). Furthermore, the gamma-ray observations from COS-B (Bloemen et al. 1986) and the Energetic Gamma-Ray Experiment Telescope (EGRET; Strong & Mattox 1996;Grenier, Casandjian, & Terrier 2005) also implied an extra gas component with a mass comparable to that in gas traced by N(Hi)+2X CO *W(CO) in the Milky Way. This excess component of ISM, which cannot be fully traced by the usual Hi 21-cm or CO 2.6-mm transition, is termed dark gas.
The mainstream view considers dark gas to be unobserved molecular gas due to lack of corresponding CO emission. Most of the direct detections of molecular gas were made with CO emission. There are, however, examples of interstellar molecules detected toward lines of sight without corresponding CO emissions (Wannier et al. 1993;Magnani & Onello 1995;Allen et al. 2015). If the nondetection of CO is taken as a sign of missing molecular gas, the fraction of dark gas varies from 12% to 100% for individual components in Liszt & Pety (2012). The existence of unresolved molecular gas by CO is also supported by photodissociation region (PDR) model (e.g., van Dishoeck et al. 1988). H 2 can exist outside the CO region in an illuminated cloud because the self-shielding threshold of H 2 is smaller than that of CO. The gas in the transition layer between the outer H 2 region and the CO region is CO-dark molecular gas (DMG).
The DMG can be associated with Hi self-absorption (HISA), which is caused by a foreground Hi cloud that is colder than Hi background at the same radial velocity (e.g., Knapp et al. 1974). The Canadian Galactic Plane Survey (CGPS; Gibson et al. 2000, Taylor et al. 2003) and the Southern Galactic Plane Survey (SGPS; McClure-Griffiths et al. 2005) revealed that HISA is correlated with molecular emission in space and velocity (Gibson et al. 2005b, Kavars et al. 2005 although HINSA without H 2 can exist (Knee & Brunt 2001). It is now accepted fully that large portion of cold neutral medium (CNM) is colder (Heiles & Troland 2003) than the predictions of the three-phase ISM model (McKee & Ostriker 1977). When HINSA does contain H 2 , it is dubbed Hi narrow self-absorption (HINSA; Li & Goldsmith 2003). In normal molecular clouds, HINSA can be easily identified through its correlation with 13 CO. Without the emission of CO as a clear comparison mark, distinguishing between HISA and HINSA relies on the empirical threshold of δV ∼ 1.5 km/s, which seems to be applicable in most diffuse regions, but can be subjective. We henceforth adopt the term HINSA because of our focus on DMG.
The total H 2 column density can be measured directly through ultraviolet (UV) absorption of H 2 toward stars. Observations taken by the Copernicus satellite (Savage et al. 1977) and the Far Ultraviolet Spectroscopic Explorer (FUSE) satellite (Rachford et al. 2002(Rachford et al. , 2009) revealed a weak inverse correlation between rotational temperature and reddening, as well as increasing correlation between molecular fraction and reddening. These kinds of observations are limited to strong UV background stars with low extinction (< 3 mag), and cannot resolve a Galactic cloud due to coarse spectral resolution (> 10 km s −1 ) at UV bands (Snow & McCall 2006). The C + -158 µm emission, a fine structure transition 2 P 3/2 → 2 P 1/2 , can be used as a probe of molecular gas in PDR. Based on C + spectra obtained from a Herschel Open Time Key Program, Galactic Observations of Terahertz C+ (GOT C+), Langer et al. (2014;hereafter L14) found that DMG mass fraction varies from ∼ 75% in diffuse molecular clouds without corresponding CO emission to ∼ 20% for dense molecular clouds with CO emission.
There are two critical challenges in quantifying the DMG environment: the determination of the kinetic temperature, T k , and the determination of the column and volume densities of Hi and H 2 . Analysis of dust emission and extinction can aid in meeting these challenges. When looking into the Galactic plane, however, analysis of dust is muddied by source confusion. In previous studies, Kavars et al. (2005) attempted to constrain T k and volume density n based on an analysis of Hi absorption. Because of the lack of an effective tracer of total hydrogen gas, these authors had to rely on the Galactic thermal pressure distribution in Wolfire (2003) to estimate the molecular fraction. The study L14 introduced C + emission as an effective tracer of total hydrogen gas. The study L14 assumed an overall temperature of 70 K and Galactic thermal pressure distribution to calculate total volume density. They analyzed C + excitation to determine molecular abundance. Because C + emission is sensitive to kinetic temperature and volume density, the lack of direct measurements of excitation temperature and volume density in L14 introduced uncertainties, especially for single clouds at low Galactic latitudes. Moreover, L14 obtained Hi intensity by integrating velocity width defined by the C + (or 13 CO) line. This overestimated Hi column density as widespread background Hi emission is included. Additionally, the optically thin assumption for the 21-cm line adopted in L14 results in an uncertainty of 20% for optical depth between 0.5 and 1 as discussed in their paper. Considering the caveats above, it is of great importance to inspect the effects of kinetic temperature, volume density, and Hi optical depth.
To improve constraints on physical properties of DMG, we adopted here the HINSA method in Li & Goldsmith (2003) to obtain an independent measure of T ex (Hi), N(Hi), and n(Hi) = N(Hi)/L Hi , where L Hi is the linear dimension of HINSA cloud. H 2 volume density, n(H 2 ), and H 2 column density, N(H 2 ), can be related as n(H 2 ) = N(H 2 )/L H 2 , where L H 2 is the linear dimension of H 2 region of the HINSA cloud. According to the PDR model T6 (model parameters are proton density of 10 3 cm −3 , temperature of 15 K, UV intensity of 1, and total visual extinction of 5.1 mag) in van Dishoeck & Black (1988), the outer layer with pure Hi (L Hi −L H 2 ∼ 0.03 pc) is relatively thin compared to the cloud with A V = 1 mag (∼ 0.9 pc). N(H 2 ) and n(H 2 ) can then be determined through C + excitation analysis after adopting the ratio L H 2 /L Hi = 1. The uncertainty caused by the value of L H 2 /L Hi is discussed in detail in Section 4.3.
This paper is organized as follows. In Section 2, we describe our observations and data. In Section 3, we present our procedure to identify DMG clouds. In Section 4, we present derived spatial distribution, Hi excitation temperatures, column and volume densities of Hi and H 2 of identified DMG clouds from Hi and C + analysis. In Section 5, we present derived DMG cloud properties. The discussion and summary are presented in Section 6 and Section 7, respectively.

C +
The Herschel Open Time Key Program, GOT C+ observed C + -158 µm line toward 452 lines of sight toward the Galactic plane (e.g., Langer et al. 2010). Most of the lines of sight are within 1 degree of the Galactic plane in latitude except for a small fraction of lines of sight in the outer Galaxy that are within 2 degrees. Longitude distribution of all lines of sight can be found in Figure 1 of L14. We obtained public C + data from Herschel Science Archive (HSA) with the kind aid of J. Pineda. The angular resolution of the C + observations is 12 ′′ . The data have already been smoothed to a channel width of 0.8 km s −1 with an average root mean square (rms) of 0.1 K. A detailed description of the GOT C+ program and the data can be found in Pineda et al. (2013) and L14.

CO
For lines of sight of GOT C+ in the Galactic longitude -175.5 • ≤ l ≤ 56.8 • , J = 1 → 0 transitions of 12 CO, 13 CO, and C 18 O were observed with the ATNF Mopra Telescope (see Pineda et al. 2013 and L14 for details). The Mopra data have an angular resolution of 33 ′′ . Two channels of CO spectrum were smoothed into one to derive a comparable velocity resolution to that of Hi spectra. The typical rms values are 0.44 K for 12 CO per 0.7 km s −1 , 0.18 K for 13 CO per 0.74 km s −1 , and 0.21 K for C 18 O per 0.73 km s −1 .
For those GOT C+ sightlines (56.8 • < l < 184.5 • ) that are out of the Mopra sky coverage, we obtained J = 1 → 0 transitions of 12 CO, 13 CO, C 18 O with the Delingha 13.7 m telescope. Full width at half power of Delingha telescope is about 60 ′′ . The observations were made between May 9 and 14 2014, using the configuration of 1 GHz bandwidth and 61 kHz channel resolution (velocity resolution of ∼ 0.16 km s −1 ). The data were reduced with GILDAS/CLASS 1 data analysis software and were smoothed to ∼0.8 km s −1 to be consistent with velocity resolution of Hi spectra. The derived rms values are 0.16 K for 12 CO per 0.79 km s −1 and 0.09 K for both 13 CO and C 18 O per 0.83 km s −1 .

Radio continuum
To calculate excitation temperature from the HINSA features, background continuum temperature T c is needed. The Milky Way background continuum temperature is estimated to be ∼ 0.8 K in the L band (e.g., Winnberg et al. 1980). Total T c containing contribution from the cosmic microwave background (2.7 K; Fixsen 2009) and the Milky Way is estimated to be 3.5 K, but T c of 3.5 K is only valid for lines of sight toward high Galactic latitudes and T c in the Galactic plane is seriously affected by continuum sources, e.g., H II regions. We adopted 1.4 GHz continuum data from the Continuum Hi Parkes All-Sky Survey (CHIPASS; Calabretta, Staveley-Smith, & Barnes 2014) with an angular resolution of 14.4 ′ and a sensitivity of 40 mK to derive T c . The CHIPASS covers the sky south of declination +25 • 1 http://www.iram.fr/IRAMFR/GILDAS that corresponds to −180 • < l < 68 • in the Galactic plane. In 68 • < l < 175 • , continuum data from CGPS with a rms of ∼ 0.3 mJy beam −1 at 1420 MHz were utilized.

Procedures for HINSA identification and Gaussian fitting
As shown in Figure 1, the relations between Hi, C + , and CO are complicated. For example, the cloud at V lsr -52 km s −1 for G337.0+0.5 has Hi , 12 CO, and C + emission. In contrast, the cloud at -44 km s −1 for G337.0+0.5 has only Hi and 12 CO, but no C + emission. Our focus in this study is DMG clouds that have C + emission with corresponding HINSA features, but without CO emission. The first step is to identify DMG-HINSA candidates showing C + emission and Hi depressions but no obvious CO emission. We found 377 such candidates toward 243 sightlines out of a total of 452 in the GOT C+ program by eye. The candidates were further filtered by the following procedures: 1. Depression features are common in Galactic Hi spectra. They can be caused by temperature fluctuations, gap effects between multiple emission lines, absorption toward continuum sources, or cooling through collision with H 2 (HINSA) as described in Section 1. We checked Hi channel map around the depression velocity to ascertain whether a Hi depression feature is HINSA. A HINSA cloud should appear as a colder region than its surroundings in the Hi channel map at its absorption velocity. Moreover, the colder region should be visible in maps of adjacent velocity channels (≥ 2). Checking the channel map is necessary because non-HINSA features, with an obvious Hi spectral depression feature, are common. Examples of HINSA and non-HINSA features are shown in Figure 2. We rejected more than half of Hi depression features as fake HINSA features after this inspection. 2. After visual inspection, we employed a quantitative inspection of the absorption to weed out confusion originating from temperature fluctuations. The Hi spectrum toward the GOT C+ sightline was labeled the ON spectrum. Background Hi emission arising from behind the foreground absorption cloud was derived through averaging spectra of nearby positions around the absorption cloud and was labeled the OFF spectrum. The nearby positions were selected from regions with HI emission contiguous with the ON position and at about 5 arcmin from the cloud boundary. An absorption signal in the ON spectrum is seen as an emission feature in the OFF-ON spectrum. The component in the residual OFF-ON spectrum is contributed by foreground cold HI cloud. (e.g., around −50 km s −1 toward G132.5-1.0 of Figure 1). Hi ON spectra in velocity ranges where Galactic Hi emissions are absent (e.g., V ≥ 60 km s −1 or V ≤ -20 km s −1 in the Hi spectrum of G207.2-1.0) were chosen to calculate 1σ rms. Hi OFF-ON signals with signal-to-noise (S/N) greater than 3.0 were identified as absorption lines. 3. The rms values in different C + spectra vary owing to different integration time. Spectral ranges without obvious signals were chosen to calculate 1σ rms. The typical 1σ rms of C + is listed in Table 1. The rms values of different C + spectra vary by as much as a factor of 1.5. Those C + signals with S/N greater than 2.5 were identified as C + emission lines, considering the generally weaker C + emission for clouds without CO emission.  the fitting is sensitive to initial inputs, especially the number of Gaussian components and the central velocities of individual components. We developed an IDL code to do Gaussian decomposition. In the code, the number of Gaussian components was automatically determined by the method presented in Lindner et al. (2015). The key is the solution of derivatives of the spectra. A regularization method is introduced. It is difficult to define a suitable regularization parameter, which controls the smoothness of the derivations of the spectra. We chose a coarse regularization parameter that may introduce extraneous components. A visual check of all components was performed to remove obviously unreliable components. The estimated parameters of Gaussian components were input as initial conditions into the Gaussian fitting procedure gfit.pro, which was adopted from the Millen-nium Arecibo 21 cm absorption-line survey (Heiles & Troland 2003), to give final fitting parameters of decomposed Hi and C + components.
We first fitted Gaussians to the Hi OFF-ON spectra around HINSA velocity because the HINSA components are easily recognized. In most cases, Hi OFF-ON spectra can be fitted with only one Gaussian component. In other cases, two components were used, and no case of three components was needed. The derived Hi parameters were used as initial conditions for C + emission fitting. Examples of Gaussian decomposition for Hi and C + spectra are shown in Figure 1.
The derived components were further filtered based on line widths. We required that an emission line should have at least two channels, corresponding to 1.6 km s −1 in C + and Hi spectra.
A final check was necessary to determine whether the observed Hi gas can produce the observed C + emission alone; details are in Section 4.3. Finally, we ended up with 36 DMG clouds with relatively clearly visible HINSA features and Gaussian components.

Galactic spatial distribution
Kinematic distance was derived based on the Milky Way rotation curve (Brand & Blitz 1993). The galactocentric radius, R, for a cloud with Galactic longitude, l, latitude, b, and radial velocity along line of sight, V los , is given by where V R is orbital velocity at R. V ⊙ =220 km s −1 is local standard of rest (LSR) orbital velocity of the Sun at R ⊙ of 8.5 kpc as recommended by International Astronomical Union (IAU); V R /V ⊙ = a 1 (R/R ⊙ ) a 2 + a 3 with a 1 = 1.00767, a 2 = 0.0394, and a 3 =0.00712 (Brand & Blitz 1993). Then the distance to the cloud, d, can be expressed as a function of R.
In the outer galaxy ( R > R ⊙ ), the solution is unique, d = In the inner Galaxy ( R < R ⊙ ), there exists kinematic distance ambiguity (KDA) with two simultaneous solutions for a velocity along a line of sight, . There are three main resolutions of the KDA: (1) Hi absorption against bright pulsars (Koribalski et al. 1995) or against H II regions with well-known distances (Kolpak et al. 2003); (2) judgement of different angular extent of the cloud at the near and far kinematic distances (e.g., Clemens et al. 1988 ); and (3) the HINSA method. Clouds in the near distance tend to show HINSA features while clouds in the far distance do not because of the lack of absorption background (Roman-Duval et al. 2009, Wienen et al. 2015. A comparison of the optical image with the 13 CO distribution for GRSMC 45.6+0.3 supports this premise . While solutions 1 and 2 are limited to sources satisfying specific conditions, solution 3 can be applied to more sources. To test the validity of our distance calculation, we compared our calculated kinematic distance with maser trigonometric parallax distances for four sources listed in Table 3 of Roman-Duval et al. (2009). Two kinds of distances are consistent within ≤ 5%. We took the near distance value for our sources located in the inner Galaxy. The distance thus derived was used to calculate the background Hi fraction p in Equation 2 in Section 4.2.
The above distance estimates have the following caveats: (1) There may exist enough background for a cloud to show HINSA, even at the far distance; for example, such a background can be provided by spiral density waves (Gibson et al. 2002(Gibson et al. , 2005a. (2) The existence of cloud-to-cloud velocity dispersion of about 3 km s −1 (Clemens 1985) adds uncertainty to the one-to-one mapping of distance to velocity. Streaming motions of 3 km s −1 will introduce an uncertainty of <∼ 220 pc for cloud with (l, b)=(45 • , 0 • ) and LSR velocity of 40 km s −1 .
Article number, page 5 of 13 A&A proofs: manuscript no. 28055  Figure 3 shows the spatial distribution of 36 DMG clouds in the Galactic plane. Four Galactic spiral arms revealed by distributions of star-forming complexes in Russeil (2003) are also drawn. It can be seen that most clouds are located between 311 • and 55 • in Galactic longitude. The two ends of the longitude range correspond to tangent directions along Scutum-Crux Arm and Sagittarius Arm, respectively. Selection effect may contribute to this. Foreground clouds preferentially exhibit HINSA features when they are backlit by warmer Hi emerging from the Galactic bar and spiral arms.

Analysis of HINSA
The excitation temperature of cold Hi absorption cloud can be derived as (Li & Goldsmith 2003) where T c is the background continuum temperature derived from CHIPASS and CGPS continuum data (Section 2.4); p is Hi fraction behind the foreground cold cloud; T Hi is the reconstructed background Hi brightness temperature without absorption of the foreground cold cloud; and T ab is the absorption brightness temperature. The temperatures T Hi and T ab are shown in the spectra toward G207.2-1.0 in Figure 1; τ f , the foreground Hi optical depth, was adopted as 0.1; and τ Hi is the optical depth of Hi in the cold cloud. Infinite τ Hi results in an upper limit of excitation temperature, T upp ex . Kolpak et al. (2002) showed an average optical depth of 1 for clouds in the spatial range between Galactic radius 4 and 8 kpc. As seen from Figure 3, most of our clouds are located in that spatial range. Thus it is reasonable to assume τ Hi = 1 for our clouds. The uncertainties of adopting different τ Hi are discussed further in Section 5.1.
Galactic Hi spatial distribution and positions of DMG clouds are necessary for calculating p. The Galaxy was divided into a set of concentric rings, with a galactocentric radius R and radius width ∆R = 1 kpc. The Hi surface density Σ(r) of each concentric ring was assumed to be constant and distributed as Figure 10 in Nakanishi & Sofue (2003). The maximum galactocentric radius of the Galaxy was chosen as 25 kpc. The spatial information derived in the Section 4.1 was applied here. The Hi fraction behind foreground cold cloud p = behind cloud Σ(r)dr/ entire sightline Σ(r)dr, where entire sightline Σ(r)dr is the total integrated Hi surface density along a sightline and the behind cloud Σ(r)dr is the integrated Hi surface density behind the cloud.
Derived T ex , (τ Hi = 1) and T upp ex are shown in column (4) and (5) of Table 2, respectively. Excitation temperature distributions of DMG are shown in Figure 4. T ex , (τ Hi = 1) ranges from 20 to 92 K, with a median value of 55 K. This median value is comparable to the observed median temperature of 48 K for 143 components of cold neutral medium (Heiles & Troland (2003)), which were decomposed from emission/absorption spectra toward 48 continuum sources. Moreover, this median value is consistent with the calculated temperature range of ∼ 50 − 80 K in the CO-dark H 2 transition zone in Wolfire et al. (2010). The derived lowest T ex , (τ Hi = 1) is 20.3 K for G028.7-1.0.
The uncertainties of T ex result from p are associated with two aspects. The first is our adoption of the average Hi surface density in each concentric ring is ideal. This is idealized for two reasons. Firstly, and probably more importantly, is the presence of the localized Hi structure, some of which is associated with the very dark gas we are studying; excess Hi associated with this structure can lie in front of or behind the HINSA. Secondly, such a smooth Hi distribution on large scales is idealized because it neglects such things as spiral structure. The second is the distance ambiguity of the cloud, which may cause a twice the uncertainty. For instance, the near and far distance of G025.2+0.0 is 2.4 kpc and 13.0 kpc. The values of p are 0.86 and 0.59, resulting in T ex of 52.9 K and 28.7 K, respectively. As we discussed in Section 4.1, we prefer the near distance due to Hi absorption feature in our sources. Thus the derived Hi excitation temperature is an upper limit because of our adoption of the near distance.
With the condition of hν/kT ex ≪ 1, Hi column density N(Hi) is related to Hi optical depth τ Hi and excitation temperature T ex Article number, page 6 of 13 N.Y. Tang et al.: Physical properties of CO-dark molecular gas traced by C + through N(Hi) = 1.82 × 10 18 T ex τ Hi dυ cm −2 . (3) We derived N(Hi) by adopting T ex , (τ Hi = 1) and τ Hi = 1, where T ex is excitation temperature of the cloud. The values of N(Hi) are shown in column (8) of Table 2, assuming τ Hi = 1. The median value of N(Hi) is 3.1×10 20 cm −2 . As seen in Equation 2, T ex depends on τ Hi . The uncertainty in τ Hi would strongly affect N(Hi) and the DMG fraction as seen in Section 5.1.
The HINSA angular scale, ∆θ, can be measured from Hi channel maps. Though most HINSA have a complex nonspherical structure, we used a geometric radius to model the HINSA region in Hi channel map. For a cylinder structure, we chose the width as cloud diameter. For some HINSA clouds without a clear boundary, there may exist larger uncertainties. Combining with the calculated distance d in Section 4.1, we can determine the spatial scale of cloud L Hi =∆θ · d. Hi volume density can then be calculated through n(Hi) = N(Hi)/L Hi . The derived n(Hi) are shown in column (6) of Table 2 with a median value of 34 cm −3 , which is consistent with the typical CNM volume density, n(Hi) CNM ∼ 56 cm −3 (Heiles & Troland 2003).

Analysis of C +
C + is one of the main gaseous forms of carbon elements in the Galactic ISM. It exists in ionized medium, diffuse atomic clouds, and diffuse/translucent molecular gas regions where the phase transition between atomic and molecular gas happens (e.g., Pineda et al. 2013). The C + 158 µm line intensity, a major cooling line of the CNM, is sensitive to physical conditions. This line is an important tool for tracing star formation activity and ISM properties in the Milky Way and galaxies (e.g., Bosellietal 2002, Stacey et al. 2010).
The C + 158 µm line is mainly excited by collisions with electrons, atomic hydrogen, and molecular hydrogen. Collisional rate coefficient (s −1 /cm −3 ) with electrons is ∼ 100 times larger than that with atomic and molecular hydrogen because of the advantage of Coulomb focusing (Goldsmith et al. 2012). The C + emission from ionized gas contributes only 4% of the total C + 158 µm flux in the Milky Way (Pineda et al. 2013). Our selected clouds have T ex less than 100 K and should be cold neutral medium (CNM) without high percentage of ionization. In neutral region, Hi and H 2 dominate collisions with C + . C + intensity can be given by (Goldsmith et al. 2012) where χ H (C + ) is C + abundance relative to hydrogen; χ H (C + ) = 5.51 × 10 −4 exp(−R gal /6.2), is valid for spatial range 3 kpc < R gal < 18 kpc (Wolfire et al. 2003); and χ H (C + ) = 1.5 × 10 −4 was adopted outside that range. The parameters n cr (Hi), n cr (H 2 ) are critical densities of Hi and H 2 , respectively; n cr (Hi) = 5.75 × 10 4 /(16 + 0.35T 0.5 + 48T −1 ) cm −3 and n cr (H 2 ) = 2n cr (Hi) were adopted from Goldsmith et al. (2010); n Hi and n H 2 are volume densities of Hi and H 2 , respectively; ∆E/k, the transition temperature between Red rectangles indicate median visual extinctions for excitation temperature bin of 10 K. The physical widths and heights of the rectangles are 10 K and 1 mag, respectively.
2 P 3/2 -2 P 1/2 of C + , is 91.26 K; and T k is gas kinetic temperature. It is equivalent to T ex of Hi because Hi 21 cm emission is always in local thermodynamic equilibrium (LTE) in gas with density 10 cm −3 due to low Hi critical density (∼ 10 −5 cm −3 ) . We estimated n(H 2 ) = N(H 2 )/L H 2 , where L H 2 is the diameter of H 2 layer in cloud and L H 2 = L Hi was adopted as already discussed in Section 1. Thus N(H 2 ) and n(H 2 ) can be determined from Equation 4, and the results are shown in column (7) and (9) of Table 2, respectively. The median value of n(H 2 ) is 2.3×10 2 cm −3 . The median value of N(H 2 ) is 2.1×10 21 cm −2 .
Visual extinction is connected with total proton column density through, A V = 5.35 × 10 −22 [N(Hi) + 2N(H 2 )] mag, assuming a standard Galactic interstellar extinction curve of R V = A V /E(B − V) = 3.1 (Bohlin, Savage & Drake 1978). The corresponding visual extinction values toward each source are shown in column (11) of Table 2. In Figure 5, we plot A V as a function of T ex . It is clear that A V has a decreasing trend when T ex increases.
The ratio between L H 2 and L Hi is a key relation during the above calculation, but may vary for clouds with different visual extinction and different PDR models. We took another value L H 2 /L Hi = 0.8, which is the possible lower value of PDR with A V < 0.2mag, to estimate the uncertainty. With ratios of 1.0 and 0.8, the maximum differences of N(H 2 ), A V , and DMG fraction (Section 5.1) are 10%, 10%, and 5%, respectively. Thus the value of the ratio L H 2 /L Hi does not affect the physical parameters associated with H 2 too much.

Observed properties of dark gas clouds
Physical properties and spatial distribution of DMG are fundamental quantities that affect our understanding of the transition between diffuse atomic clouds and dense molecular clouds. If extra H 2 not traced by CO is needed to explain the observed C + intensity, the cloud is considered a DMG cloud.
Following Equation (7) in L14, the mass fraction of DMG in the cloud is defined as, A&A proofs: manuscript no. 28055 where N(H 2 ) = N(CO-dark H 2 ) + N(CO-traced H 2 ). In this paper, N(CO-traced H 2 ) is set to 0 due to absence of CO detection for our samples.
The uncertainty of f DMG comes from two aspects. First, measurement and fitting of the Hi and C + spectra. They were estimated to be less than ∼ 10% for all the sources. The second is the uncertainty of adopting τ Hi of Hi . As seen in Section 4.2, τ Hi of Hi greatly affects the Hi column density, and thus f DMG . It is necessary to investigate available parameter space. The parameters are constrained by the following three conditions: total Galactic dust extinction along the sightline. We adopted extinction values from all sky dust extinction database (Schlafly & Finkbeiner 2011), in which dust extinction was derived through analyzing colors of stars E(B-V) of Sloan Digital Sky Survey with a reddening ration The relations between f DMG , T ex , and τ Hi for 36 sources are shown in Figure 6. It is worthwhile to note that the upper values of τ Hi are overestimated and lower values of τ Hi are underestimated as A V (dust) is the total value along the sightline of each source. The parameter A V (dust) contains contributions from COtraced molecular gas at other velocities besides those with dark gas. According to Equation (2), higher T ex is required to produce a fixed absorption strength when τ Hi increases. This is reflected in Figure 6(d). According to Equation (3), a bigger τ Hi produces larger N(Hi).
In Figure 6(b), we present f DMG versus τ Hi . When τ Hi increases, f DMG decreases to a nonzero minimum value. This can be understood as follows. C + is mainly excited by collisions with Hi and H 2 . According to Equation (4), for a fixed C+ intensity, increasing N(Hi) and T k (T k =T ex was adopted) increases the contribution of Hi collision to C + emission, requiring decreasing contribution from H 2 collision, thus decreasing the H 2 column density and decreasing the H 2 fraction in the cloud. The lower limits of τ Hi for all sources are less than 1.0 except for G132.5-1.0 and G207.2-1.0, which have a τ Hi range of (1.7,9.0) and (2.3,6.2), respectively. For these two sources, we apply median value τ Hi = 5.4 and 4.3, respectively. As seen in Figure 6(c) and 6(d), this selection does not affect T ex and f DMG too much as they are in narrow value ranges, (23.0, 28.2) K and (0.83, 0.98) for G132.5-1.0, (37.7, 39.8) K and (0.86, 0.95) for G207.2-1.0; τ Hi = 0.5 is applied for G347.4+1.0 due to an upper limit of 0.85. For other 33 sources, we apply τ Hi = 1.0. Although this selection is arbitrary, we argue that it is reasonable for two reasons. The first reason is that averaged Hi optical depth between the Galactic radius 4 and 8 kpc is around 1.0 (Kolpak et al. 2002). The second reason is that the changes from 0.5 to 1.5 of τ Hi strongly affect f DMG value for only three sources. For other sources, the values of f DMG have a minimum of ≥ 0.6 in this τ Hi range, implying a weak dependence of τ Hi in the range of [0.5,1.5]. Thus we take a τ Hi range of [0.5,1.5] to represent the total uncertainty since uncertainties of τ Hi are much greater than measurement and fitting uncertainties.
The relation between f DMG and T ex , (τ Hi = 1) is shown in Figure 7. The relation between DMG fraction and gas excitation temperature can be described well by an empirical relation, f DMG = −2.1 × 10 −3 T ex , (τ Hi = 1) + 1.0.
The decreasing trend of f DMG toward increasing T ex,(τ Hi =1) is clear. This result is consistent with that in Figure 7 of Rachford et al. (2009). With the FUSE telescope, Rachford et al. (2009) derived the total molecular hydrogen N(H 2 ) and rotational temperature T 01 directly through UV absorption of H 2 toward bright stars. These authors found that molecular fraction f = 2N(H 2 )/(2N(H 2 ) + N(Hi)) decreases from ∼ 0.8 at T 01 = 45 K to ∼ 0.0 at T 01 = 110 K with relatively large scatter. Though the decreasing trend between our result and that in Rachford et al. (2009) is similar, f DMG is as high as 7.7 × 10 −1 at T ex = 110 K in Equation 6, implying a flatter slope compared to that in Rachford et al. (2009). Our results are more physical meaningful because N(Hi) and T 01 in Rachford et al. (2009) are averaged values along a line of slight.
Relation between f DMG and N H is shown in Figure 8. It reflects DMG fractions along different extinctions and is investigated in most theoretical papers. The data are fitted with an empirical relation We compared this result with cloud evolutionary model from Lee et al. (1996), who incorporated time-dependent chemistry and shielding of CO and H 2 in photodissociation clouds. Lee et al. (1996) split the cloud into 43 slabs. We adopted Lee's model through the following procedures. First, we calculated total hydrogen column density N H and total H 2 column density N(H 2 ) at 43 slabs. Then CO-traced H 2 was calculated through N(CO-traced H 2 )= N(CO)/Z(CO), where Z(CO) is CO abundance relative to molecular hydrogen. We derived the DMG column density through N(DMG)=N(H 2 )-N(CO-traced H 2 ). Finally, the DMG fraction f DMG =2N(DMG)/N H . As already shown in the models of Lee et al. (1996), Z(CO) varies significantly under different environments as shown in the chemical models (e.g., Lee et al. 1996) and in observations toward diffuse gas clouds (Liszt & Pety 2012). We adopted a constant Z(CO)= 3.2×10 −4 (Sofia et al. 2004) that is an upper limit in the ISM during the calculation. This leads to an upper DMG fraction from the model. We adopted model 1 in Lee et al. (1996), in which all hydrogen were originally in atomic phase. The DMG fractions as a function of hydrogen column density in the age of 10 5 , 10 6 , 10 7 , and 10 8 yr are shown in Figure 8 with dashed lines. It can be seen that our results are consistent with model results at age of 10 7 yr when N H 5 × 10 21 cm −2 (A V 2.7 mag). When N H > 5 × 10 21 cm −2 , f DMG decreases according to the modeled results of chemical evolution but still increases in our results. This difference persists even we consider data uncertainties. The 36 clouds were thus divided into two groups: a low extinction group with N H 5 × 10 21 cm −2 and high extinction group with N H > 5 × 10 21 cm −2 .
Planck Collaboration (2011) found an apparent excess of dust optical depth τ dust compared to the simulated τ mod dust between A V range of [0.37, 2.5] mag. The A V value of ∼ 0.37 mag and ∼ 2.5 mag correspond to threshold extinction of H 2 selfshielding and threshold extinction of dust shielding for CO, respectively. When A V > 2.5 mag, the CO abundance increases, resulting in a deceasing DMG fraction as expected from the chemical evolutionary model predictions at the age of 10 7 yr in Figure  8. If the CO luminosity is too weak to be observed, this would lead to an increasing curve when A V > 2.5 mag. Actually, Liszt & Pety (2012) found patchy CO emission in DMG regions with higher CO sensitivity. In order to estimate CO abundance limits in high extinction group (A V > 2.7 mag) clouds, we assumed optically thin and LTE of CO. These two assumptions are reasonable owing to no detection of CO and T ex /5.56 K ≫ 1. 12 CO column densities were derived through N( 12 CO) = 4.8 × 10 14 T b dυ cm −2 . We used a rms of T b = 0.6 K and velocity resolution of 0.35 km s −1 in our CO spectra. An upper limit of 12 CO column density N(CO)=1.0 ×10 14 cm −2 implies an upper CO abundance relative to H 2 Z upp CO = N(CO)/N(H 2 ) = 2.1 × 10 −6 for A V > 2.7 mag; Z upp CO is 6.6 × 10 2 times smaller than the canonical value of 3.2 × 10 −4 in the Milky Way (Sofia et al. 2004).
Our assumption of optically thin emission of CO in low A V clouds is mostly empirical. This assumption can be quantified as the following. We smoothed the data to RMS of 0.44 K per 0.7 km s −1 . For a cloud with modest opacity at T ex = 10 K, T bg = 2.7 K, τ(CO) = 1, the derived antenna temperature (50% main beam efficiency) is 2.1 K, which is way above our RMS threshold.
Thus we conclude that, clouds in high extinction group (A V > 2.7 mag) are CO poor molecular clouds. The formation of these clouds are discussed in Section 6.1.   Lee et al. (1996). Vertical dotted red line represents N H = 5 × 10 21 cm −2 . Fig. 9. Relation between volume density and excitation temperature. Blue, green, and red lines represent pressure P/k of 6000, 1.4×10 4 , 4 ×10 4 K cm −3 , respectively.

Comparison between clouds in low and high extinction groups
We plotted the total gas volume density n gas = n Hi +n H 2 as a function of T ex for 36 sources in Figure 9. Typical thermal pressure P th of 6 × 10 3 K cm −3 in Galactic radius of 5 kpc (Wolfire et al. 2003), P th of 1.4 × 10 4 K cm −3 near the Galactic center (Wolfire et al. 2003), and auxiliary P th of 4 × 10 4 K cm −3 are also shown. Median densities for the low extinction (A V ≤ 2.7 mag) and high extinction (A V > 2.7 mag) groups are 212.1 and 231.5 cm −3 , respectively. The median excitation temperatures for the low extinction (A V ≤ 2.7 mag) and high extinction (A V > 2.7 mag) groups are 64.8 and 41.9 K, respectively. Densities in these two groups are comparable but excitation temperatures are relatively lower in high extinction (A V > 2.7 mag) group, resulting lower thermal pressures in this group. We discuss the implication for cloud formation in Section 6.1.

Assembly of molecular clouds
Molecular clouds can be formed either directly from neutral medium or by assembling pre-existing, cold molecular clumps. The first scenario is commonly accepted (e.g., Hollenbach & McKee 1979). The second scenario is outlined by Pringle et al. (2001), who proposed that clouds formed out of pre-existing, CO-dark, molecular gas. Compared to the first scenario, the second scenario allows for fast cloud formation in a few Myr, which was suggested by observations (Beichman et al. 1986;Lee, Myers & Tafalla 1999;Hartmann et al. 2001). The key problem for the second evolutionary scenario is how molecular gas can exist before it is collected together. Pringle et al. (2001) argued that the pre-existing molecular gas should be cold (< 10 K) and was shielded from photodissociation by patch dust with A V of ∼ 0.5 mag (White et al. 1996;Berlind et al. 1997;Gonzalez et al. 1998;Keel & White 2001). Under A V of ∼ 0.5 mag, H 2 should exist substantially while CO is in very low abundance. This is because the self-shielding threshold of A V =0.02 and 0.5 mag (Wolfire et al. 2010) are widely considered as a condition for maintaining a stable population of abundant H 2 and CO gas, respectively. Listz et al. (2012) detected strong CO(1-0) emission (4-5 K) in regions with equivalent visual extinction less than 0.5 mag. Two obvious possibilities are 1) the CO gas is transient. Such gas may also have been seen in Goldsmith et al. 2008, who detected a 40% of the total CO mass in Taurus in regions with low to intermediate Av (mask 0 and 1 in their terminology). 2) Such CO gas lies in a highly clumpy medium with lower apparent averaged extinction when photons travel through lower density interclump medium. When small agglomerations of molecular gas are compressed and heated by shock, e.g., in a spiral arm, they become detectable. This scenario is supported by observations of GMC formation in spiral arms (Dobbs et al. 2008) and simulations of molecular clouds formation (e.g., Clark et al. 2012). In Section 5.1, we showed that clouds in high extinction group (A V > 2.7 mag) are not consistent with chemical evolutionary model of the first scenario. The upper limit of CO abundance in this group is 6.6 × 10 2 times smaller than the typical value in the Milky Way. We suggest the CO-poor feature can be explained if clouds in the high extinction group (A V > 2.7 mag) are formed through coagulation of pre-existing molecular, COpoor clumps. The clouds should be in the early stage of formation. According to chemical evolutionary model, CO can reach an abundance of 2 × 10 −5 in 10 5 yr at A v = 2.7 mag if all hydrogen is locked in H 2 before the cloud formation (Lee et al. 1996). Thus the cloud age may be constrained to be less than 1.0 × 10 5 yr after cloud assembly.
Moreover, the obvious differences seen in linewidthscale relation, excitation temperature distribution, and nonthermal/thermal ratio relations for clouds in the low extinction group (A V ≤ 2.7 mag) and high extinction group (A V > 2.7 mag) in Section 5.2, are possible pieces of evidence to support the cloud formation under the second scenario.

Hi contributes little in explaining dark gas
Dark gas is the gas component that is not detected with either Hi or CO emission but is clearly seen from the excess of A V compared to N H (Reach et al. 1994). We focus on DMG, but as pointed out in Planck Collaboration (2011), N(Hi) could be underestimated with the optically thin approximation and an excitation temperature that is too high. Atomic Hi may contribute as much as 50% mass for the excess according to their estimate. Fukui et al. (2015) investigated the Hi optical depth τ Hi and reanalyzed all sky Planck/IRAS dust data in high galactic latitudes (|b| > 15 • ). They derived 2-2.5 times higher Hi densities than that with optical thin assumption. They implied that optically thick cold Hi gas may dominate dark gas in the Milky Way.
In this paper, we introduced HINSA as an effective tool to constrain τ Hi . Though τ Hi = 1 is applied for 33 clouds, it does not affect the conclusion much that DMG dominated the cloud mass for 0.5 ≤ τ Hi ≤ 1.5.
Another objection against H 2 dominating dark gas in Fukui et al. (2015) was that, crossing the timescale of ≤ 1 Myr of local clouds is an order of magnitude smaller than H 2 formation timescale 2.6×10 9 /n Hi yr (Hollenbach & Natta 1995;Goldsmith & Li 2005) for typical clouds (n ∼ 100 cm −3 ). This is not a problem if we adopt the assumption in Section 6.1 that molecular clouds are formed by assembling pre-existing molecular gas.
Within our sample of clouds with C+ emission and HI selfabsorption, the molecular gas seems to be the dominant component regardless of their individual excitation temperatures, optical depth, and their lack of CO emission. Our conclusion is in line with the direct observational result in the Perseus by Lee et al. (2015). Grenier et al. (2005) indicated that dark gas mass is comparable to that of CO-traced molecular gas in the Milky Way. Our results suggest that H 2 dominates the dark gas. In a previous study, L14 obtained Hi intensity by integrating over a velocity range centered around their V LSR defined by the C + (or 13 CO) line. Moreover, they adopted an optically thin assumption and a constant kinematic temperature of 70 K. To compare with L14, we applied these treatments to DMG clouds in this study. Results from these treatments differ from our results by an average factor of 0.55 +4.61 −0.83 for total visual extinction A V and differ by an average factor of 0.04 +0.89 −0.21 for DMG fraction. The symbol "+" means maximum value of underestimate and "−" means maximum value of overestimate. The actual DMG content detected with previous treatments and method here may differ a little.

Galactic impact
The detections in this study are limited for DMG for two reasons. First, DMG clouds without HINSA feature are common in the Milky Way. Second, C + emission in some DMG clouds may be hard to identify.
We estimated quantitatively the detection limits in this study. To be detected under the sensitivities of this study, the excitation temperature should be lower than the background emission temperature. The detection requirement of C + brightness temperature is 0.25 K (2.5 σ) . As seen in Equation 4, C + intensity strongly depends on kinetic temperature T k . To produce a C + intensity of 3.2 × 10 −1 K km s −1 (T peak b = 0.3 K and FWHM of 1.0 km s −1 ), it requires a N(H 2 )=9.0 × 10 19 cm −2 under T k = 70.0 K and N(H 2 )=2.2 × 10 21 cm −2 under T k = 20.0 K, assuming n H 2 = 1.0 × 10 3 cm −3 . Thus a large fraction of cold, diffuse DMG clouds in the Milky Way may be undetectable as C + emission is under the conditions specified in this paper.

Summary
In this paper, we have carried out a study of the DMG properties in the Galactic plane by combining physical properties derived from C + survey of Hershel, international Hi surveys, and CO surveys. The HINSA method was used to determine Hi excita-tion temperature, which is assumed to be constant in previous works (e.g., Langer et al. 2014). Our conclusions include 1. Most DMG clouds are distributed between the Sagittarius arm and Centaurus arm in the Milky Way. We argue that this is caused by sample selection with HINSA features, which can be produced only when background temperature is stronger than excitation temperature of foreground cloud. 2. Hi excitation temperatures of DMG clouds vary in a range between 20 and 92 K with a median value of 55 K, which is lower than assumed 70 K in Langer et al. (2014). Gas densities vary from 6.2 × 10 1 to 1.2 × 10 3 cm −3 with a median value of 2.3 × 10 2 cm −3 . 3. DMG dominates dark gas in a wide range of Hi optical depth τ Hi and excitation temperature T ex . 4. The Hi optical depth τ Hi can exist in a wide parameter range without significantly affecting the global relations between DMG fraction, Hi column density, and Hi excitation temperature. 5. Under the constraint of 12 CO sensitivity of 0.44 K per 0.7 km s −1 in this paper, the relation between f DMG and excitation temperature can be described by a linear function, f DMG = −2.1 × 10 −3 T ex + 1.0, assuming Hi optical depth of 1.0. 6. The relation between f DMG and total hydrogen column density N H can be described by f DMG = 1 −3.7 ×10 20 /N H . When N H ≤ 5.0 × 10 21 cm −2 , this curve is consistent with the timedependent chemical evolutionary model at the age of ∼ 10 Myr. The consistency between the data and chemical evolutionary model breaks down when N H > 5.0 × 10 21 cm −2 . 7. We discovered a group of clouds with high extinction (A V > 2.7 mag), in which an upper CO abundance of 2.1 × 10 −6 relative to H 2 is two orders magnitude smaller than canonical value in the Milky Way. This population of clouds cannot be explained by the chemical evolutionary model. They may be formed through the agglomeration of pre-existing molecular gas in the Milky Way.
It is worthwhile to note that the definition of DMG strongly depends on the sensitivity of CO data. In this paper, this value is 0.44 K per 0.7 km s −1 for 12 CO emission. More sensitive data of CO as well as other molecular tracers, e.g., OH, toward these clouds are necessary to constrain CO abundance further and to investigate physical properties of molecular gas in these clouds. N.Y. Tang et al.: Physical properties of CO-dark molecular gas traced by C +  (3) is full width at half maximum of Hi . Column (4) is excitation temperature with assumption of optical depth of 1. Column (5) is upper limit of excitation temperature assuming infinite optical depth. Column (6) is Hi volume density. Column (7) is H 2 volume density. Column (8) is Hi column density. Column (9) is H 2 column density. Column (10) is DMG fraction relative to total hydrogen. Column (11) is total visual extinction.