Water vapor detection in the transmission spectra of HD 209458 b with the CARMENES NIR channel

Aims: We aim at detecting H$_2$O in the atmosphere of the hot Jupiter HD 209458 b and perform a multi-band study in the near infrared with CARMENES. Methods: The H$_2$O absorption lines from the planet's atmosphere are Doppler-shifted due to the large change in its radial velocity during transit. This shift is of the order of tens of km s$^{-1}$, whilst the Earth's telluric and the stellar lines can be considered quasi-static. We took advantage of this to remove the telluric and stellar lines using SYSREM, a principal component analysis algorithm. The residual spectra contain the signal from thousands of planetary molecular lines well below the noise level. We retrieve this information by cross-correlating the spectra with models of the atmospheric absorption. Results: We find evidence of H$_2$O in HD 209458 b with a signal-to-noise ratio (S/N) of 6.4. The signal is blueshifted by --5.2 $^{+2.6}_{-1.3}$ km s$^{-1}$, which, despite the error bars, is a firm indication of day-to-night winds at the terminator of this hot Jupiter. Additionally, we performed a multi-band study for the detection of H$_2$O individually from the three NIR bands covered by CARMENES. We detect H$_2$O from its 1.0 $\mu$m band with a S/N of 5.8, and also find hints from the 1.15 $\mu$m band, with a low S/N of 2.8. No clear planetary signal is found from the 1.4 $\mu$m band. Conclusions: Our significant signal from the 1.0 $\mu$m band in HD 209458 b represents the first detection of H$_2$O from this band, the bluest one to date. The unfavorable observational conditions might be the reason for the inconclusive detection from the stronger 1.15 and 1.4 $\mu$m bands. H$_2$O is detected from the 1.0 $\mu$m band in HD 209458 b, but hardly in HD 189733 b, which supports a stronger aerosol extinction in the latter.


Introduction
Transmission spectroscopy during the primary transit of exoplanets has proven to be extremely useful for characterizing their atmospheres (Seager & Sasselov 2000;Brown 2001;Hubbard et al. 2001). As the planet passes in front of its host star from the perspective of the Earth, part of the stellar light is blocked by the planetary disk, opaque at all wavelengths. In addition, part of the light is absorbed and scattered by the constituents of the planet's atmosphere, which is seen as an annulus from the point of view of the observer. Each compound absorbs in specific spectral regions and with different intensities, which leaves weak but unique fingerprints in the transmitted light. Taking advantage of this, the first detection of an exoplanet atmosphere was achieved by Charbonneau et al. (2002), who reported a detection of sodium in the atmosphere of HD 209458 b using the Space Telescope Imaging Spectrograph on the Hubble Space Telescope (HS T /STIS). Only a year later, Vidal-Madjar et al. (2003) detected the absorption of hydrogen in Lyman-α in the atmosphere of the same exoplanet, also using HS T /STIS. Later, Redfield et al. (2008) and Snellen et al. (2008) accomplished the first detections of planetary sodium with ground-based instrumentation at 8-10 m class telescopes.
The signatures produced by molecular species in the planet's atmosphere are also orders of magnitude weaker than those of the telluric and stellar lines. This greatly hinders their detection from the ground since their spectral features cannot be seen directly in the spectra, but are buried well below the noise level. To overcome this difficulty, Snellen et al. (2010) merged the information of thousands of CO rotational-vibrational (rovibrational) lines in HD 209458 b transit spectra by applying a cross-correlation technique to high-resolution spectroscopy with the Cryogenic InfraRed Échelle Spectrograph (CRIRES) at the Very Large Telescope. Snellen et al. (2010) took advantage of the Doppler shift between the stellar light and the planetary atmospheric absorption caused by the planet's movement around its host star. This is possible because the semi-amplitude of the planet's orbital velocity can be well beyond 100 km s −1 and the relative velocity with respect to the Earth changes by tens of km s −1 during the transit. On the contrary, the Earth's telluric lines and the stellar lines can be considered quasi-static. Moreover, Snellen et al. (2010) measured an excess Doppler-shift of 2 ± 1 km s −1 , which was explained by invoking high-altitude winds in the atmosphere. Moreover, it constituted the first evidence of high-resolution instrumentation being sensitive to the atmospheric dynamics.
The cross-correlation technique has also been successfully applied to mid-and high-resolution (R ≥ 50 000) highlystabilized instruments at 4-m class telescopes for detecting molecular signatures. Thus, water vapor was detected in HD 189733 b by Brogi et al. (2018) using GIANO at the Telescopio Nazionale Galileo. They reported a blueshift in the signal of −1.6 km s −1 , indicating the presence of winds in the atmosphere. Moreover, Alonso-Floriano et al. (2019) detected H 2 O in the same exoplanet for the first time using the 1.15 and 1.4 µm bands individually with the near-infrared (NIR) channel of CARMENES (Calar Alto high-Resolution search for M dwarfs with Exoearths with Near-infrared and optical Échelle Spectrographs; Quirrenbach et al. 2016Quirrenbach et al. , 2018. One of the best-studied planetary atmospheres to date is that of HD 209458 b, which orbits a bright Sun-like star (G0 V, G ≈ 7.5 mag) with a period of 3.5 days (Charbonneau et al. 2000;Henry et al. 2000). This planet has been studied by analyzing its transit (Charbonneau et al. 2002;Vidal-Madjar et al. 2003;Deming et al. 2013;Madhusudhan et al. 2014a) and dayside spectra (Line et al. 2016;Hawker et al. 2018), combining low-and highresolution spectra , and by investigating its exosphere and mass-loss rate (Vidal-Madjar et al. 2004;Linsky et al. 2010).
Comparative studies of several hot Jupiters using HS T and the Spitzer Space Telescope have shown water vapor spectral features in HD 209458 b in the 0.3-5.0 µm spectral region (Sing et al. 2016). These signatures were found to be weaker than expected in a solar composition cloud-free atmosphere, and have been attributed to a possible oxygen depletion or a partial coverage of clouds or hazes (Madhusudhan et al. 2014b). Later, MacDonald et al. (2017 found sub-solar abundances in the atmosphere of HD 209458 b, with inhomogeneous clouds covering the terminator. In addition, they reported the first detection of nitrogen chemistry, with NH 3 (or HCN) being detected in HS T transmission spectra in the 1.1-1.7 µm region. Moreover, Hawker et al. (2018) also detected HCN, H 2 O, and CO in this planet by using CRIRES measurements in the 2.29-2.35 µm and 3.18-3.27 µm spectral ranges.
In this paper we analyze transit spectra of HD 209458 b obtained with the near infrared (NIR) channel of CARMENES (0.96-1.71 µm). After the multi-band detection of water vapor in HD 189733 b with this instrument (Alonso-Floriano et al. 2019), we have extended this analysis to HD 209458 b, where H 2 O has not been detected from the ground so far in the NIR. Detections of water vapor from bands located at different spectral regions can help to characterize aerosols in hot Jupiters (Stevenson 2016; Pino et al. 2018b). This kind of multi-band studies is of particular interest for HD 189733 b and HD 209458 b since space observations suggest that the former is rather hazy whilst the latter has a more gentle Rayleigh scattering slope (Sing et al. 2016). Moreover, an additional measurement of the winds in the upper atmosphere of HD 209458 b is also of high interest for characterizing the dynamics of hot Jupiter atmospheres. This paper is organized as follows. In Sect. 2 we describe the main features of the CARMENES instrument, our observations of HD 209458 b and the atmospheric conditions during the night. In Sect. 3 we present the data reduction procedure and in Sect. 4 the methodology for identifying the molecular signatures. In Sect. 5 we elaborate on the main results of this work regarding the H 2 O detection in the different bands, the atmospheric winds, and a comparison with the results for HD 189733 b. Finally, in Sect. 6 we present the main conclusions.

Observations
We observed a transit of the hot Jupiter HD 209458 b (Table 1) on the night of 5 September 2018 using CARMENES. CARMENES is located in the coudé room of the 3.5 m telescope at the Calar Alto Observatory. The instrument consists of two fiber-fed high-resolution spectrographs, the visible (VIS) channel covering optical wavelengths, ∆λ = 520-960 nm in 55 orders (R = 94 600), and the NIR channel observing the near-infrared Article number, page 2 of 11 A. Sánchez-López, et al.: Water vapor detection in the transmission spectra of HD 209458 b with the CARMENES NIR channel wavelength range ∆λ = 960-1710 nm in 28 orders (R = 80 400).
In this work we analyze the data gathered with the NIR channel of CARMENES only. We used the two fibers of the instrument. Fiber A was devoted to the target, and fiber B to the sky in order to identify sky emission lines. The observations covered the pre-, in-, and post-transit of HD 209458 b, and are composed of 91 exposures of 198 s from 21:39 UT to 03:47 UT (see Fig. 1). Due to a brief failure in the guiding system, eight spectra were missed between the planetary orbital phases φ = −0.029 and φ = −0.024. We thus had to discard also the eight pre-transit spectra gathered before them in order to avoid sharp discontinuities in the light curve of each pixel, which would hamper our telluric corrections (see Sect. 3.1). The typical signal-to-noise ratio (S/N) per pixel of the raw spectra showed a strong spectral dependence (see bottom panel in Fig. 1 and Fig. 2). The mean S/N was of ∼ 85 for the bands at ∼ 1 µm and ∼ 1.15 µm and of ∼ 65 for the band at ∼ 1.4 µm in the first half of the observations. However, it dropped below 60 towards the end for the bands at ∼ 1 µm and ∼ 1.15 µm, and below 50 for the band at ∼ 1.4 µm. Consecutively, we also discarded the last eight out-of-transit spectra to avoid noisy data. Additionally, one in-transit spectrum presented an anomalous spectral behavior and was also excluded from the analysis. The spectra analyzed in this work thus spanned the exoplanet orbital phase interval −0.023 < φ < 0.031. The airmass range was 1.06 < sec (z) < 1.79 (see Fig. 1) and the mean seeing was 1.5 arcsec.

Data reduction
The raw spectra were processed by the CARMENES data reduction pipeline, Caracal v2.20 (Zechmeister et al. 2014;Caballero et al. 2016). As in Alonso-Floriano et al. (2019), we used the European Southern Observatory Molecfit tool (Smette et al. 2015) to retrieve the column depth of precipitable water vapor (PWV)  towards the target, which dropped from 10.3 mm to 3.8 mm during the observations (see Fig. 1, top panel). The mean value of the PWV was ∼ 5.8 mm, being the average PWV over Calar Alto of ∼ 7 mm. Even with the relatively low mean PWV, there was a strong telluric absorption in the centers of the two strongest H 2 O bands, which resulted in a very low flux. We thus excluded from the analysis the 1.12-1.16 µm (orders 54-53) and 1.34-1.47 µm (orders 45-42) spectral regions (see the a priori masks in dark gray in Fig. 3). Three water vapor bands are observable with the NIR channel of CARMENES, each one covered by a different number of orders. Firstly, the spectral region 0.96-1.05 µm (the 1.0 µm band from now on) was covered by five useful spectral orders (63-59). Secondly, the spectral interval at 1.05-1.26 µm (1.15 µm band in the following) was covered by eight useful or-A&A proofs: manuscript no. hd20_v2  Brogi et al. (2017) and an H 2 O volume mixing ratio of 10 −5 . The model is shown at the same wavelengths of the data. The CARMENES line spread function for the NIR channel has been applied. The 1.0 µm, 1.15 µm and 1.4 µm bands are shown in blue, green, and red, respectively. The orders in dark gray represent the a priori masks. The orders in light gray were masked a posteriori.
ders . And, thirdly, the spectral region of 1.26-1.71 µm (the 1.4 µm band in the following), which was covered by nine spectral orders . In addition to the orders initially discarded by the a priori masks, we found orders 59 -57 (1.03-1.08 µm) and 36 (1.68-1.71 µm) to present strong uncorrected telluric residuals in the analysis presented in Sect. 4.2. Therefore, we also discarded them in the analysis (see the a posteriori masks in Fig. 3). The 1.0 µm, 1.15 µm and 1.4 µm bands were then covered by four, six, and eight useful orders, respectively.
Next, we removed from the spectra the 5 σ outliers, which were likely produced by cosmic rays. To do so, we fit an outlierresistant third-order polynomial to the time evolution of the pixel and assigned the value of the fit to the affected frame. In addition, we derived the instrument drift during the night by measuring the shifts of the strong telluric lines (as in Alonso-Floriano et al. 2019), and found a value of ∼ 0.023 pixel (∼ 30 m s −1 ). If large, sub-pixel drifts have been found to impact the data analysis of this technique. For instance, this was the case of Brogi et al. (2013) and Birkby et al. (2017), who found ∼0.7 pixel (∼1 kms −1 ) drifts for detector 1 in CRIRES. Thus, the authors performed a correction by aligning spectra to a common wavelength solution. As discussed in Brogi et al. (2016), this correction had an error of 0.03 pixel (∼ 50 ms −1 ) in the absence of noise. Our drift is roughly two orders of magnitude lower than what was reported for detector 1 in CRIRES and, also, lower than the ideal alignment error computed in previous works. Therefore, we do not expect a drift correction to significantly affect our results.
The resulting spectra were stored as 18 matrices (one per order) of dimension 68 × 4080, i.e., the number of spectra times the number of wavelength bins of each order. This allowed us to analyze each order independently by following the methodologies presented in previous publications (Brogi et al. 2016(Brogi et al. , 2018Birkby et al. 2017;Hawker et al. 2018;Alonso-Floriano et al. 2019) and discussed below.

Telluric and stellar signal removal with Sysrem
Changing conditions in the Earth's atmosphere during the night (e.g., airmass variations) induce a variable pseudo-continuum level in the observed spectra (see Fig. 2). In order to have the same baseline in all spectra, we normalized them order by order by fitting a quadratic polynomial to the pseudo-continuum. In this process, we excluded from the fit the spectral points corresponding to the sky emission lines identified by using fiber B, which are common in the reddest orders (see Fig. 2).
Next, we masked the spectra at the wavelengths where the Earth's atmospheric absorption was larger than 80% of the flux continuum. Thereby, we eliminated the spectral ranges where there is almost no flux. In our previous work (Alonso-Floriano et al. 2019) we searched for the wavelength bins that satisfied the previous condition in the largest S/N spectrum of the night. This was a useful approach since the column depth of PWV during that night was stable and airmass did not change very much during the observations. However, we found this procedure not to be suitable for the night analyzed here due to the rather large changes in the PWV, airmass, and S/N of the observed spectra (see Fig. 1). For this reason, instead of selecting the spectrum with the largest S/N for masking, we chose that taken at the largest telluric water vapor absorption, which is more conservative. The masked regions represented ∼6% of the data of the spectral orders included in the analysis.
The normalized and masked 68 × 4080 matrices of spectra were at this point dominated by the quasi-static telluric and stellar contributions. The Doppler-shifted excess absorption from the planetary atmosphere is expected to be orders of magnitude lower than the telluric and stellar absorptions. We removed these telluric and stellar signals by using Sysrem (Tamuz et al. 2005;Mazeh et al. 2007), a principal component analysis algorithm that has been successfully applied to detect molecular signatures in the atmosphere of several exoplanets (Birkby et al. 2017;Nugroho et al. 2017;Hawker et al. 2018;Alonso-Floriano et al. 2019). This algorithm treats the temporal evolution of each wavelength bin as a light curve and removes systematic features common to all of them, while allowing each wavelength bin to be weighted by its uncertainty. In particular, the CARMENES pipeline provides a measurement of the photon noise and the read out noise for each pixel at each spectrum, which constitutes the input uncertainty matrix. By iterating Sysrem in each order individually, we removed most of the stellar and telluric features in the spectra and also the variations induced by airmass and PWV changes. As was pointed out previously (see, e.g., Nugroho et al. 2017), Sysrem does not really fit airmass nor other atmospheric parameters. Instead, this algorithm performs linear fits that only approximate the non-linear evolution of the telluric absorption with time. Because of this, some telluric residuals are still expected to be left after each iteration. The planetary features also create a low-order effect at the sub-pixel level (Birkby et al. 2017). Thus, after a certain number of Sysrem iterations, these features are perceptible by the algorithm. At this point, Sysrem starts focusing on fitting and removing the planetary signal, which is an unwanted effect. Also, as the intensity of the telluric and stellar contamination varies from one order to another, the optimum number of Sysrem iterations is also expected to differ. This has been widely examined in the past (Birkby et al. 2017;Nugroho et al. 2017;Hawker et al. 2018;Alonso-Floriano et al. 2019). Following these earlier studies, we analyzed the effect of Sysrem order by order by injecting a model planetary signal (computed as described in Sect. 4.1) into the observations before removing the telluric contamination. In order to mimic  Fig. 4. Evolution of the retrieved S/N of the injected signal with the number of Sysrem iterations for the 1.0 µm band (left), 1.15 µm band (middle), and 1.4 µm band (right). The model is injected at 5× the nominal strength. For the spectral orders where a very small signal is retrieved (i.e., S/N < 3), we injected a stronger model at 10× (for orders 49 and 38) and at 12× (for order 60) the nominal strength so as to ensure the injected signal is clearly measured (i.e., S/N ≥ 3). The stars represent the selected number of iterations for each spectral order at the peak of the recovered S/N (see text). the real signal of the planet, we Doppler-shifted the model planetary signal according to the expected orbital velocities of the planet at the times of the observed spectra. In other words, we injected the model at the expected velocities given by where sys is the systemic velocity of the HD 209458 system, bary (t) is the barycentric velocity during the observations, K P is the semi-amplitude of the orbital motion of the planet, and φ(t) is the orbital phase of the planet. In addition, possible atmospheric winds could be accounted for by considering an additional term, wind , in the same manner as Alonso-Floriano et al. (2019). We used the cross-correlation technique (see Secs. 4.2 and 4.3) to study the evolution of the S/N of the retrieved injected signal with the number of Sysrem iterations. That is, we determined the optimum number of Sysrem iterations for each spectral order by maximizing the significance of the injected crosscorrelation function (CCF) peak (Birkby et al. 2017;Nugroho et al. 2017;Hawker et al. 2018;Cabot et al. 2019). In order to characterize this behavior, we visually inspected the retrieved S/N evolution for each spectral order (see Fig. 4). The different performances can be broadly grouped into the following categories. Firstly, Sysrem performed well for some spectral orders where a few iterations (< 10) were enough to reach the maximum recovery of the injected signal, and the significance then dropped (e.g., orders 61,55,52,49,46,38,37). For some spectral orders, the peak S/N was reached more slowly (≥10 iterations, see, e.g., orders 60, 56, 48). Secondly, we observed a less efficient performance in some spectral orders where a maximum is achieved, but the behavior is almost flat (e.g., orders 63 and 39). In these cases, since the S/N dropped afterwards, we kept the found maximum. And thirdly, for some other orders the S/N improved slightly after the first maximum (e.g., orders 62, 51,50,47,41,40). In this latter case, we kept the iteration with the first maximum of the S/N if it did not improve significantly (changes smaller than 5%) in subsequent iterations. We found this behavior for orders 62, 47, and 41.
We explored the removal of the telluric absorption by using Molecfit instead of Sysrem. However, we did not find any planetary signal when applying the procedure described in Sect. 4.2 to the spectra obtained with this telluric correction. Recent works have applied this tool to CRIRES data and have found the scatter of the residuals to be between 3% and 7% (Ulmer-Moll et al. 2019). Therefore, the fit uncertainties over the entire wavelength coverage of the CARMENES NIR channel seem to be too large for attempting a detection of the weak H 2 O features of this planet.

Planetary atmospheric signal extraction
We used the cross-correlation technique to extract the atmospheric signal. The basic idea is to accumulate the information of the thousands of absorbing ro-vibrational lines, which is usually expressed as the amplitude of a CCF. To do so, we computed high-resolution absorption models of the atmosphere for the primary transit that served as a template for the CCF. Water vapor (H 2 O), methane (CH 4 ), ammonia (NH 3 ), hydrogen cyanide (HCN), and carbon monoxide (CO) are the main absorbent species in the wavelength range covered by the NIR channel of CARMENES for this planet. We have correlated the measured spectra with exo-atmospheric transmission models including the absorption of these molecules.

High-resolution absorption models
We computed high-resolution spectral transmission models of HD 209458 b in its primary transit by using the Karlsruhe Optimized and Precise Radiative Transfer Algorithm (Kopra; Stiller et al. 2002). This code was originally developed for the Earth's atmosphere and has been afterwards adapted to other planetary atmospheres in the Solar System (García-Comas et al. 2011;López-Puertas et al. 2018), and more recently also to exoplanet's atmospheres (Alonso-Floriano et al. 2019). Kopra is a welltested general purpose line-by-line radiative transfer model that includes all known relevant processes for computing those transmittances. In particular, we used a Voigt line shape for the rovibrational lines. The code uses an adaptive scheme for including or rejecting spectral lines at a specified strength and a given absorption accuracy (see Stiller et al. 2002). This is particularly useful in our case given the large number of ro-vibrational lines of H 2 O at high temperatures. The spectroscopic data were taken from the HITEMP 2010 compilation (Rothman et al. 2010  We computed the transmission models at a very high resolution (R ∼ 4·10 7 ) and afterwards convolved them with the CARMENES NIR line spread function. We used the pressuretemperature profile retrieved by Brogi et al. (2017) by combining low and high-resolution measurements for HD 209458 b. In addition, we chose H 2 O, CH 4 , NH 3 , HCN, and CO volume mixing ratios (VMR) of 10 −5 , 10 −8 , 10 −8 , 10 −6 , and 10 −4 , respectively, which are very similar to the most probable values retrieved in that work. In Fig. 3 we show the computed transmission model for water vapor where the major absorbing bands near 1, 1.15, and 1.4 µm are identified.
In the same way as was done for the observed spectra, we normalized the modeled transmission. Therefore, the study presented in the following is not sensitive to the absolute absorption level of the lines, but to their relative depth with respect to the continuum.

Cross-correlation
We cross-correlated the residual matrices with the transmission model (template) Doppler-shifted in the range of -200 to 200 km s −1 with respect to the Earth's rest-frame. The step size was 1.3 km s −1 , which was calculated by averaging the velocity step-size of the pixels in the NIR channel. In each step, the template was linearly interpolated to the corresponding Dopplershifted wavelength grid. We obtained a CCF for each of the 68 spectra, thus forming cross-correlation matrices, CCFs, of dimensions 201 × 68. These matrices were calculated independently for each spectral order and for each Sysrem iteration. Additionally, we subtracted the median value from each row (i.e., 201 spectral velocity bins) of the matrices in order to remove broad variations arising from the differences between the spectra and the template.
We further enhanced the signal by co-adding the information in the CCFs from the different spectral orders. The resulting total CCF in the Earth's rest-frame is shown in Fig. 5. The trace of the planet is expected to appear during transit (i.e., between the horizontal dashed lines) as positive cross-correlation values along the planetary velocities P calculated by Eq. 1. During the observing night, these velocities were expected to range roughly between −34 and 15 km s −1 (see tilted dashed lines).
Next, we aligned the rows of the cross-correlation matrices to the rest frame of the planet. In this process we assumed K P in Eq. 1 as unknown and performed the shift for a K P grid from -280 to 280 km s −1 . As in earlier studies, this allowed us to directly measure K P , although with rather large uncertainties. This is mainly because the change in the exoplanet radial velocity during the transit, of a few tens of km s −1 , is too small to accurately measure it with this technique (Brogi et al. 2018;Alonso-Floriano et al. 2019). Also, we were able to check for strong telluric residuals that might appear close to the expected exoplanet velocities or around zero or negative K P values. Consecutively, we co-added the cross-correlation values over time and obtained the total CCF. We excluded from the co-adding the last seven in-transit spectra at 0.012 < φ < 0.018, for which the velocity of the planet with respect to the Earth was very small (i.e., between ± 2.5 km s −1 ) and hence, the absorption lines from the planetary atmosphere nearly overlapped with the telluric lines. At this point, if the atmospheric molecular signal were strong enough, we would expect to find a CCF peak around the zero velocity and the expected K P = 140 km s −1 .

Measurement of the significance of the signal
We obtained the significance of the signal by applying the same methodology as in Birkby et al. (2017), Brogi et al. (2018), andAlonso-Floriano et al. (2019). First, we computed a S/N map (see top panel in Fig. 6), where the S/N values were calculated at each K P by dividing the CCF values by the CCF standard deviation obtained from the whole velocity interval but excluding the ± 15.6 km s −1 interval around the CCF value. As a result, the S/N thus depends on the width of the velocity interval. The presence of a planetary signal in the map should appear then as a region of large S/N at a K P compatible with that of HD 209458 b, which is observed in the top panel of Fig. 6 at around K P = 150 km s −1 .
We further examined the significance of the signal by using the in-transit cross-correlation matrix in the rest-frame of the planet. Here we defined two distributions: the "in-trail" distribution, covering the velocities from -1.3 km s −1 to 1.3 km s −1 (i.e., three pixels), where we expected the exoplanet signal, and the "out-of-trail" distribution comprising the rest of the velocities but excluding the ± 15.6 km s −1 region. The out-of-traildistribution should contain non-correlated noise, hence being normally distributed with zero mean. We compared the two distributions by performing a Welch's t-test (Welch 1947), following earlier studies (Birkby et al. 2017;Alonso-Floriano et al. 2019;Cabot et al. 2019). To do so, we set a null hypothesis, H 0 , of the in-and out-of-trail samples having the same mean. The test returns the value of the t-statistic, which is expected to be larger the stronger the evidence against H 0 is. Each t-value can be directly converted into a p−value that represents the probability of obtaining a p-value as large as observed or larger if H 0 was true. Subsequently, we expressed the p-values in terms of standard deviations (σ-values). In this analysis, the presence of a signal of atmospheric origin in our data would yield significantly different means for the two distributions. The in-trail cross-correlation values would then contain information not consistent with uncorrelated noise, hence rejecting the null hypothesis.
In Fig. 7 we show both distributions for the pairs of (K P , wind ) that yield the largest σ-value. The means of the intrail and out-of-trail distributions are different, with the latter following a Gaussian distribution. Thus, this test also provides evidence for the presence of a planetary signal in our transit data. In order to better characterize this signal, we computed a map of σ-values for the same K P grid (see bottom panel of Fig. 6) and found a region of maximum σ-values consistent with the S/N results (top panel).

Results and discussion
We have studied the detectability of water vapor in HD 209458 b transit spectra with the NIR channel of CARMENES. We computed the total CCF when including all the useful NIR spectral orders (see black curve in Fig. 8) and detected H 2 O with a maximum S/N = 6.4 and a σ-value of 8.1, which was larger than its S/N. Similar results were found by Cabot et al. (2019), indicating that the Welch's t-test might overestimate the significance of the signal more than the S/N calculation. However, the noise in the S/N calculations might contain contributions from the auto- correlation function of the template (i.e., aliases) or, in general, correlated noise that still averages to zero, but inflates the standard deviation. In this case, the S/N calculation might be an underestimation of the potential signal in the data.
The S/N and the σ maps for the explored K P grid are depicted in the top and bottom panels of Fig. 6, respectively. We found the maximum signal at a K P = 150 +28 −25 km s −1 , which is in line with the value derived by Hawker et al. (2018) from CRIRES observations, within the uncertainties. The cross-correlation with models including CH 4 , NH 3 , HCN, or CO individually did not yield any CCF peak compatible with a planetary origin. Consequently, the models including H 2 O+CH 4 , H 2 O+NH 3 , H 2 O+HCN, or H 2 O+CO showed noisier CCFs, and lower S/N than that obtained by including water vapor alone. However, the strongest absorption features of CH 4 , NH 3 , HCN, and CO occur at the reddest NIR orders of CARMENES, which presented the lowest S/N and the least efficient telluric removal in our data. Therefore, our results for these molecules are not conclusive and should be validated with future observations.

Multi-band analysis
We have investigated the possibility of detecting water vapor by using separately the data in the three bands covered by the CARMENES NIR channel (see Figs. 3 and 9 and Table 2). We detected H 2 O from the 1.0 µm band with a maximum S/N of 5.8 (σ-value of 7.4) at a K P = 176 +30 −38 km s −1 (see green curve in Fig. 8 and left panels in Fig. 9). However, H 2 O was not detected using this band, covered only partially by the NIR channel, in HD 189733 b (Alonso-Floriano et al. 2019). Actually the detection in HD 209458 b represents the first detection of H 2 O from this band in any exoplanet so far.
Regarding the 1.15 µm band, we found the S/N map to be noisier than for the 1.0 µm band, possibly due to a less efficient telluric correction in this spectral region (see middle panel in Fig. 9). We obtained the maximum significance CCF peak in the map with S/N of 2.8 (σ-value of 6.9) at a K P = 157 +33 −47 km s −1 (see blue curve in Fig. 8), which is consistent with the expected Article number, page 7 of 11 A&A proofs: manuscript no. hd20_v2 velocity of the planet. However, this S/N is rather low and some contamination can be seen in the form of other similar peaks. In particular, the contamination at the negative K P space is carried over when including this band in all the useful NIR orders (see negative K P space in the S/N and σ maps in Fig. 6). Therefore, the maximum CCF peak of the 1.15 µm band should be interpreted only as a hint of a signal, since the telluric removal in these orders was less efficient. The results are worse for the 1.4 µm band, for which different maxima appear (see right panel in Fig. 9), being the maximum significance peak of the map located at the negative K P space. The maximum significance CCF peak in the positive K P space is found with a S/N = 3.6 (σ-value of 5.3) at a K P = 119 +43 −45 km s −1 (see orange curve in Fig. 8), which is consistent with the expected value within the uncertainties. However, this result is largely insufficient for claiming a detection from this band individually.
Overall, these results indicate that most of our signal arose from the 1.0 µm band, and partially from 1.15 µm band. However, including this spectral region in the analysis we obtained a slightly larger signal. It is surprising though that we obtained the largest contribution to the signal from the weakest band rather than from the two strongest 1.15 µm and 1.4 µm bands. It is true that, for the strongest bands, some orders near the center of the bands were discarded because of the absence of flux. Additionally, the masking of the strongest absorption lines (i.e., absorbing more than 80 % of the flux) within each order rejects many spectral points and hence reduces the planetary signal. But this is in contrast with what we would expect (see, e.g., Alonso-Floriano et al. 2019) where we obtained the largest signals in the stronger bands. Several reasons might explain this. Firstly, the mean S/N per pixel of the observed spectra in the second half of the observations fall rapidly (see Fig. 1, bottom panel). In fact, the last spectra have around half of the S/N of the first ones. Additionally, the S/N within each spectrum decreases towards longer wavelengths with typical values in the range 20 -60 at the reddest orders (see Fig. 2). This was possibly due to the large airmasses in the second half of the observations, between 1.3 and 1.8 (Fig. 1, middle panel). Secondly, the drastic drop of the column depth of PWV (Fig. 1, top panel) that occurred during the Notes. a Maximum significance CCF peak measured in the positive K P space. b Combination of the 18 useful NIR spectral orders.
observations possibly added other trends to the data that make more difficult the telluric removal. These effects combined might have hampered an efficient signal retrieval from the 1.15 µm and 1.4 µm bands.

Winds in the atmosphere of HD 209458 b
The CCF peaks showed a net blueshift that was − 6.5 +2.6 −1.3 km s −1 for the 1 µm band, −3.9 +3.9 −2.6 km s −1 for the 1.15 µm band, and −5.2 +2.6 −1.3 km s −1 when considering all the useful NIR spectral orders. This could be caused by strong winds flowing from the dayto the night-side at the terminator of this hot Jupiter. The different absolute wind velocities retrieved from both bands could be related to the different pressure levels at which the bulk of the spectrum is formed for each wavelength interval. However, we would expect the weaker 1.0 µm band to have its largest contribution at slightly higher pressure levels (i.e., lower altitudes) than the stronger 1.15 µm band and hence, to have a lower wind according to global circulation models (e.g., see Fig. 3 in Rauscher & Menou 2012, which is opposite to our results. Hence, this is likely not the reason for the differences. Nevertheless, our large error bars make both measurements fully compatible. In addition, the cross-correlation signal of the 1.15 µm band is rather weak and, thus, the significance of its wind measurement is low. Snellen et al. (2010) reported, from observations of a transit of this planet in the 2-µm spectral band of carbon monoxide, an overall blueshift of −2±1 km s −1 . Our best measurement of the net blueshift obtained when considering all useful orders is larger, although our uncertainties also exceed theirs and, overall, both determinations are within the errors.
The model of Showman et al. (2013) predicted that the atmosphere of HD 209458 b, which is exposed to a large stellar insolation, is expected to have a circulation dominated by high-altitude, day-to-night winds, leading to a predominantly blueshifted Doppler signature at the terminator (i.e., when probed during transit observations). The same model predicted for this planet a stronger high-altitude day-to-night circulation than for HD 189733 b, which should manifest in a larger net blueshift. In particular, it predicted a blueshifted signal with a peak near −3 km s −1 for a fraction of ∼40% of the full terminator for HD 189733 b at a pressure level of 0.1 mbar, whilst for HD 209458 b the velocity is around −6 km s −1 and covering near 60% of the terminator (see their Figs. 7c and 7d). These supersonic wind velocities were also obtained in the models of Rauscher & Menou (2012) and Amundsen et al. (2016). Rauscher & Menou (2013), however, reported slightly slower winds because they included the effects of magnetic drag, which reduces the wind velocity in the upper atmosphere of HD 209458 b to about 3 kms −1 . When considering all  1.0 µm a 3.7 (2.2) 179 +27 −34 (153) -3.9 ± 2.6 (−3.9) 1.15 µm 4.6 (4.9) 116 +65 −31 (146 ± 46) -5.2 +2.6 −1.3 (-3.9 +1.3 −2.6 ) 1.4 µm 5.1 (4.4) 150 ± 29 (156 +39 −31 ) −3.9 ± 2.6 (-3.9 ± 1.3) Useful orders b 7.7 (6.6) 147 +33 −28 (160 +45 −33 ) -3.9 ± 1.3 (−3.9 ± 1.3) Notes. Values in parenthesis are from Alonso-Floriano et al. the useful NIR spectral orders, we obtained a blue velocity of −5.2 +2.6 −1.3 km s −1 for HD 209458 b. This value is in good agreement with the predictions of Rauscher & Menou (2012), Showman et al. (2013), and Amundsen et al. (2016). In addition, the absolute value is larger than what we derived for HD 189733 b (−3.9 ±1.3 kms −1 ; Alonso-Floriano et al. 2019), which is in line with the model predictions of Showman et al. (2013). However, the rather large error bars in both measurements prevented us from confirming the prediction of Showman et al. (2013) of stronger winds in the upper atmosphere of HD 209458 b.

Comparison with the CARMENES H 2 O detection in HD 189733 b
We present a qualitative comparison of the results discussed above for HD 209458 b with those obtained by Alonso-Floriano et al. (2019) for the hot Jupiter HD 189733 b using also CARMENES transit data (mean S/N of the continuum ∼ 150). In order to make it meaningful, though, we have reanalyzed the HD 189733 b dataset using the same procedure used in this work. The multi-band results are shown in Fig. 10 and Table 3. Starting with the 1.0 µm band, water vapor was not detected in HD 189733 b from this band by Alonso-Floriano et al. (2019). In this reanalysis (see left panel in Fig. 10) we observe a strong contamination at low K P values but also hints of a moderate signal of S/N = 3.7 at K P = 179 +27 −34 km s −1 , which is consistent with the expected velocities for HD 189733 b. Regarding the 1.15 µm and 1.4 µm bands, the results of the reanalysis (see Table 3) are very similar to those presented by Alonso-Floriano et al. (2019), with confident detections of H 2 O in both bands (see middle and right panels in Fig. 10, respectively). When using all the useful NIR orders from Alonso-Floriano et al. (2019), we detected H 2 O in HD 189733 b with a S/N = 7.7 at K P = 147 +33 −28 km s −1 , which represents a small improvement with respect to the signal obtained in that work of S/N = 6.6. This is due to the optimization of the number of Sysrem iterations required for each spectral order, principally for the 1.0 µm band.
Compared to the results obtained for HD 209458 b, the hint of a water vapor detection using the 1.0 µm band in HD 189733 b is much less significant, despite the larger S/N of the gathered