Free Access
Issue
A&A
Volume 659, March 2022
Article Number A121
Number of page(s) 26
Section Planets and planetary systems
DOI https://doi.org/10.1051/0004-6361/202142314
Published online 29 March 2022

© ESO 2022

1 Introduction

The use of high resolution cross-correlation spectroscopy (HRCCS, resolving power R > 100 000) has proven to be a powerful technique in the characterisation of giant exoplanet atmospheres using ground-based telescopes. The technique can be applied to transiting and non-transiting planets alike, and has already delivered studies of both transmission and thermal emission spectroscopy, revealing compositions, atmospheric structures, rotation, and day-to-night winds (e.g. Snellen et al. 2010; Brogi et al. 2016). Moreover, it holds great promise for identifying biosignatures in the nearest rocky exoplanet atmospheres using the large photon collecting power of the upcoming extremely large telescopes (ELTs) (see e.g. Snellen et al. 2015; Hawker & Parry 2019; Serindag & Snellen 2019). However, HRCCS relies on the noise being photon-dominated, which limits its effectiveness from the ground at wavelengths longer than ≳5 μm due to the sky thermal background (Snellen et al. 2015). Thus, any search for biosignatures with HRCCS at the ELTs will likely need to focus on optical and near-infrared wavelengths.

One of the key, although not definitive, biosignatures is oxygen (Meadows et al. 2018). It has strong optical spectral features (~760 nm) that are accessible to HRCCS in transmission (Snellen et al. 2013). However, the nearest rocky exoplanets, such as Proxima b, are non-transiting and have very low thermal emission at optical wavelengths. This leaves only the reflected optical spectrum as an avenue to search for oxygen in their atmospheres with HRCCS.

Shortly after the discovery of 51 Pegasi b (Mayor & Queloz 1995), attempts to use HRCCS to detect reflected light began (Charbonneau et al. 1999; Collier Cameron et al. 1999). However, despite multiple attempts, only upper limits or contested detections have been made, and the success of the method remains inconclusive (see e.g. Charbonneau et al. 1999; Collier Cameron et al. 1999; Leigh et al. 2003; Rodler et al. 2010, 2013; Hoeijmakers et al. 2018a; Scandariato et al. 2021; Martins et al. 2015; Borra & Deschatelets 2018). Although hot Jupiters have advantageous inflated radii and short orbital periods, their close proximity to their host stars is thought to leave many of them bereft of a reflective cloud deck (e.g. Rowe et al. 2008; Cowan & Agol 2011; Heng & Demory 2013), resulting in low expected geometric albedo (Ag ≲ 0.1), making them very dark with low planet-to-star contrast ratios that are difficult to detect (e.g. Hoeijmakers et al. 2018a). This is supported by observations of photometric optical phase curves from Kepler, K2, TESS and CHEOPS (Coughlin & López-Morales 2012; Angerhausen et al. 2015; Esteves et al. 2015; Wong et al. 2020, 2021; Hooton et al. 2022). There are, however, a few notable exceptions, including Kepler-7 b with Ag ~ 0.3 suggesting the presenceof a reflective cloud deck over at least some of the planet (Kipping & Bakos 2011; Demory et al. 2011, 2013; Heng et al. 2022), and HD 189733 b with an increasing albedo towards bluer wavelengths as observed with low resolution Hubble Space Telescope spectra (Evans et al. 2013; Barstow et al. 2014). Thus, the literature shows that, whilst the majority of hot Jupiters have a very low Ag, there are sufficient exceptions that the diversity in albedo warrants investigation. In addition to the challenge of low albedo, for reflection HRCCS, the stellar and planetary spectra are highly correlated because the targeted planets reflect the spectrum of their host star. Thus, reflection HRCCS must overcome the substantial additional challenge of successfully disentangling two very similar signals.

With these challenges in mind, in this paper we present our study of optical reflected light from 51 Pegasi b using HRCCS, where we remove stellar and telluric contaminants directly from the spectra rather than in cross-correlation space, adapting processes used previously for infrared HRCCS analysis techniques. In this respect, we were aided by the superior stability of the HARPS-N spectrograph at the TNG used to obtain our data. In Sect. 2.1, we present a brief summary of the previous studies of reflected light with HRCCS, with a focus on studies of 51 Pegasi b. In Sect. 2.2, we describe the adaptations necessary to use HRCCS in reflected light. This includes the important consideration of rotational broadening due to the apparent rotation of the star as seen from the planet during its orbit. This has so far been neglected in previous studies of 51 Pegasi b, but has the potential to impact how well a planet spectrum can be recovered by HRCCS and the interpretation of its contrast ratio (see Sect. 2.3). We describeour observations in Sect. 3, and Sect. 4 details our post-processing methods, including how we remove the host star spectrum contaminant. Our results and upper limits are shown in Sect. 5 and we assess our adaptations for reflection HRCCS in Sect. 6 along with an in depth comparison to previous studies of 51 Pegasi b. We conclude our study in Sect. 7.

2 Reflected light high resolution cross-correlation spectroscopy (HRCCS)

2.1 Previous studies of reflected light with HRCCS

Several previous works have attempted reflection HRCCS, with some studies resulting in planet-to-star contrast upper limits, for example at the 1.5 × 10−5 level for τ Boö b (Hoeijmakers et al. 2018a). Given its bright host star magnitude, four previous works have sought the reflected light from 51 Pegasi b (Martins et al. 2015; Borra & Deschatelets 2018; Di Marcantonio et al. 2019; Scandariato et al. 2021). We briefly summarise the status of these 51 Pegasi b investigations here as motivation for this work, as they have presented conflicting conclusions.

The first, Martins et al. (2015, hereafter M15), use a set of sparse and randomly sampled archival HARPS-N spectra, and follow the method outlined in Martins et al. (2013), which removes the contaminating host star spectrum in cross-correlation space. They generated cross-correlation functions (CCFs) using the weighted G2 binary mask in the HARPS-N data reduction pipeline (Pepe et al. 2002), first for star-only spectra when the planet day side is hidden, and then for each star + planet spectrum. They divide out the star-only CCF from the star + planet CCFs to remove the host star spectrum, in order to obtain a planet-only CCF. They then use the height of the peak of the planet-only CCF as a metric for the strength of the planet detection, which they refer to as the signal amplitude. M15 used their measured signal amplitude to suggest evidence of a detection of the reflected light from 51 Pegasi b at 3σ level, corresponding to a very bright planet-to-star contrast ratio of 1.2 ×10−4. This would correspond to a very inflatedplanet radius of Rp = 1.9RJ with a bright geometric albedo (Ag = 0.5). They also noted significant broadening of the signal, but demonstrate that this is introduced by their processing methods in the low signal-to-noise (S/N) regime.

Borra & Deschatelets (2018) use the same archival data as M15, and a similar method to obtain a planet-only CCF. However, they generated their CCFs by using the observed spectra themselves as the cross-correlation templates, rather than using the G2 weighted mask. These so-called autocorrelation functions (ACFs) have the advantage that they are always perfectly aligned to the stellar rest frame1.

Borra & Deschatelets (2018) find an even deeper signal amplitude, which they interpret as a detection at the 5.5σ level. The width of their planet-only ACF is broader than their stellar ACF, but to a lesser extent than seen in the CCFs of M15. Borra & Deschatelets (2018) propose that 51 Pegasi b is tidally locked, and that the apparent broadening is due to noise. They do not make inferences about the resultant planet properties, such as the contrast ratio or Ag and Rp, but the deeper amplitude suggests an even brighter or larger planet.

Di Marcantonio et al. (2019) again used the same archival data as M15, but use instead the method of Independent Component Analysis (ICA). In contrast to M15 and Borra & Deschatelets (2018), they do not recover a reflected light signal from 51 Pegasi b, which they ascribe to the insufficient S/N of the dataset.

Finally, Scandariato et al. (2021, hereafter S21) present an analysis following the method of Borra & Deschatelets (2018), but use an augmented archival dataset which only includes long dedicated observing sequences from individual nights. S21 do not recover a reflected light signal from 51 Pegasi b, reaching instead a 3σ upper limit on the contrast ratio of 10−5, which they use to predict that 51 Pegasi b is an average-sized hot Jupiter (0.9 ≤ Rp ≤ 1.5RJ) with low albedo (Ag < 0.1). They also show that no significant broadening was imparted by their processing techniques. Their result is in stark contrast to the previous analyses presented in M15 and Borra & Deschatelets (2018). S21 suggest these previous claims of reflected light from 51 Pegasi b may be affected by an unlucky combination of false positives.

2.2 Adaptations for reflection HRCCS

First, let us consider the theory behind HRCSS, as applies for transmission, emission and reflection HRCSS. High-resolution spectroscopy resolves molecular bands into a dense forest of thousands of spectral lines, which each adhere to a unique pattern in wavelength space that is difficult to reproduce with random noise. For short orbit exoplanets, the success of HRCCS relies on using the large Doppler shift (~km s−1) of the exoplanet during its orbit to separate its spectrum from its host star. The spectral lines of the exoplanet change considerably in wavelength during even a fraction of its orbit, while its host star and Earth’s telluric lines appear quasi-static. Consequently, any spectral feature that does not change in wavelength over time can be identified and removed, typically using algorithms such as Principle Component Analysis (PCA) or SYSREM (Tamuz et al. 2005; Mazeh et al. 2007), leaving behind the Doppler-shifting exoplanet spectrum. However, there remains the challenge of the very high contrast ratio between the exoplanet and its host star, which varies from 10−3 for some ultra-hot Jupiters in the near infrared to only 10−10 for an Earth-analogue orbiting a Sun-like star in the optical. In order to amplify any signal from the exoplanet, its individual spectral lines are combined by cross-correlating the residuals (that is, the observed spectra after the time invariant features are removed) with a model of the exoplanet atmosphere. In an idealised case, where each line is assumed to have equal depth and the data is photon-noise dominated, this increases the S/N of the planet by (following e.g. Snellen et al. 2015; Birkby 2018): S/ N p =( F p F ) S/N N lines , \begin{equation*}{S/N}_{\textrm{p}} = \left(\frac{F_{\textrm{p}}}{F_{\star}}\right)\ \textrm{S/N}_{\star}\sqrt{N_{\textrm{lines}}},\end{equation*}(1)

where Fp is the flux from the planet, F is the flux from the star, S/N is the total S/N of the observed spectra and Nlines is the number of spectral lines.

HRCCS was developed in the near infrared, in order to study exoplanets in transmission and thermal emission (e.g. Snellen et al. 2010; Brogi et al. 2012, 2016; Birkby et al. 2013, 2017; Casasayas-Barris et al. 2019; Webb et al. 2020; Boucher et al. 2021), and it has only recently been applied in the optical for ultra-hot Jupiters such as Kelt-9 b (e.g. Yan & Henning 2018; Hoeijmakers et al. 2018b, 2019; Pino et al. 2020) and Wasp-76 b (e.g. Ehrenreich et al. 2020; Deibert et al. 2021). In all these cases, the exoplanet spectrum is typically very distinct from its host star. A host star may contain (as an example) CO or H2O lines but these are typically modified in strength compared to those in the exoplanet spectrum due to the different abundances, structures, and temperatures of their respective atmospheres.

In reflected light, however, the exoplanet reflection spectrum (hereafter planet spectrum, Fp (λ)) is a replica of the host star’s spectrum (hereafter stellar spectrum, F(λ)), with the exoplanet’s geometric albedo2 as function of wavelength (Ag(λ)) imprinted on top, as follows: F p (λ)= F (λ)g(α) A g (λ) ( R p a ) 2 , \begin{equation*}F_{\textrm{p}}(\lambda) = F_{\star}(\lambda)\ g(\alpha)\ A_{\textrm{g}}(\lambda)\ \left(\frac{R_{\textrm{p}}}{a}\right){}^2,\end{equation*}(2)

where g(α) is the phase function, α is the phase angle, Rp is the planet radius and a is the semi-major axis. For more detail on the phase function, see Sect. 4.3.

Thus, for reflection HRCCS the cross-correlation template will match with not only the exoplanet’s spectrum but also with any residual host star spectrum that was not removed by the cleaning algorithms, potentially to a level that prohibits reaching the required contrast ratios to detect the planet. Unlike near infrared spectra, where removing telluric contamination is prioritised, optical spectra are comparatively free of strong tellurics, and removal of the host star spectrum is instead the priority.

From the reference frame of the Earth, the exoplanet’s velocity Vp varies with the orbital phase, ϕ, according to: V p (ϕ)= K p sin(2πϕ)+ v bary + v sys , \begin{equation*}V_{\textrm{p}}(\phi)= K_{\textrm{p}}\sin(2\pi\phi) + v_{\textrm{bary}} + v_{\textrm{sys}},\end{equation*}(3)

where the first term is its radial velocity (RVp) and Kp is the RV semi-amplitude of the exoplanet, the second term vbary is its apparent motion due to the Earth orbiting the barycentre of the Solar system, and the last term vsys is the constant systemic velocity of the star–planet system through the galaxy with respect to the barycentre of the Solar System. The same equation applies to the host star to find V(ϕ), by substituting Kp with − K. The intra-night variation in vbary can be sufficiently large that an ultra-stable spectrograph will measure the host star spectrum moving during the half-night of observations, offsetting it from the less dominant, but not completely negligible telluric lines, and negating the time-invariant assumptionsmade in the cleaning algorithms. Consequently, we need to operate in the rest frame of the star rather than the barycentric frame to use the cleaning algorithms designed for HRCCS, albeit at the expense of the quality of telluric removal. We discuss and illustrate this issue further in our data analysis in Sect. 4.2.3.

As the cross-correlation template needs to be predominantly a replica of the host star, it could be created from the observations themselves, by making a high S/N reference spectrum of the host star. However, there are potential issues with this approach. At the contrast ratios we aim to achieve, even small instrumental systematics and tellurics could act to obscure the planet signal in the cross-correlation (see Sect. 6.1). But importantly, as described below in Sect. 2.3, the planet may broaden the stellar lines it reflects, which necessitates changing the shape of the stellar lines in the cross-correlation template to achieve maximum correlation. Broadening the observed stellar spectrum also broadens any telluric contamination or instrumental systematics, which could then obscure any planet signal in the cross-correlation. The lines in the planet’s albedo function are also neglected when using only an observed stellar spectrum. Consequently, high resolution spectral libraries offer a useful alternative for providing a cross-correlation template.

2.3 The impact of rotational broadening on reflected light

The light emitted from and reflected by an exoplanet is subject to rotational broadening. From the perspective of an observer in the Earth or barycentric reference frame (Observer A), the light emitted and reflected from an exoplanet is Doppler broadened due to that exoplanet’s rotation on its own axis. This is the exoplanet’s projected rotational velocity, as seen from the Earth’s line of sight, vproj,p. In the case of reflected light, however, there is an additional source of broadening to consider due to the rotation of the exoplanet’s host star and the planet’s orbital period. Let us imagine another observer (Observer B) situated on the exoplanet itself. In some systems, such as τ Boö b (Rodler et al. 2010; Hoeijmakers et al. 2018a), the planet’s orbit and the stellar rotation are fully synchronised, such that it would appear to Observer B that the star was not rotating. However, for most hot Jupiters – such as 51 Pegasi b and other tidally locked planets – this is not the case. The planet keeps the same hemisphere facing the star, but the star does not maintain the same hemisphere facing the planet throughout its orbit. Thus, from the perspective of Observer B, the star will appear to be rotating at a velocity that depends on the difference between the true stellar rotation period Prot,⋆ and planet orbital period Porb (see Eq. (5)). Let us call this the star’s rotational velocity in the reference frame of the exoplanet v⋆,p. In order to understand intuitively why v⋆,p has an impact on the rotational broadening, consider that due to v⋆,p, each line in the stellar spectrum originated at different velocities across the stellar disk as observed by Observer B. However, Observer B cannot observe parts of the stellar disk individually, and thus the star appears as an integrated disc. Therefore, from Observer B’s perspective, each stellar line will be a composite of the Doppler shifts from the integrated velocities, causing each line to appear broadened.

Thus, from the perspective of Observer A in the Earth or barycentric reference frame, the disk-integrated light reflected from the exoplanet will be rotationally broadened according to two factors: vproj,p and v⋆,p. In order to quantify the extent of the rotational broadening of the light reflected from 51 Pegasi b, we referred to Eqs. (9) – (11) in Rodler et al. (2010).

We first assumed that 51 Pegasi b is tidally locked on a circular orbit, and thus Prot,p = Porb = 4.230784 ± 0.000004 days (Scandariato et al. 2021). Then, vproj,p follows as: v proj,p =2πsini R p P rot,p =1.46±0.12km s 1 , \begin{equation*}v_{\textrm{proj,p}} = 2 \pi \sin{i} \frac{R_{\textrm{p}}}{P_{\textrm{rot,p}}} = 1.46 \pm0.12 \,\textrm{km}\,\textrm{s}^{-1},\end{equation*}(4)

where i is the inclination of the planet’s orbit relative to the Earth, which we set to i = 80.9 ± 1.3 ° (Brogi et al. 2013), and Rp is the exoplanet radius, which for this non-transiting planet, for the purposes of constraining the rotational broadening, we have set to be 1.2 ±0.1 RJ3.

The v⋆,p is calculated as (assuming no misalignment of the star–planet spin–orbit axes): v ,p =2π R ( 1 P rot, 1 P orb )=11.93±0.45km s 1 , \begin{equation*}v_{\star,\textrm{p}} = 2 \pi R_{\star} \left(\frac{1}{P_{\textrm{rot},\star}} - \frac{1}{P_{\textrm{orb}}}\right) = -11.93\pm0.45\,\textrm{km}\,\textrm{s}^{-1},\end{equation*}(5)

where R = 1.237 ± 0.047 R is the stellar radius (van Belle & von Braun 2009), and Prot,⋆ is the stellar rotation period, for which we used Prot,⋆ = 21.9 ± 0.4 days (Simpson et al. 2010). The negative value of v⋆,p here indicates that to Observer B, the star would appear to be rotating counter to its true direction of orbit.

Finally, the two components of the rotational broadening, vproj,p and v⋆,p, are added in quadrature to give: v refl = v proj,p 2 + v ,p 2 =12.02±0.45km s 1 , \begin{equation*}v_{\textrm{refl}} = \sqrt{v_{\textrm{proj,p}}^2 + v_{\star,\textrm{p}}^2} = 12.02\pm0.45\,\textrm{km}\,\textrm{s}^{-1},\end{equation*}(6)

where vrefl is the total rotational broadening. Thus, whenever we accounted for rotational broadening in this work, we set vrefl = 12.0 km s−1. This important value is based on an approximation that Rp = 1.2 ± 0.1 RJ. For 51 Pegasi b the radius value has a minimal impact on the value of vrefl as the largest contributionto vrefl comes from v⋆,p. For example, a value of Rp = 0.6 RJ gives a vrefl = 11.95 km s−1 and a value of Rp = 1.8 RJ gives a vrefl = 12.13 km s−1.

Figure 1 shows the impact of broadening on the depth of the spectral lines, making them shallower and more challenging to detect. These broadening effects are discussed extensively in Strachan & Anglada-Escudé (2020), who show that the amplitude of the spectral lines is deepest in a fully synchronised system (Prot,⋆ = Porb), and gets shallower as the two periods differ. The effect plateaus for slower rotating host stars, but the line depth amplitude rapidly decreases when the star is rotating faster than the planetary orbit. However, in cases where the cross-correlation template for the planet is significantly broader than the lines in the host star spectrum, this can in fact help to mitigate contamination of the CCF by residual host star features as the line shapes do not match (see Sect. 4.5).

The star’s projected velocity (vproj,⋆, calculated using an adapted Eq. (4) such that vproj,⋆ = 2πsini(RProt,⋆)) could affect the discrepancy between the broadening of the stellar and planet lines for some systems, as if a star rotates sufficiently rapidly its stellar lines will also be be broadened. However, we do not account for vproj,⋆ in this work, as for vproj,⋆ = 2.86 km s−1 for 51 Pegasi, which is only just above the velocity per resolution elements of HARPS-N (Δv= 2.6 km s−1) and HARPS (Δv = 2.5 km s−1). Thus, the impact of vproj,⋆ will be negligible, and the stellar lines will be effectively unbroadened. See Sect. 3 for more detail on the velocity per resolution element of HARPS-N and HARPS.

thumbnail Fig. 1

Model spectrum of reflected light for 51 Pegasi b based on a continuum normalised synthetic spectrum of its host star, at the spectral resolution of HARPS-N (Fmod, norm, R = 115 000, blue line in left-hand panel). The black solid line shows the effect of broadening the model to vrefl = 12.0 km s−1 via convolution with the rotation kernel shown in the right-hand panel. It is approximately equivalent to R ~ 25 000. Many of the narrow spectral features are blended and shallower in the broadened model, which reduces the last term of Eq. (1) and lowers S/Np.

Table 1

Summary of the eight nights of observations.

thumbnail Fig. 2

Phases, ϕ, covered during the eight nights. The open grey circles show all observations and the solid coloured circles show the frames used in the final analysis (see Table 1) from each night. Discarding unusable data from Night 4 left a gap in the phase coverage. The smallest phase observed (ϕ = 0.36), the largest phase observed (ϕ = 0.64), and the point of inferior conjunction (ϕ = 0.5), are marked.

3 Observations

In this work, made use of both dedicated and archival observations of the main-sequence star 51 Pegasi, a bright star of spectral type G2 IV and apparent magnitude V = 5.46 mag. We ran our dedicated observation programme at the TNG using HARPS-N, while the archival data was collected at both HARPS-N and HARPS. Together, these datasets almost precisely replicate the data analysed by S21, enabling comparison between our results. The data is summarised in Table 1. We used the time of inferior conjunction (Tc) in the orbital solution of S21 to calculate the orbital phases (ϕ) of these time-critical observations, which in total spanned 0.395 < ϕ < 0.612. The total change in the radial velocity of the planet (RVp) during the observations was ~ΔRVp = −206.5 km s−1. We note that while Nights 2, 4, 7, and 8 have the highest average mean S/N in our dataset (after the removal of poor quality frames) the overlapping RV of the planet and stellar lines during the phases obtained in Night 2, the fewer usable frames obtained in Night 4, and the relative distance from ϕ = 0.5 for Nights 7 and 8 result in Night 1 having the best sensitivity for extracting the planet spectrum. Details of our dedicated observing programme and the archival data we used are presented below in Sects. 3.1 and 3.2.

3.1 Dedicated HARPS-N observing programme

We observed 51 Pegasi b continuously during four half-nights between 2015–2016, under Programmes CAT15B_146, CAT16B_146 and CAT16B_143 (hereafter Night 1, 2, 3 and 4, see Table 1). We collected a total of 274 exposures spanning 0.43< ϕ < 0.56, chosen to coincide with the passage of 51 Pegasi b when its illuminated day side hemisphere was orientated towards the Earth (see Fig. 2) for maximum signal. However, some of the exposures were collected when the radial velocity of the planet (RVp) overlapped with the radial velocity of the star (RV) that is at ϕ = 0.5. While not ideal due to increased contamination by the host star (see Sect. 4.5.3), this only affected a small number of frames.

The observations were carried out in visitor mode using HARPS-N (Cosentino et al. 2012), a fibre-fed echelle spectrograph on the Nasmyth B Focus of the 3.6-TNG telescope, located at the Observatorio del Roque de Los Muchachos, La Palma, Spain. There are two HARPS-N fibres, one for the object and one for calibration (sky or Th–Ar lamp), each with an aperture of 1 arcsec. Each fibre’s input is re-imaged by the spectrograph optics onto the two 4096 × 2048 pixel CCDs, designed for enhanced response in the optical regime, specifically 385 < λ < 691 nm. This results in a resolving power R ~ 115 000, with a velocity per resolution element of cR= 300 000∕115 000 = 2.6 km s−1, which is sampled by the HARPS-N full width at half maximum FWHM = 3.2 pixels, resulting in a velocity resolution per pixel of 2.6∕3.2 = 0.8 km s−1. For each frame, a spectrum comprising 69 echelle orders is formed per fibre.

The lamp calibration frames were all taken prior to the start of science observations, and during science observations the reference fibre was put on the sky. Simultaneous reference lamps were not used concurrent with science observations in order to minimise the possibility of cross-contamination. Thus, only lamp calibration frames were available to determine the wavelength solution. Wenote that, given the high stability of the temperature and pressure conditions in the chamber containing HARPS-N, the wavelength solution obtained from the calibration frames at the beginning of the night will be sufficiently precise for our purposes (Nugroho et al. 2020a). We do not need to flux calibrate the spectra and thus did not observe standard stars.

The observing conditions across the nights were variable. Despite this, the length of each exposure was kept at 200 s, in order to maintain consistent instrumental noise properties. Nights 2 and 3 included periods of dome closure due to poor weather, and Night 3 was further disrupted by tracking problems, resulting in fewer frames collected on these nights. Adverse observing conditions at the end of Night 4 also substantially reduced the S/N of these spectra, hence we have discarded 35 frames from our analysis, which left a short gap in the phase coverage of the dataset (see Fig. 2). S21 similarly cut 31 frames from this night of data. The additional four frames that we cut from Night 4 is the only difference between the datasets used in this work and S21.

3.2 Archival HARPS-N and HARPS data

Nights 6 and 7 are archival HARPS-N spectra from GAPS programme (see Table 1). We used only the optical data associated with this program as reflected light from 51 Pegasi b is strongest at these wavelengths. This programme used the same 200 s exposure times as our dedicated program and we used all 159 spectra collected across the two nights.

Nights 5 and 8 are archival HARPS spectra collected with the ESO La Silla 3.6 m telescope under programmes 091.C-0271 and 0101.C-0106(A), respectively. While these programmes contain spectra obtain sporadically and sparsely across multiple nights each, we only use spectra obtained during long continuous observing sequences during any single night in order to match the observing strategy in our dedicated observing program, and the analysis of S21. Night 5 corresponds to the data used in the previous analyses of reflected light from 51 Pegasi b (M15; Borra & Deschatelets 2018; Di Marcantonio et al. 2019). Both nights have different exposure times to the HARPS-N datasets (see Table 1), but have photon-dominated noise, and so we opted to use them to remain consistent with S21.

4 Method

The light reflected from 51 Pegasi b is much fainter than its host star. Therefore, to reveal the planet spectrum, we needed to remove stellar contamination from the observed spectra, ideally to the photon noise level, before we could search for the planet signal using the cross-correlation method. Because we rely on the Doppler-shift of the planet spectrum to isolate it from the star, we aimed to remove all features that were stationary in wavelength over time.

For ease of understanding and replication, in this section, we have first summarised the function of the HARPS-N data reduction software (DRS, Lovis & Pepe 2007, see Sect. 4.1) that is used by the TNG archive from which we obtained the extracted spectra. The further post-processing that we carried out is detailed in Sect. 4.2. Section 4.3 describes: i) the process for injecting a synthetic stellar spectrum into the data, which we used to test the efficacy of the pipeline and to obtain an upper limit on the planet-to-star contrast ratio; and ii) how we accounted for the rotational broadening of the planet signal in the pipeline. In Sect. 4.4, we describe our method for removing stellar lines from the dataset, including the use of the SYSREM algorithm. Section 4.5 shows how we performed the cross-correlation. Each night of data was processed separately until after the cross-correlation was performed.

4.1 HARPS-N DRS and the e2ds format

Every exposure collected by HARPS-N is processed by the HARPS-N DRS, taking the data from raw 4096 × 4096 pixel detector format to extracted spectra and radial velocity drift (RVdrift) measurements. For our dataset, the DRS corrected for bad columns on the CCD and the bias, then extracted the 69 individual orders in each frame using the Horne optimum extraction method (Horne 1986), and flat-fielded.

The DRS provides data in two formats: extracted two-dimensional spectra (e2ds) and extracted one-dimensional spectra (s1d, see top two panels, Fig. 3). The e2ds format provides a two-dimensional matrix for each frame, where each row represents a single order (i.e. 4096 × 69). Thus each row starts at a different wavelength value and spans a different wavelength range. The unique wavelength solution for each order is calculated by the DRS. There is a significant overlap in wavelength range between each order. The e2ds spectra remain in the Earth rest frame. Conversely, s1d format provides a single one-dimensional ‘stitched’ spectrum for each frame. To make this, the 69 orders are first blaze-corrected, then merged together and re-binned onto a regularly spaced wavelength grid by the DRS, and are then interpolated onto the barycentric reference frame. The s1d format has been the chosen format for several publications(see e.g. Hoeijmakers et al. 2018a; S21); however, the additional steps of merging the orders and rebinning the data risks increasing the systematics (Pino et al. 2020), hence we elected to use the less processed e2ds data.

The DRS provides a wavelength solution (a third order polynomial) per order for every spectrum, which is in pixel spacing for the e2ds format (i.e. an irregular wavelength grid). For each night, we used the wavelength solutions obtained from the lamp calibration frames (see Sect. 3). This wavelength solution was sufficiently precise to determine the ~km s−1 variations of the planet radial velocity.

4.2 Post-processing the data

4.2.1 Separating the orders

To facilitate data handling and the removal of stellar lines, we first reorganised the extracted spectral matrices. For each night, we rearranged the e2ds spectra into 69 two-dimensional matrices, one for each spectral order, where each column corresponded to a single wavelength channel (detector pixel) and each row showed the spectrum at a given time (i.e. a 69 matrices of 4096 × Nf). This served to highlight variations in time and wavelength, such as changes in flux due to clouds coverage or seeing effects, as well as barycentric motion and long term radial velocity drift. Accounting for the latter helps especially in the removal of the stellar lines (see Sect. 4.2.3).

thumbnail Fig. 3

Extracted spectra (Fobs) in both e2ds (top panel) and s1d (middle panel) formats, and the synthetic stellar spectrum (Fmod, bottom panel) used in the cross-correlation. The units e are photoelectrons. A zoom around wavelengths 538.2–533.2 nm (order 43) is shown in blue and the right-hand panels. These plots show the dense forest of spectral lines in our dataset. The continuum variation in the e2ds format is the uncorrected blaze function, while in the s1d format we see the residual low order instrument response in our non-flux calibrated data.

4.2.2 Outlier removal

Despite bad pixel correction by the DRS, some outliers still remain that propagate into the extracted spectra. These outliers can adversely affect the removal of stellar lines (see e.g. Birkby et al. 2013). To highlight them, for each matrix, we first continuum normalised each row (see Sect. 4.4 for a description of our continuum normalisation method), then subtracted the mean column value from each wavelength channel. To determine if a pixel was an outlier, we calculated the median and standard deviation of its six nearest neighbour pixels in the same row and if it was more than six standard deviations above or below this median, it was replaced by the linear interpolation of these neighbours in the e2ds spectra. The choice of the 6σ threshold led to approximately 0.3% of pixels being identified as an outlier across the dataset, but was sufficiently high that variation due to any potential planet signal would not be mistakenly removed.

thumbnail Fig. 4

Section of order 35 from Night 1, aligned to different rest frames, with the corresponding standard deviation shown in the top left of each panel. For each alignment, the matrix has been continuum normalised and then the mean of each column has been subtracted. This exposes the characteristic “X” pattern caused by misalignment of the stellar lines. The noise is reduced closest to the photon noise in the host star’s true rest frame, reducing the standard deviation by a factor of 1.8.

4.2.3 Alignment to the stellar rest frame

The optical spectra of the 51 Pegasi system were dominated by the stellar lines, and were far less significantly contaminated by the telluric lines that plague high resolution spectra in the infrared regime (e.g. Birkby et al. 2017). Here, we demonstrate that the ideal rest frame to remove this major stellar contaminant is the rest frame of the star, which departs from previous infrared studies with HRCCS that aligned to the telluric rest frame (e.g. Birkby et al. 2013). This is visualised in Fig. 4, where each extracted spectrum has been continuum normalised (see Sect. 4.4), and then the mean of each wavelength channel has been subtracted to highlight any spectral features that are not stationary in wavelength over time. Any under- or over-subtraction by the mean column values results in a characteristic ‘X’ pattern in the residuals, indicating misalignment of the stellar lines over time. Previous works have shifted the spectra to the barycentric reference frame to mitigate this effect for stellar lines. The middle panel of Fig. 4 highlights the ~33% reduction in the standard deviation of the residuals when the spectra are first aligned to the barycentric reference frame. The value for the barycentric earth radial velocity (BERV) was taken from the DRS. We then further aligned our spectra to the true rest frame of the host star using the RVdrift values calculated by the DRS. We note that the RVdrift exceeded the expected values due to the influence of 51 Pegasi b alone. In this true stellar rest frame, we achieve an additional ~11% reduction in the standard deviation of the residuals, resulting in a ~44% total reduction in comparison with the residuals in the Earth’s rest frame, indicating that the stellar rest frame is the best rest frame to work in for removing stellar contamination (see the bottom panel of Fig. 4).

We used linear interpolation to shift the extracted spectra into the true stellar rest frame. A side effect was that any telluric lines were then misaligned, and were therefore removed less effectively (see Sect. 4.4). We prioritised the optimal removal of the host star’s spectrum rather than the tellurics as discussed in Sect. 1, noting also that only the reddest spectral orders of HARPS-N were significantly affected by telluric lines. However, in the case of seeking the reflected spectrum from an Earth-like planet, one would need to carefully consider both sources of contamination, including micro-tellurics (Cunha et al. 2014).

4.3 Injecting a model

We consider four separate cases throughout this work:

  1. No model injected, assume any planet signal is rotationally broadened according to vrefl = 12.0 km s−1;

  2. A model injected that is broadened to match HARPS-N’s spectral resolution, and rotationally broadened according to vrefl = 12.0 km s−1;

  3. No model injected;

  4. A model injected that is broadened to HARPS-N’s spectral resolution only.

We inject a model planet into our data in order to determine an upper limit on the star–planet contrast ratio (see Sect. 5.4). Case 2 enables us to determine the impact of rotational broadening on our ability to recover a realistically broadened signal from 51 Pegasi b, whilst case 4 allows us to find the maximum sensitivity of the observations in the case where we are limited by the instrument resolution. Figure 1 shows the model planet spectrum in both cases. As discussed in Sect. 1, we use a synthetic spectrum of the host star as our reflected light model in order to avoid any systematics and tellurics that might otherwise contaminate a reference made from the observed data, and to correctly broaden the spectral lines. In this work, for simplicity we assume that Ag (λ) = Ag is constant with wavelength. This approximation is typical in previous HRCCS studies, whilst more detailed models of Ag (λ) are reserved for high S/N detections.

4.3.1 Synthetic model planet spectra

We obtained the synthetic stellar spectrum from the Coelho (2014) stellar library, with Teff = 5750 K, log (g) = 4.5, [Fe/H] = 0.2, and [α∕H] = 0.0 at a wavelength resolution of R = 300 000. We use the continuum normalised spectrum as the model (Fmod, norm(λ′), where λ′ is in the laboratory rest frame).

In case 4, where we only consider broadening due to the instrument resolution of HARPS-N, we simply convolve the spectrum with a Gaussian to match the HARPS-N spectral resolution (R = 115 000). In case 2, to account for the rotational broadening in the reflected light from 51 Pegasi b, we convolve the model with a rotation kernel (see right panel, Fig. 1) using vrefl = 12.0 km s−1 (as calculated for 51 Pegasi b in Sect. 2.3) with spacing, vstep = 0.8 km s−1 corresponding to HARPS-N’s velocity resolution per pixel4.

4.3.2 Adding the model to the data

When injecting a model, we assume that the observed spectra Fobs(λ) contain only light from the host star, that is Fobs(λ) = F(λ). The sequence of injecting a model planet into the observed spectra is as follows: i) broaden the model according to case 2 or 4; ii) scale the model according tog(α), Ag, Rp and a (see Eqs. (2), (7) and (8)); iii) Doppler-shift the model according to the orbital phase at the time of observation; iv) multiply the Doppler-shifted model, Fmod, norm(λ), by the continuum of the observed spectra, Fobs, cont(λ); and finally v) add the model to the data. We include step iv because our observed spectra are not flux calibrated, so multiplying by Fobs, cont ensures that the model is injected at the correct contrast ratio.

The Ag factor assumes that all the light reflected from 51 Pegasi b’s dayside is directed towards Earth, but of course this is not the case. In order to account for this, the phase function, g(α), outputs a factor between 0 and 1 that indicates the proportion of the light that is reflected towards Earth’s line of sight. The phase function is determined by the phase angle α, which is calculated by: cosα=sinicos(2πϕ). \begin{equation*}\cos{\alpha}\,=\,-\sin{i}\,\cos{(2\pi\phi)}.\end{equation*}(7)

Following S21, if we assume that 51 Pegasi b follows Lambert’s scattering law, then the phase function is calculated by: g(α)= sinα+(πα)cosα π . \begin{equation*}g{(\alpha)}\,=\,\frac{\sin{\alpha}\,+\,(\pi - \alpha)\cos{\alpha}}{\pi}.\end{equation*}(8)

The phase function is maximised to 1 in the case where an exoplanet has i = 90° and if at inferior conjunction (ϕ = 0.5, although in the case of 51 Pegasi b the planet would then be blocked from the Earth’s line of sight by its host star). The phase function is minimised when the exoplanet is in superior conjunction (ϕ = 0.0).

To account for the Doppler shift associated with each observed spectrum, we offset Fmod, norm(λ′) by Δλ = RVp(Kp, ϕ)∕c, where the planet RVp semi-amplitude Kp = 133 km s−1 was taken from literature (e.g. Brogi et al. 2013; Birkby et al. 2017) and the phase was calculated as in Sect. 4.2. This resulted in a new wavelength grid for each frame of λ = λ′ (1 + Δλ), and we interpolated Fmod, norm(λ′) onto each new wavelength grid, generating Fmod, norm(λ) in Eq. (9) below at each time step. In all instances, the model is injected after the alignment of the observed spectra to the rest frame of the host star, but prior to the removal of the stellar lines.

The sequence of model injection is described by the following equation, where Fmod, norm(λ, vrefl) is the appropriately broadened and Doppler-shifted model: F (λ)+ F p (λ) = F obs (λ)+[ F mod,norm (λ, v refl ) F obs,cont (λ)g(α) A g ( R p a ) 2 ]. \begin{multline*}F_{\star}(\lambda)+F_{\textrm{p}}(\lambda)\\=F_{\textrm{obs}}(\lambda)+\left[F_{\textrm{mod},\,\textrm{norm}}(\lambda,v_{\textrm{refl}})\,F_{\textrm{obs},\,\textrm{cont}}(\lambda)\,g(\alpha)\,A_{\textrm{g}}\left(\frac{R_{\textrm{p}}}{a}\right){}^2\right].\end{multline*}(9)

thumbnail Fig. 5

Left panels: sections of the order 35, from Night 1, shown at various stages of post-processing, demonstrating the removal of the stellar residuals. Right panels: sections of frame 38 from order 35, demonstrating the removal of the stellar residuals from a single spectrum. Row 1: observed spectra after bad pixel correction and alignment to the true stellar rest frame; row 2: after the removal of each order’s continuum; row 3: after division by a master spectrum; row 4: after masking the columns with over 1.4× the mean column variance, followed by one iteration of the  SYSREM algorithm; row 5: after weighting columns by their variance; bottom row: same stage as for Row 5, but with a model planet (Rp = 1.9RJ, Ag = 0.5, vrefl, mod = 0.0 km s−1) injected at 100× nominal strength after the alignment to the stellar rest frame. At each stage of post-processing, the standard deviation of the entireorder decreases.

4.4 Removal of stellar and telluric lines

51 Pegasi b orbits a G2 IV star that has abundant spectral features at optical wavelengths. As discussed in Sect. 1, these same lines appear in the planet spectrum, but Doppler-shifted by the planet’s radial velocity, and thus their multiplicity aides the recovery of the planet spectrum in the cross-correlation (boosting the S/N by N lines $\sqrt{N_{\textrm{lines}}}$). However, it also means the cross-correlation template will match with any residual host star spectrum left in the data. Hence, the removal of the host star spectrum is key to extracting the faint planet spectrum. In infrared HRCCS where tellurics dominate, stellar lines have been removed by modelling them directly (Brogi et al. 2016; Chiavassa & Brogi 2019). However, given that they are numerous and the main source of contamination in our optical spectra, and that we expect systematic effects (e.g. air mass variations) to impact the spectra as well, we opt instead for a data-driven approach. With the spectra now in the true stellar rest frame, the stellar lines are stationary in wavelength over time and we can remove the host star spectrum, as described below and illustrated for Night 1 in Fig. 5 (similar for the other nights can be found in Fig. B.1).

We first removed the stellar continuum from each spectrum. To model the continuum, we divided each residual spectrum into 20 sections and assigned the median of each as the continuum value at the central wavelength of each section. We interpolatedthese values onto the wavelength grid for their order, then divided each residual spectrum by the continuum model.

We then created a master spectrum for each individual order by taking the mean of all the frames. We then divided each spectrum in the order by the master spectrum. This successfully removed the weaker stellar lines, however, at this stage, there were still traces of the deeper stellar lines in the data and other high order systematic effects (see Fig. 5). We opted to mask the strongest residuals before any further cleaning. To make the mask, we first calculated the variance of the residuals in each column, and then calculated the median column variance. We divided each column’s variance by this value and applied a sigma clip. For cases 1 and 2 (see Sect. 4.3) we used a lower 1.4 sigma clip, resulting in a more aggressive stellar mask. This minimised the number of times that the cleaning algorithm SYSREM (see below) needed to be run, as we found that broadened planet signals are susceptible to removal by SYSREM (see Sect. 4.4.1). For cases 3 and 4 we used a higher 1.6 sigma clip, resulting in a less of the data being masked and available for use in the cross correlation. A full investigation of the optimal masking approach for HRCCS is beyond the scope of this work, and deferred to future study.

4.4.1 SYSREM

We next used the SYSREM algorithm (Tamuz et al. 2005; Mazeh et al. 2007) to further remove remaining high order variance. SYSREM is similar to PCA but allows unique uncertainties to be assigned to the data points. It was developed to identify and subtract systematic trends from multiple light curves. For a set of M light curves, {l0, l1, ..., lx, ...lM}, each with N points, each iteration of SYSREM will converge on two trends: the a-values, {a0, a1, ..., ax, ..., aN} (see Fig. 6), which correspond to the number of points in each light curve, (i.e. Nf); and the c-values, {c0, c1, ..., cx, ..., cM}, which correspond to the number of light curves, M (i.e. the wavelengths channels). Once these trends have converged, each light curve lx is de-trended by subtracting its c-value multiplied by the a-values, {cx a0, cxa1, ..., cxax, ..., cxaN}. This provides a new set of de-trended light curves, and completes a single iteration of SYSREM. By treating each spectral order as a set of multiple light curves, and thus each individual wavelength channel of an order as a single light curve, SYSREM is readily adaptable to high-resolution spectroscopic datasets. Because the planet is Doppler-shifting, it in principle does not present as a trend common to any of the light curves. SYSREM has been used previously, with an emphasis on the removal of telluric features from high-resolution spectroscopic data (see e.g. Birkby et al. 2013, 2017; Merritt et al. 2020; Gibson et al. 2020; Yan et al. 2020; Kesseli et al. 2020; Nugroho et al. 2020a,b, 2021). For the individual uncertainties, we took the inverse square-root of each data-point prior to any removal of the stellar lines. This gave data-points with a higher S/N a lower associated uncertainty, and vice versa. We set the uncertainties for the masked regions to 1 × 108, namely a very high number so that SYSREM down weights them in its calculations.

SYSREM cannot be allowed to run indefinitely, as it will begin to converge on random patterns in the noise (see Fig. 6). Subtracting these would simply distort the noise and run the risk of obscuring any planet signal. Thus, the optimal stopping point for the algorithm must be determined, which will vary per dataset. In principle, the systematic trends could vary per order per night, as conditions are not uniform, and therefore optimal results could be obtained by using different numbers of SYSREM iterations per order, per night. However, some methods for optimising per order SYSREM have the potential to constructively add noise at the planet’s velocity, and in the limit of large datasets, this constructively compounded noise could create the appearance of a planet detection. We defer a full investigation into the optimal application of SYSREM beyond that of Cabot et al. (2019) to future work, as it merits its own study. In this work, we cautiously adopt the same approach that others have in recent literature of applying the same number of SYSREM iterations for each order across all the nights to avoid possible biasing (Cabot et al. 2019; Merritt et al. 2020; Gibson et al. 2020; Yan et al. 2020; Nugroho et al. 2020a,b, 2021).

To determine when to stop SYSREM, we used a variation of the ΔCCF method that is described in detail below in Sect. 4.5.2, and further justified in Sect. 6.4. In brief, the ΔCCF is simply the difference of the cross correlation functions (CCFs) created by Cases 1 and 2, or by Cases 3 and 4, that is the model-injected minus observed CCFs, where the injected model corresponded to the Rp and Ag as suggested by M15. These ΔCCFs are created for successive iterations of SYSREM until it is clear that SYSREM begins to remove the planet signal.

Some of the trends that SYSREM identified reflected a physical quantity, and other trends did not (see Fig. 6). For the reddest spectral orders, the trends identified by SYSREM are highly correlated with the variation in air mass, likely due to the higher concentration of now misaligned telluric lines in this region.

For the realistic cases, with a rotational broadening (cases 1 and 2, see Sect. 4.3), we assessed the results of using zero, one, two, three, and four iterations of SYSREM. We found that wings of the broadened model spectral lines were susceptible to being removed by SYSREM. One iteration of SYSREM provided the optimal balance between removing stellar and telluric contamination, without removing the injected broadened model (see top panel, Fig. 8). Thus, when running our pipeline for cases 1 or 2 we applied one iteration of SYSREM to each order, across all the nights. This did not adequately remove the stellar contamination, and thus we decided to use a more aggressive stellar mask prior to running SYSREM for cases 1 and 2.

For the most sensitive case, without rotational broadening (cases 3 and 4, see Sect. 4.3), we assessed the results of using up to eight iterations of SYSREM. Six iteration of SYSREM provided the optimal result (see bottom banel, Fig. 8). Thus, when running our pipeline we applied six iterations of SYSREM to each order, across all the nights. This meant we were able to use a less aggressive stellar mask prior to running SYSREM for cases 3 and 4.

thumbnail Fig. 6

Sets of SYSREM a-values from Night 1of data that has been masked with a 1.4σ clip (blue crosses, top three rows, see Sect. 4.4), 2nd degree polynomials fitted to detect any trends (black lines, top three rows), and possible causes of the observed  SYSREM trends (purple crosses, bottom row). Top three rows: a-values subtracted during the first, second, third and fourth iteration of  SYSREM for orders 1, 35 and 69, respectively. There is a large variety in the shape of the trend removed during each iteration, and little similarity between the trends, indicating a strong wavelength dependence. Bottom row: air mass, primary mirror channel temperature, local air pressure and humidity during Night 1 of observations. The air mass values tightly correlate with the a-values for order 69 where tellurics were strongest. The other physical trends do not correlate with any of the sets of a-values.

thumbnail Fig. 7

Section of order 69, from Night 1 of observations, after zero, one, two, three and four iterations of the SYSREM algorithm respectively. After each iteration, the standard deviation of the order is reduced.

4.4.2 Weighting residuals by variance

Following the literature (see e.g. Snellen et al. 2010; Birkby et al. 2017; Hoeijmakers et al. 2018a), we normalised our data after running SYSREM by its S/N. Each column was divided by its variance, and multiplied by the average variance per order. This served to down weight noisy wavelength channels (columns), such as any remaining stellar or telluric contamination, and the noisy fringes as the edges of each order.

thumbnail Fig. 8

Optimising the S/N of the ΔCCFs at Δv = 0 km−1 in the planet rest frame, for all orders and all the nights combined, optimised for cases 1 and 2 (top panel) and for cases 3 and 4 (bottom panel). More SYSREM iterations improves the model recovery when rotational broadening is not included, indicating that SYSREM is likely removing the wings of broadened signals and thus should be used with less iterations.

4.5 Cross-correlation

With the host star spectrum removed as optimally as possible, we can now proceed to extract the faint planet spectrum by cross-correlating with a model template. In each frame, the stellar light reflected by 51 Pegasi b is Doppler-shifted according to Eq. (3). Although we expect 51 Pegasi b to appear at Kp = 133 km s−1 and Δv = 0 in the planet rest frame (Brogi et al. 2013; Birkby et al. 2017), we explored a range of Δv ± 180 km s−1 in steps of the HARPS-N velocity resolution per pixel (vstep = 0.8 km s−1), and 40 ≤ Kp (km s−1) ≤ 200 in increments of 1 km s−1. This was to explore the noise properties of the surrounding velocity space and to search for velocity offsets caused by astrophysical effects such as bright spots. For our cross-correlation template, we used the same synthetic spectrum of 51 Pegasi, Fmod, norm, that we used in the model injection as described in Sect. 4.3.

Before each cross-correlation, the wavelength grid of the model was offset to the required Doppler-shift in wavelength space by Δv using λ = λ′(1 + (Δvc)), then interpolated onto the wavelength grid of the data. We then cross-correlated the residuals with Fmod, norm(λ) by calculating the Pearson correlation coefficient, ρ, as: ρ X,Y = C X,Y C X,X C Y,Y , \begin{equation*}\rho_{X,Y} = \frac{C_{X,Y}}{\sqrt{C_{X,X} \cdot C_{Y,Y}}},\end{equation*}(10)

where ρX,Y is the Pearson correlation coefficient between two matrices X and Y, and C is the covariance5. In our case, X was the residuals spectrum and Y was the cross-correlation template Fmod, norm(λ). This created a cross-correlation function (CCF) for each frame of each individual order, which we arranged as a CCF matrix for each order with one CCF per row and one vstep per column.

thumbnail Fig. 9

Comparison of the summed (all orders combined) CCFs for Night 1 with a model (mod) injected with parameters Rp = 1.9RJ and Ag = 0.5, when the cross-correlation template (c. c. temp) is broadened (solid lines) and is not broadened (dashed lines). Mismatched broadening (green and brown lines) results in lower S/N in the CCF peak compared to matched broadening (purple and grey lines).

4.5.1 Matching the cross-correlation template to the planet’s rotational broadening

Cross-correlation is insensitive to absolute values hence no scaling was applied to the continuum normalised synthetic spectrum when making the cross-correlation templates. However, it is sensitive to the shape of the line, and thus to the rotational broadening. As we were searching for a planet signal that was considerably rotationally broadened, we needed to establish if our cross-correlation template also needed to be broadened to achieve maximum S/N. In precision radial velocity surveys, a non-broadened template (or even a binary mask) is used as this results in a CCF with a very precisely known peak position, but as we are interested in the S/N of the peak of the CCF, we may wish to optimise CCF peak strength instead. To test this, using Night 1 of our data, we calculated the CCFs for the four different scenarios arising when injecting a model with and without broadening, and cross-correlating using a template with or without broadening. The results are shown in Fig. 9, and indicate that like-for-like scenarios result in the best cross correlation, namely when the cross-correlation template matches the broadening of the injected model. Therefore, we match our cross-correlation templates to the expected broadening of the planet spectrum in question. For 51 Pegasi b, we calculated this in Sect. 2.3 to be vrefl = 12.0 km s−1.

4.5.2 Weighting the cross-correlation functions

Once we had a CCF matrix for each order per night, we proceeded to weight and sum the orders from each night to maximise the strength of the planet signal. Each order spans a unique number of reflected stellar spectral lines, and the HARPS-N detector is not uniformly sensitive in wavelength. Consequently, certain orders contributed more to the planet signal than others. To account for this, we weighted every order of each night between 0 and 1. To determine the weights, we follow a similar approach to Hoeijmakers et al. (2018a) and calculated a Δ CCF for each order on each night. ΔCCF = ΔCCFmod, inj − ΔCCFobs is the difference between the CCF obtained after injecting a model into every spectrum6, and the CCF obtained from the observed spectra. The S/N of the peak of the Δ CCF demonstrates the effectiveness of each order on each night in recovering the injected planet spectrum (see Fig. 10). The weight (w) for each i-th order was then calculated as: w i = S/N i S/N min S/N max S/N min , \begin{equation*}w_i = \frac{\textrm{S/N}_{i} - \textrm{S/N}_{\textrm{min}}}{\textrm{S/N}_{\textrm{max}} - \textrm{S/N}_{\textrm{min}}},\end{equation*}(11)

where S/Nmin is the minimum peak SN obtained in an order on that night, and similarly for S/Nmax. Each ith order was multiplied by its weight wi. There was aclose similarity between the weights across all eight nights (see Fig. 11). The weights are primarily driven by the S/N of each order, but countered at the bluer ends by the number of stellar lines per order.

Once weighted, the orders were summed to create a single CCF matrix per night. The CCFs in these nightly matrices were then collected and ordered by phase to create the final all-nights CCF matrix shown in Fig. 12. The phase increments are not evenly spaced due to gaps in the phase coverage (see Fig. 2).

thumbnail Fig. 10

ΔCCFs from different orders from Night 1 (see Sect. 4.5.2) demonstrating the sensitivity of each order in recovering an injected model.

4.5.3 Stellar residual features in the CCFs

For cases 3 and 4, when the cross-correlation template is not rotationally broadened (vrefl, c.c. temp. = 0.0 km s−1), there is a noticeable vertical feature at Δv = 0 km s−1 in the stellar rest frame (for example, see the top right panels of Fig. 12).

Because these CCFs were created without rotationally broadening the cross-correlation template, the template will match well with the narrower lines of any residual host star spectrum. Thus, this vertical feature is most likely a stellar residual feature. This stellar residual feature will add noise to the Doppler-shifting trail of the planet, which overlaps with it when RVp ~ RV at ϕ ~ 0.5. This could potentially provide a false boost to a recovered planet signal. However, for cases 1 and 2, where the cross-correlation template is rotationally broadened (vrefl, c.c. temp. = 12.0 km s−1), the vertical feature is substantially reduced. We postulate that this robustness against the stellar residuals largely stems from the poorer match between the broadened lines of the planet’s cross-correlation template and the narrower lines of the host star spectrum (see Fig. 9).

We note that, in the special case of τ Boö b and other fully synchronised systems where the planet spectrum is not broadened by v⋆,p (see Eq. (5)), it would be pertinent to mask these stellar residual features in the CCFs (as was done by Hoeijmakers et al. 2018a). However, in the case of 51 Pegasi b and most other hot Jupiters where only the planet is tidally-locked, we do not need to mask the CCFs.

thumbnail Fig. 11

Top two panels: CCF weights for each night of data, for cases 1 and 2 (top panel) and for cases 3 and 4 (second panel). Third panel: S/N per order, for each night of data. The S/N increases towards the red showing the sensitivity of both HARPS detectors. Bottom panel: number of stellar lines per order, which is the same for each night of data.

5 Results

5.1 Kp– Δ v S/N analogue maps

To maximise the recovery of the planet signal, we interpolated the CCF matrices into the planet’s rest frame, which aligns any planet signal as a vertical trail at Δv = 0 km s−1 (the verticaltrail due to the model injection is visible in the bottom rightmost panel of Fig. 12). We then summed the CCF matrices into a single 1D CCF. However, as discussed in Sect. 4.5, we assume that we do not know the Kp a priori that would align the planet signal, and we also want to explore any offset in the planet signal from Δv = 0 km s−1. Thus, prior to summing, we align the CCF matrices for a range of Kp values, 40 < Kp < 200 km s−1, and then create a summed 1D CCF for each Kp. As a final step, we calculate the S/N of each data point in each summed 1D CCF. To do this, we first calculated the standard deviation in each summed 1D CCF, (excluding data Δv = ±40 km s−1 to avoid contamination from any planet signal, then divided the entire summed 1D CCF by this value to get the S/N (see Fig. 13). This metric is sometimes referred to in literature as the cross-correlation significance (σ), the S/N, or the S/N ratio (sometimes abbreviated as SNR) (see e.g. Snellen et al. 2010; Birkby et al. 2013; Yan et al. 2020; Kesseli et al. 2020, where occasionally the standard deviation is calculated instead using all summed 1D CCFs). We refer to our approach as the S/N analogue, or S/N for brevity. The Kp – Δv map format serves to highlight the tight contours that form when Kp and Δv approach the values that where a planet signal is present.

5.2 Non-detection of 51 Pegasi b

As described at the beginning of Sect. 4.3, we analysed our observed data for cases 1 and 3, where no model was injected. This allowed us to determine whether we had detected reflected light from 51 Pegasi b. Figure 13 shows the distribution of S/N values in the observed summed 1D CCFs, which gives deviations of SN ~ ±3. Similar structure is seen by Hoeijmakers et al. (2018a); Cabot et al. (2019) and elsewhere in the literature. Thus, we adopt SN = 3 as our threshold for detection.

We ordered the 1D summed CCFs by Kp to create a 2D Kp –Δv map in terms of S/N, as described in Sect. 5.1. For case 1 (with rotational broadening), at 51 Pegasi b’s expected Kp = 133 km s−1 and Δv = 0 km s−1 the SN = −0.16 in the all-nights matrix. For case 3 (without rotational broadening), the SN = −0.30 at Kp = 133 km s−1 and Δv = 0 km s−1. There is also no value SN > 3 anywhere in the region of Kp = 133 km s−1 and Δv = 0 km s−1, for the case with or without the broadening of the cross-correlation template. Consequently, we do not claim a detection of 51 Pegasi b in reflected light. The Kp –Δv maps for the results from all the nights combined are displayed in Fig. 14. The Kp –Δv maps for the individual nights for the broadened case can be seen in Fig. B.2.

5.3 Impact of rotational broadening

In order both to demonstrate the impact of rotational broadening on our ability to recover the reflected light spectrum of the planet, and to determine the maximum sensitivity of our data, we injected two different models into our dataset (case 2 and case 4, see Sect. 4.3). Both models were injected with parameters Ag = 0.5, Rp = 1.9RJ at Kp = 133 km s−1 and Δv = 0 km s−1 to simulate the scenario proposed by M15. The Kp –Δv maps for the results from all the nights combined are displayed in Fig. 15.

For case 2, the physically realistic case with rotational broadening, the model is recovered with the SN = 5.50 at the injected Kp = 133 km s−1 and Δv = 0 km s−1 location, while for case 4, without rotational broadening, the model is recovered with very high SN = 18.82. The negative values of the noise distribution reach approximately –4 and –8 for the two cases, respectively. This is mainly due to the negative wings or “shadows”, either side of the detections. These shadows appear due to the column-wise nature of the removal of the stellar and telluric contamination, which can subtract the wings of the planet spectral lines as they cross each column, resulting in anti-correlation with the template at this final stage.

The discrepancy between our recovery of the models with and without broadening make clear that rotational broadening has a substantial impact on the ability of HRCCS to recover the planet’s reflected light spectrum. Our recovery of the non-broadened model at high S/N is consistent with S21, as discussed further in Sect. 6.1.

thumbnail Fig. 12

All-nights CCF matrices for cases 1, 2, 3 and 4, with the CCFs from all eight nights ordered by phase. The top panels are in the rest frame of the star and the bottom panels are in the rest frame of the planet. The cases where the cross-correlation template is rotationally broadened (cases 1 and 2, vrefl, c.c. temp = 12.0 km s−1, left two columns) and is not broadened (cases 3 and 4, vrefl, c.c. temp = 0.0 km s−1, right two columns) are shown. Cases 2 and 4, where a model (mod) following that proposed by M15 (Rp = 1.9, Ag = 0.5) has been injected are indicated by ‘With model’. For case 2, the injected model returns a very weak trail, as the CCF peak is spread out over more Δv increments. For case 4, the injected model is clearly visible to the eye as a blue-shifting trail in the stellar rest frame and as a vertical stationary trail in the planet rest frame. The phase increments are not evenly spaced due to uneven phase coverage (see Fig. 2). The faint vertical trails at Δv = 0.0 km s−1 in the top panels is due to residual stellar lines. The stellar residuals are much more pronounced for cases 3 and 4, because the stellar residuals match with the non-broadened cross-correlation template.

thumbnail Fig. 13

Summed 1D CCFs of the observed data in terms of S/N, where each lines corresponds to a different Kp in 40 ≤ Kp ≤ 200 km s−1, for cases 1 (left) and 3 (right), with cross-correlation templates with and without broadening (vrefl, c.c. temp = 12.0 km s−1 and 0.0 km s−1 respectively). The dashed light blue line and the dotted purple line mark the SN = ±3 and SN = ±4 thresholds. The cross-correlation template without rotational broadening results in stronger noise spikes due to matching with narrower lines in the residual host star spectrum.

thumbnail Fig. 14

Kp – Δv maps showing the S/N of the CCFs for a range of Kp, for all the nights combined. The red dashed lines mark the expected location of any signal from 51 Pegasi b. The plots showscase 1 (left) and case 3 (right), with and without a rotationally broadened cross-correlation template (vrefl, c.c. temp = 12.0 km s−1 and 0.0 km s−1 respectively). No planet signal is recovered, with SN = −0.16 and SN = −0.30 at Kp = 133 km s−1 and Δv= 0 km s−1 for the left and right plots respectively. The colour bars are calibrated to peak at the highest value in each Kp – Δv map, and descend in increments of 0.5.

thumbnail Fig. 15

As for Fig. 14, Kp – Δv maps showing the S/N of the CCFs for all the nights combined, but with a model injected with parameters Ag = 0.5, Rp = 1.9RJ (M15). The plots show case 2 (left) and case 4 (right), with and without rotational broadening (vrefl, mod = 12.0 km s−1 and 0.0 km s−1 respectively). The cross-correlation templates match the broadening of the model in each case. The broadened model is weakly recovered with SN = 5.50 at Kp = 133 km s−1 and Δv= 0 km s−1. Without broadening the model is recovered very strongly with SN = 18.82 at Kp = 133 km s−1 and Δv= 0 km s−1. The colour bars are calibrated to peak at the highest value in each Kp–Δv map, and descend in increments of 1.0.

thumbnail Fig. 16

S/N recovered by our pipeline for a suite of injected reflected light models at Kp = 133 km s−1 and Δv= 0 km s−1, as a functionof geometric albedo (Ag) and planet radius (Rp), for models with (left) and without (right) rotational broadening. The white and grey contours are above the threshold for detection, while blue regions cover statistically viable scenarios for 51 Pegasi b, given the upper limits on the FpF contrast that we derive in Sect. 5.4. The black dashed contours mark out the FpF contrast values. The red dashed lines mark the Ag = 0.5 and Rp= 1.9RJ parameters proposed by M15 for 51 Pegasi b. The blue to grey colour scale is not evenly spaced, but increases in increments of 2 up to SN ≤ 10, in increments of 4 when 10 < SN ≤ 40 and in increments of 8 when SN > 40. The colour bars begin at −1 because the S/N value where no model is injected is less than zero for the cases with and without broadening. The colour scale is the same for both plots.

5.4 Upper limit on the radius and albedo of 51 Pegasi b

As there was no statistically significant recovery of light reflected from 51 Pegasi b in the observed spectra, we proceeded to calculate upper limits on its radius and albedo. We considered both the physical scenario with rotational broadening (cases 1 and 2), and the scenario without rotational broadening (cases 3 and 4), with the latter being a test of the maximum sensitivity of our data and pipeline. We did this by injecting a series of models for 0.0 ≤ Ag ≤ 1.0 (increment of 0.1 between each model injection) and 0.0 ≤ Rp ≤ 3.0RJ (increment of 0.3 between each model injection) values using Eq. (9). As for cases 2 and 4 the model was injected at 51 Pegasi b’s expectedKp = 133 km s−1 and Δv = 0.0 km s−1, and we calculated the S/N of the recovery. The full results of our model injections are shown in Fig. 16 with (left panel) and without (right panel) rotational broadening, where the S/N value is taken at Kp = 133 km s−1 and Δv = 0 km s−1.

As shown in Sect. 5.2, our threshold for detection is SN > 3. We adopt this SN = 3 upper limit cut7 as the threshold for when our pipeline can sufficiently recover a model and thus reject its parameters as a possible physical scenario for 51 Pegasi b. Any model recovered with SN < 3 was kept as a statistically plausible model for 51 Pegasi b.

For the physical scenario with rotational broadening (case 1), we find a SN = 3 upper limit on the contrast ratio of F p F <7.60× 10 5 $\frac{F_{\textrm{p}}}{F_{\star}}<7.60\times10^{-5}$, resulting in relatively loose constraints Ag and Rp. For example, a typical hot Jupiter with the same mass of 51 Pegasi b8 has Rp ~ 1.2RJ, which corresponds to Ag < 0.62.

For the scenario without rotational broadening (case 3), which represents the limit of the sensitivity of our data and pipeline for a planet orbiting 51 Pegasi b in a fully synchronised orbit at a = 0.052 au, we find a SN = 3 upper limit of F p F <2.40× 10 5 $\frac{F_{\textrm{p}}}{F_{\star}}<2.40\times10^{-5}$, corresponding to a much more tightly constrained Ag < 0.20 for Rp ~ 1.2 RJ. This is comparable to the upper limit on the contrast ratio found by Hoeijmakers et al. (2018a) with archival data for the fully synchronised τ Boö b hot Jupiter ( F p F <1.5× 10 5 $\frac{F_{\textrm{p}}}{F_{\star}}<1.5\times10^{-5}$), and the upper limit of S21 for 51 Pegasi b ( F p F ~ 10 5 $\frac{F_{\textrm{p}}}{F_{\star}}\sim10^{-5}$). If 51 Pegasi b had similar albedo to HD 189733b (i.e. Ag = 0.40; Evans et al. 2013), our results would give constraints of Rp < 1.50 RJ with rotational broadening, and Rp < 0.84 RJ without rotational broadening.

It is evident that our analysis very strongly rules out the radius and albedo parameters suggested by M15 for 51 Pegasi b (Ag = 0.5, Rp = 1.9 RJ). For the case they consider (our case 4), we recover the corresponding injected model at SN = 18.82, so it is clearly recovered. Even adding rotational broadening (our case 2), we still recover the model at SN = 5.50, while the observed data alone reveal no signals greater than SN = 3.

5.5 Confirming the source of the broadening

M15 showedwith model injections that their processing techniques acted to broaden the CCF of signals in the data. To confirm that broadening in our analysis is purely astrophysical in origin and not introduced erroneously by any of the steps in our pipeline, we compared the full width at half maximum (FWHM) of the autocorrelation function of our normalised synthetic stellar spectrum (Fmod, norm), with the CCFs obtained after injecting a model with and without rotational broadening (cases 2 and 4). The autocorrelation function for the broadened Fmod, norm (vrefl, mod = 12.0 km s−1) had FWHM = 34.4 km s−1, and without broadening Fmod, norm (vrefl, mod = 0.0 km s−1) had FWHM = 10.4 km s−1. These relatively large FWHMs are in part due to broadening of spectral features in the star due to, for example, pressure or natural broadening, and saturation of certain lines in the optical range of 51 Pegasi’s spectrum (e.g. the Ca II H and K, Mg I, and Hα lines). We compared these FWHM of the autocorrelation functions to the FWHM of the CCFs obtained from the observed spectra, when injecting a model with and without broadening and using matching cross-correlation templates. The results are shown in Fig. 17. For both cases 2 and 4, the FWHM of the CCF is narrower than the model autocorrelation function. We propose that this is because some of the wings of the broadest and shallowest lines of the injected model are removed by the post-processing and stellar line removal. For neither case 2 or 4 does our pipeline broaden the recovered model. Therefore, we can be confident that any broadening is due to astrophysical broadening of the planet signal, not due to our pipeline.

We note that our pipeline was able to recover the real signal of HD 189733b at the same FWHM as found by Birkby et al. (2013) and Cabot et al. (2019) (see Appendix A).

thumbnail Fig. 17

Comparison of the autocorrelation function of the model, with (solid) and without (dashed) broadening, to the CCFs obtained after injecting models into the observed data (purple and grey lines), to assess if post-processing affects the FWHM of the signal. All are normalised with respect to the black dashed line for visual purposes. In the autocorrelation functions, one spectrum has the residuals mask from Sect. 4.4 applied beforehand to better match the steps in the analysis that exclude the more saturated stellar features. The FWHM values of the autocorrelation functions are 34.4 km s−1 and 10.4 km s−1, respectively with (blue solid line) and without (black dashed line) broadening. For the injected models, the FWHM of the CCFs are 29.6 km s−1 (purple solidline), and 7.2 km s−1 (grey dashed line), respectively with and without broadening. The grey and purple lines correspond to their un-normalised counterparts in Fig. 9.

6 Discussion

6.1 Rotational broadening and comparison with previous studies

Our results indicate that very low albedo is entirely plausible for 51 Pegasi b, in line with the large body of literature that predicts that hot Jupiters will have extremely low optical albedo, because their high temperatures mean that they lack a reflective cloud deck (see e.g. Rowe et al. 2008; Cowan & Agol 2011; Heng & Demory 2013). However, we cannot confirm that 51 Pegasi b has a low albedo, because rotational broadening due to v⋆,p considerably reduces the sensitivity of our data (see Fig. 16).

As discussed in Sect. 2.1, four previous works sought reflected light from 51 Pegasi b, with conflicting conclusion on the radius and albedo of the planet (M15; Borra & Deschatelets 2018; Di Marcantonio et al. 2019; S21). While each of these studies acknowledge that the host star rotation period is very long in comparison to orbital period (21.9 days vs. 4.23 days respectively), each neglect the effect of the apparent rotational broadening due to the orbital motion of the planet, v⋆,p (see Eq. (5) and Strachan & Anglada-Escudé 2020), and instead assume that the star would not appear broadened from the planet’s perspective. In contrast, we have shown here that v⋆,p has a significant impact on the rotational broadening of the planet’s reflected light spectrum (vrefl = 12.0 km s−1) and the strength with which it can be detected. As a result, Eq. (1) for the S/N of the detection does not hold true for broadened spectra, requiring modification to account for the reducing effect of broadening on the depth of the spectral lines and hence the number of detectable lines in the spectrum. Only in the special case of a fully synchronised system (i.e. Prot,⋆ = Porb, like for τ Boö b), can this effect be ignored. An immediate consequence is that the ratio of stellar and planet CCF amplitudes cannot be used directly to calculate contrast and thus Rp and Ag as they have been muted by the broadening. Instead, the CCF would need to be deconvolved to remove the muting effect of the rotational broadening, implying that the parameters proposed by M15 are underestimated, and thus the albedo and radius for 51 Pegasi b would be even brighter or larger, despite their already high values when compared to the radius of a typical hot Jupiter with the same mass and temperature as 51 Pegasi b.

In the case of a detection, an alternative approach to determining Rp and Ag would be to inject negative versions of the appropriately broadened model into the data until the signals cancelled out (as similarly done with thermal emission detections in the infrared, see e.g. Brogi et al. 2012; Birkby et al. 2017).

We can attempt a comparison of our work with these previous studies if we temporarily ignore the broadening and muting effects of v⋆,p, in order to assess the efficacy of the different processing techniques. For this, we consider the results shown in our figures with vrefl = 0.0 km s−1. This is closeto, but not exactly the scenarios presented in the other four studies, as our cross-correlation templates are generated in different ways. As discussed in Sect. 6.1, we find that if we inject a model with vrefl, mod = 0.0 km s−1 with Ag and Rp as proposed by M15 then it is recovered with a SN = 18.82 (see right plot, Fig. 15). Our CCF template has matched broadening unlike M15, and thus recovers slightly improved CCF peak values (see Fig. 9). However, even if we inject a model with vrefl, mod = 12.0 km s−1, a weak signal is recovered with SN = 5.50 (see left plot, Fig. 15). Consequently, we can rule out the high Ag and Rp proposed by M15 for 51 Pegasi b. This corroborates the results of S21, who rule out contrast ratios ≳ 10−5 for cases without broadening, including the 1.2 × 10−4 contrast ratio suggested by M15. Given that the upper limits reached by S21 and our work are derived from a dataset that contains over five times the amount of spectra (of comparable S/N) used in M15, it is unsurprising they yield a superior sensitivity.

It is interesting then to compare our results solely to S21 given that our datasets are the same, but our model templates and cleaning processes are different, particularly the process of removing the stellar line contamination. Our work operates directly on the spectra, while S21 remove stellar contamination from the CCFs instead. Our closest comparable case to S21 is our case 3 (with no rotational broadening and thus the most sensitive) where we obtained an upper limit on the contrast ratio of F p F <2.40× 10 5 $\frac{F_{\textrm{p}}}{F_{\star}}<2.40\times10^{-5}$ which is comparable to their F p F 10 5 $\frac{F_{\textrm{p}}}{F_{\star}}\lesssim10^{-5}$. Our recovery of the M15 proposed model yielded SN ~ 19, while S21 recover SN ~ 22 for a similar contrast ratio. These are both very high S/N detections, but we do not assign significance here to their difference. A strict comparison is not possible as the exact details of the injection and recovery scheme are not fully available in S21, hence there may be differences in, for example, the exact S/N metric used, and the noise distribution of the S/N metric. However, given we injected our similar models at the same location within Kp ± 1 km s−1 and i ± 1°, and we both incorporated the phase function, we conclude broadly that both approaches to removing stellar lines are effective. We do, however, note some pertinent differences between our spectra-focused approach and their CCF approach to removing stellar lines. S21 do not use 88 frames near superior conjunction to avoid stellar contamination in the CCF, whereas we only needed to remove four frames for this, allowing more data to be used. We were also able to use more spectral orders than the CCF approach which has to exclude those with broadened features in the host star spectrum. This ability to use extra data may be useful for less abundant datasets, enabling greater phase coverage. We cannot clearly state if our use of synthetic templates is superior to observed templates. While the observed templates in the CCF approach of S21 allows for excellent alignment of the stellar lines, we note that synthetic templates have the advantage of being i) completely free of telluric lines such that they outperform observed templates in this respect, even when great care is taken to correct the tellurics, as done by S21 with MOLECFIT (Smette et al. 2015; Kausch et al. 2015), ii) are more straightforward to appropriately broaden without also broadening the planet’s albedo function, and iii) are easily combined with models of planet’s high resolution albedo function. The latter is particularly important if, for example, abundances are to be derived from high resolution exoplanet reflection spectra. Nonetheless, there remains scope for improvement in both approaches to removing stellar contamination.

Our post-processing and cross-correlation approach differed greatly from that presented in S21, and is most similar to Hoeijmakers et al. (2018a) who studied τ Boö b in reflected light. Our model injections without rotational broadening are most directly comparable to this work, as τ Boö is a fully synchronised system. Both our studies operate on the spectra directly to remove the host star, including PCA-like systematics removal, and we both used a model from a stellar library to perform the cross-correlation. Hoeijmakers et al. (2018a), however, used a very large repository of archival spectra from multiple spectrographs, obtained relatively sporadically over 15 years, containing a mix of exposure times and phase coverage. While HARPS-N contributes some of their dataset, other spectrographs with lower radial velocity precision and stability are also included. They used a total of 2160 frames with an average S/N of 500–1000 per pixel to obtain an upper limit on the contrast of F p F <1.5× 10 5 $\frac{F_{\textrm{p}}}{F_{\star}} < 1.5\times10^{-5}$. In contrast, the smaller collection of 484 frames used in this work, which were comprised of eight more consistent observing sequences, led to an upper limit on the contrast ratio in case 1 (without rotational broadening) of F p F <2.40× 10 5 $\frac{F_{\textrm{p}}}{F_{\star}} < 2.40\times10^{-5}$. This is comparable to the result presented in Hoeijmakers et al. (2018a), despite our much smaller dataset. Unlike for the analysis presented in this paper, Hoeijmakers et al. (2018a) additionally applied a mask to their CCFs to remove remaining stellar residuals. We also see these residuals in Fig. 12 (especially in the top right panels), but do not mask them, to avoid interfering with the noise distribution of the Kp –Δv maps. A one-to-one comparison is not fully possible due to the different orbital parameters of the system (i.e. Kp is different), thus the full impact of stellar residuals is not the same between our studies. However, we note that due to the different instruments used in Hoeijmakers et al. (2018a), their alignment to the stellar rest frame is based on fitting the positions of common strong stellar features across the dataset, unlike the work presented here where we have relied on the precision radial velocity andstability of HARPS-N to precisely determine the true rest frame of the star and align the data accordingly. It is possible that this instrumental precision and stability has therefore enabled a better alignment and therefore subtraction of the host star spectrum and hence a comparable upper limit, despite the much smaller dataset in our study.

6.2 Stellar residuals and stellar variability

The removal of stellar lines, however, is not perfect in either study. One potential reason for this is that neither study accounts for stellar variability. Within the timescale of a single observing night, stellar surface processes including surface granulation, coronal flares and mass ejections could introduce variability (see e.g. Cegla et al. 2013; Meunier et al. 2017). On a longer timescale, stellar spots and faculae could introduce variability between observing nights (see e.g. Meunier et al. 2010; Aigrain et al. 2012; Dumusque et al. 2014; Haywood et al. 2014, 2016; Milbourne et al. 2019). Stellar variability is recognised as a serious challenge to the success of extreme precision radial velocity (EPRV) studies (e.g. Fischer et al. 2016; Crass et al. 2021), and has been more recently recognised as a challenge to transmission and thermal day side HRCCS (van Sluijs et al. 2019; Guilluy et al. 2020; Nugroho et al. 2020b). For reflection HRCCS, small changes in line shape or position due to stellar variability may present as small flux variations at < 10−4 levels, similar to that expected for reflected light, which are revealed when the time-averaged spectrum is divided out from the aligned spectra, and when nights are combined. Its impact appears greatest for observations taken close to times of superior or inferior conjunction. Future studies could aim to mitigate the effect with modelling or observational strategies.

6.3 Strengths and limits of the S/N analogue

The S/N analogue used in this work and throughout the literature provides a straightforward and intuitive approach to understanding the extent to which a recovered signal, if any, stands above the noise, and allows us to set a threshold for detection or upper limit. However, the method is based upon the supposition that, for a cross-correlation array that has been co-added in time, a recovered signal will be concentrated at one Δv value. This situation is often the case for thermal emission signals in the infrared regime from tidally-locked systems, where any exoplanet or model signal is typically recovered almost entirely at Δv = 0 km s−1 (see e.g. Birkby et al. 2017). In this case, the resulting S/N at Δv = 0 km s−1 will be representative of the strength of the exoplanet or model signal. However, in the case where a signal is broadened, the S/N analogue rapidly becomes unrepresentative of the total signal because it is spread across a range of Δv values (see Fig. 9). This impacts the CCFs from spectra of fast rotating exoplanets, for example, β Pic b (Snellen et al. 2014), as well as reflected light spectra due to v⋆,p. We also note that in the case of 51 Pegasi b, its host star spectrum contains near-saturated spectral features that inherently broaden the CCFs too (see Fig. 17). This renders the S/N analogue a less suitable metric for determining the significance of a signal when using HRCCS for reflected light and other broadened signals.

An additional issue is that the S/N analogue in principle requires one to know where the planet signal will be so that it can be excluded in the calculation of the standard deviation. This makes it less suitable for a blind search for the planet signal, which is pertinent for non-transiting planets where Kp is not known a priori. On a similar note, the standard deviation depends on the choice of Δv range to explore. Finally, in some cases the standard deviation is calculated per Kp value (as was done for this work), but sometimes it is calculated only once across the entire Kp – Δv map of values (see e.g. Nugroho et al. 2021). This inconsistency in the literature makes comparison of results difficult. Alternative metrics have been used, for example, Welch’s T-test (see e.g. Birkby et al. 2017) or log-likelihood (see e.g. Brogi & Line 2019), but these come with other complicating issues. In short, the S/N analogue approach has the benefit of simplicity, but it is not ideally suited to reflected light detections, especially where there is significant broadening. Thus, we acknowledge the need to evaluate the precise meaning of S/N, σ, and other significance metrics in relation to HRCCS, including those that can be reasonably adapted in the case of significant rotational broadening, which will be investigated in future work.

6.4 Considerations on the optimisation of SYSREM

As stated in Sect. 4.4.1, there is precedent for optimising the SYSREM algorithm per spectral order or per night because each can suffer from different conditions and systematics (see e.g. Birkby et al. 2013, 2017; Nugroho et al. 2017; Cabot et al. 2019; Gibson et al. 2020; Kesseli et al. 2020). However, the methods used for determining this optimisation vary, making the outcomes challenging to compare. Typically, a model is injected and recovered per order, with the iteration that gave the highest S/N recovery of the model being used for that order when assessing the real data (see e.g. de Kok et al. 2013; Birkby et al. 2013, 2017; Nugroho et al. 2017). These previous works used a small number of spectral orders compared to HARPS-N, which has 69 orders. Using this approach, we found that the large number of orders acted to give a false positive due to optimised constructive addition of noise at the planet’s expected velocity. Furthermore, Cabot et al. (2019) stressed the need for caution when optimising the number of SYSREM iterations, and found that the significance of detections could be over-estimated if the algorithm was over-fitted. Therefore, in this work we followed other papers (e.g. Gibson et al. 2020; Merritt et al. 2020; Nugroho et al. 2021), opting to use a uniform numbers of SYSREM iterations across all orders and nights, as detailed in Sect. 4.4, to reduce this effect. A detailed assessment of how to optimally apply SYSREM, including with consideration of rotational broadening, is beyond the scope of this paper, but will be further explored in future work to reach the deepest contrasts.

6.5 Rotational broadening in different systems

The rotational broadening of the reflected light from 51 Pegasi b made a significant impact on the ability of HRCCS to extract its spectrum. We visualise the equations used to calculated vrefl for gas giant planets (see Eqs. (4), (5), and (6)) in Fig. 18 to highlight the effect of different parameters. It is evident that v⋆,p contributes considerably to the value of vrefl. The magnitude of v⋆,p is determined by the rotation of the star on its own axis (Prot,⋆) and the orbit of the planet (Porb,p), and increases as these periods differ. Due to vproj,p, which is determined by the ratio between the planet’s Rp and Prot,p, vrefl never equals zero, even when the star and planet system is fully synchronised (see Eq. (4)). vrefl is greatest for large, rapidly rotating planets where there is a large difference between Prot,⋆ and Porb. However, vrefl can still be significant for combinations in between, and we can use these equations to predict its impact on the rotational broadening on different star–planet systems. This is of particular interest to Earth-like exoplanets that orbit M-dwarfs, as these are considered to be amongst the most likely targets for the successful detection of biosignatures in reflected light (e.g. Snellen et al. 2015; Lovis et al. 2017). As examples, we consider Proxima b, and two different hypothetical scenarios for an Earth-like exoplanet in orbit in the habitable zone of two different M-dwarfs. We use the scaling relation a(HZ, au)= ( T /5770 ) 2 ( R / R ) $a({\textrm{HZ},~\textrm{au}}) = \left(T_{\star}/5770\right){}^2 \left(R_{\star}/R_{\odot}\right)$ from Traub & Cutri (2008) to approximate a semi-major axis of a planet in the habitable zone, and a scaled version of Kepler’s 3rd law to find the orbital period of the planet in the habitable zone, P(HZ,days)=365.25 a 3/2 M 1/2 $P(\textrm{HZ, days}) = 365.25\, a^{3/2} M_{\star}^{-1/2}$. For all scenarios, we assume tidallocking, so that Prot,p = Porb. The results are presented in Table 2.

Consequently, for Proxima b, rotational broadening of its reflected light spectrum is unlikely to have significant impact on HRCCS in extracting is spectrum as the broadening is below the resolution limit of most spectrographs. However, for other scenarios (e.g. scenarios B and C in Table 2) the broadening does exceed the typical velocity per resolution element of Δv = 3 km s−1 for R = 100 000 spectrographs. The high vrefl in these scenarios is due to the rapid stellar rotation, which is typical to some populations of M-dwarfs (e.g. McQuillan et al. 2013; Rappaport et al. 2014. Thus, the rotational broadening must be carefully considered when inferring the retrieved properties of the planet, and the S/N required to detect the planet’s reflected light spectrum.

Table 2

Rotational broadening, vrefl, of different systems, calculated using Eqs. (4), (5), and (6).

6.6 Considering the impact of rotational broadening on the optimal choice of instrument

When considering observational plans for HRCCS, one could broadly approximate the impact of rotational broadening to a reduction in resolving power. In the case of 51 Pegasi b, this would be R ~ c∕Δv ~ 3 × 105∕12 = 25 000. This raises the question of whether it is necessary or beneficial to use very high resolution spectrometers for HRCCS for systems where we expect a significant amount of rotational broadening. There are advantages to being able to use lower resolution spectrometers: higher S/N due to a greater number of photons per wavelength channel; a greater range of available instruments; and the possibility of using space telescopes. However, using a lower resolution spectrometer for reflection HRCCS would lose one of the scant advantages that the rotational broadening of the reflected light offers, namely that that the planet spectrum is subject to additional broadening (see Eq. (6)), which helps distinguish it from the stellar spectrum. This advantage would be lost at lower resolution. Nonetheless, it is worth bearing in mind the possibility of employing a wider range of instruments when searching for reflected light from a system with considerable rotational broadening.

thumbnail Fig. 18

vrefl values, in kms−1 calculated using the equations defined in Rodler et al. (2010, see Eqs. (5), (4) and (6)), for a range of exoplanet and stellar parameters. Each row of plots shows vrefl values for different values of stellar radii (R) and exoplanet radii (Rp). The y-axes correspond to the period of the exoplanet’s rotation on its own axis (Prot,p). The x-axes correspond to the period of the exoplanet’s orbit around its host star and extends down to Porb = 0.1 days). The dashed black diagonal lines show where Prot,p = Porb, i.e. the exoplanet is always showing the same face to its host star (as is the case for 51 Pegasi b). Each individual plot corresponds to a different period of the star’s rotation on its own axis (Prot,⋆). The dotted black vertical lines show where Prot,⋆ = Porb, i.e. the star is always showing the same face to the exoplanet. The point where the dashed and dotted black lines cross mark where Prot,p = Prot,⋆ = Porb, i.e. the star-exoplanet system is in a full tidal lock.

7 Conclusions

We have presented a search for the reflected light spectrum of 51 Pegasi b using optical high resolution cross-correlation spectroscopy obtained during a dedicated observing programme with the stable, precision radial velocity spectrograph HARPS-N at the TNG, in combination with archival data collected from HARPS-N and HARPS. In contrast to previous studies of 51 Pegasi b, we removed the contaminating host star spectrum directly from the spectra rather than in cross-correlation space. We found that the precise stellar RVs from HARPS-N allowed better alignment of the spectra to the true stellar rest frame which subsequently allowed us to remove the stellar lines more effectively using SYSREM. However, stellar residuals remained, possibly due to stellar variability during the course of the observationsthat need to be mitigated in future studies. We also highlighted a lack of consistency in the application of SYSREM in the literature. There remains space for improvement in both the removal of stellar lines and telluric lines in our approach and others. Importantly, we found that apparent rotational broadening, due to the difference between thelong host star rotation period and the short orbital period of 51 Pegasi b, is significant and strongly impacts the sensitivity of HRCCS in extracting reflected light spectra. We did not make a significant detection of reflected light from 51 Pegasi b, and for the physical scenario with rotational broadening (case 1), we found a SN = 3 upper limit on the contrast ratio F p F <7.60× 10 5 $\frac{F_{\textrm{p}}}{F_{\star}}<7.60\times10^{-5}$, which corresponds to a weakly constrained Ag < 0.62 for a typical hotJupiter with Rp ~ 1.2 RJ. Without broadening (case 3), the sensitivity of our data and pipeline results in a SN = 3 upper limit of F p F <2.40× 10 5 $\frac{F_{\textrm{p}}}{F_{\star}}<2.40\times10^{-5}$, which would corresponds to a much more tightly constrained Ag < 0.20 for Rp ~ 1.2 RJ if the 51 Pegasi system was fully synchronised (Prot,⋆ = Porb). In both cases, we were able to reject the proposed Ag and Rp in M15, and we stress that our demonstration of the impact of vrefl shows it must be accounted for when inferring Rp and Ag from contrast ratios measured with HRCCS, else they will be underestimated. For rocky planets, we found that Proxima b-like systems experience broadening below the velocity resolution of most spectrographs, but other possible M-dwarf habitable zones planets can experience detectable broadening and should be carefully considered when aiming to detect the spectra of these planets. Finally, we found that the S/N analogue used here and in the literature is not an ideal metric for assessing the significance of rotationally broadened signal recovered with HRCCS, because S/N only considers the height of the CCF peak and not its breadth. Other metrics, such as the log-Likelihood (e.g. Brogi & Line 2019) which account for the shape of the signal, may be more successful and warrant further separate study.

Acknowledgements

The authors would like to thank P. Uttley for valuable discussions on statistical methods, and I. Snellen and G. Anglada-Escudé for helpful insights on cleaning the host star spectrum from the data. We further thank J. Hoeijmakers, M. Brogi, F. Rodler, L. Buchhave, J. M. Désert, and D. Charbonneau for helpful discussions that improved the analysis in this paper. We thank Eike Gunther, Hristo Stoev and the TNG operators for all their help in obtaining the data in this work, as well as the wonderful staff at the Observatorio del Roque de Los Muchachos, La Palma. J.L.B. and M.E.Y. acknowledge funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 805445. This work was performed in part under contract with the Jet Propulsion Laboratory (JPL) funded by NASA through the Sagan Fellowship Program executed by the NASA Exoplanet Science Institute. R.A. acknowledges support from the Spanish MINECO grant ESP2014-57495-C2-1-R. S.H. acknowledges CNES funding through the grant 837319. P.C. acknowledges support from Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) under grant 310041/2018-0. This article is based on observations made with the Italian Telescopio Nazionale Galileo (TNG) operated on the island of La Palma by the Fundación Galileo Galilei of the INAF (Istituto Nazionale di Astrofisica) at the Spanish Observatorio del Roque de los Muchachos of the Instituto de Astrofisica de Canarias. This research used the facilities of the Italian Center for Astronomical Archive (IA2) operated by INAF at the Astronomical Observatory of Trieste. This research has made use of the NASA Exoplanet Archive, which is operated by the California Institute of Technology, under contract with the National Aeronautics and Space Administration under the Exoplanet Exploration Program, and NASA’s Astrophysics Data System Bibliographic Services. The following software and packages were used in this work: the HARPS-N Data Reduction Software (DRS, Lovis & Pepe 2007); Python v3.7; Python packages Astropy (Astropy Collaboration 2013), NumPy (Oliphant 2006; van der Walt et al. 2011), and Matplotlib (Hunter 2007). The authors would like to thank the anonymous referee for their insightful comments and suggestions.

Appendix A Confirming the efficacy of the pipeline

The pipeline we present in this work is an adaptation of that used for infrared high resolution spectroscopy of exoplanet atmospheres where the telluric lines are the dominant contaminant (e.g. Birkby et al. 2013, 2017; Brogi et al. 2012). In this work, we make adjustments such as aligning to the true stellar rest frame to improve removal of the stellar lines. To confirm that our pipeline was able to recover previously confirmed signals, we ran it on 3.2 μm CRIRES/VLT observations of HD 189773 b as presented in Birkby et al. (2013) and Cabot et al. (2019) who both detected water in the planet’s atmosphere. The contaminating tellurics (and few stellar) lines were removed following the method laid out in Sect. 4.4, and SYSREM was iterated eight times. Using the same cross-correlation template presented in Birkby et al. (2013), we detect a signal of water absorption at Kp = 152 km s−1, Δv = 0 km s−1 (where vsys = 2.361 km s−1) using the pipeline presented in this paper. The detection is shown in Figure A.1. The total S/N=4.66 of our detection is slightly lower than Birkby et al. (2013, S/N=5.1) and Cabot et al. (2019, S/N=4.8), likely due to the difference in the number of SYSREM iterations used to clean the data.

thumbnail Fig. A.1

Detection of water vapour in the atmosphere of HD 189733b found using the pipeline described in this paper. The data are from CRIRES/VLT at 3.2 μm. This serves to verify the efficacy of our pipeline in recovering known detections (Birkby et al. 2013; Cabot et al. 2019). Axes as for Figures 14 and 15.

Appendix B Additional figures

We present the equivalent plots displayed in Figure 5, for the remaining individuals nights of data (see Figure B.1), and the equivalent plot as displayed in the left plot of Figure 14, but for the individual nights, as opposed to all the nights combined (see Figure B.2).

thumbnail Fig. B.1

As for Figure 5, but for Nights 2, 3, 4, 5, 6, 7 and 8. Continued on the next page. Note that the wavelength range 497.52 nm - 498.70 nm is covered by order 35 from HARPS-N data and order 39 from HARPS data.

thumbnail Fig. B.2

As for the left plot in Figure 14, but displaying the individual results from Nights 1, 2, 3, 5, 6, 7 and 8, as opposed to the combined results. Some features appear at the S/N=-5 level, but do not propagate to the all-nights matrix, indicating that these are probably due to residual stellar contamination.

References

  1. Aigrain, S., Pont, F., & Zucker, S. 2012, MNRAS, 419, 3147 [NASA ADS] [CrossRef] [Google Scholar]
  2. Angerhausen, D., DeLarme, E., & Morse, J. A. 2015, PASP, 127, 1113 [NASA ADS] [CrossRef] [Google Scholar]
  3. Astropy Collaboration (Robitaille, T. P., et al.) 2013, A&A, 558, A33 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  4. Barstow, J. K., Aigrain, S., Irwin, P. G. J., et al. 2014, ApJ, 786, 154 [NASA ADS] [CrossRef] [Google Scholar]
  5. Birkby, J. L. 2018, Exoplanet Atmospheres at High Spectral Resolution [Google Scholar]
  6. Birkby, J. L., de Kok, R. J., Brogi, M., et al. 2013, MNRAS, 436, L35 [Google Scholar]
  7. Birkby, J. L., de Kok, R. J., Brogi, M., Schwarz, H., & Snellen, I. A. G. 2017, AJ, 153, 138 [NASA ADS] [CrossRef] [Google Scholar]
  8. Borra, E. F., & Deschatelets, D. 2018, MNRAS, 481, 4841 [NASA ADS] [CrossRef] [Google Scholar]
  9. Boucher, A., Darveau-Bernier, A., Pelletier, S., et al. 2021, AJ, 162, 233 [NASA ADS] [CrossRef] [Google Scholar]
  10. Brogi, M., & Line, M. R. 2019, AJ, 157, 114 [Google Scholar]
  11. Brogi, M., Snellen, I. A. G., de Kok, R. J., et al. 2012, Nature, 486, 502 [Google Scholar]
  12. Brogi, M., Snellen, I. A. G., de Kok, R. J., et al. 2013, ApJ, 767, 27 [NASA ADS] [CrossRef] [Google Scholar]
  13. Brogi, M., de Kok, R. J., Albrecht, S., et al. 2016, ApJ, 817, 106 [Google Scholar]
  14. Cabot, S. H. C., Madhusudhan, N., Hawker, G. A., & Gandhi, S. 2019, MNRAS, 482, 4422 [NASA ADS] [CrossRef] [Google Scholar]
  15. Casasayas-Barris, N., Pallé, E., Yan, F., et al. 2019, A&A, 628, A9 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  16. Cegla, H. M., Shelyag, S., Watson, C. A., & Mathioudakis, M. 2013, ApJ, 763, 95 [Google Scholar]
  17. Charbonneau, D., Noyes, R. W., Korzennik, S. G., et al. 1999, ApJ, 522, L145 [CrossRef] [Google Scholar]
  18. Chiavassa, A., & Brogi, M. 2019, A&A, 631, A100 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  19. Coelho, P. R. T. 2014, MNRAS, 440, 1027 [Google Scholar]
  20. Collier Cameron, A., Horne, K., Penny, A., & James, D. 1999, Nature, 402, 751 [NASA ADS] [CrossRef] [Google Scholar]
  21. Cosentino, R., Lovis, C., Pepe, F., et al. 2012, in Proc. SPIE, 8446, Ground-based and Airborne Instrumentation for Astronomy IV, 84461V [NASA ADS] [CrossRef] [Google Scholar]
  22. Coughlin, J. L., & López-Morales, M. 2012, AJ, 143, 39 [NASA ADS] [CrossRef] [Google Scholar]
  23. Cowan, N. B., & Agol, E. 2011, ApJ, 729, 54 [Google Scholar]
  24. Crass, J., Gaudi, B. S., Leifer, S., et al. 2021, Extreme Precision Radial Velocity Working Group Final Report [Google Scholar]
  25. Cunha, D., Santos, N. C., Figueira, P., et al. 2014, A&A, 568, A35 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  26. de Kok, R. J., Brogi, M., Snellen, I. A. G., et al. 2013, A&A, 554, A82 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  27. Deibert, E. K., de Mooij, E. J. W., Jayawardhana, R., et al. 2021, ApJ, 919, L15 [NASA ADS] [CrossRef] [Google Scholar]
  28. Demory, B.-O., Seager, S., Madhusudhan, N., et al. 2011, ApJ, 735, L12 [Google Scholar]
  29. Demory, B.-O., de Wit, J., Lewis, N., et al. 2013, ApJ, 776, L25 [Google Scholar]
  30. Di Marcantonio, P., Morossi, C., Franchini, M., & Lehmann, H. 2019, AJ, 158, 161 [CrossRef] [Google Scholar]
  31. Dumusque, X., Boisse, I., & Santos, N. C. 2014, ApJ, 796, 132 [NASA ADS] [CrossRef] [Google Scholar]
  32. Ehrenreich, D., Lovis, C., Allart, R., et al. 2020, Nature, 580, 597–601 [NASA ADS] [CrossRef] [Google Scholar]
  33. Esteves, L. J., Mooij, E. J. W. D., & Jayawardhana, R. 2015, ApJ, 804, 150 [NASA ADS] [CrossRef] [Google Scholar]
  34. Evans, T. M., Pont, F., Sing, D. K., et al. 2013, ApJ, 772, L16 [NASA ADS] [CrossRef] [Google Scholar]
  35. Fischer, D. A., Anglada-Escude, G., Arriagada, P., et al. 2016, PASP, 128, 066001 [Google Scholar]
  36. Gibson, N. P., Merritt, S., Nugroho, S. K., et al. 2020, MNRAS, 493, 2215 [Google Scholar]
  37. Guilluy, G., Andretta, V., Borsa, F., et al. 2020, A&A, 639, A49 [EDP Sciences] [Google Scholar]
  38. Hawker, G. A., & Parry, I. R. 2019, MNRAS, 484, 4855 [NASA ADS] [CrossRef] [Google Scholar]
  39. Haywood, R. D., Collier Cameron, A., Queloz, D., et al. 2014, MNRAS, 443, 2517 [Google Scholar]
  40. Haywood, R. D., Collier Cameron, A., Unruh, Y. C., et al. 2016, MNRAS, 457, 3637 [Google Scholar]
  41. Heng, K., & Demory, B.-O. 2013, ApJ, 777, 100 [NASA ADS] [CrossRef] [Google Scholar]
  42. Heng, K., Morris, B. M., & Kitzmann, D. 2022, Nat. Astron. [Google Scholar]
  43. Hoeijmakers, H. J., Snellen, I. A. G., & van Terwisga, S. E. 2018a, A&A, 610, A47 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  44. Hoeijmakers, H. J., Ehrenreich, D., Heng, K., et al. 2018b, Nature, 560, 453–5 [CrossRef] [Google Scholar]
  45. Hoeijmakers, H. J., Ehrenreich, D., Kitzmann, D., et al. 2019, A&A, 627, A165 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  46. Hooton, M. J., Hoyer, S., Kitzmann, D., et al. 2022, A&A, 658, A75 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  47. Horne, K. 1986, PASP, 98, 609 [Google Scholar]
  48. Hunter, J. D. 2007, Comput. Sci. Eng., 9, 90 [NASA ADS] [CrossRef] [Google Scholar]
  49. Kausch, W., Noll, S., Smette, A., et al. 2015, A&A, 576, A78 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  50. Kesseli, A. Y., Snellen, I. A. G., Alonso-Floriano, F. J., Mollière, P., & Serindag, D. B. 2020, AJ, 160, 228 [NASA ADS] [CrossRef] [Google Scholar]
  51. Kipping, D., & Bakos, G. 2011, ApJ, 730, 50 [NASA ADS] [CrossRef] [Google Scholar]
  52. Leigh, C., Collier Cameron, A., Horne, K., Penny, A., & James, D. 2003, MNRAS, 344, 1271 [NASA ADS] [CrossRef] [Google Scholar]
  53. Lovis, C., & Pepe, F. 2007, A&A, 468, 1115 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  54. Lovis, C., Snellen, I., Mouillet, D., et al. 2017, A&A, 599, A16 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  55. Martins, J. H. C., Figueira, P., Santos, N. C., & Lovis, C. 2013, MNRAS, 436, 1215–24 [NASA ADS] [CrossRef] [Google Scholar]
  56. Martins, J. H. C., Santos, N. C., Figueira, P., et al. 2015, A&A, 576, A134 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  57. Mayor, M., & Queloz, D. 1995, Nature, 378, 355 [Google Scholar]
  58. Mazeh, T., Tamuz, O., & Zucker, S. 2007, in Transiting Extrapolar Planets Workshop, eds. C. Afonso, D. Weldrake, & T. Henning, ASP Conf. Ser., 366, 119 [NASA ADS] [Google Scholar]
  59. McQuillan, A., Aigrain, S., & Mazeh, T. 2013, MNRAS, 432, 1203 [Google Scholar]
  60. Meadows, V. S., Reinhard, C. T., Arney, G. N., et al. 2018, Astrobiology, 18, 630 [Google Scholar]
  61. Merritt, S. R., Gibson, N. P., Nugroho, S. K., et al. 2020, A&A, 636, A117 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  62. Meunier, N., Desort, M., & Lagrange, A. M. 2010, A&A, 512, A39 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  63. Meunier, N., Lagrange, A. M., Mbemba Kabuiku, L., et al. 2017, A&A, 597, A52 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  64. Milbourne, T. W., Haywood, R. D., Phillips, D. F., et al. 2019, ApJ, 874, 107 [Google Scholar]
  65. Nugroho, S. K., Kawahara, H., Masuda, K., et al. 2017, AJ, 154, 221 [Google Scholar]
  66. Nugroho, S. K., Gibson, N. P., de Mooij, E. J. W., et al. 2020a, ApJ, 898, L31 [Google Scholar]
  67. Nugroho, S. K., Gibson, N. P., de Mooij, E. J. W., et al. 2020b, MNRAS, 496, 504 [Google Scholar]
  68. Nugroho, S. K., Kawahara, H., Gibson, N. P., et al. 2021, ApJ, 910, L9 [CrossRef] [Google Scholar]
  69. Oliphant, T. 2006, Guide to NumPy (Trelgol Publishing USA) [Google Scholar]
  70. Pepe, F., Mayor, M., Galland, F., et al. 2002, A&A, 388, 632 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  71. Pino, L., Désert, J.-M., Brogi, M., et al. 2020, ApJ, 894, L27 [Google Scholar]
  72. Rappaport, S., Swift, J., Levine, A., et al. 2014, ApJ, 788, 114 [Google Scholar]
  73. Rodler, F., Kürster, M., & Henning, T. 2010, A&A, 514, A23 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  74. Rodler, F., Kürster, M., & Barnes, J. R. 2013, MNRAS, 432, 1980 [NASA ADS] [CrossRef] [Google Scholar]
  75. Rowe, J. F., Matthews, J. M., Seager, S., et al. 2008, ApJ, 689, 1345 [Google Scholar]
  76. Scandariato, G., Borsa, F., Sicilia, D., et al. 2021, A&A, 646, A159 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  77. Serindag, D. B., & Snellen, I. A. G. 2019, ApJ, 871, L7 [Google Scholar]
  78. Simpson, E. K., Baliunas, S. L., Henry, G. W., & Watson, C. A. 2010, MNRAS, 408, 1666 [NASA ADS] [CrossRef] [Google Scholar]
  79. Smette, A., Sana, H., Noll, S., et al. 2015, A&A, 576, A77 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  80. Snellen, I. A. G., de Kok, R. J., de Mooij, E. J. W., & Albrecht, S. 2010, Nature, 465, 1049 [Google Scholar]
  81. Snellen, I. A. G., de Kok, R. J., le Poole, R., Brogi, M., & Birkby, J. 2013, ApJ, 764, 182 [Google Scholar]
  82. Snellen, I. A. G., Brandl, B. R., de Kok, R. J., et al. 2014, Nature, 509, 63–5 [NASA ADS] [CrossRef] [Google Scholar]
  83. Snellen, I., de Kok, R., Birkby, J. L., et al. 2015, A&AS, 576, A59 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  84. Strachan, J. B. P., & Anglada-Escudé, G. 2020, MNRAS, 493, 1596 [Google Scholar]
  85. Tamuz, O., Mazeh, T., & Zucker, S. 2005, MNRAS, 356, 1466 [Google Scholar]
  86. Traub, W. A., & Cutri, R. 2008, in ASP Conf. Ser., 398, Extreme Solar Systems, eds. D. Fischer, F. A. Rasio, S. E. Thorsett, & A. Wolszczan, 475 [NASA ADS] [Google Scholar]
  87. van Belle, G. T., & von Braun, K. 2009, ApJ, 694, 1085 [NASA ADS] [CrossRef] [Google Scholar]
  88. van der Walt, S., Colbert, S. C., & Varoquaux, G. 2011, Comput. Sci. Eng., 13, 22 [Google Scholar]
  89. van Sluijs, L., de Mooij, E., Kenworthy, M., et al. 2019, A&A, 626, A97 [EDP Sciences] [Google Scholar]
  90. Webb, R. K., Brogi, M., Gandhi, S., et al. 2020, MNRAS, 494, 108–19 [NASA ADS] [CrossRef] [Google Scholar]
  91. Wong, I., Shporer, A., Daylan, T., et al. 2020, AJ, 160, 155 [Google Scholar]
  92. Wong, I., Kitzmann, D., Shporer, A., et al. 2021, ApJ, 162, 127 [CrossRef] [Google Scholar]
  93. Yan, F., & Henning, T. 2018, Nat. Astron., 2, 714–8 [NASA ADS] [CrossRef] [Google Scholar]
  94. Yan, F., Pallé, E., Reiners, A., et al. 2020, A&A, 640, L5 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

1

As noted by Scandariato et al. (2021, hereafter S21), the general definition of the autocorrelation function is of a signal cross-correlated with a copy of itself. The autocorrelation function (ACF) as defined by Borra & Deschatelets (2018), however, refers to a spectrum that is cross-correlated with a stellar template. To avoid confusion in this work, we use the acronym “ACF” to refer to the Borra & Deschatelets (2018) method, and the term ‘autocorrelation function’ retains its standard meaning.

2

i.e. its brightness at full illumination compared to a Lambertian disk.

3

Based on the data currently available in the NASA Exoplanet Archive, 1.2 ± 0.1 RJ is the approximate mean radius value for short-orbit transiting exoplanets (1 < Porb < 10 days) with similar masses to 51 Pegasi b (0.3 < Mp < 0.6 MJ). The true mass of 51 Pegasi b has been measured directly to be 0.46 ± 0.02 MJ (Brogi et al. 2013).

4

To ensure that our results from all eight nights were possible to combine, we used HARPS-N’s velocity resolution throughout our analysis.

5

ρX,Y = 1 indicates perfect correlation, 0 no correlation, and -1 perfect anti-correlation.

6

The model had HARPS-N resolution with Rp = 1.9RJ, Ag = 0.5 (M15).

7

The literature sometimes calls this cut a 3σ upper limit see e.g. Hoeijmakers et al. (2018a), which would be the case in pure Gaussian noise.

8

1.2 RJ is the approximate mean radius value for short-orbit transiting exoplanets with similar masses to 51 Pegasi b in the NASA Exoplanet Archive.

All Tables

Table 1

Summary of the eight nights of observations.

Table 2

Rotational broadening, vrefl, of different systems, calculated using Eqs. (4), (5), and (6).

All Figures

thumbnail Fig. 1

Model spectrum of reflected light for 51 Pegasi b based on a continuum normalised synthetic spectrum of its host star, at the spectral resolution of HARPS-N (Fmod, norm, R = 115 000, blue line in left-hand panel). The black solid line shows the effect of broadening the model to vrefl = 12.0 km s−1 via convolution with the rotation kernel shown in the right-hand panel. It is approximately equivalent to R ~ 25 000. Many of the narrow spectral features are blended and shallower in the broadened model, which reduces the last term of Eq. (1) and lowers S/Np.

In the text
thumbnail Fig. 2

Phases, ϕ, covered during the eight nights. The open grey circles show all observations and the solid coloured circles show the frames used in the final analysis (see Table 1) from each night. Discarding unusable data from Night 4 left a gap in the phase coverage. The smallest phase observed (ϕ = 0.36), the largest phase observed (ϕ = 0.64), and the point of inferior conjunction (ϕ = 0.5), are marked.

In the text
thumbnail Fig. 3

Extracted spectra (Fobs) in both e2ds (top panel) and s1d (middle panel) formats, and the synthetic stellar spectrum (Fmod, bottom panel) used in the cross-correlation. The units e are photoelectrons. A zoom around wavelengths 538.2–533.2 nm (order 43) is shown in blue and the right-hand panels. These plots show the dense forest of spectral lines in our dataset. The continuum variation in the e2ds format is the uncorrected blaze function, while in the s1d format we see the residual low order instrument response in our non-flux calibrated data.

In the text
thumbnail Fig. 4

Section of order 35 from Night 1, aligned to different rest frames, with the corresponding standard deviation shown in the top left of each panel. For each alignment, the matrix has been continuum normalised and then the mean of each column has been subtracted. This exposes the characteristic “X” pattern caused by misalignment of the stellar lines. The noise is reduced closest to the photon noise in the host star’s true rest frame, reducing the standard deviation by a factor of 1.8.

In the text
thumbnail Fig. 5

Left panels: sections of the order 35, from Night 1, shown at various stages of post-processing, demonstrating the removal of the stellar residuals. Right panels: sections of frame 38 from order 35, demonstrating the removal of the stellar residuals from a single spectrum. Row 1: observed spectra after bad pixel correction and alignment to the true stellar rest frame; row 2: after the removal of each order’s continuum; row 3: after division by a master spectrum; row 4: after masking the columns with over 1.4× the mean column variance, followed by one iteration of the  SYSREM algorithm; row 5: after weighting columns by their variance; bottom row: same stage as for Row 5, but with a model planet (Rp = 1.9RJ, Ag = 0.5, vrefl, mod = 0.0 km s−1) injected at 100× nominal strength after the alignment to the stellar rest frame. At each stage of post-processing, the standard deviation of the entireorder decreases.

In the text
thumbnail Fig. 6

Sets of SYSREM a-values from Night 1of data that has been masked with a 1.4σ clip (blue crosses, top three rows, see Sect. 4.4), 2nd degree polynomials fitted to detect any trends (black lines, top three rows), and possible causes of the observed  SYSREM trends (purple crosses, bottom row). Top three rows: a-values subtracted during the first, second, third and fourth iteration of  SYSREM for orders 1, 35 and 69, respectively. There is a large variety in the shape of the trend removed during each iteration, and little similarity between the trends, indicating a strong wavelength dependence. Bottom row: air mass, primary mirror channel temperature, local air pressure and humidity during Night 1 of observations. The air mass values tightly correlate with the a-values for order 69 where tellurics were strongest. The other physical trends do not correlate with any of the sets of a-values.

In the text
thumbnail Fig. 7

Section of order 69, from Night 1 of observations, after zero, one, two, three and four iterations of the SYSREM algorithm respectively. After each iteration, the standard deviation of the order is reduced.

In the text
thumbnail Fig. 8

Optimising the S/N of the ΔCCFs at Δv = 0 km−1 in the planet rest frame, for all orders and all the nights combined, optimised for cases 1 and 2 (top panel) and for cases 3 and 4 (bottom panel). More SYSREM iterations improves the model recovery when rotational broadening is not included, indicating that SYSREM is likely removing the wings of broadened signals and thus should be used with less iterations.

In the text
thumbnail Fig. 9

Comparison of the summed (all orders combined) CCFs for Night 1 with a model (mod) injected with parameters Rp = 1.9RJ and Ag = 0.5, when the cross-correlation template (c. c. temp) is broadened (solid lines) and is not broadened (dashed lines). Mismatched broadening (green and brown lines) results in lower S/N in the CCF peak compared to matched broadening (purple and grey lines).

In the text
thumbnail Fig. 10

ΔCCFs from different orders from Night 1 (see Sect. 4.5.2) demonstrating the sensitivity of each order in recovering an injected model.

In the text
thumbnail Fig. 11

Top two panels: CCF weights for each night of data, for cases 1 and 2 (top panel) and for cases 3 and 4 (second panel). Third panel: S/N per order, for each night of data. The S/N increases towards the red showing the sensitivity of both HARPS detectors. Bottom panel: number of stellar lines per order, which is the same for each night of data.

In the text
thumbnail Fig. 12

All-nights CCF matrices for cases 1, 2, 3 and 4, with the CCFs from all eight nights ordered by phase. The top panels are in the rest frame of the star and the bottom panels are in the rest frame of the planet. The cases where the cross-correlation template is rotationally broadened (cases 1 and 2, vrefl, c.c. temp = 12.0 km s−1, left two columns) and is not broadened (cases 3 and 4, vrefl, c.c. temp = 0.0 km s−1, right two columns) are shown. Cases 2 and 4, where a model (mod) following that proposed by M15 (Rp = 1.9, Ag = 0.5) has been injected are indicated by ‘With model’. For case 2, the injected model returns a very weak trail, as the CCF peak is spread out over more Δv increments. For case 4, the injected model is clearly visible to the eye as a blue-shifting trail in the stellar rest frame and as a vertical stationary trail in the planet rest frame. The phase increments are not evenly spaced due to uneven phase coverage (see Fig. 2). The faint vertical trails at Δv = 0.0 km s−1 in the top panels is due to residual stellar lines. The stellar residuals are much more pronounced for cases 3 and 4, because the stellar residuals match with the non-broadened cross-correlation template.

In the text
thumbnail Fig. 13

Summed 1D CCFs of the observed data in terms of S/N, where each lines corresponds to a different Kp in 40 ≤ Kp ≤ 200 km s−1, for cases 1 (left) and 3 (right), with cross-correlation templates with and without broadening (vrefl, c.c. temp = 12.0 km s−1 and 0.0 km s−1 respectively). The dashed light blue line and the dotted purple line mark the SN = ±3 and SN = ±4 thresholds. The cross-correlation template without rotational broadening results in stronger noise spikes due to matching with narrower lines in the residual host star spectrum.

In the text
thumbnail Fig. 14

Kp – Δv maps showing the S/N of the CCFs for a range of Kp, for all the nights combined. The red dashed lines mark the expected location of any signal from 51 Pegasi b. The plots showscase 1 (left) and case 3 (right), with and without a rotationally broadened cross-correlation template (vrefl, c.c. temp = 12.0 km s−1 and 0.0 km s−1 respectively). No planet signal is recovered, with SN = −0.16 and SN = −0.30 at Kp = 133 km s−1 and Δv= 0 km s−1 for the left and right plots respectively. The colour bars are calibrated to peak at the highest value in each Kp – Δv map, and descend in increments of 0.5.

In the text
thumbnail Fig. 15

As for Fig. 14, Kp – Δv maps showing the S/N of the CCFs for all the nights combined, but with a model injected with parameters Ag = 0.5, Rp = 1.9RJ (M15). The plots show case 2 (left) and case 4 (right), with and without rotational broadening (vrefl, mod = 12.0 km s−1 and 0.0 km s−1 respectively). The cross-correlation templates match the broadening of the model in each case. The broadened model is weakly recovered with SN = 5.50 at Kp = 133 km s−1 and Δv= 0 km s−1. Without broadening the model is recovered very strongly with SN = 18.82 at Kp = 133 km s−1 and Δv= 0 km s−1. The colour bars are calibrated to peak at the highest value in each Kp–Δv map, and descend in increments of 1.0.

In the text
thumbnail Fig. 16

S/N recovered by our pipeline for a suite of injected reflected light models at Kp = 133 km s−1 and Δv= 0 km s−1, as a functionof geometric albedo (Ag) and planet radius (Rp), for models with (left) and without (right) rotational broadening. The white and grey contours are above the threshold for detection, while blue regions cover statistically viable scenarios for 51 Pegasi b, given the upper limits on the FpF contrast that we derive in Sect. 5.4. The black dashed contours mark out the FpF contrast values. The red dashed lines mark the Ag = 0.5 and Rp= 1.9RJ parameters proposed by M15 for 51 Pegasi b. The blue to grey colour scale is not evenly spaced, but increases in increments of 2 up to SN ≤ 10, in increments of 4 when 10 < SN ≤ 40 and in increments of 8 when SN > 40. The colour bars begin at −1 because the S/N value where no model is injected is less than zero for the cases with and without broadening. The colour scale is the same for both plots.

In the text
thumbnail Fig. 17

Comparison of the autocorrelation function of the model, with (solid) and without (dashed) broadening, to the CCFs obtained after injecting models into the observed data (purple and grey lines), to assess if post-processing affects the FWHM of the signal. All are normalised with respect to the black dashed line for visual purposes. In the autocorrelation functions, one spectrum has the residuals mask from Sect. 4.4 applied beforehand to better match the steps in the analysis that exclude the more saturated stellar features. The FWHM values of the autocorrelation functions are 34.4 km s−1 and 10.4 km s−1, respectively with (blue solid line) and without (black dashed line) broadening. For the injected models, the FWHM of the CCFs are 29.6 km s−1 (purple solidline), and 7.2 km s−1 (grey dashed line), respectively with and without broadening. The grey and purple lines correspond to their un-normalised counterparts in Fig. 9.

In the text
thumbnail Fig. 18

vrefl values, in kms−1 calculated using the equations defined in Rodler et al. (2010, see Eqs. (5), (4) and (6)), for a range of exoplanet and stellar parameters. Each row of plots shows vrefl values for different values of stellar radii (R) and exoplanet radii (Rp). The y-axes correspond to the period of the exoplanet’s rotation on its own axis (Prot,p). The x-axes correspond to the period of the exoplanet’s orbit around its host star and extends down to Porb = 0.1 days). The dashed black diagonal lines show where Prot,p = Porb, i.e. the exoplanet is always showing the same face to its host star (as is the case for 51 Pegasi b). Each individual plot corresponds to a different period of the star’s rotation on its own axis (Prot,⋆). The dotted black vertical lines show where Prot,⋆ = Porb, i.e. the star is always showing the same face to the exoplanet. The point where the dashed and dotted black lines cross mark where Prot,p = Prot,⋆ = Porb, i.e. the star-exoplanet system is in a full tidal lock.

In the text
thumbnail Fig. A.1

Detection of water vapour in the atmosphere of HD 189733b found using the pipeline described in this paper. The data are from CRIRES/VLT at 3.2 μm. This serves to verify the efficacy of our pipeline in recovering known detections (Birkby et al. 2013; Cabot et al. 2019). Axes as for Figures 14 and 15.

In the text
thumbnail Fig. B.1

As for Figure 5, but for Nights 2, 3, 4, 5, 6, 7 and 8. Continued on the next page. Note that the wavelength range 497.52 nm - 498.70 nm is covered by order 35 from HARPS-N data and order 39 from HARPS data.

In the text
thumbnail Fig. B.2

As for the left plot in Figure 14, but displaying the individual results from Nights 1, 2, 3, 5, 6, 7 and 8, as opposed to the combined results. Some features appear at the S/N=-5 level, but do not propagate to the all-nights matrix, indicating that these are probably due to residual stellar contamination.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.