Evidence for at least three planet candidates orbiting HD 20794

F. Feng; M. Tuomi; H. R. A. Jones

doi:10.1051/0004-6361/201730406

Home

All issues

Volume 605 (September 2017)

A&A, 605 (2017) A103

Full HTML

Free Access

Issue		A&A Volume 605, September 2017


Article Number		A103
Number of page(s)		11
Section		Planets and planetary systems
DOI		https://doi.org/10.1051/0004-6361/201730406
Published online		18 September 2017

A&A 605, A103 (2017)

Evidence for at least three planet candidates orbiting HD 20794^⋆

F. Feng, M. Tuomi and H. R. A. Jones

Centre for Astrophysics Research, School of Physics, Astronomy and Mathematics, University of Hertfordshire, College Lane, Hatfield, AL10 9AB, UK
e-mail: f.feng@herts.ac.uk, fengfabo@gmail.com

Received: 6 January 2017
Accepted: 14 May 2017

Abstract

Aims. We explore the feasibility of detecting Earth analogs around Sun-like stars using the radial velocity method by investigating one of the largest radial velocities data sets for the one of the most stable radial-velocity stars HD 20794.

Methods. We proceed by disentangling the Keplerian signals from correlated noise and activity-induced variability. We diagnose the noise using the differences between radial velocities measured at different wavelength ranges, so-called “differential radial velocities”.

Results. We apply this method to the radial velocities measured by HARPS, and identify four signals at 18, 89, 147 and 330 d. The two signals at periods of 18 and 89 d are previously reported and are better quantified in this work. The signal at a period of about 147 d is reported for the first time, and corresponds to a super-Earth with a minimum mass of 4.59 Earth mass located 0.51 AU from HD 20794. We also find a significant signal at a period of about 330 d corresponding to a super-Earth or Neptune in the habitable zone. Since this signal is close to the annual sampling period and significant periodogram power in some noise proxies are found close to this signal, further observations and analyses are required to confirm it. The analyses of the eccentricity and consistency of signals provide weak evidence for the existence of the previously reported 43 d signal and a new signal at a period of about 11.9 d with a semi-amplitude of 0.4 m/s.

Conclusions. We find that the detection of a number of signals with radial velocity variations around 0.5 m/s that are likely caused by low mass planet candidates demonstrates the important role of noise modeling in searching for Earth analogs.

Key words: methods: statistical / methods: numerical / techniques: radial velocities / stars: individual: HD 20794

^⋆

Radial velocity tables are available at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/605/A103

© ESO, 2017

1. Introduction

High precision spectrometers enable us to find super-Earths by measuring the Doppler shift of the stellar spectra caused by the periodic perturbations of exoplanets orbiting the target stars. Unlike the transit method, which relies on a particular orbital orientation, the radial velocity (RV) method can provide useful information about planets around all stars. The recent detection of the nearest habitable-zone planet candidate Proxima Centauri b demonstrates the important role of the RV technique in detecting Earth analogs (Anglada-Escudé et al. 2016).

However, the stellar spectrum is contaminated by various noise sources, such as stellar activity, instrumental noise, and non-perfect data reduction. An analysis of high-cadence spectroscopy of M dwarfs shows that the noise floor (about 1 m/s) of current Doppler surveys has to a large extent an instrumental origin (Berdiñas et al. 2016). Our analysis is also concerned with demonstration of the strong dependence of activity-induced noise on wavelength. The minimization of both instrumental and stellar noise requires a noise model accounting for noise correlated in time and wavelength.

Considering that complex models typically lead to false negatives while simple models lead to false positives (Feng et al. 2016), we follow the Goldilocks principle introduced in Feng et al. (2016) to select the best noise model for a given RV data set. In other words, we devise a noise model framework to diagnose the noise in a given data set. In this work, we apply this noise model framework to the HARPS RV measurements of HD 20794.

HD 20794 is a high-velocity and metal-deficient G8 star with a mass of $0.81 3_{-0.012}^{+ 0.018}$ $\hbox{$0.813^{+0.018}_{-0.012}$}$ M_⊙ (Ramírez et al. 2013). It has been reported to host at least three planets and a dust disk (Pepe et al. 2011; Kennedy et al. 2015). The postulated planets are likely super-Earths with orbital periods of 18.1, 40.1, and 90.3 d Pepe et al. (2011; hereafter P11). These results of P11 were obtained via periodogram analysis, which assumes white RV noise and circular planetary orbits. Although HD 20794 is known to have planets, it has long been chosen as a special RV reference target due to its notable stability over long time baselines found by the southern radial velocity programs of the Anglo-Australian Planet Search (e.g., Fig. 2 of Butler et al. 2001) and its continuous presence on the HARPS guaranteed time observation list¹.

This paper is structured as follows. We introduce the RV data in Sect. 2. In Sect. 3, we introduce differential RVs to model wavelength-dependent noise and choose the so-called Goldilocks noise model in the Bayesian framework. In Sect. 4, we apply our models to the data and report the results. Finally, we discuss and conclude in Sect. 5.

2. HARPS Doppler measurements for HD 20794

In the European Southern Observatory archive, there are 5150 publicly available RVs measured by HARPS from September 2003 to September 2013 for HD 20794 from programs Mayor, 60.A-9036, 072.C-0488, 083.C-1001, 084.C-0229, 086.C-0230, 087.C-0990, 088.C-0011, 089.C-0050, 090.C-0849, and 091.C-0936. We processed the Doppler measurements with the TERRA algorithm (Anglada-Escudé & Butler 2012), which generates RVs for 72 echelle orders. The stellar activity is recorded by the following indices: the S-index measured from the Ca II H&K emission, line bisector shapes (BIS), and the width of the spectral lines (FWHM). We excluded the points with RVs or activity indices, which deviate from the mean by more than 5σ. We do this iteratively to omit any visible outliers. Moreover, we also removed the data obtained before JD 2 453 500 due to the large dispersion in the activity indices, FWHM in particular. Such a feature is also observed in the τ Ceti data (Feng et al. 2017b), although the cause for this may be different. Finally, we obtain a data set of 4882 RVs together with activity indices. These data and the discarded epochs are shown in Fig. 1. We see a great variation of FWHM before JD 2 453 500, and this feature is also found in the data measured for τ Ceti (Feng et al. 2017b)².

Fig. 1

Measurements of RV, S-index, BIS and FWHM by HARPS for HD 20794. The dashed line denotes JD 2 453 500 before which the data is probably affected by instrumental effects.

We used the RV differences between the 72 echelle orders to remove wavelength-dependent noise. We divided the 72 spectral orders into groups and averaged the RVs in each group to generate the so-called “aperture data sets”. For n groups of spectral orders, we generated nAPj aperture data sets, where j is a natural number not larger than n. We defined the RV differences between aperture data sets as differential RVs. We denoted them by nAPj − i, where n is the number of groups of spectral orders, and j and i denote different aperture data sets. For n independent aperture data sets, there are n − 1 independent differential RVs in total. The schematic of this data reduction process is shown in Fig. 2.

Fig. 2

Schematic diagram of data reduction and modeling process.

3. Noise model framework and Bayesian inference

3.1. Wavelength-dependent noise models

The RV variation induced by stellar activity and rotation is typically wavelength-dependent (Tuomi et al. 2013a). Thus the average of RVs measured at different wavelength ranges (or spectral orders) is biased because previous studies have ignored the influence of such a dependence. To remove such bias, we introduced “differential RVs” to weight the spectral orders a posteriori. Differential RVs are RV differences between different spectral orders (or wavelength ranges) and thus do not contain Keplerian variation since Keplerian variation is color independent. Following the Goldilocks principle of noise modeling introduced by Feng et al. (2016), we reduced the number of different RVs by deriving them from averaged spectral orders, so-called aperture data sets (see Fig. 2).

The differential RVs are then $D (t_{i}, λ_{j}) \equiv v (t_{i}, λ_{j + 1}) - v (t_{i}, λ_{j}) = Ψ (t_{i}, λ_{j + 1}) - Ψ (t_{i}, λ_{j}),$ $\begin{equation} D(t_i,\lambda_j)\equiv v(t_i,\lambda_{j+1})-v(t_i,\lambda_j)=\Psi(t_i,\lambda_{j+1})-\Psi(t_i,\lambda_j), \end{equation}$ (1)where v(t_i,λ_j) is the RV measured at time t_i within a wavelength range centered at λ_j. We define the noise component of the RV model as $Ψ̂ (t_{i}, λ_{j}) = a (λ_{j}) t_{i} + b (λ_{j}) + \sum_{k} c_{k} (λ_{j}) I_{k} + \sum_{m = 1}^{N_{λ} - 1} d_{m} D (t_{i}, λ_{m}),$ $\begin{equation} \hat{\Psi}(t_i,\lambda_j) = a(\lambda_j)\,t_i+b(\lambda_j)+\sum_{k}c_k(\lambda_j)I_k+\sum_{m=1}^{N_\lambda-1}d_mD(t_i,\lambda_m), \label{eqn:psi} \end{equation}$ (2)where N_λ is the number of wavelength ranges or aperture divisions and thus N_D ≡ N_λ − 1 is the number of independent differential RVs, d_m characterizes the linear dependence of RV noise on the mth differential RV, a is the linear acceleration caused by stellar companions or activity cycles and b is the reference velocity. The linear dependence of RVs on activity index I_k is parameterized by constant c_k. We consider that a and b can be wavelength dependent, because besides fitting for trends due to companions, those parameters can also absorb any drift due to long-term activity not correlated with activity indices or long-term instrumental noise variation. We linearly combine $\hbox{$\hat{\Psi}(t_i,\lambda_j)$}$ with n Keplerian components to define the basic RV model, which is $\begin{matrix} v̂ b (t_{i}, λ_{j}) & = & \sum_{k = 1}^{n} f_{k} (t_{i}) + Ψ̂ (t_{i}, λ_{j}), \\ f_{k} (t_{i}) & = \end{matrix}$ $\begin{eqnarray} \hat{v}_b(t_i, \lambda_j)&=&\sum_{k=1}^{n} f_k(t_i)+\hat{\Psi}( t_i,\lambda_j),\\[2mm]\nonumber f_k(t_i)&=&K_k [\cos(\omega_k + \nu_k(t_i))+e_k\cos(\omega_k)], \label{eqn:basic} \end{eqnarray}$ (3)where f_k(t_i) is the kth Keplerian component, and $\hbox{$\hat{\Psi}$}$ is the noise component. In the Keplerian function, K_k, ω_k, ν_k, and e_k are the amplitude, longitude of periastron, true anomaly, and eccentricity of the kth Keplerian signal.

Since the moving average (MA) model can efficient account for red noise in RVs and avoid false negatives (Feng et al. 2016), we model the red noise using a general MA model with exponential smoothing which is $v̂ (t_{i}, λ_{j}) = v̂ b (t_{i}, λ_{j}) + \sum_{k} w_{k} (λ_{j}) \exp [- | t_{i} - t_{i - k} | / τ (λ_{j})] ϵ_{i - k},$ $\begin{equation} \hat{v}(t_i,\lambda_j)=\hat{v}_b(t_i,\lambda_j)+\sum_k w_k(\lambda_j) \exp[-|t_i-t_{i-k}|/\tau(\lambda_j)] \epsilon_{i-k}, \label{eqn:full} \end{equation}$ (4)where w_k and τ are the amplitude and timescale of the moving average, and ϵ_{i − k} is $\hbox{$v(t_{i-k},\lambda_{j})-\hat{v}_b(t_{i-k}, \lambda_j)$}$ . Hereafter, we define “nP+MA(q)+mD” as the n-planet model combined with qth-order moving average and m differential RVs deriving from m + 1 aperture data sets (see Fig. 2). The white noise model is $\hbox{$\hat{\Psi}(t_i,\lambda_j)$}$ , denoted by MA(0).

We also account for a white excess noise with an amplitude of s_J in the likelihood which is $\begin{matrix} ℒ_{j} & \equiv & P (v_{j} | θ,M) = \end{matrix}$ $\begin{eqnarray} \mathcal{L}_j&\equiv& P(v_j|\boldsymbol{\theta},M)=\notag \\ &&\prod_i\frac{1}{\sqrt{2\pi[\sigma_{i,j}^2+s_{\rm J}(\lambda_j)^2]}}\exp\left[-\frac{(\hat{v} (t_i,\lambda_j)-v_{i,j})^2}{2(\sigma_{i,j}^2+s_{\rm J}(\lambda_j)^2)}\right], \label{eqn:like} \end{eqnarray}$ (5)where σ_i,j is the measurement noise at time t_i in the jth aperture data set with wavelength range centered at λ_j, s_J(λ_j) is the jitter level, and v_i,j is the RV measured at time t_i in the jth aperture data set. Following Feng et al. (2016), we adopt a Gaussian prior distribution centered at zero and with a standard deviation of 0.1 for eccentricity, logarithmic uniform distributions for period P and correlation timescale τ, and uniform distributions for other parameters. The specific prior distributions are given in Table 1.

Table 1

Prior distributions of model parameters.

In almost all previous research, only the averaged data or 1AP1 is analyzed without accounting for wavelength-dependent noise. We justify the usage of differential RVs in the removal of wavelength-dependent noise by applying the model defined in Eqs. (4) and (5) to the 3AP1, 3AP2, 3AP3 and 1AP1 data sets. We model these data sets using the noise models of 0P+MA(1)+0D, 0P+MA(1)+2D, 0P+MA(0)+0D, and 0P+MA(0)+2D. We calculate the generalized periodogram with floating trend (GLST; Feng et al. 2017a) for each data set and each model to visualize the consistency of signals in wavelength. Following Cumming et al. (1999), the GLST is normalized with respect to the residual variance and the FAP is calculated accordingly.

The GLSTs for the 3AP1, 3AP2, 3AP3, and 1AP1 data sets and their residuals after subtracting the best-fitting noise model are shown in Fig. 3. The interpretation of specific peaks in plotted GLSTs is not important. Rather, our concern is the consistency between GLSTs of RVs and their residuals measured at different wavelengths. The FAPs are calculated assuming the null hypothesis of white noise and thus are not used to measure the significance of signals in the data contaminated by correlated noise. In the figure, we observe significant differences between GLSTs for 3AP1, 3AP2, 3AP3, and 1AP1, indicating significant wavelength-dependent noise in the RV data. On the contrary, the residuals of all RV data sets have similar GLSTs after subtracting the best-fitting 0P+MA(0)+2D or 0P+MA(1)+2D from the data. Such consistency is not sensitive to the change of time-correlated noise component (i.e., MA(0) and MA(1)), as seen in the rightmost two columns. Therefore differential RVs are essential in removing wavelength-dependent noise.

Fig. 3

Illustration of the existence of wavelength-dependent noise and the necessity of removing it using differential RVs. We show a series of GLSTs across the page. Each row of plots is given to a different aperture data set labeled 3AP1, 3AP2, 3AP3, and 1AP1. The following columns show GLSTs for the corresponding residuals after subtracting the best-fitting models of 0P+MA(0)+0D (white noise), 0P+MA(1)+0D (moving average), and 0P+MA(0)+2D (white noise with differential RVs), 0P+MA(1)+2D (moving average with differential RVs). The logarithm BF of a model with respect to the MA(0) model are shown for the residuals of each data set after subtracting the best model prediction. In each panel, the red dotted line denotes the period with the highest power. The false alarm probability (FAP) threshold of 0.1, 0.01 and 0.001 are shown by dashed lines.

In the following sections, we focus our analysis on the 1AP1 data set because it has a higher signal-to-noise ratio with respect to the other aperture data sets. As seen in Fig. 3, the wavelength-dependent noise in 1AP1 is properly removed by differential RVs a posteriori.

Table 2

Logarithmic BIC-estimated BFs for noise models with respect to the model without any MA components or differential RVs.

Table 3

Logarithmic BIC-estimated BFs for models with various numbers of planets for Keplerian and circular solutions.

3.2. Goldilocks noise model

We applied the adaptive Metropolis Markov chain Monte Carlo (AM) introduced by Haario et al. (2001) to sample the posterior. Specifically, we used tempered (or hot) AM chains to identify potential signals, and then used untempered (or cold) chains to constrain signals. Similar methods have been introduced by Gregory (2011). The hot chain is defined as a chain moving according to the modified posterior $\hbox{$P({\mathcal D}|\theta,M)^{\beta}P(\theta|M)$}$ with 0 <β ≤ 1, where $\hbox{$P({\mathcal D}|\theta,M)$}$ and P(θ | M) are the likelihood and prior of parameters θ for model M and data $\hbox{${\mathcal D}$}$ . We used four parallel chains for inference if the posterior samples drawn by these chains converged to a stationary distribution. In other words, the ratio of variances between and within these chains should be less than 1.1, which is the so-called “Gelman-Rubin criteria” (Gelman & Rubin 1992). We typically use the AM algorithm to draw a few million posterior samples for model and/or parameter inference. The detailed results are available online at http://star-www.herts.ac.uk/~ffeng/HD20794_supplementary/results

From the likelihoods of the posterior samples drawn by cold chains, we estimated the Bayes factor (BF) using the Bayesian information criterion (BIC). Following Feng et al. (2016), we regard model A as favored over model B if the logarithmic BF of the former with respect to the latter is larger than 5. To confirm a signal, we also adopted the criteria used by Tuomi (2012) that the period should be constrained from above and below in the posterior distribution of the period. In other words, $\hbox{$P(P_k|{\mathcal D},~M)$}$ converges to a stationary distribution.

Following the Goldilocks principle introduced by Feng et al. (2016), we applied the BF criterion to determine which noise model to use for removing noise. We generated a set of noise models by setting q ∈ { 0,1,2,3,4,5,6 } and N_D ∈ { 0,2,5,8 }, and apply them to the data. The results of the model comparison are shown in Table 2. We find that the fourth order MA combined with five differential RVs is the best choice. Apart from q and N_D, all other model parameters are free. Specifically there are 2 parameters for the trend, 1 for jitter, 3 for activity indices, 5 for differential RVs and 5 for the MA model. There are 16 free parameters in total in the 0P+MA(4)+5D model. In the following section, we analyze the HARPS data using this model.

4. Keplerian candidates

We applied both Keplerian and sinusoidal functions to model the RV variations caused by planets, which are called the Keplerian solution and circular solution, respectively. We identified six signals in both solutions, and report the BFs for them in Table 3. Although the ~43 d signal identified in the Keplerian solution does not pass the BF threshold of 150, it satisfies the criteria for the circular solution. Moreover, without including the 43 d signal into the model, the eccentricity of the 89 d signal would be as high as 0.4. The 43 d and 89 d signals are approximately in 1:2 resonance, which is not rare for exoplanets (Steffen & Hwang 2015). Although the signal at a period of 11 d does not pass the BF threshold in both solutions, it is strong in the Keplerian solution, and increases the BF by at least one order of magnitude. In addition, it is visible in the periodograms for all observation seasons, as we see later. The periods in the circular solution are different from those in the Keplerian solution because of the assumption of periods in the circular solution.

To test whether the identified signals are caused by stellar and instrumental noise, we show the GLSTs for the data, the activity indices and differential RVs in Fig. 4. In the figure, we find that the RVs are strongly correlated with the FWHM and differential RVs, demonstrating the essential role of these noise proxies in removing correlated noise. We also find that periods around the ~330 d signal are strong in noise proxies such as FWHM and 6AP5-4, although these periods typically deviate from 330 d by more than 20 d. Such approximate overlaps are expected since the 330 d signal is close to the annual sampling frequency. Moreover, the probability of a random overlap between a signal and significant powers in the GLSTs of noise proxies increases with the number of proxies. Considering the minimization of correlated noise using noise proxies in the model, the 330 d signal is too strong to be caused by pure noise.

Fig. 4

GLSTs for the 1AP1 data set, various noise proxies, and the window function. The names of the times series are shown at the top left corner. The red and blue dotted lines denote the signals quantified for the 6-planet model in the Keplerian and circular solutions respectively and reported in Table 3. The blue and red lines are usually too close to be distinguished for signals at periods larger than 12 d. We truncate the period range for optimal visualization of signals. We do not show the FAPs because the highly correlated noise in the data is not accounted for by white noise FAPs.

We also calculated the GLSTs for the data subtracted by the optimal prediction of models with different numbers of Keplerian components and show these periodograms in Fig. 5. We see that all identified signals correspond to certain peaks in the periodogram for the original RVs. The subtraction of the noise components makes some signals more significant. We find that the signal identified by posterior sampling is not always the most significant one shown in the residual periodogram. Moreover, the 43 and 11 d signals are not strong in the corresponding residual periodograms, indicating the limitation of residual periodograms in identifying Keplerian signals. The residuals after subtracting the predictions of n-planet model are shown in Fig. A.1 in the appendix. The correlation between the RV residuals and noise proxies are shown in Fig. A.2.

Fig. 5

GLSTs for the original data and data subtracted by the optimal prediction of n-planet model. The value of n is shown in the upper right corner. The new signal identified by the n + 1-planet model is shown by red dotted lines. The other elements are similar to those in Fig. 4.

To show the consistency of the detected signals, we make a Bayesian periodogram for data within a moving time window. To make this periodogram, we model the RVs using a combination of sinusoidal functions and a linear trend and analytically marginalize the likelihood over the amplitudes of sinusoidal functions and the trend parameters, which are a and b in Eq. (2). Hence this new periodogram is called marginalized likelihood periodogram (MLP; Feng et al. 2017a). In addition to trend parameter a, the MLP also optimizes b, and thus extends the Bayesian generalized Lomb-Scargle periodogram (Mortier et al. 2015)³. To make a 2D periodogram, we move the time window 100 steps to cover the whole data set, and compute the MLP for each step to construct a moving periodogram. We apply this method to the RVs from which the noise component in the six-planet model prediction for the Keplerian solution has been subtracted. The moving periodograms with 1000 d and 2000 d time windows together with the data are shown in Fig. 6. Since more data is needed to cover the phase of long period signals, we recommend the 2000 d time window for testing the consistency of the 330 d and 147 d signals. We calculated the MLP for each time step and scale the logarithmic marginalized likelihood (ML) to be $RML \equiv (ML - ML) / ({ML}_{\max} - ML)$ $\hbox{${\rm RML}\equiv ({\rm ML}-\overline{\rm ML})/({\rm ML}_{\rm max}-\overline{\rm ML})$}$ , where $ML$ $\hbox{$\overline{\rm ML}$}$ and ML_max are the mean and maximum ML. The RMLs are encoded by colors to show the consistency of signals in time rather than to estimate the significance of signals.

As we can see, the 11 d signal together with its aliases, although weak, are visible over the whole time span. The 147 and 330 d signals are consistently identified in the 2000 d moving periodogram. Notably the significance of these signals may vary with time because long period signals are sensitive to data sampling and size, which typically vary with time. There are also strong powers around the harmonics of the 330 d signal at a period of about 680 d and the annual alias of the 147 d signal at a period of about 250 d. Considering that the 330 d signal is also strong in some noise proxies, further observations and analyses are required to explore the nature of this long period signal.

Fig. 6

Noise-subtracted RVs (top panels) and the moving MLP-based periodograms (bottom panels) made within the 1000 d (left panels) and 2000 d (right panels) moving window. In each moving periodogram, the bottom left panel shows the periodogram for periods ranging from 10 d to 1000 d while the bottom right panel is focused on visualization of periods below 30 d. The colors encode the RML which is truncated to optimize the visualization of signals. The periods of signals identified in the 6-planet Keplerian solution are shown by horizontal dashed lines. The logarithmic ML is truncated to optimize the visualization of signals.

The 147 d signal is strong in many observation seasons despite the slight variation of its period. The high periodogram power around 110 d is probably a combination of the aliases of 89 and 147 d since the subtraction of the 89 d signal would reduce its significance. The signals around 89 d, 43 d and 18 d are also consistently identified over the whole time span. These signals are previously found by P11 in the circular solutions. Notably the periods of ~43 and 147 d slightly deviate from the optimized values shown in the periodogram because the periodogram assumes zero eccentricity. In addition to the identified signals, the periodogram maps in Fig. 6 also show annual alias of 330 d around 174 d, and annual aliases of 89 d around 113 and 70 d. The annual aliases of 43, 18 and 11 d are also visible. Therefore we interpret all strong signals based on the moving periodograms which visualize the genuine signals as well as their aliases.

Based on the above analyses, we interpret the signals around 18, 89, 147 and 330 d as planet candidates, although investigation of the 330 d signal is necessary to fully confirm its Keplerian nature. Since there is only weak evidence for the signals at periods of 11 and 43 d, more data and further analyses are required to confirm these potential candidates. We show the phase curves for these six signals in Fig. 7. We see that the six-planet model well fits the data, supporting the existence of 6 signals despite considerable eccentricity for signals around 18, 43 and 147 d.

Fig. 7

Phase-folded data and planet model prediction. For the signal shown in each panel, the noise components and other signals are subtracted from the data. For each panel, the phase-folded times are original observation times mod the optimal period with respect to the minimum observation time.

In Table 4, we report the parameters of the noise model and the four planet candidates, which are estimated in the 6-planet Keplerian solution. In the table, we see high MAP values of linear parameters such as c₃ and d₃⁴, suggesting the existence of correlated noise as high as 1 m/s. This correlated noise would prevent any reliable detections of submeter signals if not minimized through the application of noise proxies such as activity indices and differential RVs. However, the application of noise proxies may introduce extra noise in the fitting (Feng et al. 2017b) due to an oversimplification of the complex/nonlinear dependence of RVs on proxies by linear functions. This is also part of the reason why we do not interpret the weak signals at periods of 11 and 43 d as planet candidates. On the other hand, the MA model plays an important role in removing time-correlated noise, which is probably intra-night RV variability caused by instrumental or reduction-process effects (Berdiñas et al. 2016).

Table 4

Maximum a posteriori (MAP) estimation of the parameters for the noise model (lower rows) and the five signals (upper rows) detected in the TERRA-reduced HARPS data for HD 20794.

In Table 4, the signal at a period of 147 d corresponds to a super-Earth perturbing HD 20794 with a mean semi-amplitude⁵ of about 0.5 m/s. The candidate with 89 d period has a minimum mass lower than the candidate around 90 d reported by P11 most likely because we accounted for correlated noise that might have been interpreted as signals by P11. The 11 d signal caused a radial velocity variation of about 0.35 m/s. Regardless of whether this signal is Keplerian or not, the detection of such a weak signal demonstrates the important role of noise modeling in detecting low mass planets using the radial velocity technique.

According to the results of RV challenge, the MA model in combination of Bayesian methods can identify signals with a K/N ratio as low as 5 without announcing a false positive (Dumusque et al. 2017). The K/N ratio for signal with semi-amplitude of K is defined as $K / N \equiv K / R V_{rms} \times \sqrt{N_{obs}}$ $\hbox{$K/N\equiv K/RV_{\rm rms}\times \sqrt{N_{\rm obs}}$}$ , where RV_rms is the standard deviation of RVs after removing the best-fitting trend and correlation with noise proxies, and N_obs is the number of observations. If we consider measurements in a 15 min bin as an independent observation (Mayor et al. 2003; O’Toole et al. 2008)⁶, we get 713 independent observations and RV_rms = 1.61 m/s. Then the K/N ratios are 12, 10, 10, 19, 6.7, and 9.6 for signals at periods of 18, 89, 147, 330, 11, and 43 d. Hence the submeter signals we identified for HD 20794 b, d, and e are well above the K/N = 7.5 threshold of reliable detections of RV signals. Even the smallest signal at 11 d is close to this threshold and is higher than the K/N = 5 limit reached by our team.

However, the nominal eccentricities of HD 20794 b, d, and e are different from zero, which probably lead to dynamical instability of the system according to the Lagrange stability analyses (Barnes & Greenberg 2006). But we find that the system is stable if lower eccentricity within the uncertainty interval is adopted for HD 20794 b, d and e. The high eccentricity reported in Table 4 might arise from instrumental noise (Feng et al. 2017b). Specifically, the instrumental noise may cause short-term RV variations that favor high eccentricity solutions. The high eccentricity may also be caused by the intrinsic bias in estimating non-negative eccentricity (Zakamska et al. 2011). Therefore the proposed candidates probably have orbits that are more circular than those reported in this work.

Only the candidate with a period of about 330 d is located in the habitable zone (Kopparapu et al. 2013), although the confirmation of this candidate requires further observations and analyses. Its habitability would be influenced by the frequent impacts of objects from the massive circumstellar disk (Kennedy et al. 2015). All the other candidates are located too close to the host star to allow the existence of liquid water, and thus cannot be considered habitable.

5. Discussions and conclusions

We analyze the HARPS data of HD 20794 in the Bayesian framework. We find strong dependence of the RV noise on wavelengths. This wavelength-dependent noise cannot be removed by averaging all spectral orders as previous studies did. To deal with this noise, we use differential RVs to weight the spectral orders a posteriori. We apply this method to data sets measured within different wavelength ranges, and find that differential RVs efficiently remove wavelength-dependent noise. Therefore we propose a combination of the MA model and differential RVs to remove the time and wavelength-dependent noise in RV data sets.

By modeling the RV noise correlated in time and wavelength for HD 20794, we identify three firm Keplerian candidates at periods of about 18, 89 and 147 d. The signal at a period of 330 d is probably Keplerian, although similar periods have strong powers in some proxies. There is also weak evidence for the existence of signals at periods of 43 and 11 d. While the 43 d signal does not pass the Bayes factor criterion, the inclusion of this signal in the model can reduce the eccentricity of the 89 d signal. Although the 11 d signal only increases the Bayes factor by a factor of about 20, it consistently appears in different data chunks.

Thus by applying noise modeling to the P11 and newer HARPS data, we confirm the two candidates at periods of 18 and 90 d reported by P11 and quantify them better. But we are suspicious about the existence of the reported 40 d candidate because it does not pass the signal detection criteria. We also find a new planet candidate, HD 20794 e, at a period of about 147 d, corresponding to a super-Earth. The signal at a period of about 330 d probably correspond to a Neptune located in the habitable zone of HD 20794. The estimated orbital eccentricity of HD 20794 b, d and e is larger than 0.2.

This considerable eccentricity is probably caused by instrumental noise and the fitting bias, as concluded by (Feng et al. 2017b) and Zakamska et al. (2011). Notably such eccentric solutions are also found in the HARPS data of τ Ceti (Feng et al. 2017b). This indicates a connection between high eccentricity and significant red noise in large RV data sets of bright stars that are observed with high cadence. Thus it is a viable hypothesis that the four planet candidates have eccentricities that are low enough to form a dynamically stable system.

Our detection of signals with semi-amplitudes to below 0.5 m/s demonstrates the ability of the RV method to find Earth analogs orbiting Sun-like stars. This work supports the conclusion of Dumusque et al. (2017) that the combination of Bayesian methods with the modeling of correlated noise is essential to finding small planets. In particular, the noise model framework developed by Tuomi et al. (2013b) and Feng et al. (2017b) and applied in this work plays a key role in disentangling small signals from noise.

¹

E.g. http://www.eso.org/sci/observing/teles-alloc/gto.html

²

The data are available in http://star-www.herts.ac.uk/~ffeng/HD20794_supplementary/data

³

The relevant code for MLP and GLST is available in Github: https://github.com/phillippro/agatha, based on which a web app is also developed and linked at http://www.agatha.herts.ac.uk

⁴

These parameters measure the contribution of noise RV variations and are in units of m/s because the proxies are normalized. On the other hand, the Pearson coefficients shown in Fig. A.2 measure the correlation between RVs and proxies.

⁵

The mean semi-amplitude is the semi-amplitude K multiplied by 1 − e.

⁶

The RVs were measured with high cadence and are thus highly correlated in time due to stellar oscillations and systematic errors (Teixeira et al. 2009). It is necessary to bin the data to calculate the number of independent observations.

Acknowledgments

The authors are supported by the Leverhulme Trust (RPG-2014-281) and the Science and Technology Facilities Council (ST/M001008/1).We used the ESO Science Archive Facility to collect radial velocity data sets. The authors gratefully acknowledge the HARPS-ESO team’s continuous improvement of the instrument, data and data reduction pipelines that made this work possible. Finally, the authors thank the referee, Xavier Dumusque, for his valuable comments that helped improve the manuscript significantly.

References

Anglada-Escudé, G., & Butler, R. P. 2012, ApJS, 200, 15 [NASA ADS] [CrossRef] [Google Scholar]
Anglada-Escudé, G., Amado, P. J., Barnes, J., et al. 2016, Nature, 536, 437 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
Barnes, R., & Greenberg, R. 2006, ApJ, 647, L163 [CrossRef] [Google Scholar]
Berdiñas, Z. M., Amado, P. J., Anglada-Escudé, G., Rodríguez-López, C., & Barnes, J. 2016, MNRAS, 459, 3551 [NASA ADS] [CrossRef] [Google Scholar]
Butler, R. P., Tinney, C. G., Marcy, G. W., et al. 2001, ApJ, 555, 410 [NASA ADS] [CrossRef] [Google Scholar]
Cumming, A., Marcy, G. W., & Butler, R. P. 1999, ApJ, 526, 890 [NASA ADS] [CrossRef] [Google Scholar]
Dumusque, X., Borsa, F., Damasso, M., et al. 2017, A&A, 598, A133 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Feng, F., Tuomi, M., Jones, H. R. A., Butler, R. P., & Vogt, S. 2016, MNRAS, 461, 2440 [NASA ADS] [CrossRef] [Google Scholar]
Feng, F., Tuomi, M., & Jones, H. R. A. 2017a, MNRAS, 470, 4794 [NASA ADS] [CrossRef] [Google Scholar]
Feng, F., Tuomi, M., Jones, H. R. A., et al. 2017b, AJ, submitted [Google Scholar]
Gelman, A., & Rubin, D. B. 1992, Stat. Sci., 457 [Google Scholar]
Gregory, P. C. 2011, MNRAS, 410, 94 [NASA ADS] [CrossRef] [Google Scholar]
Haario, H., Saksman, E., & Tamminen, J. 2001, Bernoulli, 223 [Google Scholar]
Kennedy, G. M., Matrà, L., Marmier, M., et al. 2015, MNRAS, 449, 3121 [NASA ADS] [CrossRef] [Google Scholar]
Kopparapu, R. K., Ramirez, R., Kasting, J. F., et al. 2013, ApJ, 765, 131 [NASA ADS] [CrossRef] [Google Scholar]
Mayor, M., Pepe, F., Queloz, D., et al. 2003, The Messenger, 114, 20 [NASA ADS] [Google Scholar]
Mortier, A., Faria, J. P., Correia, C. M., Santerne, A., & Santos, N. C. 2015, A&A, 573, A101 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
O’Toole, S. J., Tinney, C. G., & Jones, H. R. A. 2008, MNRAS, 386, 516 [NASA ADS] [CrossRef] [Google Scholar]
Pepe, F., Lovis, C., Ségransan, D., et al. 2011, A&A, 534, A58 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Ramírez, I., Allen de Prieto, C., & Lambert, D. L. 2013, ApJ, 764, 78 [NASA ADS] [CrossRef] [Google Scholar]
Steffen, J. H., & Hwang, J. A. 2015, MNRAS, 448, 1956 [NASA ADS] [CrossRef] [Google Scholar]
Teixeira, T. C., Kjeldsen, H., Bedding, T. R., et al. 2009, A&A, 494, 237 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Tuomi, M. 2012, A&A, 543, A52 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Tuomi, M., Anglada-Escudé, G., Gerlach, E., et al. 2013a, A&A, 549, A48 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Tuomi, M., Jones, H. R. A., Jenkins, J. S., et al. 2013b, A&A, 551, A79 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Zakamska, N. L., Pan, M., & Ford, E. B. 2011, MNRAS, 410, 1895 [NASA ADS] [Google Scholar]

Appendix A: Residual visualization

Fig. A.1

Signals and residuals for n-planet model. In the left panels, the original data, noise-subtracted data, and signals are shown by black points, gray points, and red curve, respectively. The residuals after subtracting signals and the noise model predictions are shown in the middle panels with the same RV range as in the left panels. The distributions of residuals are shown in the right panels. The mean (μ), standard deviation (σ), skewness (μ³), and kurtosis (μ⁴) are shown for each distribution.

Fig. A.2

Correlation between RV residuals and noise proxies. The RV residuals for a given proxy are calculated by subtracting the best-fitting six-planet model from the 1AP1 data set, where the linear parameter corresponding to the proxy is set to zero. The Pearson correlation coefficient is shown in the top right corner.

All Tables

Table 1

Prior distributions of model parameters.

In the text

Table 2

Logarithmic BIC-estimated BFs for noise models with respect to the model without any MA components or differential RVs.

In the text

Table 3

Logarithmic BIC-estimated BFs for models with various numbers of planets for Keplerian and circular solutions.

In the text

Table 4

Maximum a posteriori (MAP) estimation of the parameters for the noise model (lower rows) and the five signals (upper rows) detected in the TERRA-reduced HARPS data for HD 20794.

In the text

All Figures

	Fig. 1 Measurements of RV, S-index, BIS and FWHM by HARPS for HD 20794. The dashed line denotes JD 2 453 500 before which the data is probably affected by instrumental effects.
In the text

	Fig. 2 Schematic diagram of data reduction and modeling process.
In the text

Fig. 3

Illustration of the existence of wavelength-dependent noise and the necessity of removing it using differential RVs. We show a series of GLSTs across the page. Each row of plots is given to a different aperture data set labeled 3AP1, 3AP2, 3AP3, and 1AP1. The following columns show GLSTs for the corresponding residuals after subtracting the best-fitting models of 0P+MA(0)+0D (white noise), 0P+MA(1)+0D (moving average), and 0P+MA(0)+2D (white noise with differential RVs), 0P+MA(1)+2D (moving average with differential RVs). The logarithm BF of a model with respect to the MA(0) model are shown for the residuals of each data set after subtracting the best model prediction. In each panel, the red dotted line denotes the period with the highest power. The false alarm probability (FAP) threshold of 0.1, 0.01 and 0.001 are shown by dashed lines.

In the text

Fig. 4

GLSTs for the 1AP1 data set, various noise proxies, and the window function. The names of the times series are shown at the top left corner. The red and blue dotted lines denote the signals quantified for the 6-planet model in the Keplerian and circular solutions respectively and reported in Table 3. The blue and red lines are usually too close to be distinguished for signals at periods larger than 12 d. We truncate the period range for optimal visualization of signals. We do not show the FAPs because the highly correlated noise in the data is not accounted for by white noise FAPs.

In the text

	Fig. 5 GLSTs for the original data and data subtracted by the optimal prediction of n-planet model. The value of n is shown in the upper right corner. The new signal identified by the n + 1-planet model is shown by red dotted lines. The other elements are similar to those in Fig. 4.
In the text

Fig. 6

Noise-subtracted RVs (top panels) and the moving MLP-based periodograms (bottom panels) made within the 1000 d (left panels) and 2000 d (right panels) moving window. In each moving periodogram, the bottom left panel shows the periodogram for periods ranging from 10 d to 1000 d while the bottom right panel is focused on visualization of periods below 30 d. The colors encode the RML which is truncated to optimize the visualization of signals. The periods of signals identified in the 6-planet Keplerian solution are shown by horizontal dashed lines. The logarithmic ML is truncated to optimize the visualization of signals.

In the text

	Fig. 7 Phase-folded data and planet model prediction. For the signal shown in each panel, the noise components and other signals are subtracted from the data. For each panel, the phase-folded times are original observation times mod the optimal period with respect to the minimum observation time.
In the text

Fig. A.1

Signals and residuals for n-planet model. In the left panels, the original data, noise-subtracted data, and signals are shown by black points, gray points, and red curve, respectively. The residuals after subtracting signals and the noise model predictions are shown in the middle panels with the same RV range as in the left panels. The distributions of residuals are shown in the right panels. The mean (μ), standard deviation (σ), skewness (μ³), and kurtosis (μ⁴) are shown for each distribution.

In the text

	Fig. A.2 Correlation between RV residuals and noise proxies. The RV residuals for a given proxy are calculated by subtracting the best-fitting six-planet model from the 1AP1 data set, where the linear parameter corresponding to the proxy is set to zero. The Pearson correlation coefficient is shown in the top right corner.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Anglada-Escudé, G., & Butler, R. P. 2012, ApJS, 200, 15 [NASA ADS] [CrossRef] [Google Scholar]

[2] Anglada-Escudé, G., Amado, P. J., Barnes, J., et al. 2016, Nature, 536, 437 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]

[3] Barnes, R., & Greenberg, R. 2006, ApJ, 647, L163 [CrossRef] [Google Scholar]

[4] Berdiñas, Z. M., Amado, P. J., Anglada-Escudé, G., Rodríguez-López, C., & Barnes, J. 2016, MNRAS, 459, 3551 [NASA ADS] [CrossRef] [Google Scholar]

[5] Butler, R. P., Tinney, C. G., Marcy, G. W., et al. 2001, ApJ, 555, 410 [NASA ADS] [CrossRef] [Google Scholar]

[6] Cumming, A., Marcy, G. W., & Butler, R. P. 1999, ApJ, 526, 890 [NASA ADS] [CrossRef] [Google Scholar]

[7] Dumusque, X., Borsa, F., Damasso, M., et al. 2017, A&A, 598, A133 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[8] Feng, F., Tuomi, M., Jones, H. R. A., Butler, R. P., & Vogt, S. 2016, MNRAS, 461, 2440 [NASA ADS] [CrossRef] [Google Scholar]

[9] Feng, F., Tuomi, M., & Jones, H. R. A. 2017a, MNRAS, 470, 4794 [NASA ADS] [CrossRef] [Google Scholar]

[10] Feng, F., Tuomi, M., Jones, H. R. A., et al. 2017b, AJ, submitted [Google Scholar]

[11] Gelman, A., & Rubin, D. B. 1992, Stat. Sci., 457 [Google Scholar]

[12] Gregory, P. C. 2011, MNRAS, 410, 94 [NASA ADS] [CrossRef] [Google Scholar]

[13] Haario, H., Saksman, E., & Tamminen, J. 2001, Bernoulli, 223 [Google Scholar]

[14] Kennedy, G. M., Matrà, L., Marmier, M., et al. 2015, MNRAS, 449, 3121 [NASA ADS] [CrossRef] [Google Scholar]

[15] Kopparapu, R. K., Ramirez, R., Kasting, J. F., et al. 2013, ApJ, 765, 131 [NASA ADS] [CrossRef] [Google Scholar]

[16] Mayor, M., Pepe, F., Queloz, D., et al. 2003, The Messenger, 114, 20 [NASA ADS] [Google Scholar]

[17] Mortier, A., Faria, J. P., Correia, C. M., Santerne, A., & Santos, N. C. 2015, A&A, 573, A101 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[18] O’Toole, S. J., Tinney, C. G., & Jones, H. R. A. 2008, MNRAS, 386, 516 [NASA ADS] [CrossRef] [Google Scholar]

[19] Pepe, F., Lovis, C., Ségransan, D., et al. 2011, A&A, 534, A58 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[20] Ramírez, I., Allen de Prieto, C., & Lambert, D. L. 2013, ApJ, 764, 78 [NASA ADS] [CrossRef] [Google Scholar]

[21] Steffen, J. H., & Hwang, J. A. 2015, MNRAS, 448, 1956 [NASA ADS] [CrossRef] [Google Scholar]

[22] Teixeira, T. C., Kjeldsen, H., Bedding, T. R., et al. 2009, A&A, 494, 237 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[23] Tuomi, M. 2012, A&A, 543, A52 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[24] Tuomi, M., Anglada-Escudé, G., Gerlach, E., et al. 2013a, A&A, 549, A48 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[25] Tuomi, M., Jones, H. R. A., Jenkins, J. S., et al. 2013b, A&A, 551, A79 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[26] Zakamska, N. L., Pan, M., & Ford, E. B. 2011, MNRAS, 410, 1895 [NASA ADS] [Google Scholar]

Evidence for at least three planet candidates orbiting HD 20794⋆

1. Introduction

2. HARPS Doppler measurements for HD 20794

3. Noise model framework and Bayesian inference

3.1. Wavelength-dependent noise models

3.2. Goldilocks noise model

4. Keplerian candidates

5. Discussions and conclusions

Acknowledgments

References

Appendix A: Residual visualization

All Tables

All Figures

Evidence for at least three planet candidates orbiting HD 20794^⋆