Open Access
Issue
A&A
Volume 642, October 2020
Article Number A157
Number of page(s) 31
Section Planets and planetary systems
DOI https://doi.org/10.1051/0004-6361/202038376
Published online 15 October 2020

© N. Meunier and A.-M. Lagrange 2020

Licence Creative Commons
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1 Introduction

A large number of exoplanets have been detected using indirect techniques for over 20 yr. However, because these techniques are indirect, they are very sensitive to stellar variability. The radial velocity (RV) technique is particularly sensitive to activity that is due to both magnetic and dynamical processes at different temporal scales. Many studies have focussed on stellar magnetic activity (recognised early on by Saar & Donahue 1997) based on simulations of simple spot configurations (e.g. Desort et al. 2007; Boisse et al. 2012; Dumusque et al. 2012) as well as more complex patterns (e.g. Lagrange et al. 2010; Meunier et al. 2010a,b, 2019a,b; Borgniet et al. 2015; Santos et al. 2015; Dumusque 2016; Herrero et al. 2016; Dumusque et al. 2017; Meunier & Lagrange 2019a). Flows on different spatial and temporal scales also play an important role: in addition to large-scale flows such as meridional circulation (Makarov et al. 2010; Meunier & Lagrange 2020), oscillations, granulation, and supergranulation also affect RV time series.

The properties of these small-scale flows and the mitigating techniques used to remove them (mostly averaging techniques) have been studied in several works (e.g. Dumusque et al. 2011; Cegla et al. 2013, 2015, 2018, 2019; Meunier et al. 2015; Sulis et al. 2016, 2017a; Meunier & Lagrange 2019b; Chaplin et al. 2019) for the Sun and other stars. More details can be found in the review by Cegla (2019). The impact of granulation on the use of standard statistical tools has been pointed out by Sulis et al. (2017b), who proposed a new method (based on periodogram standardisation) to improve these tools, so far for a solar type star. The RV jitter associated to granulation has also been studied for chromospherically quiet stars covering a large range in spectral types and evolutionary stages by Bastien et al. (2014).

Granulation and supergranulation are challenging because of the shape of their power spectrum, which is flat (instead of decreasing, as in the case of oscillations) at low frequencies (Harvey 1984), and because it is not related to usual activity indicators. Furthermore, in Meunier & Lagrange (2019b), hereafter referred to as Paper I, we showed that for the Sun, the effect of supergranulation was unexpectedly strong and more problematic than the granulation signal. Here, we perform a similar analysis (with the addition of more complete blind tests) for main sequence stars extending over a large range of spectral types, that is, from F6 to K4 as in our magnetic activity simulations (Meunier et al. 2019a, hereafter referred to as Paper II), where this contribution was added to the activity signal to build more realistic long-term time series of realistic activity patterns. In the present paper, we aim to study granulation and supergranulation contributions to RVs for stars with various spectral types and to perform a detailed analysis of the false positive levels from different points of view (theoretical and observational) and their effect on exoplanet detection rates.

We adopted a systematic approach to study and quantify these effects for different conditions, including different spectral types, numbers of observations, and samplings. We consider exoplanet detectability using RV techniques, but also the mass characterisation which can be made using RV in transit follow-ups: when the planet has been detected and validated using transits, its radius is known (relative to the stellar radius) along with other parameters (orbital period, phase), but only the RV techniques can currently provide a mass estimate, which, in turn, allows us to estimate its density, thus giving us a hint of its composition. We focus on Earth-like planets in the habitable zone of their host star. Such a systematic approach is also very important because there are few stars observed that have a very large (in the 500–1000 regime or above) number of observations currently available; thus, tests on observations are currently limited, in addition to the fact that they could have undetected planets, for stars other than the Sun (Collier Cameron et al. 2019).

The outline of the paper is as follows. In Sect. 2, we present the synthetic time series and the approaches we implemented to analyse them, as well as, in particular, how we define theoretical levels of false positives. In Sect. 3, we analyse these time series using true false positive levels (i.e. assuming a perfect knowledge of the properties of the signal) to derive detection rates and mass detection limits. In Sect. 4, we focus on the observational point of view by comparing usual false alarm probability levels with the true false positive levels and characterising the detection limits proposed in Meunier et al. (2012) for this type of signal. Then we estimate the uncertainty on the massestimation in transit follow-ups. We implement blind tests to fully characterise the performance in terms of detectability and false positive levels when a classical tool is used to evaluate detections. Finally, we test complementary samplings in Sect. 5 and present our conclusions in Sect. 6.

2 Model and analysis

In this section, we describe the time series and how we extrapolate data from solar parameters (Meunier et al. 2015; Meunier & Lagrange 2019b) to build stellar time series. Then we present the different approaches to analyse these synthetic time series and, in particular, we discuss how we determine false positive levels.

2.1 Time series of oscillations, granulation, and supergranulation

Our reference time series are solar ones: we first provide the amplitudes we consider for the Sun and apply those to G2 stars. Then we describe our assumptions for other stars.

2.1.1 Solar amplitudes

We first define the solar values we consider in this study. The time series are derived from power spectra following Harvey (1984) for granulation and supergranulation and following the shape of the envelope of the oscillations from Kallinger et al. (2014), as in Papers I and II. This method has the advantage of allowing us to produce a large amount of very long time series. We showed in Meunier et al. (2015) that the shape proposed by Harvey (1984) was well adapted, even at low frequencies: therefore, we use the parameters found in Meunier et al. (2015). The choice of a 1-h binning is similar to what we chose in Paper I and corresponds to the timescales where the RV jitter due to granulation reaches an inflexion point (Meunier et al. 2015): binning over a longer duration is not efficient enough to reduce this jitter further and so, this binning time is used to filter granulation out best.

For granulation, in the majority of our study, we use an rms (root-mean-square) of 0.83 m s−1 before averaging (i.e. 0.39 m s−1 after averaging over 1 h), hereafter GRAhigh, which stands as our reference value, provided by our simulations of about 1 million granules on the disk at any given time in Meunier et al. (2015). As discussed in Paper I, such simulations were based on realistic properties of granules (derived from hydrodynamical simulations of Rieutord et al. 2002), which are known to reproduce realistic line profiles (Asplund et al. 2000). However, lower values were derived from the observation of two specific spectral lines: about 0.32 m s−1 by Elsworth et al. (1994) from the Potassium line at 770 nm and 0.46 m s−1 from the Sodium doublet at 589 nm by Pallé et al. (1999). More recently, the residuals on timescales lower than one day obtained by Collier Cameron et al. (2019) on solar integrated RV times series and covering the whole spectrum obtained by HARPS-N are also of the order of 0.40 m s−1 (when averaging over typically 5 min). Similar amplitudes have been obtained by Sulis et al. (2020) using MHD simulations. The difference between these estimates and the results of Meunier et al. (2015) may be due to some subtle effects in the centre-to-limb dependence which are not taken into account in Meunier et al. (2015), but also to the fact that observations were made in a single lines which may not be representative of the whole spectrum. For that reason, a twice lower level (hereafter GRAlow) with respect to our reference level will also be considered in mass characterisations and blind tests in Sects. 4.3 and 4.4. We note that Cegla et al. (2019) obtained very low rms RV for granulation using a reconstruction based on MHD simulations of the solar surface, around 0.1 m s−1. The reason for this discrepancy is not clear at this stage, although it may be due the fact that strong vertical magnetic fields were used.

Concerning supergranulation, Meunier et al. (2015) provide a large range of possible values based on our current knowledge of these flows. Here, we consider two values, their median level (0.7 m s−1, hereafter SGmed), and their lower level (0.27 m s−1, hereafter SGlow), as in Meunier & Lagrange (2019b): these are in agreement with typical amplitudes obtained for a few stars by Dumusque et al. (2011). The median level is also close to the rms found by Pallé et al. (1999) for the Sun, with 0.78 m s−1 for the Sodium doublet lines. Because of the longer timescales of supergranulation, the rms RV is almost the same after the 1 h averaging. The amplitude of the oscillations is derived from Davies et al. (2014), as in Paper I. The time scale is the same one obtained in Meunier et al. (2015) as in Paper I, that is, 1.1 × 106 s.

We mainly use five types of time series throughout the paper: high level of granulation alone (GRAhigh), supergranulation alone (SGmed, median level, and SGlow, low level), all contributions for oscillations, a high level of granulation, and median supergranulation (ALLGRAhigh,SGmed) or low supergranulation (ALLGRAhigh,SGlow). In the following, ALL always represents the superposition of oscillations, granulation, and supergranulation. The other three configurations (GRAlow alone, ALLGRAlow,SGmed, ALLGRAlow,SGlow) are mostly be considered for the mass characterisation and blind tests to provide a complete view. The configuration ALLGRAhigh,SGmed was used in combination with magnetic activity in Paper II. The contribution attributed to any of these combinations is referred to as the OGS (for oscillations, granulation, supergranulation) signal in the following. The oscillations are not studied alone here because we consider 1-h averages and they are well averaged out (Chaplin et al. 2019) at such timescales: they did not prevent us from obtaining excellent detection rates when considered independently (Paper I).

2.1.2 Stellar time series

We considered seven spectral types covering the F6-K4 range, that is, F6, F9, G2, G5, G8, K1, and K4. The amplitudes of the different components were scaled to G2 stars (i.e. solar values from the previous section) as in Paper II (previous section). We recall them here in brief. Granulation parameters are scaled from G2 stars to other spectral types using results from Beeck et al. (2013). Oscillation parameters are scaled using laws from Kjeldsen & Bedding (1995), Samadi et al. (2007), Bedding & Kjeldsen (2003), Kippenhahn & Weigert (1990), and Belkacem et al. (2013)1. Supergranulation is scaled following the granulation scaling, assuming supergranulation is strongly related to granulation properties (Rieutord et al. 2000; Roudier et al. 2016), including the time scale, which can differ by up to about 20%, so that the impact should be small.

All-time series were produced for a 10-yr period of duration with a 30-s time step and are then binned over 1 h. We then selected one such point per night. Examples of time series (subsets over short periods) are shown in Appendix A for F6, G2, and K4 stars, as well as examples of the power functions versus frequency. In addition to this full sample of 3650 nights, we consider several other configurations with a gap of four months per year to simulate the fact that a star can usually not be observed all year long. Then Nobs nights were randomly selected out of the remaining nights over the 10-yr duration. Each realisation of this selection for a given value of Nobs corresponds to a different sampling. We use Nobs = 180, 542, 904, 1266, 1628, 1990, and 2352 nights with the 4-month gap each year), and 3650 nights (no gap), leading to a total of eight configurations. In Paper I, we found that using a random selection or considering packs of adjacent nights did not lead to significant differences. In addition, Burt et al. (2018) tested different ways of building the sampling for magnetic activity time series and found that the random sampling was optimal (the uniform sampling was not extremely different however). Testing of additional sampling configurations is presented in Sect. 5.

Figure 1 summarises the rms RV versus spectral type for the eight configurations of OGS time series used in this paper. We note a general decrease towards lower mass stars. When considering all components and spectral types, the RV jitter varies between 0.28 and 0.9 m s−1 typically when considering GRAhigh. The dashed lines show the levels when the granulation level is divided by 2 (GRAlow). In this case, the granulation rms varies between 0.22 and 0.1 m s−1, and when combined with the low level of supergranulation it varies between 0.2 and 0.37 m s−1. We note that even for such a large number of points, there is little dispersion in RV jitter from one realisation to the next.

2.2 Principle of the analysis

Here, we describe the planet properties considered in this paper and then discuss issues related to detectability as well as mass characterisation in transit follow-ups.

thumbnail Fig. 1

Rms RV vs. spectral type for GRAhigh (orange), SGmed (red), SGlow (brown), ALLGRAhigh,SGmed (green), and ALLGRAhigh,SGlow (blue), for the best sampling (3650 points, no gaps). The dashed lines correspond to the configurations including GRAlow (same colour code). Individual values are shown as stars.

Open with DEXTER

2.2.1 Planets

We focus our analysis on low-mass planets orbiting in the habitable zone of their host stars. We define the limits of the habitable zone as a function of spectral type as in Meunier & Lagrange (2019a), following Kasting et al. (1993), Jones et al. (2006), and Zaninetti (2008). We consider three typical orbital periods, corresponding to the inner side (PHZin), the middle (PHZmed), and the outer side (PHZout) of the habitable zone: the resulting orbital periods vary between 409 and 1174 days for F6 stars to 179–501 days for K4 stars. Furthermore, we consider only circular orbits, for simplicity.

Most of the computations are carried out with projected masses of 1 and 2 MEarth. For inclinations higher than 40–50°, the performance obtained with these masses is representative of this whole range of inclinations, while for lower inclinations the performance should be significantly worse than the one presented in this paper. Therefore, additional blind tests, presented in Sect. 4.4, are also performed, considering a distribution of inclinations between 0 and 90° when building the data set and with the assumption that the orbital plane is the same as the stellar equatorial plane. In the case of a transit follow-up using RV to characterise the mass, however, the projected mass can be considered to be the true mass.

2.2.2 Detectability

In subsequent sections, the analysis of the time series is made using two complementary approaches (i.e. two test statistics), which are then compared. The steps are as follows: (i) we analyse the periodograms2 of the time series, and compute the maximum amplitude around the considered PHZ (frequential analysis, computed in 0.9–1.1 PHZ range); (ii) we fit the planetary signal, considering a period guess corresponding to the period of this peak with maximum amplitude (temporal analysis) or of interest (PHZ) depending on the case. This fit is made using a χ2 minimization.

We first consider the detectability of such exoplanets in the presence of the stellar contribution defined in the previous section. Because we consider synthetic time series, we can study them with the certainty that there is no planet present in the signal. As a consequence, we can estimate a true level of false positive (FP) for a given test statistics (frequential and temporal analysis) and for a given probability (e.g. 1%), and it is then possible to compute detection rates for a given planet (on the time series where the planet have been added), considering this level of false positives. The method we apply to determine the FP is described in Sect. 2.3. Once we have determined a true FP level corresponding to a certain percentage of false positives, and a detection rate for a given mass, we can also determine which mass corresponds to a good detection rate (e.g. 95%), which provides a detection limit. This approach is explored in Sect. 3.

From the point of view of the observer, however, the determination of the true level of false positive due to a given stellar contributionis not possible because it is not possible to know if it includes other, additional signals (of a planet for instance) and because we have only one realisation of the signal. This is why the analysis of observed time series always relies on other methods, such as the use of false alarm probability levels using bootstrap analysis, although this approach makes assumptions on the signal which may not be correct, as pointed out by Sulis et al. (2017a). In a second step, we therefore test this type of approach and compare it with the one based on the true false positive level. We also compare the detection limits based on the periodogram analysis proposed by Meunier et al. (2012), the local power analysis (LPA) method with the true detection limits. A blind test is implemented to estimate the detection rates and false positive levels and compare them with the true ones. This approach is explored in Sect. 4.

2.2.3 Mass characterisation

The latter issue, also studied in Sect. 4, concerns the performance with regard to mass characterisations of planets detected by transit in photometric light curves. In this case, we consider that the planet presence is confirmed, meaning that the transits do not require any validation using RV observations. There is, therefore, no issue with false positives in this case and we also know its orbital period and phase with very good precision from the transit. We can then fit the RV amplitude due to the planet (temporal analysis) at this orbital period to determine the precision for the mass characterisation.

2.3 False positives from synthetic time series

Here, we describe how we estimate the false positive (FP) level at the 1% level, both in mass (temporal analysis) and power (frequential analysis) This level corresponds to the behaviour in the frequency of the OGS signal alone and for a given test of statistics (here, the power at the period we are interested in or the fitted mass; see previous section) since it is computed based on a large number of time series of the OGS signal alone. This is done with no correction of the signal (apart from the 1 h binning).

To estimate the FP from our time series, we produce 1000 realisations of the OGS signal and sampling (for a given spectral type and number of points Nobs) as describedin Sect. 2.1. For each of the three orbital periods corresponding to the habitable zone (Sect. 2.2), we fit a planetary signal at this period, which provides 1000 values of the mass. The period used as a guess before minimisation is the period of the peak with maximum power in the periodogram around the period we are interested in (namely in the 0.9–1.1 PHZ range as above). The 1% false positive level fpM is defined as the mass such that 1% of the 1000 values are higher. This level is therefore estimated for each spectral type, Nobs and PHZ.

To ascertain that a planet has been detected, we compare the fitted mass (temporal analysis) to fpM : if it is higher than fpM, we consider the planet as detected. Figure 2 shows fpM versus spectral type and Nobs, for PHZmed. The values of fpM decrease towards lower mass stars and with higher values of Nobs. There are many configurations where fpM is higher than 1 MEarth with values as high as several MEarth for F6 stars, but below 1 MEarth for K4 stars. For a given spectral type and OGS configuration, fpM decreases as Nobs increases, but the variation is not linear in , and after a sharp decrease at low Nobs, the level does not change much, as shown in Fig. 2. More details about the dependence on Nobs is shown in Sect. 5.

For each of these 1000 realisations, we also compute the periodogram and the maximum peak amplitude in two ways: between 100 and 1000 days (which includes most of our PZH values) and in 10 ranges of 100 days between 0 and 1000 days, to check whether the FP depends on the period. As before, the 1% level, fpP, is computed out of each 1000 series of values. The results are shown in Fig. 3. There is a clear trend with period and the whole range of values corresponds roughly to the lowest period. In the following, we consider fpP computed for the different period ranges to take this trend into account.

thumbnail Fig. 2

False positive level in mass fpM vs. spectral type for different numbers of points Nobs: 180 (yellow), 542 (orange), 904 (red), 1266 (brown), 1628 (green), 1990 (blue), 2352 (pink), and 3650 (purple), and for different OGS configurations (from top to bottom). Values of fpM correspond to 1% of false positives and PHZmed. The two horizontal lines indicate the 1 MEarth (solid line) and 2 MEarth (dashed line) levels for comparison.

Open with DEXTER
thumbnail Fig. 3

False positive level in power fpP vs. period for highest number of points, G2 stars, and five OGS configurations (GRAhigh in orange, SGmed in red, SGlow in brown, ALLGRAhigh,SGmed in green, andALLGRAhigh,SGlow in blue). Thesolid lines represent fpP computed in 100 days ranges, while the dashed horizontal lines correspond to the single value of fpP computed over 100–1000 days.

Open with DEXTER

3 Simulated detection rates of Earth-mass planets in the habitable zone

In this section, we consider the synthetic time series produced in the previous section and add planets with different masses at different orbital periods to estimate the effect of the OGS signal on exoplanet detectability. We use the level of false positives (corresponding to 1% in the following) defined in Sect. 2.3, both in mass for the temporal analysis (fpM) and in power for the frequential analysis (fpP). We then compute detection rates for various masses and detection limits corresponding to these well identified detection rates. We use only GRAhigh in this section (with five OGS configurations).

3.1 Detection rates for Earth-mass planets

We consider planets with projected masses of 1 MEarth and 2 MEarth (see Sect. 2.2.1 for a discussion) on circular orbits and at three positions in the habitable zone of each spectral type as described in Sect. 2. The signal due to such planets (with a random phase) is added to each of the 1000 realisations of the OGS signals and sampling for each spectral type and Nobs. For the frequential analysis, we use the amplitude of the peak at the orbital period we are interested in. For the temporal analysis, the fit is made with an initial guess for the period corresponding to this orbital period, which leads to a certain mass. The 1000 values of the peak amplitude and of the mass are then compared to the corresponding values of fpM and fpP at the considered period (see Sect. 2.3). The false positives are computed around the targeted periodicities, which biases them since there may be other false positives at other periods as well: this will be studied in Sect. 4. The percentage of those 1000 values above the true false positive level is the detection rate, associated to the chosen level of false positive (here 1%). An example of the distributions of the 1000 values is shown in Fig. 4 to illustrate the procedure. Values beyond the vertical lines correspond to detections. For the same configuration, the detection rate seems slightly larger for the frequential analysis compared to the temporal analysis, meaning that the frequential analysis is more robust for obtaining good detection rates.

Figure 5 shows the resulting detection rates obtained with the frequential analysis depending on Nobs. For each spectral type, the curves indicate the necessary number of points Nobs to reach a 50% detection rate (solid lines) or a 95% detection rate (dashed lines). Curves at a low-level mean that it is very easy to detect planets (small values of Nobs are sufficient), while curves at the top correspond to configurations for which a detection is difficult to obtain (high values of Nobs). Higher values of Nobs are necessary for longer orbital periods, as expected (since the planetary signal is dropping). For 2 MEarth, the detection rates are very good for granulation and low supergranulation levels (or ALLGRAhigh,SGlow), as excellent rates can be reached with a low number of observations. Very good detection rates require a very large number of observations (a few hundreds to a few thousands depending on spectral type) when considering SGmed. Adding granulation to SGmed does not change much the performance. For 1 MEarth, the performance is not as good, and higher numbers of points are required to get good detection rates. The low level of supergranulation leads to good detection rates, but only with a high number of points, except for F stars for which even our maximum Nobs of 3650 nights does not allow us to reach detection rates of 95%. The situation is significantly worse for the median level of supergranulation, with conclusions similar to what was found in Paper I for G2 stars. We also observe a bump for K1 stars and PHZmed: this is due to the fact that for this particular configuration, the orbital period is equal to 366 days3, and given the gap introduced every 1 yr in the sampling, planets at such periods would naturally be more difficult to detect. As expected, the frequential analysis is therefore quite sensitive to the temporal window. We conclude that the performance is good for a 2 MEarth planet, while for a 1 MEarth planet good results can be achieved only with a very high frequency of observations, mostly due to supergranulation.

Figure 6 shows similar curves for the temporal analysis, that is, with the fitted mass used as a criterion for estimating the detection rates. The global trends are similar to the frequential analysis, with two main differences. All curves correspond to higher numbers of points, that is, more observations are requested to obtain the same detection rate. This is due to the difference in false positive levels already noted in Sect. 2.3: the frequential analysis criterion allows us to get better detection rates. On the other hand, there is no more bump for K1 stars and PHZmed with this approach, as the temporal analysis is less sensitive to the temporal window than the frequential analysis.

thumbnail Fig. 4

Example of distributions of power (upper panel) and mass (lower panel) in the presence of 1 MEarth planet (dashed line) and with no planet (false positive values, solid line). The distributions are for a G2 star, 1266 points and PHZmed and GRAhigh. The vertical line indicates the position of the 1% false positive level deduced from the solid line distribution.

Open with DEXTER
thumbnail Fig. 5

Detection rates of 50% (solid lines) and 95% (dashed lines) based on frequential analysis vs. spectral type and Nobs, for different OGS configurations (from top to bottom) and for different orbital periods: PHZin (black), PHZmed (red), and PHZout (green).

Open with DEXTER
thumbnail Fig. 6

Same as Fig. 5 but based on temporal analysis.

Open with DEXTER
thumbnail Fig. 7

Example of detection rate vs. planet mass, for G2 stars, 1266 points, and GRAhigh, in two cases: based on frequential analysis (black curve) and on temporal analysis (red curve). The vertical solid lines indicate the corresponding 95% level, and the dashed lines the 50% level.

Open with DEXTER

3.2 Detection limits

Detections rates are computed as in the previous section but for a large range of planet masses and a 0.1 MEarth step. This allows us to determine at which mass, for a given spectral type, Nobs, and OGS configuration, the detection rate is equal to 95% for example (given a false positive level of 1%). Only 100 realisations of the signal OGS + planet are performed because such computations are time consuming. For the same reason, computations are made only for the middle of the habitable zone PHZmed. An exampleof detection rate versus planet mass is shown in Fig. 7 to illustrate the procedure. As already noted, there is a shift between the frequential analysis and the temporal analysis, of the order of 0.1 MEarth in this example.

Figure 8 shows the detection limits versus spectral types for different OGS contributions and different values of Nobs, for various numbers of nights (between 180 and 3650) covering 10 yr. At the 50% level, they are often below 1 MEarth (especially for low mass stars) if Nobs is sufficiently high: this is the case for GRAhigh, SGlow and ALLGRAhigh,SGlow. They are mostly above 1 MEarth for SGmed and ALLGRAhigh,SGmed with values up to 2.5 MEarth for F6 stars however. At the 95% level, only the highest values of Nobs allow to reach 1 MEarth, and this is true for K4 stars only when considering the median level of supergranulation.

We conclude that in most configurations, the detection limits are higher than 1 MEarth. This is the case especially for the most massive stars and when a limited number of nights is available (typically a few hundreds for granulation, but a few thousands for supergranulation).

4 Observational approach

The results presented in the previous section are based on a perfect knowledge of the OGS signal. This allowed us to compute true false positive levels and to deduce detection rates corresponding to a given level (1%) of a false positive: given the true false positive levels, this approach provided the best detection rates possible, with a controlled false positive level. We now consider the point of a view of an observer, who is interested in a time series which may contain other contributions and for which we have only one realisation: different tools must then be used, and actual detection rates may be lower, or the resulting detection rates may correspond to a higher false positive level. We first compare the false alarm probability (FAP) obtained using a bootstrap analysis with the true false positive level. Then we compute the detection limits using the LPA method (Meunier et al. 2012) and determine which true detection rates and exclusion rates these detection limits correspond to. Finally, we characterise the mass uncertainty in transit follow-ups and we implement several blind tests to estimate the detection rates and false positives obtained when a usual FAP analysis of the data is performed.

thumbnail Fig. 8

Detection limits vs. spectral type for 50% detection rate (left panels) and 95% detection rate (right panels) for frequential analysis, PHZmed and for different values of Nobs (from low Nobs to high Nobs, see Sect. 2.1): yellow (180), orange (542), red (904), brown (1266), green (1628), blue (1990), pink (2352), purple (3650). The horizontal dotted line corresponds to a 1 MEarth planet.

Open with DEXTER

4.1 Classical bootstrap false alarm probability

In this section, we focus on the comparison between the FAP level and the true false positive level, fpP, with no injected planet, both at the 1% level. The effect on detection rates will be studied in the blind tests in Sect. 4.4. Only GRAhigh is used in this section. For each time series (with no planet), we compute the 1% FAP level using a bootstrap analysis. Because it is time consuming, only ten realisations of the OGS signal are considered for each spectral type and value of Nobs. The maximum of the periodogram to compute the FAP is computed over the whole periodogram, that is, between 2 and 2000 days. For each configuration (spectral type, Nobs) and a given orbital period (one of the three PHZ values), we compute the following values: the percentage of simulations with a FAP higher than the true false positive level fpP at 1% obtained in the previous sections (this is necessarily noisy since there are only ten realisations); the ratio of the FAP and FP, namely, fap/fpP (averaged over the ten realisations); the number of peaks above the FAP (averaged over the ten realisations).

The results are summarised in Fig. 9. The fap/fpP and the percentage of simulations with FAP larger than the FP are strongly correlated, therefore only the ratio is shown. Although the results showsome dispersion because of the low number of realisations (a larger number of realisations performed on a few typical configurations gives similar results, however), some trends can be observed. The ratio covers a wide range, with values between 0.6 and 3 (after averaging on the ten realisations). For GRAhigh, the percentage is always 100%, and it is almost always the case for ALLGRAhigh,SGlow: the FAP is then always overestimating the false positive level, on average by a factor of two (corresponding to a factor of four on the mass). In the other configurations, there is a high proportion of simulations where the FAP is larger than the true false positive level when Nobs is small, and it tends to be the opposite for a large number of points, with a transition for Nobs in the 1000–2000 range. The limit between the two regimes occurs at higher Nobs for a given orbital period (alternatively, for a given Nobs, the ratio is larger at longer periods). Finally, the average number of peaks over all configurations is low (0.24) but there are several peaks above the FAP in some configurations, mostly for supergranulation alone and ALLGRAhigh,SGmed, especially when Nobs is large, in agreement with the ratio.

The true false positive level corresponds to the true frequency behaviour of the OGS signal, while the FAP assumes a white noise with a similar rms RV and a similar distribution of RV values. The shape of the power spectrum of the OGS signal is such that the usual FAP computation is not always adapted: it appears to overestimate the false positive level when the number of point is low (or for GRAhigh and ALLGRAhigh,SGlow in all configurations) when, rather, it should underestimate the detection rate, corresponding to a conservativeapproach of the detection. When the number of points is high however, for SGlow SGmed, and ALLGRAhigh,SGmed, the FAP underestimates the false positive level, which should lead to potentially good detection rates but corresponding to much higher false positive levels in reality. These results are compatible with those of Sulis et al. (2017a), who proposed a new method (periodogram standardisation) to be able to use standard tools such as the FAP.

Finally, we note that the FAP is computed over the whole range over which we compute the periodogram (2–2000 days), while in the previous section, we consider the FP dependent on the period (see Sect. 2.3 and Fig. 3): given its shape, the FP level we are interested when searching for planets in the habitable zone is lower than at short periods. We expect the FAP to agree better with the FP at low periods.

thumbnail Fig. 9

Average ratio fap/fpP for PHZmed vs. Nobs. The average is computed over all realisations and spectral types. The colour code represents the period: inner side (black), middle (red), and outer side (green) of the habitable zone.

Open with DEXTER

4.2 LPA detection limits: exclusion rates and detection rates

The LPA method proposed in Meunier et al. (2012) computes detection limits as a function of orbital period from a given RV time series, taking the power around the considered orbital period due to stellar contribution into account since stellar activity produces signal at some specific periods. This fast computing method has been used in several works (for example Lagrange et al. 2013, 2018; Borgniet et al. 2017, 2019; Lannier et al. 2017; Grandjean et al. 2020). Here, we recall the method in brief, which is also illustrated in the upper panel of Fig. 10. For a given orbital period Porb, we compute the maximum power Pmax in the periodogram in a window around Porb. The detection limit is defined as the mass which would give a peak amplitude equal to 1.3 × Pmax (Lannier et al. 2017): we exclude the presence of planets with masses above the LPA detection limit because otherwise they would have produced a larger amplitude than observed (around that period), meaning that it is an exclusion limit. There is, however, a simplification in this computation. This is because when the planetary signal is superposed on a stellar signal, depending on its phase, the amplitude of the resulting peak can vary a great deal, as shown, for example, in Paper I. This effect is not taken into account in the LPA computation, although the 1.3 factor gives a good margin. It is useful to estimate, for different OGS configurations (only GRAhigh is used here, i.e. five configurations of OGS), which exclusion rates such a definition corresponds to: the objective is that this rate is close to 100% for good exclusion performance derived from this limit and as robust as possible for all configurations.

For that purpose, we implement the following procedure, illustrated in the lower panel of Fig. 10. For each spectral type and each Nobs value chosenamong a subsample (180, 1266, 2353 points), we consider 100 (N1) realisations of the OGS signal and sampling. One of these realisations is shown in Fig. 10. For each of these N1 realisations, the LPA detection limit Mlpa is computed for orbital periods equal to PHZmed and we perform 100 (N2) realisations of the planetary signal of this mass Mlpa and period (i.e. N2 random phases), which is added to the corresponding OGS signal. The maximum peak in the periodogram, P′, computed in the same window as above, is compared to Pmax(maximum power in the periodogram around the considered period) for each of these N2 realisations (the N2 values of P′ are shown in the lower panel of Fig. 10: the percentage of realisations (out of the N2 values) where this maximum is higher than Pmax (i.e. unobserved)is the exclusion rate. In addition, the maximum peak in the periodogram can also be compared to the true false positive fpP (from Sect. 2.3), leading to a detection rate computed from the N2 realisations. For each configuration, we therefore derive 100 exclusion rates and 100 detection rates.

We find that the exclusion rate is quite constant for most spectral types (and slightly lower for F6 stars), with a median of 87%. Therefore, when computing the LPA limits with the above threshold, there would therefore still be a 13% chance to miss a planet at the detection limit. The detection rates, on the other hand are rather low, typically in the 20–40% range. The average detection limits are below 1 MEarth, except when the median supergranulation level is considered, in which case it is above 1 MEarth for the most massive stars. The LPA detection limit naturally depends on the spectral type, but also depends strongly on the number of points Nobs. As a summary, Fig. 11 shows the distribution of the different rates for all realisations. The exclusion rate shows a higher peak at 100% (about a quarter of all simulations) and all are above 50%. The detection rates are much lower, with a high peak at 0.

Finally, for G2 stars and 1266 points, we investigated the effect of the chosen factor (1.3) to compute the LPA limit on the exclusion rates. The results are shown in Table 1. As expected, the exclusion rates are improved by a larger factor. A median exclusion rate of 99% is reached for a factor of 1.9, for which half of the cases correspond to a 100% exclusion rate: this would correspond to an LPA mass that is higher by 21% (compared to the mass obtained with the 1.3 factor). We note, however, that the minimum exclusion rate increases very slowly.

We conclude that the LPA corresponds to a good exclusion rate, although it is not 100%. The LPA masses are also lower than the detection limits computed in the previous section.

thumbnail Fig. 10

Example of the periodogram for OGS alone (upper panel, GRAhigh alone) and with a planet at LPA mass (0.53 MEarth, lower panel) to illustrate the LPA computations. The red and green horizontal lines correspond to the maximum of the OGS periodogram in the window delimited by the dotted lines and multiplied by 1.3 respectively. The green solid line periodogram in the upper panel is for the planet alone. The example of periodogram with OGS + planet is for an arbitrary phase of the planet, and the horizontal orange line corresponds to the maximum power. Orange stars are for 100 realisations of the planet phase.

Open with DEXTER
thumbnail Fig. 11

Distribution of exclusion rates (solid line) and detection rates (dashed line) for all LPA tests, i.e. covering all spectral types, three values of Nobs, different OGS configurations, and PHZmed.

Open with DEXTER

4.3 Mass characterisation for Earth-mass planets in the habitable zone

Before considering the detectability issue from the point of view of an observer, we consider the performance in terms of mass characterisation during a transit follow-up in RV. The transit provides an excellent estimate of the orbital period and of the phase of the planetary signal (the length of the transit is extremely small compared to the orbital periods considered here). The mass of the injected planet is extremely close to the true mass (orbit seen edge-on). We consider 1000 realisations of the OGS signal (eight configurations) as defined in Sect. 2.1 and samplings for each spectral type, values of Nobs and PHZ, and add a 1 MEarth or a 2 MEarth planet with an arbitrary phase to each of them. Results for additional masses are shown in Appendix B. The planetary signal is then fitted (amplitude only as the period and phase are known) and from this we deduce the planet’s mass. For each configuration, the 1000 values of the mass can then be compared to the input value.

For K4 stars, the mass distributions are quite narrow and are well separated between the two input masses we consider. The distributions are very good for GRAhigh and GRAlow, but when added to supergranulation (in particular, SGmed) the distributions are dominated by supergranulation. Distributions are close to Gaussian. For G5 stars, the distributions widen and for the input of 1 MEarth and SGmed (or ALLGRAhigh,SGmed), the distributions are wide enough to include no planet, hence, there are large uncertainties on the mass. Finally, for F6 stars, the distributions are much wider and the median level of supergranulation leads to very large dispersion, (much larger than the mass). Thus, they correspond to very poor mass characterisations.

The average fitted mass is always in excellent agreement with the input mass, with no significant bias. The dispersion decreases with increasing Nobs and decreasing stellar mass. For example, for G2 stars and ALLGRAhigh,SGlow, at the 3σ level, masses are between 0 and 3 MEarth (for an input of 2 MEarth) and between 0 and 3 MEarth (for an input of 1 MEarth) for 180 points. The ranges are reduced to 1.1–3 and 0.1–1.7 MEarth, respectively for 1266 points, and to 1.2–2.6 and 0.2–1.6 for 2352 points. For K4 stars in the same conditions, the 3σ uncertainties are already very good for 180 points (0.2–1.8 and 1.2–2.8 MEarth) and falls to 0.7–1.2 and 1.8–2.2 MEarth for the higher number of points.

The uncertainties on the mass are summarised in Fig. 12. For 1 MEarth and GRAhigh, the uncertainties at the 1σ level are below 35% and except for the most massive stars, they are around 20% or below, which are good mass estimates. With SGlow, the uncertainties are larger, but remain below a few 10% (40% for F6 stars with a very good sampling). They are, however, significantly higher when considering SGmed (alone or added to granulation and oscillations), and can be as high as 100% for F6 stars and are always above 20%. The low level of granulation alone provides very good uncertainties: for F6 they are below 20% for Nobs above 1266 for 1 MEarth, and for K4 they are much below 20% even for a small Nobs. Performance is still good when the low level of supergranulation is added (except for stars with spectral types earlier than G2, even for very high Nobs), providing a large Nobs, but again are mostly above 20% for the median level of supergranulation is added and can reach values up to 50% for F6 stars. In absolute values, the uncertainties are not very different between 1 MEarth and 2 MEarth, so that the relative uncertainty for 2 MEarth is about twice smaller than for 1 MEarth. Overall, there is a significant gain in performance between 180 (very poor in general) and 1266 points, but not between 1266 and 2352 points, which does not improve the situation significantly. The dependence on Nobs is discussed in detail in Sect. 5.1. For practical purposes, a representation of the values of Nobs to reach a precision of 20% on the mass is shown in Fig. 13. Values are lower or upper limits in a few cases: upper limits mean that even with 180 points, uncertainties are below 20%, so that a lower number of points are sufficient. Lower limits shown by the diamond symbols mean that even with 3650 points over 10 yr it is impossible to reach a 20% uncertainties. Apart for K4, the only OGS contributions allowing to reach 20% with Nobs within the range that we considered are granulation alone (high or low), SGlow, and combination of both.

The uncertainty on the mass estimation is strongly correlated with the true false positive level (in mass) computed in Sect. 2.3, as illustrated in Fig. 14. When considering all spectral types, Nobs values, orbital periods, and different OGS configurations together, the correlation between the two variables is 0.96. The correlation slightly depends on the OGS configurations, with values between 0.93 and 0.99, but remains very high. There is a tendency for high values of Nobs to lead to higher uncertainties at a given false positive level (however, they naturally correspond on average to lower false positive levels). For example, the false positive level at 2 MEarth corresponds to a 1σ uncertainty between 40% and 60%. For 1 MEarth, it is between 20% and 35%. To guarantee uncertainties below 20%, the theoretical false positive level should be below ~0.5 MEarth.

Table 1

Effect of the amplitude factor on LPA exclusion rates.

thumbnail Fig. 12

Uncertainty on mass in transit follow-up vs. spectral type for two planet masses (1 MEarth on left hand-side and 2 MEarth on right hand-side), for PHZmed, and for different values of Nobs: 180 points (yellow), 1266 points (brown), and 2352 points (pink). The 1σ levels are shown as solid lines, the 2σ levels as dashed lines, and the 3σ levels as dotted lines. The black horizontal line shows the 20% level for reference. The 2-σ and 3-σ uncertainties are in some cases out of scale for clarity.

Open with DEXTER
thumbnail Fig. 13

Number of points necessary for 20% uncertainties on mass characterisation, for 1 MEarth (upper panel) and 2 MEarth (lower panel) and different OGS contributions: GRAhigh (orange), SGmed (red), SGlow (brown), ALLGRAhigh,SGmed (green), and ALLGRAhigh,SGlow (blue). The dashed lines correspond to the configurations including GRAlow (same colour code as in Fig. 1). Stars indicate that even with our largest number of points the uncertainties are in fact higher than 20% (lower limit for Nobs). Diamonds indicate that even with 180 points the uncertainties are in fact lower than 20% (upper limit for Nobs).

Open with DEXTER
thumbnail Fig. 14

Uncertainty on mass characterisation at 1σ level vs. falsepositive level in mass fpM for all spectral types, Nobs and different OGS configurations. The colour code corresponds to the different Nobs values: 180 (yellow), 542 (orange), 904 (red), 1266 (brown), 1628 (green), 1990 (blue), 2352 (pink), and 3650 (purple).

Open with DEXTER

4.4 Blind tests

In this last section, we implement blind tests to estimate the level of false positives and the detection rates when applying the FAP criterion to the OGS time series in two cases: when a planet is injected or when there is no planet. We describe the principle of the blind tests, how the data sets are built and analysed, and, finally, our results.

4.4.1 Principle

For each OGS signal and spectral type, statistically half of the realisations of the time series remain unchanged while a planet is added to the other half. The analysis of each time series allows to determine whether a planet is detected on not. In a second independent step, we determine the level of false positives and the detection rate for each set of simulations, by comparing the outputs with what was actually injected or not. We focus our analysis on one of the Nobs values (1266points), which corresponds to good conditions, but with still a reasonable rate of observations in future dense monitorings.

The fitting challenge implemented in Dumusque et al. (2017) which focusses on stellar magnetic activity defined several detectability criteria. We use similar criteria and terminology with a few modifications: (1) we decide whether there is a detection or not using a binary choice, but since there is no further comparison with activity indicator for example, there is no intermediate case; (2) false positives are counted separately for realisations with an injected planet and with no planet; (3) the identification of the planet in Dumusque et al. (2017) was based on whether the amplitude K and the period P corresponded to the injected planet, while here we use only the period as a criterion, because given the dispersion in mass this criteria would be quite subjective and can be used in a second step if necessary. The different categories are summarised in Table 2. False positives and missed or wrong planets can bias statistics on exoplanets: their effect is also indicated in Table 2. We also note that the detection criteria in Dumusque et al. (2017) was not the same in all methods as it depended on the fitting method, and may be different from ours.

Table 2

Blind tests: configuration and results.

4.4.2 Building of the data sets

The first step of the procedure consists of building the data sets. For each configuration (one spectral type, and one of the eight OGS configurations), we consider 200 realisations of the OGS signal and sampling. Computations are made for 1266 points and 1 MEarth unless otherwise noted (figures for other values and approaches such as including a distribution in inclinations are shown in Appendix C). Based on a random variable, on average, half of the realisations remain unchanged, while a planet is added to the other half. The planet has the following properties: the orbital period Porb is chosen randomly in the PHZin-PHZout range (i.e. we consider the habitable zone globally, using a uniform distribution), and the phase is chosen randomly in the [0–2π] interval. The projected mass is equal to 1 MEarth (projected mass, see discussion in Sect. 2.2.1) unless noted otherwise: these blind tests serve as our reference. Figures corresponding to other masses are shown in Appendix C, along with blind tests that include inclination distribution.

4.4.3 Analysis of the time series and detectability criteria

For each configuration (one spectral type, and one of the 8 OGS configurations), each of the 200 realisations of the time series are analysed as follows. The FAP at the 1% level is computed (using 200 realisations of the bootstrap, which we checked does not give very different results from a larger number of realisations). The periodogram of the time series is computed and the highest peak is identified (in the range 2–2000 days). If the amplitude of the peak is lower than the FAP, then we establish that there is no detection, whereas if the amplitude of the peak is higher, we consider this to be a detection. In this latter case, a sinusoid (we recall that we consider only circular orbits here) is fitted, with the period fixed to the peak period, to obtain the mass.

We note that the conditions are different from the theoretical results presented in Sect. 3. In Sect. 3, each computation was focusing on the behaviour at a given period (for example the middle of the habitable zone) and, therefore, on the power at this particular period or the mass corresponding to a fit at this period. Here, we address a different question, since we do not focus on a given period: we place ourselves at the point of view of an observer and we do not knowwhere the planet is injected, that is, we consider the whole 2–2000-day range and not only the habitable zone. The analysis can even lead to a (wrong) detection outside of the period range where the planet is injected. This can then induce a higher rate of false positives (unless the criteria to make the detection is much higher than the true false positive level).

In a second independent step, we compare the results with the input parameters: this allows us to associate one of the categories of Table 2 to each realisation. The decision algorithm is shown in Fig. 15. To define whether a peak is attributable to the true planet or not, we use the criterion |PpeakPtrue| < 0.1Ptrue to determine if the planet is the correct one (see next section).

thumbnail Fig. 15

Decision algorithm to attribute a category (according to Table 2) to each realisation of the blind test. ΔP is equal to |PpeakPtrue|.

Open with DEXTER
Table 3

Examples of detections and false positive rates from blind tests.

4.4.4 Results of the blind tests: planet properties and detection rates

The outputs of each blind test are mainly the properties of the fitted planet parameters when detected and the percentages corresponding to the different categories defined in Table 2. We first focus on the period since a criterion on the period obtained during the analysis must be defined to assign each realisation to one of the categories. Figure 16 shows the distribution of the difference between the periods provided by the analysis and the true periods over all realisations (i.e. all OGS configurations and all realisations with an injected planet), independently of the significance of the peak. The realisations outside this range correspond mostly to peaks found at low periods, with a maximum of the distribution in the 20–30 days range, as shown in the lower panel of Fig. 16: 95% of those peaks are at periods below the true orbital period and many of them are, in effect, below the FAP. In practice, thewidth of the peak varies with the period, and a threshold of 10% of the period allows us to separate the peak from outliers.

Table 3 shows a few examples of percentages, for G2 and K4 stars and a subset of OGS (GRA, ALLGRAhigh,SGmed, ALLGRAhigh,SGlow). Ideally, we would like to obtain 100% on the first two lines and 0% on the other lines. The categories correspond to Table 2, some of them being regrouped. For example, the percentage of bad planet detections (i.e. the global false positive rate) corresponds to planets detected when none was injected, added to the planet detected with a wrong period. For G2 stars and granulation, the recovery rate is very good when no planet is injected but lower when there is an injected planet: most of the lost planets correspond to peaks below the FAP. The performance is much poorer for ALLGRAhigh,SGmed, with very low detection rates when a planet is injected and high false positive level. Even for ALLGRAhigh,SGlow, the recoveryrate when a planet is injected is only 35%. For K4 stars, performance is perfect of granulation and very good for ALLGRAhigh,SGlow. For ALLGRAhigh,SGmed, the detection rate is only 67%, however.

Figure 17 summarises the percentages for all configurations (1266 points, 1 MEarth). The good recovery rates are shown on the left-hand side panels. When no planet is injected (black curves), they are very good for GRAhigh and GRAlow, and above 80–90% when added to SGlow. They are strongly degraded in other configurations, for all spectral types (and more so for high mass stars). The detection rates when a planet is injected (green curves) are good for all stars for GRAlow only, and for K stars and sometimes G stars for GRAhigh, SGlow, ALLGRAhigh,SGlow, and ALLGRAlow,SGlow (the threshold depends on the configuration) but strongly degraded for all other cases. It could seem surprising that the performance is better when considering ALLGRAhigh,SGlow compared toSGlow alone (no injected planet): this is likely due to the fact that when adding the GRAhigh signal, even though the rms is increased, the power spectrum is then more similar to the GRAhigh power spectrum corresponding to good performancein the habitable zone.

The green dotted lines correspond to the detection rate obtained for the theoretical false positive level of 1% (Sect. 3), which is to be compared to the green solid line observed in the blind test. The two estimations are sometimes similar, corresponding to the FP that is very close to the FAP (Sect. 3), while in other cases, the blind test detection rates are lower than expected from the theoretical false positive level due to the difference between the FAP and the true FP. There is, therefore, a complex relationship between the theoretical results and the detection rates derived from the blind tests. We conclude that the FAP provides a detection rate which corresponds to a different false positive level from the one expected (i.e. in our case, diverging by 1%).

thumbnail Fig. 16

Distribution of the difference between the highest peak period and true period for all OGS configurations and all realisations with injected planet in the blind test (1266 points, 1 MEarth), corresponding to a total of 5546 realisations. The middle panel is a zoom in the [−50d,50d] range. The lower panel shows the distribution of the periods found outside the [−50d,50d] range when a planet is injected (solid line), and when no planet is injected (dashed line).

Open with DEXTER
thumbnail Fig. 17

Good recovery percentages (left-hand side panels) and bad recovery percentages (right-hand side panels) vs. spectral type in the main blind test (1266 points, 1 MEarth). Good recoveries include no detection when no injected planet (black) and good planet recovered when injected (green). The green dotted line corresponds to the detection rate obtained in Sect. 3 with the theoretical false positive levels for the middle of the habitable zone for comparison. Bad recoveries include the false positive rate when no planet is injected (red), wrong planet detected (brown), rejection of true planet (orange), and missed planet (blue). The dashed black line is the sum of all bad recovery rates when a planet is injected (brown+orange+blue).

Open with DEXTER

4.4.5 Results of the blind tests: false positives

The right-hand panels in Fig. 17 show the bad recovery rates. When a planet is injected, the bad recoveries (dashed black line) naturally serve as the complement of the green curve from the left panels. It represents a wide variety of situations: it is sometimes dominated by the missed planet (bad period and below the FAP, in blue), sometimes by the rejection of true planet (in orange); globally, that is, because peaks are below the FAP and sometimes because the highest peak is above the FAP but does not correspond to the planet (in brown, i.e. a false positive). We note that the false positive rate when a planet is injected (brown) is different from the false positive when no planet is injected (in red, completely to the black curves on the left-hand side panels) for supergranulation (especially SGlow) alone, but it is similar when granulation and supergranulation are superposed.

For GRA, ALLGRAhigh,SGmed, and ALLGRAhigh,SGlow, the red and brown curves are similar, that is, the percentage of false positives is the same whether a planet (of 1 MEarth) is injected or not. However, the situation is different for SGmed and SGlow because when no planet is injected, the percentage of false positives is the same, even though they have very different rms RV. This is due to the fact that here the comparison of the power is made with the FAP. Because the shape of the power spectrum is the same between SGmed and SGlow, and because the FAP values are scaled with the rms, both power and FAP increases from SGlow to SGmed in a similar manner and the percentage of false positives is similar. In the case of ALLGRAhigh,SGlow, the signal is dominated by GRAhigh, hence, a level that is similar between GRAhigh and ALLGRAhigh,SGlow, while the situation is intermediate for ALLGRAhigh,SGmed. For GRAlow, the rate of false positives is very small in all cases. However, when added to supergranulation (either SGmed or SGlow), the latter dominates, and rates are very similar to those obtained when combining with GRAhigh, only slightly lower.

The level of false positives here may be large because our analysis is too simplistic. When a peak is detected above the FAP, we should test the robustness of the detection to determine whether the peak is stable or not for example. More sophisticated methods will have to applied in this area in the future (see Sect. 6).

Another representation of these results is shown in Fig. 18, showing the percentage of false positives (sum of the two contributions described above) versus the detection rate (computed on the cases with an injected planet), which is similar to a ROC curve (but where each point corresponds to a spectral type). Each curve corresponds to one of the OGS configurations. Ideally, we would like points to be in the lower right corner. Points at the top have a high false positive level and points on the left correspond to poor detection rates. If we compare the global level of false positive here and the rms for each type of OGS configuration, we see that there is not a direct correspondence, because a granulation-like signal provides better performance due to their more suitable power spectrum (for a given rms). High-mass stars are to the left of each curve and lead to high rates of false positives and low detection rates except for granulation alone (for GRAlow all points are in the lower right corner), and to a lesser extent ALLGRAhigh,SGlow. We also note that the highest level of false positives is obtained for SG alone. However, when granulation is added to supergranulation, the rms increases, but the level of false positive decreases because the shape of the power spectrum is closer to the granulation shape, leading to better performance: this explains why the level of false positive is higher when SGmed and GRAlow are superimposed (dashed green curve) compared to SGmed and GRAhigh (solid green curve), that is, closer to the SG behaviour (large false positive rates) even though the rms is lower.

thumbnail Fig. 18

False positive rate vs. detection rate for each OGS configuration (GRAhigh in orange, SGmed in res, SGlow in brown, ALLGRAhigh,SGmed in green, andALLGRAhigh,SGlow in blue) in main blind tests. The dashed lines correspond to configurations including GRAlow. The orange dashed line is not visible (all points in the lower right corner).

Open with DEXTER

4.4.6 Additional configurations

Additional configurations are tested in Appendix C.1 (180 points only) and C.2 (2 MEarth). The performance for 180 points is very poor. The level of false positives is quite low, which can be explained by the results shown in Sect. 4.1: here, the FAP overestimates the true false positive level and, therefore, there are few peaks above the FAP. The detection rates are very low, however. On the other hand, the performance is much better for a 2 MEarth planet compared to a 1 MEarth, although it is not perfect in all cases: for F and early G stars, the detection rates reach values below 50% when supergranulation is high.

We also implemented a similar blind test, but in which 1 MEarth or 2 MEarth are the true planet mass. We assume that the orbital plane is similar to the equatorial plane and take the distribution of stellar inclination into account. We expect slightly lower detection rates than before (for cases with injected planets), which is indeed observed as shown in Appendices C.3 and C.4. Figure 19 shows the average of the rates over all spectral types for each OGS configuration, without taking inclination into account (previous results) and, conversely, taking it into account. The detection rates are slightly lower when considering inclination (i.e. the true mass), typically by a different of about 12–13 points on the percentage. The difference is mostly due to the larger amount of missed planet when the mass is the projected mass only.

thumbnail Fig. 19

Comparison of average rates for 1 MEarth (black) and 2 MEarth (green), and without taking projection into account (solid lines, the mass is the apparent mass) and taking inclination into account (dashed lines, the mass is the true mass). The number associated to each OGS configuration corresponds to the order of the plots in Fig. 17 (from top to bottom, i.e. GRAhigh is number 1, SGmed is number 2 and so on). The detection rate plot corresponds to the green curves in the left panels in Fig. 17, the wrong planet rate plot to the brown curves, the rejected planet rate plot to the orange curves, and the missed planet rate plot to the blue curves in the right panels in Fig. 17.

Open with DEXTER

4.4.7 Corresponding LPA limits

Finally, we compute the LPA detection limits (see Sect. 4.2 for the definition): with an injected planet with a mass of 1 MEarth, we want the LPA detection limit (Mlpa) to be higher than 1 MEarth. We compute tenvalues of Mlpa over the habitable zone, which are then averaged together for each spectral type. The average Mlpa and the percentage of realisations where Mlpa is higher than 1 MEarth. In all cases, Mlpa is indeed above 1 MEarth, and the percentage above 70%, which is in agreement with expectation. When no planet is injected on the other hand, we want Mlpa to be as low as possible. For SGmed and ALLGRAhigh,SGmed, they are above 1 MEarth for F6-G8 stars, so that in those cases, the exclusion of the presence of low mass planet (below 1 MEarth) is not possible. This is strongly related to the performance in terms of detection rates described above. For all other configurations (OGS, spectral types), they are always below 1 MEarth. We conclude that the LPA provides results that are consistent with the presence of the injected planet.

4.4.8 Comparison of the detection rates with the K/N criterion

In this section, we compute the K/N criterion proposed in Dumusque et al. (2017) and defined as /RVrms, where Kpl is the amplitude of a planetary signal in RV (for a given mass, period, host star), Nobs is the number of observation, and RVrms is the RV jitter4. K/N is used by Dumusque et al. (2017) as a criterion for estimate the quality of recovery rates. Therefore, we compute this practical criterion for a 1 MEarth planet and compare it to the detection rates obtained previously for the same planet mass (cases with injected planet). The results are shown in Fig. 20. We find a very clear relationship between the two: all OGS configurations and spectral types lie along the same curve with very little dispersion, so the criterion is adequate to describe the detection rate in these conditions. Detection rates better than 50% correspond to K/N above ~7, and K/N must be above ~9 to reach detection rates better than 95%. This is very similar to the rough threshold between bad recoveries and good recoveries of ~7.5 in Dumusque et al. (2017), who focused on magnetic activity. On the other hand, there is not a one-to-one relationship between this criterion and the false positives, as the different OGS configurations correspond to different levels, as shown on the lower panel of Fig. 20.

Although the curve for a given number of points, Nobs, and the mass are well-defined, it is, in fact, very dependent on the configurations. For example, for a lower number of points (see Fig. C.3 for 180 points) the curve is very different: the curve is also well-defined, but for a similar K/N, the detection rates are lower than for 1266 points. The same is observed for 2 MEarth, with the 50% level reached at lower K/N values compared to a 1 MEarth planet. Thus, the criterion is not universal. We then consider the performance as a function of the number of points in Sect. 5.

5 Effect of the sampling

In this last section, we focus on the effects of the sampling. We first summarise the dependence of the performance obtained in Sects. 3 and 4 on Nobs. Then we test the effect of the sampling in a limited amount of cases: regular sampling instead of random, with a duration limited to 3 yr instead of 10 yr, and including data binning.

5.1 Summary of the effect of the number of points

Figure 21 summarises the performance obtained in the previous sections for G2 stars, PHZmed, and ALLGRAhigh,SGlow versus the number of points. Below 500 points, curves obtained with the theoretical false positive threshold inmass, detection limits, and mass characterisation are not very different from a 1/ dependence.However, above 500 points (and for all values for the fap/fp ratio), they decrease more slowly than the 1/ law. This is, therefore, important to optimise the observing time. The uncertainty on the mass appears, for example, to be saturating at high Nobs. On the other hand, detection limits (upper right panel) vary strongly with Nobs and do not follow a law. The same is true for the detection rates in the blind tests. Increasing the number of points may also increase the level of false positives however (when no planet is injected).

thumbnail Fig. 20

Detection rate (planet injected, upper panel) and false positive rate (lower panel) from blind test vs. K/N criterion for each OGS configuration (GRAhigh and GRAlow in orange, SGmed in red, SGlow in brown, ALLGRAhigh,SGmed and ALLGRAlow,SGmed in green, and ALLGRAhigh,SGlow and ALLGRAlow,SGlow in blue), for 1266 points, 1 MEarth.

Open with DEXTER

5.2 Regular vs. random sampling

In previous sections, we consider a random sampling during the period of observations. We now consider the effect of this choice by testing the performance of a regular sampling in a few cases (G2 and K4 stars) for the blind test and over all spectral types for the mass characterisation. This test is done as in Sect. 4.4, that is, with 1266 points over 10 yr, and GRAhigh. We find that the mass uncertainties are extremely similar to what is obtained with the random sampling. The blind tests show that the detection rates when a planet is injected are also very similar, the random sampling providing slightly better detection rates. However, when no planet is injected, the regular sampling provides better false positive rates for certain OGS signals (SG alone and ALLGRAhigh,SGmed) while they are very similar for GRAhigh alone and ALLGRAhigh,SGlow. We conclude that in the future, depending on the observational constrains and type of signals, the two types of sampling must be tested to decide which one provides the best performance.

thumbnail Fig. 21

Effect of Nobs on performance studied in Sects. 2, 3, and 4 for G2 stars, PHZmed and ALLGRAhigh,SGlow. The different panels represent: fpM from Sect. 2.3; detection rates using the true false positive level in power (black line) and mass (red line) from Sect. 3.1; true detection limits in power (black line for 50% detection rate, red line for 95% detection rate) and in mass (green line for 50% detection rate, blue line for 95% detection rate) from Sect. 3.2; fap/fpP from Sect. 4.1; average LPA detection limit from Sect. 4.2; 1σ uncertainty on the mass characterisation from Sect. 4.3 (black for 1 MEarth and red for 2 MEarth); detection rate from the blind test in Sect. 4.4 with planet injected (green) and good recovery when no planet is injected (black); false positives when a planet is injected (dashed black line) and no planet is injected (red) from the same blind tests. The dotted lines correspond to what would be obtained if the variability was following a law ( in the case of the detection rate), scaled to the values at 180 days.

Open with DEXTER
thumbnail Fig. 22

Uncertainty on mass for GRAhigh (left-hand side panels) and ALLGRAhigh,SGmed (right-hand side panels) vs. spectral type comparing 10-yr coverage (solid lines) and 3-yr coverage (dashed line) for different values of Nobs (from top to bottom), and for PHZmed.

Open with DEXTER

5.3 Temporal coverage

In this work,we observed that high values of Nobs were necessary for obtaining good performance and we tested only across a long duration (10 yr). In this section, we estimate the performance in a few cases if only 3 yr of data are available, both on the blind tests (detectability) and mass characterisation. We keep the 4-month gap every year (except for the highest value of Nobs, 1095) and consider the following number of points with this gap: 180 (to be compared with the same number of points spread over 10 yr), 284, 384 (to be compared with a Nobs of 1266 in the previous simulations because it corresponds to the same density of points), 486, 588, and 690. We consider all spectral types. The figures are shown in Appendices B.1 and C.5. Figure 22 shows a comparison in mass uncertainty between a few 10-yr and 3-yr coverage configurations for GRAhigh and ALLGRAhigh,SGlow. For 180 points for both coverages, the performance is similar for GRAhigh but worse when supergranulation is added for the 3-yr coverage compared to the 10-yr period. When Nobs increases the differences remain when supergranulation is added. The same behaviour is observed for 542 points over 10 yr and 588 over 3 yr (with a similar number of points). It is, for example, more efficient to obtain 904 observationsover 10 yr than 1095 over 3 yr in this case. We conclude that for granulation alone, the temporal coverage is not a critical choice, but longer time series provides better performance when considering supergranulation. Figure B.2 also shows the number of points necessary to reach a 20% uncertainty on the mass: in most cases, when supergranulation is included, 1095 is a lower limit, that is, it is not possible to reach such a level for a 1 MEarth planet; saturation is present only for SGmed for 2MEarth. With granulation alone, it is possible to reach 20% for a 1 MEarth planet in most cases. The blind tests were carried out for 384 observations over the 3 yr. Compared to the 1266 points over 10 yr, the detection rates are significantly lower, although the false positive rates are not much affected. The relationship between K/N and the detection rate is also shifted compared to Sect. 4.4.

5.4 Temporal binning

We compare the performance after binning the time series using 30-day bins with the preceding results. The objective is mostly to test whether binning the signal over several days to average out supergranulation is efficient. Since we are interested in long orbital periods, such a binning should not a priori affect the planetary signal very much. The protocol is otherwise similar to the one described in Sect. 4.3 for the mass characterisation in transit follow-up and in Sect. 4.4 for the blind tests (1 MEarth, 1266 points). The figures are shown in Appendices B.1 and C.6. The mass characterisation is not improved by the binning: depending on the configuration it is similar to the no-binning results or worse. The number of observations necessary to reach a precision of 20% on the mass is higher than without binning. The blind test shows that when no planet is injected, performance in terms of good recovery is slightly better than with no binning. However, when a planet is injected, the performance is worse. The level of false positives is very low. We conclude that such a binning does not significantly help to improve the detectability performance.

6 Conclusion

In this paper, we study in detail the effect of granulation and supergranulation on Earth-mass planet mass characterisation and detectability for stars between F6 and K4 stars for different numbers of points. The two strong advantages of our approach include: the application of a large set of time series due to these flows and a systematic analysis of their impact and performance in terms of false positive, detection rates, detection limits, and mass characterisation. This work is based on several assumptions, which we recall here: (1) the shape of the power spectrum is similar to what we found in Meunier et al. (2015), although we test different granulation and supergranulation levels (the power at long orbital period depending on the rms of the signal and the timescale, which is fixed here), and the supergranulation amplitude versus spectral type follows the granulation dependence on spectral type; (2) we do not add any other signal (magnetic activity, instrumental, photon noise, etc.) except for planets; (3) we focus on a long orbital period in the habitable zone around these stars; (4) no correction technique is applied except for the 1-h binning and the test involving a 30-day binning.

Our main conclusions, noted here and detailed below, are: (1) both granulation and supergranulation affect the detection rates and the false positive levels, but supergranulation plays the main role; (2) different tools give different results because they are based on different assumptions (mainly on the false positive definition) and should be used with caution (e.g. FAP computed from a bootstrap analysis).

Our results can be summarised as follows. The presence of granulation and supergranulation affects mass characterisation in RV when performing a follow-up of a transit detection. The uncertainties on these masses are sometimes below 20% for a 1 MEarth (mostly for granulation alone or for low mass stars), but they are much larger in certain configurations (supergranulation, high-mass stars). This contribution is, therefore, important to consider when performing mass characterisations.

We estimated detection rates and detection limits corresponding to a good detection rate using theoretical levels of false positive (i.e.assuming a perfect knowledge of the signal). Aside from when the temporal window is not very good (for example period close to the 1-yr period), the frequential analysis (periodogram analysis) leads to better detection rates than the temporal analysis (fit of the planetary signal). The performance is poor for a large fraction of our configurations, and always requires a large amount of points. Granulation alone or added to low levels of supergranulation leads to good detection rates (although a very high number of points is required for F stars), but the performance is very poor for the median level of supergranulation.

When adopting the point of view of an observer (i.e. without knowing whether any other contribution than the stellar signal is present), we found that the FAP (obtained with a standard bootstrap analysis of the observed time series) does not provide the true false positive level: apart from GRA and SGlow (always an overestimation of the true level), they overestimate the true level for a low number of points (meaning a conservative detection) and underestimate it when the number of points was large (with the risk of false positives). Current surveys are in the regime of a low number of points (the FAP estimate is, therefore, conservative), but future observations using a large Nobs to improve the detection rates are likely to be more sensitive to an underestimation of the FAP. Here, we characterise the exclusion rates associated to the LPA detection limits (Meunier et al. 2012) when applied to this type of signal, showing that the threshold used in previous works corresponds to a median exclusion rate of 83% (masses should be increased byabout 20% to correspond to 99%). This should be kept in mind when using them to compute occurrence rates.

Finally, we performed several blind tests corresponding to different conditions in terms of planet mass, number of points, and different sampling issues (binning, duration...). As for the theoretical approach, the performance both in terms of detection rates and false positives is poor for F and G stars, whereas it is good for K stars. These rates strongly depend on the number of points as well and we find that the detection rate as a function of the K/N criterion (Dumusque et al. 2017) follows a single curve for all OGS configuration for a given number of points, but not when considering different number of points: the performance fortunately increases faster than .

An important result from the blind tests comes from the comparison between the detection rates and false positives in our various configurations: We find that for most stars, the detection rates are well below 100% and always associated to a high level of false positives. The blind tests we implemented used a simple analysis method, that is, based only on the FAP, given that we lack “activity” indicators for this type of signal, which is in contrast to the case dealing with magnetic activity (see below). As a consequence, to improve this performance, future works will need to concentrate on both aspects. The scope of the present paper is focussed on estimating the performance across a wide variety of configurations but without using mitigating techniques, which have yet to be developed.

Some approaches in the literature may help to decrease the number of false positives. Periodogram standardisation may help to better define the false positive level, as discussed, for example, by Sulis et al. (2016, 2017a). Stacked periodograms, as proposed by Mortier & Collier Cameron (2017), may also aid in this purpose. However, it remains to be seen whether these methods allow us to increase the detection rate, that is, to recover missed planets (although the one may help to a certain extent with regard to planet peaks that are not too far below the FAP). Improving the detection rates will, however, require the development of new methods. Gaussian processes, which may be fitting to describe this type of signal due to their flexibility, may also absorb planets at long orbital periods: this will have to be checked with similar simulations. One difficulty arises from the fact that usual activity indicators cannot be used (e.g. the ). We do not expect a correlation with photometry (which is not often simultaneous with the RV, anyway) from the simulations of Meunier et al. (2015) due to the high stochasticity of the granulation signal and it is not present for supergranulation (Meunier et al. 2007). There may be a small correlation with the bisector shape variation (Cegla et al. 2019) for granulation (but its use when superposed on the bisector variations due to other processes may be limited), however, we do not expect any for supergranulation because it involves relatively large scale flows (little dependence on line depth expected) which is relatively symmetric across the disk (no strong effect as there would be e.g. for a spot crossing the disk). However, this aspect has not yet been measured nor simulated so it remains to be checked in future studies.

Acknowledgements

We thank L. Bigot and S. Sulis for useful discussions. This work has been funded by the ANR GIPSE ANR-14-CE33-0018. This work was supported by the “Programme National de Physique Stellaire” (PNPS) of CNRS/INSU co-funded by CEA and CNES. This work was supported by the Programme National de Planétologie (PNP) of CNRS/INSU, co-funded by CNES.

Appendix A: Typical power and examples of time series

Figure A.1 shows the theoretical power functions used to produce the time series analysed in this paper for GRAhigh and SGmed. The other power functions (GRAlow and/or SGlow) only differ in amplitudes. Figures A.3 and A.4 show examples of subsets of time series (covering 8 h, 5 days, and 50 days) for the different contributions and for G2 and K4 stars, respectively.

thumbnail Fig. A.1

Example of power function used in Sect. 2.1, for F6, G2, and K4 stars (from top to bottom), for SGmed (red), for GRAhigh (orange), and oscillations (yellow). The dashed black line represents the sum of these three curves.

Open with DEXTER
thumbnail Fig. A.2

Extract from time series vs. time for F6 stars, different OGS configurations (from top to bottom) and for three temporal coverages: 8 h (left-hand side panels), 5 days (middle panels) and 50 days (right-hand side panels), smoothed over 1 h. For the 8-h coverage, ten examples corresponding to adjacent nights are superposed (each section of the RV time series is represented by a different color). For the two other coverage sets, the black line represents the full resolution time series (smoothed over 1 h) and the red circles are the selected points used in the analysis (one point per day).

Open with DEXTER
thumbnail Fig. A.3

Same as Fig. A.2 for G2 stars.

Open with DEXTER
thumbnail Fig. A.4

Same as Fig. A.2 for K4 stars.

Open with DEXTER

Appendix B: Additional mass characterisation results

Additional mass characterisation results are presented here for the purposes of comparing different conditions. We explore the effect of the sampling.

B.1 Case PHZmed, 1 and 2 MEarth planets, 3-yr coverage

Figure B.1 shows the uncertainty on the mass for a 3-yr coverage (see Sect. 5.4), and Fig. B.2 the number of observations necessary to reach an uncertainty on the mass of 20%. The results are discussed in Sect. 5.3.

thumbnail Fig. B.1

Same as Fig. 12 but for a 3-yr coverage, and Nobs of 180, 384, and 690.

Open with DEXTER
thumbnail Fig. B.2

Same as Fig. 13 but for a 3-yr coverage.

Open with DEXTER

B.2 Case PHZmed, 30-day binning, 1 and 2 MEarth planets

Figure B.3 shows the uncertainty on the mass for a binning over 30 days (see Sect. 5.4) and Fig. B.4 shows the number of observations necessary to reach an uncertainty on the mass of 20%. The results are discussed in Sect. 5.4.

thumbnail Fig. B.3

Same as Fig. 12 but for a 30-day binning.

Open with DEXTER
thumbnail Fig. B.4

Same as Fig. 13 but for a 30-day binning.

Open with DEXTER

Appendix C: Additional blind test results

Additional blind tests have been performed to compare different conditions. We explore the sampling, the planet mass and effects of inclination.

C.1 Case 1 MEarth, 180 points

Figures C.1, C.2, and C.3 show the results for a blind test performed with 180 days only (1 MEarth as in Sect 4.4). They are discussed in Sects. 4.4.6 and 4.4.8.

thumbnail Fig. C.1

Same as Fig. 17 but for 180 points.

Open with DEXTER
thumbnail Fig. C.2

Same as Fig. 18 but for 180 points.

Open with DEXTER
thumbnail Fig. C.3

Same as Fig. 20 but for 180 points.

Open with DEXTER

C.2 Case 2 MEarth, 1266 points

Figures C.4, C.5, and C.6 show the results for a blind test performed for 2 MEarth (1266 pointsas in Sect. 4.4). They are discussed in Sects. 4.4.6 and 4.4.8.

thumbnail Fig. C.4

Same as Fig. 17 but for a 2 MEarth planet.

Open with DEXTER
thumbnail Fig. C.5

Same as Fig. 18 but for a 2 MEarth planet.

Open with DEXTER
thumbnail Fig. C.6

Same as Fig. 20 but for a 2 MEarth planet.

Open with DEXTER

C.3 Case 1 MEarth with inclination distribution, 1266 points

Figures C.7 and C.8 show the results for a blind test performed with a realistic distribution of the inclination angle. They are discussed in Sect. 4.4.6.

thumbnail Fig. C.7

Same as Fig. 17 but with inclination distribution.

Open with DEXTER
thumbnail Fig. C.8

Same as Fig. 18 but with inclination distribution.

Open with DEXTER

C.4 Case 2 MEarth with inclination distribution, 1266 points

Figures C.9 and C.10 show the results for a blind test performed with a distribution of the inclination angle and a 2 MEarth planet. They are discussed in Sect. 4.4.6.

thumbnail Fig. C.9

Same as Fig. 17 but with inclination distribution and 2 MEarth planet.

Open with DEXTER
thumbnail Fig. C.10

Same as Fig. 18 but with inclination distribution and 2 MEarth planet.

Open with DEXTER

C.5 Case 1 MEarth, 3-yr coverage

Figures C.11, C.12, and C.13 show the results for a blind test performed with 384 points, over 3 yr instead of 10yr (1 MEarth as in Sect. 4.4). They are discussed in Sect. 5.3.

thumbnail Fig. C.11

Same as Fig. 17 but for 3-yr coverage, 384 points.

Open with DEXTER
thumbnail Fig. C.12

Same as Fig. 18 but for 3-yr coverage, 384 points.

Open with DEXTER
thumbnail Fig. C.13

Same as Fig. 20 but for 3-yr coverage, 384 points.

Open with DEXTER

C.6 Case 1 MEarth, 1266 points, 30-day binning

Figures C.14, C.15, and C.16 show the results for a blind test performed with a 30-day binning, for 1 MEarth (1266 pointsas in Sect. 4.4). They are discussed in Sect. 5.4.

thumbnail Fig. C.14

Same as Fig. 16 but for a 30-day binning.

Open with DEXTER
thumbnail Fig. C.15

Same as Fig. 18 but for a 30-day binning.

Open with DEXTER
thumbnail Fig. C.16

Same as Fig. 20 but for a 30-day binning.

Open with DEXTER

References

  1. Asplund, M., Nordlund, Å., Trampedach, R., Allende Prieto, C., & Stein, R. F. 2000, A&A, 359, 729 [NASA ADS] [Google Scholar]
  2. Bastien, F. A., Stassun, K. G., Pepper, J., et al. 2014, AJ, 147, 29 [NASA ADS] [CrossRef] [Google Scholar]
  3. Bedding, T. R., & Kjeldsen, H. 2003, PASA, 20, 203 [Google Scholar]
  4. Beeck, B., Cameron, R. H., Reiners, A., & Schüssler, M. 2013, A&A, 558, A48 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  5. Belkacem, K., Samadi, R., Mosser, B., Goupil, M.-J., & Ludwig, H.-G. 2013, ASP Conf. Ser., 479, 61 [Google Scholar]
  6. Boisse, I., Bonfils, X., & Santos, N. C. 2012, A&A, 545, A109 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  7. Borgniet, S., Meunier, N., & Lagrange, A.-M. 2015, A&A, 581, A133 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  8. Borgniet, S., Lagrange, A. M., Meunier, N., & Galland, F. 2017, A&A, 599, A57 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  9. Borgniet, S., Lagrange, A. M., Meunier, N., et al. 2019, A&A, 621, A87 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  10. Burt, J., Holden, B., Wolfgang, A., & Bouma, L. G. 2018, AJ, 156, 255 [CrossRef] [Google Scholar]
  11. Cegla, H. 2019, Geosciences, 9, 114 [NASA ADS] [CrossRef] [Google Scholar]
  12. Cegla, H. M., Shelyag, S., Watson, C. A., & Mathioudakis, M. 2013, ApJ, 763, 95 [NASA ADS] [CrossRef] [Google Scholar]
  13. Cegla, H. M., Watson, C. A., Shelyag, S., & Mathioudakis, M. 2015, in Cambridge Workshop on Cool Stars, Stellar Systems and the Sun, Vol. 18, eds. G. T. van Belle, & H. C. Harris, 567 [Google Scholar]
  14. Cegla, H. M., Watson, C. A., Shelyag, S., et al. 2018, ApJ, 866, 55 [NASA ADS] [CrossRef] [Google Scholar]
  15. Cegla, H. M., Watson, C. A., Shelyag, S., Mathioudakis, M., & Moutari, S. 2019, ApJ, 879, 55 [NASA ADS] [CrossRef] [Google Scholar]
  16. Chaplin, W. J., Cegla, H. M., Watson, C. A., Davies, G. R., & Ball, W. H. 2019, AJ, 157, 163 [NASA ADS] [CrossRef] [Google Scholar]
  17. Collier Cameron, A., Mortier, A., Phillips, D., et al. 2019, MNRAS, 487, 1082 [CrossRef] [Google Scholar]
  18. Davies, G. R., Chaplin, W. J., Elsworth, Y., & Hale, S. J. 2014, MNRAS, 441, 3009 [NASA ADS] [CrossRef] [Google Scholar]
  19. Desort, M., Lagrange, A.-M., Galland, F., Udry, S., & Mayor, M. 2007, A&A, 473, 983 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  20. Dumusque, X. 2016, A&A, 593, A5 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  21. Dumusque, X., Udry, S., Lovis, C., Santos, N. C., & Monteiro, M. J. P. F. G. 2011, A&A, 525, A140 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  22. Dumusque, X., Pepe, F., Lovis, C., et al. 2012, Nature, 491, 207 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
  23. Dumusque, X., Borsa, F., Damasso, M., et al. 2017, A&A, 598, A133 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  24. Elsworth, Y., Howe, R., Isaak, G. R., et al. 1994, MNRAS, 269, 529 [CrossRef] [Google Scholar]
  25. Grandjean, A., Lagrange, A.-M., Keppler, M., et al. 2020, A&A, 633, A44 [CrossRef] [EDP Sciences] [Google Scholar]
  26. Harvey, J. W. 1984, in Probing the Depths of a Star: The Study of Solar Oscillation from Space, eds. R. W. Noyes, & E. J. Rhodes Jr., JPL, 400, 327 [Google Scholar]
  27. Herrero, E., Ribas, I., Jordi, C., et al. 2016, A&A, 586, A131 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  28. Jones, B. W., Sleep, P. N., & Underwood, D. R. 2006, ApJ, 649, 1010 [NASA ADS] [CrossRef] [Google Scholar]
  29. Kallinger, T., De Ridder, J., Hekker, S., et al. 2014, A&A, 570, A41 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  30. Kasting, J. F., Whitmire, D. P., & Reynolds, R. T. 1993, Icarus, 101, 108 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
  31. Kippenhahn, R., & Weigert, A. 1990, Stellar Struct. Evol., 192 [Google Scholar]
  32. Kjeldsen, H., & Bedding, T. R. 1995, A&A, 293, 87 [NASA ADS] [Google Scholar]
  33. Lagrange, A.-M., Desort, M., & Meunier, N. 2010, A&A, 512, A38 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  34. Lagrange, A.-M., Meunier, N., Chauvin, G., et al. 2013, A&A, 559, A83 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  35. Lannier, J., Lagrange, A. M., Bonavita, M., et al. 2017, A&A, 603, A54 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  36. Lagrange, A. M., Keppler, M., Meunier, N., et al. 2018, A&A, 612, A108 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  37. Makarov, V. V., Parker, D., & Ulrich, R. K. 2010, ApJ, 717, 1202 [NASA ADS] [CrossRef] [Google Scholar]
  38. Meunier, N., & Lagrange, A. M. 2019a, A&A, 628, A125 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  39. Meunier, N., & Lagrange, A. M. 2019b, A&A, 625, L6 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  40. Meunier, N., & Lagrange, A.-M. 2020, A&A, 638, A54 [CrossRef] [EDP Sciences] [Google Scholar]
  41. Meunier, N., Tkaczuk, R., & Roudier, T. 2007, A&A, 463, 745 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  42. Meunier, N., Desort, M., & Lagrange, A.-M. 2010a, A&A, 512, A39 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  43. Meunier, N., Lagrange, A.-M., & Desort, M. 2010b, A&A, 519, A66 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  44. Meunier, N., Lagrange, A.-M., & De Bondt, K. 2012, A&A, 545, A87 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  45. Meunier, N., Lagrange, A.-M., Borgniet, S., & Rieutord, M. 2015, A&A, 583, A118 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  46. Meunier, N., Lagrange, A. M., Boulet, T., & Borgniet, S. 2019a, A&A, 627, A56 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  47. Meunier, N., Lagrange, A.-M., & Cuzacq, S. 2019b, A&A, 632, A81 [CrossRef] [EDP Sciences] [Google Scholar]
  48. Mortier, A., & Collier Cameron, A. 2017, A&A, 601, A110 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  49. Pallé, P. L., Roca Cortés, T., Jiménez, A., GOLF Team, & Virgo Team 1999, ASP Conf. Ser., 173, 297 [Google Scholar]
  50. Rieutord, M., Roudier, T., Malherbe, J. M., & Rincon, F. 2000, A&A, 357, 1063 [Google Scholar]
  51. Rieutord, M., Ludwig, H.-G., Roudier, T., Nordlund, A., & Stein, R. 2002, Nuovo Cimento C Geophys. Space Phys. C, 25, 523 [Google Scholar]
  52. Roudier, T., Malherbe, J. M., Rieutord, M., & Frank, Z. 2016, A&A, 590, A121 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  53. Saar, S. H., & Donahue, R. A. 1997, ApJ, 485, 319 [NASA ADS] [CrossRef] [Google Scholar]
  54. Samadi, R., Georgobiani, D., Trampedach, R., et al. 2007, A&A, 463, 297 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  55. Santos, A. R. G., Cunha, M. S., Avelino, P. P., & Campante, T. L. 2015, A&A, 580, A62 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  56. Sulis, S., Mary, D., & Bigot, L. 2016, in 2016 IEEE International Conference on Acoustics, Speech Signal Process. (ICASSP), 4428 [CrossRef] [Google Scholar]
  57. Sulis, S., Mary, D., & Bigot, L. 2017a, IEEE Trans. Signal Process., 65, 2136 [NASA ADS] [CrossRef] [Google Scholar]
  58. Sulis, S., Mary, D., & Bigot, L. 2017b, in Proceedings of 25th European Signal Processing Conference, 1095 [Google Scholar]
  59. Sulis, S., Mary, D., & Bigot, L. 2020, A&A, 635, A146 [CrossRef] [EDP Sciences] [Google Scholar]
  60. Yu, J., Huber, D., Bedding, T. R., & Stello, D. 2018, MNRAS, 480, L48 [NASA ADS] [CrossRef] [Google Scholar]
  61. Zaninetti, L. 2008, Serb. Astron. J., 177, 73 [NASA ADS] [CrossRef] [Google Scholar]

1

There are other scaling laws, for example by Yu et al. (2018), but they are not very different. Because the oscillations are strongly averaged in this work, this choice is not critical.

2

We use the Lomb Scargle periodogram with no normalisation to be able to compare powers between different types of contributions. They are computed between 2 and 2000 days.

3

This period of 366 days corresponds to the middle of the habitable zone for a K1 star. The Earth has a 1-yr period but lies closer to the inner side of the habitable zone of a G2 star.

4

In the original definition in Dumusque et al. (2017), the RV jitter is computed after correction with a linear correlation with chromospheric emission index and trend in time, which is not done here because it is irrelevant to the type of signal we consider.

All Tables

Table 1

Effect of the amplitude factor on LPA exclusion rates.

Table 2

Blind tests: configuration and results.

Table 3

Examples of detections and false positive rates from blind tests.

All Figures

thumbnail Fig. 1

Rms RV vs. spectral type for GRAhigh (orange), SGmed (red), SGlow (brown), ALLGRAhigh,SGmed (green), and ALLGRAhigh,SGlow (blue), for the best sampling (3650 points, no gaps). The dashed lines correspond to the configurations including GRAlow (same colour code). Individual values are shown as stars.

Open with DEXTER
In the text
thumbnail Fig. 2

False positive level in mass fpM vs. spectral type for different numbers of points Nobs: 180 (yellow), 542 (orange), 904 (red), 1266 (brown), 1628 (green), 1990 (blue), 2352 (pink), and 3650 (purple), and for different OGS configurations (from top to bottom). Values of fpM correspond to 1% of false positives and PHZmed. The two horizontal lines indicate the 1 MEarth (solid line) and 2 MEarth (dashed line) levels for comparison.

Open with DEXTER
In the text
thumbnail Fig. 3

False positive level in power fpP vs. period for highest number of points, G2 stars, and five OGS configurations (GRAhigh in orange, SGmed in red, SGlow in brown, ALLGRAhigh,SGmed in green, andALLGRAhigh,SGlow in blue). Thesolid lines represent fpP computed in 100 days ranges, while the dashed horizontal lines correspond to the single value of fpP computed over 100–1000 days.

Open with DEXTER
In the text
thumbnail Fig. 4

Example of distributions of power (upper panel) and mass (lower panel) in the presence of 1 MEarth planet (dashed line) and with no planet (false positive values, solid line). The distributions are for a G2 star, 1266 points and PHZmed and GRAhigh. The vertical line indicates the position of the 1% false positive level deduced from the solid line distribution.

Open with DEXTER
In the text
thumbnail Fig. 5

Detection rates of 50% (solid lines) and 95% (dashed lines) based on frequential analysis vs. spectral type and Nobs, for different OGS configurations (from top to bottom) and for different orbital periods: PHZin (black), PHZmed (red), and PHZout (green).

Open with DEXTER
In the text
thumbnail Fig. 6

Same as Fig. 5 but based on temporal analysis.

Open with DEXTER
In the text
thumbnail Fig. 7

Example of detection rate vs. planet mass, for G2 stars, 1266 points, and GRAhigh, in two cases: based on frequential analysis (black curve) and on temporal analysis (red curve). The vertical solid lines indicate the corresponding 95% level, and the dashed lines the 50% level.

Open with DEXTER
In the text
thumbnail Fig. 8

Detection limits vs. spectral type for 50% detection rate (left panels) and 95% detection rate (right panels) for frequential analysis, PHZmed and for different values of Nobs (from low Nobs to high Nobs, see Sect. 2.1): yellow (180), orange (542), red (904), brown (1266), green (1628), blue (1990), pink (2352), purple (3650). The horizontal dotted line corresponds to a 1 MEarth planet.

Open with DEXTER
In the text
thumbnail Fig. 9

Average ratio fap/fpP for PHZmed vs. Nobs. The average is computed over all realisations and spectral types. The colour code represents the period: inner side (black), middle (red), and outer side (green) of the habitable zone.

Open with DEXTER
In the text
thumbnail Fig. 10

Example of the periodogram for OGS alone (upper panel, GRAhigh alone) and with a planet at LPA mass (0.53 MEarth, lower panel) to illustrate the LPA computations. The red and green horizontal lines correspond to the maximum of the OGS periodogram in the window delimited by the dotted lines and multiplied by 1.3 respectively. The green solid line periodogram in the upper panel is for the planet alone. The example of periodogram with OGS + planet is for an arbitrary phase of the planet, and the horizontal orange line corresponds to the maximum power. Orange stars are for 100 realisations of the planet phase.

Open with DEXTER
In the text
thumbnail Fig. 11

Distribution of exclusion rates (solid line) and detection rates (dashed line) for all LPA tests, i.e. covering all spectral types, three values of Nobs, different OGS configurations, and PHZmed.

Open with DEXTER
In the text
thumbnail Fig. 12

Uncertainty on mass in transit follow-up vs. spectral type for two planet masses (1 MEarth on left hand-side and 2 MEarth on right hand-side), for PHZmed, and for different values of Nobs: 180 points (yellow), 1266 points (brown), and 2352 points (pink). The 1σ levels are shown as solid lines, the 2σ levels as dashed lines, and the 3σ levels as dotted lines. The black horizontal line shows the 20% level for reference. The 2-σ and 3-σ uncertainties are in some cases out of scale for clarity.

Open with DEXTER
In the text
thumbnail Fig. 13

Number of points necessary for 20% uncertainties on mass characterisation, for 1 MEarth (upper panel) and 2 MEarth (lower panel) and different OGS contributions: GRAhigh (orange), SGmed (red), SGlow (brown), ALLGRAhigh,SGmed (green), and ALLGRAhigh,SGlow (blue). The dashed lines correspond to the configurations including GRAlow (same colour code as in Fig. 1). Stars indicate that even with our largest number of points the uncertainties are in fact higher than 20% (lower limit for Nobs). Diamonds indicate that even with 180 points the uncertainties are in fact lower than 20% (upper limit for Nobs).

Open with DEXTER
In the text
thumbnail Fig. 14

Uncertainty on mass characterisation at 1σ level vs. falsepositive level in mass fpM for all spectral types, Nobs and different OGS configurations. The colour code corresponds to the different Nobs values: 180 (yellow), 542 (orange), 904 (red), 1266 (brown), 1628 (green), 1990 (blue), 2352 (pink), and 3650 (purple).

Open with DEXTER
In the text
thumbnail Fig. 15

Decision algorithm to attribute a category (according to Table 2) to each realisation of the blind test. ΔP is equal to |PpeakPtrue|.

Open with DEXTER
In the text
thumbnail Fig. 16

Distribution of the difference between the highest peak period and true period for all OGS configurations and all realisations with injected planet in the blind test (1266 points, 1 MEarth), corresponding to a total of 5546 realisations. The middle panel is a zoom in the [−50d,50d] range. The lower panel shows the distribution of the periods found outside the [−50d,50d] range when a planet is injected (solid line), and when no planet is injected (dashed line).

Open with DEXTER
In the text
thumbnail Fig. 17

Good recovery percentages (left-hand side panels) and bad recovery percentages (right-hand side panels) vs. spectral type in the main blind test (1266 points, 1 MEarth). Good recoveries include no detection when no injected planet (black) and good planet recovered when injected (green). The green dotted line corresponds to the detection rate obtained in Sect. 3 with the theoretical false positive levels for the middle of the habitable zone for comparison. Bad recoveries include the false positive rate when no planet is injected (red), wrong planet detected (brown), rejection of true planet (orange), and missed planet (blue). The dashed black line is the sum of all bad recovery rates when a planet is injected (brown+orange+blue).

Open with DEXTER
In the text
thumbnail Fig. 18

False positive rate vs. detection rate for each OGS configuration (GRAhigh in orange, SGmed in res, SGlow in brown, ALLGRAhigh,SGmed in green, andALLGRAhigh,SGlow in blue) in main blind tests. The dashed lines correspond to configurations including GRAlow. The orange dashed line is not visible (all points in the lower right corner).

Open with DEXTER
In the text
thumbnail Fig. 19

Comparison of average rates for 1 MEarth (black) and 2 MEarth (green), and without taking projection into account (solid lines, the mass is the apparent mass) and taking inclination into account (dashed lines, the mass is the true mass). The number associated to each OGS configuration corresponds to the order of the plots in Fig. 17 (from top to bottom, i.e. GRAhigh is number 1, SGmed is number 2 and so on). The detection rate plot corresponds to the green curves in the left panels in Fig. 17, the wrong planet rate plot to the brown curves, the rejected planet rate plot to the orange curves, and the missed planet rate plot to the blue curves in the right panels in Fig. 17.

Open with DEXTER
In the text
thumbnail Fig. 20

Detection rate (planet injected, upper panel) and false positive rate (lower panel) from blind test vs. K/N criterion for each OGS configuration (GRAhigh and GRAlow in orange, SGmed in red, SGlow in brown, ALLGRAhigh,SGmed and ALLGRAlow,SGmed in green, and ALLGRAhigh,SGlow and ALLGRAlow,SGlow in blue), for 1266 points, 1 MEarth.

Open with DEXTER
In the text
thumbnail Fig. 21

Effect of Nobs on performance studied in Sects. 2, 3, and 4 for G2 stars, PHZmed and ALLGRAhigh,SGlow. The different panels represent: fpM from Sect. 2.3; detection rates using the true false positive level in power (black line) and mass (red line) from Sect. 3.1; true detection limits in power (black line for 50% detection rate, red line for 95% detection rate) and in mass (green line for 50% detection rate, blue line for 95% detection rate) from Sect. 3.2; fap/fpP from Sect. 4.1; average LPA detection limit from Sect. 4.2; 1σ uncertainty on the mass characterisation from Sect. 4.3 (black for 1 MEarth and red for 2 MEarth); detection rate from the blind test in Sect. 4.4 with planet injected (green) and good recovery when no planet is injected (black); false positives when a planet is injected (dashed black line) and no planet is injected (red) from the same blind tests. The dotted lines correspond to what would be obtained if the variability was following a law ( in the case of the detection rate), scaled to the values at 180 days.

Open with DEXTER
In the text
thumbnail Fig. 22

Uncertainty on mass for GRAhigh (left-hand side panels) and ALLGRAhigh,SGmed (right-hand side panels) vs. spectral type comparing 10-yr coverage (solid lines) and 3-yr coverage (dashed line) for different values of Nobs (from top to bottom), and for PHZmed.

Open with DEXTER
In the text
thumbnail Fig. A.1

Example of power function used in Sect. 2.1, for F6, G2, and K4 stars (from top to bottom), for SGmed (red), for GRAhigh (orange), and oscillations (yellow). The dashed black line represents the sum of these three curves.

Open with DEXTER
In the text
thumbnail Fig. A.2

Extract from time series vs. time for F6 stars, different OGS configurations (from top to bottom) and for three temporal coverages: 8 h (left-hand side panels), 5 days (middle panels) and 50 days (right-hand side panels), smoothed over 1 h. For the 8-h coverage, ten examples corresponding to adjacent nights are superposed (each section of the RV time series is represented by a different color). For the two other coverage sets, the black line represents the full resolution time series (smoothed over 1 h) and the red circles are the selected points used in the analysis (one point per day).

Open with DEXTER
In the text
thumbnail Fig. B.1

Same as Fig. 12 but for a 3-yr coverage, and Nobs of 180, 384, and 690.

Open with DEXTER
In the text
thumbnail Fig. B.2

Same as Fig. 13 but for a 3-yr coverage.

Open with DEXTER
In the text
thumbnail Fig. B.3

Same as Fig. 12 but for a 30-day binning.

Open with DEXTER
In the text
thumbnail Fig. B.4

Same as Fig. 13 but for a 30-day binning.

Open with DEXTER
In the text
thumbnail Fig. C.1

Same as Fig. 17 but for 180 points.

Open with DEXTER
In the text
thumbnail Fig. C.2

Same as Fig. 18 but for 180 points.

Open with DEXTER
In the text
thumbnail Fig. C.3

Same as Fig. 20 but for 180 points.

Open with DEXTER
In the text
thumbnail Fig. C.4

Same as Fig. 17 but for a 2 MEarth planet.

Open with DEXTER
In the text
thumbnail Fig. C.5

Same as Fig. 18 but for a 2 MEarth planet.

Open with DEXTER
In the text
thumbnail Fig. C.6

Same as Fig. 20 but for a 2 MEarth planet.

Open with DEXTER
In the text
thumbnail Fig. C.7

Same as Fig. 17 but with inclination distribution.

Open with DEXTER
In the text
thumbnail Fig. C.8

Same as Fig. 18 but with inclination distribution.

Open with DEXTER
In the text
thumbnail Fig. C.9

Same as Fig. 17 but with inclination distribution and 2 MEarth planet.

Open with DEXTER
In the text
thumbnail Fig. C.10

Same as Fig. 18 but with inclination distribution and 2 MEarth planet.

Open with DEXTER
In the text
thumbnail Fig. C.11

Same as Fig. 17 but for 3-yr coverage, 384 points.

Open with DEXTER
In the text
thumbnail Fig. C.12

Same as Fig. 18 but for 3-yr coverage, 384 points.

Open with DEXTER
In the text
thumbnail Fig. C.13

Same as Fig. 20 but for 3-yr coverage, 384 points.

Open with DEXTER
In the text
thumbnail Fig. C.14

Same as Fig. 16 but for a 30-day binning.

Open with DEXTER
In the text
thumbnail Fig. C.15

Same as Fig. 18 but for a 30-day binning.

Open with DEXTER
In the text
thumbnail Fig. C.16

Same as Fig. 20 but for a 30-day binning.

Open with DEXTER
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.