EDP Sciences
Free Access
Issue
A&A
Volume 623, March 2019
Article Number A39
Number of page(s) 13
Section Numerical methods and codes
DOI https://doi.org/10.1051/0004-6361/201834672
Published online 28 February 2019

© ESO 2019

1. Introduction

Since the first discovery of an extrasolar planetary transit across the disk of a distant star (Charbonneau et al. 2000), exoplanet surveys have expanded greatly in numbers and volume. Ground-based transit searches such as HATNet (Bakos et al. 2004), WASP (Pollacco et al. 2006), KELT (Pepper et al. 2007), and CHESPA (Zhang et al. 2019), and space-based search campaigns like CoRoT (Auvergne et al. 2009), Kepler (Borucki et al. 2010), K2 (Howell et al. 2014), and TESS (Ricker et al. 2014) produced vast data sets. These encompass hundreds of thousands of stars, cadences of seconds or minutes, and data sets that span several years. The PLATO space mission, with an expected launch in 2026 and a nominal six year duty cycle, will shadow these surveys by observing up to a million relatively bright stars with cadences of 25 s or 10 min (Rauer et al. 2014). These modern exoplanet transit searches require fast, sensitive, and reliable algorithms to detect the expected but unknown transit signals.

The box least squares (BLS) algorithm (Kovács et al. 2002, 2016) has become the standard tool for exoplanet transit searches in large data sets. It approximates the transit light curve as a (negative) boxcar function with a normalized average out-of-transit flux of zero and a fixed depth during the transit. This approach is key to its computational speed and allows for reliable detections of high to medium signal-to-noise ratio (S/N) signals, such as large (Jupiter-sized) and moderately large (Neptune-sized) planets around sun-like stars in most surveys.

The BLS detection efficiency for low-S/N signals from Earth-sized planets around sun-like stars, however, is significantly smaller because the transit depths are comparable to the level of instrumental and stellar noise. Moreover, BLS introduces a systematic noise component that comes from the mathematical concept of the boxcar function. This binary model of a fixed out-of-transit and a fixed in-transit flux is equivalent to the neglect of the stellar limb darkening and of the planetary ingress and egress in the light curve. This box-shape approximation introduces an extra noise component in the test statistic that dilutes low-S/N signals. Here we present an improved transit search algorithm that attempts to minimize this systematic noise component in the search statistic.

The BLS algorithm has been analyzed and optimized in depth, for example in terms of the optimal frequency sampling, optimal phase sampling, and various other parameters (Ofir 2014). BLS has been extended to variable intervals between successive transits (Carter & Agol 2013), to work with non-Gaussian errors (Boufleur et al. 2014), to improve speed at the cost of sensitivity (Renner et al. 2008), and to refine the detected transit parameters (Collier Cameron et al. 2006; Hartman & Bakos 2016). Further adaptions were made for the application to circumbinary planets (Ofir 2008).

Alternatively, the “matched filter” algorithm is similar to BLS in modeling the transit as a boxcar, but it uses a different test statistic (Jenkins et al. 1996; Bordé et al. 2007). Phase dispersion minimization (Stellingwerf 1978) has been shown to be inferior to BLS for transit detection (Kovács et al. 2002). Analysis of variance (AoV, Schwarzenberg-Czerny & Beaulieu 2006) also uses a box-shaped transit model (which the authors refer to as top-hat), and has been demonstrated to have a lower detection efficiency than BLS in WASP data (Enoch et al. 2012). Bayesian algorithms can search for any signal form (Doyle et al. 2000; Defaÿ et al. 2001), but are not widely used. For example, the Gregory-Loredo method for Bayesian periodic signal detection uses step-functions (boxes with multiple steps) (Aigrain & Favata 2002; Aigrain & Irwin 2004). Wavelet-based algorithms (Régulo et al. 2007) are of similar detection efficiency and have been widely used for automated analyses of CoRoT (Régulo et al. 2009), Kepler (Jenkins et al. 2010), and TESS (Jenkins et al. 2016) data. Polynomials have also been suggested to approximate transit shapes more adequately than boxes (Cabrera et al. 2012) and this idea has been used (Johnson et al. 2016; Livingston et al. 2018), although without a comparison to BLS in terms of detection efficiency and computational effort.

Comparisons of different algorithms showed that BLS is the best of all known algorithms for weak signals (Tingley 2003a,b), but “no detector is clearly superior for all transit signal energies”, which has been verified by empirical tests of the methods (Moutou et al. 2005).

New techniques have now arrived with the advent of artificial intelligence. Deep learning algorithms are usually trained with a series of transit shapes (Pearson et al. 2018; Zucker & Giryes 2018; Armstrong et al. 2018). Random-Forest methods detect 7.5% more planets than classical BLS for low S/N transits (Mislis et al. 2016) because (many different) real transit shapes are used instead of a box. Disadvantages of these methods include substantial computational requirements, high implementation complexity, and a difficulty in understanding the origin of the results due to the many abstraction layers.

Here we present a new transit search algorithm that is easy to use, publicly available1, and has a detection statistic that is generally more sensitive than that of BLS. Most important, it is optimized to find small planets in large data sets. The algorithm assumes a realistic transit shape with ingress and egress and stellar limb darkening (as per Mandel & Agol 2002) using a predefined parameterization that we optimized based on all previous exoplanet transit detections. The resulting increase in the detection significance of the algorithm by 5–10% comes at the toll of larger computational demands. Given the tremendous growth of available CPU power in the past 60 years (Moore 1965) and in particular since the publication of the BLS algorithm in 2002, however, we argue that CPU margins are not as crucial to the detection of small planets as is the significance of the test statistic. That said, we have nevertheless optimized the algorithm for computational speed as far as possible.

Our algorithm is particularly suited for the detection of Earth-sized planets with Kepler/K2, TESS, or with the future big data sets from the PLATO mission. In fact, the improvements of our new algorithm are most substantial for small planets with few transits, which is a common characteristic of Earth-sized planets in the habitable zones around sun-like stars.

2. Methods

We illustrate the methodology of TLS using the K2 light curve of the metal-poor K3 dwarf star K2-110, which hosts a transiting massive mini-Neptune (K2-110 b, EPIC 212521166b) in a 13.86 d orbit (Osborn et al. 2017). Huber et al. (2016) estimate a stellar effective temperature of Teff = 4628 K, a surface gravity of log (g) = 4.6, a stellar mass of Ms = 0.752 M, and a stellar radius of Rs = 0.7 R.

In the top panel of Fig. 1 we show the K2 light curve after correction for instrumental effects with EVEREST (Luger et al. 2016) (black line) together with the running median of 51 exposures (red line). The lower panel of Fig. 1 shows the light curve after division by the running median. This is the data used throughout this section and we note that this pre-processing or detrending of the light curve is not part of TLS.

thumbnail Fig. 1.

Top panel: K2 long cadence light curve of the star K2-110, which exhibits transits of a mini-Neptune-sized planet, K2-110 b. The black line shows the light curve that has been corrected for instrumental effects with EVEREST and the red line shows our running median of 51 data points. Bottom panel: EVEREST light curve divided by the running median.

Open with DEXTER

Readers interested in detrending techniques are referred to the Savitzky & Golay (1964) filter (used by Gilliland et al. 2011), the median filter (Tal-Or et al. 2013), polynomial filters (Gautier et al. 2012; Rodenbeck et al. 2018), the Cosine Filtering with Autocorrelation Minimization (CoFiAM, Kipping et al. 2013), and Gaussian process (Aigrain et al. 2015).

2.1. Transit model

The key idea of TLS is to search for transits using a transit-like search function rather than a box. Our first task then is to identify the transit light curve that is most representative of the known exoplanet transit light curves, assuming that future exoplanet detections will be done most efficiently with this particular function. We refer to this function as the default template for TLS and describe its construction in the following. Although this decision of using a transit curve template to match previous detections might suggest that TLS will inherit a detection bias and search mostly for planets like the ones we already know, we have verified that TLS is better than BLS in finding any kind of planet, in particular grazing transiters and very small planets.

In fact, we decide to optimize the TLS template for the detection of small planets. We have verified that large planets, which produce deep transits, can also be found with this TLS template with a higher signal detection efficiency than with a box. In turn, if we had chosen to optimize the template to find large planets, then small planets would be more likely to be missed. We impose an arbitrary limit of Rp/Rs <  0.05 ∼ 5.4 R/R on the planet-to-star radius ratio and retrieve the orbital inclination (i), semimajor axis in units of stellar radii (a/Rs), Rp/Rs, and the orbital period (P) for all transiting Kepler planets from the Exoplanet Orbit Database2 (Wright et al. 2011). We set the orbital eccentricity of each planet to zero and obtain the predicted limb darkening coefficients c1 and c2 of a quadratic limb darkening law for each host star, using the stellar effective temperature (Teff) and surface gravity (log (g)), from the stellar model atmospheres of Claret et al. (2012, 2013) in the Kepler bandpass.

In the left panel of Fig. 2 we plot the 2346 resulting model transit light curves, normalized to the transit depth and transit duration, using the batman implementation (Kreidberg 2015a,b) of the Mandel & Agol (2002) analytic transit model with quadratic limb darkening (black lines, one for each planet). We then construct the TLS default template transit curve from the median values of the above-mentioned input parameters to the analytic transit model for quadratic limb darkening (red dashed line). We emphasize that the TLS template is not a fit to the observed normalized transit light curves in Fig. 2 but a model light curve based on the median input parameters of all known transiting exoplanets. In fact, the user of TLS is free to chose their own parameterization of a template transit light curve for their search. We also show our template for a grazing transiting planet (black dotted line) and the best fitting box.

thumbnail Fig. 2.

Left panel: transit light curves for all 2346 transiting Kepler planets from the Exoplanet Orbit Database (as of 1 November 2018) with Rp/Rs <  0.05 ∼ 5.4 R/R (black lines, one per planet). The default template for TLS is shown with a red dashed line and the optional grazing planet template is shown with a black dotted line. The best-fitting box is shown as a light blue solid line for comparison. Right panel, gray histogram: reduced χ2 residuals between the TLS default (median) transit template and the real transit light curves in the left panel. Open histogram: reduced χ2 residuals between the box and the real transit light curves. The separation between the two histograms confirms that the TLS default transit template is a substantially better match to the observations than a box in ≳99% of the cases.

Open with DEXTER

In the right panel of Fig. 2 we show the reduced χ2 residuals between the TLS template and the real transit light curves (gray histogram) and the reduced χ2 residuals between the BLS box and the real transit light curves (open histogram). The two histograms show a substantial offset with the TLS template resulting in much smaller χ2 residuals than the box used for the BLS algorithm. We note that the single outlier of the TLS distribution at belongs to KOI-7880, which is a grazing transit planet.

2.2. Transit search statistic

The TLS algorithm searches for periodic transit-shaped signals in time series of flux measurements. The algorithm operates by phase-folding the data over a range of trial periods (P), transit epochs (t0), and and transit durations (d). It then calculates the χ2 statistic of the phase-folded light curve between the N data points of the respective transit model () and the observed values () as per

(1)

where are the standard deviations in the light curve. In Fig. 3a, we show the spectrum of minimum χ2 as a function of P. In other words, for each trial period TLS searches the minimum χ2 by testing all combinations of the (t0, d) 2D hyperspace of the 3D parameter grid. TLS uses the global χ2 minimum

(2)

at the location (P′,t0′,d′) of our 3D parameter space for the normalization of the test statistic below. In Fig. 3a, we locate at about 13.87 d, corresponding to the published value of K2-110 b by Osborn et al. (2017).

thumbnail Fig. 3.

Panel a: distribution of χ2 (minimized over t0, d) obtained by phase-folding the light curve of K2-110 b over different trial periods. Panel b: signal residue for the best fitting periods throughout the parameter space. Panel c: raw signal detection efficiency (black line) and walking median (red line). Panel d: signal detection efficiency used by TLS. This is the result of the division of the raw SDE by its walking mean in panel c.

Open with DEXTER

TLS uses a modified version of the test statistic originally implemented in the BLS algorithm3, the signal detection efficiency (SDE; Alcock et al. 2000). The SDE has been widely demonstrated to yield useful results and it has become a standard metric in the exoplanet hunting community. Our implementation of TLS, however, does not apply a binning of the phase-folded light curve to compute the signal residue (SR) between the model and the data. Our approach is computationally more expensive (quadratic in the number of data points, see Sect. 3.4) but key to making TLS more sensitive to the signals of small transiting planets.

TLS calculates the SR from the distribution of minimum χ2 as a function of P,

(3)

which necessarily results in the SR(P) distribution to range between 0 and 1 (see Fig. 3b).

The SDE(P) distribution is then obtained as per Kovács et al. (2002) using the arithmetic mean ⟨SR(P)⟩, the standard deviation σ(SR(P)), and the peak value SRpeak of SR(P) via

(4)

With SRpeak = 1 by definition in Eq. (3), we have

(5)

An SDE value of x for any given P means that the statistical significance of this period is xσ compared to the mean significance of all other periods. We refer to the resulting SDE(P) distribution in Fig. 3c as the raw SDE. The final step in our construction of the transit search statistic is in the removal of the systematic noise component that is inherent to the SDE distribution as explained by Ofir (2014). We follow this author in removing this trend with a walking median filter through the SDE(P) periodogram, the result of which is shown in Fig. 3d for K2-110. The transit signal of Kepler-110 b can be found at 13.8662 with an SDE value of 66.7. We detetermine a conservative error estimate via the half width at half maximum of the SDE peak, which is 0.0122 d. Hence, TLS determines the orbital period of K2-110 b as P = 13.8662 (±0.0122) d.

Empirical thresholds for a transit detection have been proposed to range from SDE >  6 (Dressing & Charbonneau 2015), SDE >  6.5 (Livingston et al. 2018), SDE >  7 (Siverd et al. 2012), 6–8 as a function of period (Pope et al. 2016; Aigrain et al. 2016), and up to 10 (Wells et al. 2018). Lower SDE thresholds mean better completeness but also higher false alarm rates.

2.3. The TLS parameter grid

2.3.1. Period sampling

In the search for sine-like signals, for example using Fourier transforms, a uniform sampling of the trial frequencies is usually quite efficient. A uniform sampling of the orbital frequencies has also been suggested for BLS (Kovács et al. 2002). As shown by Ofir (2014), however, this sampling of the orbital frequency tends to be insensitive to either short- or long-period planets, but it is always computationally inefficient. Ofir (2014) derived the optimal number of test frequencies as

(6)

with

(7)

where G is the gravitational constant, S is the time span of the data set, and OS is the oversampling parameter to be chosen between 2 and 5 to ensure that the SDE peak is not missed between trial frequencies (or periods). The minimum and maximum trial orbital frequencies can be found at fmin = 2/S (or fmin = 3/S if three transits are required) and at the most short-period (high-frequency) circumstellar orbit, the Roche limit, . Strictly speaking, the Roche limit depends on the density (ρp) of the planet, and the term 3 Rs for our expression of fmax assumes the most pessimistic case of an extremely low-density fluid-like planet with ρp = 1 g cm−3, which can be compared to Jupiter’s mean density of 1.33 g cm−3. Our TLS implementation generates an array of evenly spaced orbital frequencies with Nfreq, opt constant steps between fmin and fmax and then computes the (non-uniform) trial orbital periods as the inverse of this frequency grid.

Since computational speed is a key concern for us, we illustrate the resulting number of trial periods in Fig. 4, using three different time spans S of a hypothetical light curve for different stellar masses (and radii, assuming main-sequence mass-radius relation). We find that an extension of the light curve by a certain factor – here ten between the three example curves – increases the number of trial periods by the same factor. This plot warns us of the large number of trial periods that need to be examined for planets around low-mass stars, with Nfreq, opt reaching values of up to almost one million for a light curve with 1000 d of continuous observations of a very-low-mass star. This feature is inherent to both TLS and BLS.

thumbnail Fig. 4.

Optimal number of trial periods (Nfreq,  opt) as a function of stellar mass for three different time spans of a hypothetical stellar light curve.

Open with DEXTER

2.3.2. Transit depth

TLS measures the mean flux of the in-transit data points under consideration. It then calculates the corresponding maximum transit depth δ and the resulting planet-to-star radius ratio under the assumption of zero transit impact parameter using the analytic solutions for common stellar limb darkening laws found by Heller (2019). With this calculation, the signal shape is scaled and compared to the data points.

TLS can also be used to search for user-defined signal shapes (for example, flares), either with positive or negative flux. If an analytical scaling option is not available, TLS can perform a numerical iterative fit using an initial guess based on the mean of the in-signal data, . The δ range to be tested with TLS is bracketed between and . TLS uses an iterative ternary algorithm (Knuth 1998) to tighten the interval in steps of 1/3 until the upper and lower limits differ by <X% in signal depth (or height), where X is a user-defined threshold.

2.3.3. Transit duration

Transit searches using the BLS algorithm usually operate with trial transit durations4 T14 that span 0.00125–0.07 (Petigura et al. 2013), 0.01–0.1 (Giacobbe et al. 2012), or 0.001–0.2 (Sanchis-Ojeda et al. 2014; Aigrain et al. 2016) times the orbital period. More than half of the corresponding T14(P) diagram, however, is not populated with exoplanet discoveries in these regions of the parameter space (see Fig. 5). We explain this absence of transiting planets using geometrical constraints and Kepler’s third law.

thumbnail Fig. 5.

Transiting planets from the Exoplanet Orbit Database in the T14/P-P diagram. BLS implementations typically search a linearly spaced uniform grid, or roughly the entire diagram. However, more than half of this search space is not populated with planets. The default parameterization of TLS only searches inside the area embraced by the solid lines, which are defined in Eq. (10). TLS users can nevertheless redefine their own cuts of the T14/P-P diagram and search for planets with hitherto unknown properties in this diagram.

Open with DEXTER

For example, there are no planets known with P = 10 d and T14/P <  5 × 10−3. From an astrophysical perspective, only extremely eccentric planets could have such a short transit duration – apparently a very rare, or even non-existent kind of exoplanet. It thus appears reasonable to us to restrict the computational effort to the physically plausible regions of the T14/P diagram. We also conclude from Fig. 5 that the transit duration search grid shall be linear in log-space.

For circular orbits, the maximum transit duration is T14, max = 2(Rs + Rp)/vp, where vp is the planet’s average orbital velocity during the transit. Shorter transit durations are possible if the planetary transit path is not across the stellar diameter. We then have

(8)

In the limit of the star being much more massive than the planet, Kepler’s third law becomes

(9)

We insert Eq. (9) into Eq. (8) and obtain

(10)

In Fig. 5 we plot Eq. (10) for Rp = 2 RJ (RJ being Jupiter’s radius) orbiting an A5 star (Ms = 2.1 M, Rs = 1.7 R) to maximize the effect of a very large planet on the transit duration. To embrace the physically plausible search space, we also show a main-sequence M8 red dwarf star (Ms = 0.1 M, Rs = 0.13 R) and a sun-like star, both with a small planet (Rp = R). We see that a significant amount of planets is actually located above the uppermost of these lines, which can be attributed to bloated super-Jovian planets and/or to planets transiting slightly evolved stars, for example. The absence of planets in the lower left part of the diagram could be partly astrophysical in nature and interpreted as a trace of planet formation and evolution (for example, the absence of ultra-short period planets around M dwarfs). That said, planets can naturally have transit durations that are arbitrarily shorter than T14, max, for example, on eccentric or inclined orbits. The empty space can nevertheless be explained with a detection bias against planets with transit durations of just a few minutes, for example, ∼15 min for P = 1 d and T14/P = 10−2.

In order to compensate for planets transiting evolved stars as well as for planets on eccentric orbits5 and other astrophysical effects that are potentially hard to predict, we use Fig. 5 to derive empirical estimates for the maximum and minimum values of d to be searched. We parameterize the upper limit via Eq. (10) and using Ms = 1 M and Rs = 3.5 R, and we parameterize the lower limit of T14 using Ms = 1 M and Rs = 0.184 R in Eq. (10). We note that these two parameterizations do not correspond to any particular or even physically plausible main-sequence star. The motivation behind this parameterization of Eq. (10) is entirely empirical with the aim of embracing all known transiting exoplanets. Searches for planets around more exotic stars, such as white dwarfs, require other limits. Using our TLS implementation, the user can conveniently set arbitrary limits of their choice.

Our default empirical limits for the transit durations to be searched with TLS are shown with inclined solid lines in Fig. 5 and their parameterization was intentionally chosen to encompass all known transiting exoplanets. The horizontal cutoff at T14/P = 1.12 × 10−1 is a global threshold. This is the default region in the T14-(P) diagram to be tested with our implementation of the TLS algorithm. That said, the user can define their own range of transit durations to be tested.

3. Results

3.1. TLS and BLS signal detection efficiency for white noise

As a first test of the performance of TLS in comparison to BLS we generated synthetic light curves with white noise only. These light curves have a time span of S = 3 yr and a cadence of 30 min with 110 ppm noise (standard deviation) per cadence. This noise level is adapted to a best-case scenario for Kepler data, where the total noise over 6.5 h was found to be 30 ppm for a KP = 12 star (Gilliland et al. 2015). We note that the noise in the Kepler light curves does not have Gaussian properties due to instrumental and stellar trends (for example, from stellar rotation), all of which complicates transit detections in practice.

We injected three transits of an Earth-sized planet around a sun-like star with transit impact parameters randomly chosen in b = [0, 1] and with solar quadratic limb darkening as seen in the Kepler bandpass into a set of 10 000 pre-computed synthetic light curves with different white noise realizations. Then we conducted both a TLS search and a standard BLS search, in both of which we used our optimized period grid (Sect. 2.3.1) for a fair comparison. A detection was counted as “positive” if the highest peak in the power spectrum was within 1% of the injected transit period.

We determine the SDE thresholds for a false positive rate of 1% to be SDEfp = 1% = 7 for BLS and TLS. Given these thresholds and the (forced) false positive rates of 1%, TLS recovers 93.1% of the injected signals (the true positives) compared to 75.7% for BLS. While the SDE distribution of the white noise-only light curves is virtually identical for both BLS and TLS, the SDE distribution of the light curves with signal is shifted to higher SDE values for TLS, with a mean of ⟨SDE⟩TLS,p = 9.9 compared to a mean of ⟨SDE⟩BLS, p = 8.2 for BLS. In Fig. 6, we illustrate the results of our experiment.

thumbnail Fig. 6.

Statistics of the signal detection efficiency for a transit injection-retrieval experiment of simulated light curves with white noise only. Left panel: box least squares algorithm. Right panel: transit least squares algorithm. Both panels show the results of 10 000 realizations of a 3 yr light curve with white noise only (open histograms) and of the same amount of light curves with white noise and an Earth-like planetary transit around a G2V star (gray histograms). Outlined histograms relate to the SDE maximum value in a noise-only search. Gray histograms refer to the highest SDE value within 1% of the period of the injected transit. The SDE thresholds at which the false positive rates are 1% is found to be SDEfp = 1% = 7. At this SDE threshold, the recovery rate of the injected signals (the true positive rate) is 75.7% for BLS and 93.1% for TLS, while the fraction of missed signals (the false negative rate) is 24.3% for BLS and 6.9% for TLS.

Open with DEXTER

3.2. Comparison of TLS and BLS for K2-110 b

We now compare the performance of TLS and BLS using a known planet with a high signal-to-noise ratio, K2-110 b (EPIC 212521166), a massive mini-Neptune orbiting an old, metal-poor K3 dwarf star with a 13.9 d period (Osborn et al. 2017). We retrieve estimates for the stellar mass (), radius (), effective temperature ( K), and surface gravity () from the Kepler K2 EPIC catalog (Huber et al. 2016) using the automated catalog_info function of TLS. catalog_info also retrieves the quadratic limb-darkening coefficients for the Kepler bandpass (a = 0.7010, b = 0.0462) via a cross-match of the Claret et al. (2012) tables based on Teff and log (g). With these priors, TLS creates an optimal period grid with an oversampling factor of five, which results in 21 500 trial periods between Pmin = 0.4 d and Pmax = d/2 = 40 d.

We run BLS and TLS searches with the same period grid, oversampling, and duration constraints (Fig. 7). TLS delivers a much higher SDE (66.7) compared to BLS (16.9, or 24.5 when median-smoothed). This is despite the fact that our priors are slightly different to the improved posteriors from Osborn et al. (2017), which suggest a hotter star (Teff = 5050 ± 50 K) with the same surface gravity, resulting in different quadratic limb darkening parameters (a = 0.5322, b = 0.1787).

thumbnail Fig. 7.

Phase-folded transits and peridograms of Kepler K2-110 b with TLS (top) and BLS (middle, bottom) fitting for the same trial periods and durations. We note the boost in signal detection efficiency from 16.9 with the original BLS, or 24.5 with the median-smoothed BLS to 64.2 with TLS. The vertical dashed blue lines denote the aliases of the period detected at the highest SDE value, respectively.

Open with DEXTER

3.3. Recovery of the TRAPPIST-1 planets

Moving on to real light curves of Earth-sized planets with time-correlated (red) noise components, we chose the K2 light curve of TRAPPIST-1 as a stress-test for TLS to ensure its robustness. The system exhibits noise from instrumentals, stellar rotation, flares, and other sources, which can only imperfectly be removed using detrending. Signals of multiple planets occur with overlapping transits. Each planet produces its own set of harmonics and subharmonics in the power spectrum. To be considered robust, a detection algorithm must able to handle these difficulties and our attempt of recovering this previously reported series of transit signals in retrospect offers an exquisite case to simulate the TLS search for Earth-sized planets.

The well-studied TRAPPIST-1 system exhibits transits of seven terrestrial planets (Gillon et al. 2016, 2017) in a resonant chain, where the orbital periods are near-ratios of small integers (Luger et al. 2017). An automatic recovery of all planets is certainly difficult because of the low S/N of the individual transits resulting from the dim host star, the very small (Earth-sized) planets, transit timing variations, stellar flares, systematic trends from the stellar rotation of 3.3 days, and overlapping multi-planet transits.

As before, we use the K2 EVEREST data spanning 79 days in campaign 12. We divide the data by a running median of 13 data points.

On the one hand, the length of the running median filter window must be larger than the transit duration to prevent the transit signal from being distorted by the median filter. On the other hand, the window length must be sufficiently short to remove stellar variability. For planets around TRAPPIST-1, the longest plausible transit duration at a period of 40 days is 1.6 h (3 cadences). Our choice of 13 cadences is ∼4 times longer than the critical value. Longer median filter windows increase the residuals of the stellar noise significantly due to the strong variability of TRAPPIST-1. For comparison, in the case of K2-110b (Sect. 3.2), the amplitude of the stellar variability is much smaller and it occurs on timescales that are much longer than the transit duration. Therefore, we set the window length to 51 points (∼2 d), but a window length of 1 d yields virtually the same result.

We then remove data points that deviate positively from the mean flux by more than +3σ to eliminate bright flares. A detailed analysis by Ducrot et al. (2018) carefully identifies transits affected by flares, incomplete transits, and multi-planet transits, which increases the quality of the in-transit data. Such a fine-tuned processing is beyond the scope of our analysis.

For our TLS search, we use the default template and priors on the stellar mass (0.089 ± 0.006 M) and radius (0.121 ± 0.003 R) with limb darkening for the Kepler bandpass of an M8-star (Van Grootel et al. 2018). The first run results in an SDE ∼ 45 detection of a signal with a period of P = 2.4218 ± 0.0013 d, which we identify as planet “c” (Fig. 8). Then we mask the in-transit data points of this signal using the TLS convenience function transit_mask and re-run TLS iteratively. Each successive run results in the detection (and masking) of planets c–b–g–e–f–d. The order of detection is based on the signal-to-noise ratio of the stacked transits. While planet “b” nominally has the highest S/N, its transit shape differs significantly from the TLS default template, making planet “c” have the highest SDE in the first TLS run. The seventh and outermost planet “h” is not automatically detected by TLS. We attribute this to the low number of transits (four) and to a flare that happened during the fourth transit, resulting in several in-transit data points showing unusually high flux. We show the highest SDE peak (caused by noise) of this last search in the bottom row of Fig. 8 together with the best-fit transit shape, which is very noisy. This false positive signal illustrates the limits of automatic planet recovery, which apply to both TLS and BLS. We also verified that BLS is not able to detect planet “h” using the same data processing.

thumbnail Fig. 8.

Demonstration of the TLS performance on the TRAPPIST-1 system. Left panels: phase-folded transit light curve for the respective period and epoch at SDE maximum (black dots). The best-fit transit model (fitted for transit duration and depth) with quadratic stellar limb darkening is shown with a red solid line. Planet names are indicated in the lower right corner of each panel. Planets are sorted from top to bottom in the order of detection from an iterative TLS search of the K2 light curve. Planet “h” (bottom panels) is a false positive and not related to the actual detection of TRAPPIST-1 h (see Sect. 3.3). Middle panels: entire K2 light curve of TRAPPIST-1 with the detected in-transit data points highlighted in red. Transits detected in previous iterations were masked. Right panels: SDE(P) diagram for the light curve shown in the center.

Open with DEXTER

3.4. Computational costs

TLS aims at maximizing sensitivity while our implementation of TLS aims at maximizing computational speed at the same time. Since computing power has been continuously increasing for more than half a century now (Moore 1965) and since the whole point of TLS is to offer unprecedented sensitivity, we prioritized the latter over computational efficiency whenever necessary (for example: no phase binning).

The computational effort per light curve is a complicated function of the stellar mass and stellar radius (see Fig. 4) and it depends linearly on the time span of the light curve (S ). Another important factor is the number of trial epochs (or trial phases), which increases linearly with S for a constant cadence. Both things combined, we find that the computational load increases quadratically with S , which itself is proportional to the number of data points for a fixed cadence.

The default TLS configuration has a typical run time per Kepler K2 long cadence (K2 LC) light curve (∼4000 data points, S = 80 d) of 10 s, virtually identical to BLS on the same Intel Core i7-7700K (Fig. 9). We used the reference implementation of BLS provided by Astropy 3.1 (Astropy Collaboration 2013, 2018) in the C programming language, parallelized with the OpenMP interface6. To compare K2 run times, we used the same number of trial transit durations (66) in both algorithms, and the same optimal grid of ∼10 000 periods as determined by TLS. We note that the optimal grid is not available by default in Astropy and most other BLS implementations, and only used by part of the community, resulting in three to five times longer run times for BLS at the same level of sensitivity. TLS run times are strongly dependent on the stellar density prior and the shallowest transit depth considered for fitting. A range of plausible values for optimistic and pessimistic cases is shown in Fig. 9 and explained in more detail in Appendix A.

thumbnail Fig. 9.

Algorithm run times for different missions, durations, and cadences. Kepler LC (30 min) and SC (1 min) are shown for 4.25 yr worth of data, K2 assumes 80 d of LC data. TESS is represented with one (1S) and twelve (12S) seasons at 2 min cadence, respectively. PLATO light curves are considered over 2 yr in both 25 s short cadence (P SC) and 10 min long cadence (P LC). The red area shows the full range of TLS run times. The upper end assumes no priors on stellar density and fitting signals down to 10 ppm. The lower end assumes typical priors from catalog data and a 100 ppm threshold (or a 1% threshold on phase sampling). In the latter case, run times are shorter than BLS for all but the largest data sets. We note the slope of roughly two orders of magnitude of the run time per order of magnitude of cadences.

Open with DEXTER

Our measurements are in agreement with “one minute run time [for BLS] per processor core per K2 campaign star” (Vanderburg et al. 2016). Some years ago, BLS performance was noted as “10 min run time on a desktop workstation” for a MEarth star with 1000 data points (Berta et al. 2012). BLS run times of the PyKE kepbls routine have been reported as “26 min (…) using a 3 GHz Intel Core 2 Duo” (Kinemuchi et al. 2012)7. BLS speedup factors of 25× for K2 sized data are projected using optimal period sampling and optimal phase sampling (Ofir 2014) for nominal BLS sensitivity.

Longer data sets such as the long cadence (LC) light curves from the Kepler primary mission (K1) with ∼60 000 data points and S = 4.25 yr are more demanding, as are K1 short cadence (K1 SC) data. PLATO long cadence (P LC) light curves will have a 10 min sampling over 2 yr (per field of view) and about 105 cadences. PLATO short cadence (P SC) will be 25 s and deliver 2.5 × 106 cadences over 2 yr per light curve.

All quoted TLS run times include the calculation of initial star-specific templates, which requires ∼10 ms for the quadratic limb-darkening law. We explain these technicalities in more detail in the Appendix A.

3.5. Comparison to other transit detection algorithms

Mislis et al. (2016) claim that their machine-learning method is 1000 times faster than BLS, but do not include (or state) the run time required for the training part. The method is described to detect 8% more planets compared to BLS. However, no false/true positive/negative rates are given (as in our Fig. 6), preventing a comparison to TLS.

Training times for algorithms based on deep-learning to detect transits are of order several thousand CPU hours (Shallue & Vanderburg 2018). It remains unclear whether the training could be re-used in later searches. What is more, the authors describe a drop in model performance toward lower S/N transits, because few (real world) training candidates with low S/N were available.

Other studies find that random forest classifiers and convolutional neural networks produce a significant fraction of false-positives (Schanche et al. 2019). Depending on the threshold of a detection, it may also result in the outcome that BLS has the smallest fraction of false negatives (missed detections), 5% versus 11–14% for various machine classifiers (Pearson et al. 2018).

To fairly assess machine learning algorithms for transit detection, we recommend to perform an independent benchmark which includes BLS and TLS. This is beyond the scope of this paper but can be a natural follow-up work to it.

4. Discussion

4.1. Arbitrary signal shapes

TLS can be used with arbitrary search functions to detect other kinds of periodic events in stellar (or other) light curves. We plan to implement a user-friendly interface for such functions in the next release. As an example, although stellar flares are not known to be strictly periodic, flares from TRAPPIST-1 appear to be semi-periodic (Morris et al. 2018). An analytic description of a stellar flare (Davenport et al. 2014) could be used to search for periodically flaring stars. What is more, many phenomena related to exoplanets were not expected or known before they were found.

It would also be possible to feed TLS with an analytic description of exocometary transits (Rappaport et al. 2018; Kennedy et al. 2019) or disintegrating planets with comet-like tails (Bochinski et al. 2015; Garai 2018), atmospheric refraction (Dalba 2017, 2018), exoplanetary rings (Barnes & Fortney 2004; Ohta et al. 2009; Tusnski & Valio 2011; Aizawa et al. 2017; Hatchett et al. 2018) or artificial shapes such as rectangles (Arnold 2005) as well as starshades at the Lagrange points (Gaidos 2017; Moores & Welch 2018).

4.2. Issues with uneven sampling and data gaps

TLS moves the model transit curve over the data points in phase space, very much akin to a moving window. This procedure assumes constant cadences or steps in phase space. Constant steps in phase space are only present on average, however. Transit timing variations can sometimes induce a stroboscopic effect: Cadences that are constant in time may not be constant in phase due to resonances with the observational cadence (Szabó et al. 2013). This should only affect a small fraction of all planets.

Variable cadences result in a morphologically distorted transit shapes, incorrect transit duration estimates, and usually reduce the SDE. Small variations of the cadence are negligible, for example from barycentering the timestamps of the Kepler satellite, which accounts for a variation of ≲8 min and is therefore substantially smaller than even the shortest known transit duration of ∼40 min (Rappaport et al. 2013). Partially observed transits, which can occur near data gaps or near the beginning and end of observations, may cause similar issues. When the number of observed transits is large, for example more than a dozen, the effect of partial transits is small and this is (and will be) valid for the vast majority of planets detected by Kepler (and TESS and PLATO) due to the missions’ long duty cycles. Even in the case of contamination with partial transits, the phase-folded transit light curve is usually better approximated by our TLS default transit template than by a box.

4.3. Observational biases due to the transit shape template

Observational biases for transiting planets (Kipping & Sandford 2016) are partly due to the box shaped transit fit (when using BLS). Even when fitting a better transit shape like the TLS template, similar biases can be expected since our template curve cannot be a perfect fit for all transits. For example, an eccentric or V-form grazing transit shape is substantially different from a box. This causes increased noise, resulting in lower detection efficiency. A few real, but rare, transit shapes might be closer to a box than to the reference transit template, resulting in a different set of observational biases. Characterizing these can be a natural follow-up work.

4.4. Correlated noise

Our definition of the SDE, extending from Eq. (1) to Eq. (5), assumes that the noise in the light curve is uncorrelated in time. For cases of correlated noise, which is often caused by stellar activity and instrumental systematics, SDE is an overestimate. The transit evaluation metrics provided by TLS (Appendix C) include the snr_pink and the snr_pink_per_transit, which can be used as an indiction for correlated noise, when compared to snr and the snr_per_transit.

4.5. Wavelength dependence of transit shape

The advantage of TLS over BLS is largest at short wavelengths, that is in the blue-optical regime of the electronamgnetic spectrum, where stellar limb darkening is most pronounced. At longer wavelengths, transits become increasingly box-shaped and the advantage of TLS over BLS vanishes. As can be seen in Fig. 10, a central Earth/sun transit observed at λ = 2 μm is well approximated by a box. While TLS can be parameterized with appropriate limb-darkening for such a bandpass and thus transit shape, its advantage over BLS is reduced to the trapezoidal ingress and egress shape (and the lack of binning). The detection efficiencies of BLS and TLS essentially match at long (∼2 μm) wavelengths.

thumbnail Fig. 10.

Transit light curves for different wavelengths from the UV (300 nm) to the NIR (2 μm) are increasingly box-shaped. Simulation for a central Earth/sun transit using measurements from Hestroffer & Magnan (1998).

Open with DEXTER

4.6. Eccentricity and transit timing/duration variations

Throughout this paper, we have assumed planets in circular orbits with linear ephemerides. While this is a valid approximation for many real planets, a certain fraction of planets exhibits deviations in the form of eccentricity and/or transit timing/duration variations. All of these effects cause the signal to deviate from the TLS default template and reduce the SDE. This is also true for BLS, as these effects broaden the signal in phase space. For the vast majority of these cases, the transit shapes are still closer to the TLS default template than to a box, so that the advantage of TLS over BLS holds. Usually, TTVs are very small, less than a few percent of the transit duration. Studying other cases, where a different signal template is wortwhile, could be a natural follow-up work. A search for less common transit shapes can readily be made through TLS’ interface to batman (Kreidberg 2015a,b), using of the underlying Mandel & Agol (2002) analytic transit model.

4.7. Data binning

While the time required for phase-folding and sorting is identical for TLS and BLS, the latter is faster by an order of magnitude due to the binning of the phase-folded light curve. A box-shaped function allows for binning with minimal loss in quality and a fixed factor speed gain (Ofir 2014). For typical long cadence data (for example 30 min), however, binning cannot be recommended for TLS because it would smear the ingress and egress shapes of most transits and therefore reduce the sensitivity – if the ingress and egress duration is short. For high impact parameter transits, this duration can be long, and binning may be acceptable. In general, binning is adequate if there are many data points between two phase grid points at the critical phase sampling. A detailed analysis of such trade-offs can be a natural part of follow-up work on TLS.

Our current TLS implementation does not compensate for morphological light-curve distortions (temporal smearing effects) due to finite integration time (Kipping 2010). It is computationally prohibitive to re-compute the transit shape template at every test period. Instead, an optimal re-computation grid of periods could be derived following Eq. (40) in Kipping (2010). This is an open feature request for TLS and can be part of natural follow-up work.

5. Conclusion

The default transit search function of TLS is a model transit light curve optimized to find small planets. We have constructed this template based on 2346 small (Rp/Rs <  0.05) planets and planet candidates observed with the Kepler mission. With this template, or a user-specified signal shape, TLS analyses the entire, unbinned data of the phase-folded light curve. Our transit injection-retrieval experiments with white noise light curves of an Earth-sized planet around a sun-like star demonstrate that these improvements yield a 17 percentage points higher true positive rate for TLS (∼93%) compared to BLS (∼76%) if the false alarm rates are chosen to be 1%, respectively. At the same time, the TLS false negative rate (7%) is significantly smaller than that of BLS (24%). In other words, TLS is substantially more efficient and reliable in finding small planets than BLS.

The test statistic of TLS is a modified version of the signal detection efficiency (SDE) used by the standard transit detection algorithm BLS. The SDE for TLS is derived from all the data points in the phase-folded light curves and not from the binned phase-folded light curve, as done by BLS. The TLS approach is computationally more demanding but key to the increased transit detection efficiency of TLS over BLS for small planets. We also filter the SDE periodogram for a systematic noise component by dividing it through a walking median. The resulting SDE distribution for TLS yields significantly more robust detections of transit-like signals compared to BLS.

Finally, as a demonstration example for the detection of Earth-sized planets around a low-mass star, we have tested TLS with its default transit template on the K2 data of the TRAPPIST-1 system and retrieved six of seven planets together with their detection statistics and phase-folded light curves. The high detection efficiency of TLS and its optimization for computational speed makes it a natural search algorithm for small transiting planets in light curves from Kepler, K2, TESS, and PLATO.


3

In Appendix C we identify a glitch in a patch to the previously known BLS edge effect that slightly affects the test statistic. This has been corrected for in TLS.

4

We follow the common nomenclature to indicate the time interval between the first and fourth contact of the stellar and planetary silhouettes as T14 (Seager & Mallén-Ornelas 2003).

5

In eccentric cases the average orbital velocity in-transit can be smaller or larger and the resulting transit duration can be larger or smaller than in the circular case.

Acknowledgments

The authors thank an anonymous referee for her or his very helpful report. RH receives funding from the German Space Agency (Deutsches Zentrum für Luft- und Raumfahrt) under PLATO Data Center grant 50OO1501. This research has made use of the Exoplanet Orbit Database, of the Exoplanet Data Explorer at http://exoplanets.org, and of NASA’s ADS Bibliographic Services. The authors made use of the following software: Astropy (Astropy Collaboration 2018), NumPy (Van Der Walt et al. 2011), SciPy (Jones et al. 2001), and Matplotlib (Hunter 2007).

References

Appendix A: Optimization for computational speed

A straightforward implementation of the mathematical framework described in Sect. 2 into an algorithm would result in huge computational demands with days of run time per K2 light curve for a reasonably dense grid of P, t0, δ, d trial values. We thus implement a computationally optimized version that produces identical results to a straightforward coding. Most aspects of our optimization are time-memory trade-offs, where repetitive calculations are identified and stored in memory after their first calculation. A memory read is often faster than a repeated calculation. The memory size requirements for the computer’s random access memory (RAM) are of order 50 MB per thread, dominated by the buffered signal shapes.

Most of the data points in a light curve are out of transit. As a consequence, the out-of-transit data account for the majority of computation time. Most important, the out-of-transit data of both the observed and of the modeled light curves are identical for a fixed trial epoch, transit duration, and orbital period. Hence, we design TLS to calculate the squared residual values of the out-of-transit data only once in the (P-t0-d) 3D parameter hyperspace of our 4D search grid. Consequently, only a small amount of in-transit squared residuals need to be calculated for the various trial depths of the transits.

We also found that instead of calculating the oversampled transit model for each χ2 test in the d space, it is significantly faster to pre-compute all oversampled transit models for all trial durations but only for single, arbitrary transit depth for any given d value. Re-scaling the transit in δ only requires one multiplication per data point. Resampling a model in width, however, would be considerably more expensive due to the necessary oversampling. Thus, the transit models are pre-computed and cached for any given transit duration and then they are scaled in depth on-the-fly as TLS searches through the transit duration grid.

Phase-folding involves sorting the orbital phases ϕi  =  ti/P, where ti are the times of observations. Sorting can be extremely demanding computationally. A typical K2 light curve worth 4000 data points takes 0.5 ms to fold and sort on an Intel Core i7-7700K. The required time for folding and sorting 20 000 trial periods and 4000 trial phases would be ∼11 h. Hence, as in several BLS implementations, TLS only folds and sorts the phases once per trial period, which takes ∼5 s and implies a speed gain of a factor of 4000. We find that the fastest algorithm to sort phase arrays is “Mergesort” (von Neumann 1945, unpublished), which is typically ∼20% faster than the commonly used Quicksort algorithm (Hoare 1962).

The TLS reference implementation is written in pure python code, which is an interpreter-based programming language and thus comes with a speed loss. We therefore chose to implement many of the time-critical parts of TLS with the specialized numba package (Lam et al. 2015) that translates the python code in machine code. This procedure is called “just-in-time” compilation and saves us two orders of magnitude in computing time.

TLS can be adjusted for adequate performance on large data sets with minor compromises. When speed is critical, we recommend to first perform a search using fast binned BLS, which is adequate to recover all high-S/N planets. Using iterative runs, these significant signals can be found and masked from the data, reducing overall variance and data volume. When no significant BLS signal remains, the search can be switched to TLS.

The computational speed of TLS can then be increased with a sensible threshold for the shallowest transit signal that is fitted. For example, instrumental and stellar noise limit the detectability of signals by Kepler to transit depths of ∼100 ppm for periods of ∼365 d around a G2V-star, roughly the Earth-equivalent that the Kepler mission was originally designed for. Allowing the algorithm to fit signals of down to 10 ppm will then only result in a very high computational load, but almost certainly not in the discovery of a real 10 ppm transit signal. As the S/N of a planet is a complex combination of many factors, we have set the default TLS parametrization to a threshold of 10 ppm. For reference, the shallowest known transit is 11.9 ppm (Kepler-37 b, Barclay et al. 2013), a discovery that was made possible due to the short orbital period (13.4 d) and the stacking of ∼100 transits. For long-period planets, one may choose thresholds of, for example, 50 ppm for Kepler data or 100 ppm for K2 data, which avoids fitting out the complete noise floor. For reference, an Earth-sized planet transiting a dG2 star has a transit depth of ∼100 ppm. Then TLS run times are similar to those of BLS. As a follow-up work, one might develop a heuristic of a shallowest transit depth to be fitted, as a function of noise in the data, period, and other factors.

For Kepler K1 LC data (30 min cadence over 4.25 yr), we have tested TLS run times for different combinations of stellar density priors (from stellar radii and masses), shallowest transit fit depths (δcut), and phase space sampling Δt0 (Tables A.1, and A.2). The KIC and EPIC catalogs (Brown et al. 2011; Huber et al. 2016) typically have relative mass and radius uncertainties of 5%, so that a range of ±0.1 ρ (±0.2 ρ) gives a 2σ (4σ) confidence interval. Priors decrease run times by an order of magnitude in case of complete phase space sampling. Then, the influence of δcut is a factor of a few between 100 ppm and 10 ppm for typical K1 data, where the standard deviation per data point (after detrending) is typically in the hundreds of ppm.

Table A.1.

TLS run times (in minutes) for Kepler K1 LC data, R = R, M = M, Δt0 = 0.

Table A.2.

TLS run times (in minutes) as before, but δcut = 10 ppm.

TLS also offers the option of not sampling the phase space at every cadence. For example, K1 LC data (60 000 points) allows for transit signal lengths of up to 7200 points for T14/P = 0.12. Considering noise levels in the hundreds of ppm per point, shifting this long template point by point is pointless. Instead, TLS can shift the data in phase space by a user-defined fraction of the transit duration, the latter of which is measured in cadences. As an example, the default value of 1% shifts a transit signal of 200 points length by 2 points in each trial, saving 50% of the computational time. This procedure results in much faster run times because most of the computational effort is spent to test very long transit duration. Empirically, we find that setting Δt0 = 0.01 (instead of zero) allows for virtually identical SDE values in Kepler K1 and K2 data, while Δt0 = 0.1 results in a detection efficiency loss of a few percent and should be used only for high S/N transits, for example, using an iterative search.

Our TLS implementation leverages all available CPU cores and shows continuous updates of the estimated remaining time and a progress bar. The user can use the estimate to balance run time and search depth.

Appendix B: Edge effect jitter in BLS which leads to additional noise

The original BLS implementation did not account for transit events occurring to be divided between the first and the last bin of the folded light curve. This was noted by Peter R. McCullough in 2002, and an updated version of BLS was made (ee-bls.f) to account for this edge effect. The patch is commonly realized by extending the phase array through appending the first bin once again at the end, so that a split transit is stitched together, and present once in full length. The disadvantage of this approach has apparently been ignored: The test statistic is affected by a small amount of additional noise. Depending on the trial period, a transit signal (if present) is sometimes partly located in the first and the second bin. The lower (in-transit) flux values from the first bin are appended at the end of the data, resulting in a change of the ratio between out-of-transit and in-transit flux. There are phase-folded periods with one, two, or more than two bins which contain the in-transit flux. This causes a variation (over periods) of the summed noise floor, resulting in additional jitter in the test statistic. For typical Kepler light curves, the reduction in detection efficiency is comparable to a reduction in transit depth of ∼0.1−1%. TLS corrects this effect by subtracting the difference of the summed residuals between the patched and the non-patched phased data. A visualization of this effect on the statistic is shown in Fig. B.1, using synthetic data. In real data, the effect is usually overpowered by noise, and was thus ignored, but it is nonetheless present.

thumbnail Fig. B.1.

Edge effect jitter in BLS (right) and absence of this jitter in TLS (left).

Open with DEXTER

Appendix C: Transit evaluation metrics

In addition to the SR and SDE transit search statistic (Sect. 2.2), our python implementation of TLS outputs the period of the highest SDE value, its corresponding t0, δ, d, and the planet-to-star radius ratio for zero transit impact parameter (Heller 2019), where c1 and c2 are the limb darkening coefficients of a quadratic limb darkening law. TLS also offers a range of automated evaluation parameters of the detected transits such as

  • the ratio of the signal to the white noise of the stacked transits (snr);

  • the ratio of the signal to the pink noise of the stacked transits (snr_pink) (Pont et al. 2006; Hartman & Bakos 2016);

  • the ratio of the signal to the white noise of the individual transits (snr_per_transit);

  • the ratio of the signal to the white noise of the individual transits (snr_pink_per_transit);

  • the significance (in units of standard deviations) between the depths of the odd and even transits (odd_even_mismatch);

  • the number of transits with in-transit data points (distinct_transit_count);

  • the number of transits with no in-transit data points (empty_transit_count);

  • the total number of transits (transit count);

  • the number of data points for each transit (per_transit_count).

Our online release of TLS comes with a documentation of these parameters and of additional evaluation metrics.

All Tables

Table A.1.

TLS run times (in minutes) for Kepler K1 LC data, R = R, M = M, Δt0 = 0.

Table A.2.

TLS run times (in minutes) as before, but δcut = 10 ppm.

All Figures

thumbnail Fig. 1.

Top panel: K2 long cadence light curve of the star K2-110, which exhibits transits of a mini-Neptune-sized planet, K2-110 b. The black line shows the light curve that has been corrected for instrumental effects with EVEREST and the red line shows our running median of 51 data points. Bottom panel: EVEREST light curve divided by the running median.

Open with DEXTER
In the text
thumbnail Fig. 2.

Left panel: transit light curves for all 2346 transiting Kepler planets from the Exoplanet Orbit Database (as of 1 November 2018) with Rp/Rs <  0.05 ∼ 5.4 R/R (black lines, one per planet). The default template for TLS is shown with a red dashed line and the optional grazing planet template is shown with a black dotted line. The best-fitting box is shown as a light blue solid line for comparison. Right panel, gray histogram: reduced χ2 residuals between the TLS default (median) transit template and the real transit light curves in the left panel. Open histogram: reduced χ2 residuals between the box and the real transit light curves. The separation between the two histograms confirms that the TLS default transit template is a substantially better match to the observations than a box in ≳99% of the cases.

Open with DEXTER
In the text
thumbnail Fig. 3.

Panel a: distribution of χ2 (minimized over t0, d) obtained by phase-folding the light curve of K2-110 b over different trial periods. Panel b: signal residue for the best fitting periods throughout the parameter space. Panel c: raw signal detection efficiency (black line) and walking median (red line). Panel d: signal detection efficiency used by TLS. This is the result of the division of the raw SDE by its walking mean in panel c.

Open with DEXTER
In the text
thumbnail Fig. 4.

Optimal number of trial periods (Nfreq,  opt) as a function of stellar mass for three different time spans of a hypothetical stellar light curve.

Open with DEXTER
In the text
thumbnail Fig. 5.

Transiting planets from the Exoplanet Orbit Database in the T14/P-P diagram. BLS implementations typically search a linearly spaced uniform grid, or roughly the entire diagram. However, more than half of this search space is not populated with planets. The default parameterization of TLS only searches inside the area embraced by the solid lines, which are defined in Eq. (10). TLS users can nevertheless redefine their own cuts of the T14/P-P diagram and search for planets with hitherto unknown properties in this diagram.

Open with DEXTER
In the text
thumbnail Fig. 6.

Statistics of the signal detection efficiency for a transit injection-retrieval experiment of simulated light curves with white noise only. Left panel: box least squares algorithm. Right panel: transit least squares algorithm. Both panels show the results of 10 000 realizations of a 3 yr light curve with white noise only (open histograms) and of the same amount of light curves with white noise and an Earth-like planetary transit around a G2V star (gray histograms). Outlined histograms relate to the SDE maximum value in a noise-only search. Gray histograms refer to the highest SDE value within 1% of the period of the injected transit. The SDE thresholds at which the false positive rates are 1% is found to be SDEfp = 1% = 7. At this SDE threshold, the recovery rate of the injected signals (the true positive rate) is 75.7% for BLS and 93.1% for TLS, while the fraction of missed signals (the false negative rate) is 24.3% for BLS and 6.9% for TLS.

Open with DEXTER
In the text
thumbnail Fig. 7.

Phase-folded transits and peridograms of Kepler K2-110 b with TLS (top) and BLS (middle, bottom) fitting for the same trial periods and durations. We note the boost in signal detection efficiency from 16.9 with the original BLS, or 24.5 with the median-smoothed BLS to 64.2 with TLS. The vertical dashed blue lines denote the aliases of the period detected at the highest SDE value, respectively.

Open with DEXTER
In the text
thumbnail Fig. 8.

Demonstration of the TLS performance on the TRAPPIST-1 system. Left panels: phase-folded transit light curve for the respective period and epoch at SDE maximum (black dots). The best-fit transit model (fitted for transit duration and depth) with quadratic stellar limb darkening is shown with a red solid line. Planet names are indicated in the lower right corner of each panel. Planets are sorted from top to bottom in the order of detection from an iterative TLS search of the K2 light curve. Planet “h” (bottom panels) is a false positive and not related to the actual detection of TRAPPIST-1 h (see Sect. 3.3). Middle panels: entire K2 light curve of TRAPPIST-1 with the detected in-transit data points highlighted in red. Transits detected in previous iterations were masked. Right panels: SDE(P) diagram for the light curve shown in the center.

Open with DEXTER
In the text
thumbnail Fig. 9.

Algorithm run times for different missions, durations, and cadences. Kepler LC (30 min) and SC (1 min) are shown for 4.25 yr worth of data, K2 assumes 80 d of LC data. TESS is represented with one (1S) and twelve (12S) seasons at 2 min cadence, respectively. PLATO light curves are considered over 2 yr in both 25 s short cadence (P SC) and 10 min long cadence (P LC). The red area shows the full range of TLS run times. The upper end assumes no priors on stellar density and fitting signals down to 10 ppm. The lower end assumes typical priors from catalog data and a 100 ppm threshold (or a 1% threshold on phase sampling). In the latter case, run times are shorter than BLS for all but the largest data sets. We note the slope of roughly two orders of magnitude of the run time per order of magnitude of cadences.

Open with DEXTER
In the text
thumbnail Fig. 10.

Transit light curves for different wavelengths from the UV (300 nm) to the NIR (2 μm) are increasingly box-shaped. Simulation for a central Earth/sun transit using measurements from Hestroffer & Magnan (1998).

Open with DEXTER
In the text
thumbnail Fig. B.1.

Edge effect jitter in BLS (right) and absence of this jitter in TLS (left).

Open with DEXTER
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.