Period, epoch, and prediction errors of ephemerides from continuous sets of timing measurements
Instituto de Astrofísica de Canarias, C. Vía Lactea S/N, 38205 La Laguna, Tenerife, Spain
Universidad de La Laguna, Dept. de Astrofísica, 38206 La Laguna, Tenerife, Spain
email: hdeeg@iac.es
Received: 19 November 2014
Accepted: 14 March 2015
Space missions such as Kepler and CoRoT have led to large numbers of eclipse or transit measurements in nearly continuous time series. This paper shows how to obtain the period error in such measurements from a basic linear leastsquares fit, and how to correctly derive the timing error in the prediction of future transit or eclipse events. Assuming strict periodicity, a formula for the period error of these time series is derived, σ_{P} = σ_{T} (12 / (N^{3}−N))^{1 / 2}, where σ_{P} is the period error, σ_{T} the timing error of a single measurement, and N the number of measurements. Compared to the iterative method for period error estimation by Mighell & Plavchan (2013), this much simpler formula leads to smaller period errors, whose correctness has been verified through simulations. For the prediction of times of future periodic events, usual linear ephemeris were epoch errors are quoted for the first time measurement, are prone to an overestimation of the error of that prediction. This may be avoided by a correction for the duration of the time series. An alternative is the derivation of ephemerides whose reference epoch and epoch error are given for the centre of the time series. For long continuous or nearcontinuous time series whose acquisition is completed, such central epochs should be the preferred way for the quotation of linear ephemerides. While this work was motivated from the analysis of eclipse timing measures in spacebased light curves, it should be applicable to any other problem with an uninterrupted sequence of discrete timings for which the determination of a zero point, of a constant period and of the associated errors is needed.
Key words: ephemerides / time / occultations / techniques: photometric / methods: data analysis / binaries: eclipsing
© ESO, 2015
1. Motivation and objectives
The space missions MOST, Kepler, and CoRoT have been dedicated to the acquisition of nearcontinuous photometry over long time scales. From them, timing measurements of eclipse or transit events ave become available of a different nature from those from groundbased campaigns. Their main difference is the completeness of coverage between the first and the last measurement, with duty cycles of about 90% (Michel 2013, for Kepler and CoRoT) over time scales ranging from weeks to years. The derivation of precise ephemerides for timing measurements of periodic events (typically eclipses or transits) in such data may therefore also require revised methods. In this paper, we address two points: the estimation of the period error on the one hand and the error in the prediction of the time of future eclipse events – also called the prediction error − on the other. The derivation of the prediction error is related to the correct usage of the epoch error of an ephemeris. This will also lead to a recommendation for an improved quoting of ephemeris from such longcadence time series. In all the work presented here, an intrinsically constant period of the observed target is assumed. An algorithmic method for deriving period errors from continuous series of timing measurements has been published by Mighell & Plavchan (2013, hereafter M&P). In the first part (Sect. 2) of this communication, we show that a linear fit of the measured individual timings (or a fit through “O−C” residuals, derived against a preliminary ephemeris) is the correct way to determine a period and its error, leading to a very simple equation with which to estimate the period error. The epoch error in a linear ephemeris is elaborated in the second part (Sect. 3). This error, if quoted in the usual way for the first timing measurement in a dataset, is shown to be a nonoptimum description of the zeropoint error in a linear ephemeris, which may lead to overestimated prediction errors. Correct ways to estimate the prediction error of future events are then given.
2. Derivation of the period error
Our objective is the estimation of the error of the period P, given a continuous sequence of N timing measurements T_{E} at integer Epochs E, with E = 0,...,N−1, and assuming that the period to be measured is intrinsically constant (e.g. P does not vary with E). It is also assumed that all timing measurements have an identical time error σ_{T}. A linear ephemeris given by (1)T_{c,0} being the time of zeroepoch, can then be derived from minimizing the residuals T_{E}−T_{c,E}. We note that T_{E}−T_{c,E} corresponds to the commonly used “O−C” or “observed − calculated” residuals.
The chisquare minimization to determine bestfit parameters for P and T_{c,0} is then given by the linear regression: (2)The leastsquares estimate of the slope b of a linear fit y = a + bx to datatuples (x_{i},y_{i}) can be found in many basic works on statistics (e.g. Kenney & Keeping 1962; Press et al. 1992) and is given by (3)where we use summations^{1} over i = 0,...,N−1. Recognizing that the values x_{i} are given by the Epoch number E, and changing to the nomenclature of Eq. (2)(a → T_{c,0},b → P,y_{i} → T_{E}) and for the convenience of writing, replacing E by i, we obtain (4)It is of note that the T_{i} in Eq. (4)may be either the measured times themselves (quoted for example in BJD) or O−C residuals against some other (preliminary) ephemeris. In that case, Eq. (4)delivers the difference to the period of that ephemeris.
Making use of the identities and , after some basic algebra we find (5)The terms act as weighting coefficients for the timings T_{i}, the highest weight being given for the timings at the beginning and end of a dataset, and with little or no weight for those near the centre.
From above equation for P = P(T_{0},...,T_{N−1}) we can then derive the period error σ_{P} using error propagation, e.g. , which leads immediately to (6)Using again the identities for and , we find that and arrive at the final result: (7)
2.1. Comparison with the period errors of Mighell and Plavchan
In the following, period error estimates from Eq. (7)are compared to similar ones given by M&P. They use an iterative algorithm which they denominate “Period Error Calculator (PEC)”, that obtains the period error through multiple combinations of the errors of the manifold 2point measurements that are present within a series of timing measures. For N = 2 to 8 timing measurements, M&P quote explicit values^{2} that correspond to the c_{N} coefficients of Eqs. (5)or (7)given above. A comparison of these values is shown in Table 1. Their results agree with Eq. (7)only for the case of N = 2 or N = 3. Up to N = 8 differences remain small, within ≈30%. Without implementing their PEC algorithm, we can also compare our results with their example of a strictly periodic variable with N = 171 timing measurements, each with an uncertainty of σ_{T} = 0.0104 days, for which they derive a period error of 23 microdays^{3}. However, from our Eq. (7)with c_{171} = 2.40 × 10^{6}, we derive a period error of 16.1 microdays, which implies that their “reduced value” for N = 171 is (23 / 16.1)^{2} = 2.04 times larger than c_{171}.
Comparison between c_{N} values in this work and in Mighell & Plavchan (2013).
2.2. Simulations of the period error and application notes
Given that period error estimates calculated using Eq. (7)deviate significantly from those obtained by M&P’s PEC algorithm, the results of Eq. (7)were verified by a set of simulations as follows. Assuming that an intrinsic (correct) ephemeris is given by T_{c}(E) = T_{c,0} + EP_{c}, a set of N timing measurements for E = 0,...,N−1 is generated, with values T_{E} that are randomly drawn from a normal distribution with a standarddeviation of σ_{T} and centred on the calculated value T_{c}(E). The error of each individual timing measurement is also set to be σ_{T}. An example of such a simulated set of measurements in the form of an O−C diagram against the intrinsic ephemeris is shown in Fig. 1. A linear ephemeris is then fitted by minimizing χ^{2} (e.g. as given by Eqs. (4)or (5)), which gives us the “observed period” P_{fit}, and the deviation against the intrinsic period, ΔP = P_{fit}−P_{c}. This procedure can easily be repeated many times using the same intrinsic ephemeris and a histogram of the deviations ΔP can be produced, as shown in Fig. 2. The simulations show that, for large numbers of repetitions, the mean of ΔP becomes close to zero, and the standard deviation of ΔP very closely approaches the value predicted by Eq. (7). Such simulations with 100 000 repetitions were performed for values of N = 3, 10, 100, 1000, all of them showing the validity of the period error given by Eq. (7).
Fig. 1 O−C diagram of a simulation of N = 10 timing measurements drawn from a normal distribution with a standard deviation of σ_{T} = 1 against the intrinsic ephemeris, with individual errors also of ±1. The time unit is irrelevant in these simulations and any unit can be assumed. The intrinsic (correct) ephemeris would be a horizontal line at O−C = 0 (not shown). The solid line shows the best fit to these measurements. The period error, given by Eq. (7), which is ±0.1101 time units per epoch, is indicated by the slopes of the dashed lines. In this particular simulation, the reduced chisquare against the intrinsic ephemeris was 0.64, whereas the reduced chisquare against the fit was 0.75. 

Open with DEXTER 
Fig. 2 Histogram of differences ΔP between fitted and intrinsic periods generated by repeating 100 000 times a simulation for N = 10, as shown in Fig. 1. The standard deviation of ΔP is 0.1099, very close to the value from Eq. (7)(0.1101 for N = 10). 

Open with DEXTER 
Equation (7)assumes that the measured times are normally distributed around the intrinsic (and unknown) linear ephemeris, and that the errors of the individual timing measurements (which are usually assigned by the observer) are of a size similar to the O−C residuals of the timings against this ephemeris. When this condition is not given, the reduced chisquare (using Eq. (2)and ) against the intrinsic ephemeris deviates significantly from 1. In practice, we know only the fitted ephemeris. The reduced chisquare against it should be calculated as , accounting for the two unknown fit parameters. If deviates substantially from 1, there is in principle no way to know if this is due to unusually large or small random errors in the measurements or if it has other origins. For the case of , we do not know if the measurements have been fortuitously well aligned, or if measurement errors have been overstated. For , the measurement errors might have been understated. However, an intrinsic periodvariation could also be the origin for the poor fit; a revision of the physical condition of the observed system should then be performed to evaluate whether that might be a valid hypothesis. Even if the fitted ephemeris indicates a good fit with , we have to be aware that a large against the unknown true ephemeris may still be present, and that the fitted parameters may have errors that are larger than expected. This can happen when measurements are fortuitously aligned to represent a rather deviant ephemeris. A simulation with such an outcome in shown in Fig. 3.
Fig. 3 Like Fig. 1. In this particular simulation, a fortuitously good fit is obtained from data with a rather large scatter: the reduced chisquare against the intrinsic ephemeris was 1.43, whereas the reduced chisquare against the ephemeris fit (solid line) is 0.97. However, the fit’s period has a rather large error, 2.5 times as large as the perioderror given by Eq. (7)(dashed lines). 

Open with DEXTER 
3. Epoch and prediction errors
In a similar way to that discussed for the period and period error in the previous section, we can derive the intercept of the linear fit, which gives the ephemeris zero epoch T_{c,0} and its error.
3.1. Epoch errors at the beginning of a measurement sequence
First, we continue to use the usual epoch indices ranging from 0 to N−1 and start from a common equation^{1} for the intercept: (8)Changing, as before, to the nomenclature of Eq. (2)and with very similar algebra, we obtain (9)Again, terms act as weighting coefficients for the timings T_{i}, with a weight that goes from about through zero to . For the error of T_{c,0} we obtain, similarly to Eq. (6), (10)Evaluating the sum of the weighting coefficients as , we then obtain (11)
3.2. Ephemeris with zero epoch at the centre of the measurement sequence
In the following, a zero epoch at the centre of the measurement sequence is considered. For simplicity, only the case with an odd number of measurements is elaborated. The indices (epochs) of the measurements are now labelled j and will be j = −k,...,k with . For the period, we start again from the basic Eqs. (3)or rather (4). For the summations, going now from −k to + k, we employ the identities and . The value for the period then turns out as (12)Given that , this is identical to Eq. (5). The equation for the period error is then of course also identical to Eq. (7). For the intercept, or zeroepoch, and starting from Eq. (8), however, we obtain a different and much simpler expression: (13)This “centre” or “middle epoch” has been labelled T_{c,m} in order to distinguish it from the usual zero epoch, T_{c,0}, at the first measurement. The corresponding error is given by (14)The different outcome for the epoch error, depending on its location at the beginning or in the centre of a measurement sequence, can be explained in the following way. From the epoch error at the centre of a measurement sequence σ_{Tc,m} and from the period error σ_{P}, we can calculate the expected timing error at the beginning (or at the end) of the sequence, σ_{Tbegin}, using the squaresum of errors, (15)where is the number of periods between the beginning of the sequence and its centre. Inserting the expressions for σ_{Tc,m} and σ_{P} into Eq. (15), we find that σ_{Tbegin} = σ_{Tc,0}, with σ_{Tc,0} being given by Eq. (11). This means that the error of the zero epoch of an ephemeris quoted in the conventional way, with E = 0 corresponding to the first timed event, is really the errorsum of the “true” epoch error σ_{Tc,m} in the middle of the sequence plus a contribution from the period error!
3.3. Consequences for the prediction of events beyond the measurement sequence
A fundamental function of an ephemeris is the prediction of transit or eclipse events beyond the end of a measurement sequence, giving both the time and the time uncertainty of such future events. The usually quoted σ_{Tc,0} and σ_{P} may, however, easily lead to an overestimation of this timing error if a naive errorsum given by is used, as is illustrated by the dashed slopes in Fig. 4. There are two solutions to circumvent such overestimation of the “prediction error”. The first solution, σ_{Tc,m}, can be retrieved^{4} from the conventionally quoted epoch errors by reversing Eq. (15): (16)The timeuncertainty at any epoch before, during, or after the measurement sequence can then be obtained from the errorsum, (17)where j is the number of epochs relative to the centre of the sequence. A second, simpler, way to obtain a correct prediction error comes from the observation that the epoch error at the beginning or at the end of a measurement sequence should be identical. We can then use the conventional epoch error σ_{Tc,0} for the prediction of future events by counting from the end of the sequence, using (18)where E is the epoch number since the begin of the sequence and σ_{Tc,E} is the time uncertainty at E. This equation correctly takes into account the duration of the measurement sequence (see also Fig. 4).
Fig. 4 Expected timing uncertainties during and after a sequence of timing measurements based on an ephemeris with given errors in epoch and period. The axes are similar to those in Fig. 1. The dashed line gives the development of the 1sigma uncertainty from an error sum using the period error and the epoch error at the beginning of the sequence. The solid lines outline the correct timing uncertainty, with the epoch error being derived for the centre of the sequence. For epochs beyond the end of the sequence, this uncertainty is also identical to the prediction uncertainty given by Eq. (18). 

Open with DEXTER 
4. Conclusions
In the first part of this communication, a simple formula for the derivation of period errors in continuous sequences of timing measurements with identical timing errors has been derived and verified through a set of simulations. For the extraction of a linear ephemeris from a set of timing measurements, which implies that an intrinsic constant period is assumed, there is no apparent reason to use other methods than a linear fit based on an error minimization. There is no reason to use another method for the estimation of the fitparameters errors beyond an error propagation from the equations that determine the fit parameters (in this case, from Eqs. (3)and (8)). In the case of identical measurement errors in a continuous series of data, equations for these errors simplify to those given in Sect. 2 for the period (Eq. (7)) and Sect. 3 for the epoch (Eqs. (11)and (14)). The only point open to variation is the type of error minimization used in the linear fit, where other methods beyond chisquare minimization might be considered, such as minimizing absolute errors and/or robust fits that reject outliers. These may lead to slightly different errorestimates, all of which, however, are based on the residuals against the best linear fit, independently of how that fit was obtained.
It is not the aim of this communication to revise or analyse M&P’s PEC algorithm, which derives the period error from a combination of timing errors between any pairing of two timing measurements within a sequence of timings. Certainly, PEC is a vastly more complicated way to derive the period error. Differences between the error estimates from Eq. (7)of this work and PEC (which increase with the number of timing measurements) are probably caused by an incorrect weighting of the individual 2point timing measurements from which PEC constructs its final result. Verifying this would, however, need a detailed analysis of PEC, which is beyond the scope of this study.
Both this work and the one by M&P assume identical errors for all individual timing measurements. In practice, even in space missions such as CoRoT and Kepler, imperfect duty cycles cause occasional misses or incomplete transits. Furthermore, cosmicray hits may degrade light curves of individual transits, leading to larger timing errors. For practical applications of the equations presented here, occasional measurements with strongly deviating timing errors (or missed measurements) can be ignored as long as a predominant timing error can be identified. If such a predominant error cannot be identified (e.g. owing to a change of integration time in Kepler light curves), or if a significant fraction of timing measurements is missing, the simplified equations presented here will not be reliable. The ephemeris and its errors should then be derived from a numerical leastsquares minimization of Eq. (2), using individual timing errors (e.g. Press et al. 1992, Sect. 14.2).
In the second part of this article, two equations for the epoch error of continuous timing sequences have been derived. In the first of these equations, the error is given for a zero epoch that corresponds to the first timing measurement (Eq. (11)). This is the conventional way in which ephemerides are indicated. In the second equation, a much simpler expression is obtained for the epoch error at the centre of a timing sequence (Eq. (14)). It has been shown that these errors are equivalent, with the epoch error at the beginning (or end) of a timing sequence being the error sum of the central epoch error plus the period error. Ephemerides of long sequences of timing measurements would therefore be more logically expressed with epochs and epoch errors for a timing measurement at or near the center of the sequence. With such a “central ephemeris”, the estimation of timing errors beyond the end of the original measurement sequence
will then be correctly performed by a simple error sum between epoch error and the period error. The central ephemeris serves also to correctly estimate the uncertainties of the true (intrinsic) event times during the measurement sequence. With conventional ephemerides, on the other hand, a correction for the duration of the measurement sequence is needed in order to derive correct timing errors for predictions past the measurement sequence. For the photometric followup of CoRoT planets and planet candidates, Eq. (18)has been implemented for several years in an online calculator^{5}. There, the number N of observed transit events during a CoRoT pointing is estimated from the target period and the pointing duration, the latter one ranging from 28d to 159d. Its predictions of future transit events have been shown to be reliable for CoRoT’s planet candidate verification programme (described initially in Deeg et al. 2009), as well as for an ongoing reobservation of CoRoT planet transits (Klagyivik et al., in prep.). The use of Eq. (18)over naive errorsums of epoch and period errors (counting the epochs since the beginning of the measurements) is still more important when prediction errors for targets from the Kepler mission are calculated, since in that case the difference between “naive” and correct prediction errors is much larger, due to the 3.9 yr duration of its light curves.
A successful reobservation of transits or eclipses of objects discovered by space missions such as CoRoT or Kepler depends critically on the correct prediction of their transit time and timing errors. Such predictions are essential when only a few hours are available to observe a given transit or eclipse. A best possible derivation of the ephemeris errors and the prediction errors is therefore very important for the legacies of the space missions that have brought us these wonderful datasets of long, continuous, and highly precise time series. Precise ephemeris measurements may also be expected to have a similar impact on followup observations of future planet detection missions, namely TESS (Ricker et al. 2014) and PLATO (Rauer et al. 2014).
Usually, summations over indices going from 1 to N are assumed in Eq. (3)and also in Eq. (8). The change to indices going from 0 to N−1 has no consequences as long as the summations go over a total of N terms. Here we prefer indices starting with 0 in order to start with E = 0 in the linear ephemeris, as in Eq. (1).
See the “Reduced” values in Table 1 in M&P, where they are given for M = 1,...,7 period cycles, which correspond to N = 2,...,8 timing measurements.
See M&P’s Fig. 1 and accompanying text, which is for 170 cycles, corresponding to 171 measurements.
As a side note to Eq. (16), the calculation of σ_{Tc,m} for existing ephemerides may also serve as a diagnostics on the correct sizing of these ephemeris errors since σ_{Tc,m} can easily be related to the size of the errors of individual timing measurements through Eq. (14). Of course, needs to come out as a positive number, otherwise σ_{Tc,0} is underestimated and/or σ_{P} is overestimated.
http://www.iac.es/proyecto/corot/followup/ (Access will be provided upon request to the author).
Acknowledgments
The author acknowledges support through grant AYA201239346C0202 of the Spanish Ministerio de Economía y Competividad (MINECO). He also thanks the anonymous referee for comments which led to an improvement in the presentation of this paper.
References
 Deeg, H. J., Gillon, M., Shporer, A., et al. 2009, A&A, 506, 343 [NASA ADS] [CrossRef] [EDP Sciences] (In the text)
 Kenney, J. F., & Keeping, E. S. 1962, in Mathematics of Statistics, Pt. 1, 3rd edn. (Princeton, NJ: Van Nostrand), 252 (In the text)
 Michel, E. 2013, in Stellar Pulsations: Impact of New Instrumentation and New Insights, eds. J. C. Suárez, R. Garrido, L. A. Balona, & J. ChristensenDalsgaard, Adv. Solid State Phys., 31, 145 (In the text)
 Mighell, K. J., & Plavchan, P. 2013, AJ, 145, 148 [NASA ADS] [CrossRef] (In the text)
 Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. 1992, Numerical Recipes in C. The Art of Scientific Computing, 2nd edn. (Cambridge: University Press) (In the text)
 Rauer, H., Catala, C., Aerts, C., et al. 2014, Exp. Astron., 38, 249 [NASA ADS] [CrossRef] (In the text)
 Ricker, G. R., Winn, J. N., Vanderspek, R., et al. 2014, in SPIE Conf. Ser., 9143, 20 (In the text)
All Tables
All Figures
Fig. 1 O−C diagram of a simulation of N = 10 timing measurements drawn from a normal distribution with a standard deviation of σ_{T} = 1 against the intrinsic ephemeris, with individual errors also of ±1. The time unit is irrelevant in these simulations and any unit can be assumed. The intrinsic (correct) ephemeris would be a horizontal line at O−C = 0 (not shown). The solid line shows the best fit to these measurements. The period error, given by Eq. (7), which is ±0.1101 time units per epoch, is indicated by the slopes of the dashed lines. In this particular simulation, the reduced chisquare against the intrinsic ephemeris was 0.64, whereas the reduced chisquare against the fit was 0.75. 

Open with DEXTER  
In the text 
Fig. 2 Histogram of differences ΔP between fitted and intrinsic periods generated by repeating 100 000 times a simulation for N = 10, as shown in Fig. 1. The standard deviation of ΔP is 0.1099, very close to the value from Eq. (7)(0.1101 for N = 10). 

Open with DEXTER  
In the text 
Fig. 3 Like Fig. 1. In this particular simulation, a fortuitously good fit is obtained from data with a rather large scatter: the reduced chisquare against the intrinsic ephemeris was 1.43, whereas the reduced chisquare against the ephemeris fit (solid line) is 0.97. However, the fit’s period has a rather large error, 2.5 times as large as the perioderror given by Eq. (7)(dashed lines). 

Open with DEXTER  
In the text 
Fig. 4 Expected timing uncertainties during and after a sequence of timing measurements based on an ephemeris with given errors in epoch and period. The axes are similar to those in Fig. 1. The dashed line gives the development of the 1sigma uncertainty from an error sum using the period error and the epoch error at the beginning of the sequence. The solid lines outline the correct timing uncertainty, with the epoch error being derived for the centre of the sequence. For epochs beyond the end of the sequence, this uncertainty is also identical to the prediction uncertainty given by Eq. (18). 

Open with DEXTER  
In the text 