Frequentist confidence intervals for orbits

L. B. Lucy

doi:10.1051/0004-6361/201423661

Home

All issues

Volume 565 (May 2014)

A&A, 565 (2014) A37

Full HTML

Free Access

Issue		A&A Volume 565, May 2014


Article Number		A37
Number of page(s)		7
Section		Celestial mechanics and astrometry
DOI		https://doi.org/10.1051/0004-6361/201423661
Published online		30 April 2014

A&A 565, A37 (2014)

Frequentist confidence intervals for orbits^⋆

L. B. Lucy

Astrophysics Group, Blackett Laboratory, Imperial College London, Prince Consort Road London SW7 2AZ UK
e-mail: l.lucy@imperial.ac.uk

Received: 17 February 2014
Accepted: 25 March 2014

Abstract

The problem of efficiently computing the orbital elements of a visual binary while still deriving confidence intervals with frequentist properties is treated. When formulated in terms of the Thiele-Innes elements, the known distribution of probability in Thiele-Innes space allows efficient grid-search plus Monte-Carlo-sampling schemes to be constructed for both the minimum-χ² and the Bayesian approaches to parameter estimation. Numerical experiments with 10⁴ independent realizations of an observed orbit confirm that the 1 − and 2σ confidence and credibility intervals have coverage fractions close to their frequentist values.

Key words: binaries: visual / stars: fundamental parameters / methods: statistical

^⋆

Appendix is available in electronic form at http://www.aanda.org

© ESO, 2014

1. Introduction

When error bars or confidence intervals are reported, the reader expects them to have their frequentist meaning. Thus, a 95% confidence interval is interpreted as implying a probability of 0.95 that the true result is enclosed by that interval. Similarly, the interval defined by ± 1σ error bars is expected to include the true answer with probability 0.683. However, this frequentist ideal is often not realized. This may be the result of observers misjudging the precision of their measurements or of large measurment errors occurring more frequently than expected for a normal distribution.

Such practical issues are absent when data analysis techniques are investigated with simulations, since precision can be exactly specified and measurement errors can be assigned with random Gaussian variates, so that one might then expect a rigorous recovery of the frequentist ideal. But approximations can still compromise statistical rigour. For example, if a grid is required, confidence intervals might be affected if the grid is too coarse. In such cases, with increased computational resources, the limit as the grid steps → 0 can be closely approached and accurate results obtained.

Of more concern are approximations that compromise confidence intervals independently of any such limit. Two examples in the recent literature occur in hybrid problems – i.e., non-linear problems with a subset of linear parameters. The first example is the code EXOFAST for analysing transit and radial velocity data for stars with orbiting planets (Eastman et al. 2013). These authors note that the convergence of their Markov chain Monte Carlo (MCMC) parameter search is much faster if the exact solution for the linear parameters is introduced. However, the resulting uncertainties in the linear parameters are as much as 10 times smaller than when fitted non-linearly. Pending further research, these authors sensibly choose the inefficient option of treating all parameters as non-linear.

A similar but less extreme example arises when Bayesian estimation is applied to visual binaries (Lucy 2014; L14). When formulated in terms of the Thiele-Innes elements, the problem becomes linear in four of the seven elements. But when this linearity is exploited, coverage fractions (L14, Sect. 5.5) indicate that the standard errors of the four linear elements are too small by factors of up to 2.1.

These examples pose a statistical challenge in the analysis of orbits: How can we benefit from partial linearity without losing the frequentist properties of confidence intervals? In this paper, this challenge is addressed in its visual binary context and for both frequentist and Bayesian procedures.

2. Synthetic orbits

The paper L14 is followed closely with regard both to notation and in the creation of synthetic data.

2.1. Orbital elements

The orbit of the secondary relative to its primary is conventionally parameterized by its Campbell elements P,e,T,a,i,ω,Ω. Here P is the period, e is the eccentricity, T is a time of periastron passage, i is the inclination, ω is the longitude of periastron, and Ω is the position angle of the ascending node. However, from the standpoint of computational economy, many investigators – references in L14 Sect. 2.1 – prefer the Thiele-Innes elements. Thus, the Campbell parameter vector θ = (φ,ϑ), where φ = (P,e,τ) and ϑ = (a,i,ω,Ω), is replaced by the Thiele-Innes vector (φ,ψ), where the components of the vector ψ are the Thiele-Innes constants A,B,F,G. (Note that in the φ vector, T has been replaced by τ = T/P which by definition ∈ (0,1).)

2.2. Model binary

As in L14, the adopted model binary has the following Campbell elements:

$\begin{matrix} P_{*} = 100 y τ_{*} = 0.4 e_{*} = 0.5 a_{*} = 1 ″ \\ i_{*} = 60 ° ω_{*} = 250 ° Ω_{*} = 120 ° . \end{matrix}$ $\begin{eqnarray} &&P_{*}=100y \;\;\; \tau_{*}=0.4 \;\;\; e_{*}=0.5 \;\;\; a_{*}=1\arcsec \nonumber \\ &&{i}_{*}=60\degr \;\;\; \omega_{*} = 250\degr \;\;\; \Omega_{*} = 120\degr. \end{eqnarray}$ (1)An observing campaign for this binary is simulated by creating measured Cartesian sky coordinates $(\begin{matrix} _{˜} \\ x_{n} \end{matrix}, \begin{matrix} _{˜} \\ y_{n} \end{matrix})$ $\hbox{$(\tilde{x}_{n},\tilde{y}_{n})$}$ with weights w_n at uniformly-spaced times t_n for n = 1,...,N as described in L14 Sect. 3.2. The parameters defining a campaign are f_orb, the fraction of the orbit observed, N, the number of observations, and σ, the standard error for unit weight.

For given orbital elements, the predicted orbit (x_n,y_n) is computed as described in L14 Sect. A.1, and the quality of the fit is determined by $χ^{2} = \frac{1}{σ^{2}} Σ_{n} w_{n} (x_{n} - \begin{matrix} ˜ \\ x_{n} \end{matrix})^{2} + \frac{1}{σ^{2}} Σ_{n} w_{n} (y_{n} - \begin{matrix} ˜ \\ y_{n} \end{matrix})^{2} .$ $\begin{equation} \chi^{2} = \frac{1}{\sigma^{2}} \Sigma_{n} w_{n} (x_{n}-\tilde{x}_{n})^{2} +\frac{1}{\sigma^{2}} \Sigma_{n} w_{n} (y_{n}-\tilde{y}_{n})^{2}. \end{equation}$ (2)

3. Minimum-χ² estimation

The conventional (frequentist) approach to orbit-fitting is the method of least squares – i.e., finding the elements $\hbox{$\hat{\theta} = (\hat{\phi},\hat{\psi})$}$ that minimize χ². When the problem is non-linear, the search for the minimum typically involves successive differential corrections obtained from linearized equations, starting with an initial guess. However, in treating incomplete orbits and imprecise data, it is preferable to find $χ_{\min}^{2} = χ^{2} (θ̂)$ $\hbox{$\chi^{2}_{\min} = \chi^{2}(\hat{\theta})$}$ by means of a grid search (e.g., Hartkopf et al. 1989; Schaefer et al. 2006) and then to derive confidence intervals from constant χ² “surfaces” in parameter space (e.g., Press et al. 1992, Chap. 15.6; James 2006, Chap. 9.1.2).

3.1. Grid search

In a brute force approach to finding $\hbox{$\hat{\theta}$}$ , values of χ² would be computed throughout a 7D grid. Confidence intervals for the elements would then be derived from projections of the 7D volume $\hbox{${\cal V}$}$ defined by the inequality $χ^{2} (θ) < χ_{\min}^{2} + Δ_{ν,α}$ $\begin{equation} \chi^{2}(\theta) \: < \: \chi^{2}_{\min} + \Delta_{\nu,\alpha} \end{equation}$ (3)where the constant Δ_ν,α is detemined by ν, the number of degrees of freedom and α, the desired confidence level. With a typical 100 steps for each dimension, the brute force method requires χ² to be evaluated at ~10¹⁴ grid points. However, if the linearity with respect to the Thiele-Innes elements can be exploited, χ² is only required at ~10⁶ grid points. This potential reduction by a factor of ~10⁸ in the number of computed orbits is a powerful incentive to solve the challenge posed in Sect. 1.

On the assumption that linearity can be exploited, grid searches in this paper are restricted to the φ-elements P,e,τ. The grid is defined by taking constant steps spanning the intervals (log P_L,log P_U),(e_L,e_U),(τ_L,τ_U). The grid cells are labelled (i,j,k) and the φ-elements at the mid-points are log P_i,e_j,τ_k. With these values fixed, $χ_{ijk}^{2}$ $\hbox{$\chi^{2}_{ijk}$}$ is a function of ψ = (A,B,F,G) and has its minimum value $χ̂ \begin{matrix} 2 \\ ijk \end{matrix}$ $\hbox{$\hat{\chi}^{2}_{ijk}$}$ at the point $\hbox{$\hat{\psi}_{ijk} = (\hat{A},\hat{B},\hat{F},\hat{G})$}$ given by Eqs. (A.7) in L14.

Because $χ_{ijk}^{2} \geq χ̂ \begin{matrix} 2 \\ ijk \end{matrix}$ $\hbox{$\chi^{2}_{ijk} \ge \hat{\chi}^{2}_{ijk}$}$ , it follows that nowhere in (φ_ijk,ψ)-space is $χ_{ijk}^{2} (ψ)$ $\hbox{$\chi^{2}_{ijk}(\psi)$}$ less than the minimum value found in the 3D search. Accordingly, in the limit of vanishingly small grid steps, $χ_{\min}^{2} = \min_{θ} {χ^{2} (θ)^{}} = \min_{ijk} {χ̂ {\begin{matrix} 2 \\ ijk \end{matrix}}^{}} .$ $\begin{equation} \chi^{2}_{\min} = \min_{\theta} \left\{\chi^{2}(\theta)\right\} = \min_{ijk} \: \left\{\hat{\chi}^{2}_{ijk}\right\}. \end{equation}$ (4)Thus, as has long been understood (e.g, Hartkopf et al. 1989), the minimum-χ² elements $\hbox{$\hat{\theta}$}$ can be found with a grid search restricted to the non-linear elements.

3.2. Approximate confidence intervals

With $\hbox{$\hat{\theta}$}$ determined, the calculation of confidence intervals requires projections of $\hbox{${\cal V}$}$ , the volume in θ-space defined by Eq. (3). In the absence of a 7D grid, a possible approach is to derive approximate confidence intervals from projections of the 3D grid points satisfying Eq. (3) – i.e., from projections of the domain $\hbox{${\cal D}$}$ comprising grid points such that $χ̂ \begin{matrix} 2 \\ ijk \end{matrix} < χ_{\min}^{2} + Δ_{ν,α} .$ $\begin{equation} \hat{\chi}^{2}_{ijk} \: < \: \chi^{2}_{\min} + \Delta_{\nu,\alpha}. \end{equation}$ (5)This derivation of confidence intervals has exploited linearity since it relies on obtaining $\hbox{$\hat{\psi}_{ijk}$}$ and therefore also $χ̂ \begin{matrix} 2 \\ ijk \end{matrix}$ $\hbox{$\hat{\chi}^{2}_{ijk}$}$ without iteration. However, every point $\hbox{$\in {\cal D}$}$ also satisfies Eq. (3), so that $\hbox{${\cal D} \in {\cal V}$}$ . Accordingly, these approximate intervals will always be enclosed within the true intervals and so may give a misleading impression of the elements’ precision.

3.3. Accurate confidence intervals

An asymptotically rigorous calculation proceeds as follows: first, since $χ_{ijk}^{2} (ψ) \geq χ̂ \begin{matrix} 2 \\ ijk \end{matrix}$ $\hbox{$\chi^{2}_{ijk}(\psi) \ge \hat{\chi}^{2}_{ijk}$}$ , the points (φ_ijk,ψ) are all exterior to $\hbox{${\cal V}$}$ when $\hbox{$\phi_{ijk} \ni {\cal D}$}$ . Thus grid points $\hbox{$\ni {\cal D}$}$ are no longer of interest.

Now consider a point $\hbox{$\phi_{ijk} \in {\cal D}$}$ . A point in the ψ-space attached to φ_ijk has a χ² higher than the least squares value at φ_ijk by the amount $δ χ_{ijk}^{2} = χ_{ijk}^{2} - χ̂ \begin{matrix} 2 \\ ijk \end{matrix} .$ $\begin{equation} \delta \chi^{2}_{ijk} = \chi^{2}_{ijk} - \hat{\chi}^{2}_{ijk}. \end{equation}$ (6)This point is on the 6D surface $\hbox{${\cal S}$}$ of the volume $\hbox{${\cal V}$}$ if $δ χ_{ijk}^{2} = Δ_{ν,α} - (χ̂ \begin{matrix} 2 \\ ijk \end{matrix} - {χ_{\min}^{2}}^{)} .$ $\begin{equation} \delta \chi^{2}_{ijk} = \Delta_{\nu,\alpha} -\left( \hat{\chi}^{2}_{ijk} - \chi^{2}_{\min}\right). \end{equation}$ (7)The contribution to $\hbox{${\cal S}$}$ arising from the grid point (i,j,k) can therefore be obtained by randomly sampling the attached ψ-space subject to this constraint on $δ χ_{ijk}^{2}$ $\hbox{$\delta \chi^{2}_{ijk}$}$ . The superposition of these contributions from all $\hbox{$\phi_{ijk} \in {\cal D}$}$ then maps out $\hbox{${\cal S}$}$ , and the projections of $\hbox{${\cal S}$}$ give the desired confidence intervals.

If $\hbox{${\cal N}$}$ is the number of random points ψ_ℓ on $\hbox{${\cal S}$}$ generated at each φ_ijk, the ensemble of generated points (φ_ijk,ψ_ℓ) becomes an exact representation of $\hbox{${\cal S}$}$ in the limits $\hbox{${\cal N} \rightarrow \infty$}$ and grid steps → 0. In other words, in these limits no finite surface element $\hbox{$\in {\cal S}$}$ would be missed by the random sampling.

The merits of this procedure are the following: 1) The random sampling on $\hbox{${\cal S}$}$ does not require further orbits to be calculated; 2) in contrast to acceptance-rejection methods common in Monte Carlo sampling, all points are accepted; and 3) in contrast to the brute force approach, no points are computed either interior or exterior to $\hbox{${\cal S}$}$ .

3.4. An example

To illustrate this calculation of confidence intervals, the model binary with elements given in Eq. (1) is observed in a campaign with parameters f_orb = 0.6,N = 15,σ = 0″05. The initial 3D search for $χ_{\min}^{2}$ $\hbox{$\chi^{2}_{\min}$}$ is on a coarse 100³ grid spanning the intervals (1,4),(0,1),(0,1) in log P,e,τ, respectively. Given the resulting initial estimate of $χ_{\min}^{2}$ $\hbox{$\chi^{2}_{\min}$}$ , the search boundaries are pruned in such a way that no point with $χ^{2} < χ_{\min}^{2} + 25$ $\hbox{$\chi^{2} < \chi^{2}_{\min} + 25$}$ is excluded, and then a new 100³ grid is computed. The resulting $χ_{\min}^{2}$ $\hbox{$\chi^{2}_{\min}$}$ and its location are then slightly improved by making small random displacements and acceping those that reduce χ².

An investigator selects the confidence intervals of interest by specifying ν and α. Here we take α = 0.683, corresponding to ± 1σ limits, and ν = 1, thus computing confidence intervals for each element individually – i.e., independently of the other elements. With these choices, Δ_ν,α = 1. For ± 2σ limits, α = 0.954 and Δ_ν,α = 4.

With Δ_ν,α = 1, the refined grid has 506 points satisfying Eq. (5). These define $\hbox{${\cal D}$}$ and, as described in Sect. 3.2, approximate confidence intervals are obtained from projections of $\hbox{${\cal D}$}$ . The details are as follows: At each point $\hbox{$\phi_{ijk} \in {\cal D}$}$ , the least squares values $\hbox{$\hat{A},\hat{B},\hat{F},\hat{G}$}$ are available from the grid search. From these, the Campbell elements $\hbox{$\hat{a}, \hat{\rm i}, \hat{\omega},\hat{\Omega}$}$ are calulated as described in Sect. A4 of L14. Thus the Campbell elements θ_ijk are known at every point $\hbox{$\in {\cal D}$}$ and the projections of this ensemble give the ± 1σ intervals. Figure 1 illustrates this procedure. The (log a,i)-projection of the θ_ijk vectors $\hbox{$\in {\cal D}$}$ is plotted as are the resulting ± 1σ limits.

Fig. 1

Approximate confidence intervals. The ensemble of orbit vectors $\hbox{$\theta_{ijk} = (\phi_{ijk}, \hat{\psi}_{ijk})$}$ with $χ̂ \begin{matrix} 2 \\ ijk \end{matrix} < χ_{\min}^{2} + 1$ $\hbox{$\hat{\chi}^{2}_{ijk} < \chi^{2}_{\min} + 1$}$ is projected on to the (log a,i)-plane. The ± 1σ bounds for each coordinate are indicated. The 506 grid points φ_ijk define the domain $\hbox{${\cal D}$}$ .

Now, for the same observed orbit $(\begin{matrix} _{˜} \\ x_{n} \end{matrix}, \begin{matrix} _{˜} \\ y_{n} \end{matrix})$ $\hbox{$(\tilde{x}_{n}, \tilde{y}_{n})$}$ and with the same refined grid, accurate ± 1σ intervals are computed according to the procedure of Sect. 3.3. At each of the 506 grid points ∈ $\hbox{${\cal D}$}$ , $\hbox{${\cal N} = 5$}$ random points on $\hbox{${\cal S}$}$ are calculated as described in Sect. A.5. These points ψ_ℓ are such that $χ^{2} (ψ_{ℓ}) = χ_{\min}^{2} + 1$ $\hbox{$\chi^{2}(\psi_{\ell}) = \chi^{2}_{\min} + 1$}$ . For each ψ_ℓ, the corresponding Campbell elements are then derived as described in Sect. A.4 of L14. The final result is 2530 vectors $\hbox{$\theta_{\ell} \in {\cal S}$}$ . The projections of $\hbox{${\cal S}$}$ give the desired ± 1σ limits. Figure 2 illustrates this step by again projecting the cloud of points on to the (log a,i)-plane.

Because Figs. 1 and 2 refer to the same orbit and are plotted to the same scale, we see immediately that, as anticipated in Sect. 3.2, the approximate ± 1σ intervals are enclosed by the accurate intervals. The ratios accurate:approximate are 1.4 for log a and 1.9 for i.

A point to note from Figs. 1 and 2 is that with finite samples, the derived confidence limits will always be underestimates. In the above example, increasing $\hbox{${\cal N}$}$ from 5 to 500 increases the confidence intervals for log a and i by additional factors of 1.045 and 1.055, respectively. Convergence experiments indicate that sufficient accuracy is achieved with $\hbox{${\cal N} \ga 200$}$ .

The error bars derived with these procedures for a quantity Q are in general not symmetric about its minimum-χ² value. Accordingly, in testing these procedures, attention is focussed not on error bars but on confidence intervals (Q_L,Q_U).

Fig. 2

Accurate confidence limits. The ensemble of orbit vectors $\hbox{$\theta_{\ell} \in {\cal S}$}$ – i.e., with $χ_{ℓ}^{2} = χ_{\min}^{2} + 1$ $\hbox{$\chi^{2}_{\ell} = \chi^{2}_{\min} + 1$}$ – is projected on to the (log a,i)-plane. The ± 1σ bounds for each coordinate are indicated. Each of the 506 grid points $\hbox{$\phi_{ijk} \in {\cal D}$}$ gives rise to $\hbox{${\cal N} = 5$}$ points on $\hbox{${\cal S}$}$ .

3.5. Coverage fractions

Confidence intervals calculated as described in Sect. 3.3 are claimed to be asymptotically rigorous. This implies that, with a fine enough grid and a sufficiently large $\hbox{${\cal N}$}$ , coverage fractions will be close to their frequentist values. To test this, two experiments similar to those in Sect. 5.5 of L14 are now reported. In each experiment, the example of Sect. 3.4 is repeated 10 000 times with independent realizations $(\begin{matrix} _{˜} \\ x_{n} \end{matrix}, \begin{matrix} _{˜} \\ y_{n} \end{matrix})$ $\hbox{$(\tilde{x}_{n}, \tilde{y}_{n})$}$ of the observed orbit. For each trial, confidence intervals are computed for various quantities Q, and an interval is counted as a success if Q_L<Q_exact<Q_U. The quanties Q are the Campbell elements plus the mass estimator ℳ defined in Eq. (17) of L14.

The approximate and accurate intervals of Sect. 3.2 and Sect. 3.3 – now with $\hbox{${\cal N} = 500$}$ – are investigated in experiments i and ii, respectively. In each experiment, both ± 1σ and ± 2σ intervals are tested.

The results are reported in Table 1. From experiment i, we see that the approximate confidence intervals are too small by factors of up to 1.9 for the ± 1σ intervals and up to 1.4 for the ± 2σ intervals. In contrast, the results for experiment ii are close to ideal. Specifically, to within errors, the ± 2σ coverage fractions match the frequentist value. On the other hand, the ± 1σ coverage fractions fall short by the inconsequential factor of 1.016, indicating the need for a somewhat larger $\hbox{${\cal N}$}$ .

These results support the claim that the procedure developed in Sect. 3.3 is asymptotically rigorous. Moreover, the additional computational burden is negligible: experiment ii required a mere 11% more computer time than experiment i.

Table 1

Coverage fractions f for confidence intervals from 10⁴ trials.

4. Bayesian estimation

In this section, the Bayesian treatment of L14 is modified to eliminate its dependence on the profile likelihood – Eq. (3) in L14.

4.1. Posterior density

The posterior probability density function (PDF) at (φ,ψ) given data D is $Λ (φ,ψ | D) \propto ℒ (φ,ψ | D) π (φ,ψ)$ $\begin{equation} \Lambda(\phi,\psi|D) \propto {\cal L}(\phi,\psi|D) \: \pi(\phi,\psi) \end{equation}$ (8)where ℒ is the likelihood of (φ,ψ) given D, and π(φ,ψ) is a PDF that quantifies the investigator’s prior beliefs or knowledge about (φ,ψ). As in L14, we assume that π is the product of seven independent priors, one for each element. With the same choices as in L14, π can be omitted from Eq. (8) if the φ elements are now understood to be (log P,e,τ).

Since coefficients independent of (φ,ψ) can be ignored $ℒ (φ,ψ | D) \propto \exp (- \frac{1}{2} χ̂ 2) \times \exp (- \frac{1}{2} δ χ^{2})$ $\begin{equation} {\cal L}(\phi,\psi|D) \propto \exp\left(-\frac{1}{2} \: \hat{\chi}^{2}\right) \: \times \: \exp\left(-\frac{1}{2} \: \delta \chi^{2}\right) \end{equation}$ (9)where $\hbox{$\hat{\chi}^{2}(\phi) = \chi^{2}(\hat{\psi}|\phi)$}$ is the minimum value of χ² at fixed φ, and δχ²(δψ | φ) is the positive increment in χ² for the displacement $\hbox{$\delta \psi = \psi-\hat{\psi}$}$ . The second factor in Eq. (9) can be eliminated using Eq. (A.9), so that $Λ \propto \exp (- \frac{1}{2} χ̂ 2) η (φ | D) p (ψ | φ,D) .$ $\begin{equation} \Lambda \: \propto \: \exp\left(-\frac{1}{2} \: \hat{\chi}^{2}\right) \: \eta(\phi|D) \: p(\psi|\phi,D). \end{equation}$ (10)If we now approximate p(ψ | φ,D) by a sum of δ functions as discussed in Sect. A.3, then $Λ \propto \exp (- \frac{1}{2} χ̂ 2) \frac{η}{𝒩} \sum_{ℓ} δ (ψ - ψ_{ℓ})$ $\begin{equation} \Lambda \propto \exp\left(-\frac{1}{2} \hat{\chi}^{2}\right) \: \frac{\eta}{{\cal N}} \: \sum_{\ell} \delta(\psi-\psi_{\ell}) \end{equation}$ (11)where the ψ_ℓ are $\hbox{${\cal N}$}$ independent vectors that randomly sample the exact quadrivariate normal PDF.

The PDF Λ is for the Thiele-Innes elements. The corresponding PDF Γ(θ | D) for the Campbell elements θ is $Γ \propto \exp (- \frac{1}{2} χ̂ 2) \frac{η}{𝒩} \sum_{ℓ} δ (ϑ - ϑ_{ℓ})$ $\begin{equation} \Gamma \propto \exp\left(-\frac{1}{2} \hat{\chi}^{2}\right) \: \frac{\eta} {{\cal N}} \: \sum_{\ell} \delta(\vartheta-\vartheta_{\ell}) \end{equation}$ (12)where ϑ_ℓ = ϑ(ψ_ℓ).

4.2. Credibility intervals

In terms of a 3D scan over φ-space, the PDF Γ(θ | D) is approximated by the ensemble of 7D vectors $θ_{m} = (φ_{ijk}, ϑ_{ℓ})$ $\begin{equation} \theta_{m} = (\phi_{ijk},\vartheta_{\ell}) \end{equation}$ (13)with weights $μ_{m} = \frac{η_{ijk}}{𝒩} \exp (- \frac{1}{2} χ̂ \begin{matrix} 2 \\ ijk \end{matrix})$ $\begin{equation} \mu_{m} = \frac{\eta_{ijk}}{{\cal N}} \exp\left(-\frac{1}{2} \hat{\chi}^{2}_{ijk}\right) \end{equation}$ (14)Here m is an index that enumerates the random points ϑ_ℓ across all grid points (i,j,k).

If Q(θ) is a quantity for which a credibility interval is required, the data from which this can be computed are the values Q_m = Q(θ_m) with weights μ_m. From this data, an estimate of the PDF of Q is $Θ (Q) = \sum_{m} μ_{m} δ (Q - Q_{m}) / \sum_{m} μ m$ $\begin{equation} \Theta(Q) = \sum_{m} \: \mu_{m} \delta(Q-Q_{m})/\sum_{m} \: \mu_{m} \end{equation}$ (15)with corresponding cumulative distribution function (CDF) $F (Q) = \sum_{Q_{m} < Q} μ_{m} / \sum_{m} μ_{m} .$ $\begin{equation} F(Q) = \sum_{Q_{m} < Q} \: \mu_{m}/\sum_{m} \: \mu_{m}. \end{equation}$ (16)The equal tail credibility interval (Q_L,Q_U) corresponding to ± 1σ is then obtained from the equations $F (Q_{L}) = 0.1587 F (Q_{U}) = 0.8413$ $\begin{equation} F(Q_{L}) = 0.1587 \;\;\;\;\; F(Q_{U}) = 0.8413 \end{equation}$ (17)so that the enclosed probability 0.6826.

These credibility intervals are asymptotically rigorous – i.e., are exact in the limits $\hbox{${\cal N} \rightarrow \infty$}$ and grid steps → 0.

4.3. Calculation procedure

The basic steps required to derive credibility intervals are as follows:

1)
At every point (log P_i,e_j,τ_k), the minimum-χ² Thiele-Innes elements $\hbox{$\hat{A}, \hat{B},\hat{F},\hat{G}$}$ are obtained with Eqs. (A.7) of L14, and the corresponding $χ̂ \begin{matrix} 2 \\ ijk \end{matrix}$ $\hbox{$\hat{\chi}^{2}_{ijk}$}$ computed.
2)
The variances and covariances defining the exact quadrivariate normal PDF p(ψ | φ,D) at φ_ijk are computed with Eqs. (A.9), (A.10) of L14.
3)
Random points ψ_ℓ sampling the exact PDF p(ψ | φ_ijk,D) are computed as described in Sect. A.4.
4)
The Campbell elements ϑ_ℓ corresponding to ψ_ℓ are computed as described in Sect. A.4 of L14.
5)
The vectors θ_m are then (φ_ijk,ϑ_ℓ) with weights μ_m given by Eq. (14).
6)
Lastly, credibility intervals are derived from Q_m with the approximate CDF given in Eq. (16).

4.4. An example

To illustrate this procedure, credibility intervals are computed for the orbit $(\begin{matrix} _{˜} \\ x_{n} \end{matrix}, \begin{matrix} _{˜} \\ y_{n} \end{matrix})$ $\hbox{$(\tilde{x}_{n},\tilde{y}_{n})$}$ discussed in Sect. 3.4. A scatter diagram analogous to Figs. 1 and 2 is not readily constructed because the points θ_m are not of equal weight. Instead, the confidence intervals derived as in Fig. 2 (but now with $\hbox{${\cal N} = 500$}$ ) are compared with the credibility intervals derived from Eqs. (17).

The Δ_ν,α = 1 confidence interval for log a is (− 0.020, 0.006), whereas the equal-tail 68.3% credibility interval is (− 0.019, 0.008). The corresponding intervals for i are $\hbox{$(58\fdg1,61\fdg1)$}$ and $\hbox{$(58\fdg2,61\fdg2)$}$ , respectively.

In these calculations, $\hbox{${\cal N} = 50$}$ for $χ_{ijk}^{2} < χ_{\min}^{2} + 21.85$ $\hbox{$\chi^{2}_{ijk} < \chi^{2}_{\min} + 21.85$}$ and =1 otherwise. The domain defined by this inequality corresponds to Δ_ν,α with α = 0.9973 and ν = 7, thus ensuring an accurate treatment of the wings of the posterior PDF’s to beyond ± 2σ. Convergence experiments indicate that sufficient accuracy is achieved with $\hbox{${\cal N} \ga 20$}$ .

In contrast to confidence intervals derived from scatter plots such as Fig. 2, the credibility intervals calculated from Eqs. (17) are not biased. Convergence to the asymptote is therefore faster and sufficient accuracy is achieved with a smaller $\hbox{${\cal N}$}$ .

4.5. Coverage fractions

For comparison with Sect. 3.5 above and with Sect. 5.5 of L14, coverage fractions for credibility intervals are computed for 10 000 independent realizations of the observed orbit, and an interval is again counted as a success if Q_L<Q_exact<Q_U. The results of this experiment (iii) are given in Table 2.

Because these are credibility not frequentist intervals, there is no rigorous asymptotic expectation that the frequentist fractions should be recovered. Nevertheless, these ideal fractions are closely matched and so the credibility intervals calculated according to Sect. 4.2 can be described as well-calibrated (Drawid 1982).

When Table 2 is compared to Table 1 in L14, we see that the previous low coverage fractions for the ψ-elements log a,i,ω,Ω and for the derived quantity log ℳ are now replaced by fractions close to their frequentist values. This confirms the conjecture in L14 that the shortfall was due to the profile likelihood.

As in Sect. 3.5, statistical rigour is achieved with only a modest increase in the computional burden. Experiment iii required 22% more computer time than experiment i.

When Bayesian estimates depend on informative priors, the PDF π(θ) may have a significant gradient at θ_∗, the elements of a particular binary. A coverage experiment restricted to θ_∗ will then (correctly) deviate from the frequentist expectation. In coverage tests for such cases, each independent orbit $\begin{matrix} _{˜} \\ x_{n} \end{matrix}, \begin{matrix} _{˜} \\ y_{n} \end{matrix}$ $\hbox{$\tilde{x}_{n},\tilde{y}_{n}$}$ should also be for a random θ drawn from π(θ).

Table 2

Coverage fractions for credibility intervals from 10⁴ trials.

5. Comparison of estimates

The relative performance of minimum-χ² and Bayesian estimation in experiments ii and iii is summarized in Table 3. The means ⟨ δQ ⟩ and standard deviations s_δQ of the residuals δQ = Q_est − Q_exact are tabulated, where Q_est is either the minimum-χ² value of Q or its posterior mean, and Q_exact is given in Eq. (1).

Table 2 shows that in this test the two estimation methodologies yield closely similiar results. This is to be expected for a non-informative prior π(θ) with negligible gradient at θ_∗.

Table 3

Comparison of residuals δQ.

6. Conclusion

This paper has addressed a technical issue in the statistical analysis of orbits: how to achieve statistical rigour while taking advantage of the linearity of a subset of the orbital elements. The coverage experiments reported in Sects. 3.5 and 4.5 demonstrate that statistical rigour is achieved for both minimum-χ² and Bayesian estimation. Moreover, the reported timings show an inconsequential increase in the computational burden.

The key to this success is that at each grid point the distribution of probability in Thiele-Innes space is known. For minimum-χ² estimation, this allows Monte Carlo sampling to be targeted (Sect. A.5) precisely on the constant χ² surface defining the desired confidence level. For Bayesian estimation, the known PDF allows Monte Carlo sampling in Thiele-Innes space to be concentrated (Sect. A.4) on the high probability domain enclosing the least-squares points $\hbox{$(\hat{A},\hat{B},\hat{F},\hat{G})$}$ . Moreover, in each case, the random sampling does not require additional orbits to be computed.

The approach developed here is not specific to visual binaries or to the Thiele-Innes elements. In principle, an analogous procedure can be constructed for any partially linear estimation problem.

Online material

Appendix A: Statistics in Thiele-Innes space

In Appendix A of L14, formulae are derived for $\hbox{$\hat{\psi} = (\hat{A},\hat{B},\hat{F},\hat{G})$}$ , the least squares Thiele-Innes constants at given φ = (log P,e,τ). The resulting value of χ² then determines the profile likelihood ℒ^† used in the approximation of posterior means – Eq. (7) of L14. Now we wish to sample points displaced from $\hbox{$\hat{\psi}$}$ . Let such a displacement be δψ = (a,b,f,g).

Appendix A.1: Probability density function p(ψ | φ, D)

In Sect. A.4 of L14, the PDF at δψ is shown to be the product of two independent PDF’s, each a bivariate normal distribution, one for (a,f), the other for (b,g). Formulae for the variances $σ_{a}^{2}, σ_{b}^{2}, σ_{f}^{2}, σ_{g}^{2}$ $\hbox{$\sigma_a^{2}, \sigma_b^{2},\sigma_f^{2},\sigma_g^{2}$}$ and the covariances cov(a,f),cov(b,g) that define these PDF’s are given in Eqs. (A.9) and (A.10) of L14.

The PDF for (a,f) is $p (a,f) = \frac{1}{2 π σ_{a} σ_{f} \sqrt{1 - ρ_{af}^{2}}} \exp (- \frac{1}{2} ξ^{2})$ $\appendix \setcounter{section}{1} \begin{equation} p(a,f) = \frac{1}{2 \pi \sigma_{a} \sigma_{f} \sqrt{1-\rho_{af}^{2}}} \: \exp\left( -\frac{1}{2} \xi^{2}\right) \end{equation}$ (A.1)where ρ_af = cov(a,f)/(σ_aσ_f) and $ξ^{2} = \frac{1}{1 - ρ_{af}^{2}} [{(\frac{a}{σ_{a}})}^{2} + {(\frac{f}{σ_{f}})}^{2} - 2 ρ_{af} (\frac{a}{σ_{a}}) (\frac{f}{σ_{f}})]$ $\appendix \setcounter{section}{1} \begin{equation} \xi^{2} = \frac{1}{1-\rho_{af}^{2}} \left [\left(\frac{a}{\sigma_{a}}\right)^{2} + \left(\frac{f}{\sigma_{f}}\right)^{2} - 2 \rho_{af} \: \left(\frac{a}{\sigma_{a}}\right)\left(\frac{f}{\sigma_{f}}\right) \right] \end{equation}$ (A.2)The point (0,0) corresponds to the minimum- $χ_{x}^{2}$ $\hbox{$ \chi^{2}_{x}$}$ solution $\hbox{$(\hat{A},\hat{F})$}$ , where $χ_{x}^{2}$ $\hbox{$\chi^{2}_{x}$}$ is the x-coordinate contribution to χ² – see Eq. (2). The displacement (a,f) therefore results in a positive increment $δ χ_{x}^{2}$ $\hbox{$\delta \chi^{2}_{x}$}$ given by $σ^{2} δ χ_{x}^{2} = Σ_{n} w_{n} [(x_{n} - \begin{matrix} ˜ \\ x_{n} \end{matrix})^{2} - (x̂ n - \begin{matrix} ˜ \\ x_{n} \end{matrix})^{2}] .$ $\appendix \setcounter{section}{1} \begin{equation} \sigma^{2} \delta \chi^{2}_{x} = \Sigma_{n} w_{n} \left[(x_{n}-\tilde{x}_{n})^{2} - (\hat{x}_{n}-\tilde{x}_{n})^{2} \right]. \end{equation}$ (A.3)Now, for displacement (a,f), the predicted $x_{n} = x̂ n + a X_{n} + fY n$ $\appendix \setcounter{section}{1} \begin{equation} x_{n} = \hat{x}_{n} + a\: X_{n} + f \: Y_{n} \end{equation}$ (A.4)where X,Y are given by Eqs. (A.3) in L14. Substitution in Eq. (A.3) then gives $σ^{2} δ χ_{x}^{2} = Σ_{n} w_{n} (a^{2} X_{n}^{2} + f^{2} Y_{n}^{2} + 2 af X_{n} Y_{n})$ $\appendix \setcounter{section}{1} \begin{equation} \sigma^{2} \delta \chi^{2}_{x} = \Sigma_{n} w_{n} \: \left(a^{2} X_{n}^{2} + f^{2} Y_{n}^{2} + 2 \:af \:X_{n}Y_{n}\right) \end{equation}$ (A.5)where terms linear in a and f vanish because the minimum is at (0,0). The summations in Eq. (A.5) can be eliminated in favour of $σ_{a}^{2}, σ_{f}^{2}$ $\hbox{$\sigma_a^{2},\sigma_f^{2}$}$ and cov(a,f) with the formulae given in Eqs. (A.6), (A.9) and (A.10) of L14. After lengthy algebra, we find that $δ χ_{x}^{2} = \frac{1}{1 - ρ_{af}^{2}} [{(\frac{a}{σ_{a}})}^{2} + {(\frac{f}{σ_{f}})}^{2} - 2 ρ_{af} (\frac{a}{σ_{a}}) (\frac{f}{σ_{f}})]$ $\appendix \setcounter{section}{1} \begin{equation} \delta \chi^{2}_{x} = \frac{1}{1-\rho_{af}^{2}} \left [\left(\frac{a}{\sigma_{a}}\right)^{2} + \left(\frac{f}{\sigma_{f}}\right)^{2} - 2 \rho_{af} \: \left(\frac{a}{\sigma_{a}}\right)\left(\frac{f}{\sigma_{f}}\right) \right] \end{equation}$ (A.6)and so $p (a,f) \propto \exp (- \frac{1}{2} δ χ_{x}^{2}) .$ $\appendix \setcounter{section}{1} \begin{equation} p(a,f) \: \propto \: \exp \left( -\frac{1}{2} \delta \chi^{2}_{x}\right) . \end{equation}$ (A.7)Exactly the same analysis applies to the independent pair (b,g), so that $p (b,g) \propto \exp (- \frac{1}{2} δ χ_{y}^{2}) .$ $\appendix \setcounter{section}{1} \begin{equation} p(b,g) \: \propto \: \exp \left( -\frac{1}{2} \delta \chi^{2}_{y}\right) . \end{equation}$ (A.8)Combining these formulae, we find that the PDF at $\hbox{$\psi = \hat{\psi} + \delta \psi$}$ is $p (ψ | φ,D) = \frac{1}{4 π^{2} σ^{4} η} \exp (- \frac{1}{2} δ χ^{2})$ $\appendix \setcounter{section}{1} \begin{equation} p(\psi| \phi,D) = \frac{1}{4 \pi^{2} \sigma^{4} \eta } \: \exp \left( -\frac{1}{2} \delta \chi^{2}\right) \end{equation}$ (A.9)where $δ χ^{2} = δ χ_{x}^{2} + δ χ_{y}^{2}$ $\hbox{$\delta \chi^{2} = \delta \chi^{2}_{x} + \delta \chi^{2}_{y}$}$ and η(φ | D) is given by $σ^{4} η = σ_{a} σ_{f} \sqrt{(} 1 - ρ_{af}^{2}) \times σ_{b} σ_{g} \sqrt{(} 1 - ρ_{bg}^{2})$ $\appendix \setcounter{section}{1} \begin{equation} \sigma^{4} \eta = \sigma_{a} \sigma_{f} \sqrt(1-\rho_{af}^{2}) \times \sigma_{b}\sigma_{g} \sqrt(1-\rho_{bg}^{2}) \end{equation}$ (A.10)

Appendix A.2: Modified Thiele-Innes constants

The familiar device of “completing the square” applied to Eq. (A.2) suggests new variables $\hbox{${\cal A},{\cal F}$}$ defined by $\frac{a}{σ_{a}} = 𝒜, \frac{f}{σ_{f}} = ρ_{af} 𝒜 + \sqrt{1 - ρ_{af}^{2}} ℱ .$ $\appendix \setcounter{section}{1} \begin{equation} \frac{a}{\sigma_{a}} = {\cal A} \:, \;\;\; \frac{f}{\sigma_{f}} = \rho_{af} \: {\cal A} + \sqrt{1-\rho_{af}^{2}} \: {\cal F}. \end{equation}$ (A.11)Substitution in Eq. (A.6) then gives $δ χ_{x}^{2} = 𝒜^{2} + ℱ^{2} .$ $\appendix \setcounter{section}{1} \begin{equation} \delta \chi^{2}_{x} = {\cal A}^{2} + {\cal F}^{2} . \end{equation}$ (A.12)The Jacobian of this transformation is $\frac{\partial (a,f)}{\partial (𝒜, ℱ)} = σ_{a} σ_{f} \sqrt{1 - ρ_{af}^{2}}$ $\appendix \setcounter{section}{1} \begin{equation} \frac{ \partial(a,f)}{ \partial ({\cal A}, {\cal F})} = \sigma_{a} \sigma_{f} \sqrt{1-\rho_{af}^{2}} \end{equation}$ (A.13)so that, by conservation of probability, the PDF of $\hbox{$({\cal A}, {\cal F})$}$ is $Ψ (𝒜, ℱ) = \frac{1}{2 π} \exp [- \frac{1}{2} (𝒜^{2} + ℱ^{2})] .$ $\appendix \setcounter{section}{1} \begin{equation} \Psi({\cal A},{\cal F}) = \frac{1}{2 \pi} \: \exp \left[-\frac{1}{2} \left({\cal A}^{2} +{\cal F}^{2}\right) \right] . \end{equation}$ (A.14)Now, exactly the same analysis applies to the independent pair (b,g). Thus, if we define new variables $\hbox{${\cal B}, {\cal G}$}$ by the equations $\begin{matrix} \frac{b}{σ_{b}} = ℬ, \frac{g}{σ_{g}} = ρ_{bg} ℬ + \sqrt{1 - ρ_{bg}^{2}} \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} \frac{b}{\sigma_{b}} = {\cal B} \:, \;\;\; \frac{g}{\sigma_{g}} = \rho_{bg} \: {\cal B} + \sqrt{1-\rho_{bg}^{2}} \: \end{eqnarray}$ (A.15)then the PDF of $\hbox{$({\cal B},{\cal G})$}$ is $\hbox{$\Psi({\cal B},{\cal G})$}$ , and $δ χ_{y}^{2} = ℬ^{2} + 𝒢^{2} .$ $\appendix \setcounter{section}{1} \begin{equation} \delta \chi^{2}_{y} = {\cal B}^{2} + {\cal G}^{2} . \end{equation}$ (A.16)If ζ denotes the vector $\hbox{$({\cal A}, {\cal B}, {\cal F}, {\cal G})$}$ , then Π(ζ), the PDF in ζ-space, is $\hbox{$\Psi({\cal A},{\cal F}) \times \Psi({\cal B},{\cal G})$}$ – i.e., $Π (ζ) = \frac{1}{4 π^{2}} \exp [- \frac{1}{2} (𝒜^{2} + ℬ^{2} + ℱ^{2} + 𝒢^{2})] .$ $\appendix \setcounter{section}{1} \begin{equation} \Pi(\zeta) = \frac{1}{4 \pi^{2}} \: \exp \left[-\frac{1}{2} \left({\cal A}^{2} +{\cal B}^{2} + {\cal F}^{2} +{\cal G}^{2}\right) \right] . \end{equation}$ (A.17)Accordingly, the distribution of probability in ζ-space is simply the product of four independent normal distributions, each with zero mean and unit variance.

The increment in χ² is given by Eqs. (A.12) and (A.16) as $δ χ^{2} (ζ) = 𝒜^{2} + ℬ^{2} + ℱ^{2} + 𝒢^{2} .$ $\appendix \setcounter{section}{1} \begin{equation} \delta \chi^{2}(\zeta) = {\cal A}^{2} +{\cal B}^{2} + {\cal F}^{2} +{\cal G}^{2}. \end{equation}$ (A.18)

Appendix A.3: Approximate PDFs

A distribution of probability can be represented by a sum of δ functions in such a way that the probability attached to any finite element of space is approximated with arbitrary accuracy. Thus, the PDF giving the distribution of probability in ζ-space is given approximately by $p (ζ) = 𝒩^{-1} \sum_{ℓ} δ (ζ - ζ_{ℓ})$ $\appendix \setcounter{section}{1} \begin{equation} p(\zeta) = {\cal N}^{-1} \sum_{\ell} \delta(\zeta-\zeta_{\ell}) \end{equation}$ (A.19)where each ζ_ℓ is an independent random vector sampling the PDF Π(ζ) given by Eq. (A.17). The integral of p(ζ) over a finite element in ζ-space converges to the exact value as $\hbox{${\cal N} \rightarrow \infty$}$ .

Equations (A.11) and (A.15) transform the point ζ_ℓ into the displacement $\hbox{$\delta \psi_{\ell} = \psi_{\ell} - \hat{\psi}$}$ . The corresponding approximate PDF in ψ-space is therefore $p (ψ | φ,D) = 𝒩^{-1} \sum_{ℓ} δ (ψ - ψ_{ℓ}) .$ $\appendix \setcounter{section}{1} \begin{equation} p(\psi|\phi,D) = {\cal N}^{-1} \sum_{\ell} \delta(\psi-\psi_{\ell}). \end{equation}$ (A.20)Note that the Jacobian of this transformation from ζ- to ψ- space is implicit in the changes in number densities of the delta functions in the respective spaces.

Similarly, the PDF for the Campbell elements ϑ corresponding to the Thiele-Innes elements ψ is $p (ϑ | φ,D) = 𝒩^{-1} \sum_{ℓ} δ (ϑ - ϑ_{ℓ})$ $\appendix \setcounter{section}{1} \begin{equation} p(\vartheta|\phi,D) = {\cal N}^{-1} \sum_{\ell} \delta(\vartheta-\vartheta_{\ell}) \end{equation}$ (A.21)where ϑ_ℓ = ϑ(ψ_ℓ) is derived as described in Sect. A.4 of L14.

Appendix A.4: Random sampling in ψ-space

According to Eq. (A.17), a random point in ζ-space is (z₁,z₂,z₃,z₄), where the z_i are independent random Gaussian variates drawn from $\hbox{${\cal N}(0,1)$}$ . This point corresponds to the displacement (a,b,f,g) given by Eqs. (A.11) and (A.15) and therefore to the point $\hbox{$(\hat{A} +a, \hat{B} +b,\hat{F} +f,\hat{G} +g)$}$ in ψ-space. Thus, a point randomly selected from the exact PDF p(ψ | φ,D) can be derived from four independent Gaussian variates, and the resulting increment in χ² is given by Eq. (A.18) as $δ χ^{2} (ψ) = z_{1}^{2} + z_{2}^{2} + z_{3}^{2} + z_{4}^{2}$ $\appendix \setcounter{section}{1} \begin{equation} \delta \chi^{2}(\psi) = z_{1}^{2} + z_{2}^{2} + z_{3}^{2} + z_{4}^{2} \end{equation}$ (A.22)

Appendix A.5: Random sampling at fixed δχ²

A random point in ψ-space subject to a constraint on δχ² can be found by first selecting a random point on the 4D sphere in ζ-space defined by Eq. (A.18). This is achieved as follows: If z_i again denotes a Gaussian variate from $\hbox{${\cal N}(0,1)$}$ , then a random point on this hypersphere is (z₁,z₂,z₃,z₄) /Z, where $Z^{2} = (z_{1}^{2} + z_{2}^{2} + z_{3}^{2} + {z_{4}^{2}}^{)} / δχ 2$ $\appendix \setcounter{section}{1} \begin{equation} Z^{2} = \left(z_{1}^{2} + z_{2}^{2} + z_{3}^{2} + z_{4}^{2}\right)/\delta \chi^{2} \end{equation}$ (A.23)(Muller 1979). The corresponding point (A,B,F,G) in ψ-space is then derived from Eqs. (A.11) and (A.15).

The random sampling procedures of Sects. A.4 and A.5 predict χ² without the need to compute an orbit. This is achieved by exploiting the linearity of the ψ-elements and is the basis of the computational efficiency of the techniques of Sects. 3.3 and 4.1. However, during code development, this prediction should be tested by actually computing the orbit and independently evaluating χ² from Eq. (2).

Acknowledgments

The issue of error underestimation in hybrid problems was raised by the referee of the previous paper (L14) and was the direct stimulus of this investigation. This same referee provided useful comments on this paper.

References

Dawid, A. P. 1982, J. Am. Stat. Assoc., 77, 605 [CrossRef] [Google Scholar]
Eastman, J., Gaudi, B., & Agol, E. 2013, PASP, 125, 83 [NASA ADS] [CrossRef] [Google Scholar]
Hartkopf, W. I., McAlister, H. A., & Franz, O. G. 1989, AJ, 98, 1014 [NASA ADS] [CrossRef] [Google Scholar]
James, F. 2006, Statistical Methods in Experimental Physics (Singapore: World Scientific Publishing Co.) [Google Scholar]
Lucy, L. B. 2014, A&A, 563, 126 (L14) [Google Scholar]
Muller, M. E. 1959, Comm. Assoc. Comp. Mach., 2, 19 [Google Scholar]
Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. 1992, Numerical Recipes, 2nd edn. (Cambridge: Cambridge Univ. Press) [Google Scholar]
Schaefer, G. H., Simon, M., Beck, T. L., Nelan, E., & Prato, L. 2006, AJ, 132, 2618 [NASA ADS] [CrossRef] [Google Scholar]

All Tables

Table 1

Coverage fractions f for confidence intervals from 10⁴ trials.

In the text

Table 2

Coverage fractions for credibility intervals from 10⁴ trials.

In the text

Table 3

Comparison of residuals δQ.

In the text

All Figures

Fig. 1

Approximate confidence intervals. The ensemble of orbit vectors $\hbox{$\theta_{ijk} = (\phi_{ijk}, \hat{\psi}_{ijk})$}$ with $χ̂ \begin{matrix} 2 \\ ijk \end{matrix} < χ_{\min}^{2} + 1$ $\hbox{$\hat{\chi}^{2}_{ijk} < \chi^{2}_{\min} + 1$}$ is projected on to the (log a,i)-plane. The ± 1σ bounds for each coordinate are indicated. The 506 grid points φ_ijk define the domain $\hbox{${\cal D}$}$ .

In the text

Fig. 2

Accurate confidence limits. The ensemble of orbit vectors $\hbox{$\theta_{\ell} \in {\cal S}$}$ – i.e., with $χ_{ℓ}^{2} = χ_{\min}^{2} + 1$ $\hbox{$\chi^{2}_{\ell} = \chi^{2}_{\min} + 1$}$ – is projected on to the (log a,i)-plane. The ± 1σ bounds for each coordinate are indicated. Each of the 506 grid points $\hbox{$\phi_{ijk} \in {\cal D}$}$ gives rise to $\hbox{${\cal N} = 5$}$ points on $\hbox{${\cal S}$}$ .

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Dawid, A. P. 1982, J. Am. Stat. Assoc., 77, 605 [CrossRef] [Google Scholar]

[2] Eastman, J., Gaudi, B., & Agol, E. 2013, PASP, 125, 83 [NASA ADS] [CrossRef] [Google Scholar]

[3] Hartkopf, W. I., McAlister, H. A., & Franz, O. G. 1989, AJ, 98, 1014 [NASA ADS] [CrossRef] [Google Scholar]

[4] James, F. 2006, Statistical Methods in Experimental Physics (Singapore: World Scientific Publishing Co.) [Google Scholar]

[5] Lucy, L. B. 2014, A&A, 563, 126 (L14) [Google Scholar]

[6] Muller, M. E. 1959, Comm. Assoc. Comp. Mach., 2, 19 [Google Scholar]

[7] Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. 1992, Numerical Recipes, 2nd edn. (Cambridge: Cambridge Univ. Press) [Google Scholar]

[8] Schaefer, G. H., Simon, M., Beck, T. L., Nelan, E., & Prato, L. 2006, AJ, 132, 2618 [NASA ADS] [CrossRef] [Google Scholar]

Frequentist confidence intervals for orbits⋆

1. Introduction

2. Synthetic orbits

2.1. Orbital elements

2.2. Model binary

3. Minimum-χ2 estimation

3.1. Grid search

3.2. Approximate confidence intervals

3.3. Accurate confidence intervals

3.4. An example

3.5. Coverage fractions

4. Bayesian estimation

4.1. Posterior density

4.2. Credibility intervals

4.3. Calculation procedure

4.4. An example

4.5. Coverage fractions

5. Comparison of estimates

6. Conclusion

Online material

Appendix A: Statistics in Thiele-Innes space

Appendix A.1: Probability density function p(ψ | φ, D)

Appendix A.2: Modified Thiele-Innes constants

Appendix A.3: Approximate PDFs

Appendix A.4: Random sampling in ψ-space

Appendix A.5: Random sampling at fixed δχ2

Acknowledgments

References

All Tables

All Figures

Frequentist confidence intervals for orbits^⋆

3. Minimum-χ² estimation

Appendix A.5: Random sampling at fixed δχ²