Mass estimates for visual binaries with incomplete orbits

L. B. Lucy

doi:10.1051/0004-6361/201322649

Home

All issues

Volume 563 (March 2014)

A&A, 563 (2014) A126

Full HTML

Free Access

Issue		A&A Volume 563, March 2014


Article Number		A126
Number of page(s)		10
Section		Celestial mechanics and astrometry
DOI		https://doi.org/10.1051/0004-6361/201322649
Published online		21 March 2014

A&A 563, A126 (2014)

Mass estimates for visual binaries with incomplete orbits

L. B. Lucy

Astrophysics Group, Blackett Laboratory, Imperial College London, Prince Consort Road, London SW7 2AZ, UK
e-mail: l.lucy@imperial.ac.uk

Received: 11 September 2013
Accepted: 12 December 2013

Abstract

The problem of estimating the total mass of a visual binary when its orbit is incomplete is treated with Bayesian methods. The posterior mean of a mass estimator is approximated by a triple integral over orbital period, orbital eccentricity and time of periastron. This reduction to 3D from the 7D space defined by the conventional Campbell parameters is achieved by adopting the Thiele-Innes elements and exploiting the linearity with respect to the four Thiele-Innes constants. The formalism is tested on synthetic observational data covering a variable fraction of a model binary’s orbit. The posterior mean of the mass estimator is numerically found to be unbiased when the data cover ≳40% of the orbit.

Key words: binaries: visual / stars: fundamental parameters / methods: statistical

© ESO, 2014

1. Introduction

Visual binaries are a fundamental source of data on stellar masses. An individual system provides definitive data if it has a precise parallax ϖand a high quality orbit yielding accurate values for the orbital period P and the semi-major axis a. The total mass of the binary in solar units is then given by Kepler’s third law $ℳ_{1} + ℳ_{2} = \frac{1}{ϖ^{3}} \frac{a^{3}}{P^{2}}$ $\begin{equation} {\cal M}_{1} + {\cal M}_{2} = \frac{1}{\varpi^{3}} \: \frac{a^{3}}{P^{2}} \end{equation}$ (1)where P is in years and a and ϖ are in arcseconds. However, for many long-period binaries, the accumulated measurements do not cover even one full orbit. The question therefore arises: can a high quality orbit be derived from such incomplete data?

In his classic monograph, Aitken (1918) gave the following answer: “in general, it is not worthwhile to compute the orbit of a double star until the observed arc not only exceeds 180°, but also defines both ends of the apparent ellipse”. Since the first observation will seldom coincide with either end point, Aitken’s criterion typically requires f_orb ≳ 0.75, where f_orb is the fraction of the orbit covered. Aitken also warns that when f_orb < 0.5 it will generally be possible to draw several very different ellipses each of which will satisfy the data about equally well.

Although definitive orbits are surely a prerequisite for definitive masses, Eggen (1967) noted that a³/P² is often reasonably well determined even when a and P are not, a conclusion he reached by compiling data on systems with multiple computed orbits. He reports (Eggen 1962) that “In some cases there have been drastic revaluations of the period of a given pair but there is then a compensating change in a and the resulting values of a³/P² suffer little alteration”. These same compensating changes in a and P also arise in the Monte Carlo search technique of Schaefer et al. (2006) − see their Fig. 13 for DF Tau.

Eggen’s examination of the historical record provides strong empirical evidence that useful masses can be obtained from incomplete orbits. But restricting the discussion to observed systems does not allow an assessment of the statistical properties of such mass estimates. Accordingly, in this paper, synthetic observations with f_orb < 1 are created for a binary with specified elements in order to test the accuracy of the derived total mass.

The problem of inferring masses from incomplete orbits will be particularly acute for data acquired by the Gaia satellite. Nurmi (2005) and Pourbaix (2011) estimate that ≳10⁷ visual binaries will be discovered. Many of these will have periods longer than the five years planned for the mission, and their orbits could perhaps be analysed by the technique developed in this paper.

2. Bayesian estimation

As noted already by Aitken (1918), when an orbit is incomplete, several theoretical orbits will provide satisfactory fits. Two possible responses to this circumstance are: 1) scan parameter space to find the global minimum (e.g., Hartkopf et al. 1989) and assume that this orbit is the closest achievable approximation to the truth; or 2) scan parameter space to make a complete census of acceptable orbits and then compute an appropriate average of the individual mass determinations (e.g., Schaefer et al. 2006). However, a Bayesian approach is preferred here. Specifically, by scanning parameter space, the posterior means of the orbital elements or any function thereof can be computed without locating minima.

2.1. Orbital elements

The standard orbital elements describing a secondary’s motion relative to its primary are the Campbell elements θ = (P,T,e,a,i,ω,Ω). In addition to P and a previously defined, T is a time of periastron passage, e is the eccentricity, i is the inclination, ω is the longitude of periastron, and Ω is the position angle of the ascending node. However, for scanning parameter space, it is convenient to define an alternative to T. Noting that T can be replaced by T ± nP, where n is any integer, we choose T ∈ (0,P) and then define the alternative element τ = T/P ∈ (0,1).

The above Campbell elements are standard. But an alternative set, the Thiele-Innes elements (Appendix A), simplify Bayesian integrations. In these elements, the vector (a,i,ω,Ω) is replaced by ψ = (A,B,F,G), so that the 7-element vector of elements is now (φ,ψ), where φ = (P,e,τ).

Numerous authors have preferred the Thiele-Innes elements for conventional least-squares determinations of orbital parameters for both visual binaries and exoplanets (e.g., Hartkopf et al. 1989; Pourbaix et al. 2002; Schaefer et al. 2006; Casertano et al. 2008; Wright & Howard 2009). This paper argues that these elements are also advantageous for Bayesian estimation of orbital parameters.

2.2. Posterior mean

Let Q(θ) be a quantity whose value would normally be computed after deriving a definitive orbit. But now in anticipation of there being no single definitive orbit, we instead compute Q’s posterior mean, given by $⟨ Q ⟩ = \int Q ℒ p d φ d ψ / \int ℒ p d φ d ψ .$ $\begin{equation} \langle Q \rangle = \int{Q \: {\cal L} \: p \: {\rm d} \phi {\rm d} \psi} \: / \int{{\cal L} \: p \: {\rm d} \phi {\rm d}\psi.} \end{equation}$ (2)Here ℒ(φ,ψ|D) is the likelihood of the elements (φ,ψ) given the data D, and p(φ,ψ) is a probability density function (pdf) quantifying our prior beliefs or knowledge about the orbital elements. In this problem, D comprises the secondary’s measured relative Cartesian sky coordinates $(\begin{matrix} _{˜} \\ x_{n} \end{matrix}, \begin{matrix} _{˜} \\ y_{n} \end{matrix})$ $\hbox{$(\tilde{x}_{n},\tilde{y}_{n})$}$ at times t_n.

As remarked in Sect. 2.1, the Thiele-Innes elements simplify the integrations in Eq. (2). For fixed φ = (P,e,τ), the family of Keplerian orbits is linear in A,B,F,G. Accordingly, their least-squares values $\hbox{$\hat{A},\hat{B},\hat{F},\hat{G}$}$ can be computed without iteration − see Sect. A.2. (Note that with normally-distributed errors the least-squares solution is also the point of maximum likelihood (ML).)

The second simplification concerns topology. When f_orb ≲ 0.5, several orbits fit the data (Sect. 1). This implies severe topological complexity for ℒ in the 7D (φ,ψ)-space. However, at each point in φ-space, the 4D function $\hbox{${\cal L}(\psi) / {\cal L}(\hat{\psi})$}$ is a quadrivariate normal distribution (Sect. A.3), which therefore has a single peak at $\hbox{$\hat{\psi} = (\hat{A},\hat{B},\hat{F},\hat{G})$}$ , the point of ML.

This topological simplification can be used in estimating the 7D integrals in Eq. (2). To achieve this, we approximate the quadrivariate normal distribution by the delta function $\hbox{$\delta(\psi - \hat{\psi})$}$ , so that $ℒ (φ,ψ | D) = ℒ^{†} δ (ψ - ψ̂)$ $\begin{equation} {\cal L}(\phi, \psi | D) = {\cal L}^{\dag} \: \delta(\psi - \hat{\psi}) \end{equation}$ (3)where $ℒ^{†} = ℒ (φ, ψ̂ | D) .$ $\begin{equation} {\cal L}^{\dag} = {\cal L} (\phi, \hat{\psi} | D) . \end{equation}$ (4)Thus, for each point in φ-space, ℒ^† is the value of ℒ at $\hbox{$\hat{\psi} = \hat{\psi}(\phi)$}$ , the ML point in ψ-space.

If we now replace ℒ in Eq. (2) by the approximation given in Eq. (3) and integrate over ψ-space, we obtain $⟨ Q ⟩ = \int Q^{†} ℒ^{†} p^{†} d φ / \int ℒ^{†} p^{†} d φ$ $\begin{equation} \langle Q \rangle \: = \int Q^{\dag} \: {\cal L}^{\dag} \: p^{\dag} \: {\rm d} \phi \: / \int{\cal L}^{\dag} \: p^{\dag} \: {\rm d} \phi \end{equation}$ (5)where the superscript † indicates evaluation at $\hbox{$(\phi,\hat{\psi})$}$ as in Eq. (4).

On the assumption of normally-distributed measurement errors and with constants of proportionality omitted, $ℒ^{†} = \exp (- \frac{1}{2} χ^{2})$ $\begin{equation} {\cal L}^{\dag} = \exp \: \left( - \frac{1}{2} \chi^{2} \right) \end{equation}$ (6)where $\hbox{$\chi^{2}(\phi,\hat{\psi})$}$ is the conventional goodness-fit criterion − see Eq. (A.5) − at φ = (P,e,τ).

In the statistics literature ℒ^† is the profile likelihood (e.g., Severini 2000). This function is used in high energy physics for evaluating the significance of particle detections (e.g., Ranucci 2012).

2.3. Priors

In order to calculate the approximate posterior mean ⟨Q⟩ from Eq. (5), $\hbox{$p^{\dag} = p(\phi,\hat{\psi})$}$ must be specified. Guided by common practice in Bayesian exoplanet detection (e.g., Ford & Gregory 2007), we assume the elements to have independent priors, so that p^† is the product of seven independent functions. The priors on the ψ elements are taken to be uniform and so cancel between numerator and denominator in Eq. (5).

For the φ elements e and τ, the chosen priors are uniform in (0,1). For P, we adopt a Jeffreys prior − i.e., log P uniform in (log P_L,log P_U).

2.4. Numerical integration

The integrals in Eq. (5) are 3D as against 7D in Eq. (2). This reduction allows ⟨Q⟩ to be evaluated simply by computing the integrands at every point in a 3D grid and then summing. Without this reduction, a Markov chain Monte Carlo (MCMC) calculation would perhaps be required. Direct summation would, however, still be feasible if regions of negligible likelihood were efficiently excluded (Mikkelsen et al. 2012).

The integration domain in (log P,e,τ)-space is taken to be (log P_L,log P_U) for log P, and (0,1) for both e and τ. This domain is partitioned into a 3D grid with constant steps for each variable. The quantity Q is then evaluated at the mid-points of the cells labelled (i,j,k) with i ∈ (1,I), j ∈ (1,J), k ∈ (1,K). The resulting formula for the posterior mean is $⟨ Q ⟩ = Σ_{ijk} Q_{ijk}^{†} ℒ_{ijk}^{†} / Σ_{ijk} ℒ_{ijk}^{†} .$ $\begin{equation} \langle Q \rangle = \Sigma_{ijk} \: Q_{ijk}^{\dag} \: {\cal L}_{ijk}^{\dag} / \: \Sigma_{ijk} \: {\cal L}_{ijk}^{\dag}. \end{equation}$ (7)The priors adopted in Sect. 2.3 are implicitly incorporated since the cells are weighted equally.

3. Calculation of ⟨Q⟩

In this section, the calculation of ⟨Q⟩ is described step-by-step.

3.1. Model binary

The model binary has the following Campbell elements:

$\begin{matrix} P_{*} = 100 y τ_{*} = 0.4 e_{*} = 0.5 a_{*} = 1 ″ \\ i_{*} = 60 ° ω_{*} = 250 ° Ω_{*} = 120 ° . \end{matrix}$ $\begin{eqnarray} && P_{*}=100y \;\;\; \tau_{*}=0.4 \;\;\; e_{*}=0.5 \;\;\; a_{*}=1\arcsec \nonumber \\ && {\rm i}_{*}=60\degr \;\;\; \omega_{*} = 250\degr \;\;\; \Omega_{*} = 120\degr. \end{eqnarray}$ (8)From Eqs. (A.1), the Thiele-Innes elements corresponding to the above values of a_∗,i_∗,ω_∗,Ω_∗ are $\begin{matrix} A_{*} = + 0 .̋ 578 B_{*} = - 0 .̋ 061 \\ F_{*} = - 0 .̋ 322 G_{*} = + 0 .̋ 899. \end{matrix}$ $\begin{eqnarray} && A_{*} = +0\farcs578 \;\;\;\;\; B_{*} = -0\farcs061 \nonumber \\ && F_{*} = -0\farcs322 \;\;\;\;\; G_{*} = +0\farcs899. \end{eqnarray}$ (9)

3.2. Synthetic data

Given these elements, the binary is “observed” at times $t_{n} = f_{orb} P_{*} \times (n - 1) / (N - 1)$ $\begin{equation} t_{n} = f_{\rm orb} P_{*} \times (n-1)/(N-1) \end{equation}$ (10)for n = 1,2,...,N, so that the fraction f_orb of the orbit is uniformly sampled. At each t_n, the secondary’s exact coordinates (x_n,y_n) are computed from Eqs. (A.2)−(A.4). The measured positions are then $\begin{matrix} ˜ \\ x_{n} \end{matrix} = x_{n} + σ z_{G} \begin{matrix} ˜ \\ y_{n} \end{matrix} = y_{n} + σz G$ $\begin{equation} \tilde{x}_{n} = x_{n} + \sigma \: z_{\rm G} \;\;\;\; \tilde{y}_{n} = y_{n} + \sigma \: z_{\rm G} \end{equation}$ (11)where each z_G is a random Gaussian variate drawn from $\hbox{${\cal N}(0,1)$}$ and σ is the standard error of unit weight. The vector (f_orb,N,σ) defines an observing campaign.

Uniform sampling is here assumed for simplicity: it is not required by the Bayesian technique. Random sampling, the obvious alternative, is an inferior model for extant ground-based data on long-period binaries where typically an observation was made every observing season.

3.3. Calculation procedure

Given positions $(\begin{matrix} _{˜} \\ x_{n} \end{matrix}, \begin{matrix} _{˜} \\ y_{n} \end{matrix})$ $\hbox{$(\tilde{x}_{n},\tilde{y}_{n})$}$ , a posterior mean ⟨Q⟩ is computed as follows:

1)
The grid parameters I,J,K and P_L,P_U are specified.
2)
At each grid point, the vector φ_ijk = (P_i,e_j,τ_k) follows from the stepping procedure described in Sect. 2.4.
3)
Given φ_ijk, the least-squares (ML) values of the Thiele-Innes elements $\hbox{$\hat{\psi} = (\hat{A},\hat{B},\hat{F},\hat{G})$}$ are computed from Eq. (A.7).
4)
Given $\hbox{$\hat{\psi}$}$ , the predicted positions (x_n,y_n) are computed from Eqs. (A.2)−(A.4) and the resulting $χ_{ijk}^{2}$ $\hbox{$\chi^{2}_{ijk}$}$ derived from Eq. (A.5).
5)
Given $χ_{ijk}^{2}$ $\hbox{$\chi^{2}_{ijk}$}$ , the profile likelihood $ℒ_{ijk}^{†}$ $\hbox{${\cal L}^{\dag}_{ijk}$}$ is given by Eq. (6).
6)
Given $\hbox{$\hat{\psi}$}$ , the corresponding Campbell elements are computed from Eqs. (A.11)−(A.15) and used to evaluate Q_ijk = Q(θ_ijk).
7)
Given Q_ijk and $ℒ_{ijk}^{†}$ $\hbox{${\cal L}^{\dag}_{ijk}$}$ throughout the grid, ⟨Q⟩ is obtained from Eq. (7).

Note that steps 1)−4) also arise in the parameter search techniques of Hartkopf et al. (1989) and Schaefer et al. (2006).

4. Feasible orbits

In this section, we exploit the scanning of parameter space required by the calculation of ⟨Q⟩ to relate the findings of Aitken and Eggen (Sect. 1) to the behaviour of ℒ^†(φ|D).

4.1. A particular case

With orbital parameters from Eq. (8) and campaign parameters $\hbox{$f_{\rm orb} = 0.4, N = 15, \sigma = 0\farcs05$}$ , synthetic data are calculated from Eq. (11). Then, with grid parameters P_L = 0.7f_orbP_∗, P_U = 300P_∗, I = 800,J = K = 200, the values $χ_{ijk}^{2}$ $\hbox{$\chi^{2}_{ijk}$}$ at all grid points are determined as described in Sect. 3.3.

If at grid point (i,j,k), $P (χ^{2} > {χ_{ijk}^{2}}^{)} > 0.05$ $\begin{equation} P\left(\chi^{2} > \chi^{2}_{ijk}\right) > 0.05 \end{equation}$ (12)the elements θ_ijk are deemed to represent a feasible orbit, and the ensemble of such orbits define the feasible domain(s) $\hbox{${\cal D}$}$ in (log P,e,τ)-space.

In Fig. 1, $\hbox{${\cal D}$}$ is projected onto the (log P,e) plane. Thus a filled circle appears at the point (log P_i,e_j) if Eq. (12) is fulfilled for at least one τ_k. From this plot, we see that for this poorly-observed incomplete orbit, there is an extended domain of feasible orbits with P ranging from 40 to 5000y and e from 0.19 to >0.99.

Because $\hbox{${\cal D}$}$ extends to the grid’s upper boundary at e = 1, a more general treatment would, in this particular case, show that some parabolic and even hyperbolic orbits are feasible − i.e., fit the data. By restricting the analysis to elliptical orbits, unbound orbits are excluded. But this is appropriate for the practical problem of concern: an incomplete observed orbit is highly unlikely to be a close encounter of an unbound pair.

Fig. 1

Feasible domain $\hbox{${\cal D}$}$ in (log P,e)-space for the test binary defined by Eq. (8) when observed in a campaign with parameters $\hbox{$f_{\rm orb} = 0.4, N = 15, \sigma = 0\farcs05$}$ . The true values log P_∗,e_∗ are indicated as are the coordinates of the orbits plotted in Fig. 2.

In Fig. 2, the true orbit for this particular case is plotted together with the measurements $(\begin{matrix} _{˜} \\ x_{n} \end{matrix}, \begin{matrix} _{˜} \\ y_{n} \end{matrix})$ $\hbox{$(\tilde{x}_{n},\tilde{y}_{n})$}$ . Also shown are three feasible orbits indicative of the range seen in Fig. 1. The Campbell elements of these are: $\begin{matrix} P = 47.0 yT = 41.0 ye = 0.79 a = 0 .̋ 87 \\ i = 62 .̊ 3 ω = 273 ° Ω = 157 ° \end{matrix}$ $\begin{eqnarray} P=47.0y \;\;\; T = 41.0y \;\;\; e = 0.79 \;\;\; a = 0\farcs87 \nonumber \\ {\rm i} = 62\fdg3 \;\;\; \omega = 273\degr \;\;\; \Omega = 157\degr \end{eqnarray}$ (13)for orbit I; $\begin{matrix} P = 98.7 yT = 38.3 ye = 0.19 a = 0 .̋ 85 \\ i = 51 .̊ 1 ω = 243 ° Ω = 98 ° \end{matrix}$ $\begin{eqnarray} P=98.7y \;\;\; T = 38.3y \;\;\; e = 0.19 \;\;\; a = 0\farcs85 \nonumber \\ {\rm i} = 51\fdg1 \;\;\; \omega = 243\degr \;\;\; \Omega = 98\degr \end{eqnarray}$ (14)for orbit II; and $\begin{matrix} P = 4423.8 yT = 33.2 ye = 0.95 a = 15 .̋ 2 \\ i = 73 .̊ 9 ω = 207 ° Ω = 110 ° \end{matrix}$ $\begin{eqnarray} P=4423.8y \;\;\; T = 33.2y \;\;\; e = 0.95 \;\;\; a = 15\farcs2 \nonumber \\ {\rm i} = 73\fdg9 \;\;\; \omega = 207\degr \;\;\; \Omega = 110\degr \end{eqnarray}$ (15)for orbit III.

Fig. 2

Feasible orbits. Orbits I, II, III fit the measurements (filled circles) with an acceptable χ². The bold curve is the true orbit with elements given in Eq. (8).

Figure 2 illustrates and confirms Aitken’s warning (Sect. 1) that when f_orb < 0.5 several orbits will fit the data. Similar plots for real binaries are given by Schaefer et al. (2006).

Orbit III merits further comment. This orbit is nearly parabolic and has its periastron during the observing campaign. Thus, the observations were taken in time interval (0,40y) and periastron occurs at T = 33.2y. If this were the true orbit with P = 4423.8y, we would count ourselves fortunate to catch a periastron passage in such a short campaign. This is a generic feature of fitting orbits to short observed arcs: the orbit is not constrained when not observed and so can balloon out to large angular separations. This family of feasible orbits violate the Copernican Principle since the observer is inferred to be at a special epoch. Although this possibly justifies their exclusion, these orbits are retained in the calculation of posterior means.

4.2. Eggen’s effect

To investigate Eggen’s empirical discovery that even for poor orbits the ratio a³/P² is relatively reliable, the ensemble of feasible orbits plotted in Fig. 1 is now projected onto the (log P,log a) plane: Fig. 3. From this plot, we see a high density of points along the line $\log \frac{a}{a_{*}} = \frac{2}{3} \log \frac{P}{P_{*}} \cdot$ $\begin{equation} \log \: \frac{a}{a_{*}} = \frac{2}{3} \: \log \: \frac{P}{P_{*}}\cdot \end{equation}$ (16)Thus, despite ranging in period from 40 to 5000y, the values of a³/P² are relatively concentrated. This effect becomes even more striking with improved campaigns having f_orb = 0.5 and 0.6 (see Fig. 3).

These experiments indicate that with the chosen (N,σ), the feasible orbits are:

1): closely confined to (a_∗,P_∗) when f_orb ≳ 0.6;

2): dispersed widely in P but narrowly in a³/P² when 0.5 ≳ f_orb ≳ 0.4;

3): dispersed widely in P and increasingly so in a³/P² when f_orb ≲ 0.4.

These results provide strong theoretical support for Eggen’s discovery and for his use of the derived masses in studies of the mass-luminosity relation.

This confirmation derives from investigating the domain of high likelihood in (log P,e,τ)-space. Since this is the domain in which high weight is assigned to Q when its posterior mean is calculated, Eggen’s effect is evidently incorporated when this formalism is used to estimate the binary’s mass. Accordingly, Fig. 3 implies that we can anticipate useful results in this standard case provided that f_orb ≳ 0.4.

The explanation of Eggen’s effect is that as soon as secondary’s relative motion reveals a significant departure from rectilinear motion (f_orb ≪ 1) then acceleration has been detected and this is ∝ (ℳ₁ + ℳ₂) ∝ a³/P². As f_orb increases, the uncertainty due to orientation factors decreases and so a³/P² is well-determined before the orbit is complete − see also Heintz (1978) and Schaefer et al. (2006).

Fig. 3

Eggen’s effect. Feasible domains $\hbox{${\cal D}$}$ in the (log P,log a) plane. Results for three campaigns with f_orb = 0.4,0.5,0.6 are shown, each with $\hbox{$N = 15, \sigma = 0\farcs05$}$ . The f_orb = 0.5 and 0.6 domains are displaced downwards by 0.5 and 1.0 dex, respectively. The plotted lines have slope 2/3 and pass through the points (2.0,0.0),(2.0, −0.5),(2.0, −1.0), each corresponding to (log P_∗,log a_∗).

In addition to the linear trends, Fig. 3 shows a dramatic increase in the volume of $\hbox{${\cal D}$}$ as f_orb decreases from 0.6 to 0.4, supporting Aitken’s warning about the onset of multiple acceptable orbits. Note that this increase in volume is a generic effect that is not sensitive to the choice of 0.05 in Eq. (12).

This section has shown that the insights of Aitken and Eggen from 50−100 years ago find their modern explanation in how the topology of the likelihood function ℒ^†(φ|D) responds to changes in the observational data D.

5. Mass estimates

The analysis of Sects. 2 and 3 is now applied to simulated partial orbits for the standard visual binary. Results with variations of e and i are reported in Appendix C.

5.1. Mass estimator

At the grid point (i,j,k), the φ elements are (P_i,e_j,τ_k), and the $\hbox{$\hat{\psi}$}$ elements are the least-squares values $\hbox{$(\hat{A},\hat{B},\hat{F},\hat{G})$}$ . From the latter, the remaining Campbell elements $\hbox{$(\hat{a},\hat{{\rm i}},\hat{\omega},\hat{\Omega})$}$ are computed as described in Sect. A.4, and we denote this semi-major axis as â_ijk.

Now if ℳ_∗ is the true total mass of the model binary with orbit defined by Eq. (8), the inferred mass with the orbit corresponding to the point (i,j,k) is $ℳ_{ijk} = ℳ_{*} {(\frac{â ijk}{a_{*}})}^{3} {(\frac{P_{i}}{P_{*}})}^{-2} \cdot$ $\begin{equation} {\cal M}_{ijk} = {\cal M}_{*} \left (\frac{\hat{a}_{ijk}}{a_{*}} \right)^{3} \left(\frac{P_{i}}{P_{*}} \right)^{-2}\cdot \end{equation}$ (17)The natural choice for the mass variable to be substituted in Eq. (7) is therefore ℳ_ijk. However, numerical experiments demonstrate that ⟨log ℳ⟩ is more accurate than log ⟨ℳ⟩. Accordingly, we take $Q_{ijk}^{†} = \log \frac{ℳ_{ijk}}{ℳ_{*}} = 3 \log \frac{â ijk}{a_{*}} - 2 \log \frac{P_{i}}{P_{*}} \cdot$ $\begin{equation} Q_{ijk}^{\dag} = \log \frac{ {\cal M}_{ijk}}{ {\cal M}_{*}} = 3 \: \log \frac{\hat{a}_{ijk}}{a_{*}} - 2 \: \log \frac{P_{i}}{P_{*}}\cdot \end{equation}$ (18)Exploiting the freedom to set ℳ_∗ = 1, we now substitute Eq. (18) into Eq. (7) to obtain the posterior mean ⟨log ℳ⟩, where ℳ is the ratio of the inferred to the exact mass. Thus, if the exact mass is recovered, then ⟨log ℳ⟩ = 0.

5.2. Numerical experiments

In these experiments, the true orbit is again given by Eq. (8). In the first sequence of campaigns, N = 15 and $\hbox{$\sigma = 0\farcs05$}$ as in Sect. 4, but now f_orb is treated as a continuous variable, thus investigating the deterioration of mass estimates as f_orb → 0. Note that when f_orb changes so does the seed for the random number generator. The pattern of Gaussian variates in Eq. (11) is therefore never repeated.

With steps of 0.01 in f_orb, the values of ⟨log ℳ⟩ are plotted in Fig. 4. We see that for f_orb ≳ 0.40, the mass estimates scatter about the exact value ℳ = 1 with errors <0.1 dex. However, for f_orb ≲ 0.40, the scatter increases sharply, with the error first exceeding 0.1 dex at f_orb = 0.38. These results are consistent with the implications of Fig. 3.

Accuracy should improve with more data or data of higher precision − i.e, with decreasing precision parameter $η = σ / \sqrt{N}$ $\hbox{$\eta = \sigma/\sqrt{N}$}$ . This is investigated by reducing σ from $\hbox{$0\farcs05$}$ to $\hbox{$0\farcs005$}$ while keeping N = 15. This second sequence is plotted in Fig. 4 as filled circles. As expected, the quality of the estimates dramatically inproves: the errors are <0.012 dex for f_orb ≥ 0.36, but increase sharply thereafter. Note that decreasing η by a factor of 10 has only slightly extended the domain of reliable solutions − the error first exceeds 0.1 dex at f_orb = 0.34.

These experiments demonstrate that ⟨log ℳ⟩ provides a seemingly unbiased estimate of a visual binary’s total mass even when standard orbit fitting generates multiple acceptable solutions with a large range of orbital periods.

Fig. 4

Mass estimator. The posterior mean of log ℳ is plotted against f_orb. The campaigns with $\hbox{$N = 15, \sigma = 0\farcs05$}$ are plotted as open circles; those with $\hbox{$N = 15, \sigma = 0\farcs005$}$ as filled circles. The dashed lines indicate errors of ±0.1 dex.

5.3. Posterior pdf of log ℳ

For a real binary with an incomplete orbit, f_orb is unknown. Moreover, perhaps counterintuitively, it cannot even be reliably estimated. Figure 2 and Eqs. (13)−(15) illustrate this: for orbit I, f_orb = 0.85; for orbit II, f_orb = 0.41; and for orbit III, f_orb = 0.009, as against the true value 0.4. It follows that an investigator cannot judge the accuracy of ⟨log ℳ⟩ from an estimate of f_orb. Instead, the posterior pdf of log ℳ should be derived and credible intervals computed.

Inspection of Eq. (7) shows that the implied approximation to the posterior pdf is $Θ (Q | D) = Σ_{ijk} W_{ijk} δ (Q - {Q_{ijk}^{†}}^{)}$ $\begin{equation} \Theta(Q|D) = \Sigma_{ijk} W_{ijk} \delta\left(Q-Q_{ijk}^{\dag}\right) \end{equation}$ (19)where δ is the delta function, and $W_{ijk} = ℒ_{ijk}^{†} / Σ_{ijk} ℒ_{ijk}^{†} .$ $\begin{equation} W_{ijk} = {\cal L}_{ijk}^{\dag} / \: \Sigma_{ijk} \: {\cal L}_{ijk}^{\dag}. \end{equation}$ (20)From these formulae, we immediately find $\int Θd Q = 1 and \int Q Θd Q = ⟨ Q ⟩$ $\begin{equation} \int \Theta \: {\rm d}Q =1 \;\; {\rm and} \;\; \int Q \: \Theta \: {\rm d}Q = \langle Q \rangle \end{equation}$ (21)where ⟨Q⟩ is the approximation given by Eq. (7).

A histogram representation of the pdf Θ(log ℳ|D) is obtained by convolving Eq. (19) with a top-hat function. Examples are plotted in Fig. 5 for f_orb = 0.4 and 0.5. These show the loss of precision with decreasing f_orb that was expected from Fig. 3 and seen in Fig. 4.

The equivalent of 1σ Gaussian error bars can be computed from these pdf’s by finding the equal area tails that together comprise 31.7% of the probability. Thus, we find $⟨ \log ℳ ⟩ = 0.01 2_{-0.019}^{+ 0.019}$ $\hbox{$\langle \log {\cal M} \rangle = 0.012^{+0.019}_{-0.019}$}$ for f_orb = 0.5 and $= 0.06 8_{-0.080}^{+ 0.077}$ $\hbox{$ = 0.068^{+0.077}_{-0.080}$}$ for f_orb = 0.4.

Because the pdf Θ(log ℳ|D) is approximately Gaussian (see Fig. 5) the above error bars can with sufficient accuracy be replaced by the standard deviation of the pdf, given by $s_{Q}^{2} = Σ_{ijk} W_{ijk} {(Q_{ijk}^{†} - ⟨ Q ⟩^{)}}^{2} .$ $\begin{equation} s^{2}_{Q} = \Sigma_{ijk} W_{ijk} \left(Q_{ijk}^{\dag} - \langle Q \rangle \right)^{2}. \end{equation}$ (22)This formula gives s_log ℳ = 0.019 and 0.085 for f_orb = 0.5 and 0.4, respectively. Evidently, the dramatic loss in precision when the orbital coverage is insufficient is apparent from s_log ℳ even without knowing f_orb.

Fig. 5

Posterior pdf Θ(log ℳ|D) for f_orb = 0.4 and 0.5 both with $\hbox{$N = 15,\: \sigma = 0\farcs05$}$ . The vertical dashed lines are the posterior means ⟨log ℳ⟩.

5.4. Bias

For f_orb ≳ 0.4, Fig. 4 shows no evidence that the estimator ⟨log ℳ⟩ is biased. But for f_orb ≲ 0.4, any bias is obscured by the dramatic increase in scatter. Accordingly, since quantifying the onset of bias is of interest in understanding the limits of this Bayesian approach, further calculations are now reported.

In order to beat down the noise, the campaigns with $\hbox{$N=15, \: \sigma = 0\farcs05$}$ are repeated n = 100 times for selected values of f_orb. The average and variance $Q̅ = \frac{1}{n} Σ_{i} ⟨ Q ⟩_{i} s^{2} = \frac{1}{n - 1} Σ_{i} (⟨ Q ⟩_{i} - Q̅) 2$ $\begin{equation} \bar{Q} = \frac{1}{n} \Sigma_{i} \langle Q \rangle _{i} \;\;\; s^{2} =\frac{1}{n-1} \Sigma_{i} \left(\langle Q \rangle _{i} -\bar{Q}\right)^{2} \end{equation}$ (23)of the ⟨Q⟩ _i are computed, so that the resulting estimate is $Q̅ \pm s / \sqrt{n}$ $\hbox{$\bar{Q} \pm s/\sqrt{n}$}$ , and this indicates bias if it differs significantly from the exact value of Q.

Figure 6 presents evidence of bias for the estimator ⟨log (ℳ)⟩. The average values of ⟨log (ℳ)⟩ and their standard deviations from 100 independent realisations of the orbits are plotted for f_orb = 0.30(0.05)0.95. There is no evidence of bias for f_orb ≥ 0.6, but strong evidence for f_orb ≤ 0.4 where the displacements are >3.5s. Similar plots are given in Appendix B for the posterior means of the Campbell elements.

The onset of bias coincides with the onset of scatter (see Fig. 4) at which point accurate mass determinations are no longer possible. Nevertheless, from a technical standpoint, it is of interest to understand the origin of bias. Comparison of Figs. 2, 3, and 6 shows that the onset of bias corresponds to the dramatic increase in the volume of $\hbox{${\cal D}$}$ , and this implies that the choice of priors becomes increasingly important. Accordingly, since the expansion of $\hbox{${\cal D}$}$ brings in orbits of high eccentricity (Fig. 2), we might seek to penalize such orbits via the prior on e. But this would be an ad hoc fix for this particular model binary. A fundamental solution probably invokes the Copernican Principle (Sect. 4.1) to reduce the weight assigned to highly eccentric orbits.

Fig. 6

Bias of the posterior mean ⟨log (ℳ)⟩. The averages and standard errors from Eq. (23) plotted against f_orb with $\hbox{$N=15, \sigma = 0\farcs05$}$ . The dashed line is the exact value.

5.5. Coverage fractions

When the estimate of a parameter derived by least-squares analysis is reported as $\hbox{$\bar{Q} \pm \sigma_{Q}$}$ , we automatically interpret this as an assertion that there is a 68.3% probability that the true answer lies in the interval $\hbox{$(\bar{Q} - \sigma_{Q}, \bar{Q} + \sigma_{Q})$}$ . Implicit in this interpretation, however, are the following assumptions: 1) normally-distributed measurement errors; 2) the validity of the linearization used in deriving the normal equations; and 3) that the number of data points N greatly exceeds M, the number of parameters estimated. If one or more these assumptions does not hold then the above interval’s coverage fraction will differ from 0.683.

In Bayesian statistics, credibility intervals derived from the posterior pdf replace the confidence intervals of frequentist statistics, and the different meanings of these intervals has been topic of much discussion in the statistical literature over many years. One approach to reconciling this aspect of the Bayesian-frequentist debate is that of calibrated Bayes (e.g., Dawid 1982; Little 2006): a credible interval is well-calibrated if the enclosed probability matches the fraction of times that it contains the true value.

In order to test whether credible intervals calculated with this formalism for visual binaries are well-calibrated, 1000 independent realizations of two campaigns with $\hbox{$N=15, \sigma = 0\farcs05$}$ have been analysed, one with f_orb = 0.60, the other with f_orb = 0.95. For each realization, we check if Q_exact is in the intervals (⟨Q⟩ −ms_Q, ⟨Q⟩ + ms_Q) for m = 1,2, where ⟨Q⟩ is given by Eq. (7) and s_Q by Eq. (22).

The results of this exercise are in Table 1. The first row gives the desired fraction ℰ(f)− i.e., the fraction expected for an ideal frequentist analysis. The remaining rows give the results for the posterior means of the Campbell elements and log ℳ.

Inspection of these results shows that the coverage is close to ideal for the φ elements P,e,τ. Accordingly, their credible intervals are well-calibrated. However, for the remaining elements and for log ℳ, the error bars are too small by factors of 1.1−2.1 at 1σ and 1.1−1.6 at 2σ. These error bars could of course be calibrated from such simulations. However, these shortfalls occur for the elements that were replaced by the Thiele-Innes constants and thus had the widths of their posterior pdfs reduced by the delta function approximation in Eq. (3). Further research based on the analytical formulae of Sect. A.3 may alleviate or eliminate this problem.

Table 1

Coverage fractions f from 1000 trials.

6. Conclusion

The aim of this paper has been to apply Bayesian methods to a problem that dates back a century or more, namely how to estimate the total mass of a visual binary that has a measured parallax but an incomplete orbit. This problem has previously been effectively treated by Schaefer et al. (2006) with a heuristic Monte Carlo search technique. In contrast, the Bayesian approach has a firm theoretical foundation and is, moreover, now the preferred methodology in many areas of astronomy, notably in cosmology (e.g., Liddle 2009) and in the discovery and characterization of the orbits of exoplanets (e.g., Ford & Gregory 2007).

A further merit of Bayesian estimation for visual binaries is that only triple integrals need to be evaluated when the quadrivariate normal distribution of the Thiele-Innes constants (Sect. A.3) is approximated by a delta function (Sect. 2.2). The use of this approximation might be expected to introduce systematic errors in the derived elements. However, the failure to detect bias when f_orb ≳ 0.5 (Sect. 5.4, Appendix B) demonstrates that such errors are not of practical concern.

In Sect. 1, the possible application to binaries discovered by Gaia was mentioned. However, the simulations reported in this paper relate to ground-based data where two sky coordinates are measured. In contrast, the scanning mode of Gaia will provide ~70 high precision one dimensional measurements of a secondary’s displacement (e.g., Pourbaix 2002; Lattanzi et al. 2000). Simulations of a model binary observed in this manner are required to demonstrate the merits of this Bayesian approach for Gaia.

Acknowledgments

I thank D.J. Mortlock for a helpful discussion on nuisance parameters and profile likelihoods. I am obliged to the referee for suggestions that improved the paper.

References

Aitken, R. G. 1918, The Binary Stars (New York: D. C. McMurtrie), 92 [Google Scholar]
Casertano, S., Lattanzi, M. G., Sozzetti, A., et al. 2008, A&A, 48, 699 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Dawid, A. P. 1982, Journal of the American Statistical Association, 77, 605 [Google Scholar]
Eggen, O. J. 1962, QJRAS, 3, 259 [NASA ADS] [Google Scholar]
Eggen, O. J. 1967, ARA&A, 5, 105 [NASA ADS] [CrossRef] [Google Scholar]
Ford, E. B., & Gregory, P. C. 2007, ASPC, 371, 189 [Google Scholar]
Hartkopf, W. I., McAlister, H. A., & Franz, O. G. 1989, AJ, 98, 1014 [NASA ADS] [CrossRef] [Google Scholar]
Heintz, W. D. 1978, Double Stars (Reidel, Dordrecht), 63 [Google Scholar]
Lattanzi, M. G., Spagna, A., Sozzetti, A., & Casertano, S. 2000, MNRAS, 317, 211 [NASA ADS] [CrossRef] [Google Scholar]
Liddle, A. R. 2009, Ann. Rev. Nucl. Particle Sci., 59, 95 [NASA ADS] [CrossRef] [Google Scholar]
Little, R. 2006, The American Statistician, 60, 1 [CrossRef] [Google Scholar]
Mikkelsen, K., Nss, S. K., & Eriksen, H. K. 2012, ApJ, submitted [Google Scholar]
Nurmi, P. 2005, ESA SP, 576, 463 [NASA ADS] [Google Scholar]
Pourbaix, D. 2002, A&A, 385, 686 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Pourbaix, D. 2011, AIP Conf. Proc., 1346, 122 [Google Scholar]
Ranucci, G. 2012, Nucl. Instr. Meth. Phys. Res. A, 661, 77 [NASA ADS] [CrossRef] [Google Scholar]
Schaefer, G. H., Simon, M., Beck, T. L., Nelan, E., & Prato, L. 2006, AJ, 132, 2618 [NASA ADS] [CrossRef] [Google Scholar]
Severini, T. A. 2000, Likelihood Methods in Statistics (Oxford: OUP) [Google Scholar]
Wright, J. T., & Howard, A. W. 2009, ApJS, 182, 205 [NASA ADS] [CrossRef] [Google Scholar]

Appendix A: Thiele-Innes elements

In this Appendix, a concise self-contained account of the Thiele-Innes elements is presented − see also Wright & Howard (2009) and Hartkopf et al. (1989). The explicit least-squares formulae for A,B,F,G and for their variances and covariances are new − Eqs. (A.7), (A.9), (A.10).

Appendix A.1: Definitions

The Thiele-Innes constants A,B,F,G are defined in terms of four of the Campbell elements. Specifically,

$\begin{matrix} A = a (+ \cos Ω \cos ω - \sin Ω \sin ω \cos i) \\ B = a (+ \sin Ω \cos ω + \cos Ω \sin ω \cos i) \\ F = a (- \cos Ω \sin ω - \sin Ω \cos ω \cos i) \\ G = a (- \sin Ω \sin ω + \cos Ω \cos ω \cos i) . \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} &&A=a(+\cos \: \Omega \: \cos \: \omega - \sin \: \Omega \: \sin \: \omega \: \cos \: i) \nonumber \\ &&B=a(+\sin \: \Omega \: \cos \: \omega + \cos \: \Omega \: \sin \: \omega \: \cos \: i) \nonumber \\ &&F=a(-\cos \: \Omega \: \sin \: \omega - \sin \: \Omega \: \cos \: \omega \: \cos \: i) \nonumber \\ &&G=a(-\sin \: \Omega \: \sin \: \omega + \cos \: \Omega \: \cos \: \omega \: \cos \: i). \end{eqnarray}$ (A.1)

Given these constants, the coordinates (x,y) of the secondary relative to the primary at time t are $\begin{matrix} x (t) = AX (E) + FY (E) \\ y (t) = BX (E) + GY (E) \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} x(t)= A \: X(E) + F \: Y(E) \nonumber \\ y(t) = B \: X(E) + G \: Y(E) \end{eqnarray}$ (A.2)where $X = \cos E - e,Y = \sqrt{1 - e^{2}} \sin E$ $\appendix \setcounter{section}{1} \begin{equation} X = \cos \: E - e , \;\; Y = \sqrt{1-e^{2}} \: \sin \: E \end{equation}$ (A.3)and E(t), the eccentric anomaly, is given by Kepler’s equation $μ (t - T) = E - e \sin E with μ = 2 π / P .$ $\appendix \setcounter{section}{1} \begin{equation} \mu(t-T) = E - e \: \sin \: E \;\;\; {\rm with} \;\;\; \mu = 2 \pi/P. \end{equation}$ (A.4)

Appendix A.2: Least-squares estimates

On the assumption that P,e,T are known, the determination of the four Thiele-Innes elements is a straightforward problem of fitting a linear model to the data. If the observed position at t_n is $(\begin{matrix} _{˜} \\ x_{n} \end{matrix}, \begin{matrix} _{˜} \\ y_{n} \end{matrix})$ $\hbox{$(\tilde{x}_{n},\tilde{y}_{n})$}$ , then estimates for A,B,F,G may be obtained by minimizing $χ^{2} = \frac{1}{σ^{2}} Σ_{n} w_{n} (x_{n} - \begin{matrix} ˜ \\ x_{n} \end{matrix})^{2} + \frac{1}{σ^{2}} Σ_{n} w_{n} (y_{n} - \begin{matrix} ˜ \\ y_{n} \end{matrix}) 2$ $\appendix \setcounter{section}{1} \begin{equation} \chi^{2} = \frac{1}{\sigma^{2}} \Sigma_{n} w_{n} (x_{n}-\tilde{x}_{n})^{2} +\frac{1}{\sigma^{2}} \Sigma_{n} w_{n} \left(y_{n}-\tilde{y}_{n}\right)^{2} \end{equation}$ (A.5)where (x_n,y_n) is the theoretical position given by Eq. (A.2). Here w_n is the weight of the nth observation and σ is the standard error of a measurement of unit weight.

This least-squares problem can be solved without iteration since x and y are linear in A,B,F,G. Moreover, the first term in Eq. (A.5) depends only on the pair (A,F) and the second term only on (B,G). Accordingly, the estimates for A,F are obtained by minimizing the first term in Eq. (A.5) and those for B,G by minimizing the second term. Thus, the problem reduces to solving two independent pairs of linear equations, each with two unknowns.

To simplify the resulting formulae, we define the following quantities

$\begin{matrix} a = Σ_{n} w_{n} X_{n}^{2} b = Σ_{n} w_{n} Y_{n}^{2} c = Σ_{n} w_{n} X_{n} Y_{n} \\ r_{11} = Σ_{n} w_{n} \begin{matrix} ˜ \\ x_{n} \end{matrix} X_{n} r_{12} = Σ_{n} w_{n} \begin{matrix} ˜ \\ x_{n} \end{matrix} Y_{n} \\ r_{21} = Σ_{n} w_{n} \begin{matrix} ˜ \\ y_{n} \end{matrix} X_{n} r_{22} = Σ_{n} w_{n} \begin{matrix} ˜ \\ y_{n} \end{matrix} Y_{n} . \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} && a = \Sigma_{n} w_{n} X_{n}^{2} \;\;\; b = \Sigma_{n} w_{n} Y_{n}^{2} \;\;\; c = \Sigma_{n} w_{n} X_{n} Y_{n} \nonumber \\ && r_{11} = \Sigma_{n} w_{n} \tilde{x}_{n} X_{n} \;\;\;\;\; r_{12} = \Sigma_{n} w_{n} \tilde{x}_{n} Y_{n} \\ && r_{21} = \Sigma_{n} w_{n} \tilde{y}_{n} X_{n} \;\;\;\;\; r_{22} = \Sigma_{n} w_{n} \tilde{y}_{n} Y_{n}. \nonumber \end{eqnarray}$ (A.6)The least-squares estimates for the Thiele-Innes constants are then $\begin{matrix} Â = (b r_{11} - c r_{12}) / Δ F̂ = (- c r_{11} + a r_{12}) / Δ \\ B̂ = (b r_{21} - c r_{22}) / Δ Ĝ = (- c r_{21} + a r_{22}) / Δ \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} \hat{A} = (b r_{11} - c r_{12})/\Delta \;\;\;\;\; \hat{F} = (-c r_{11} + a r_{12})/\Delta \nonumber \\ \hat{B} = (b r_{21} - c r_{22})/\Delta \;\;\;\;\; \hat{G} = (-c r_{21} + a r_{22})/\Delta \end{eqnarray}$ (A.7)where Δ = ab − c².

The above analysis is readily generalized to allow $\begin{matrix} _{˜} \\ x_{n} \end{matrix}, \begin{matrix} _{˜} \\ y_{n} \end{matrix}$ $\hbox{$\tilde{x}_{n},\tilde{y}_{n}$}$ to have different weights $w_{n}^{x}, w_{n}^{y}$ $\hbox{$w_{n}^{x},w_{n}^{y}$}$ . In the formulae for $\hbox{$(\hat{A},\hat{F})$}$ , the quantities a,b,c,r₁₁,r₁₂ are evaluated with $w_{n} = w_{n}^{x}$ $\hbox{$w_{n} = w_{n}^{x}$}$ . Correspondingly, in the formulae for $\hbox{$(\hat{B},\hat{G})$}$ , the quantities a,b,c,r₂₁,r₂₂ are evaluated with $w_{n} = w_{n}^{y}$ $\hbox{$w_{n} = w_{n}^{y}$}$ .

Appendix A.3: Covariance matrix

The two sets of normal equations solved above for $\hbox{$(\hat{A},\hat{F})$}$ and $\hbox{$(\hat{B},\hat{G})$}$ have the same matrix of coefficients and therefore the same inverse. These are $M = (\begin{matrix} a & c \\ c & b \end{matrix}) and M^{-1} = (\begin{matrix} + b / Δ & - c / Δ \\ - c / Δ & + a / Δ \end{matrix}) .$ $\appendix \setcounter{section}{1} \begin{equation} \vec{M} = \left(\begin{array}{clcr} a & c \\ c & b \end{array} \right) \;\;\; {\rm and} \;\;\; \vec{M^{-1}} = \left(\begin{array}{clcr} +b/\Delta & -c/\Delta \\ -c/\Delta & +a/\Delta \end{array} \right). \end{equation}$ (A.8)From the coefficients of the covariance matrix M^-1, we derive $σ_{A,B}^{2} = \frac{b}{Δ} σ^{2} σ_{F,G}^{2} = \frac{a}{Δ} σ 2$ $\appendix \setcounter{section}{1} \begin{equation} \sigma^{2}_{A,B} = \frac{b}{\Delta} \sigma^{2} \;\;\;\; \sigma^{2}_{F,G} = \frac{a}{\Delta} \sigma^{2} \end{equation}$ (A.9)and $cov (A,F) = cov (B,G) = - \frac{c}{Δ} σ^{2} .$ $\appendix \setcounter{section}{1} \begin{equation} {\rm cov}(A,F) = {\rm cov}(B,G) = -\frac{c}{\Delta} \sigma^{2}. \end{equation}$ (A.10)The covariances of other pairings of A,B,F,G are zero. Note that these results apply for fixed P,e,T.

The above analysis shows that (A,F) have a bivariate normal distribution centred on $\hbox{$(\hat{A}, \hat{F})$}$ with variances and covariance from Eqs. (A.9) and (A.10) − and correspondingly for the pair (B,G). Accordingly, the pdf p(A,B,F,G|φ) is a quadrivariate normal distribution centred on $\hbox{$(\hat{A}, \hat{B}, \hat{F}, \hat{G}) = \hat{\psi}(\phi)$}$ . This explicit derivation of p(ψ|φ) could be used to replace the delta function in Eq. (3).

Appendix A.4: Campbell elements

Having derived the least-squares values of A,B,F,G from Eqs. (A.7), we must invert the transformation given by Eq. (A.1). The angle ω + Ω is the solution of the equation $ω + Ω = \arctan (\frac{B - F}{A + G})$ $\appendix \setcounter{section}{1} \begin{equation} \omega+\Omega = \arctan \left(\frac{B-F}{A+G} \right) \end{equation}$ (A.11)such that sin(ω + Ω) has the same sign as B − F. Similarly, ω − Ω is the solution of the equation $ω - Ω = \arctan (\frac{- B - F}{A - G})$ $\appendix \setcounter{section}{1} \begin{equation} \omega-\Omega = \arctan \left(\frac{-B-F}{A-G} \right) \end{equation}$ (A.12)such that sin(ω − Ω) has the same sign as −B − F. This pair of equations can be solved for ω and Ω. But Ω may violate the convention that Ω ∈ (0,π). Accordingly, if Ω < 0, we set Ω = Ω + π and ω = ω + π. But if Ω > π, we set Ω = Ω − π and ω = ω − π.

If we now define $q_{1} = \frac{A + G}{\cos (ω + Ω)} and q_{2} = \frac{A - G}{\cos (ω - Ω)}$ $\appendix \setcounter{section}{1} \begin{equation} q_{1} = \frac{A+G}{\cos(\omega+\Omega)} \;\; {\rm and} \;\; q_{2} = \frac{A-G}{\cos(\omega-\Omega)} \end{equation}$ (A.13)then the inclination is $i = 2 \arctan (\sqrt{q_{2} / q_{1}})$ $\appendix \setcounter{section}{1} \begin{equation} {\rm i} = 2 \: \arctan \left(\sqrt{q_{2}/q_{1}} \:\right) \end{equation}$ (A.14)and the semi-major axis is $a = (q_{1} + q_{2}) / 2$ $\appendix \setcounter{section}{1} \begin{equation} a= (q_{1}+q_{2})/2 \end{equation}$ (A.15)

Appendix B: Bias of posterior means

The numerical search for bias reported in Sect. 5.4 also included the Campbell elements. The results are plotted in Figs. B.1−B.7. The behaviour revealed by these plots is broadly similar to that seen in Fig. 6 except that for some elements bias is evident at somewhat larger f_orb. Thus biases >3s occur at f_orb = 0.5 for ⟨log P⟩ , ⟨log a⟩ , ⟨τ⟩ and ⟨e⟩.

Fig. B.1

Bias of the posterior mean ⟨log (P/P_∗)⟩. The averages and standard errors from Eq. (23) are plotted against f_orb with $\hbox{$N=15, \sigma = 0\farcs05$}$ . The dashed line is the exact value.

Fig. B.2

Bias of the posterior mean ⟨log (a/a_∗)⟩. The averages and standard errors from Eq. (23) are plotted against f_orb with $\hbox{$N=15, \sigma = 0\farcs05$}$ . The dashed line is the exact value.

Fig. B.3

Bias of the posterior mean ⟨τ⟩. The averages and standard errors from Eq. (23) are plotted against f_orb with $\hbox{$N=15, \sigma = 0\farcs05$}$ . The dashed line is the exact value.

Fig. B.4

Bias of the posterior mean ⟨e⟩. The averages and standard errors from Eq. (23) are plotted against f_orb with $\hbox{$N=15, \sigma = 0\farcs05$}$ . The dashed line is the exact value.

Fig. B.5

Bias of the posterior mean ⟨i⟩. The averages and standard errors from Eq. (23) are plotted against f_orb with $\hbox{$N=15, \sigma = 0\farcs05$}$ . The dashed line is the exact value.

Fig. B.6

Bias of the posterior mean ⟨ω⟩. The averages and standard errors from Eq. (23) are plotted against f_orb with $\hbox{$N=15, \sigma = 0\farcs05$}$ . The dashed line is the exact value.

Fig. B.7

Bias of the posterior mean ⟨Ω⟩. The averages and standard errors from Eq. (23) are plotted against f_orb with $\hbox{$N=15, \sigma = 0\farcs05$}$ . The dashed line is the exact value.

Appendix C: Variation of parameters

In this Appendix, posterior means are computed when selected orbital parameters are varied from those of the standard model; see Eq. (8). Specifically, in one sequence e = 0.00(0.19)0.95, and in a second sequence i = 0°(18°)90°, with, in each case, the other elements as in Eq. (8). For both sequences, the campaign parameters are $\hbox{$f_{\rm orb} = 0.7, N = 15, \sigma = 0\farcs05$}$ , and each orbit is computed with a different random number seed. The orbits are poor and incomplete but f_orb is in the domain (f_orb ≳ 0.6) where bias due to priors is negligible; see Sect. 5.4 and Appendix B.

Table C.1

Sequence with varying eccentricity e.

Table C.2

Sequence with varying inclination i.

Results for the e-sequence are in Table C.1. Aspects to note are as follows: The solution for e = 0 has large error bars for ⟨ω⟩ and ⟨τ⟩. These arise because ω and τ = T/P become meaningless and indeterminate as e → 0 - i.e., for a circular orbit. Also since e is non-negative, ⟨e⟩ is positively-biased when the true value is e = 0. Despite these issues, an accurate mass is obtained.

At the other extreme of highly elliptical orbits, Table C.1 reveals large error bars for all elements except ⟨e⟩.

Results for the i-sequence are in Table C.2. Aspects to note are as follows: The solution i = 0°− i.e., a face-on orbit - has large error bars for ω and Ω. From Eqs. (A.1), we see that when i = 0°, A = G = acos(ω + Ω) and B = −F = asin(ω + Ω). Accordingly, for an exactly face-on orbit, the difference ω − Ω is indeterminate, and the error bars on ω and Ω reflect this. Nevertheless, the mass determination is accurate.

As was the case for ⟨e⟩, the quantity ⟨i⟩ is postively-biased since i is non-negative. The expected magnitude of this bias can be computed independently of the Bayesian code. If we assume that P,e,τ have the values given in Eq. (8), then the least-squares fit to a synthetic orbit with i = 0 is derived with Eqs. (A.7) and then the corresponding Campbell elements a,i,ω,Ω obtained with Eqs. (A.11)−(A.15). With the chosen campaign parameters, n = 1000 repetitions of this calculation gives the average value $\hbox{$\bar{{\rm i}} = 13\fdg9$}$ , consistent with $\hbox{$14\fdg4 \pm 4\fdg0$}$ in Table C.2. When σ is reduced to $\hbox{$0\farcs005$}$ as in Sect. 5.2, $\hbox{$\bar{{\rm i}} = 4\fdg4$}$ . Thus, as expected, the bias of i decreases with improved data.

A further point of note in the i-sequence is that the standard error of ⟨Ω⟩ is much smaller than that of ⟨ω⟩ when i = 90°− i.e., an edge-on orbit. Although the error bars of these elements are somewhat underestimated (Sect. 5.5), this is a real effect. From Eqs. (A.1) and (A,2), it follows that the exact edge-on orbit obeys the equation y/x = tanΩ, so that Ω is determined by the slope of the straight line fitted to $\begin{matrix} _{˜} \\ x_{n} \end{matrix}, \begin{matrix} _{˜} \\ y_{n} \end{matrix}$ $\hbox{$\tilde{x}_{n},\tilde{y}_{n}$}$ . The Bayesian code does not formally make such a fit, but this mathematical aspect is implicit.

All Tables

Table 1

Coverage fractions f from 1000 trials.

In the text

Table C.1

Sequence with varying eccentricity e.

In the text

Table C.2

Sequence with varying inclination i.

In the text

All Figures

	Fig. 1 Feasible domain $\hbox{${\cal D}$}$ in (log P,e)-space for the test binary defined by Eq. (8) when observed in a campaign with parameters $\hbox{$f_{\rm orb} = 0.4, N = 15, \sigma = 0\farcs05$}$ . The true values log P_∗,e_∗ are indicated as are the coordinates of the orbits plotted in Fig. 2.
In the text

	Fig. 2 Feasible orbits. Orbits I, II, III fit the measurements (filled circles) with an acceptable χ². The bold curve is the true orbit with elements given in Eq. (8).
In the text

Fig. 3

Eggen’s effect. Feasible domains $\hbox{${\cal D}$}$ in the (log P,log a) plane. Results for three campaigns with f_orb = 0.4,0.5,0.6 are shown, each with $\hbox{$N = 15, \sigma = 0\farcs05$}$ . The f_orb = 0.5 and 0.6 domains are displaced downwards by 0.5 and 1.0 dex, respectively. The plotted lines have slope 2/3 and pass through the points (2.0,0.0),(2.0, −0.5),(2.0, −1.0), each corresponding to (log P_∗,log a_∗).

In the text

	Fig. 4 Mass estimator. The posterior mean of log ℳ is plotted against f_orb. The campaigns with $\hbox{$N = 15, \sigma = 0\farcs05$}$ are plotted as open circles; those with $\hbox{$N = 15, \sigma = 0\farcs005$}$ as filled circles. The dashed lines indicate errors of ±0.1 dex.
In the text

	Fig. 5 Posterior pdf Θ(log ℳ\|D) for f_orb = 0.4 and 0.5 both with $\hbox{$N = 15,\: \sigma = 0\farcs05$}$ . The vertical dashed lines are the posterior means ⟨log ℳ⟩.
In the text

	Fig. 6 Bias of the posterior mean ⟨log (ℳ)⟩. The averages and standard errors from Eq. (23) plotted against f_orb with $\hbox{$N=15, \sigma = 0\farcs05$}$ . The dashed line is the exact value.
In the text

	Fig. B.1 Bias of the posterior mean ⟨log (P/P_∗)⟩. The averages and standard errors from Eq. (23) are plotted against f_orb with $\hbox{$N=15, \sigma = 0\farcs05$}$ . The dashed line is the exact value.
In the text

	Fig. B.2 Bias of the posterior mean ⟨log (a/a_∗)⟩. The averages and standard errors from Eq. (23) are plotted against f_orb with $\hbox{$N=15, \sigma = 0\farcs05$}$ . The dashed line is the exact value.
In the text

	Fig. B.3 Bias of the posterior mean ⟨τ⟩. The averages and standard errors from Eq. (23) are plotted against f_orb with $\hbox{$N=15, \sigma = 0\farcs05$}$ . The dashed line is the exact value.
In the text

	Fig. B.4 Bias of the posterior mean ⟨e⟩. The averages and standard errors from Eq. (23) are plotted against f_orb with $\hbox{$N=15, \sigma = 0\farcs05$}$ . The dashed line is the exact value.
In the text

	Fig. B.5 Bias of the posterior mean ⟨i⟩. The averages and standard errors from Eq. (23) are plotted against f_orb with $\hbox{$N=15, \sigma = 0\farcs05$}$ . The dashed line is the exact value.
In the text

	Fig. B.6 Bias of the posterior mean ⟨ω⟩. The averages and standard errors from Eq. (23) are plotted against f_orb with $\hbox{$N=15, \sigma = 0\farcs05$}$ . The dashed line is the exact value.
In the text

	Fig. B.7 Bias of the posterior mean ⟨Ω⟩. The averages and standard errors from Eq. (23) are plotted against f_orb with $\hbox{$N=15, \sigma = 0\farcs05$}$ . The dashed line is the exact value.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Aitken, R. G. 1918, The Binary Stars (New York: D. C. McMurtrie), 92 [Google Scholar]

[2] Casertano, S., Lattanzi, M. G., Sozzetti, A., et al. 2008, A&A, 48, 699 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[3] Dawid, A. P. 1982, Journal of the American Statistical Association, 77, 605 [Google Scholar]

[4] Eggen, O. J. 1962, QJRAS, 3, 259 [NASA ADS] [Google Scholar]

[5] Eggen, O. J. 1967, ARA&A, 5, 105 [NASA ADS] [CrossRef] [Google Scholar]

[6] Ford, E. B., & Gregory, P. C. 2007, ASPC, 371, 189 [Google Scholar]

[7] Hartkopf, W. I., McAlister, H. A., & Franz, O. G. 1989, AJ, 98, 1014 [NASA ADS] [CrossRef] [Google Scholar]

[8] Heintz, W. D. 1978, Double Stars (Reidel, Dordrecht), 63 [Google Scholar]

[9] Lattanzi, M. G., Spagna, A., Sozzetti, A., & Casertano, S. 2000, MNRAS, 317, 211 [NASA ADS] [CrossRef] [Google Scholar]

[10] Liddle, A. R. 2009, Ann. Rev. Nucl. Particle Sci., 59, 95 [NASA ADS] [CrossRef] [Google Scholar]

[11] Little, R. 2006, The American Statistician, 60, 1 [CrossRef] [Google Scholar]

[12] Mikkelsen, K., Nss, S. K., & Eriksen, H. K. 2012, ApJ, submitted [Google Scholar]

[13] Nurmi, P. 2005, ESA SP, 576, 463 [NASA ADS] [Google Scholar]

[14] Pourbaix, D. 2002, A&A, 385, 686 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[15] Pourbaix, D. 2011, AIP Conf. Proc., 1346, 122 [Google Scholar]

[16] Ranucci, G. 2012, Nucl. Instr. Meth. Phys. Res. A, 661, 77 [NASA ADS] [CrossRef] [Google Scholar]

[17] Schaefer, G. H., Simon, M., Beck, T. L., Nelan, E., & Prato, L. 2006, AJ, 132, 2618 [NASA ADS] [CrossRef] [Google Scholar]

[18] Severini, T. A. 2000, Likelihood Methods in Statistics (Oxford: OUP) [Google Scholar]

[19] Wright, J. T., & Howard, A. W. 2009, ApJS, 182, 205 [NASA ADS] [CrossRef] [Google Scholar]