Issue 
A&A
Volume 571, November 2014



Article Number  A86  
Number of page(s)  9  
Section  Celestial mechanics and astrometry  
DOI  https://doi.org/10.1051/00046361/201424405  
Published online  14 November 2014 
Analysing weak orbital signals in Gaia data
Astrophysics Group, Blackett Laboratory, Imperial College London, Prince Consort Road, London SW7 2AZ, UK
email: l.lucy@imperial.ac.uk
Received: 15 June 2014
Accepted: 22 August 2014
Anomalous orbits are found when minimumχ^{2} estimation is applied to synthetic Gaia data for orbits with astrometric signatures comparable to the singlescan measurement error (Pourbaix 2002, A&A, 385, 686). These orbits are nearly parabolic, edgeon, and their major axes align with the lineofsight to the observer. Such orbits violate the Copernican principle (CPr) and as such could be rejected. However, the preferred alternative is to develop a statistical technique that incorporates the CPr as a fundamental postulate. This can be achieved in a Bayesian context by defining a Copernican prior. Pourbaix’s anomalous orbits then no longer arise. Instead, the selected orbits have a somewat higher χ^{2} but do not violate the CPr. The problem of detecting a weak additional orbit in an astrometric binary with a welldetermined orbit is also treated.
Key words: binaries: visual / stars: fundamental parameters / methods: statistical
© ESO, 2014
1. Introduction
With the Gaia observatory in orbit at L2 and with commissioning underway, astronomers can look forward with increasing confidence to the eventual release of an enormous quantity of high precision astrometric data. Initially, this data will be analysed with the alreadyexisting pipeline software created by the various consortia. The resulting pipeline products will no doubt be entirely satisfactory for the vast majority of observed objects. However, a lesson from earlier largescale surveys is that a small number of objects at the limit of a survey’s range often prove to be of exceptional interest. For such objects, standard reduction techniques may give anomalous and misleading results.
This occurred for the Hipparcos mission. As reviewed by Pourbaix (2004) and Perryman (2009, p. 594), orbits fitted to Hipparcos data for stars with known spectroscopic orbits led to “discoveries” that were later refuted. As emphasized by Pourbaix (2004), “fitting the noise with an orbital model can have some awful consequences”.
This earlier episode suggests that the extraction of orbital parameters from weak orbital signals in Gaia data should be investigated. In fact, this is already the subject of an intriguing paper by Pourbaix (2002). He found that minχ^{2} solutions for weak orbits are frequently anomalous – specifically, edgeon and nearly parabolic. In the present paper, the origin of such orbits is explained and a Bayesian technique developed that overcomes this problem.
2. Synthetic data
In this section, synthetic 1D scans of a model astrometric binary are created. In order to focus on orbital parameters, we follow Pourbaix (2002) in assuming that parallactic and proper motions have been subtracted. With regard to notation, previous papers (Lucy 2014a,b; hereafter L14a,b) are followed closely.
2.1. Orbital elements
In contrast to L14a,b, the secondary is here not detected, so the astrometry measures the primary’s reflex motion about the system’s centre of mass. This motion is parameterized with the Campbell elements P,e,T,a,i,ω,Ω. Here P is the period, e is the eccentricity, T is a time of periastron passage, i is the inclination, ω is the longitude of periastron, and Ω is the position angle of the ascending node. However, following many earlier investigators – references in L14a – the ThieleInnes elements are also used, thereby exploiting the resulting linearity in four parameters. Thus, the Campbell vector θ ≡ (φ,ϑ), where φ = (P,e,τ), and where ϑ = (a,i,ω,Ω) is replaced by the vector ψ whose components are the ThieleInnes constants A,B,F,G. (Note that in φ, periastron has been replaced by τ = T/P which by definition ∈ (0,1).)
2.2. Model astrometric binary
The model binary has the following elements: (1)Note that P_{∗} is less than t_{M} = 5y, the duration of the Gaia mission, so that the issue of incomplete orbits (L14a) is not of concern here. Also the semimajor axis a_{∗} is expressed as a dimensionless multiple β of the standard error σ of a singlescan measurement. Thus our ability to detect weak orbits can be investigated by letting β → 0.
The eccentricity e_{∗} = 0.05 is typical for giant planets in the solar system. But the main reason for such a small value is to highlight the anomaly when nearly parabolic orbits fit the data.
2.3. Observing campaign
A Gaialike observing campaign is defined by t_{n}, the N times at which the star is scanned, by α_{n}, the corresponding scanning angles, and by σ. We take t_{n} = t_{M}z_{u} and α_{n} = 2πz_{u}, where the z_{u} here and later denote independent random numbers ∈ (0,1).
Although β = a_{∗}/σ is the important parameter, we take σ = 40 μas, the expected accuracy for a single transit at Gband magnitude ≈ 14 (see Fig. 2 in Sozzetti et al. 2014). Note that, in the comprehensive investigation of planet detection with Gaia by Casertano et al. (2008), σ = 8 μas.
From Fig. 1 in Sozzetti et al. (2014), we take N = 70 as a representive number of scans during the mission.
2.4. Synthetic scans
Given β, Eq. (1) defines the theoretical orbit. The Cartesian sky coordinates at t_{n} can therefore be computed from Eqs. (A.2) of L14a. The corresponding 1D coordinate or abscissa is , where (2)Here α_{n} is the angle between the scanning direction and the xaxis – see Fig. 1 in Pourbaix (2002). A synthetic data set is then (3)where the z_{G} are independent random Gaussian variates sampling . Note that the χ^{2} of the measurement errors is simply (4)The Ndimensional vector with elements is the data vector from which orbital elements are to be estimated. For a given orbit θ ≡ (φ,ψ), the goodnessoffit to is measured by (5)where s_{n} = s(t_{n},α_{n};θ).
Fig. 1 Feasible domain projected on to the (ω,e)plane. The orbital parameters are given by Eq. (1) with β = 2.8. 
3. Feasible orbits
In this section, a procedure from L14a is used to explore the likely degradation of extracted orbits as β → 0.
3.1. Grid scan
A 3D grid in the φ variables is set up as follows: the midpoint of grid cell (i,j,k) is (log P_{i},e_{j},τ_{k}). The grid has 200 constant steps in each of these variables, with ranges (0.0,1.0) for log P, and (0,1) for e and τ.
For specified β, a synthetic scan vector is created as described in Sect. 2.4. Then, at each grid point, the minχ^{2} ThieleInnes vector is computed as described in Appendix A.1. The resulting .
Fig. 2 Feasible domain projected on to the (i,e)plane. The orbital parameters are given by Eq. (1) with β = 2.8. 
3.2. Feasible domain
An orbit is deemed to be feasible if (6)and the ensemble of such orbits define the feasible domain(s) in φspace.
For β ≳ 10, the domain is a small ellipsoidal volume approximately centred on the exact values (log P_{∗},e_{∗},τ_{∗}). But as β decreases, increases and eventually develops extraordinary topology.
From a sequence of grid scans with β → 0, the value β = 2.8 is found to be such that just extends to e = 1. Figure 1 illustrates the resulting distortions of . In this figure, a filled circle is plotted at (ω,e) if Eq. (6) is satisfied, and we see that this projection of the feasible orbits θ_{ijk} extends far beyond the exact values (150°,0.05). The most notable features are the two narrow spikes that emerge at e ~ 0.4 and reach e = 1 at precisely ω = 90° and 270°. Further information about these spikes is provided by other projections of the θ_{ijk}. In Fig. 2, the vectors are projected onto the (i,e)plane, and this shows that along both spikes i → 90° as e → 1. Accordingly, if the orbital signal is weak enough (β< 2.8), an acceptable fit is provided by nearly parabolic, edgeon orbits with ω = π/ 2 or 3π/ 2. But this remarkable finding is not original: Pourbaix (2002), in reporting leastsquares fits to synthetic 1D scans, found an accumulation of nearly parabolic orbits when β = a_{∗}/σ = 1.33 and noted that such orbits lead to reasonable apparent orbits when i and ω are close to π/ 2. This serendipitous numerical discovery posed what he called “the puzzling case of almost parabolic orbits”.
3.3. Violations of the Copernican principle
When the observed star is at periastron (t = T), its Cartesian coordinates are: (7)where the periastron distance a_{p} = a(1 − e). Accordingly, if i = π/ 2 and ω = π/ 2 or 3π/ 2, periastron has coordinates (0,0, ± a_{p}). The major axis is thus aligned with the lineofsight to the observer, who therefore finds himself in a special location. The observer might then object that this orbit violates the Copernican principle (CPr). But this objection could be raised against a slightly noncircular orbit with the same (i,ω), and an edgeon circular orbit is not particularly objectionable from the standpoint of the CPr. To fully appreciate the CPr violation, the extra fact that Pourbaix orbits (hereafter Porbits) are nearly parabolic must therefore be taken into account.
Consider the astrometric signal’s dependence on orbit orientation. Since the maximum elongation of the star from the barycentre is a(1 + e), the extreme range for the abscissae s_{n} is −a(1 + e) to + a(1 + e). However, when i = π/ 2 and ω = π/ 2 or 3π/ 2, this range shrinks to its minimum, namely −b to + b, where is the semiminor axis Thus, for the Pourbaix solutions, the orbit’s inferred orientation and eccentricity are such that the signal is at a deep minimum. For example, the ratio of the maximum to minimum elongations is when e = 0.99. This is a large effect, and so the observer would be correct in concluding that Porbits violate the CPr.
Traditionally, when an analysis leads to a CPr violation, astronomers suspect that some underlying hypothesis H must be wrong. A classic example is Herschel’s model of the Milky Way, which violates the CPr because the sun is close to its centre. In this case, the error is Herschel’s implicit assumption that interstellar space is transparent.
But note a crucial difference. In these simulations – and in those of L14a – CPr violations arise even though H – Keplerian motion – is rigorously correct. This strongly implies that there must exist a dataanalysis technique that includes the CPr ab initio rather than invoking it to pass judgement on a model only after it has been derived.
3.4. Degeneracy
For a single star, the astrometric solution has five parameters: the star’s right ascension (RA) and declination at a reference epoch, two components of proper motion, and its parallax. However, because of errors in the , this solution has residuals, and so it is likely that the addition of orbital motion (seven parameters) will “improve” the fit – i.e., reduce χ^{2}. Given their minimal astrometric signatures (Sect. 3.3), Porbits with b ≲ σ can be added with little effect on the fit. Evidently, a singlestar solution is degenerate under the addition of a Porbit with arbitrarily large semimajor axis a so long as the semiminor axis .
3.5. Imperfect experiments
For reasons beyond the observer’s control, experiments in astronomy are often imperfect, yielding data from which a definitive solution cannot be obtained. In double star astronomy, examples are longperiod binaries that have only completed a fraction of an orbit since discovery. If a solution is nevertheless attempted, orbits with very different parameters may provide acceptable fits – see Fig. 2 in L14a and references therein. Among these acceptable orbits may be orbits that violate the CPr, as is the case for the nearly parabolic orbit in that figure.
Comparison of the simulations here with those for incomplete orbits in L14a is illuminating. Here and in Pourbaix (2002), we find CPr violations even though the orbit is complete (P_{∗}<t_{M}). This shows that a weak orbital signal suffices for the experiment to be imperfect and to thereby permit solutions that violate the CPr.
When the minχ^{2} elements violate the CPr, we might suspect that there exists a better solution that, despite a higher χ^{2}, should be preferred because it is consistent with the CPr.
4. A Bayesian prior derived from the CPr
The CPr is now treated as an integral part of Bayesian estimation and not as an a posteriori arbiter of a solution’s believability. This is achieved by constructing a Copernican prior.
4.1. Conventional priors
If H denotes the hypothesis and D the data and if I is some relevant information, then by Bayes’ theorem (Jaynes 2003, p. 85), the posterior density of H given D and I is (8)Here Pr(D  H,I) ≡ ℒ(H,I  D) is the likelihood, and Pr(H  I) is the prior probability of H given I.
If there is no information I, the prior reduces to Pr(H) and so becomes the subjective choice of the investigator. This aspect of Bayesian estimation is controversial and muchdebated. However, there is little reason to object to current astronomical practice with regard to Pr(H) since the aim is not to quantify prejudice but to admit ignorance. Thus flat priors are typically imposed on the parameters of H. Moreover, the ranges over which these priors are nonzero are chosen to comfortably enclose the intervals within which there is significant likelihood ℒ and therefore significant posterior density Pr(H  D). Such priors are noninformative.
In the Gaia problem, it is tempting to use a variant of this methodology to eliminate CPrviolating orbits. Thus, Fig. 1 suggests a prior on e that is zero for e> 0.6. But this would be an ad hoc fix for this particular data set. A Bayesian prior should not depend on, nor be derived from, D.
4.2. A Copernican prior
For the problem under consideration, the symbols H,D and I are defined as follows:

H:
the components of the theoretical scan vector s are s_{n} = s(t_{n},α_{n};θ), the predicted abscissae at the known times t_{n} and scanning angles α_{n} for the Keplerian orbit θ.

D:
the elements of the data vector are , the measured abscissae at (t_{n},α_{n}).

I:
orbits θ with random orientations and random shifts in epoch are all equally probable a priori.
Comments:

(i)
information I is the means of incorporating the CPr.

(ii)
since orbits are periodic, choosing a random value of τ = T/P ∈ (0,1) is equivalent to a random shift in epoch.

(iii)
imposing I is appropriate for orbits discovered with Gaia but not if a previouslyknown orbit is targeted.
Given that the hypothesis of Keplerian motion enters via the theoretical vector s, the Copernican prior Pr(H  I) becomes π(s  I), the probability density at s when I is taken into account. However, I does not itself suffice to determine this probability density function (PDF). In addition, P,e and a – or their prior distributions – must be specified. We choose the latter option on the grounds of simplicity.
For the bounded quantity e, we assume a uniform prior in (0,1). For the unbounded positive continuous parameters P and a, it is appropriate (Jaynes 2003, p. 395) to assign equal prior probabilities to equal logarthmic intervals – i.e., Jeffreys’ priors.
With these additional assumptions, the Copernican prior is determinate and given by (9)where π_{1}(s  I) is the PDF at s for orbits scaled to a = 1″.
4.3. Accurate treatment
A rigorous calculation of π_{1}(s  I) would proceed as follows: The scale parameter a is set = 1″, and e and log P are randomly chosen in (0,1) and (log P_{L},log P_{U}), respectively. The orbit’s orientation (i,ω,Ω) and epoch τ = T/P are randomly chosen, and the resulting theoretical s computed. These steps are repeated many times, thus generating points that populate the Ndimensional sspace with a probability density determined by I and by the prior distributions of e and P. As the sample size → ∞, the result is the desired PDF π_{1}(s  I). However, with N ~ 70, this bruteforce approach is not feasible. A less rigorous approach must be adopted.
4.4. Approximate treatment
Consider an orbit with a = 1″ and eccentricity e. Sampled with random orientations and epochs, the theoretical abscissae s_{n} will extend over the full permitted range, namely −(1 + e) to + (1 + e). In other words, whatever the values of (t_{n},α_{n}), there will be some combination of orientation (i,ω,Ω) and epoch τ for which these limits are reached. Given that e ∈ (0,1), it follows that the s_{n}values populate the interval (− 2, + 2). Since this applies to every component of s, the distribution of the vectors s in Ndimensional space is approximately isotropic. Accordingly, most of the information relevant to CPr violations is contained in the distribution of the Euclidean “lengths” of the vectors s. We therefore define the statistic ξ given by (10)With this statistic as the sole basis for assessing CPr violations, the approximate Copernican prior is (11)where π_{1}(ξ  I) is the PDF of the lengths ξ for orbits scaled to a = 1″. This prior is used in the subsequent tests.
4.5. Calculation of π_{1}(ξ  I)
With the above assumptions, the problem has been reduced to tabulating the 1D function π_{1}(ξ  I). The steps are as follows:

1)
The campaign (t_{n},α_{n};N) is specified (Sect. 2.3).

2)
We set a = 1″ and choose random values of e ∈ (0,1) and log P ∈ (log P_{L},log P_{U}).

3)
A random orientation is selected by taking ω = 2πz_{u},Ω = πz_{u}, and cosi = 1 − 2z_{u}.

4)
A random epoch is selected by taking τ = T/P = z_{u}.

5)
With the orbit vector θ determined in steps 2)–4), the coordinates (x_{n},y_{n}) at t_{n} are computed.

6)
From these coordinates, the components of s are given by Eq. (2), and the corresponding length ξ by Eq. (10).

7)
Steps 2)–6) are repeated 10^{8} times. The resulting histogram of ξvalues gives π_{1}(ξ  I).
Comments:

(i)
To eliminate parabolic orbits and to avoid convergence failureswhen solving Kepler’s equation, an upper limite = 0.999 is imposed at step 2).

(ii)
Since the exact period is known, we take P_{L} = 1y and P_{U} = 10y.

(iii)
From the 10^{8} values of s_{n} for each n, the maximum and minimum values are derived. These closely approach the expected values ± 2 (Sect. 4.4).

(iv)
If no assumptions are made about the prior distributions of P and e, then π_{1} = π_{1}(ξ;P,e  I), thus requiring an extra two dimensions in its tabulation. This is feasible, but the preference here is to investigate the simplest formulation.
The resulting accurate determination of π_{1}(ξ  I) is plotted in Fig. 3. This shows that the astrometric lengths are typically in the interval (0.4,1.1), and that values ≲ 0.3 are improbable.
Fig. 3 PDF π_{1}(ξ  I) for astrometric “lengths” ξ defined by Eq. (10). The orbits have random orientations (i,ω,Ω), random epochs τ, random eccentricities ∈ (0,1), and random values of log P ∈ (0.0,1.0). The corresponding PDF for Porbits (i = ω = 90°) with e = 0.96 is a near delta function at ξ = 0.14. 
The position of a Porbit in this plot is of interest. The above steps are therefore repeated with the constraints e = 0.96 and i = ω = 90°. For this orbit, the range for the s_{n} is (− b, + b) or (− 0.28, + 0.28) in units of a. Consistent with this, the PDF is a near delta function at ξ = 0.14, and this location is indicated in Fig. 3. The probability of obtaining an even smaller value is (12)which gives Π_{1}(0.14  I) = 1.7 × 10^{4}, showing that Porbits populate an extremely low probability tail of π_{1}(ξ  I).
5. Bayesian estimation subject to the CPr
Formulae are now developed that allow the Copernican prior to be included in the calculation of posterior densities and credibility intervals.
5.1. Posterior densities
For every orbit in θspace, there is a theoretical scan vector s corresponding to the scanning campaign (t_{n},α_{n}). From this s and the orbit’s a, we can compute ξ from Eq. (10). Then, from ξ, we obtain π_{1}(ξ  I) by interpolating in the data file plotted in Fig. 3. This procedure results in an ensemble of orbit vectors weighted according to their Copernican priors a^{1}π_{1}(ξ  I), and so CPr violations are penalized. From this ensemble, the Bayesian machinery then computes posterior densities by further weighting the orbits in accordance with their goodnessoffits to the measured scan vector .
The posterior density at (φ,ψ) is (13)Ignoring coefficients independent of (φ,ψ) and assuming normally distributed measurement errors, we have (14)where is the minimum at , and δχ^{2} is the postive increment due to the displacement at fixed φ.
In the absence of the Copernican prior, Λ ∝ ℒ ∝ Pr(φ) × Pr(ψ  φ). The first of these PDFs is sampled at grid points (i,j,k) giving weight factors . The second PDF is randomly sampled as described in Appendix A.4. If is the number of random points ψ_{ℓ} selected in ψspace at φ_{ijk}, then each has weight .
With the Copernican prior included, the PDF Λ given in Eq. (13) is represented by a cloud of discrete orbit vectors (15)with weights (16)Here m enumerates the random points ψ_{ℓ} across all grid cells (i,j,k).
From these weighted orbits, the posterior mean of a quantity Q(θ) given Dand I is (17)and credibility intervals are derived as described in Sect. 4.2 of L14b.
This discrete representation of Λ and the resulting credibility means and intervals become exact as the grid steps → 0 and the .
6. Numerical experiments
The approximate theory (Sect. 4.4) of the Copernican prior is now applied to the model binary defined in Eq. (1).
6.1. Code verification
In the strongorbit limit, violations of the CPr are not an issue and so, even with the inclusion of the Copernican prior, the posterior means should → the exact elements given in Eq. (1). To test this, the Bayesian code is used to compute the solution when log β = 1.5. The posterior means and equaltail 1σ credibility intervals for the elements are as follows: (18)These results are consistent with expectation: five of the seven credibility intervals include the exact values. Minor deviations occur for τ and ω.
Note that the credibility interval for ω is substantially larger than those for i and Ω. This is a consequence of the small eccentricity, since ω becomes indeterminate as e → 0.
In this strongorbit regime, ℒ is sharply peaked in parameter space; consequently, ξ and therefore π_{1}(ξ  I) vary little within the narrow domain of high likelihood. It follows that posterior densities are then largely determined by ℒ, which overwhelms the prior.
6.2. Varying β
Solutions are computed with log β = −0.6(0.05)1.2, spanning the range from weak to strong orbits. For each β, the elements’ posterior means and 1σ credibility intervals are derived as in Sect. 6.1. In addition, for each data vector , the minχ^{2} solution is computed as in Pourbaix (2002). Note that when β changes, so does the random number seed.
Fig. 4 Sequence of solutions for log a/σ. The points with error bars are the posterior means ⟨log a/σ⟩ plotted with 1σ credibility intervals. The open circles are the minχ^{2} values. The dotted line is the locus of exact values log a_{∗}/σ. 
In Figs. 4−6, the solution sequences are plotted for log a/σ, e, and i. For log β ≳ 0.2, the credibility intervals are consistent with both the minχ^{2} values and with the exact values. However, at log β = −0.05, major disagreements occur. The minχ^{2} value of log a/σ suddenly jumps to 1.14, which is 1.19 dex greater than the exact value. Correspondingly, e jumps to 0.9975 (the highest value allowed by the grid) and i jumps to . Thus, at β = −0.05 dex, the minχ^{2} solution is a Porbit (Sects. 3.2−3.4)
At this same β = −0.05 dex, the 1σ credibility intervals are (− 0.20,0.05) for log a/σ, (0.06,0.44) for e, and (52°,83°) for i. Thus, the Bayesian solution with Copernican prior does not undergo a transition into a Porbit. In fact, the solution remains (marginally) consistent with the exact solution.
For log β ≤ − 0.05, most of the minχ^{2} solutions are Porbits. But the Bayesian solutions with Copernican prior do not exhibit such strikingly anomalies. Nevertheless, they do eventually (log β ≲ − 0.3) become inconsistent with the exact parameters.
Fig. 5 Sequence of solutions for e. The points with error bars are the posterior means ⟨e⟩ plotted with 1σ credibility intervals. The open circles are the minχ^{2} values. The exact value is e_{∗} = 0.05. 
Fig. 6 Sequence of solutions for i. The points with error bars are the posterior means ⟨i⟩ plotted with 1σ credibility intervals. The open circles are the minχ^{2} values. Orbits with i> 90° are retrograde. The exact value is i_{∗} = 40°. 
6.3. Orbits from noise
Figures 4−6 show that even for extremely weak orbits the Copernican prior has eliminated Porbits. However, the plotted credibility intervals reveal that when log β ≲ − 0.3 the posterior PDFs are systematically displaced from the exact values. This could indicate that the orbital signal is then too weak for detection, a conclusion strongly supported by Fig. 6 which shows that the posterior densities of retrograde and prograde orbits are then about equal.
To investigate this issue further, the code is now used to compute solutions when a_{∗} = βσ = 0. The posterior density of a/σ is plotted in Fig. 7 for a particular realization of the noise vector . For comparison, plots with β = 0.5 and 1.0 are also included.
In 20 independent repetitions with β = 0, the range found for ⟨a/σ⟩ is 0.45 to 0.81, with average ⟨⟨a/σ⟩⟩ = 0.57. For ⟨e⟩ the range is 0.21 to 0.53, with ⟨⟨e⟩⟩ = 0.37; and for ⟨i⟩ the range is 67° to 113°, with ⟨⟨i⟩⟩ = 88°. Since these are consistent with Figs. 4−6 when log β ≲ − 0.3, we conclude that the aforementioned systematic displacements simply reflect the code’s reponse to data with negligible orbital signal.
The bias in a/σ for β = 0 evident in Fig. 7 is reminiscent of the bias in the eccentricities of spectroscopic binaries for nearly circular orbits (Lucy & Sweeney 1971). In both cases, bias is the inevitable result of estimating a nonnegative parameter at or near its zero lower bound. From the values of ⟨⟨a/σ⟩⟩ plotted in Fig. 7, the bias of a/σ is typically 0.57, 0.21 and 0.13 at β = 0.0,0.5 and 1.0, respectively.
Fig. 7 Posterior densities of a/σ for β = 0.0,0.5 and 1.0 for particular realizations of the measurement vectors . The vertical dotted lines indicate the average positions of the corresponding posterior means ⟨a/σ⟩. Each of these is obtained from 20 independent simulations. 
Also of interest when β = 0 is the frequency of Porbits for minχ^{2} solutions. From 200 independent simulations, 156 or 78% are Porbits – i.e., have e> 0.95,i ≈ 90° and ω ≈ 90 or 270°. Thus, the PDFs of e,i and ω for minχ^{2} solutions when β = 0 are dominated by near delta functions at the Pourbaix loci.
Figures 8−10 plot the posterior densities of e,cosi and ω for β = 0 when the Copernican prior is included. While these plots are free from peaks at the Pourbaix loci, they do show evidence of imperfections that presumably derive from the approximate treatment of the Copernican prior (Sect. 4.4). Ideally, when analysing pure noise, the inferred values of cosi and ω should be uniformly distributed in (− 1, + 1) and (0,360°), respectively. Figures 9, 10 show departures from this ideal.
As in Fig. 7, Figs. 8−10 also include solutions for β = 1. Figs. 8 and 9 show the emergence of peaks at the exact values of e and cosi, repectively. However, an emerging peak is not evident at ω_{∗} in Fig. 10. This is due to the near indeterminacy of ω when e ≪ 1 – see Sect. 6.1.
Fig. 8 Posterior PDFs for e when β = 0 and 1. The Pourbaix peak at e = 1 and the exact value e_{∗} = 0.05 are indicated. 
Fig. 9 Posterior PDFs for i when β = 0 and 1. The Pourbaix peak at cosi = 0 and the exact value cosi_{∗} are indicated. 
Fig. 10 Posterior PDFs for ω when β = 0 and 1. The Pourbaix peaks at ω = 90,270° and the exact value ω_{∗} = 150° are indicated. 
6.4. ξprobabilities
Given that Porbits arise when applying a conventional data analysis technique to synthetic Gaia data, there is some danger that such orbits will contaminate the huge data bases expected from the Gaia mission. On the assumption that this Bayesian procedure cannot feasibly replace the existing pipeline analyses, a less ambitious approach to elimating Porbits is desirable.
Let θ_{0} be the minχ^{2} elements derived from an observed scan vector and let the corresponding fitted vector be s_{0}. From s_{0}, the astrometric length ξ_{0} of the orbit θ_{0} is then given by Eq. (10). This scalefree length refers to an orbit with physical parameters (P_{0},e_{0}) observed at epoch τ_{0} and orientation (i_{0},ω_{0},Ω_{0}). We now define p_{0} to be the probability that a shorter length ξ would be found with random epochs and orientations but with P and e fixed at their minχ^{2} values. Thus, with steps 3)−6) of Sect. 4.5, we compute (19)In Fig. 11, log p_{0} is plotted against log β for minχ^{2} orbits. For strong orbits, the values scatter about the exact value = − 0.188. But for weak orbits the solutions are the CPrviolating Porbits with log p_{0} ~ − 4. Accordingly, if a Gaia orbit catalogue were contaminated by Porbits a cut excluding orbits with p_{0}< − 3 dex would remove them.
Fig. 11 ξprobabilities p_{0} for a βsequence of minχ^{2} orbits computed for simulations of the orbit defined by Eq. (1). The dotted line is the exact value p_{∗} = 0.649. 
Besides Porbits, other as yet unrecognized anomalies, biases and selection effects may be present in a Gaia catalogue. Accordingly, it is worth noting that there are five quantities which, in a perfect catalogue, are uniformly and independently distributed in (0,1). These quantities are: (1 + cosi)/2,ω/ 2π,Ω /π,τ and p_{0}. This statement yields fifteen statistical tests that should be applied to a catalogue of Gaia orbits.
6.5. Detecting a second companion
A further weakorbit problem for Gaia is that of detecting a second companion (B) when the first (A) is welldetermined. This is a goodnessoffit problem: the presence of B degrades the fit achieved when only A is considered.
To investigate this problem, Gaia data is created (Sects. 2.3, 2.4) for a star with invisible companions A and B. Companion A has the elements given in Eq. (1) with β_{A} = 10, and B is in a coplanar orbit with P = 7.2y,e = 0.2,τ = 0.7, and a reflex orbit with semimajor axis = β_{B}σ.
A 1D sequence of Gaia scans is created for this twocompanion model with log β_{B} = −0.6(0.05)0.6, and each scan is analysed with the Bayesian code under the assumption of only one companion.
For each β_{B}, the code creates (Sect. 5) a cloud of orbits θ_{m} with weights μ_{m}. The χ^{2} of the mth orbit’s fit to the data vector is (20)Now, if the onecompanion solution provides a satisfactory fit, then orbits of high weight should have . On the other hand, if the solution is not satisfactory, then these highweight orbits will have with α< 0.05. These expectations can be reduced to a single measure of goodnessoffit, namely ⟨χ^{2}⟩, the posterior mean of χ^{2}. This is calculated from Eq. (17) with .
The values of ⟨χ^{2}⟩ are plotted against log β_{B} in Fig. 12. As always with statistical tests, the investigator has discretion as to when he deems a model to be successful. In this case, he is likely to suspect an additional orbit when log β_{B} ≥ 0.3. On the other hand, scans with log β_{B} ≤ 0.1 are fitted with , the residuals are therefore consistent with measurement errors and so there is no evidence of an additional orbit.
Fig. 12 Detecting a second companion. The filled circles are the posterior means ⟨χ^{2}⟩ measuring the goodnessoffits of the Bayesian singleorbit solutions to the simulated scan vectors for the twoorbit model. The amplitudes are β_{A} = 10 and log β_{B} = −0.6(0.05)0.6. The open circles are the corresponding values of given by Eq. (4). The dotted lines are the values for α = 0.025,0.975 with N = 70. 
7. Conclusion
The aims of this paper are twofold. First, to provide a weakorbit analysis for Gaia and, in particular, to investigate the occurrence of the spurious solutions found by Pourbaix (2002). Secondly, to use the Gaia problem as a test case for a procedure that incorporates the CPr into the machinery of statistical astronomy.
With regard to spurious solutions, Pourbaix’s (2002) finding is confirmed and the puzzle of nearly parabolic, edgeon orbits explained in terms of the near degeneracy (Sect. 3.4) of scan vectors s under the addition of such orbits. Moreover, these orbits are shown to violate the CPr (Sect. 3.3) and do not arise when the CPr is adopted as a fundamental postulate in Bayesian estimation (Sects. 5 and 6).
More generally, incorporating the CPr in statistical analyses may improve solutions derived for imperfect experiments (Sect. 3.5). In addition to poor precision and limited sampling, weather, the seasons, atmospheric opacity and interstellar extinction are among the numerous factors that result in data sets that are less than ideal.
When an astronomer must perforce analyse an imperfect data set, he needs to be aware that supposedly optimum statistical procedures – e.g., minχ^{2} or Bayesian estimation with noninformative priors – can in extreme cases, as with the Porbits, give anamolous solutions. Moreover, at a more subtle level, even when an anomaly is not immediately evident, the complicated topology (Sect.3.2) of the likelihood function implies that the above “optimum” procedures are unlikely to be so. An investigation of how an estimation procedure can exploit imperfect data should be carried out (Sect. 3.2) and an appropriate prior constructed. Often Copernican considerations with regard to position, epoch or orientation will be crucial.
Acknowledgments
I thank A. H. Jaffe and D. J. Mortlock for helpful discussions on Bayesian methods and the referee for justified criticisms of the original version.
References
 Casertano, S., Lattanzi, M. G., Sozzetti, A., et al. 2008, A&A, 482, 699 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Gentle, J. E. 2009, Computational Statistics (New York : Springer) [Google Scholar]
 James, F. 2006, Statistical Methods in Experimental Physics (Singapore: World Scientific Publishing Co) [Google Scholar]
 Jaynes, E. T. 2003, Probability Theory, The Logic of Science (Cambridge: Cambridge University Press) [Google Scholar]
 Lucy, L. B. 2014a, A&A, 563, A126 (L14a) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Lucy, L. B. 2014b, A&A, 565, A37 (L14b) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Lucy, L. B., & Sweeney, M. A. 1971, AJ, 76, 544 [NASA ADS] [CrossRef] [Google Scholar]
 Perryman, M. 2009, Astronomical Applications of Astrometry (Cambridge: Cambridge University Press) [Google Scholar]
 Pourbaix, D. 2002, A&A, 385, 686 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Pourbaix, D. 2004, ASPC Conf. Ser., 318, 132 [NASA ADS] [Google Scholar]
 Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. 1992, Numerical Recipes, 2nd edn. (Cambridge: Cambridge Univ. Press) [Google Scholar]
 Sozzetti, A., Giacobbe, P., Lattanzi, et al. 2014, MNRAS, 437, 497 [NASA ADS] [CrossRef] [Google Scholar]
Appendix A: Statistics in ThieleInnes space
In L14a,b, each observation of the model visual binary yielded two measurements , the sky coordinates of the secondary’s displacement from the primary at time t_{n}. In this circumstance, minimizing χ^{2} to obtain the ThieleInnes constants separates into two independent problems, minimizing the xcoordinate contribution to χ^{2} to obtain (A,F) and minimizing the ycoordinate contribution to obtain (B,G). This separation results in the considerable simplifications exploited in L14a,b.
However, these simplifications are lost when observing an astrometric binary with a 1D scanning device. On the assumption that the parallactic and proper motion have been subtracted, the measurement at t_{n} with scanning angle α_{n} is the observed star’s displacement from the binary’s barycentre.
Appendix A.1: Normal equations
For given orbit θ ≡ (φ,ψ), the goodnessoffit criterion χ^{2}(φ,ψ) is given by Eq. (5). At fixed φ, the orbit (x,y) is linear in ψ. Accordingly, since s_{n} = 0 when ψ = 0, the predicted abscissa at t_{n} is (A.1)Substitution of s_{n} into Eq. (5) then allows the minχ^{2} solution for the ThieleInnes vector ψ to be obtained without iteration. The normal equations are (A.2)where, the curvature matrix, (A.3)and (A.4)The partial derivatives in these equations can be expressed in terms of the elliptical rectangular coordinates X(E),Y(E) via Eqs. (2) and (A.2) of L14a.
The solution of Eq. (A.2) is and we write .
Appendix A.2: Increment in χ^{2}
At fixed φ, a displacement δψ from results in a positive increment δχ^{2}. The abscissa corresponding to this displacement is given by Eq. (A.1). Substitution in Eq. (5) then gives . From the quadratic terms in the resulting expression, we obtain (A.5)
Appendix A.3: Probability density function p(ψ  φ,D)
The distribution of probability at fixed φ is a quadrivariate normal distribution centred on . If Σ is the covariance matrix, then (A.6)
(James 2006, p. 67). Since , comparison with Eq. (A.6) gives (A.7)
Appendix A.4: Random sampling in ψspace
A random point δψ_{ℓ} sampling p(ψ  φ,D) is obtained as follows (Gentle 2009, pp. 315−316): The first step is to compute the Cholesky decomposition (Press et al. 1992, pp. 89−91) of Σ. Thus, we find the lower triangular matrix L such that (A.8)Now let z_{G} be a 4D vector whose elements are independent random Gaussian variates sampling . Then (A.9)is a random displacement from satisfying the PDF given by Eq. (A.7)
If we generate independent displacements, then the points give us the approximation (A.10)which is exact in the limit .
All Figures
Fig. 1 Feasible domain projected on to the (ω,e)plane. The orbital parameters are given by Eq. (1) with β = 2.8. 

In the text 
Fig. 2 Feasible domain projected on to the (i,e)plane. The orbital parameters are given by Eq. (1) with β = 2.8. 

In the text 
Fig. 3 PDF π_{1}(ξ  I) for astrometric “lengths” ξ defined by Eq. (10). The orbits have random orientations (i,ω,Ω), random epochs τ, random eccentricities ∈ (0,1), and random values of log P ∈ (0.0,1.0). The corresponding PDF for Porbits (i = ω = 90°) with e = 0.96 is a near delta function at ξ = 0.14. 

In the text 
Fig. 4 Sequence of solutions for log a/σ. The points with error bars are the posterior means ⟨log a/σ⟩ plotted with 1σ credibility intervals. The open circles are the minχ^{2} values. The dotted line is the locus of exact values log a_{∗}/σ. 

In the text 
Fig. 5 Sequence of solutions for e. The points with error bars are the posterior means ⟨e⟩ plotted with 1σ credibility intervals. The open circles are the minχ^{2} values. The exact value is e_{∗} = 0.05. 

In the text 
Fig. 6 Sequence of solutions for i. The points with error bars are the posterior means ⟨i⟩ plotted with 1σ credibility intervals. The open circles are the minχ^{2} values. Orbits with i> 90° are retrograde. The exact value is i_{∗} = 40°. 

In the text 
Fig. 7 Posterior densities of a/σ for β = 0.0,0.5 and 1.0 for particular realizations of the measurement vectors . The vertical dotted lines indicate the average positions of the corresponding posterior means ⟨a/σ⟩. Each of these is obtained from 20 independent simulations. 

In the text 
Fig. 8 Posterior PDFs for e when β = 0 and 1. The Pourbaix peak at e = 1 and the exact value e_{∗} = 0.05 are indicated. 

In the text 
Fig. 9 Posterior PDFs for i when β = 0 and 1. The Pourbaix peak at cosi = 0 and the exact value cosi_{∗} are indicated. 

In the text 
Fig. 10 Posterior PDFs for ω when β = 0 and 1. The Pourbaix peaks at ω = 90,270° and the exact value ω_{∗} = 150° are indicated. 

In the text 
Fig. 11 ξprobabilities p_{0} for a βsequence of minχ^{2} orbits computed for simulations of the orbit defined by Eq. (1). The dotted line is the exact value p_{∗} = 0.649. 

In the text 
Fig. 12 Detecting a second companion. The filled circles are the posterior means ⟨χ^{2}⟩ measuring the goodnessoffits of the Bayesian singleorbit solutions to the simulated scan vectors for the twoorbit model. The amplitudes are β_{A} = 10 and log β_{B} = −0.6(0.05)0.6. The open circles are the corresponding values of given by Eq. (4). The dotted lines are the values for α = 0.025,0.975 with N = 70. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.