Analysing weak orbital signals in Gaia data

L. B. Lucy

doi:10.1051/0004-6361/201424405

Home

All issues

Volume 571 (November 2014)

A&A, 571 (2014) A86

Full HTML

Free Access

Issue		A&A Volume 571, November 2014


Article Number		A86
Number of page(s)		9
Section		Celestial mechanics and astrometry
DOI		https://doi.org/10.1051/0004-6361/201424405
Published online		14 November 2014

A&A 571, A86 (2014)

Analysing weak orbital signals in Gaia data

L. B. Lucy

Astrophysics Group, Blackett Laboratory, Imperial College London, Prince Consort Road, London SW7 2AZ, UK
e-mail: l.lucy@imperial.ac.uk

Received: 15 June 2014
Accepted: 22 August 2014

Abstract

Anomalous orbits are found when minimum-χ² estimation is applied to synthetic Gaia data for orbits with astrometric signatures comparable to the single-scan measurement error (Pourbaix 2002, A&A, 385, 686). These orbits are nearly parabolic, edge-on, and their major axes align with the line-of-sight to the observer. Such orbits violate the Copernican principle (CPr) and as such could be rejected. However, the preferred alternative is to develop a statistical technique that incorporates the CPr as a fundamental postulate. This can be achieved in a Bayesian context by defining a Copernican prior. Pourbaix’s anomalous orbits then no longer arise. Instead, the selected orbits have a somewat higher χ² but do not violate the CPr. The problem of detecting a weak additional orbit in an astrometric binary with a well-determined orbit is also treated.

Key words: binaries: visual / stars: fundamental parameters / methods: statistical

© ESO, 2014

1. Introduction

With the Gaia observatory in orbit at L2 and with commissioning underway, astronomers can look forward with increasing confidence to the eventual release of an enormous quantity of high precision astrometric data. Initially, this data will be analysed with the already-existing pipeline software created by the various consortia. The resulting pipeline products will no doubt be entirely satisfactory for the vast majority of observed objects. However, a lesson from earlier large-scale surveys is that a small number of objects at the limit of a survey’s range often prove to be of exceptional interest. For such objects, standard reduction techniques may give anomalous and misleading results.

This occurred for the Hipparcos mission. As reviewed by Pourbaix (2004) and Perryman (2009, p. 594), orbits fitted to Hipparcos data for stars with known spectroscopic orbits led to “discoveries” that were later refuted. As emphasized by Pourbaix (2004), “fitting the noise with an orbital model can have some awful consequences”.

This earlier episode suggests that the extraction of orbital parameters from weak orbital signals in Gaia data should be investigated. In fact, this is already the subject of an intriguing paper by Pourbaix (2002). He found that min-χ² solutions for weak orbits are frequently anomalous – specifically, edge-on and nearly parabolic. In the present paper, the origin of such orbits is explained and a Bayesian technique developed that overcomes this problem.

2. Synthetic data

In this section, synthetic 1D scans of a model astrometric binary are created. In order to focus on orbital parameters, we follow Pourbaix (2002) in assuming that parallactic and proper motions have been subtracted. With regard to notation, previous papers (Lucy 2014a,b; hereafter L14a,b) are followed closely.

2.1. Orbital elements

In contrast to L14a,b, the secondary is here not detected, so the astrometry measures the primary’s reflex motion about the system’s centre of mass. This motion is parameterized with the Campbell elements P,e,T,a,i,ω,Ω. Here P is the period, e is the eccentricity, T is a time of periastron passage, i is the inclination, ω is the longitude of periastron, and Ω is the position angle of the ascending node. However, following many earlier investigators – references in L14a – the Thiele-Innes elements are also used, thereby exploiting the resulting linearity in four parameters. Thus, the Campbell vector θ ≡ (φ,ϑ), where φ = (P,e,τ), and where ϑ = (a,i,ω,Ω) is replaced by the vector ψ whose components are the Thiele-Innes constants A,B,F,G. (Note that in φ, periastron has been replaced by τ = T/P which by definition ∈ (0,1).)

2.2. Model astrometric binary

The model binary has the following elements: $\begin{matrix} P_{*} = 2.9 y & e_{*} = 0.05 & τ_{*} = 0.4 \\ a_{*} = βσ & i_{*} = 40 ° & ω_{*} = 150 ° Ω_{*} = 70 ° . \end{matrix}$ $\begin{eqnarray} P_{*}=2.9y & \;\;\; e_{*}=0.05 & \;\;\; \tau_{*}=0.4 \nonumber \\ \;\;\; a_{*} = \beta \: \sigma & \;\;\; {i}_{*}=40\degr & \;\;\; \omega_{*} = 150\degr \;\;\;\; \Omega_{*} = 70\degr. \end{eqnarray}$ (1)Note that P_∗ is less than t_M = 5y, the duration of the Gaia mission, so that the issue of incomplete orbits (L14a) is not of concern here. Also the semi-major axis a_∗ is expressed as a dimensionless multiple β of the standard error σ of a single-scan measurement. Thus our ability to detect weak orbits can be investigated by letting β → 0.

The eccentricity e_∗ = 0.05 is typical for giant planets in the solar system. But the main reason for such a small value is to highlight the anomaly when nearly parabolic orbits fit the data.

2.3. Observing campaign

A Gaia-like observing campaign is defined by t_n, the N times at which the star is scanned, by α_n, the corresponding scanning angles, and by σ. We take t_n = t_Mz_u and α_n = 2πz_u, where the z_u here and later denote independent random numbers ∈ (0,1).

Although β = a_∗/σ is the important parameter, we take σ = 40 μas, the expected accuracy for a single transit at G-band magnitude ≈ 14 (see Fig. 2 in Sozzetti et al. 2014). Note that, in the comprehensive investigation of planet detection with Gaia by Casertano et al. (2008), σ = 8 μas.

From Fig. 1 in Sozzetti et al. (2014), we take N = 70 as a representive number of scans during the mission.

2.4. Synthetic scans

Given β, Eq. (1) defines the theoretical orbit. The Cartesian sky coordinates $(x_{n}^{*}, y_{n}^{*})$ $\hbox{$(x^{*}_{n},y^{*}_{n})$}$ at t_n can therefore be computed from Eqs. (A.2) of L14a. The corresponding 1D coordinate or abscissa is $s_{n}^{*} = s_{n} (x_{n}^{*}, y_{n}^{*})$ $\hbox{$s^{*}_{n} = s_{n}(x^{*}_{n},y^{*}_{n})$}$ , where $s_{n} = x_{n} \cos α_{n} + y_{n} \sin α n$ $\begin{equation} s_{n} = x_{n} \: \cos \: \alpha_{n} + y_{n} \: \sin \: \alpha_{n} \end{equation}$ (2)Here α_n is the angle between the scanning direction and the x-axis – see Fig. 1 in Pourbaix (2002). A synthetic data set is then $\begin{matrix} ˜ \\ s_{n} \end{matrix} = s_{n}^{*} + σz G$ $\begin{equation} \tilde{s}_{n} = s^{*}_{n} + \sigma \: z_{\rm G} \end{equation}$ (3)where the z_G are independent random Gaussian variates sampling $\hbox{${\cal N}(0,1)$}$ . Note that the χ² of the measurement errors is simply ${χ^{2}}_{˜} = \sum_{n} z_{G}^{2}$ $\begin{equation} \tilde{\chi}^{2} = \sum_{n} \: z_{\rm G} ^{2} \end{equation}$ (4)The N-dimensional vector $\begin{matrix} ˜ \\ s \end{matrix}$ $\hbox{$\tilde{\vec{s}}$}$ with elements $\begin{matrix} ˜ \\ s_{n} \end{matrix}$ $\hbox{$\tilde{s}_{n}$}$ is the data vector from which orbital elements are to be estimated. For a given orbit θ ≡ (φ,ψ), the goodness-of-fit to $\begin{matrix} ˜ \\ s \end{matrix}$ $\hbox{$\tilde{\vec{s}}$}$ is measured by $χ^{2} = \frac{1}{σ^{2}} \sum_{n} (\begin{matrix} ˜ \\ s_{n} \end{matrix} - s_{n}) 2$ $\begin{equation} \chi^{2} = \frac{1}{\sigma^{2}} \sum_{n} (\tilde{s}_{n} - s_{n})^{2} \end{equation}$ (5)where s_n = s(t_n,α_n;θ).

Fig. 1

Feasible domain $\hbox{${\cal D}$}$ projected on to the (ω,e)-plane. The orbital parameters are given by Eq. (1) with β = 2.8.

3. Feasible orbits

In this section, a procedure from L14a is used to explore the likely degradation of extracted orbits as β → 0.

3.1. Grid scan

A 3D grid in the φ variables is set up as follows: the mid-point of grid cell (i,j,k) is (log P_i,e_j,τ_k). The grid has 200 constant steps in each of these variables, with ranges (0.0,1.0) for log P, and (0,1) for e and τ.

For specified β, a synthetic scan vector $\begin{matrix} ˜ \\ s \end{matrix}$ $\hbox{$\tilde{\vec{s}}$}$ is created as described in Sect. 2.4. Then, at each grid point, the min-χ² Thiele-Innes vector $\hat{ψ}$ $\hbox{$\hat{\vec{\psi}}$}$ is computed as described in Appendix A.1. The resulting $χ^{2} = χ̂ \begin{matrix} 2 \\ ijk \end{matrix}$ $\hbox{$\chi^{2} = \hat{\chi}^{2}_{ijk}$}$ .

Fig. 2

Feasible domain $\hbox{${\cal D}$}$ projected on to the (i,e)-plane. The orbital parameters are given by Eq. (1) with β = 2.8.

3.2. Feasible domain $\hbox{${\cal D}$}$

An orbit $θ_{ijk} = (φ_{ijk}, \hat{ψ})$ $\hbox{$\vec{\theta}_{ijk} = (\vec{\phi}_{ijk},\hat{\vec{\psi}})$}$ is deemed to be feasible if $\Pr (χ^{2} > χ̂ {\begin{matrix} 2 \\ ijk \end{matrix}}^{)} > 0.05$ $\begin{equation} {\rm Pr}\left(\chi^{2} > \hat{\chi}^{2}_{ijk}\right) > 0.05 \end{equation}$ (6)and the ensemble of such orbits define the feasible domain(s) $\hbox{${\cal D}$}$ in φ-space.

For β ≳ 10, the domain $\hbox{${\cal D}$}$ is a small ellipsoidal volume approximately centred on the exact values (log P_∗,e_∗,τ_∗). But as β decreases, $\hbox{${\cal D}$}$ increases and eventually develops extraordinary topology.

From a sequence of grid scans with β → 0, the value β = 2.8 is found to be such that $\hbox{${\cal D}$}$ just extends to e = 1. Figure 1 illustrates the resulting distortions of $\hbox{${\cal D}$}$ . In this figure, a filled circle is plotted at (ω,e) if Eq. (6) is satisfied, and we see that this projection of the feasible orbits θ_ijk extends far beyond the exact values (150°,0.05). The most notable features are the two narrow spikes that emerge at e ~ 0.4 and reach e = 1 at precisely ω = 90° and 270°. Further information about these spikes is provided by other projections of the θ_ijk. In Fig. 2, the vectors are projected onto the (i,e)-plane, and this shows that along both spikes i → 90° as e → 1. Accordingly, if the orbital signal is weak enough (β< 2.8), an acceptable fit is provided by nearly parabolic, edge-on orbits with ω = π/ 2 or 3π/ 2. But this remarkable finding is not original: Pourbaix (2002), in reporting least-squares fits to synthetic 1D scans, found an accumulation of nearly parabolic orbits when β = a_∗/σ = 1.33 and noted that such orbits lead to reasonable apparent orbits when i and ω are close to π/ 2. This serendipitous numerical discovery posed what he called “the puzzling case of almost parabolic orbits”.

3.3. Violations of the Copernican principle

When the observed star is at periastron (t = T), its Cartesian coordinates are: $\begin{matrix} x = & a_{p} & (\cos Ω \cos ω - \sin Ω \sin ω \cos i) \\ y = & a_{p} & (\sin Ω \cos ω + \cos Ω \sin ω \cos i) \\ z = & a_{p} & \sin ω \sin i \end{matrix}$ $\begin{eqnarray} x = & a_{\rm p} & \! ( \cos \Omega \: \cos \omega - \sin \Omega \: \sin \omega \: \cos i) \nonumber \\ y = & a_{\rm p} & \! ( \sin \Omega \: \cos \omega +\cos\Omega \: \sin \omega \: \cos i) \nonumber \\ z = & a_{\rm p} & \sin \omega \: \sin i \end{eqnarray}$ (7)where the periastron distance a_p = a(1 − e). Accordingly, if i = π/ 2 and ω = π/ 2 or 3π/ 2, periastron has coordinates (0,0, ± a_p). The major axis is thus aligned with the line-of-sight to the observer, who therefore finds himself in a special location. The observer might then object that this orbit violates the Copernican principle (CPr). But this objection could be raised against a slightly non-circular orbit with the same (i,ω), and an edge-on circular orbit is not particularly objectionable from the standpoint of the CPr. To fully appreciate the CPr violation, the extra fact that Pourbaix orbits (hereafter P-orbits) are nearly parabolic must therefore be taken into account.

Consider the astrometric signal’s dependence on orbit orientation. Since the maximum elongation of the star from the barycentre is a(1 + e), the extreme range for the abscissae s_n is −a(1 + e) to + a(1 + e). However, when i = π/ 2 and ω = π/ 2 or 3π/ 2, this range shrinks to its minimum, namely −b to + b, where $b = a \sqrt{1 - e^{2}}$ $\hbox{$b = a \sqrt{1-e^{2}}$}$ is the semi-minor axis Thus, for the Pourbaix solutions, the orbit’s inferred orientation and eccentricity are such that the signal is at a deep minimum. For example, the ratio of the maximum to minimum elongations is $\sqrt{(1 + e) / (1 - e)} = 14.1$ $\hbox{$\sqrt{(1+e)/(1-e)}= 14.1$}$ when e = 0.99. This is a large effect, and so the observer would be correct in concluding that P-orbits violate the CPr.

Traditionally, when an analysis leads to a CPr violation, astronomers suspect that some underlying hypothesis H must be wrong. A classic example is Herschel’s model of the Milky Way, which violates the CPr because the sun is close to its centre. In this case, the error is Herschel’s implicit assumption that interstellar space is transparent.

But note a crucial difference. In these simulations – and in those of L14a – CPr violations arise even though H – Keplerian motion – is rigorously correct. This strongly implies that there must exist a data-analysis technique that includes the CPr ab initio rather than invoking it to pass judgement on a model only after it has been derived.

3.4. Degeneracy

For a single star, the astrometric solution has five parameters: the star’s right ascension (RA) and declination at a reference epoch, two components of proper motion, and its parallax. However, because of errors in the $\begin{matrix} ˜ \\ s_{n} \end{matrix}$ $\hbox{$\tilde{s}_{n}$}$ , this solution has residuals, and so it is likely that the addition of orbital motion (seven parameters) will “improve” the fit – i.e., reduce χ². Given their minimal astrometric signatures (Sect. 3.3), P-orbits with b ≲ σ can be added with little effect on the fit. Evidently, a single-star solution is degenerate under the addition of a P-orbit with arbitrarily large semi-major axis a so long as the semi-minor axis $b = a \sqrt{1 - e^{2}} ≪ σ$ $\hbox{$b = a \sqrt{1 - e^{2}} \; \ll \sigma$}$ .

3.5. Imperfect experiments

For reasons beyond the observer’s control, experiments in astronomy are often imperfect, yielding data from which a definitive solution cannot be obtained. In double star astronomy, examples are long-period binaries that have only completed a fraction of an orbit since discovery. If a solution is nevertheless attempted, orbits with very different parameters may provide acceptable fits – see Fig. 2 in L14a and references therein. Among these acceptable orbits may be orbits that violate the CPr, as is the case for the nearly parabolic orbit in that figure.

Comparison of the simulations here with those for incomplete orbits in L14a is illuminating. Here and in Pourbaix (2002), we find CPr violations even though the orbit is complete (P_∗<t_M). This shows that a weak orbital signal suffices for the experiment to be imperfect and to thereby permit solutions that violate the CPr.

When the min-χ² elements violate the CPr, we might suspect that there exists a better solution that, despite a higher χ², should be preferred because it is consistent with the CPr.

4. A Bayesian prior derived from the CPr

The CPr is now treated as an integral part of Bayesian estimation and not as an a posteriori arbiter of a solution’s believability. This is achieved by constructing a Copernican prior.

4.1. Conventional priors

If H denotes the hypothesis and D the data and if I is some relevant information, then by Bayes’ theorem (Jaynes 2003, p. 85), the posterior density of H given D and I is $\Pr (H | D,I) \propto \Pr (H | I) \Pr (D | H,I)$ $\begin{equation} {\rm Pr}(H|D,I) \: \propto \: {\rm Pr}(H|I) \: {\rm Pr}(D| H,I) \end{equation}$ (8)Here Pr(D | H,I) ≡ ℒ(H,I | D) is the likelihood, and Pr(H | I) is the prior probability of H given I.

If there is no information I, the prior reduces to Pr(H) and so becomes the subjective choice of the investigator. This aspect of Bayesian estimation is controversial and much-debated. However, there is little reason to object to current astronomical practice with regard to Pr(H) since the aim is not to quantify prejudice but to admit ignorance. Thus flat priors are typically imposed on the parameters of H. Moreover, the ranges over which these priors are non-zero are chosen to comfortably enclose the intervals within which there is significant likelihood ℒ and therefore significant posterior density Pr(H | D). Such priors are non-informative.

In the Gaia problem, it is tempting to use a variant of this methodology to eliminate CPr-violating orbits. Thus, Fig. 1 suggests a prior on e that is zero for e> 0.6. But this would be an ad hoc fix for this particular data set. A Bayesian prior should not depend on, nor be derived from, D.

4.2. A Copernican prior

For the problem under consideration, the symbols H,D and I are defined as follows:

H:
the components of the theoretical scan vector s are s_n = s(t_n,α_n;θ), the predicted abscissae at the known times t_n and scanning angles α_n for the Keplerian orbit θ.
D:
the elements of the data vector $\begin{matrix} ˜ \\ s \end{matrix}$ $\hbox{$\tilde{\vec{s}}$}$ are $\begin{matrix} ˜ \\ s_{n} \end{matrix}$ $\hbox{$\tilde{s}_{n}$}$ , the measured abscissae at (t_n,α_n).
I:
orbits θ with random orientations and random shifts in epoch are all equally probable a priori.

Comments:

(i)
information I is the means of incorporating the CPr.
(ii)
since orbits are periodic, choosing a random value of τ = T/P ∈ (0,1) is equivalent to a random shift in epoch.
(iii)
imposing I is appropriate for orbits discovered with Gaia but not if a previously-known orbit is targeted.

Given that the hypothesis of Keplerian motion enters via the theoretical vector s, the Copernican prior Pr(H | I) becomes π(s | I), the probability density at s when I is taken into account. However, I does not itself suffice to determine this probability density function (PDF). In addition, P,e and a – or their prior distributions – must be specified. We choose the latter option on the grounds of simplicity.

For the bounded quantity e, we assume a uniform prior in (0,1). For the unbounded positive continuous parameters P and a, it is appropriate (Jaynes 2003, p. 395) to assign equal prior probabilities to equal logarthmic intervals – i.e., Jeffreys’ priors.

With these additional assumptions, the Copernican prior is determinate and given by $\begin{matrix} \Pr (H | I) \propto a^{-1} π_{1} (s | I) \end{matrix}$ $\begin{eqnarray} {\rm Pr}(H|I) \propto a^{-1} \pi_{1}(\vec{s}|I) \end{eqnarray}$ (9)where π₁(s | I) is the PDF at s for orbits scaled to a = 1″.

4.3. Accurate treatment

A rigorous calculation of π₁(s | I) would proceed as follows: The scale parameter a is set = 1″, and e and log P are randomly chosen in (0,1) and (log P_L,log P_U), respectively. The orbit’s orientation (i,ω,Ω) and epoch τ = T/P are randomly chosen, and the resulting theoretical s computed. These steps are repeated many times, thus generating points that populate the N-dimensional s-space with a probability density determined by I and by the prior distributions of e and P. As the sample size → ∞, the result is the desired PDF π₁(s | I). However, with N ~ 70, this brute-force approach is not feasible. A less rigorous approach must be adopted.

4.4. Approximate treatment

Consider an orbit with a = 1″ and eccentricity e. Sampled with random orientations and epochs, the theoretical abscissae s_n will extend over the full permitted range, namely −(1 + e) to + (1 + e). In other words, whatever the values of (t_n,α_n), there will be some combination of orientation (i,ω,Ω) and epoch τ for which these limits are reached. Given that e ∈ (0,1), it follows that the s_n-values populate the interval (− 2, + 2). Since this applies to every component of s, the distribution of the vectors s in N-dimensional space is approximately isotropic. Accordingly, most of the information relevant to CPr violations is contained in the distribution of the Euclidean “lengths” of the vectors s. We therefore define the statistic ξ given by $ξ^{2} = N^{-1} \sum_{n} (s_{n} / a) 2$ $\begin{equation} \xi^{2} \: = \: N^{-1} \sum_{n} (s_{n}/a)^{2} \end{equation}$ (10)With this statistic as the sole basis for assessing CPr violations, the approximate Copernican prior is $\begin{matrix} \Pr (H | I) \propto a^{-1} π_{1} (ξ | I) \end{matrix}$ $\begin{eqnarray} {\rm Pr}(\: H| \: I) \: \propto \: a^{-1} \: \pi_{1}(\: \xi|\: I) \end{eqnarray}$ (11)where π₁(ξ | I) is the PDF of the lengths ξ for orbits scaled to a = 1″. This prior is used in the subsequent tests.

4.5. Calculation of π₁(ξ | I)

With the above assumptions, the problem has been reduced to tabulating the 1D function π₁(ξ | I). The steps are as follows:

1)
The campaign (t_n,α_n;N) is specified (Sect. 2.3).
2)
We set a = 1″ and choose random values of e ∈ (0,1) and log P ∈ (log P_L,log P_U).
3)
A random orientation is selected by taking ω = 2πz_u,Ω = πz_u, and cosi = 1 − 2z_u.
4)
A random epoch is selected by taking τ = T/P = z_u.
5)
With the orbit vector θ determined in steps 2)–4), the coordinates (x_n,y_n) at t_n are computed.
6)
From these coordinates, the components of s are given by Eq. (2), and the corresponding length ξ by Eq. (10).
7)
Steps 2)–6) are repeated 10⁸ times. The resulting histogram of ξ-values gives π₁(ξ | I).

Comments:

(i)
To eliminate parabolic orbits and to avoid convergence failureswhen solving Kepler’s equation, an upper limite = 0.999 is imposed at step 2).
(ii)
Since the exact period is known, we take P_L = 1y and P_U = 10y.
(iii)
From the 10⁸ values of s_n for each n, the maximum and minimum values are derived. These closely approach the expected values ± 2 (Sect. 4.4).
(iv)
If no assumptions are made about the prior distributions of P and e, then π₁ = π₁(ξ;P,e | I), thus requiring an extra two dimensions in its tabulation. This is feasible, but the preference here is to investigate the simplest formulation.

The resulting accurate determination of π₁(ξ | I) is plotted in Fig. 3. This shows that the astrometric lengths are typically in the interval (0.4,1.1), and that values ≲ 0.3 are improbable.

Fig. 3

PDF π₁(ξ | I) for astrometric “lengths” ξ defined by Eq. (10). The orbits have random orientations (i,ω,Ω), random epochs τ, random eccentricities ∈ (0,1), and random values of log P ∈ (0.0,1.0). The corresponding PDF for P-orbits (i = ω = 90°) with e = 0.96 is a near delta function at ξ = 0.14.

The position of a P-orbit in this plot is of interest. The above steps are therefore repeated with the constraints e = 0.96 and i = ω = 90°. For this orbit, the range for the s_n is (− b, + b) or (− 0.28, + 0.28) in units of a. Consistent with this, the PDF is a near delta function at ξ = 0.14, and this location is indicated in Fig. 3. The probability of obtaining an even smaller value is $\begin{matrix} Π_{1} (ξ | I) = \int_{0}^{ξ} π_{1} (ξ | I) d ξ \end{matrix}$ $\begin{eqnarray} \Pi_{1} (\xi| I) = \int_0^{\xi} \pi_{1}(\xi|I) \: {\rm d} \xi \end{eqnarray}$ (12)which gives Π₁(0.14 | I) = 1.7 × 10^-4, showing that P-orbits populate an extremely low probability tail of π₁(ξ | I).

5. Bayesian estimation subject to the CPr

Formulae are now developed that allow the Copernican prior to be included in the calculation of posterior densities and credibility intervals.

5.1. Posterior densities

For every orbit in θ-space, there is a theoretical scan vector s corresponding to the scanning campaign (t_n,α_n). From this s and the orbit’s a, we can compute ξ from Eq. (10). Then, from ξ, we obtain π₁(ξ | I) by interpolating in the data file plotted in Fig. 3. This procedure results in an ensemble of orbit vectors weighted according to their Copernican priors a^-1π₁(ξ | I), and so CPr violations are penalized. From this ensemble, the Bayesian machinery then computes posterior densities by further weighting the orbits in accordance with their goodness-of-fits to the measured scan vector $\begin{matrix} ˜ \\ s \end{matrix}$ $\hbox{$\tilde{\vec{s}}$}$ .

The posterior density at (φ,ψ) is $\begin{matrix} Λ (φ, ψ | D,I) \propto a^{-1} π_{1} (ξ | I) ℒ (φ, ψ | D) . \end{matrix}$ $\begin{eqnarray} \Lambda(\vec{\phi}, \vec{\psi} | D,I) \propto a^{-1} \: \pi_{1}(\xi|I) \: {\cal L} \left(\vec{\phi}, \vec{\psi}| D\right) . \end{eqnarray}$ (13)Ignoring coefficients independent of (φ,ψ) and assuming normally distributed measurement errors, we have $\begin{matrix} ℒ \propto \exp (- \frac{1}{2} χ̂ 2) \times \exp (- \frac{1}{2} δ χ^{2}) \end{matrix}$ $\begin{eqnarray} {\cal L} \: \propto \: \exp \left( -\frac{1}{2} \: \hat{\chi}^{2} \right) \times \exp \left( -\frac{1}{2} \: \delta \chi^{2}\right) \end{eqnarray}$ (14)where $\hbox{$\hat{\chi}^{2}(\vec{\phi})$}$ is the minimum at $\hat{ψ}$ $\hbox{$\hat{\vec{\psi}}$}$ , and δχ² is the postive increment due to the displacement $ψ - \hat{ψ}$ $\hbox{$\vec{\psi} - \hat{\vec{\psi}}$}$ at fixed φ.

In the absence of the Copernican prior, Λ ∝ ℒ ∝ Pr(φ) × Pr(ψ | φ). The first of these PDFs is sampled at grid points (i,j,k) giving weight factors $\propto \exp (- χ_{ijk}^{2} / 2)$ $\hbox{${\propto}\exp(-\chi^{2}_{ijk}/2)$}$ . The second PDF is randomly sampled as described in Appendix A.4. If $\hbox{${\cal N}_{ijk}$}$ is the number of random points ψ_ℓ selected in ψ-space at φ_ijk, then each has weight $𝒩_{ijk}^{-1}$ $\hbox{${\cal N}_{ijk}^{-1}$}$ .

With the Copernican prior included, the PDF Λ given in Eq. (13) is represented by a cloud of discrete orbit vectors $\begin{matrix} θ_{m} \equiv (φ_{ijk}, ψ_{ℓ}) \end{matrix}$ $\begin{eqnarray} \vec{\theta}_{m} \equiv (\vec{\phi}_{ijk}, \vec{\psi}_{\ell}) \end{eqnarray}$ (15)with weights $\begin{matrix} μ_{m} = a_{m}^{-1} π_{1} (ξ_{m} | I) \times 𝒩_{ijk}^{-1} \exp (- \frac{1}{2} χ̂ \begin{matrix} 2 \\ ijk \end{matrix}) \end{matrix}$ $\begin{eqnarray} \mu_{m} = a_{m}^{-1} \pi_{1} (\xi_{m}|I) \times \: {\cal N}_{ijk}^{-1} \: \exp \left( -\frac{1}{2} \: \hat{\chi}^{2}_{ijk}\right) \end{eqnarray}$ (16)Here m enumerates the random points ψ_ℓ across all grid cells (i,j,k).

From these weighted orbits, the posterior mean of a quantity Q(θ) given Dand I is $\begin{matrix} ⟨ Q ⟩ = \sum_{m} μ_{m} Q_{m} / \sum_{m} μ_{m} \end{matrix}$ $\begin{eqnarray} \langle Q \rangle \: = \: \sum_{m} \: \mu_{m} Q_{m} \:/\: \sum_{m} \: \mu_{m} \end{eqnarray}$ (17)and credibility intervals are derived as described in Sect. 4.2 of L14b.

This discrete representation of Λ and the resulting credibility means and intervals become exact as the grid steps → 0 and the $\hbox{${\cal N}_{ijk} \rightarrow \infty$}$ .

6. Numerical experiments

The approximate theory (Sect. 4.4) of the Copernican prior is now applied to the model binary defined in Eq. (1).

6.1. Code verification

In the strong-orbit limit, violations of the CPr are not an issue and so, even with the inclusion of the Copernican prior, the posterior means should → the exact elements given in Eq. (1). To test this, the Bayesian code is used to compute the solution when log β = 1.5. The posterior means and equal-tail 1σ credibility intervals for the elements are as follows: $\begin{matrix} \log P (y) = 0.462 5_{-0.0013}^{+ 0.0011} & e = 0.052 9_{-0.0039}^{+ 0.0039} \\ τ = 0.38 4_{-0.012}^{+ 0.012} \\ \log a / σ = 1.499 0_{-0.0038}^{+ 0.0033} & i = 40 .̊ 6_{- 1 .̊ 0}^{+ 1 .̊ 0} \\ ω = 144 .̊ 8_{- 4 .̊ 9}^{+ 4 .̊ 9} & Ω = 69 .̊ 6_{- 1 .̊ 3}^{+ 1 .̊ 3} . \end{matrix}$ $\begin{eqnarray} \log P(y) = 0.4625^{+0.0011}_{-0.0013} & \;\; e = 0.0529^{+0.0039}_{-0.0039} \nonumber \\ &\tau = 0.384^{+0.012}_{-0.012} \nonumber \\ \log a/\sigma = 1.4990^{+0.0033}_{-0.0038} & i = 40\fdg6^{+1\fdg0}_{-1\fdg0} \nonumber \\ \omega = 144\fdg8^{+4\fdg9}_{-4\fdg9} & \Omega = 69\fdg6^{+1\fdg3}_{-1\fdg3}. \end{eqnarray}$ (18)These results are consistent with expectation: five of the seven credibility intervals include the exact values. Minor deviations occur for τ and ω.

Note that the credibility interval for ω is substantially larger than those for i and Ω. This is a consequence of the small eccentricity, since ω becomes indeterminate as e → 0.

In this strong-orbit regime, ℒ is sharply peaked in parameter space; consequently, ξ and therefore π₁(ξ | I) vary little within the narrow domain of high likelihood. It follows that posterior densities are then largely determined by ℒ, which overwhelms the prior.

6.2. Varying β

Solutions are computed with log β = −0.6(0.05)1.2, spanning the range from weak to strong orbits. For each β, the elements’ posterior means and 1σ credibility intervals are derived as in Sect. 6.1. In addition, for each data vector $\begin{matrix} ˜ \\ s \end{matrix}$ $\hbox{$\tilde{\vec{s}}$}$ , the min-χ² solution is computed as in Pourbaix (2002). Note that when β changes, so does the random number seed.

Fig. 4

Sequence of solutions for log a/σ. The points with error bars are the posterior means ⟨log a/σ⟩ plotted with 1σ credibility intervals. The open circles are the min-χ² values. The dotted line is the locus of exact values log a_∗/σ.

In Figs. 4−6, the solution sequences are plotted for log a/σ, e, and i. For log β ≳ 0.2, the credibility intervals are consistent with both the min-χ² values and with the exact values. However, at log β = −0.05, major disagreements occur. The min-χ² value of log a/σ suddenly jumps to 1.14, which is 1.19 dex greater than the exact value. Correspondingly, e jumps to 0.9975 (the highest value allowed by the grid) and i jumps to $\hbox{$89\fdg6$}$ . Thus, at β = −0.05 dex, the min-χ² solution is a P-orbit (Sects. 3.2−3.4)

At this same β = −0.05 dex, the 1σ credibility intervals are (− 0.20,0.05) for log a/σ, (0.06,0.44) for e, and (52°,83°) for i. Thus, the Bayesian solution with Copernican prior does not undergo a transition into a P-orbit. In fact, the solution remains (marginally) consistent with the exact solution.

For log β ≤ − 0.05, most of the min-χ² solutions are P-orbits. But the Bayesian solutions with Copernican prior do not exhibit such strikingly anomalies. Nevertheless, they do eventually (log β ≲ − 0.3) become inconsistent with the exact parameters.

Fig. 5

Sequence of solutions for e. The points with error bars are the posterior means ⟨e⟩ plotted with 1-σ credibility intervals. The open circles are the min-χ² values. The exact value is e_∗ = 0.05.

Fig. 6

Sequence of solutions for i. The points with error bars are the posterior means ⟨i⟩ plotted with 1σ credibility intervals. The open circles are the min-χ² values. Orbits with i> 90° are retrograde. The exact value is i_∗ = 40°.

6.3. Orbits from noise

Figures 4−6 show that even for extremely weak orbits the Copernican prior has eliminated P-orbits. However, the plotted credibility intervals reveal that when log β ≲ − 0.3 the posterior PDFs are systematically displaced from the exact values. This could indicate that the orbital signal is then too weak for detection, a conclusion strongly supported by Fig. 6 which shows that the posterior densities of retrograde and prograde orbits are then about equal.

To investigate this issue further, the code is now used to compute solutions when a_∗ = βσ = 0. The posterior density of a/σ is plotted in Fig. 7 for a particular realization of the noise vector $\begin{matrix} ˜ \\ s \end{matrix}$ $\hbox{$\tilde{\vec{s}}$}$ . For comparison, plots with β = 0.5 and 1.0 are also included.

In 20 independent repetitions with β = 0, the range found for ⟨a/σ⟩ is 0.45 to 0.81, with average ⟨⟨a/σ⟩⟩ = 0.57. For ⟨e⟩ the range is 0.21 to 0.53, with ⟨⟨e⟩⟩ = 0.37; and for ⟨i⟩ the range is 67° to 113°, with ⟨⟨i⟩⟩ = 88°. Since these are consistent with Figs. 4−6 when log β ≲ − 0.3, we conclude that the aforementioned systematic displacements simply reflect the code’s reponse to data with negligible orbital signal.

The bias in a/σ for β = 0 evident in Fig. 7 is reminiscent of the bias in the eccentricities of spectroscopic binaries for nearly circular orbits (Lucy & Sweeney 1971). In both cases, bias is the inevitable result of estimating a non-negative parameter at or near its zero lower bound. From the values of ⟨⟨a/σ⟩⟩ plotted in Fig. 7, the bias of a/σ is typically 0.57, 0.21 and 0.13 at β = 0.0,0.5 and 1.0, respectively.

Fig. 7

Posterior densities of a/σ for β = 0.0,0.5 and 1.0 for particular realizations of the measurement vectors $\begin{matrix} ˜ \\ s \end{matrix}$ $\hbox{$\tilde{\vec{s}}$}$ . The vertical dotted lines indicate the average positions of the corresponding posterior means ⟨a/σ⟩. Each of these is obtained from 20 independent simulations.

Also of interest when β = 0 is the frequency of P-orbits for min-χ² solutions. From 200 independent simulations, 156 or 78% are P-orbits – i.e., have e> 0.95,i ≈ 90° and ω ≈ 90 or 270°. Thus, the PDFs of e,i and ω for min-χ² solutions when β = 0 are dominated by near delta functions at the Pourbaix loci.

Figures 8−10 plot the posterior densities of e,cosi and ω for β = 0 when the Copernican prior is included. While these plots are free from peaks at the Pourbaix loci, they do show evidence of imperfections that presumably derive from the approximate treatment of the Copernican prior (Sect. 4.4). Ideally, when analysing pure noise, the inferred values of cosi and ω should be uniformly distributed in (− 1, + 1) and (0,360°), respectively. Figures 9, 10 show departures from this ideal.

As in Fig. 7, Figs. 8−10 also include solutions for β = 1. Figs. 8 and 9 show the emergence of peaks at the exact values of e and cosi, repectively. However, an emerging peak is not evident at ω_∗ in Fig. 10. This is due to the near indeterminacy of ω when e ≪ 1 – see Sect. 6.1.

Fig. 8

Posterior PDFs for e when β = 0 and 1. The Pourbaix peak at e = 1 and the exact value e_∗ = 0.05 are indicated.

Fig. 9

Posterior PDFs for i when β = 0 and 1. The Pourbaix peak at cosi = 0 and the exact value cosi_∗ are indicated.

Fig. 10

Posterior PDFs for ω when β = 0 and 1. The Pourbaix peaks at ω = 90,270° and the exact value ω_∗ = 150° are indicated.

6.4. ξ-probabilities

Given that P-orbits arise when applying a conventional data analysis technique to synthetic Gaia data, there is some danger that such orbits will contaminate the huge data bases expected from the Gaia mission. On the assumption that this Bayesian procedure cannot feasibly replace the existing pipeline analyses, a less ambitious approach to elimating P-orbits is desirable.

Let θ₀ be the min-χ² elements derived from an observed scan vector $\begin{matrix} ˜ \\ s \end{matrix}$ $\hbox{$\vec{\tilde{s}}$}$ and let the corresponding fitted vector be s₀. From s₀, the astrometric length ξ₀ of the orbit θ₀ is then given by Eq. (10). This scale-free length refers to an orbit with physical parameters (P₀,e₀) observed at epoch τ₀ and orientation (i₀,ω₀,Ω₀). We now define p₀ to be the probability that a shorter length ξ would be found with random epochs and orientations but with P and e fixed at their min-χ² values. Thus, with steps 3)−6) of Sect. 4.5, we compute $\begin{matrix} p_{0} = \Pr (ξ < ξ_{0} | P_{0}, e_{0}) . \end{matrix}$ $\begin{eqnarray} p_{0} = {\rm Pr}(\xi < \xi_{0} | P_{0}, e_{0}). \end{eqnarray}$ (19)In Fig. 11, log p₀ is plotted against log β for min-χ² orbits. For strong orbits, the values scatter about the exact value = − 0.188. But for weak orbits the solutions are the CPr-violating P-orbits with log p₀ ~ − 4. Accordingly, if a Gaia orbit catalogue were contaminated by P-orbits a cut excluding orbits with p₀< − 3 dex would remove them.

Fig. 11

ξ-probabilities p₀ for a β-sequence of min-χ² orbits computed for simulations of the orbit defined by Eq. (1). The dotted line is the exact value p_∗ = 0.649.

Besides P-orbits, other as yet unrecognized anomalies, biases and selection effects may be present in a Gaia catalogue. Accordingly, it is worth noting that there are five quantities which, in a perfect catalogue, are uniformly and independently distributed in (0,1). These quantities are: (1 + cosi)/2,ω/ 2π,Ω /π,τ and p₀. This statement yields fifteen statistical tests that should be applied to a catalogue of Gaia orbits.

6.5. Detecting a second companion

A further weak-orbit problem for Gaia is that of detecting a second companion (B) when the first (A) is well-determined. This is a goodness-of-fit problem: the presence of B degrades the fit achieved when only A is considered.

To investigate this problem, Gaia data is created (Sects. 2.3, 2.4) for a star with invisible companions A and B. Companion A has the elements given in Eq. (1) with β_A = 10, and B is in a coplanar orbit with P = 7.2y,e = 0.2,τ = 0.7, and a reflex orbit with semi-major axis = β_Bσ.

A 1D sequence of Gaia scans is created for this two-companion model with log β_B = −0.6(0.05)0.6, and each scan is analysed with the Bayesian code under the assumption of only one companion.

For each β_B, the code creates (Sect. 5) a cloud of orbits θ_m with weights μ_m. The χ² of the m-th orbit’s fit to the data vector $\begin{matrix} ˜ \\ s \end{matrix}$ $\hbox{$\tilde{\vec{s}}$}$ is $χ_{m}^{2} = χ̂ \begin{matrix} 2 \\ ijk \end{matrix} + δ χ_{ℓ}^{2} .$ $\begin{equation} \chi^{2}_{m} = \hat{\chi}^{2}_{ijk} + \delta \chi^{2}_{\ell}. \end{equation}$ (20)Now, if the one-companion solution provides a satisfactory fit, then orbits of high weight should have $χ_{m}^{2} ≲ N$ $\hbox{$\chi^{2}_{m} \la N$}$ . On the other hand, if the solution is not satisfactory, then these high-weight orbits will have $χ_{m}^{2} > χ_{N,α}^{2}$ $\hbox{$\chi^{2}_{m} > \chi^{2}_{N,\alpha}$}$ with α< 0.05. These expectations can be reduced to a single measure of goodness-of-fit, namely ⟨χ²⟩, the posterior mean of χ². This is calculated from Eq. (17) with $Q_{m} = χ_{m}^{2}$ $\hbox{$Q_{m} = \chi^{2}_{m}$}$ .

The values of ⟨χ²⟩ are plotted against log β_B in Fig. 12. As always with statistical tests, the investigator has discretion as to when he deems a model to be successful. In this case, he is likely to suspect an additional orbit when log β_B ≥ 0.3. On the other hand, scans with log β_B ≤ 0.1 are fitted with $⟨ χ^{2} ⟩ \approx {χ^{2}}_{˜}$ $\hbox{$\langle \chi^{2}\rangle \: \approx \: \tilde{\chi}^{2}$}$ , the residuals are therefore consistent with measurement errors and so there is no evidence of an additional orbit.

Fig. 12

Detecting a second companion. The filled circles are the posterior means ⟨χ²⟩ measuring the goodness-of-fits of the Bayesian single-orbit solutions to the simulated scan vectors $\begin{matrix} ˜ \\ s \end{matrix}$ $\hbox{$\vec{\tilde{s}}$}$ for the two-orbit model. The amplitudes are β_A = 10 and log β_B = −0.6(0.05)0.6. The open circles are the corresponding values of ${χ^{2}}_{˜}$ $\hbox{$\tilde{\chi}^{2}$}$ given by Eq. (4). The dotted lines are the $χ_{N,α}^{2}$ $\hbox{$\chi^{2}_{N,\alpha}$}$ values for α = 0.025,0.975 with N = 70.

7. Conclusion

The aims of this paper are twofold. First, to provide a weak-orbit analysis for Gaia and, in particular, to investigate the occurrence of the spurious solutions found by Pourbaix (2002). Secondly, to use the Gaia problem as a test case for a procedure that incorporates the CPr into the machinery of statistical astronomy.

With regard to spurious solutions, Pourbaix’s (2002) finding is confirmed and the puzzle of nearly parabolic, edge-on orbits explained in terms of the near degeneracy (Sect. 3.4) of scan vectors s under the addition of such orbits. Moreover, these orbits are shown to violate the CPr (Sect. 3.3) and do not arise when the CPr is adopted as a fundamental postulate in Bayesian estimation (Sects. 5 and 6).

More generally, incorporating the CPr in statistical analyses may improve solutions derived for imperfect experiments (Sect. 3.5). In addition to poor precision and limited sampling, weather, the seasons, atmospheric opacity and interstellar extinction are among the numerous factors that result in data sets that are less than ideal.

When an astronomer must perforce analyse an imperfect data set, he needs to be aware that supposedly optimum statistical procedures – e.g., min-χ² or Bayesian estimation with non-informative priors – can in extreme cases, as with the P-orbits, give anamolous solutions. Moreover, at a more subtle level, even when an anomaly is not immediately evident, the complicated topology (Sect.3.2) of the likelihood function implies that the above “optimum” procedures are unlikely to be so. An investigation of how an estimation procedure can exploit imperfect data should be carried out (Sect. 3.2) and an appropriate prior constructed. Often Copernican considerations with regard to position, epoch or orientation will be crucial.

Acknowledgments

I thank A. H. Jaffe and D. J. Mortlock for helpful discussions on Bayesian methods and the referee for justified criticisms of the original version.

References

Casertano, S., Lattanzi, M. G., Sozzetti, A., et al. 2008, A&A, 482, 699 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Gentle, J. E. 2009, Computational Statistics (New York : Springer) [Google Scholar]
James, F. 2006, Statistical Methods in Experimental Physics (Singapore: World Scientific Publishing Co) [Google Scholar]
Jaynes, E. T. 2003, Probability Theory, The Logic of Science (Cambridge: Cambridge University Press) [Google Scholar]
Lucy, L. B. 2014a, A&A, 563, A126 (L14a) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Lucy, L. B. 2014b, A&A, 565, A37 (L14b) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Lucy, L. B., & Sweeney, M. A. 1971, AJ, 76, 544 [NASA ADS] [CrossRef] [Google Scholar]
Perryman, M. 2009, Astronomical Applications of Astrometry (Cambridge: Cambridge University Press) [Google Scholar]
Pourbaix, D. 2002, A&A, 385, 686 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Pourbaix, D. 2004, ASPC Conf. Ser., 318, 132 [NASA ADS] [Google Scholar]
Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. 1992, Numerical Recipes, 2nd edn. (Cambridge: Cambridge Univ. Press) [Google Scholar]
Sozzetti, A., Giacobbe, P., Lattanzi, et al. 2014, MNRAS, 437, 497 [NASA ADS] [CrossRef] [Google Scholar]

Appendix A: Statistics in Thiele-Innes space

In L14a,b, each observation of the model visual binary yielded two measurements $(\begin{matrix} _{˜} \\ x_{n} \end{matrix}, \begin{matrix} _{˜} \\ y_{n} \end{matrix})$ $\hbox{$(\tilde{x}_{n},\tilde{y}_{n})$}$ , the sky coordinates of the secondary’s displacement from the primary at time t_n. In this circumstance, minimizing χ² to obtain the Thiele-Innes constants $\hbox{$\hat{\psi}_{j}$}$ separates into two independent problems, minimizing the x-coordinate contribution to χ² to obtain (A,F) and minimizing the y-coordinate contribution to obtain (B,G). This separation results in the considerable simplifications exploited in L14a,b.

However, these simplifications are lost when observing an astrometric binary with a 1D scanning device. On the assumption that the parallactic and proper motion have been subtracted, the measurement at t_n with scanning angle α_n is the observed star’s displacement $\begin{matrix} ˜ \\ s_{n} \end{matrix}$ $\hbox{$\tilde{s}_{n}$}$ from the binary’s barycentre.

Appendix A.1: Normal equations

For given orbit θ ≡ (φ,ψ), the goodness-of-fit criterion χ²(φ,ψ) is given by Eq. (5). At fixed φ, the orbit (x,y) is linear in ψ. Accordingly, since s_n = 0 when ψ = 0, the predicted abscissa at t_n is $\begin{matrix} s_{n} = \sum_{j} {(\frac{∂s}{\partial ψ_{j}})}_{n} ψ_{j} . \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} s_{n} = \sum_{j} \left( \frac{\partial s}{\partial \psi_{j}} \right)_{n} \: \psi_{j} . \end{eqnarray}$ (A.1)Substitution of s_n into Eq. (5) then allows the min-χ² solution for the Thiele-Innes vector ψ to be obtained without iteration. The normal equations are $\begin{matrix} 𝒜_{ij} ψ_{j} = b_{i} \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} {\cal A}_{ij} \psi_j = b_{i} \end{eqnarray}$ (A.2)where, the curvature matrix, $\begin{matrix} 𝒜_{ij} = \frac{1}{σ^{2}} \sum_{n} {(\frac{∂s}{\partial ψ_{i}})}_{n} {(\frac{∂s}{\partial ψ_{j}})}_{n} \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} {\cal A}_{ij} = \frac{1}{\sigma^{2}} \sum_{n} \left( \frac{\partial s}{\partial \psi_{i}} \right)_{n} \left( \frac{\partial s}{\partial \psi_{j}} \right)_{n} \end{eqnarray}$ (A.3)and $\begin{matrix} b_{i} = \frac{1}{σ^{2}} \sum_{n} \begin{matrix} ˜ \\ s_{n} \end{matrix} {(\frac{∂s}{\partial ψ_{i}})}_{n} \cdot \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} b_{i} = \frac{1}{\sigma^{2}} \sum_{n} \tilde{s}_{n} \left( \frac{\partial s}{\partial \psi_{i}} \right)_{n} \cdot \end{eqnarray}$ (A.4)The partial derivatives in these equations can be expressed in terms of the elliptical rectangular coordinates X(E),Y(E) via Eqs. (2) and (A.2) of L14a.

The solution of Eq. (A.2) is $\hat{ψ} = (Â, B̂, F̂, Ĝ)$ $\hbox{$\hat{\vec{\psi}} = (\hat{A},\hat{B},\hat{F},\hat{G})$}$ and we write $χ̂^{2} (φ) = χ^{2} (φ, \hat{ψ})$ $\hbox{$\hat{\chi}^{2}(\vec{\phi}) = \chi^{2}(\vec{\phi},\hat{\vec{\psi}})$}$ .

Appendix A.2: Increment in χ²

At fixed φ, a displacement δψ from $\hat{ψ}$ $\hbox{$\hat{\vec{\psi}}$}$ results in a positive increment δχ². The abscissa corresponding to this displacement is given by Eq. (A.1). Substitution in Eq. (5) then gives $\hbox{$\chi^{2} = \hat{\chi}^{2} + \delta \chi^{2}$}$ . From the quadratic terms in the resulting expression, we obtain $\begin{matrix} δ χ^{2} = δ ψ^{'} 𝒜 δ ψ . \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} \delta \chi^{2} = \delta \vec {\psi}^{'} \mathbf{ {\cal A} } \: \delta \vec {\psi} . \end{eqnarray}$ (A.5)

Appendix A.3: Probability density function p(ψ | φ,D)

The distribution of probability at fixed φ is a quadrivariate normal distribution centred on $\hat{ψ}$ $\hbox{$\hat{\vec{\psi}}$}$ . If Σ is the covariance matrix, then $\begin{matrix} p = \frac{1}{4 π^{2}} \frac{1}{\sqrt{| Σ |}} \exp (- \frac{1}{2} δ ψ^{'} Σ^{-1} δ ψ) \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} p = \frac{1}{4 \pi^{2}} \frac{1}{ \sqrt{| \vec{\Sigma} |}} \: \exp\left(- \frac{1}{2} \delta \vec {\psi}^{'} \vec{\Sigma}^{-1} \delta \vec {\psi} \right) \end{eqnarray}$ (A.6)

(James 2006, p. 67). Since $\hbox{$\vec{\Sigma}^{-1} = \vec{ {\cal A}}$}$ , comparison with Eq. (A.6) gives $\begin{matrix} p = \frac{1}{4 π^{2}} \frac{1}{\sqrt{| Σ |}} \exp (- \frac{1}{2} δ χ^{2}) \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} p = \frac{1}{4 \pi^{2}} \frac{1}{ \sqrt{| \vec{\Sigma} |}} \: \exp\left(- \frac{1}{2} \delta \chi^{2}\right) \end{eqnarray}$ (A.7)

Appendix A.4: Random sampling in ψ-space

A random point δψ_ℓ sampling p(ψ | φ,D) is obtained as follows (Gentle 2009, pp. 315−316): The first step is to compute the Cholesky decomposition (Press et al. 1992, pp. 89−91) of Σ. Thus, we find the lower triangular matrix L such that $L L^{'} = Σ .$ $\appendix \setcounter{section}{1} \begin{equation} \mathbf{L} \mathbf{L}^{'} = \mathbf{\Sigma}. \end{equation}$ (A.8)Now let z_G be a 4D vector whose elements are independent random Gaussian variates sampling $\hbox{${\cal N}(0,1)$}$ . Then $δ ψ = L z G$ $\appendix \setcounter{section}{1} \begin{equation} \delta \vec{\psi} = \mathbf{L} \vec {z}_{G} \end{equation}$ (A.9)is a random displacement from $\hat{ψ}$ $\hbox{$\hat{\vec{\psi}}$}$ satisfying the PDF given by Eq. (A.7)

If we generate $\hbox{${\cal N}$}$ independent displacements, then the points $ψ_{ℓ} = \hat{ψ} + δ ψ_{ℓ}$ $\hbox{$\vec{\psi}_{\ell} = \hat{\vec{\psi}} + \delta \vec{\psi}_{\ell}$}$ give us the approximation $p (ψ | φ,D) = 𝒩^{-1} \sum_{ℓ} δ (ψ - ψ_{ℓ})$ $\appendix \setcounter{section}{1} \begin{equation} p(\vec{\psi}|\vec{\phi},D) = {\cal N}^{-1} \sum_{\ell} \delta(\vec{\psi} - \vec{\psi}_{\ell}) \end{equation}$ (A.10)which is exact in the limit $\hbox{${\cal N} \rightarrow \infty$}$ .

All Figures

	Fig. 1 Feasible domain $\hbox{${\cal D}$}$ projected on to the (ω,e)-plane. The orbital parameters are given by Eq. (1) with β = 2.8.
In the text

	Fig. 2 Feasible domain $\hbox{${\cal D}$}$ projected on to the (i,e)-plane. The orbital parameters are given by Eq. (1) with β = 2.8.
In the text

	Fig. 3 PDF π₁(ξ \| I) for astrometric “lengths” ξ defined by Eq. (10). The orbits have random orientations (i,ω,Ω), random epochs τ, random eccentricities ∈ (0,1), and random values of log P ∈ (0.0,1.0). The corresponding PDF for P-orbits (i = ω = 90°) with e = 0.96 is a near delta function at ξ = 0.14.
In the text

	Fig. 4 Sequence of solutions for log a/σ. The points with error bars are the posterior means ⟨log a/σ⟩ plotted with 1σ credibility intervals. The open circles are the min-χ² values. The dotted line is the locus of exact values log a_∗/σ.
In the text

	Fig. 5 Sequence of solutions for e. The points with error bars are the posterior means ⟨e⟩ plotted with 1-σ credibility intervals. The open circles are the min-χ² values. The exact value is e_∗ = 0.05.
In the text

	Fig. 6 Sequence of solutions for i. The points with error bars are the posterior means ⟨i⟩ plotted with 1σ credibility intervals. The open circles are the min-χ² values. Orbits with i> 90° are retrograde. The exact value is i_∗ = 40°.
In the text

	Fig. 7 Posterior densities of a/σ for β = 0.0,0.5 and 1.0 for particular realizations of the measurement vectors $\begin{matrix} ˜ \\ s \end{matrix}$ $\hbox{$\tilde{\vec{s}}$}$ . The vertical dotted lines indicate the average positions of the corresponding posterior means ⟨a/σ⟩. Each of these is obtained from 20 independent simulations.
In the text

	Fig. 8 Posterior PDFs for e when β = 0 and 1. The Pourbaix peak at e = 1 and the exact value e_∗ = 0.05 are indicated.
In the text

	Fig. 9 Posterior PDFs for i when β = 0 and 1. The Pourbaix peak at cosi = 0 and the exact value cosi_∗ are indicated.
In the text

	Fig. 10 Posterior PDFs for ω when β = 0 and 1. The Pourbaix peaks at ω = 90,270° and the exact value ω_∗ = 150° are indicated.
In the text

	Fig. 11 ξ-probabilities p₀ for a β-sequence of min-χ² orbits computed for simulations of the orbit defined by Eq. (1). The dotted line is the exact value p_∗ = 0.649.
In the text

Fig. 12

Detecting a second companion. The filled circles are the posterior means ⟨χ²⟩ measuring the goodness-of-fits of the Bayesian single-orbit solutions to the simulated scan vectors $\begin{matrix} ˜ \\ s \end{matrix}$ $\hbox{$\vec{\tilde{s}}$}$ for the two-orbit model. The amplitudes are β_A = 10 and log β_B = −0.6(0.05)0.6. The open circles are the corresponding values of ${χ^{2}}_{˜}$ $\hbox{$\tilde{\chi}^{2}$}$ given by Eq. (4). The dotted lines are the $χ_{N,α}^{2}$ $\hbox{$\chi^{2}_{N,\alpha}$}$ values for α = 0.025,0.975 with N = 70.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Casertano, S., Lattanzi, M. G., Sozzetti, A., et al. 2008, A&A, 482, 699 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[2] Gentle, J. E. 2009, Computational Statistics (New York : Springer) [Google Scholar]

[3] James, F. 2006, Statistical Methods in Experimental Physics (Singapore: World Scientific Publishing Co) [Google Scholar]

[4] Jaynes, E. T. 2003, Probability Theory, The Logic of Science (Cambridge: Cambridge University Press) [Google Scholar]

[5] Lucy, L. B. 2014a, A&A, 563, A126 (L14a) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[6] Lucy, L. B. 2014b, A&A, 565, A37 (L14b) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[7] Lucy, L. B., & Sweeney, M. A. 1971, AJ, 76, 544 [NASA ADS] [CrossRef] [Google Scholar]

[8] Perryman, M. 2009, Astronomical Applications of Astrometry (Cambridge: Cambridge University Press) [Google Scholar]

[9] Pourbaix, D. 2002, A&A, 385, 686 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[10] Pourbaix, D. 2004, ASPC Conf. Ser., 318, 132 [NASA ADS] [Google Scholar]

[11] Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. 1992, Numerical Recipes, 2nd edn. (Cambridge: Cambridge Univ. Press) [Google Scholar]

[12] Sozzetti, A., Giacobbe, P., Lattanzi, et al. 2014, MNRAS, 437, 497 [NASA ADS] [CrossRef] [Google Scholar]

Analysing weak orbital signals in Gaia data

1. Introduction

2. Synthetic data

2.1. Orbital elements

2.2. Model astrometric binary

2.3. Observing campaign

2.4. Synthetic scans

3. Feasible orbits

3.1. Grid scan

3.2. Feasible domain

3.3. Violations of the Copernican principle

3.4. Degeneracy

3.5. Imperfect experiments

4. A Bayesian prior derived from the CPr

4.1. Conventional priors

4.2. A Copernican prior

4.3. Accurate treatment

4.4. Approximate treatment

4.5. Calculation of π1(ξ | I)

5. Bayesian estimation subject to the CPr

5.1. Posterior densities

6. Numerical experiments

6.1. Code verification

6.2. Varying β

6.3. Orbits from noise

6.4. ξ-probabilities

6.5. Detecting a second companion

7. Conclusion

Acknowledgments

References

Appendix A: Statistics in Thiele-Innes space

Appendix A.1: Normal equations

Appendix A.2: Increment in χ2

Appendix A.3: Probability density function p(ψ | φ,D)

Appendix A.4: Random sampling in ψ-space

All Figures

3.2. Feasible domain $\hbox{${\cal D}$}$

4.5. Calculation of π₁(ξ | I)

Appendix A.2: Increment in χ²