Free Access
Volume 618, October 2018
Article Number A100
Number of page(s) 9
Section Numerical methods and codes
Published online 18 October 2018

© ESO 2018

1 Introduction

In fundamental stellar astronomy, all statistical estimation problems involve mathematical models with both linear and non-linear parameters – the so-called hybrid problems. The linear parameters determine scale and location; the non-linear parameters appear as arguments in dimensionless functions of time.

The presence of linearities suggests that more efficient estimation techniques exist than when all parameters are treated as non-linear (Wright & Howard 2009; Catanzarite 2010). However, some attempts to achieve this lead to significantly underestimated error bars (Eastman et al. 2013; Lucy 2014a, hereafter L14a). This poses the challenge of developing a technique that achieves the computational efficiency allowed by linearity while still giving confidence or credibility intervals with correct coverage – so that, for example, 1-σ error bars contain the correct answer with probability 0.683. A solution to this challenge is presented in (Lucy 2014b, hereafter L14b) where a grid search in the space defined by the non-linear parameters is combined with Monte Carlo sampling of the space defined by the linear parameters. In L14b, this technique is applied to simulated observations of a visual binary and coverage experiments confirm that 1- and 2- σ error bars enclose the exact values with close to the frequencies expected for normally distributed errors.

In this paper, a significantly harder problem is posed, that of analysing a binary with both astrometric and spectroscopic data. Such data could be analysed separately, but this is sub-optimal since information concerning several orbital elements is present in both data sets. Accordingly, the aim here is to obtain the posterior distribution over the entire parameter space using both data sets and to test if the derived error bars are trustworthy, an essential requirement for fundamental data on stellar masses and luminosities.

The posed problem aims at demonstrating proof-of-concept in the treatment of hybrid problems and to provide a template for the many such problems in statistical astronomy. To this end, the code developed for this investigation is freely available.

Although here a technical exercise, the simultaneous analysis of astrometric and spectroscopic data is of practical importance in the era of adaptive optics (AO) and speckle interferometry. As emphasized by Mason et al. (1999), the ability to resolve binary stars at or near the diffraction limit results in a powerful synergy between short-period visual and long-period spectroscopic binaries, leading to stellar masses and improved mass-luminosity relations.

2 Synthetic orbits

The physical model comprises an isolated pair of stars undergoing Keplerian motion due to their mutual gravitational attraction. This binary is observed astrometrically and spectroscopically, yielding two independent data sets Da and Ds, respectively. To analyse these data sets, the mathematical models predicting the components’ relative motion on the sky as well as their radial velocity variations are used simultaneously to derive the posterior distribution over parameter space.

2.1 Orbital elements

For the astrometric orbit, L14a is followed closely with regard both to notation and the creation of synthetic data.

The motion on the sky of the secondary relative to the primary is conventionally parameterized by the Campbell elements P, e, T, a, i, ω, Ω. Here P is the period, e is the eccentricity, T is a time of periastron passage, i is the inclination, ω is the longitude of periastron, and Ω is the position angle of the ascending node. However, from the standpoint of computational economy, many investigators – references in L14a, Sect. 2.1 – prefer the Thiele-Innes elements. Thus, the Campbell parameter vector θ = (ϕ, ϑ), where ϕ = (P, e, τ) and ϑ = (a, i, ω, Ω), is replaced by the Thiele-Innesvector (ϕ, ψ), where the components of the vector ψ are the Thiele-Innes constants A, B, F, G (Note that in the ϕ vector, T has been replaced by τ = TP which by definition ∈ (0, 1)).

The spectroscopic orbits of the components introduce three additional parameters, the systemic velocity γ and the semi-amplitudes K1,2. The predicted radial velocities are then (1)

where ν(t) is the true anomaly, ω2 = ω and ω1 = ω + π.

Note that v2v1 = ż, where z, the companion’s displacement perpendicular to the sky, is given by the Thiele-Innes constants C and H. This is a useful check on coding.

With the inclusion of spectroscopic data, the combined data sets allow inferences about the 10-dimensional vector (2)

where λ = (γ, K1, K2).

In a Bayesian analysis, the task is to compute the posterior probability density in Θ -space given Da and Ds.

2.2 Model binary

The adopted model binary has the following Campbell elements: (3)

With this a, the binary would be unresolved in seeing- broadened images but should be resolved in images approaching diffraction limits.

If we now take the parallax ϖ = 0.05′′, the total mass . With mass ratio q = 0.7, the component masses are and . The resulting semi-amplitudes are K1 = 8.64 and K2 = 12.35 km s−1, and without loss of generality we take γ = 0.0 km s−1.

2.3 Observing campaigns

An astrometric observing campaign is simulated by creating measured Cartesian sky coordinates with weights for both coordinates. One observation in observing seasons of length 0.3 yr is created randomly for 10 successive years. We take σa,n = 0.05′′ for all n.

In the above, we assume equal precision for each coordinate and uncorrelated errors. These assumptions are well justified if the AO or speckle image reconstructions give circularly symmetric stellar profiles. If necessary, the technique can be generalized to treat unequal and correlated errors (Appendices A.1, C.1).

A spectroscopic observing campaign is simulated by creating measured radial velocities at random times in 10 successive observing seasons. The observations have weights and . We take σs,n = 0.5 km s−1 for all n.

All 40 simulated observations have errors that are normally-distributed.

3 Conditional probabilities

In order to benefit from the hybrid character of the problems arising in orbit estimation, the chain rule for conditional probabilities is used to factorize multi-dimensional posterior distributions Λ in such a way that linear and non-linear parameters are separated. This facilitates the construction of efficient hybrid numerical schemes that combine grid scanning with Monte Carlo sampling.

3.1 Approach

Consider a problem with two scalar parameters, α and β, and suppose the model is non-linear in α and linear in β. Applying the chain rule, we can write the posterior density as (4)

where (5)

Pr(α) is thus the projection of Λ(α, β) onto the α axis.

The 1D function Pr(α) can be approximated by the discrete values Pr(αi), where the αi are the mid-points of a uniform grid with steps Δα. In contrast, because of linearity, values βiℓ can readily be derived that randomly sample Pr(β|αi). Combining these approaches, we derive the following approximation for the posterior distribution: (6)

With this approximation, all the quantities we wish to infer from the posterior distribution become weighted summations over the points (αi, βiℓ), and these summations converge to exact values as Δα → 0 and . Arbitrary accuracy can therefore be achieved.

3.2 Astrometry only

If we only have astrometric data, Λ is a function of seven parameters. With the Thiele-Innes parameterization, the parameter vector is (ϕ, ψ), and the mathematical model is linear in the four ψ parameters and non-linear in the three ϕ parameters.

Following the 2D example of Sect. 3.1, we apply the chain rule to obtain (7)

where (8)

Here Pr(ϕ|Da) is the projection of the 7D posterior distribution Λ(ϕ, ψ|Da) onto the 3D ϕ-space. The second factor Pr(ψ|ϕ, Da) then specifies how this projected or summed probability is to be distributed in ψ-space.

3.3 Astrometry and spectroscopy

With spectroscopic data included, Λ is now a function of 10 parameters (ϕ, ψ, λ). Again applying the chain rule, we write (9)

where (10)

and (11)

Here Pr(ϕ|Da, Ds) is the projection of the 10D posterior distribution Λ(ϕ, ψ, λ|Da, Ds) onto the 3D ϕ-space. The productPr(ψ|ϕ, Da, Ds) × Pr(λ|ϕ, ψ, Ds) then specifies how this summed probability is to be distributed first into ψ-space and then into λ-space.

The dependence of these probability factors on Da and Ds merits comment. Both data sets contain information on ϕ = (logP, e, τ). Accordingly,Pr(ϕ) depends on both Da and Ds.

The ψ-vector (A, B, F, G) determines the Campbell elements (a, i, ω, Ω) and vice versa. Since ω is a spectroscopic as well as an astrometric element, Pr(ψ|ϕ) must depend on Ds as well as on Da.

If ϕ and ψ are given, then, since ω =ω(ψ), the spectroscopic elements, P, e, τ, ω are known. The data Ds then suffices to determines the remaining spectroscopic elements λ = (γ, K1, K2). Thus Pr(λ|ϕ, ψ) does not depend on Da.

4 Likelihoods

The probability factors defined in Sect. 3 are now evaluated using Bayes’ theorem and the appropriate likelihoods. Throughout this paper, we assume weak, non-informative priors whose impact on posterior distributions can be neglected.

4.1 Astrometry only

In this case, the posterior distribution is (12)

where, ignoring a constant factor, (13)

and (14)

Because of linearity, , the minimum-χ2 Thiele-Innes vector at given ϕ, is obtained without iteration, and we can write (15)

where is the positive increment in due to the displacement to .

Correspondingly, we write (16)

where (17)

The statistics of displacements in ψ-space is treated in Appendix A of L14b. These follow a quadrivariate normal distribution such that (18)

where and Δ is the determinant of the covariance matrix. It follows that (19)

Substituting into Eq. (8) and eliminating with Eq. (19), we obtain (20)

This determines the relative weights of the grid points ϕijk and agrees with Eq. (14) in L14b.

From arandom sampling of the quadrivariate normal distribution Pr(ψ|ϕ, Da) we obtain the approximation (21)

Accordingly, the relative weights from Eq. (20) are distributed equally among the points ψ in ψ-space (note that at each ϕijk an independent sample {ψ} is drawn).

If the errors in and are uncorrelated, the quadrivariate distribution Pr(ψ|ϕ, Da) simplifies to the product of two bivariate normal distributions – see Appendix A in L14b.

4.2 Astrometry and spectroscopy

With the addition of spectroscopic data and again assuming non-informative priors, the posterior density is (22)

where, ignoring a constant factor, (23)

and (24)

Because of linearity in λ = (γ, K1, K2) when ϕ and ψ are fixed, , the minimum-χ2 vector, is obtained without iteration, and we can write (25)

where is the positive increment in due to the displacement to .

Correspondingly, we write (26)

where (27)

The statistics of displacements in λ-space is treated in Appendix A. These follow a trivariate normal distribution Pr(λ|ϕ, ψ, Ds) such that Eq. (A.1) holds. From Eqs. (27) and (A.1), we obtain (28)

The statistics of displacements in ψ-space is modified by the spectroscopic data as noted in Sect. 3.3, and this is treated in Appendix B.

We now calculate Pr(ϕ|Da, Ds). Substituting into Eq. (10), eliminating with Eq. (28), and intergrating over λ, we obtain (29)

We now eliminate using Eq. (19) to obtain (30)

If we now replace Pr(ψ|ϕ, Da) by the approximation given in Eq. (21) and assume that is independent of ϕ, then (31)

Accordingly, the relative weights of points ϕijk in the ϕ-grid are (32)

Here the first factor depends only on Da. The dependence on Ds is introduced by the second factor: If at a given ϕ, all ψ correspond to poor fits to Ds, then the second factor disfavours that ϕ.

5 Numerical results

The technique developed in Sects. 3 and 4 is now applied to synthetic data Da and Ds created as described in Sect. 2.3 for the model binary defined in Sect. 2.2. All calculations use a 1003 grid for ϕ-space, and Monte Carlo sampling with for ψ-space and for λ -space.

5.1 Parameter cloud

Let ϕijk denote cell mid-points of the 3D grid spanning ϕ-space. At each ϕijk, the technique generates points ψ in ψ-space. Then, at each ψ, the technique generates points λm in λ -space. Thus, with this cascade, a cloud of points (ϕijk, ψ, λm) is generated inthe 10D (ϕ, ψ, λ)-space.

Note that this is a cloud of orbit parameters and not a cloud of orbits. Because linearity in ψ and λ is fully exploited, the values of and at cloud points are derived without computing astrometric and spectroscopic orbits, and this is the origin of the technique’s computational efficiency.

The relative weights of cloud points (ϕijk, ψ, λm) are (33)

The first factor comes from Eq. (32), the second from Eq. (B.5), and the third from Eq. (A.4).

Note that the third factor is only relevant if varies with (ϕ, ψ). Note also that if Pr(ψ|ϕ, Da, Ds) given by Eq. (B.2) were randomly sampled, the second factor would be . Instead, the quadrivariate normal distribution Pr(ψ|ϕ, Da) is sampled and then corrected via the coefficients ζ, which are such that ∑ ζ = 1 – see Appendix B.

5.2 Inferences

Suppose Q(Θ) is a quantity of interest. Its posterior distribution derived from the parameter cloud is (34)

where t ≡ (ijk, , m). The corresponding cumulative distribution function is (35)

The equal tail credibility interval (QL, QU) corresponding to ± 1σ is then obtained from the equations (36)

so that the enclosed probability is 0.6826.

These credibility intervals are asymptotically rigorous – i.e., are exact in the limits , , and grid steps → 0.

thumbnail Fig. 1

Posterior densities for and . The long vertical arrows indicate exact values. The short vertical arrows and lines indicate the posterior means and the 1- and 2- σ credibility intervals.

5.3 An example

The fundamental data derivable from the combined astrometric and spectroscopic data are the component masses and the parallax ϖ. None of these quantities can be derived if only one data set is available.

At every cloud point t, Kepler’s law and the two spectroscopic mass functions can be solved for and ϖ, and these values each have relative weight μt. Accordingly, the posterior densities of these quantities can be calculated from Eq. (34) and their credibility intervals from Eq. (35).

For a particular simulation of Da and of Ds, the posterior densities so derived are plotted in Figs. 1 and 2. Also plotted are the posterior means, the 1− and 2− σ credibility intervals, as well as the exact values from Sect. 2.2. In each case, the exact values fall within the 1− σ limits.

In Appendix C, following L16, a Bayesian goodness-of-fit (GOF) statistic, , is defined together with corresponding Bayesian p-value. We now apply this test. For the astrometric data, the posterior mean of is = 22.2, and for the spectroscopic data = 16.9, so that in total = 39.1. Since the total number of parameters is k = 10, Eq. (C.4) gives .

The total number of measurements is n = 40, comprising two astrometric (x, y) and two spectroscopic (v1, v2) measurements in each of ten years. The number of degrees of freedom is therefore ν = nk = 30. Substitution in Eq. (C.5) then gives pB = 0.51, a value consistent with the belief that the data is analysed with a valid model and that Bayesian inferences are not suspect.

thumbnail Fig. 2

Posterior density for logϖ(′′). The long vertical arrow indicates the exact value. The short vertical arrow and lines indicate the posterior mean and the 1- and 2- σ credibility intervals.

Table 1

Coverage fractions from 103 trials.

5.4 Coverage

An accurate gauge of the statistical performance of the technique requires many repetitions of the above calculation with independently drawn samples of Da and Ds.

With different data, the posterior densities and corresponding credibility limits in Figs. 1and 2 change. But the long vertical arrows marking exact values remain fixed. For each independent repetition, we can record whether or not the exact values are enclosed by the 1- and 2- σ credibility intervals. In this way, we carry out a coverage experiment as in L14a,b – see also Martinez et al. 2017.

The results obtained from 1000 repetitions are summarized in Table 1. These show reasonable agreement with ε(f), the expected fractions for errors obeying a normal distribution. Thus, despite the non-linearities, the credibility intervals retain their conventional interpretations.

6 Discussion: Bayesian hypothesis testing

In L16 and references therein, the relative absence in the astronomical literature of statistical testing of Bayesian models is commented upon.

6.1 Some quotes

The following quote from a statistician (Anscombe 1963) indicates that concern on this issue is of long standing:

“To anyone sympathetic with the current neo-Bernoullian neo-Bayesian Ramseyesque Finettist Savageous movement in statistics, the subject of testing goodness-of-fit is something of an embarrassment.”

A very recent comment (Fischer et al. 2016 in Sect. 4.1, authored by E. Ford) referring to exoplanets is:

“Too often people using Bayesian methods ignore model checking, because it doesn’t have a neat and tidy formal expression in the Bayesian approach. But it is no less necessary to do GOF type checks for a Bayesian analysis than it is for a frequentist analysis”

6.2 Additional justifications

When authors ignore model checking, they seldom, if ever, explain why. The quote above suggests that Bayesians are deterred by the absence of a readily-applied test. In contrast, frequentists reporting a minimum-χ2 analysis generally include , the χ2 minimum, and often also the p-value derived from the known distribution of . Thus, this traditional, frequentist approach has a built-in reality check. Moreover, this check is rigorously justified for linear models and normally-distributed measurement errors.

Note that minimum-χ2 codes return estimates and confidence intervals even when corresponds to a vanishingly small p-value. Thus, we may surmise that innumerable spurious inferences from false hypotheses or poor data are absent from the scientific literature precisely because of this built-in check.

Besides the difficulty of Bayesian model-checking, it seems likely that the following additional reasons play a role in checking being ignored: The detection of the expected signal confirms the hypothesis.

This is endemic in studies of orbits, including frequentist analyses going back decades. If a star is investigated for reflex motion due to a companion and a periodic signal is detected, then it is all too easy to take this as confirmation of a companion. A more critical approach recognizes that a harmonic expansion of Keplerian motion provides quantitative tests of the orbit hypothesis. When this approach is applied to a sample of spectroscopic binaries with exquisitely accurate radial velocities, significant departures from exact Keplerian motion are found (Lucy 2005; Hearnshaw et al. 2012).

A notable recent signal detection is that of gravitational waves from coalescing black holes (Abbott et al. 2016). The published parameters for the initial black hole binary derive from a Bayesian analysis. But these authors do not ignore model checking: their Bayesian analysis is preceded by a standard frequentist χ2 test of template fits. The Bayesian model has so many parameters that a poor fit is improbable.

In this case, the acceptance-rejection aspect of the scientific method is replaced by the posterior density favouring or disfavouringregions of parameter space. The expectation is that with enough high quality data, the posterior density will be sharply peaked at the point corresponding to the true hypothesis. But what if the true hypothesis is not part of the adopted multi-parameter model? How does the investigator detect this? A hypothesis should not be rejected if there is no alternative.

On this view, Bayesian model checking, even if readily carried out, should not lead to the rejection of a hypothesis. Rather, one should wait until an alternative hypothesis is proposed and then implement the model selection machinery. This view goes back to (Jeffreys 1939, Sect. 7.2.2) – see also (Sivia & Skilling 2006, p.85).

Jeffreys supports this view by remarking that there was never a time over previous centuries when Newton’s theory of gravity would not have failed ap-test. It is therefore instructive to recall how Adams and Le Verrier reacted to the large residuals in the motion of Uranus – i.e., small p-value. Crucially, their view was that the hypothesis being tested was not Newton’s theory but the then current 7-planet model of the solar system. Because they had a far greater degree of belief in Newtonian gravity than in the 7-planet model, they doubted the latter and went onto successfully predict Neptune’s position. This example illustrates that ambitious and effective scientists take small p-values seriously even in the absence of alternative hypotheses. By doing so, they create alternative hypotheses.

6.3 The statistic

A Bayesian GOF statistic and corresponding p-value is defined in Appendix C.

A crucial requirement of any (GOF) statistic is that it should not often falsely lead one to reject or doubt anhypothesis when that hypothesis is true. In frequentist terms, this is a Type II error. Such an error arises when the statistic gives a small p-value, say p = 0.001, even when the null hypothesis (H0) is true. Of course, such a value can occur by chance, but for an acceptable GOF statistic the frequency of Type II errors should not markedly exceed p.

In the simulation reported in Sects. 5.3 and 5.4, the null hypothesis is correct by construction, since the data is generated from the exact formulae for the astrometric and spectroscopic orbits. Thus, if the mathematical models were completely linear, we would expect to be distributed exactly as with ν = nk degrees of freedom. The p-value defined by Eq. (C.5) would then have an exactly uniform distribution in the interval (0, 1).

The Ntot = 1000 simulations used for the coverage experiment in Sect. 5.4 allow the uniformity of the pB values to be tested. In Fig. 3, the fraction with pB < p is plotted against p. We see that uniformity is obeyed with reasonable precision for p ∈ (0.01, 1.00). In particular, there is no indication of a significant departure from uniformity that could be attributed to the non-linearities. Accordingly, the pB values derived from can be interpreted in the same way and with the same confidence as p-values in minimum-χ2 estimation.

Note that the calculation of is a trivial addition to an existing Bayesian code that very likely already calculates the posterior means of other quantities.

6.4 Posterior predictive p-values

In the contribution authored by E. Ford from which the quote in Sect. 6.1 is taken, readers are referred to Gelman et al. (2013) who recommend posterior predictive p-values for testing Bayesian models.

In the context of the technique developed here, this recommendation proceeds as follows: 1) Randomly select a point in parameter space from the posterior distribution. Thus, if t is an index that gives a 1D enumeration of the parameter cloud, a random point t′ is that which most closely satisfies the equation (37)

where x is a random number ∈ (0, 1). 2) From the 10 parameters at t′, create synthetic data , compute , and then compare to the χ2 at t′ for the original data D = Da + Ds. 3) Repeat steps 1) and 2) times.

A Bayesian p-value is then defined to be (38)

Thus a small value of pB indicates that it is hard to find points t′ giving a worse fit than the original data, indicating that original data gives a poor fit.

To quote E. Ford again (Sect. 6.1), posterior predictive checking is evidently not “a neat and tidy” formalism. Moreover, physical scientists have a strong interest in having reliable p-values when p≲0.001, since such values raise serious doubts about a model’s validity. This then requires repetitions of the above steps, which may be infeasible.

Posterior predictive p-values have been compared to the values given by Eq. (C.5) for a simple 1D toy model. Specifically, a Hubble flow in Euclidean space populated by perfect standard candles. Synthetic data is created and the posterior density for the Hubble constant derived. A poor fit can be engineered by corrupting the data at high redshift and then comparing the resulting two small p-values. They agree closely.

This suggests that the readily calculated pB given by Eq. (C.5) eliminates any need for the cumbersome direct calculation of the posterior predictive p-value given by Eq. (38).

thumbnail Fig. 3

Test of Bayesian p-values. From 1,000 simulations, the fraction with pB > p is plotted against p for p = 0.01(0.01)1.00. The dashed line shows the expected result if the null hypothesis is correct and if the statistic obeys the distribution with ν = nk degrees of freedom.

7 Conclusion

In this paper, a non-trivial example of a wide class of problems in statistical astronomy is addressed. These are the so-called hybrid problems where the mathematical models predicting the observations are partly linear and partly non-linear in the basic parameters. As in the simpler, purely astrometric case considered in L14b, when spectroscopic data is added, a grid search over the non-linear parameter space combined with Monte Carlo sampling in the linear parameter spaces still leads to a computationally efficient scheme and again yields credibility intervals with close to correct coverage (Sect. 5.4), a result of prime importance generally, but especially so when estimating fundamental stellar parameters.

In contrast to L14a, the formulation in Sects. 3 and 4 is mostly quite general and so should be readily adapted to other hybrid problems.

In addition to exhibiting correct coverage, the large number of independent simulations allow the testing (Sect. 6.3) of , a Bayesian GOF criterion (Appendix C) for posterior probability densities. Even though the test problem involves some non-linear parameters, the exact sampling distribution in the linear case is closely followed, thus providing a readily-calculated p-value that quantifies one’s confidence in the inferences drawn from the posterior distribution. Since in problems that are exactly linear, the Bayesian and frequentist p-values are identical, investigators can make the decisions on the basis of the Bayesian pB -value exactly as they would for a frequentist p-value. Moreover, since calculating the statistic involves trivial changes to a Bayesian code, it provides “the neat and tidy formal expression” that is missing in current Bayesian methodology – see quote from E. Ford in Sect. 6.1.


The issue of error underestimation in hybrid problems was raised by the referee of L14a and was the direct stimulus of L14b and of this investigation. A useful correspondence with E. L. N. Jensen is also acknowledged.

Appendix A Statistics in λ-space

Statistics in the 4D Thiele-Innes ψ-space is treated in Appendix A of L14b. Analogous results are briefly stated here for the 3D λ -space.

Given ϕ and ψ, the minimum- vector is obtained without iteration from the normal equations derived from Eqs. (1) and (24).

The displacement gives with positive . On the assumption of normally-distributed errors, the probability density at λ is a trivariate normal distribution such that (A.1)

where and Δ is the determinant of the covariance matrix.

A.1 Random sampling in λ-space

Points λ randomly sampling the trivariate normal distribution Pr(λ|ϕ, ψ, Ds) are derived with a standard procedure (Gentle 2009) for sampling multivariate distributions. The first step is to make a Cholesky decomposition (Press et al. 2007, p. 100) of the covariance matrix C – i.e., to find the lower triangular matrix L such that (A.2)

A random sample from Pr(λ|ϕ, ψ, Ds) is then (A.3)

where the elements of z = (z1, z2, z3) are random Gaussian variates. The resulting approximation to Pr(λ) is (A.4)

Note that can vary with (ϕ, ψ). The increment in due to the displacement from is (A.5)

Accordingly, as in the analogous problem in ψ-space – see Eq. (A.22) in L14b, the increment in χ2 is obtained without computing the spectroscopic orbits at – though this should be checked during code development. This is a consequence of linearity and accounts for the computational efficiency of the technique.

In Appendix A of L14, Cholesky decompostion is not needed because the quadrivariate normal distribution Pr(ψ|ϕ, Da) is the product of bivariate distributions. But this simplification is lost if and have correlated errors (Sect. 2.3). In that circumstance, the above Cholesky approach is the necessary generalization.

Appendix B Modifiedstatistics in ψ- space

The treatment of statistics in ψ-space in Appendix A of L14b does not apply when spectroscopic data is included. As noted in Sect. 3.3 – see Eq. (11), Pr(ψ) depends on both Da and Ds

The required modification is obtained by substituting into Eq. (11), integrating over λ using Eq. (28), and noting that is independent of ψ. This gives (B.1)

We now eliminate using Eq. (19) and noting that is independent of ψ. This gives (B.2)

showing that Pr(ψ) is modified from the pure astrometry case by the factor introduced by spectroscopy. Because of this modification, Pr(ψ|ϕ, Da, Ds) is not a multivariate normal distribution and so not as readily sampled.

The adopted sampling procedure is as follows: from Eq. (A.20) in L14b, we have the approximation (B.3)

where the ψ randomly sample the quadrivariate normal distribtion appropriate in the pure astrometry case (Appendix A, L14b). Substituting into Eq. (B.2), we obtain the corresponding approximation when spectroscopy is included (B.4)

where (B.5)

As might be expected, because of the factor , points ψ in ψ-space are strongly disfavoured if that ψ gives a poor fit to the spectroscopic data.

Appendix C The and statistics

In an earlier paper (Lucy 2016; L16), we define (C.1)

where (C.2)

Thus is the posterior mean of χ2(α) when the posterior density Λ is computed under the assumption of a uniform (u) prior.

If the model is linear in the parameter vector α and if errorsare normally-distributed, then (Appendix A, L16) (C.3)

where is the minimum value of χ2(α) and k is the number of parameters. Moreover, under the stated assumptions, is distributed as , where ν = nk is the number of degrees of freedom and n is the number of measurements.

It follows that if we define the statistic (C.4)

then, for a linear model and normally-distributed errors, is distributed as with ν = nk. Accordingly, a p-value that quantifies the quality of the posterior distribution Λ (α|D) from which all Bayesian inferences are drawn is given by (C.5)

If the model is indeed linear in α, it follows from Eqs. (C.3) and (C.4) that , and so the frequentist and Bayesian p-values agree, a gratifying result.

If the model is non-linear in some parameters, then this GOF test should still be useful if the data is such that the fractional error of the non-linear parameters are small, for then a linearized model could be used.

In most Bayesian analyses in astronomy, the imposed priors are uninformative and so this analysis holds. In the rare cases where an informative prior π is imposed, perhaps from a previous experiment, the discussion in L16, Sect. 4.1 suggests that the criterion with replacing will have closely similar characteristics.

C.1 Generalization

The above analysis assumes uncorrelated measurement errors. The inclusion of correlations is a straightforward application of the statistics of quadratic forms – see, e.g., Hamilton (1964, Chap.4).

When corellations are included, the χ2 summation is replaced by (C.6)

where M is the covariance matrix and v is the vector of residuals. The previous analysis assumes that the off-diagonal elements of M−1 are zero.

With linearity in the parameter vector α and normally- distributed measurement errors , the sampling distribution of is a k-dimensional multivariate normal disribution ∝ exp(−ψ2∕2), where k is the number of parameters. It follows that the likelihood is (C.7)

Exploiting linearity in α and assuming a weak prior, we can write the posterior density as (C.8)

where is the minimum of ψ2 at α0 and δψ2 is the positive increment due to the displacement αα0.

Accordingly, in the case of a uniform prior, the posterior mean of ψ2 is (C.9)

This has the same form as Eq. (A.4) in L16, with Δ ψ2 replacing Δχ2. Therefore, since surfaces of constant Δψ2 are also self-similar k-dimensional ellipsoids, we immediately have (C.10)

Now is distributed as with ν = nk degrees of freedom. Accordingly, the statistic (C.11)

is distributed as with ν = nk degrees of freedom.

This is the required generalization of given by Eq. (C.4).


  1. Abbott, B.P., et al. (LIGO Scientific & Virgo Collaborations) 2016, Phys. Rev. Lett., 116, 061102 [Google Scholar]
  2. Anscombe, F. J. 1963 J. Roy. Stat. Soc. Ser. B, 25, 81 [Google Scholar]
  3. Catanzarite, J. 2010, ArXiv e-prints [arXiv:1008.3416] [Google Scholar]
  4. Eastman, J., Gaudi, B., & Agol, E. 2013, PASP, 125, 83 [NASA ADS] [CrossRef] [Google Scholar]
  5. Fischer, D. A., Anglada-Escude, G., Arriagada, P., et al. 2016, PASP, 128, 6001 [Google Scholar]
  6. Gelman, A., Carlin, J. B., Stern, H. S., et al. 2013, Bayesian Data Analysis (3rd edn.), ed. Chapman and Hall (Boca Raton, FL: CRC Press) [Google Scholar]
  7. Gentle, J. E. 2009, Computational Statistics (New York: Springer), 315 [Google Scholar]
  8. Hamilton, W.C. 1964, Statistics in Physical Science (New York: Roland Press) [Google Scholar]
  9. Hearnshaw, J. B., Komonjinda, S., Skuljan, J. & Kilmartin, P.M. 2012, MNRAS, 427, 298 [NASA ADS] [CrossRef] [Google Scholar]
  10. Jeffreys, H. 1939, Theory of Probability (Oxford: Clarendon Press) [Google Scholar]
  11. Lucy, L. B. 2005, A&A, 439, 663 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  12. Lucy, L. B. 2014a, A&A, 563, A126 (L14a) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  13. Lucy, L. B. 2014b, A&A, 565, A37 (L14b) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  14. Lucy, L. B. 2016, A&A, 588, A19 (L16) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  15. Martinez, G. D., Kosmo, K., Hees, A., Ahn, J., & Ghez, A. 2017, IAU Symp., 322, 239 [NASA ADS] [Google Scholar]
  16. Mason, B. D., Douglass, G.G. & Hartkopf, W.I. 1999, AJ, 117, 1023 [NASA ADS] [CrossRef] [Google Scholar]
  17. Press W. H., Teukolsky S. A., Vetterling W. T., & Flannery B. P. 2007, Numerical Recipes 3rd edn. (Cambridge: Cambridge Univ. Press) [Google Scholar]
  18. Sivia, D. S.,& Skilling, J. 2006, Data Analysis, A Bayesian Tutorial 2nd edn., (Oxford University Press) [Google Scholar]
  19. Wright, J. T., & Howard, A. W. 2009, ApJS, 182, 205 [NASA ADS] [CrossRef] [Google Scholar]

All Tables

Table 1

Coverage fractions from 103 trials.

All Figures

thumbnail Fig. 1

Posterior densities for and . The long vertical arrows indicate exact values. The short vertical arrows and lines indicate the posterior means and the 1- and 2- σ credibility intervals.

In the text
thumbnail Fig. 2

Posterior density for logϖ(′′). The long vertical arrow indicates the exact value. The short vertical arrow and lines indicate the posterior mean and the 1- and 2- σ credibility intervals.

In the text
thumbnail Fig. 3

Test of Bayesian p-values. From 1,000 simulations, the fraction with pB > p is plotted against p for p = 0.01(0.01)1.00. The dashed line shows the expected result if the null hypothesis is correct and if the statistic obeys the distribution with ν = nk degrees of freedom.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.