Issue 
A&A
Volume 638, June 2020



Article Number  A24  
Number of page(s)  21  
Section  Astrophysical processes  
DOI  https://doi.org/10.1051/00046361/202038104  
Published online  10 June 2020 
An improved test of the strong equivalence principle with the pulsar in a triple star system^{⋆}
^{1}
Jodrell Bank Centre for Astrophysics, The University of Manchester, Manchester, UK
email: guillaume.voisin@manchester.ac.uk, astro.guillaume.voisin@gmail.com
^{2}
LUTH, Observatoire de Paris, PSL Research University, Meudon, France
^{3}
Station de Radioastronomie de Nançay, Observatoire de Paris, CNRS/INSU, Université d’Orléans, 18330 Nançay, France
^{4}
Laboratoire de Physique et Chimie de l’Environnement, CNRS, 3A Avenue de la Recherche Scientifique, 45071 Orléans Cedex 2, France
^{5}
MaxPlanckInstitut für Radioastronomie, Auf dem Hügel 69, 53121 Bonn, Germany
^{6}
LESIA, Observatoire de Paris, Université PSL, CNRS, Sorbonne Université, Université de Paris, 5 Place Jules Janssen, 92195 Meudon, France
Received:
6
April
2020
Accepted:
28
April
2020
Context. The gravitational strong equivalence principle (SEP) is a cornerstone of the general theory of relativity (GR). Hence, testing the validity of SEP is of great importance when confronting GR, or its alternatives, with experimental data. Pulsars that are orbited by white dwarf companions provide an excellent laboratory, where the extreme difference in binding energy between neutron stars and white dwarfs allows for precision tests of the SEP via the technique of radio pulsar timing.
Aims. To date, the best limit on the validity of SEP under strongfield conditions was obtained with a unique pulsar in a triple stellar system, PSR J0337+1715. We report here on an improvement of this test using an independent data set acquired over a period of 6 years with the Nançay radio telescope. The improvements arise from a uniformly sampled data set, a theoretical analysis, and a treatment that fixes some shortcomings in the previously published results, leading to better precision and reliability of the test.
Methods. In contrast to the previously published test, we use a different longterm timing data set, developed a new timing model and an independent numerical integration of the motion of the system, and determined the masses and orbital parameters with a different methodology that treats the parameter Δ, describing a possible strongfield SEP violation, identically to all other parameters.
Results. We obtain a violation parameter Δ = ( + 0.5 ± 1.8) × 10^{−6} at 95% confidence level, which is compatible with and improves upon the previous study by 30%. This result is statisticslimited and avoids limitation by systematics as previously encountered. We find evidence for red noise in the pulsar spin frequency, which is responsible for up to 10% of the reported uncertainty. We use the improved limit on SEP violation to place constraints on a class of wellstudied scalartensor theories, in particular we find ω_{BD} > 140 000 for the BransDicke parameter. The conservative limits presented here fully take into account current uncertainties in the equation for state of neutronstar matter.
Key words: gravitation / pulsars: individual: PSR J0337+1715 / stars: neutron / radio continuum: stars
Supplementary figures are available at https://www.aanda.org
© ESO 2020
1. Introduction
Among the fundamental interactions of nature, gravity is unique in attracting all material objects with the same acceleration, at least within current observational precision. This feature of gravity (the universality of free fall, UFF below) was thought by Newton to be a cornerstone of Newtonian mechanics (Newton 1687). Indeed, in the Newtonian theory of gravity, this universal acceleration implies that the inertial mass of a body is always in a fixed proportion to its passive gravitational mass, and is independent of the mass, chemical composition, or the detailed internal structure of the gravitating object. This was presented as an observed physical principle, without a deeper explanation. Newton and many later experimentalists have conducted different experiments to verify UFF, no deviations have been found that are larger than 1.3 × 10^{−14} (Touboul et al. 2019). This equivalence between the inertial and passive gravitational masses for test particles (defined here as objects with negligible gravitational selfenergy) is the socalled weak equivalence principle (WEP).
When thinking about a new theory of gravity that incorporates the laws of special relativity (SR), Einstein had the insight that the gravitational field appears to be absent for a freely falling observer. This was later described by Einstein as the ‘most fortunate thought in my life’ (Renn 2007). This idea, that gravity is equivalent to acceleration, naturally explains the WEP. If the relativity principle applies to this situation, then any observers in a sufficiently small room in a freefalling reference frame are not only unable to determine whether the room is in motion or at rest relative to distant bodies, but they are neither able to determine its rate of acceleration in the gravitational field. This implies that, in the vicinity of the observer the laws of physics are (in very good approximation) given by SR, which means that the Lorentz invariance of SR is obeyed locally (this is the local Lorentz invariance, LLI) and furthermore, that it does not matter where or when an experiment is made (this is known as local position invariance, LPI). The combination of the WEP with LLI and LPI is now known as the Einstein equivalence principle (EEP, Will 2018a). Schiff’s conjecture states that the WEP implies the full EEP for any consistent theory of gravity, for which a strong plausibility argument can be made (see e.g. Will 2018a).
This generalisation of the relativity principle to reference frames in free fall guided Einstein towards general relativity (GR, Einstein 1915). GR and other metric theories of gravity fulfil the EEP in a natural way: in these theories the gravitational attraction is seen as a result of spacetime curvature, which itself originates from the energy, stress and momentum of the masses in a system, determined by the field equations of the theory. This curvature changes the trajectories of test particles moving within the spacetime (their ‘geodesics’) in a unique way that does not depend of the detailed nature of the particles themselves, hence the validity of the WEP. Furthermore, for spatial scales that are small compared to the radius of curvature, the geometry of spacetime necessarily approximates the ‘flat’ Minkowski geometry, hence the LLI and LPI automatically apply to nongravitational experiments. To rephrase, the EEP is a consequence of a universal coupling between matter and gravity (Damour 2012).
The qualification of ‘nongravitational’ is key here. If the EEP can be fully extended to gravitational experiments, like the Cavendish experiment, and to objects with large selfgravitational energy, then we have the strong equivalent principle (SEP). This is a crucially important distinction because, while all metric theories of gravity fulfil the EEP, there are suggestive arguments that GR is the only gravity theory in four spacetime dimensions that fully embodies the SEP (Di Casola et al. 2015; Will 2018a)^{1}.
Therefore, if we are looking for phenomena beyond GR, a promising avenue would be to look for instances of SEP violation. This has an added advantage: if no SEP violation is found, the results of such an experiment can in principle constrain all alternative theories of gravity.
Just as the EEP consists of the WEP, LLI and LPI, the SEP must additionally include gravitational versions of these. Any violations of the LLI and LPI of the gravitational interaction (e.g., the existence of a preferred frame of reference or the location dependence of gravity) have been strongly constrained using pulsar experiments (Shao & Wex 2012, 2013; Shao et al. 2013). In what follows, we focus on the gravitational version of the WEP (GWEP, Will 2018a), which states that the UFF applies not only to test particles, but also to any objects where the gravitational binding energy is important.
For alternative theories of gravity, the gravitational properties of objects generally depend on their amount of selfgravity. This means that at Newtonian level we have a bodydependent effective gravitational constant, G_{ab}, meaning the acceleration of a body a in the gravitational field of a body b is given by
where m_{b} denotes the inertial mass of body b, r_{ab} ≡ x_{a} − x_{b} their (coordinate) separation, and c is the speed of light. G_{ab} depends on the properties of body a and b. In the weakfield limit this can be interpreted, to a good approximation, as a mismatch between the inertial and the gravitational masses of the objects:
where G_{N} is the Newtonian gravitational constant, as measured in a Cavendishtype experiment, and m_{P} and m_{A} denote the passive and active gravitational mass respectively. For semiconservative metric theories of gravity that have a conservation of momentum one has only a single gravitational mass m_{G} ≡ m_{P} = m_{A} (Will 2018a). For the remainder of the paper we assume that momentum is conserved in the gravitational interaction, and therefore G_{ab} = G_{ba}^{2}. More generally, we use the definition G_{ab} = G_{N}(1 + Δ_{ab}) where we denote Δ_{ab} = Δ_{ba} as the relative GWEP parameter between two bodies a and b.
If one observes an isolated twobody system without prior knowledge of the masses, then any violation of the GWEP at Newtonian order would be indistinguishable from a rescaling of the masses due to the symmetry of the equations of motion. This symmetry is broken in presence of a third body. One can then compare the rate at which two selfgravitating objects fall in the field of a third one. This forms the base for a class of GWEP tests that includes Lunar laser ranging (LLR), tests with binary pulsars falling in the gravitational field of our Galaxy, and the test to be discussed in this paper.
In the LLR test, one considers the EarthMoon system falling in the gravitational field of the Sun. If GWEP is violated, then the Earth, which has a larger fractional gravitational binding energy than the Moon, falls in the Sun’s gravitational field with a slightly different acceleration than the Moon. This causes a polarisation of the EarthMoon orbit in the direction of the Sun (Nordtvedt 1968). This so called Nortdvedt effect is the gravitational equivalent of the Stark effect, where a strong electric field polarises neutral atoms. It manifests itself as an added small orbital eccentricity vector that precesses in the sky with a period of 1 year, trailing the Sun. The relative EarthMoon distance can be measured with an accuracy of about 10 cm thanks to the reflectors laid on the Moon by a variety of American and Soviet lunar missions. No Nodtvedt effect has been measured, as predicted by GR, effectively constraining
(Hofmann & Müller 2018), which is only about a factor of 10 weaker than the MICROSCOPE limit for WEP, therefore confirming to a high degree that gravitational binding energy falls the same way in an external gravitational field as any other form of energy.
In this test, all the involved bodies are weakly selfgravitating, however, this is especially true for the two ‘proof masses’, the Earth and the Moon: for the Earth ε_{grav, E} ≡ E_{grav, E}/m_{E}c^{2} = −4.6 × 10^{−10} (here E_{grav, E} is the Newtonian gravitational binding energy of the Earth), for the Moon ε_{grav, M} = −0.2 × 10^{−10}. This means that the LLR experiment only tests the weakfield limit of GWEP. In this limit Eq. (2) implies Δ_{ab} ≃ Δ_{a} + Δ_{b}, where Δ_{a} ≡ (m_{G}/m)_{a} − 1; furthermore, the gravitational binding energy of the bodies relative to their mass is so small that it can only have a very small effect on Δ_{a}. Within the parametrised postNewtonian (PPN) formalism for metric theories of gravity
where η is the so called Nordtvedt parameter, a combination of several PPN parameters (see Will 2018a, for details). The current limit on the Nordtvedt parameter from LLR is (− 0.2 ± 1.1) × 10^{−4}.
A violation of GWEP not only affects the dynamics of the EarthMoon system, but all selfgravitating masses in the Solar System are affected according to Eq. (4). A consequence of this is a shift of the Solar System barycentre (SSB) when modelling planetary ephemerides. Based on data from the MESSENGER mission, Genova et al. (2018) have derived η = (− 6.6 ± 7.2) × 10^{−5}.
Equation (4) applies to the weakfield limit, that is, the Nordtvedt parameter parametrises GWEP violation to leading order in ε_{grav, a} ≪ 1. This first order approach is no longer applicable in the strongfield regime of neutron stars. Thus, in the remainder of this article we consider GWEP violations in terms of limits directly on Δ_{ab} and not on η.
In the DamourSchäfer test (Damour & Schäfer 1991), one verifies whether the two components of a pulsar – white dwarf system (the first with a very high degree of selfgravity, which allows the detection of strongfield SEP violation) fall with the same acceleration in the field of the Galaxy, which acts as the third body. A violation of the UFF would again cause a polarisation of the orbit of the binary pulsar. At the time of that paper (1991), the timing precision and timing baselines of binary pulsars were relatively small, so the authors proposed a statistical approach to search for this polarising effect in the orbital eccentricities of the known pulsar – white dwarf systems. Following that method, several analyses of the orbital eccentricities have constrained Δ for neutron stars^{3}: Stairs et al. (2005) derive Δ < 5.6 × 10^{−3}, and Gonzalez et al. (2011) derive Δ < 4.6 × 10^{−3} (both being 95% confidence limits). However, the latter limit is derived with the inclusion of a binary pulsar, PSR J1711−4322 that does not fulfil all the necessary criteria for the DamourSchäfer test (Wex 2014).
This method has several shortcomings, which are listed and discussed in detail by Freire et al. (2012a); the two most important ones are (a) the fact that it cannot detect GWEP violation, only produce statistical upper limits for it and (b) generally, neutron stars with different masses have different values of Δ, this limits the meaningfulness of a general Δ for neutron stars (cf. footnote 25 in Damour 2009).
Apart from the statistical test based on small eccentricities, Damour & Schäfer (1991) have also proposed a test based on a direct measurement of the variation of the orbital eccentricity vector for individual systems, ė, (no matter whether eccentric or not) that results from the polarisation of the binary orbit by the Nordtvedt effect. As discussed by Freire et al. (2012a), this test not only avoids all the shortcomings of the statistical test, but its precision just keeps improving with the precision of the measurement of ė, which improves with time and with better timing instruments. Indeed, they estimated that this test should, for the best timed binaries, yield slightly better Δ values than the statistical test by the mid 2010’s. More recently Zhu et al. (2019) confirmed this by using the ė constraint for the wide orbit of PSR J1713+0747 to derive Δ < 2 × 10^{−3} (95% C.L.). Without further assumptions, this limit is strictly speaking only for neutron stars around 1.3 M_{⊙}, which is the mass of PSR J1713+0747. Recently, this limit has been used to test the UFF of a neutron star towards dark matter (Shao et al. 2018).
Although this test can detect any hypothetical large strongfield deviations of the gravitational properties of neutron stars, the limits on Δ are not very constraining because of the weak gravitational field of the Galaxy, which has accelerations of the order of 2 × 10^{−10} m s^{−2} in the solar vicinity. In the case of the LLR test, the polarising gravitational field (that of the Sun) is much stronger (6 × 10^{−3} m s^{−2}), however, in that case the Earth and Moon have very small gravitational self energies.
For this reason, Freire et al. (2012a) suggested that the (then) rumoured pulsar in a triple system would combine the best features of both tests. In this experiment, we look for the Nordtvedt effect in an inner binary system consisting of a pulsar and a white dwarf; this system is orbited by a third hierarchical component significantly farther away. As in previous binary pulsar experiments, the pulsar provides the precise tracking and an object with very strong gravitational self energy; the white dwarf provides a test mass with a much smaller gravitational self energy, and finally the third outer component in that system provides a strong (potentially) polarising gravitational field (1.7 × 10^{−3} m s^{−2}), as the Sun does for the LLR experiment. The outer component would ideally be a neutron star, as this would provide a qualitatively different test, however, any type of star would already yield a much stronger polarising force than the Galactic gravitational field and therefore either a detection of GWEP violation, or much improved limits on it.
PSR J0337+1715 was discovered in data from the GBT driftscan survey (Boyles et al. 2013; Lynch et al. 2013). This is a 2.7 ms pulsar in a 1.6 day orbit with a ∼0.2 M_{⊙} Helium white dwarf star, this is what we refer to from now on as the inner binary. The outer ∼0.4 M_{⊙} white dwarf orbits the inner binary in about 327 days in a loweccentricity (e = 0.035) orbit. This is the first, and so far the only pulsar confirmed to be in a triple stellar system (Ransom et al. 2014). The two orbits (inner and outer) are nearly coplanar, this and the small observed eccentricities provide important clues for the evolution of the system, which was described in detail by Tauris & van den Heuvel (2014).
The pulsar has very good rotational stability, as usually millisecond pulsars (MSPs) do, and is relatively bright, which allows a very good measurement of the times of arrival of the pulses. This has allowed precise measurements of the varying orbital parameters, and also extremely precise mass measurements for the pulsar and the two white dwarf stars (Ransom et al. 2014). More importantly, the GWEP test was eventually carried out for PSR J0337+1715 by Archibald et al. (2018), yielding Δ≤2.6 × 10^{−6} (95% C.L.). This represents an improvement of three orders of magnitude over previous pulsar tests and confirmed the power of a MSP in a triple stellar system for testing the GWEP.
This measurement represents a tight constraint on alternative theories of gravity. Archibald et al. (2018) derived constraints on one of the best studied alternatives to GR, the class of monoscalartensor theories described by Damour & EspositoFarèse (1992; 1993, henceforth DEF gravity). The constraints on the weakfield coupling parameter for these theories (α_{0}) derived from PSR J0337+1715 significantly improve upon all previous experiments for most of their β_{0} space.
The UFF experiment with the PSR J0337+1715 triple system and its results are clearly of great importance. It is, at present, the most powerful test of the GWEP, for either the strong or weak field limits. It is also extremely sensitive to strongfield deviations in the gravitational properties of neutron stars.
This test is of special value because, according to a gravitational analogue of Schiff’s conjecture, it is plausible that the validity of GWEP implies the SEP (Will 2018a); this in turn strongly suggests, according to the arguments mentioned above, that GR is the theory of gravity (see also Will 2014a).
For all these reasons, we find it is important to improve both the precision and reliability of the test. These are the primary objectives of this work. We use fully independent observational data, taken with a wholly different observing system (described in detail in Sect. 2), a completely independent numerical integration of the motion of the system and a different implementation of the determination of the masses and orbital parameters (described in Sect. 3) than those used by Archibald et al. (2018). One of the main differences is, however, that the uncertainties we report are purely statistical; we found no need to postulate the existence of additional systematic effects that can bias Δ. Consequently, this parameter can be selfconsistently processed like the others without requiring a special treatment.
The results of our analysis are presented in Sect. 4. Here we discuss not only the parameters we obtain, but also analyse the trends observed in the residuals after subtracting the best model for the system. In Sect. 5, we interpret the Δ constraint, as well as constraints on the postNewtonian strongfield parameters, within the context of a wide framework of alternative theories of gravity, the BergmannWagoner theories of gravity (Will 2018a). We also derive new constraints on a subclass of those theories, the DamourEsposito Farèse (DEF) theories (Damour & EspositoFarèse 1992, 1993), these new limits are derived in a conservative way that accounts for uncertainties in our knowledge of the equation of state (EOS) for neutron star matter. We finally summarise our findings in Sect. 6.
2. Observations and data reduction
The pulsar J0337+1715 has been regularly observed since July 2013 every 2 or 3 days with the Nançay radio telescope using its Lband receiver at a central frequency of 1484 MHz. The Nançay radio telescope is a meridian Kraus design collector equivalent to a 94 m dish able to conduct ∼1 h observations on any given source within its declination range each day. The dual linear polarisation signals are sent to the Nançay Ultimate Pulsar Processing Instrument (NUPPI, Desvignes et al. 2011), an instrument that is able to coherently dedisperse (Hankins & Rickett 1975) a total bandwidth of 512 MHz. It consists of a ROACH1 board (designed by the CASPER group, University of California, Berkeley) providing 128 baseband data streams of 4MHz bandwidth each. The instrument software has many similarities with GUPPI (Green Bank Ultimate Pulsar Processing Instrument, DuPlain et al. 2008) used at the Green Bank Telescope (GBT). A cluster of four nodes hosting eight GTX280/285 Graphics Processing Units (GPUs) is used to coherently dedisperse and fold the data in realtime.
The realtime folding process uses a pulsar period coming from a simple model with two noninteracting Keplerian orbits over short 15 s subintegrations. A single standard timing parameter file in TEMPO format^{4} containing this pulsar timing model is used for all the observations. The full frequency and time resolution daily pulsar profiles are stored in PSRFITS archives (Hotan et al. 2004)^{5}. A ∼3 Hz pulsed noise diode is fired for ∼10 s at the start of each observation to conduct a simple calibration accounting for gain and phase differences between the two polarisations, as implemented in the SINGLEAXIS polarisation calibration of PSRCHIVE.
As the pulsar period model used to fold in real time is not strictly satisfactory, it is necessary to properly phaseshift all the archived individual profiles. A posteriori, the drifts within individual subintegrations were statistically smaller than the mean ToA uncertainty and characterised by an rms of 0.78 μs, thus validating the parameters of our subintegrations. In an iterative way, measured times of arrival are used to derive a pulsar timing model which is used to improve the times of arrival and so on. Practically, the numerically derived pulsar timing model is used to provide ‘theoretical’ barycentric arrival times for each of the 15 s subintegrations. A code transforms those barycentric arrival times into a simple daily TEMPO parameter file with rotation rate described by only a frequency and its first three derivatives around an epoch corresponding to the middle of the observation (F0, F1, F2, F3 and PEPOCH). This polynomial predicts the barycentric rotational phases within 5 ns at maximum. These daily parameter files are then used to realign the pulse profiles within their corresponding archives. A profile ‘template’, built with more than one thousand observations (see Fig. 1, top), is used for determining the topocentric pulse times of arrival (ToAs) in the following way. After integrating profiles over 128 MHz and 20 min, the times of arrival are estimated using the pat function from PSRCHIVE within the Fourier domain with Markov chain Monte Carlo (FDM) method. The total bandwith of 512 MHz was thus split in four subbands in order to be able to fit for variations in the dispersion measure (DM) representing the integrated electron column density along the lineofsight during the numerical fit. The ToA uncertainties as reported by pat are characterised by a mean of 2.15 μs and a median of 1.89 μs. A pulse profile typical of a good observation, characterised by an uncertainty of 1.15 μs, is shown in Fig. 1 (bottom). The goodness of fit as reported by pat can give a sense of the differences between the template profile and the profiles used to derive ToAs. The goodness values are characterised by a mean of 1.05 (with a median of 1.04) and an rms of 0.12 with 99% of the values between ±3σ (0.69 to 1.41). The rather low signal to noise ratio of the PSR J0337+1715 profiles observed at Nançay prevents the detection of subtle effects of incorrect polarisation calibration on the ToA determination. In this work, we use a dataset (see footnote 7) of 9303 ToAs divided in four 128 MHz bands observed between MJD 56492 and MJD 58761 (July 2013 and October 2019).
Fig. 1. Top: template pulse profile of PSR J0337+1715 used for timing. 450 h of observations conducted with the Nançay Radio Telescope were integrated over the frequency range 1230–1742 MHz. Bottom: pulse profile obtained on October 4, 2014. When profiles are ToA uncertainty ranked, this one is at the 10th percentile of the lowest uncertainties, typical of a good observation. 
3. NUmerical TIming MOdel: NUTIMO
For the description of the orbits of binary pulsars, timing programmes such as the aforementioned TEMPO and also TEMPO2 (Edwards et al. 2006; Hobbs et al. 2006) rely on existing analytical models to calculate the times of arrival with nanosecond accuracy (such as, e.g. the DD and DDGR models, Damour & Deruelle 1986). These models are built from precise analytical solutions of the equations of motion (for the examples above these were derived by Damour & Deruelle 1985). However, no such solution is yet available for a triple system where the three masses are of comparable size and experience moderately strong mutual interactions. Therefore, and similarly to Ransom et al. (2014) and Archibald et al. (2018), we perform a high precision numerical integration of the equations of motion, which we subsequently use to calculate the delays.
The equations of motion we use are accurate up to first postNewtonian order (1PN), that is, include the firstorder terms of an expansion of GR in the small parameter where m, v and a are characteristic mass, velocity and length scales of the system and c the speed of light. These corrections are absolutely necessary because they translate into a relative acceleration Δa/a ∼ ϵ which is of similar magnitude as a potential SEP violation (see above). In addition, 1PN corrections are responsible for effects that accumulate over time such as the wellknown relativistic precession of periastron. On the other hand, second order corrections can be safely ignored since the same line of reasoning predicts that even a cumulative effect such as gravitational wave radiation cannot account for more than a nanosecond within the current span of our observations.
We use the 1PN generic strongfield framework of Will (1993) and Damour & Taylor (1992) which parametrises almost the entire class of ‘fully conservative’ Lagrangianbased theories of gravity (without preferred location effects^{6}) based on a modified EinsteinInfeldHoffmann approach (see details in Appendix A). In this framework, in the most generic case, one has three parameters Δ_{ab} at the Newtonian and 12 strongfield parameters at the postNewtonian level. All these parameters depend on the structure of the individual bodies. The 12 postNewtonian parameters generalise the parametrised postNewtonian (PPN) β_{PPN} and γ_{PPN} (Will 1993) to the regime of strongly self gravitating masses.
In the PSR J0337+1715 system we have only one strongly selfgravitating body with ε_{grav} ∼ 0.1, the pulsar, while the two white dwarfs are weakfield objects with ε_{grav} ≲ 10^{−4}. That generally leads to a significant reduction of the number of strongfield parameters relevant for the orbital dynamics of the PSR J0337+1715 system. In fact, on the Newtonian level there is only one Δ_{ab}, which we simply denote by Δ. Among the postNewtonian terms, as we discuss in detail within a theory based framework in Sect. 5, there remain three strong field parameters which are a priori unconstrained by Solar System experiments and limits on Δ already imposed by the ‘Newtonian’ dynamics: , , . Since the limits we find for these parameters in Sect. 4 are many orders of magnitude weaker than limits inferred indirectly from binary pulsar experiments, at least within our theory based framework, we primarily consider a model where the 1PN strongfield parameters are set to their GR values that is, zero. This practically corresponds to using priors arising from a combination of Solar System and binary pulsar limits at the postNewtonian level when estimating Δ.
Our specially developed software, NUTIMO^{7}, solves numerically the 3body equations of motion at 1PN (see Appendix A.1 and particularly Eq. (A.2)) before computing propagation and relativistic delays. All the geometrical delays are taken into account up to first order in L/d where d is the distance to the system and L ≪ d is any other length scale of the problem. In other words, the code computes the socalled Rœmer, Kopeikin and Shklovskii delays (Shklovskii 1970; Kopeikin 1996), and adds an extra second order correction, that is, L^{2}/d^{2}, for the latter (the only second order correction that may become important with time, see e.g. Voisin 2017). We note that Kopeikin and Shklovskii delays were not included in previous works (Archibald et al. 2018; Ransom et al. 2014). The former allows us to measure the longitude of ascending node of the outer orbit and might be important because it accounts for systematic effects at the Earth orbital frequency which is close to the outer orbital frequency. We do not expect the latter to significantly affect the results of the fit but it allows us to derive the intrinsic pulsar spin parameters which would otherwise absorb this effect (see below). We caution that the intrinsic spin parameters we report in Table 1 are still biased by the effect of Galactic acceleration which amounts to approximately 25% of the Shklovskii correction (according to the Galactic model of McMillan 2017). Relativistic delays include time dilation between the frame of the pulsar and the frame of the observer, namely the socalled Einstein delay, as well as the deformation of spacetime by the pulsar companions on the light travel path, the socalled Shapiro delay, and the aberration of the direction of the radio beam. All are calculated at 1PN order.
Mean values of the MCMC fit with their 68% confidence interval.
The delays due to interstellar medium propagation described by the DM as well as the Solar System counterparts of the previously mentioned delays are calculated by the commonly used software TEMPO2 (Edwards et al. 2006; Hobbs et al. 2006) which NUTIMO integrates as an external library. A thorough description of the timing model can be found in Chap. 5 of Voisin (2017).
3.1. Parametrisation of the problem
In total, the model must include at least 27 parameters (respectively 30 in the socalled secondary model when the 3 1PN strongfield parameters are included). One of them is a ToA uncertainty scale factor (called EFAC in the pulsar timing literature) which quantifies our ignorance of unmodelled systematic effects. The other 26 parameters (resp. 29) can be grouped into four categories:

pulsar rotation: pulsar spin frequency and its derivative;

orbital dynamics: six parameters for the innerbinary orbit, six parameters for the outerWD orbit, three masses, one SEP violation parameter (resp. one SEP violation parameter and three 1PN strongfield parameters);

astrometry: three position and three proper motion parameters;

radio propagation: DM and DM derivative.
Each category is essentially uncorrelated with the others (see Fig. 2). The first two are specific to the triplesystem problem and we shall discuss them in some details. On the other hand, the astrometric parameters, position and proper motion, and DM are treated using a standard approach and we refer the interested reader to Edwards et al. (2006), for example.
Fig. 2. Corner plot of the correlations between the fitted parameters of Table 1 and their marginalised distribution (diagonal histograms) sampled by our MCMC. A highresolution version is available as online supplementary material. 
The intrinsic pulsar parameters are its spin frequency f and spinfrequency derivative f′ taken at the reference epoch T_{ref}. These parameters need to be rescaled to avoid nonlinear correlations with astrometric parameters due to the Shklovskii delay (see Chap. 5 of Voisin 2017 for details about the delays). This is a common practice in pulsar timing. In addition, we rescaled the spin frequency to include the linear effect (that is, proportional to time) of the Einstein delay, which is approximated by the second term of Eq. (5) below. In usual pulsartiming models, the term of the Einstein delay responsible for a linear increase of the delay with time can be calculated exactly and removed from the timing model since its effect is strictly impossible to separate from a rescaling of f. However, because here we calculate numerically the delay, we can only estimate the linear drift using the initial parameters. As a result of these rescaling, the effective fit parameters and are connected to the intrinsic parameters f and f′ by
where
where the symbols correspond to those defined in Table 1. The use of the two rescaled parameters above instead of the intrinsic ones has proven to be very effective in speeding up convergence in our MCMC.
The orbital parameters for a triple system are at most 3 × 6 + 3 = 21 that is, three position coordinates and three velocity coordinates per body plus the three masses. However we consider the system in the frame of its centre of mass which results in applying two vector relations to the initial velocities and positions such that the centre of mass is at rest at coordinates (0, 0, 0). These relations suppress six degrees of freedom, and we end up with fifteen independent orbital parameters. Note that, eventually, the six degrees of freedom corresponding to the centre of mass appear as the six astrometric parameters.
The orbital parametrisation uses the fact that the system, being hierarchical, can approximately be described by an inner Keplerian binary containing the pulsar and the inner white dwarf (WD) itself forming an outer binary with the outer WD (see Fig. 3). Thus, the usual Keplerian orbital elements can be used to describe the osculating Keplerian orbits to the actual trajectory of the pulsar and of the inner binary. For eccentricity, we use the LaplaceLagrange parametrisation relevant for small eccentricities (Lange et al. 2001) which replaces e, ω, t_{p}, respectively eccentricity, longitude of periastron and time of periastron passage, by e cos ω, e sin ω, and t_{asc}. It is important to note that we define the transformation t_{asc} = t_{p} − ω/2πP, with P an orbital period.
Fig. 3. Sketch of the orbits of PSR J0337+1715 (not to scale). Note that the orbits are nearly circular and that the ellipsoidal shape of the orbits on the lefthand sketch arises purely from projection. The osculating orbits of the system can be parametrised by the Keplerian orbital elements of the pulsar within the inner binary and of the inner binary within the outer binary (see text). In particular, a_{p} is the semimajor axis of the pulsar orbit within the inner binary, a_{b} is the semimajor axis of the inner binary within the outer binary and e_{O} its eccentricity. Are also shown the longitude of periastron the outer binary as well as the inner and outer inclinations with respect to the plane of the sky, i_{I} and i_{O} respectively. Note that for simplicity the sketch neglects the difference between the directions of the inner and outer lines of ascending nodes, δΩ, as it is in practice very small. 
Similarly, we fitted for a_{b} sin i_{O}, a_{b} cos i_{O} for the outer orbit, where a_{b} is the semimajor axis of the inner binary’s orbit around the centre of mass of the system and i_{O} its inclination relative to the plane of the sky. For the inner binary we find it better to fit for a_{p} sin i_{I} and δi = i_{I} − i_{O} instead of a_{p} sin i_{I} and a_{p} cos i_{I} as this cancels several nonlinear correlations in the fit. This is helped by the fact the two orbits are very nearly coplanar. In the same way, we fitted for the longitude on the plane of the sky of the outer orbit, Ω_{O}, as well as for the offset between inner and outer orbits, δΩ = Ω_{I} − Ω_{O}. We note that only the latter was reported in Archibald et al. (2018) while the former was considered impossible to constrain with the available data. Interestingly, we were able to constrain Ω_{O} in this work, perhaps thanks to our inclusion of Kopeikin’s delays.
The inner binary mass m_{b} = m_{i} + m_{p} as well as the outer WD mass are derived using Kepler’s third law in the inner and outer binary respectively. The pulsar mass and the inner WD mass are derived from m_{b} using the mass ratio m_{i}/m_{p} which is also part of the fit. In addition, we use the postKeplerian orbital elements of Damour & Deruelle (1985) which incorporate the 1PN corrections for relativistic binaries. Using Damour & Deruelle (1985), one maps the orbital elements to the position and velocity of the pulsar relative to the innerbinary centreofmass (r_{p/b}, v_{p/b}) and to the innerbinary centreofmass position and velocity (r_{b}, v_{b}). One can then find the position and velocity of the pulsar relative to the centre of mass of the system, r_{p} = r_{p/b} + r_{b} and v_{i} = v_{i/b} + v_{b}^{8}. The position and velocity of the inner WD, (r_{i}, v_{i}), can then be derived by solving the equations of conservation of momentum and centreofmass position, (A.4) and (A.5) respectively, with P = (ℋ/c^{2})v_{b} and X = r_{b}. Finally, one solves P = 0 and X = 0 for the outer WD position and velocity, (r_{o}, v_{o}).
3.2. Model accuracy
The main signature of a SEP violation in our pulsartiming experiment would be a residual signal at the frequency 2f_{i} − f_{o} (Archibald et al. 2018), where f_{i, o} are the inner and outer orbital frequency respectively (see also Fig. 4). According to linear theory, the effect of a violation of the SEP at Newtonian order primarily results in a sinusoidal variation of the separation of the inner binary with frequency f_{i} − f_{o} (Nordtvedt 1968). However, here we measure the distance projected along the line of sight between the observer and the pulsar and this distance is modulated by the orbit of the inner binary with the outer WD. It follows that the main net effect is a modulation at 2f_{i} − f_{o} as originally pointed out by Archibald et al. (2018). Using the Newtonianorder linear theory of Nordtvedt (1968), one can show that Δ∼10^{−6} translates into a signal amplitude of ∼0.1 s via the Rœmer delay. However, the magnitude that can effectively be detected in the fit residuals appears to be reduced to only ∼50 ns (Archibald et al. 2018) due to the effect of the many strong correlations of the Δ parameter with the other orbital parameters (see Fig. 2). Therefore, we aim in this work to achieve nanosecond accuracy within our model. This level of accuracy is compatible with the level aimed at by TEMPO2 (Edwards et al. 2006; Hobbs et al. 2006).
Fig. 4. LombScargle periodogram of the timing residuals for the solution of Table 1. The periodogram is sampled at f_{m}/5 where f_{m} ≃ 2268 days^{−1} is the inverse of the full time span. From left to right vertical lines show the frequencies of the rednoise component (f_{RN} ≃ 1650 day^{−1}, black), the Earth orbital period (f_{E}, green), the outerorbit orbital period (f_{O}, orange), the innerbinary orbital period (f_{I}, red) and the SEP violation signature (2f_{I} − f_{O}, purple). 
There are essentially three types of inaccuracies that can affect the output of our model:numerical roundoff errors, postNewtonian truncation, interpolation precision. The first one, numerical roundoff errors, is expected to grow with the number of floatingpoint operations performed to obtain the result. The main source of operations is the numerical integration of the equations of motion (A.2). The integration is performed using the BulirschStoer scheme (Stoer & Bulirsch 2011) implemented in the Odeint module (Ahnert et al. 2011) of the Boost library^{9}. In addition our numerical model relies on 80 bit floating point numbers (long double in C) throughout. To assess the effect of numerical roundoff errors, we use the model to generate fake times of arrival from parameters that fit the data from PSR J0337+1715. The fake times of arrival are the theoretical times of arrival that are closest to the actual measurements, and therefore only differ from those by a few microseconds at most. We feed this mock data set back into our model such that the residuals should be exactly zero in absence of numerical roundoff errors. In practice we observe roundoff errors at the level of 10^{−3} ns showing that numerical roundoff errors are not an issue given our objective of a 1 ns accuracy. Note that this procedure does not assess any systematic inaccuracy due to the modelling or the numerical scheme themselves (see below) but it does account consistently for the entire chain of operations, including not only the numerical integration but also the Solarsystem calculations done by TEMPO2 and the pulsar system delays. It is also conservative since the chain of operations is performed twice: once to create the fakes and once to analyse them.
The main source of systematic inaccuracy due to numerical approximations lies in the precision of the interpolations of the timing delays that are calculated in intermediate steps. We use a cubicspline interpolation algorithm (Press 1996) for all our interpolations. There are two parameters than can be tuned: the number of interpolation points and the width of ‘margins’ at the beginning and the end of the interpolated range in order to avoid boundary effects. The latter need only be a few points in principle, however the former has a direct and opposite impact on accuracy and performance and therefore requires a tradeoff. To determine the level of interpolation accuracy required we need to estimate what is the effect of a 1ns signal on the χ^{2} value in order to make sure that this value is computed with the necessary accuracy. Let us assume that the difference between the data and the model is Δt_{i} + δ_{i} where the second term explicitly represent the contribution of a putative ∼1 ns signal, then the χ^{2} can be expanded as follow,
where N is the number of data points. Now if N ≫ 1 and the number of fit parameters is ≪N, then for the best fit parameters Δt_{i} ∼ σ_{i} where {σ_{i}}_{i ∈ [1, N]} are the uncertainties which we take to be approximately equal to σ = 2 μs. In the present case N ∼ 10 000 for only 27 parameters. Thus, the first term in (11) approximates χ^{2} ∼ N. The second term in Eq. (11) can be as large as Nδ/σ assuming that every term contributes positively. However it is also possible that the sum averages to zero if it alternates. The third term is of order ∼Nδ^{2}/σ^{2}. Taking δ_{i} ∼ δ = 1 ns we see that the second and third term of equation (11) are respectively ≲5 × 10^{−4}χ^{2} and ∼2.5 × 10^{−7}χ^{2}. We retain the last estimate as a conservative level of accuracy to achieve. To do so we increased exponentially the number of interpolation points until the relative variation of the χ^{2} between two increments is smaller than 2.5 × 10^{−7}.
The last source of inaccuracy, postNewtonian truncation, is intrinsic to the theoretical framework used. Indeed, although the equations of motion and the various conserved quantities of Appendix A all consistently derive from the same Lagrangian and can therefore be exactly verified in principle, the method of calculus by successive approximation does not in practice achieve that result. Indeed, since the Lagrangian itself is an approximation to order v^{2}/c^{2} ∼ 5 × 10^{−7} of a more general theory, there is no physical justification for conserving in the subsequent derivation any term of higher order. It follows that the equations of motion and the corresponding conserved quantities are only accurate to first order (1PN) and that systematic ‘residuals’ of order v^{4}/c^{4} ∼ 2 × 10^{−13} (2PN) are present in the equations themselves. As a result, we see on Fig. 5 that the energy of the system is conserved up to systematic oscillations at the orbital frequencies accompanied with a linear drift at a level consistent with the neglected 2PN terms, and numerical noise does not play any significant role. More interesting regarding the timing accuracy is to look at the conservation of the centreofmass position. Indeed, a shift in this position immediately transforms into a geometric delay. We find that, due to the fact that the neglected 2PN terms in the equations of motion do not necessarily generate residuals which average to zero, the two successive integrations leading to the centreofmass motion create a quadratic drift that increases over time. Through the time span of our observations this results into a drift of less than 10 m, namely about 30 ns in terms of geometric delay. Such a quadratic drift can undoubtedly be entirely absorbed in the spin frequency and spinfrequency derivative as well as by the astrometric parameters when fitting the data. For example, the linear drift reported on Fig. 6 would bias the spin frequency by ∼10^{−14} Hz, much less than the nevertheless very tight uncertainty on this parameter. Therefore we conclude that the quadratic drift can only result in a negligible bias of a few parameters which is why we subtract this component with a linearleastsquare fit on Fig. 6. The residuals show that the systematic oscillatory 2PN motion of the centre of mass does not exceed ∼0.1 m, or 0.3 ns in terms of geometric delay, which is well within our tolerance.
Fig. 5. Relative error in energy conservation during the numerical integration of the equations of motion over the 2268 day span of our data. The envelope of the signal oscillates at the outer binary frequency while a zoom would show a fast oscillation at the inner binary orbital frequency. 
Fig. 6. Three components of the centreofmass position variation during the integration of the equations of motion over the 2268 day span of our data. A quadratic component has been fitted out (see text and legend) which leaves only the oscillatory components. The main visible oscillation is at the frequency of the outer binary while a zoom would show a fast oscillation at the inner binary orbital frequency. 
In practice, the largest systematic errors may come from unmodelled effects. In particular, gravitational wave damping in the inner binary should account for a few nanoseconds after 5 years. Another effect that might become important for highprecision timing over time is the effect of the gravitational quadrupole moment of the inner white dwarf. Indeed this star should be slightly deformed by the tidal field of the neutron star and by its spin which would lead to a slight correction to the orbital precession rate.
4. Bayesian analysis results
Our main goal in this work has been to get a Bayesian estimate of the uncertainties on each of the parameters of the problem, or in other words to estimate the posterior probability density function (PDF) of the parameters θ belonging to model M given our data D using Bayes’ rule,
The prior function, P(θ, M), was chosen flat except for astrometric parameters that benefited from prior knowledge of position and angular proper motion from the Gaia mission DR2 (Lindegren et al. 2018), of distance from photometric observations of the inner white dwarf (Ransom et al. 2014) and radial velocity from optical spectroscopy of the same star (Kaplan et al. 2014). The Gaia DR2 catalogue does not model orbital motion which may then contaminate both position and proper motion. In the present case, given the distance of the source the magnitude of orbital motion is similar to the uncertainties reported by Gaia DR2. In order to account for potential systematic errors we have multiplied by two these uncertainties before using them as standard deviations of our Gaussian priors (see Table 2). We have also applied a factor of two to the uncertainty on the photometric distance reported in Ransom et al. (2014) in order to account for potential systematic effects that would bias this measurement. For instance an inaccurate spectroscopic estimate of surface gravity (the ‘high log g problem’ in lowmass white dwarfs, see Tremblay et al. 2015 and references therein) would in turn bias the stellar radius estimate and therefore the absolute magnitude of the star. It is worth pointing out that the two commonly used freeelectron density models for the Galaxy, NE2001 (Cordes & Lazio 2002) and YMW16 (Yao et al. 2017), both predict a distance of about 800 pc which is significantly smaller that reported in Tables 1–2. This indicates that the electron density for the given Galactic height (z = −690(40) pc) is overestimated. All priors are summarised in Table 2. Let us note that our fit for the radial proper motion μ_{d} is unconstraining as the uncertainties reported in Table 1 match the radial velocity prior of Table 2. The uncertainties of all the other fitted quantities are improved with respect to their prior.
Gaussian priors adopted in MCMC posterior inference.
The high dimensionality of the PDF together with the necessity to integrate numerically the equations of motion makes the problem computationally challenging. However our C++ code is able to calculate one PDF value in less than 10 s on a lastgeneration laptop, which made it possible to sample the PDF on a mediumsize computer cluster. The sampling was achieved using a homemade implementation of the affineinvariant Markovchain Monte Carlo (MCMC) of Goodman & Weare (2010) parallelised with the scheme of ForemanMackey et al. (2013). The advantage of this algorithm is to be efficient in high dimensionality (Allison & Dunkley 2014) and insensitive to any level of linear correlations between the parameters. This is particularly important as we found Δ to be highly correlated with many orbital parameters (see Fig. 2). However, we also found that nonlinear ‘correlations’ between parameters were preventing convergence within a reasonable time, which was solved by appropriate reparametrisation (see Sect. 3.1). Convergence was evaluated by requiring that fluctuations of the mean and standarddeviation estimators be smaller that 6% of the fullchain standard deviation for each parameter (see e.g. Dunkley et al. 2005, Sect. 4.1 and chain plot in supplementary online material). We noticed that standard deviations sometimes converge later than means, particularly for Δ, confirming the importance of monitoring both indicators to ensure reliable uncertainties.
Due to its very low ecliptic latitude, ∼2 deg, the timing of PSR J0337+1715 is potentially sensitive to a range of effects occurring in the Solarsystem. In particular, we detected in preliminary runs a slight increase in timing residuals of the order of 1 μs when the pulsar was within 3 deg of the Sun. We attributed this increase to the inaccuracy of the Solarwind electron density model used by TEMPO2 to calculate the related DM. We mitigated this effect by removing all the ToAs taken within 5 deg of the Sun. Moreover, our periodogram shows a secondary ∼0.2 μs component close to the Earth orbital frequency, sign of possible extra inaccuracies in the Solarwind model or in the Solarsystem ephemerides. This is likely to affect outerorbit parameters since this period is close to 1 year but such a correlation can only widen posterior uncertainties.
Two models M were tested. Our main model includes only Δ as a free parameter while our secondary model includes the three additional 1PNstrongfield parameters, , , , yielding the following 95% C.L. constraints for them:
In regard to binarypulsar tests (see Sect. 5), the above results on the three parameters are unconstraining. We used this prior knowledge to run our main model with and obtain our primary SEP limit
which translates into Δ < 2.05 × 10^{−6} at 95% C.L. (see Fig. 7). The full result of the main model is reported in Table 1. Note that ∼8% of the reported uncertainties are due to unaccounted systematics absorbed in the EFAC parameter (see also Sect. 4.2). The wider uncertainty obtained in the secondary run is due to large correlations with the three additional parameters.
Fig. 7. Lefthand side: marginalised posterior probability distribution of the SEP violation parameter Δ sampled by MCMC and normal law with the same mean and standard deviation. The upper axis gives the mean value and the boundaries of the 95% confidence region. Righthand side: distribution of distance to GR derived from the lefthandside distribution. 
4.1. MCMC run and convergence
The affineinvariant algorithm of Goodman & Weare (2010) requires to move N walkers together at each iteration. The gist of this algorithm is that the walkers within the set are not independent from each other while the set as a whole constitutes a single effective walker in the Markov process sense, namely that it depends only on its previous state. Individual moves within the set are informed by the positions of other walkers in a way that renders the algorithm rigorously immune to any linear correlation, or any affine parameter transformation. However it might be sensitive to correlation of a higher degree, or to nonconvexity of the posterior isosurfaces. Therefore, with this algorithm one should take care of removing as much as possible any nonlinear correlations by choosing an appropriate parameter set (see Sect. 3.1) but very large linear correlations, as can be seen in Fig. 2, are well resolved by the algorithm.
The authors of Goodman & Weare (2010), ForemanMackey et al. (2013) recommend to choose a number of walkers within the set much larger than the number of parameters. In the present case we chose to use 288 walkers per set. The only other tunnable parameter is the unique parameter of the proposal function, a, which controls the size of the steps that can be attempted. The authors of Goodman & Weare (2010), ForemanMackey et al. (2013) suggest the value a = 2 in order to keep an acceptance fraction ∼0.4. We found that, as the chain was getting closer to convergence the acceptance fraction could drop dramatically, sign of nonlinear correlations or nonconvexity. This drop was largely mitigated by adopting the final parameter set of Sect. 3.1, but we still had to choose a smaller stepsize parameter a = 1.6 to keep the acceptance fraction close to the level mentioned above.
Due to the large parameter space and the computing time needed to compute one χ^{2} (about 10 s) we parallelised the MCMC code using the scheme of ForemanMackey et al. (2013). This allowed us to run the MCMC using 144 cores of the mesoscale MesoPSL cluster (see acknowledgements) each calculating 2 walkers (which is the minimum number of walkers per core given the algorithm used). A few 10 000 steps were typically necessary to reach convergence. However, it is important to quantitatively estimate convergence as one cannot afford to run the MCMC for an unnecessarily large number of iterations. Beyond visual inspection of the parameter chains which allows to discard any obvious burnin phase, we also monitored the autocorrelation time of each chain (Goodman & Weare 2010). However, we found that the most constraining convergence estimator was to monitor the variation of the mean and standard deviation of each parameter chain. Formally, one needs to compute the value of the following estimator on each chain (Dunkley et al. 2005),
where is a standarddeviation estimator, is a statistical estimator which here is either the mean or the standard deviation , and σ is the standard deviation estimated on the entire length of the chain. In practice, we recorded the 288 walkers every 5 iterations and use the last 69 408 recorded elements (241 independent sets of 288 walkers). The standard deviation of was estimated by applying on 20 subsamples of the chain and then estimating the standard deviation of the set of the values obtained. The chain was considered converged if both and estimated values are smaller than 0.06. A situation we observed is when the latter keeps varying significantly while the former is stable and under the cutoff. In other words, the chain widens with constant mean, rendering a meanbased convergence estimator uninformative and possibly leading to an underestimation of the parameter uncertainties.
4.2. Analysis of the residuals
We have assessed the robustness of our fit by evaluating the presence of systematic components in the residuals (Fig. 8). As it appears from Fig. 9, no significant structure is present at either the inner or outer orbital period, nor at the Earth orbital period notwithstanding the sharp cut around the phase of closest approach to the Sun made to prevent any bias induced by unmodelled DM contributions.
Fig. 9. Residuals of the times of arrival corresponding to the parameters of Table 1 plotted versus the innerbinary orbital phase (top), the outerbinary orbital phase (middle) and the Earth orbital phase (bottom). The red vertical bars of the first two plots show the phase where the pulsar (resp. the inner binary) is closer to the Solar System. The red vertical bar on the bottom plot, shows the Earth orbital phase where the line of sight to the pulsar passes closest to the Sun. We have removed every ToA when the pulsar is less than 5 deg from the Sun, which explains the gap around that particular phase. 
In order to estimate more thoroughly the presence of systematic modulations we produced a LombScargle periodogram (Lomb 1976; Scargle 1982) of the same residuals, Fig. 4. It appears that indeed there is no significant power at the frequencies mentioned above, except for a tentative signal ≲0.2 μs at the Earth orbital frequency. Because of the proximity of this frequency with the outerbinary orbital frequency this might lead to correlate Solarsystem and outerbinary effects and therefore enlarge uncertainties related to the parameters involved.
The dominant component is the lowfrequency peak and its subsequent harmonics which we interpret as timecorrelated red noise. A number of causes have been proposed in the literature. The main ones are the intrinsic spin frequency noise (Shannon & Cordes 2010; Melatos & Link 2014) or magnetospheric fluctuations (Lyne et al. 2010). It has also been proposed that asteroid belts could result in apparent timing noise (Shannon et al. 2013). Propagation effects under the form of longterm variations of the dispersion measure along the line of sight due to turbulence in the interstellar medium could be the cause in some cases (Keith et al. 2013). We have tried to split the time span of our observations into several intervals with different DM values, but the fit was consistent with an absence of variation discarding this explanation. Red noise can also have a local cause, such as irregularities of the terrestrial time realisation used to time the pulsar (Hobbs et al. 2020, 2012) or inaccuracies in the planetary masses used to generate the Solar System ephemeris (Champion et al. 2010; Caballero et al. 2018). However, we use the 2016 realisation of the BIPM terrestrial time which Hobbs et al. (2012) has confirmed as suitable for precision pulsar timing. Besides, if the red noise was caused by any inaccuracy in planetary masses, then the signature would be at the orbital frequency of the responsible planet (Champion et al. 2010). The only orbital period in the Solar System that approaches the 1650 days of the rednoise signal is the orbital period of the dwarf planet Ceres. However the uncertainty on its mass in the NASA JPL ephemeris DE430 (Folkner et al. 2014) we use in this work is too small to explain a signal of that magnitude.
Thus, there is no deterministic model that can be fitted to that component, but since its frequency is much lower than any other in the system it is unlikely to bias the parameters. However, it results in increasing the scale of the ToA uncertainties (EFAC parameter in Table 1) in order to accommodate this systematic effect into a reduced χ^{2} equal to unity. Were the PDF perfectly Gaussian, that would result into posterior uncertainties increased in exactly the same proportion, which we can here estimate at ∼8%. Therefore, our posterior uncertainties should be seen as upper limits. Interestingly, because the analysis of Archibald et al. (2018) focuses on a specific frequency signature for the SEP, the result quoted in that work should be seen as a lower limit on the actual uncertainty in the sense of the above discussion.
Finally, the prominent peak at 1 day^{−1} and its harmonics on the periodogram of Fig. 4 result from the convolution of the red noise component with the observing window functions of our data. Indeed, due to its meridian configuration, the Nançay radio telescope can only observe the same object during ∼1 h windows separated by an integer number of sidereal day. Due to the proximity of the inner orbital frequency with the observing frequency of ∼1 day^{−1} one might be concerned with a potential bias. However we have checked that the Fourier transform of a comb of 1 h window functions results in sharp narrow peaks whose width does not exceed a few percents of the daily frequency and therefore cannot significantly bias signals at the orbital frequencies.
5. Theory dependent tests
The parametrised postNewtonian (PPN) formalism (see e.g. Will 2018a), with its ten theoryindependent parameters, has proven to be a powerful tool to quantify and compare tests of GR and its alternatives in the weakfield environment of the Solar System. Unfortunately, there is no such framework that generically extends beyond the weak field approximation of the PPN formalism and therefore is able to cover gravity experiments with strongly selfgravitating bodies, like the one in this paper. One reason is that, unlike in the Solar System, the treatment of the motion of a neutron star in an external gravitational field requires the full complexity of a gravity theory (Will 2018b). To put the UFF test conducted in this paper into context with other experiments, in particular Solar System tests and gravitational wave tests with binary pulsars, it has been suggested to use theorydependent frameworks (see e.g. Damour 2009). Scalartensor theories of gravity have turned out to be particularly useful for this purpose. Apart from being well motivated and well studied alternatives to GR (Fujii & Maeda 2007), they show a rich phenomenology in their deviations from GR, including prominent effects related to the nonlinear strongfield regime of neutron stars (see e.g. Damour & EspositoFarèse 1993, 1996).
In this paper, as a theorydependent framework we use the class of BergmannWagoner theories. BergmannWagoner theories represent the most general scalar–tensor theories with one scalar field that are at most quadratic in the derivatives of the fields (Will 2018a). Quite a number of well known scalartensor theories belong to this class, like JordanFierzBransDicke (JFBD) gravity (Jordan 1955; Fierz 1956; Brans & Dicke 1961), DEF gravity (Damour & EspositoFarèse 1993), MO gravity (Mendes & Ortiz 2016), f(R) gravity (De Felice & Tsujikawa 2010), and massive BransDicke gravity (Alsing et al. 2012). BergmannWagoner theories form a subclass of the class of Horndeski theories (Horndeski 1974), which is the most general class of monoscalartensor theories in four dimensions whose Lagrangian leads to second order field equations.
In the following we interpret our limits of Sect. 4 in two different approaches within the class of BergmannWagoner theories. In the first approach we remain (mostly) generic, in the sense that make as few assumptions as possible concerning the details of the theory. In the second approach, we pick a specific twoparameter scalartensor theory, that is, DEF gravity. For this twoparameter class of theories we can then explicitly calculate the properties of neutron stars for different equations of state (EOS) and derive constraints on the twodimensional theory space from the limits presented here.
5.1. Generic tests within BergmannWagoner scalartensor gravity
In BergmannWagoner theories, the field equations for the (physical) metric g_{μν} and the scalar field ϕ are a result of the following action
where G_{0} is the fundamental (‘bare’) gravitational constant, g the determinant of g_{μν}, R the curvature scalar, ω(ϕ) is the coupling function, and U(ϕ) the scalar potential. The physical (Newtonian) gravitational constant, as measured in a Cavendishtype experiment, is given by
where ϕ_{0} denotes the cosmological background field and ζ ≡ 1/(2ω(ϕ_{0})+4). S_{mat} is the action of the matter fields , which couple universally to the spacetime metric g_{μν}. For our discussion, we assume that U(ϕ) can be neglected on the scale of the triple system, that is, . In terms of a massive scalar field, this means that we assume that the Compton wavelength is much larger than the extension of the system (see Seymour & Yagi 2019, on how J0337+1715 can be used to constrain massive scalar fields).
The effective gravitational constant that enters the Nbody Lagrangian is given by
where the sensitivity
accounts for the dependence of each body on a change in the ambient scalar field, while the number of baryons remains fixed (Will 2018a). For neutron stars, s_{a} depends on the EOS. It is typically of the order of 0.1 but, depending on the details of ω(ϕ), its (absolute) value can be very large, as we discuss further below. For the relative GWEP parameter one finds
Because of the product s_{a}s_{b}, in general it is not possible to interpret the quasiNewtonian equations of motion in terms of inertial and gravitational masses of the individual bodies in an Nbody system (Will 2018a). For weakly selfgravitating bodies the sensitivity s_{a} is simply related to the fractional gravitational binding energy ε_{grav, a} via
where λ ≡ ϕ_{0}ω′(ϕ_{0})ζ^{2}(1 − ζ)^{−1}. The two Eddington parameters of the PPN formalism are given by
and the Nordtvedt parameter of Eq. (4) reads
For the inner and outer white dwarf we have ε_{grav, i} ≃ −1.8 × 10^{−5} and ε_{grav, o} ≃ −4.8 × 10^{−5}, respectively. Consequently G_{io} ≃ G_{N} for the interaction between the two white dwarfs, and G_{pi} ≃ G_{po} ≃ G_{N}(1 − 2ζs_{p}) for the interaction between the pulsar and the white dwarfs. Hence, our result for Δ in Table 1 leads to a direct constraint for
where Δ ≡ Δ_{pb}(s_{b} = 0). The above limit can be considered as generic within the family of BergmannWagoner theories of gravity, in the sense that it does not require a specification of the coupling function ω(ϕ). Later, we use this limit to impose constraints on the parameter space of a specific two parameter family of BergmannWagoner theories. Before that, we need to discuss the strongfield modifications at the postNewtonian level of the 3body dynamics.
5.1.1. First postNewtonian contributions
At the first postNewtonian (1PN) level (order v^{2}/c^{2}), the PPN parameters and need to be replaced by the bodydependent quantities and (see Appendix A for details). These strongfield generalisations of the PPN Eddington parameters depend on the sensitivities and their derivatives of the bodies in a system. For the detailed expressions, we refer the reader to Will (1993) and Damour & EspositoFarèse (1992). The latter uses the socalled Einsteinframe representation and gives these terms for multiscalartensor theories. More generally, in the triple system of PSR J0337+1715, where two of the bodies are weakly self gravitating, one finds for the twelve 1PNstrongfield parameters, to good approximation,
leaving us with six different parameters at the 1PN level of the modified EinsteinInfeldHoffmann equations of motion, instead of the two in the weak field limit. Note the following symmetries: and .
At this stage, we can further reduce the number of 1PN parameters, without making more detailed assumptions about the theory, for instance about ω(ϕ). The tight limits on and from Solar System tests ∼10^{−5} (Will 2018a), directly put tight constraints on two of the six 1PN parameters. Furthermore, only depends on terms proportional to ζ and ζs_{p}, the first being constrained to ∼10^{−5} by Cassini (Bertotti et al. 2003) and the latter to ∼10^{−6} already by the Newtonianlevel dynamics of the triple system (cf. Eq. (27); see also limit (13)). Consequently, without loss of generality, , , and can be ignored in a selfconsistent gravity test with the PSR J0337+1715 system. Besides the Δ at the Newtonian level, we are left with the 1PN strongfield parameters , , and . These three parameters cannot be constrained without further assumptions, as we discuss below. Hence we have implemented a model based on deviations from GR parametrised by , which is called secondary model in Sects. 3 and 4. Our analysis based on this model leads to the generic limits (13)–(16).
In our generic approach, the parameters , can apriori only be constrained if we make further assumptions and apply existing constraints from binarypulsar systems. The reason is as follows. The three parameters have terms, which are proportional to λζ, , λζs_{p}, , and , where
(see e.g. Will 2018b, for details). Solar System constraints on β_{PPN} and η put tight constraints (∼10^{−5}) on λζ, and is constrained to ∼10^{−12} because of Eq. (27). However, the quantities λ, s_{p}, and are unconstrained by above considerations. Consequently, λζs_{p}, , and are apriori unconstrained. Below, in a more theory specific context, we discuss a situation in DEF gravity where can be of order unity, even if ζs_{p} ≪ 1 (spontaneous scalarisation of neutron stars). In such a highly nonlinear strong field regime, s_{p} and can assume very large values. As a consequence, the terms can become much larger than β_{PPN}, even under condition (27). To give an example, in the regime of spontaneous scalarisation in DEF gravity, one finds while remains practically unaffected when ζ → 0 (cf. Damour & EspositoFarèse 1996).
If we restrict λ to values which are not very large, it can be shown that can be considered as small as well, since it only contains λζs_{p} as an a priori unconstrained term, where ζs_{p} has to be small according to Eq. (27).
Further constraints can come from binary pulsar experiments. In particular, dipolarradiation tests in binary pulsars with spectroscopic white dwarfs (see e.g. Lazaridis et al. 2009; Freire et al. 2012b), can in principle provide generic constraints on , although some additional assumptions are needed, for instance for U(ϕ) (cf. Alsing et al. 2012). If U(ϕ) = 0, the change in the orbital period of a pulsarwhite dwarf binary, P_{b}, due to dipolar radiation damping is given by
(see e.g. Eq. (12.32) in Will 2018a), where e denotes the orbital eccentricity, and m_{p} and m_{c} are the masses of pulsar and whitedwarf companion respectively. To apply constraints from other pulsar to the PSR J0337+1715 system it also requires some on the dependence of the sensitivity of the pulsar, s_{p}, on the pulsar mass. In the strongfield regime of neutron stars this dependence can be highly nonlinear (Damour & EspositoFarèse 1996; Shao et al. 2017). PSR J1738+0333 is a pulsar with a mass similar to PSR J0337+1715 (m_{p} ≃ 1.46 M_{⊙}). The dipolar radiation test by Freire et al. (2012b); Zhu et al. (2019) leads to
As a result, can, in general, also be assumed to be small in the PSR J0337+1715 system.
Imposing a generic constraint on , and therefore on , is somewhat less direct. enters, for instance, the precession of periastron, , which is particularly well tested – in combination with other postKeplerian parameters – in eccentric shortorbitalperiod binary pulsars (Wex 2014; Will 2018a). However, only the so called Double Pulsar allows for a generic constraint on deviations from GR in , which is of the order of 10^{−3} (Kramer & Wex 2009). However, the masses of the Double Pulsar are significantly lower than the mass of PSR J0337+1715. Nevertheless, the general agreement of all these systems with GR at least suggests that , and therefore can generally be assumed to be small as well. Moreover, given that the Cassini experiment already imposes ζ^{2} ≲ 10^{−10}, would have to assume quite extreme values to lead to a significant , certainly in view of the (still) quite weak limit (16).
To summarise, under additional assumptions, which we consider as reasonable for most situations, all three strongfield parameters are tightly constrained by a combination of Eq. (27) with constraints from binary pulsars experiments. The limits in Eqs. (14)–(16) are therefore generally not of particular interest, at least for constraining the class of scalartensor theories considered in this section.
5.2. EOSagnostic constraints on Damour–EspositoFarèse (DEF) gravity
In order to explicitly calculate s_{p} and and therefore Δ and the 1PN strongfield parameters that enter our equations of motion, one has to pick a specific theory of gravity, which we do in this subsection. In the quadratic monoscalar tensor theory T_{1}(α_{0}, β_{0}) of Damour & EspositoFarèse (1993), the coupling function in the Einstein frame is quadratic in the scalar field, meaning that the coupling strength between the scalar field and the trace of the stressenergy tensor becomes field dependent in a linear way. In the Jordanframe representation with the physical metric g_{μν}, which we are using here, the coupling function then reads
where ϕ_{0} ≡ 1, without loss of generality (Will 2018a). Furthermore, one finds
The tight constraints on ζ from the Cassini mission imply that . Furthermore, from Eq. (26) one then finds for the Nordtvedt parameter
The sensitivity s_{a} of a weakly selfgravitating body, like the WD companions to J0337+1715, can be calculated according to Eq. (24):
For neutron stars, the absolute value of the sensitivity can become very large if β_{0} ≲ −4.5, a fact first discovered within T_{1}(α_{0}, β_{0}) gravity theories by Damour & EspositoFarèse (1993), and generally referred to as ‘spontaneous scalarisation’. As a result, even for arbitrarily small α_{0}, the quantity α_{0}s_{a} ≃ ζ^{1/2}s_{a} remains at order unity^{10}.
A special case of T_{1}(α_{0}, β_{0}) is JFBD gravity, for which β_{0} = 0. In that case λ = 0, and the coupling function is a constant:
which is called the BransDicke parameter. For ω_{BD} → ∞, that is, α_{0} and ζ → 0, JFBD gravity approaches GR. We obtain the most conservative limits on JFBD when using the stiffest EOS from our set of viable EOSs (see Fig. 10), that is, BSk22. For this EOS, in JFBD gravity, the sensitivity of PSR J0337+1715 has the value s_{p} = 0.149. Most importantly, for ζ ≲ 10^{2}, this value is practically independent of ζ (Shibata et al. 2014; Shao et al. 2017). Hence, Eq. (27) can directly be converted into limits on the coupling parameter:
Fig. 10. Radiusmass diagram for the 12 EOSs used in this paper to accomplish a good coverage of the range from soft to stiff EOSs, while still being in agreement with the tidal deformability test in the GW170817 LIGO/Virgo event (Abbott et al. 2017, 2018) (black curves). The blue dashed line indicates the mass of PSR J0337+1715, and the red dotted line corresponds to the most massive neutron star used in our combined test: PSR J0348+0432 (Antoniadis et al. 2013). The black curves correspond to the following EOSs (from left to right in their intersection with the red dotted line): WFF1, SLy4, WFF2, AP4, BSk20, ENG, SLy9, AP3, BSk25, BSk21, MPA1, BSk22 (see Lattimer & Prakash 2001 and https://compose.obspm.fr). Our stiffest EOS, BSk22, is also in agreement with the (more model dependent) constraint of M_{max} ≲ 2.3 M_{⊙} by Rezzolla et al. (2018), Shibata et al. (2019). We have included EOS H4 (grey curve), which is disfavoured by GW170817, and therefore has not been used in Figs. 11 and 12. All these EOSs also agree with the latest constraints from NICER (Miller et al. 2019). 
Consequently, using Eq. (41), while keeping in mind that according to Eq. (38) ζ ≥ 0, one finds
This limit is more than a factor of three larger, that is, more constraining, than the Cassini limit (Will 2018a). When using EOS H4, which is already disfavoured by the GW170817 LIGO/Virgo event, we find ω_{BD} > 130 000, which is only marginally weaker than the above limit. Just to illustrate the EOS dependence of the limit on JFBD gravity, for the soft EOS WFF1 (outer/left in Fig. 10), the lower limit for ω_{BD} increases to 180 000.
While a stiffer EOS gives a more conservative limit for JFBD gravity, such a general statement is no longer true for the whole α_{0}–β_{0} parameter space of T_{1}(α_{0}, β_{0}). In particular for certain negative values of β_{0}, a softer EOS can be more conservative. For low and medium mass neutron stars, like PSR J0337+1715, the β_{0} range where that is the case is rather small (see Fig. 11). For high mass neutron stars the situation is quite different. A soft EOS that has a maximum mass close to the mass of the neutron star leads to considerably weaker limits for all β_{0} ≲ −2 (Shibata et al. 2014; Shao et al. 2017). This is of particular importance for constraints from pulsars like PSR J0348+0432 (Antoniadis et al. 2013) (see Fig. 10). Hence for our combined constraints on the parameter space of T_{1}(α_{0}, β_{0}) theories we used a set of EOSs that provide a good coverage of the range from soft to stiff. Furthermore, if for a given point (α_{0}, β_{0}), which corresponds to a specific gravity theory, there is a single EOS from our set with which all pulsar constraints are fulfilled then this point in the theory space is not excluded. For our joint analysis we have used the UFF results from this paper in combination with the dipolar radiation tests of PSRs J1012+5307 (Desvignes et al. 2016; Antoniadis et al. 2016), J1141−6545 (Bhat et al. 2008), J1738+0333 (Freire et al. 2012b; Zhu et al. 2019), J1909−3744 (Desvignes et al. 2016; Arzoumanian et al. 2018), and J2222−0137 (Cognard et al. 2017). Our results are shown in Fig. 12.
Fig. 11. PSR J0337+1715 constraints on the α_{0}–β_{0} space of scalartensor theories (Damour & EspositoFarèse 1993, 1996) from Eq. (17): the area under the curve is still allowed by experiments. Two different neutronstar equations of state are used: a soft one, WFF1 (red curve), and a stiff one, BSk22 (blue curve). The two solid lines use the SEP constraint of this paper. The grey curves show the 2σlimits from Solarsystem experiment: Cassini (solid), LLR (dashed), MESSENGER (dotted). JFBD gravity corresponds to β_{0} = 0 (thin vertical line). 
Fig. 12. Combined EOSagnostic pulsar constraints on the α_{0}–β_{0} space of scalartensor theories (Damour & EspositoFarèse 1993, 1996) from Eq. (17): the area under the curve is still allowed by experiments. The blue curve is the result of a combination of dipolar radiation tests from six pulsarWD binary systems (see text for details). The red curve indicates the improvement when the constraints from this paper are added to the dipolar radiation tests. The grey curves show the 2σlimits from Solarsystem experiment: Cassini (solid), LLR (dashed), MESSENGER (dotted). JFBD gravity corresponds to β_{0} = 0 (thin vertical line). 
6. Conclusions
We described in this paper a test of the universality of free fall (UFF) with the pulsar in a triple star system, PSR J0337+1715. The result we obtain for the UFF violation parameter for the MSP is Δ = ( + 0.5 ± 1.8) × 10^{−6} (95% C.L.), which can be stated as a limit, Δ < 2.05 × 10^{−6} (also 95% C.L.). This represents 30% improvement over the previous test using the same pulsar (Archibald et al. 2018). Interestingly, although we obtain a similar value for Δ, the nature of our limit is different: the uncertainty reported in this work is statistical, while the result of Archibald et al. (2018) is largely made of a systematic uncertainty which we did not find necessary in the present analysis. This particular difference makes our two limits difficult to compare in absence of a physically motivated model for the systematic bias, but should also provide an independent verification of the solidity of the result.
Furthermore, in a generic approach we also provide limits for three postNewtonian strongfield parameters of the threebody interaction, and discuss in detail the relevance of these limits. In view of other binary pulsar limits, it seems that these limits might be of interest only in very specific situations.
As for Archibald et al. (2018), our results are fully consistent with the predictions of GR. This limit strongly constrains SEP violation and any alternative theories of gravity that predict a violation of the universality of free fall for selfgravitating masses (GWEP), particularly for neutron stars with masses similar to that of PSR J0337+1715. In this paper we explicitly calculate these constraints for a wide class of gravity theories, and as a part of this derive EOSindependent constraints on the parameter space of quadratic monoscalartensor gravity. Specifically, for the coupling parameter of JordanFierzBransDicke (JFBD) gravity we find ω_{BD} > 140 000, which is the so far the tightest limit for this scalartensor theory. We also present new constraints for Damour–EspositoFarèse (DEF) gravity, a quadratic extension of JFBD gravity. We combine our limit with limits from binary pulsar experiments while accounting for uncertainties in our knowledge of the equation of state (EOS) of neutronstar matter.
In what remains of the paper, we make a detailed comparison of this experiment with the best previous constraints on GWEP/SEP violation. In Sect. 6.1 we make a more detailed comparison with the experiment by Archibald et al. (2018). In Sect. 6.2, we compare our experiment to radiative experiments from binary pulsars, which have also produced strong and complementary constraints on GWEP violation via their strong constraints on the emission of dipolar gravitational waves. Furthermore, we compare the present limit with potential future limits on dipolar radiation from binary neutronstar and neutron starblack hole mergers.
In all of these experiments, no GWEP violation can be detected; gravity behaves, to within observable precision, as described by GR, which is conjectured to be the only viable theory which fully embodies the SEP.
6.1. Comparison with previous work on J0337+1715
This work distinguishes itself from the Archibald et al. (2018) on the following points:

Independent data set;

Independent timing model including additional effects;

Statisticslimited versus systematicslimited accuracy;

Tension in the mass measurements and the first measurement of Ω_{O};

Generic test of those strongfield postNewtonian parameters, which are apriori unconstrained, even within a broad class of scalartensor theories.

EOSagnostic constraints on DEF gravity, while accounting for the latest observational constraints on the range of EOSs.
Point (1) benefits from the well sampled timing data acquired by the Nançay radio telescope alone (Sect. 2). All the observations used here were conducted within the same frequency range (1.2−1.7 GHz) and with the same environmental conditions since Nançay is a meridian Kraus design telescope. The NUPPI instrumentation is also routinely used for long term highprecision timing providing excellent and stable results. The instrumentation did not change since its installation in 2011 and there is no need for any time jump in the whole dataset since 2011.
Point (2) makes use of NUTIMO (Sect. 3) which has the specificity of allowing for a fully selfconsistent treatment of astrometric parameters via the binding with TEMPO2 and the inclusion of Kopeikin and Shklovskii delays in the model. The model in Archibald et al. (2018) did not include these delays and used a local linear approximation for astrometric corrections. The argument in favour of such proxy was that any systematic effect caused by these approximations should not affect the main SEP signature which has a different frequency. However, we observe that the astrometry then found differs significantly from prior knowledge and in particular Gaia observations which led Archibald et al. (2018) to acknowledge that the resulting astrometry should not be used for other applications. In addition, NUTIMO also fits consistently for DM and DM variations and allows to check for local epoch DM variations (DMX parameters in TEMPO2) which revealed no fluctuations over time. Note that Archibald et al. (2018) did fit DM over 1 year time intervals and marginalised over these parameters using the solution of a leastsquare fit. We also included the aberration delay that neither Ransom et al. (2014) nor Archibald et al. (2018) mention. This delay has a very small amplitude (sum of ∼30 ns sinusoid at the outer period and a ∼0.1 μs sinusoid at the inner period) and therefore can easily be absorbed by other parameter in a fit, but still creates a signal of magnitude larger than the expected SEP sensitivity. In NUTIMO, potential systematic effects that may not be accounted for by the model are absorbed in a rescaling of the error bars of the times of arrival via the EFAC parameter which ensures a reduced χ^{2} equal to unity. This in turns conservatively increases uncertainties on the posterior parameters.
Point (3) arises from the fact that, in Archibald et al. (2018), most of the total reported uncertainty of Δ (±0.74 × 10^{−6} at 68% C.L.) is associated with systematic uncertainties while in this work our uncertainty is mostly statistical. We account for unmodelled systematic effects, mostly a rednoise component, via the EFAC parameter which is responsible for a modest and conservative widening of ∼8% of the uncertainties. The statistical uncertainty of Archibald et al. (2018) is estimated using MCMC sampling similarly as we do in this work and results in Δ = (− 1.1 ± 0.2) × 10^{−6} (68% C.L.). Taken alone, this would signify a 5sigma SEP violation. However, the authors argue that most of the uncertainty comes from unaccounted systematic effects which could generate a signal at the signature frequency of an SEP violation. In other words, it is claimed that the accuracy is systematicslimited while in this work we are statisticslimited. The physical mechanism of the systematics being unknown, Archibald et al. (2018) propose to model systematics using an empirical stochastic model where the extraneous signal is a weighted sum of sine and cosine functions at frequencies kf_{i} + lf_{o} (k, l being small integers) whose weights are drawn from a single Gaussian distribution for each particular realisation. In order to sample the distribution of Δ caused by different realisations of the (stochastic) systematics, Archibald et al. (2018) bootstrapped many sets of synthetic data from the model, refitted the orbital model, and thus obtained a value of Δ for each synthetic dataset. This estimate heavily depends on the modelling choices for which no physical justification is currently available. It also seems unlikely that systematics should occur at frequencies kf_{i} + lf_{o} if the physical mechanism is unrelated to orbital motion (as an SEP violation would be), unless the signal at these frequencies is the tail (in Fourier space) of a systematic signal which peaks at a different frequency, but should then be seen in a periodogram such as Fig. 4. In this respect, Fig. 4 does not suggest that we should consider a systematicslimited regime in the present work. Although some systematics are present as red noise (see Sect. 4.2), there is no sign of an additional signal around the signature frequency.
Point (4) is about comparing the parameter set of Table 1 with the results of Archibald et al. (2018). A direct comparison is not straightforward because of (i) slightly different definition of the parameters (see Sect. 3.1) and (ii) the fact that most of the parameters are not constants of motion but are defined either at the reference time T_{ref} or T_{pos}. To minimise the span of numerical computations we do not use the same reference time as Archibald et al. (2018). However, one can compare masses which are constants of motion (and, in the same way, Δ). The values reported in Archibald et al. (2018), m_{p} = 1.4359(3) M_{⊙}, m_{i} = 0.19730(4) M_{⊙}, m_{o} = 0.40962(9) M_{⊙}^{11}, whose statistical 68% confidence intervals are about 5 times better than ours – similar to the ratio between the uncertainties on Δ – are in tension with the values we report in Table 1, with Δm_{p} ≃ 2.9σ, Δm_{i} ≃ 2.6σ, Δm_{o} ≃ 2.4σ where σ is half of the 68% confidence interval reported in this paper, these differences are much more significant than for Δ. Due to the very large correlation between Δ and the orbital parameters on which depend the masses (period and semimajor axis), the systematic uncertainty estimated in Archibald et al. (2018) for Δ should be similar for the masses (but not reported) and partly release the tension.
In addition, we report the first measurement of the outer longitude of ascending node, Ω_{O}, which was deemed unconstrained in Archibald et al. (2018) although the dispersion of the fit residuals in Archibald et al. (2018) is smaller than ours. We speculate that the absence of Kopeikin delay in their analysis prevented that measurement. However, this parameter is uncorrelated with Δ and should therefore not affect the SEP test, but its absence should bias astrometric parameters.
Point (5) is related to our implementation of the first postNewtonian equations of motion, derived from the modified EinsteinInfeldHoffmann Lagrangian for strongly selfgravitating masses (Appendix A). In this we use two different approaches, a (mostly) generic one where three of the 12 1PN strongfield parameters are unconstrained by Solar System experiments, and a second approach where, under additional assumptions, all the strongfield 1PN parameters are tightly constrained by adopting binary pulsar constraints for the neutron star sensitivity and its derivative. A detailed motivation for the two different approaches is given in Sect. 5. In the first approach we find generic limits for the remaining three 1PN strongfield parameters, which however are not very tight, and therefore generally not of particular interest. Archibald et al. (2018) do not provide an equally generic analysis as done in our first approach.
Point (6) refers to the constraints of quadratic monoscalartensor gravity, where Archibald et al. (2018) have used a single (outdated) EOS. In our combined tests we have fully accounted for our imperfect knowledge of the EOS of neutronstar matter, and used a set of modern EOSs that covers the range from soft to stiff EOSs. A reason for that is the fact that the most conservative pulsar limits do not always come from the stiffest EOS. Our set of EOSs is in agreement with the latest constraints from LIGO/Virgo and NICER, and can account for the largest neutron star masses measured to date (Antoniadis et al. 2013; Cromartie et al. 2019). We would like to point out, that the recent limits of Capano et al. (2020) exclude some of the stiffer EOSs used in our analysis, which consequently leads to even more stringent constraints than the ones shown in Fig. 12, in particular for β_{0} ≳ −1. In view of this, our limits can be considered as conservative.
6.2. Comparison with radiative tests
In GR, the lowest source multipole moment that generates gravitational waves is the quadrupole moment (Thorne 1980). In alternatives to GR, however, one finds lower multipoles, where for the dynamics of a binary system, the dipole moment is the most important one. The occurrence of these lower multipoles is closely related to a violation of the SEP (see e.g. Will 2018a, for a discussion). In scalartensor theories, for instance, an asymmetry in the sensitivities s_{a} in a binary system gives rise to scalar dipolar radiation (see Eq. (35)). While a difference in sensitivity is also the reason for a violation of GWEP, where masses with different compactness are falling differently in an external gravitational field (see Eq. (21)). In a sense, the UFF experiment with PSR J0337+1715 and constraints on dipolar radiation damping with binary pulsars are exploring two different sides of the same coin. The limits in Fig. 12 show that currently for a large part of the parameter space, the test with PSR J0337+1715 is more constraining than dipolar radiation tests from binary pulsars. For sufficiently, negative β_{0}, however, gravitational wave tests with binary pulsars become more constraining, in particular for small α_{0}. We have a more detailed discussion on this further below.
Gravitational wave observation of a double neutronstar merger can also be used to constrain the emission of dipolar gravitational waves, as has been done for the first LIGO/Virgo binary neutronstar merger GW170817 (Abbott et al. 2019). Limits on scalartensor theories as discussed here, from LIGO/Virgo observations, however, are not expected to be competitive with Solar System and pulsar experiments for most of the parameter space (see Shao et al. 2017). Future ground based gravitational wave detectors have the potential to improve on limits presented here, in particular in a β_{0} range which is difficult to constrain with pulsar experiments (see Fig. 13). Future gravitational wave observations of mixed (black hole + neutron star) mergers, in particular the combination of multiple events or the combination of ground and space based gravitationalwave observatories promise significant improvements (see e.g. Carson et al. 2020).
Fig. 13. Comparison of the PSR J0337+1715 constraints of this paper with expected constraints from future gravitational wave observatories of a single binary neutronstar merger: Cosmic Explorer (CE; blue dotted curve) and Einstein Telescope (ET; blue dashed curve). CE and ET curves are taken from Fig. 9 in Shao et al. (2017). Like for the CE and ET curves, the PSR J0337+1715 curve is also based on EOS AP4. Solar System constraints (grey) are as in Fig. 11. JFBD gravity corresponds to β_{0} = 0 (thin vertical line). 
As a final comment, there is an important difference between the UFF test conducted with PSR J0337+1715 and dipolar radiation tests. As discussed in the previous subsection, in the regime of spontaneous scalarisation, the neutron star charge can become (almost) independent of the parameter ζ in the sense that remains practically fixed for ζ → 0. In such a situation, the effective gravitational constant in the interaction between a neutron star and a white dwarf becomes indistinguishable from G_{N} and the test with J0337+1715 becomes practically insensitive to such deviations from GR. Dipolar radiation test with pulsarwhite dwarf systems, in contrast, are extremely constraining with respect to such scalarisation phenomena, as can be seen from Eq. (35). More generally, in situations where only the strong field of a neutron star can source additional (longrange) gravitational fields that lead to deviations from GR, the UFF test with PSR J0337+1715 cannot place any constraints, in contrast to radiative tests. Hence, both types of tests are complementary and valuable. Binary pulsar tests have already tightly constrained the occurrence of spontaneous scalarisation in neutron stars. However, depending on the EOS and mass of the neutron star, spontaneous scalarisation is not yet fully ruled out by such experiments (Shao et al. 2017).
Nordström’s conformallyflat scalar theory, which is also a metric theory, also fulfils the SEP, however, this is excluded by Solar System experiments (Deruelle 2011).
Shao (2016) investigates the possibility of constraining a difference in active and passive gravitational mass with the pulsar system under consideration in this paper.
TEMPO is a standard pulsar timing software, this can be found at http://tempo.sourceforge.net
Preferred location effects are already tightly constrained using pulsars by Shao & Wex (2013).
Source, data, and results are available at: https://doi.org/10.5281/zenodo.3778978.
Boost library version 1.55.0 www.boost.org
The effective scalar coupling α_{A} used by Damour & EspositoFarèse (1993) is linked to the sensitivity as defined here via α_{A} = α_{0}(1 − 2s_{a}). Furthermore, α_{0} ≃ ζ^{1/2} for small α_{0}.
Acknowledgments
G. Voisin acknowledges support of the European Research Council, under the European Unions Horizon 2020 research and innovation programme (grant agreement No.715051; Spiders). GV would like to thank F. Mottez and R. P. Breton for their support and helpful discussions during this project. GV also thanks A. Archibald for valuable discussions during the very early stages of this work. This work was granted access to the HPC resources of MesoPSL financed by the Region Ile de France and the project Equip@Meso (reference ANR10EQPX2901) of the programme Investissements d’Avenir supervised by the Agence Nationale pour la Recherche. We acknowledge financial support from the Action Fédératrice PhyFOG funded by Paris Observatory and from the ‘Programme National Gravitation, Références, Astronomie, Métrologie (PNGRAM)’ funded by CNRS/INSU and CNES, France. GD, MK and NW gratefully acknowledge support from European Research Council (ERC) Synergy Grant ‘BlackHoleCam’ Grant Agreement Number 610058. This work made use of the Scipy libraries (www.scipy.org). The authors would like to thank Lijing Shao for his valuable comments that helped to improve the manuscript.
References
 Abbott, B. P., Abbott, R., Abbott, T. D., et al. 2017, Phys. Rev. Lett., 119, 161101 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Abbott, B. P., Abbott, R., Abbott, T. D., et al. 2018, Phys. Rev. Lett., 121, 161101 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Abbott, B., Abbott, R., Abbott, T., et al. 2019, Phys. Rev. Lett., 123, 011102 [CrossRef] [Google Scholar]
 Ahnert, K., Mulansky, M., Simos, T. E., et al. 2011, Numerical Analysis and Applied Mathematics ICNAAM 2011: International on Numerical Analysis and Applied Mathematics, Halkidiki, (Greece), 1586 [Google Scholar]
 Allison, R., & Dunkley, J. 2014, MNRAS, 437, 3918 [NASA ADS] [CrossRef] [Google Scholar]
 Alsing, J., Berti, E., Will, C. M., & Zaglauer, H. 2012, Phys. Rev. D, 85, 064041 [CrossRef] [Google Scholar]
 Antoniadis, J., Freire, P. C. C., Wex, N., et al. 2013, Science, 340, 448 [Google Scholar]
 Antoniadis, J., Tauris, T. M., Ozel, F., et al. 2016, ArXiv eprints [arXiv:1605.01665] [Google Scholar]
 Archibald, A. M., Gusinskaia, N. V., Hessels, J. W. T., et al. 2018, Nature, 559, 73 [NASA ADS] [CrossRef] [Google Scholar]
 Arzoumanian, Z., Brazier, A., BurkeSpolaor, S., et al. 2018, ApJS, 235, 37 [Google Scholar]
 Bertotti, B., Iess, L., & Tortora, P. 2003, Nature, 425, 374 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Bhat, N. D. R., Bailes, M., & Verbiest, J. P. W. 2008, Phys. Rev. D, 77, 124017 [NASA ADS] [CrossRef] [Google Scholar]
 Boyles, J., Lynch, R. S., Ransom, S. M., et al. 2013, ApJ, 763, 80 [NASA ADS] [CrossRef] [Google Scholar]
 Brans, C., & Dicke, R. H. 1961, Phys. Rev., 124, 925 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
 Caballero, R. N., Guo, Y. J., Lee, K. J., et al. 2018, MNRAS, 481, 5501 [CrossRef] [Google Scholar]
 Capano, C. D., Tews, I., Brown, S. M., et al. 2020, Nat. Astron, in press, [arXiv:1908.10352] [Google Scholar]
 Carson, Z., Seymour, B. C., & Yagi, K. 2020, Classical and Quantum Gravity, 37, 065008 [CrossRef] [Google Scholar]
 Champion, D. J., Hobbs, G. B., Manchester, R. N., et al. 2010, ApJ, 720, L201 [CrossRef] [Google Scholar]
 Cognard, I., Freire, P. C. C., Guillemot, L., et al. 2017, ApJ, 844, 128 [NASA ADS] [CrossRef] [Google Scholar]
 Cordes, J. M., & Lazio, T. J. W. 2002, ArXiv eprints [arXiv:astroph/0207156] [Google Scholar]
 Cromartie, H. T., Fonseca, E., Ransom, S. M., et al. 2019, Nat. Astron., 4, 72 [Google Scholar]
 Damour, T. 1987, The Problem of Motion in Newtonian and Einsteinian Gravity (Cambridge: Cambridge University Press), 128 [Google Scholar]
 Damour, T. 2009, in Physics of Relativistic Objects in Compact Binaries: From Birth to Coalescence, eds. M. Colpi, P. Casella, V. Gorini, U. Moschella, & A. Possenti, Astrophys. Space Sci. Lib., 359, 1 [CrossRef] [Google Scholar]
 Damour, T. 2012, Classical Quantum Gravity, 29, 184001 [CrossRef] [Google Scholar]
 Damour, T., & Deruelle, N. 1985, Ann. Inst. Henri Poincaré Phys. Théor., 43, 107 [Google Scholar]
 Damour, T., & Deruelle, N. 1986, Annales de l’institut Henri Poincaré (A) Physique théorique, 44, 263 [Google Scholar]
 Damour, T., & EspositoFarèse, G. 1992, Classical Quantum Gravity, 9, 2093 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
 Damour, T., & EspositoFarèse, G. 1993, Phys. Rev. Lett., 70, 2220 [NASA ADS] [CrossRef] [Google Scholar]
 Damour, T., & EspositoFarèse, G. 1996, Phys. Rev. D, 54, 1474 [NASA ADS] [CrossRef] [Google Scholar]
 Damour, T., & Schäfer, G. 1991, Phys. Rev. Lett., 66, 2549 [CrossRef] [PubMed] [Google Scholar]
 Damour, T., & Taylor, J. H. 1992, Phys. Rev. D, 45, 1840 [NASA ADS] [CrossRef] [Google Scholar]
 De Felice, A., & Tsujikawa, S. 2010, Liv. Rev. Relativ., 13, 3 [NASA ADS] [CrossRef] [Google Scholar]
 Deruelle, N. 2011, Gen. Relativ. Gravit., 43, 3337 [CrossRef] [Google Scholar]
 Desvignes, G., Barott, W. C., Cognard, I., Lespagnol, P., & Theureau, G. 2011, in American Institute of Physics Conference Series, eds. M. Burgay, N. D’Amico, P. Esposito, A. Pellizzoni, & A. Possenti, 1357, 349 [Google Scholar]
 Desvignes, G., Caballero, R. N., Lentati, L., et al. 2016, MNRAS, 458, 3341 [NASA ADS] [CrossRef] [Google Scholar]
 Di Casola, E., Liberati, S., & Sonego, S. 2015, Am. J. Phys., 83, 39 [CrossRef] [Google Scholar]
 Dunkley, J., Bucher, M., Ferreira, P. G., Moodley, K., & Skordis, C. 2005, MNRAS, 356, 925 [NASA ADS] [CrossRef] [Google Scholar]
 DuPlain, R., Ransom, S., Demorest, P., et al. 2008, in Launching GUPPI: The Green Bank Ultimate Pulsar Processing Instrument (SPIE), SPIE Conf. Ser., 7019, 70191D [Google Scholar]
 Edwards, R. T., Hobbs, G. B., & Manchester, R. N. 2006, MNRAS, 372, 1549 [NASA ADS] [CrossRef] [Google Scholar]
 Einstein, A. 1915, Sitzungsberichte der Königlich Preußischen Akademie der Wissenschaften (Berlin), 844 [Google Scholar]
 Fierz, M. 1956, Helv. Phys. Acta, 29, 128 [Google Scholar]
 Folkner, W. M., Williams, J. G., Boggs, D. H., Park, R. S., & Kuchynka, P. 2014, IPN Progress Report, 42 [Google Scholar]
 ForemanMackey, D., Hogg, D. W., Lang, D., & Goodman, J. 2013, PASP, 125, 306 [CrossRef] [Google Scholar]
 Freire, P. C. C., Kramer, M., & Wex, N. 2012a, Classical Quantum Gravity, 29, 184007 [CrossRef] [Google Scholar]
 Freire, P. C. C., Wex, N., EspositoFarèse, G., et al. 2012b, MNRAS, 423, 3328 [NASA ADS] [CrossRef] [Google Scholar]
 Fujii, Y., & Maeda, K. 2007, in The ScalarTensor Theory of Gravitation (Cambridge: Cambridge University Press), Camb. Monogr. Math. Phys. [Google Scholar]
 Genova, A., Mazarico, E., Goossens, S., et al. 2018, Nat. Commun., 9, 289 [CrossRef] [Google Scholar]
 Gonzalez, M. E., Stairs, I. H., Ferdman, R. D., et al. 2011, ApJ, 743, 102 [NASA ADS] [CrossRef] [Google Scholar]
 Goodman, J., & Weare, J. 2010, Commun. Appl. Math. Comput. Sci., 5, 65 [Google Scholar]
 Hankins, T. H., & Rickett, B. J. 1975, Methods in Computational Physics. Volume 14 – Radio Astronomy (New York: Academic Press, Inc.), 14, 55 [Google Scholar]
 Hobbs, G. B., Edwards, R. T., & Manchester, R. N. 2006, MNRAS, 369, 655 [NASA ADS] [CrossRef] [Google Scholar]
 Hobbs, G., Coles, W., Manchester, R. N., et al. 2012, MNRAS, 427, 2780 [NASA ADS] [CrossRef] [Google Scholar]
 Hobbs, G., Guo, L., Caballero, R. N., et al. 2020, MNRAS, 491, 5951 [Google Scholar]
 Hofmann, F., & Müller, J. 2018, Classical Quantum Gravity, 35, 035015 [CrossRef] [Google Scholar]
 Horndeski, G. W. 1974, Int. J. Theor. Phys., 10, 363 [CrossRef] [MathSciNet] [Google Scholar]
 Hotan, A. W., van Straten, W., & Manchester, R. N. 2004, PASA, 21, 302 [NASA ADS] [CrossRef] [Google Scholar]
 Jordan, P. 1955, Schwerkraft und Weltall, Die Wissenschaft (Vieweg) [Google Scholar]
 Kaplan, D. L., van Kerkwijk, M. H., Koester, D., et al. 2014, ApJ, 783, L23 [NASA ADS] [CrossRef] [Google Scholar]
 Keith, M. J., Coles, W., Shannon, R. M., et al. 2013, MNRAS, 429, 2161 [NASA ADS] [CrossRef] [Google Scholar]
 Kopeikin, S. M. 1996, ApJ, 467, L93 [NASA ADS] [CrossRef] [Google Scholar]
 Kramer, M., & Wex, N. 2009, Classical Quantum Gravity, 26, 073001 [NASA ADS] [CrossRef] [Google Scholar]
 Lange, C., Camilo, F., Wex, N., et al. 2001, MNRAS, 326, 274 [NASA ADS] [CrossRef] [Google Scholar]
 Lattimer, J. M., & Prakash, M. 2001, ApJ, 550, 426 [NASA ADS] [CrossRef] [Google Scholar]
 Lazaridis, K., Wex, N., Jessner, A., et al. 2009, MNRAS, 400, 805 [NASA ADS] [CrossRef] [Google Scholar]
 Lindegren, L., Hernández, J., Bombrun, A., et al. 2018, A&A, 616, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Lomb, N. R. 1976, Astrophys. Space Sci., 39, 447 [Google Scholar]
 Lynch, R. S., Boyles, J., Ransom, S. M., et al. 2013, ApJ, 763, 81 [NASA ADS] [CrossRef] [Google Scholar]
 Lyne, A., Hobbs, G., Kramer, M., Stairs, I., & Stappers, B. 2010, Science, 329, 408 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 McMillan, P. J. 2017, MNRAS, 465, 76 [Google Scholar]
 Melatos, A., & Link, B. 2014, MNRAS, 437, 21 [NASA ADS] [CrossRef] [Google Scholar]
 Mendes, R. F. P., & Ortiz, N. 2016, Phys. Rev. D, 93, 124035 [CrossRef] [Google Scholar]
 Miller, M. C., Lamb, F. K., Dittmann, A. J., et al. 2019, ApJ, 887, L24 [Google Scholar]
 Newton, I. 1687, Philosophiae Naturalis Principia Mathematica. Auctore Js. Newton (J. Societatis Regiae ac Typis J. Streater) [CrossRef] [Google Scholar]
 Nordtvedt, K. 1968, Phys. Rev., 170, 1186 [CrossRef] [Google Scholar]
 Nordtvedt, K. 1985, ApJ, 297, 390 [CrossRef] [Google Scholar]
 Press, W. H. 1996, Numerical Recipes in FORTRAN 2. 2. (Cambridge: Univ. Press), oCLC: 613812361 [Google Scholar]
 Ransom, S. M., Stairs, I. H., Archibald, A. M., et al. 2014, Nature, 505, 520 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Renn, J. 2007, The Genesis of General Relativity: Sources and Interpretations, Boston Studies in the Philosophy and History of Science (Netherlands: Springer) [Google Scholar]
 Rezzolla, L., Most, E. R., & Weih, L. R. 2018, ApJ, 852, L25 [NASA ADS] [CrossRef] [Google Scholar]
 Scargle, J. D. 1982, ApJ, 263, 835 [NASA ADS] [CrossRef] [Google Scholar]
 Seymour, B. C., & Yagi, K. 2019, ArXiv eprints [arXiv:1908.03353] [Google Scholar]
 Shannon, R. M., & Cordes, J. M. 2010, ApJ, 725, 1607 [NASA ADS] [CrossRef] [Google Scholar]
 Shannon, R. M., Cordes, J. M., Metcalfe, T. S., et al. 2013, ApJ, 766, 5 [NASA ADS] [CrossRef] [Google Scholar]
 Shao, L. 2016, Phys. Rev. D, 93, 084023 [CrossRef] [Google Scholar]
 Shao, L., & Wex, N. 2012, Classical Quantum Gravity, 29, 21.5018 [CrossRef] [Google Scholar]
 Shao, L., & Wex, N. 2013, Classical Quantum Gravity, 30, 165020 [CrossRef] [Google Scholar]
 Shao, L., Caballero, R. N., Kramer, M., et al. 2013, Classical Quantum Gravity, 30, 165019 [CrossRef] [Google Scholar]
 Shao, L., Sennett, N., Buonanno, A., Kramer, M., & Wex, N. 2017, Phys. Rev. X, 7, 041025 [Google Scholar]
 Shao, L., Wex, N., & Kramer, M. 2018, Phys. Rev. Lett., 120, 241104 [CrossRef] [Google Scholar]
 Shibata, M., Taniguchi, K., Okawa, H., & Buonanno, A. 2014, Phys. Rev. D, 89, 084005 [CrossRef] [Google Scholar]
 Shibata, M., Zhou, E., Kiuchi, K., & Fujibayashi, S. 2019, Phys. Rev. D, 100, 023015 [CrossRef] [Google Scholar]
 Shklovskii, I. S. 1970, Sov. Astron., 13, 562 [NASA ADS] [Google Scholar]
 Soffel, M. M. 1989, Relativity in Astrometry, Celestial Mechanics, and Geodesy (Berlin, Germany: Springer) [CrossRef] [Google Scholar]
 Stairs, I. H., Faulkner, A. J., Lyne, A. G., et al. 2005, ApJ, 632, 1060 [NASA ADS] [CrossRef] [Google Scholar]
 Stoer, J., & Bulirsch, R. 2011, Introduction to Numerical Analysis (New York, London: Springer), oCLC: 1063482400 [Google Scholar]
 Tauris, T. M., & van den Heuvel, E. P. J. 2014, ApJ, 781, L13 [NASA ADS] [CrossRef] [Google Scholar]
 Thorne, K. S. 1980, Rev. Mod. Phys., 52, 299 [NASA ADS] [CrossRef] [Google Scholar]
 Touboul, P., Métris, G., Rodrigues, M., et al. 2019, Classical Quantum Gravity, 36, 225006 [NASA ADS] [CrossRef] [Google Scholar]
 Tremblay, P.E., Gianninas, A., Kilic, M., et al. 2015, ApJ, 809, 148 [NASA ADS] [CrossRef] [Google Scholar]
 Voisin, G. 2017, Theses, Université de recherche Paris Sciences et Lettres, https://hal.archivesouvertes.fr/tel01677325 [Google Scholar]
 Wex, N. 2014, ArXiv eprints [arXiv:1402.5594] [Google Scholar]
 Will, C. M. 1993, Theory and Experiment in Gravitational Physics (Cambridge, England: Cambridge University Press) [CrossRef] [Google Scholar]
 Will, C. M. 2014a, Liv. Rev. Relativ., 17, 4 [Google Scholar]
 Will, C. M. 2014b, Phys. Rev. D, 89, 044043 [CrossRef] [Google Scholar]
 Will, C. M. 2018a, Theory and Experiment in Gravitational Physics, Second Edition (Cambridge, England: Cambridge University Press) [CrossRef] [Google Scholar]
 Will, C. M. 2018b, Nature, 559, 40 [CrossRef] [Google Scholar]
 Yao, J. M., Manchester, R. N., & Wang, N. 2017, ApJ, 835, 29 [NASA ADS] [CrossRef] [Google Scholar]
 Zhu, W. W., Desvignes, G., Wex, N., et al. 2019, MNRAS, 482, 3249 [CrossRef] [Google Scholar]
Appendix A: Strongfield equations of motion
In our timing model, the motion of the three bodies follows the equations of motion derived from the postGalieaninvariant Nbody Lagrangian of the modified EinsteinInfeldHoffmann (mEIH) formalism (Will 1993; Damour & Taylor 1992). The mEIH equations of motion describe the first postNewtonian dynamics of a Nbody system which also contains strongly selfgravitating masses, under the assumption that the gravitational interaction is Poincaré invariant. Furthermore, it assumes that there are no ‘asymmetric’ terms in the Lagrangian, which are anyway absent in many well motivated theories of gravity (see the discussion in Nordtvedt 1985; Damour & Taylor 1992). The mEIH formalsim is a generalisation of the parametrised postNewtonian (PPN) equations of motion for fully conservative theories with ξ = 0, in order to include effects related to the strong internal fields of strongly selfgravitating objects, like neutron stars. The mEIH Lagrangian can be written as (cf. Damour & EspositoFarèse 1992)
where m_{a} are the inertial masses with coordinate positions x_{a} and coordinate velocities v_{a}, r_{ab} = ∥x_{a} − x_{b}∥, v_{a} = ∥v_{a}∥, and n_{ab} = (x_{b} − x_{a})/r_{ab}. The quantities , and G_{ab} are the effective strongfield interaction constants. The unbarred quantities are the strongfield generalisation of the PPN parameters γ_{PPN} and β_{PPN} (Eddington parameters). The strongfield parameters satisfy the symmetries G_{ab} = G_{ba} (a ≠ b), (a ≠ b), and (a ≠ b, a ≠ c). The bodydependent effective strongfield interaction constants depend on the details of the underlying gravity theory as well as the structure of the individual bodies. Hence, in the most general case of a threebody system one has three different effective gravitational constants G_{ab}, three different , and nine different . In GR, due to the fulfilment of SEP and the corresponding effacement of the internal structure (see e.g. Damour 1987), one has G_{ab} = G_{N} and .
One can then use the EulerLagrange equations (Will 1993) to derive the equations of motion for each body:
In the weakfield limit one can check that this equation does give the PPN equation of motion (e.g. Soffel 1989; Will 1993).
Conserved quantities are key elements to check the numerical implementation and accuracy of the equations of motion. We have used the Hamiltonian (conservation of energy), and the momentum and position of the centre of mass of the system. The last two are also necessary to derive the initial conditions of the system.
The Hamiltonian corresponding to Eq. (A.1) is derived using the Legendre transform ,
The momentum of the centre of mass is given by the same expression as in GR only with the replacement G → G_{ab}, and
The centreofmass position X satisfies (see e.g. Will 2014b),
Supplementary Figures
A highresolution version of Fig. 2 (Access here)
Additional figure: chain plot (Access here)
All Tables
All Figures
Fig. 1. Top: template pulse profile of PSR J0337+1715 used for timing. 450 h of observations conducted with the Nançay Radio Telescope were integrated over the frequency range 1230–1742 MHz. Bottom: pulse profile obtained on October 4, 2014. When profiles are ToA uncertainty ranked, this one is at the 10th percentile of the lowest uncertainties, typical of a good observation. 

In the text 
Fig. 2. Corner plot of the correlations between the fitted parameters of Table 1 and their marginalised distribution (diagonal histograms) sampled by our MCMC. A highresolution version is available as online supplementary material. 

In the text 
Fig. 3. Sketch of the orbits of PSR J0337+1715 (not to scale). Note that the orbits are nearly circular and that the ellipsoidal shape of the orbits on the lefthand sketch arises purely from projection. The osculating orbits of the system can be parametrised by the Keplerian orbital elements of the pulsar within the inner binary and of the inner binary within the outer binary (see text). In particular, a_{p} is the semimajor axis of the pulsar orbit within the inner binary, a_{b} is the semimajor axis of the inner binary within the outer binary and e_{O} its eccentricity. Are also shown the longitude of periastron the outer binary as well as the inner and outer inclinations with respect to the plane of the sky, i_{I} and i_{O} respectively. Note that for simplicity the sketch neglects the difference between the directions of the inner and outer lines of ascending nodes, δΩ, as it is in practice very small. 

In the text 
Fig. 4. LombScargle periodogram of the timing residuals for the solution of Table 1. The periodogram is sampled at f_{m}/5 where f_{m} ≃ 2268 days^{−1} is the inverse of the full time span. From left to right vertical lines show the frequencies of the rednoise component (f_{RN} ≃ 1650 day^{−1}, black), the Earth orbital period (f_{E}, green), the outerorbit orbital period (f_{O}, orange), the innerbinary orbital period (f_{I}, red) and the SEP violation signature (2f_{I} − f_{O}, purple). 

In the text 
Fig. 5. Relative error in energy conservation during the numerical integration of the equations of motion over the 2268 day span of our data. The envelope of the signal oscillates at the outer binary frequency while a zoom would show a fast oscillation at the inner binary orbital frequency. 

In the text 
Fig. 6. Three components of the centreofmass position variation during the integration of the equations of motion over the 2268 day span of our data. A quadratic component has been fitted out (see text and legend) which leaves only the oscillatory components. The main visible oscillation is at the frequency of the outer binary while a zoom would show a fast oscillation at the inner binary orbital frequency. 

In the text 
Fig. 7. Lefthand side: marginalised posterior probability distribution of the SEP violation parameter Δ sampled by MCMC and normal law with the same mean and standard deviation. The upper axis gives the mean value and the boundaries of the 95% confidence region. Righthand side: distribution of distance to GR derived from the lefthandside distribution. 

In the text 
Fig. 8. Residuals of the times of arrival using the mean parameters reported in Table 1. 

In the text 
Fig. 9. Residuals of the times of arrival corresponding to the parameters of Table 1 plotted versus the innerbinary orbital phase (top), the outerbinary orbital phase (middle) and the Earth orbital phase (bottom). The red vertical bars of the first two plots show the phase where the pulsar (resp. the inner binary) is closer to the Solar System. The red vertical bar on the bottom plot, shows the Earth orbital phase where the line of sight to the pulsar passes closest to the Sun. We have removed every ToA when the pulsar is less than 5 deg from the Sun, which explains the gap around that particular phase. 

In the text 
Fig. 10. Radiusmass diagram for the 12 EOSs used in this paper to accomplish a good coverage of the range from soft to stiff EOSs, while still being in agreement with the tidal deformability test in the GW170817 LIGO/Virgo event (Abbott et al. 2017, 2018) (black curves). The blue dashed line indicates the mass of PSR J0337+1715, and the red dotted line corresponds to the most massive neutron star used in our combined test: PSR J0348+0432 (Antoniadis et al. 2013). The black curves correspond to the following EOSs (from left to right in their intersection with the red dotted line): WFF1, SLy4, WFF2, AP4, BSk20, ENG, SLy9, AP3, BSk25, BSk21, MPA1, BSk22 (see Lattimer & Prakash 2001 and https://compose.obspm.fr). Our stiffest EOS, BSk22, is also in agreement with the (more model dependent) constraint of M_{max} ≲ 2.3 M_{⊙} by Rezzolla et al. (2018), Shibata et al. (2019). We have included EOS H4 (grey curve), which is disfavoured by GW170817, and therefore has not been used in Figs. 11 and 12. All these EOSs also agree with the latest constraints from NICER (Miller et al. 2019). 

In the text 
Fig. 11. PSR J0337+1715 constraints on the α_{0}–β_{0} space of scalartensor theories (Damour & EspositoFarèse 1993, 1996) from Eq. (17): the area under the curve is still allowed by experiments. Two different neutronstar equations of state are used: a soft one, WFF1 (red curve), and a stiff one, BSk22 (blue curve). The two solid lines use the SEP constraint of this paper. The grey curves show the 2σlimits from Solarsystem experiment: Cassini (solid), LLR (dashed), MESSENGER (dotted). JFBD gravity corresponds to β_{0} = 0 (thin vertical line). 

In the text 
Fig. 12. Combined EOSagnostic pulsar constraints on the α_{0}–β_{0} space of scalartensor theories (Damour & EspositoFarèse 1993, 1996) from Eq. (17): the area under the curve is still allowed by experiments. The blue curve is the result of a combination of dipolar radiation tests from six pulsarWD binary systems (see text for details). The red curve indicates the improvement when the constraints from this paper are added to the dipolar radiation tests. The grey curves show the 2σlimits from Solarsystem experiment: Cassini (solid), LLR (dashed), MESSENGER (dotted). JFBD gravity corresponds to β_{0} = 0 (thin vertical line). 

In the text 
Fig. 13. Comparison of the PSR J0337+1715 constraints of this paper with expected constraints from future gravitational wave observatories of a single binary neutronstar merger: Cosmic Explorer (CE; blue dotted curve) and Einstein Telescope (ET; blue dashed curve). CE and ET curves are taken from Fig. 9 in Shao et al. (2017). Like for the CE and ET curves, the PSR J0337+1715 curve is also based on EOS AP4. Solar System constraints (grey) are as in Fig. 11. JFBD gravity corresponds to β_{0} = 0 (thin vertical line). 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.