Press Release
Open Access
Issue
A&A
Volume 678, October 2023
Article Number A49
Number of page(s) 20
Section Numerical methods and codes
DOI https://doi.org/10.1051/0004-6361/202346842
Published online 03 October 2023

© The Authors 2023

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1 Introduction

Pulsar timing allows astronomers to track every single rotation of a pulsar. It involves comparing the measured pulse times of arrival (TOAs) with a timing model that contains our current best knowledge of the pulsar's timing properties. Millisecond pulsars (MSPs), which are rapidly rotating neutron stars distributed throughout the Galaxy, pulse with sufficient regularity to function as excellent clocks. The timing residuals, which are differences between the predicted and estimated arrival times, contain imprints of various astrophysical processes, including the subtle signature induced by the stochastic gravitational wave background (GWB).

Unlike transient gravitational waves, a GWB will be detected from all directions in the sky. However, a single pulsar is not sufficient to disentangle the contribution of the GWB, as it is affected by many sources of noise with spectral properties similar to those of the GWB. To overcome this, cross-correlations between multiple pulsars are used, as the GWB signal is correlated across multiple sources while the noise (typically intrinsic to the pulsar) is not. Thus, pulsar timing arrays (PTAs; Foster & Backer 1990) extend the concept of pulsar timing to an ensemble of MSPs. However, a GWB is not the only correlated signal amongst an array of pulsars. Imperfections in a common reference clock can cause a monopolar signal (Hobbs et al. 2020), while unmodelled errors related to Solar System ephemerides can induce a dipolar signal (Caballero et al. 2018). The GWB is expected to induce a quadrupolar signal, as described by the Hellings–Downs (H–D) curve (Hellings & Downs 1983). The latter is considered to be the definitive evidence for the detection of a GWB. It is also important to note that measurable deviations from the H–D curve provide critical insight into the origins of gravitational waves (Taylor et al. 2015).

Given the characteristics of the GWB, PTAs need to accumulate decades of data to become sensitive to the signal. Three PTA collaborations were formed in the early 2000s, the European Pulsar Timing Array (EPTA; Desvignes et al. 2016), the North American Nanohertz Observatory for Gravitational Waves (NANOGrav; Demorest et al. 2013), and the Parkes Pulsar Timing Array (PPTA; Manchester et al. 2013). Since the sensitivity of the GWB relies, among other things, on the number of pulsars and the length of the dataset, the three PTA collaborations came together to form the International Pulsar Timing Array (IPTA). The Indian Pulsar Timing Array (InPTA; Tarafdar et al. 2022) has recently also joined the IPTA. The three PTAs, along with the IPTA, have recently detected a common red signal (CRS) that shares many characteristics with the expected GWB (Arzoumanian et al. 2020; Goncharov et al. 2021; Chen et al. 2021; Antoniadis et al. 2022). Although these measurements could be the first sign of the GWB emerging from noise, there was no definitive detection of the H–D correlation. The second data release (DR2) of the EPTA with longer timing baselines and more pulsars was created to better constrain the CRS and potentially detect the H–D curve.

Using ~24 yr of data and six of the best-timed MSPs, EPTA's detection of the CRS had an amplitude of 2.950.72+0.89×1015$2.95_{ - 0.72}^{ + 0.89} \times {10^{ - 15}}$ for a fixed spectral index (γ) of 13/3 (cf. Eq. (2)) as expected for a GWB originating from a population of supermassive black hole binaries (Chen et al. 2021). Further investigations of the spectral properties of individual pulsars by Chalumeau et al. (2022) showed a clear need for pulsar-specific noise models and frequency binning. Furthermore, Goncharov et al. (2021, 2022) stated that it is likely that at least part of the CRS can arise from a superposition of independent RN processes in individual MSPs. With simulated datasets, Zic et al. (2022) demonstrated how spurious evidence for CRS arises based on the choice of priors. Thus, robust modelling of the noise within each MSP dataset is critical given the possibility that the CRS may be a subset of the GWB signal.

In addition to radio PTAs, an independent limit on the GWB was placed using gamma ray observations of MSPs (Fermi-LAT Collaboration 2022). This not only provides an independent handle on the properties of the GWB but also, since gamma rays are immune to the effects of the ionised interstellar medium (IISM), it offers the possibility of testing the efficacy of current chromatic models through joint radio and gamma-ray analysis. Therefore, it is natural for such analyses to be integrated into future IPTA data releases.

In this paper, we focus on the long-term timing noise budget of pulsars in the second data release of the EPTA (hereafter, DR2full). This contains ~25yr of data for 25 of the most precisely timed MSPs in the EPTA DR1. This new dataset will add significant sensitivity at the lowest Fourier frequencies of the GWB, owing to its long time span, and increased sensitivity at the higher Fourier frequencies due to improved observing cadence and an effective doubling of the observing bandwidth. We also combine low radio frequency pulsar timing data from the InPTA for 10 MSPs that are common to the EPTA DR2 (hereafter, DR2full+), to investigate the effects of the IISM and their impact on the total noise budget. For details on the EPTA and InPTA datasets, observing systems, and pulsar timing parameters, refer to EPTA Collaboration (2023).

The paper is structured as follows. In Sect. 2, we provide an overview of the noise budget, the descriptions of techniques used to estimate optimal Fourier frequency bins, and the selection of favoured noise models. In Sect. 3, we present the estimated noise budget for the EPTA and InPTA datasets and highlight interesting cases. In Sect. 4, we interpret our results, compare them with previous analyses, and discuss implications for GWB searches. In Sect. 5, we discuss the efficacy of our noise models through additional tests, simulations, and cross-validation with independent software packages. In Sect. 6, we present conclusions and provide potential future trajectories to explore, especially by focussing on IPTA datasets. The customised noise models reported in this paper are applied directly to the EPTA's search for spatially correlated GWB signals (EPTA Collaboration & InPTA Collaboration 2023).

2 Noise modelling techniques and noise budget

The assumption that TOAs are only composed of a deterministic timing model plus time-uncorrelated instrumental noise is generally untrue in pulsar timing datasets. Stochastic processes, ranging from intrinsic spin noise in the neutron star to chromatic variations resulting from the IISM, contribute additional time-correlated noise in the observed TOAs (Cordes & Shannon 2010). Unless this noise is accounted for, it can bias timing model parameter estimation and reduce the sensitivity to the GWB. The typical method to account for this noise is to solve for both the pulsar timing model and a stochastic Gaussian-process noise model simultaneously (Lentati et al. 2013; van Haasteren & Vallisneri 2014). This is achieved by the use of Bayesian parameter estimation routines to find the posterior probability distribution of the noise model hyperparameters and the pulsar timing model parameters. In practice, significant computation time can be saved by analytically marginalising over most or all of the pulsar timing model parameters. This is achieved by reusing the linearised approximation to the timing model adopted in traditional pulsar timing software such as tempo2, which significantly reduces the parameter space to be sampled in the Bayesian analysis. The preparation of the parameters of the pulsar timing model, including the choice of parameters included in the fit, is discussed in EPTA Collaboration (2023).

2.1 Description of the noise models

For the work discussed in this paper, we focus primarily on the standard noise models used widely within the IPTA (Lentati et al. 2013, 2014 and references therein). In brief, we model the excess noise in the pulsar as a combination of Gaussian processes with two main forms, either correlated in time (i.e., 'red' noise) or uncorrelated in time (that is, 'white' noise). Red noise usually dominates on timescales of years to decades and incorporates signals with power spectral densities similar to the GWB. These processes include both chromatic noise, which depends on observing frequency, and achromatic noise, which is independent of observing frequency. We generally consider one achromatic red timing noise and two chromatic processes, variation in the dispersion measure (DM) to the pulsar, and variation in the scattering timescale for the pulsar. White noise reflects un-modelled instrumental errors and intrinsic pulse jitter (Liu et al. 2011, 2012; Parthasarathy et al. 2021) in the arrival times.

2.1.1 Time-correlated noise

Pulsar timing noise represents stochastic irregularities in pulsar rotation. Persistent temporally correlated noise that manifests equally across the radio frequency band of the instrument is referred to as spin noise, an achromatic noise process. It is typically modelled as a wide-sense stationary stochastic signal, that is, a process with zero mean and a covariance function that depends only on the absolute time lag between two points. Timing noise is present across the pulsar population and its amplitude seems to vary as a function of the pulsar spin-down rate (Shannon & Cordes 2010). Although the origins of spin noise are not unanimously agreed upon, it is typically well-modelled with a power-law spectrum. Power-law behaviour is expected if spin noise originates due to interactions between the neutron star crust and its superfluid core (Jones 1990), although observations of several canonical pulsars (that is, non-MSPs) show a quasi-periodic behaviour in timing noise properties (Lyne et al. 2010), or spectral turnovers (Parthasarathy et al. 2019) that may warrant additional terms beyond a single power law. However, the relatively small amplitudes of spin noise seen in MSPs is found to be relatively well-modelled by a power-law process (Goncharov et al. 2020).

The IISM introduces a wide range of chromatic noise processes into the TOAs that are dependent on the observing frequency. These include dispersive delays, scintillation, and pulse profile broadening due to multipath propagation (Cordes et al. 2016; Shannon & Cordes 2017; Donner et al. 2020). Dispersive delays can become measurable on timescales of days to weeks and scale with the observing radio frequency as v−2. DM variations are one of the biggest sources of unmodelled error, and although the wider bandwidths and higher cadence observations of recent PTA datasets significantly improves the ability to model the DM variations and separate them from achromatic signals, the inhomogeneity of the datasets means that it can be difficult to separate chromatic and achromatic noise on the longest timescales. The best approach to account for DM variability depends on the underlying dataset. For EPTA, we currently favour using a stochastic power-law model for the DM variations, which assumes that there is a stationary smooth process that determines the DM. This allows for observations separated in both time and observing frequency to be used for constraining the DM. The theoretical expectation is that DM variations are caused by Kolmogorov turbulence in the IISM, and hence will have a power law index of γDM = 8/3 (Foster & Cordes 1990). Alternative models (such as the use of DMX parameters; Alam et al. 2021), which typically make independent measurements of DM over discrete time intervals, avoid the assumption of smooth variation and stationary process, but require near-simultaneous observations at a wide range of frequencies to be effective. Additionally, there is a variable contribution to the DM from the solar wind. A study of the time-variable solar wind is part of an ongoing EPTA project (Niţu et al., in prep.). For this work, we use a fixed value of 7.9 cm−3 (Madison et al. 2019) for all pulsars, except for PSRJ1022+1001 where we fit a constant solar wind amplitude1 (see also EPTA Collaboration 2023).

After DM variations, the second most prominent effect of the interstellar medium at typical observing frequencies is the variation of the pulsar's scattering timescale. This scales as v−4 and therefore can be separated from DM given enough coverage in observing frequency. Variation in scattering timescale is modelled as a power law in the few cases where it is important for the pulsars in our current dataset.

For each of our time-correlated noise processes, we model the noise of a process with chromatic index a for a TOA at time t and observing frequency v (vref is the reference frequency set at 1400 MHz) with a Fourier basis of Ncoef coefficients as,

y(t)=j=1NcoefYj(ajcos(jωt)+bjsin(jωt))(vvref)α,$y\left( t \right) = \sum\limits_{j = 1}^{{N_{{\rm{coef}}}}} {{Y_j}\left( {{a_j}\cos \left( {j\omega t} \right) + {b_j}\sin \left( {j\omega t} \right)} \right){{\left( {{v \over {{v_{{\rm{ref}}}}}}} \right)}^{ - \alpha }}} ,$(1)

where ω = 2π/Tspan for Tspan typically chosen to be the total observing time span, aj and bj are fit parameters with a Gaussian prior 𝒩(0, 1) and Yj determined by the noise model hyperparameters A and γ. Specifically, we define Yj as

Yj=A2Kscalesyr3Tspan(fjfyr)γ,${Y_j} = \sqrt {{{{A^2}} \over {{K_{{\rm{scale}}}}}}{{{{\rm{s}}_{{\rm{y}}{{\rm{r}}^3}}}} \over {{T_{{\rm{span}}}}}}{{\left( {{{{f_j}} \over {{f_{{\rm{yr}}}}}}} \right)}^{ - \gamma }}} ,$(2)

where fyr = 1 yr−1, syr = 31557600 s yr−1 converts years to seconds with Tspan in seconds, and Kscale is a scale that can adjust the units of A appropriately for the given noise process. In practise, the parameters aj and bj are analytically marginalised following the method described in Lentati et al. (2014) and implemented in temponest and enterprise.

The choice of constants for each process, red noise, DM noise and scattering variations, and the hyperparameter priors are given in Table 1. Achromatic red noise is scaled so that Ared is the equivalent GWB amplitude in yr3/2, the DM variations are given in temponest units of cm−3pc yr3/2s−1, and the scattering variations are given as the equivalent amplitude of the red noise (in yr3/2) at 1400 MHz. These units are chosen to match previous publications and can be used directly in tempo2.

Table 1

Noise model constants and hyperparameter priors.

2.1.2 White noise

Temporally uncorrelated white noise in pulsar timing residuals needs to be modelled to effectively estimate the precision of pulsar timing parameters. For this work, we include the widely adopted parameters2 EFAC and EQUAD where the diagonals of the noise covariance matrix are scaled by

σscaled2=EFAC2×σoriginal2+EQUAD2,$\sigma _{{\rm{scaled}}}^2 = {\rm{EFA}}{{\rm{C}}^2} \times \sigma _{{\rm{original}}}^2 + {\rm{EQUA}}{{\rm{D}}^2},$(3)

where σoriginal is the measurement uncertainty of a TOA due to template-fitting errors. These parameters are applied for every observing backend and are tagged with the –group flag in the EPTA dataset. We use uniform priors on EFAC (0.1 to 5) and log10(EQUAD) (−9 to −5).

Table 2

Parameter priors for the two exponential dips included in the noise models for PSR J1713+0747.

2.1.3 Exponential dips

The exponential dips are signals manifesting as a sudden radio frequency-dependent advance of pulse arrival times and are estimated to impact the measurements of time-correlated signals (Hazboun et al. 2020). Such events have been reported three times for PSR J1713+0747 at MJDs ~54757(2008) (Coles et al. 2015; Zhu et al. 2015; Desvignes et al. 2016), ~57510(2016) (Lam et al. 2018; Goncharov et al. 2021; Chalumeau et al. 2022) and ~59320(2021) (Xu et al. 2021; CHIME/Pulsar Collaboration 2021; Singha et al. 2021). Only the first two events are part of DR2full. This effect can be explained by a sudden drop in the density of the IISM electron column along the line of sight to the pulsar and would induce an effect on the timing residuals with a chromatic index of 2. It could also be caused by a magnetospheric process inducing a profile shape change with a different chromatic index than for a DM event (Shannon et al. 2016; Goncharov et al. 2021).

The model commonly used in enterprise describes the exponential dips as a deterministic signal with a waveform dexp expressed for any observing time ti > t0 and radio frequency vk

dexp(ti,vk,θexp)=A(vkvref)αexp(tit0τ),${{\bf{d}}_{{\bf{exp}}}}\left( {{t_i},{v_k},{{\bf{\theta }}_{\exp }}} \right) = A{\left( {{{{v_k}} \over {{v_{{\rm{ref}}}}}}} \right)^{ - \alpha }}\exp \left( { - {{{t_i} - {t_0}} \over \tau }} \right),$(4)

with the reference frequency υref set at 1.4 GHz and the parameters θexp being the amplitude (A; in residual units), the relaxation time (τ; in days), the initial epoch (t0; in MJD) and the chromatic index (α). The prior probability distributions are described in Table 2. For the analyses in this work and similarly to Chalumeau et al. (2022), we fix the exponential dips α of the first and second events, respectively, at 4 and 1, chosen from their marginalised posteriors evaluated a priori.

2.2 Bayesian inference for pulsar timing noise analysis

We use a Bayesian approach (e.g. Sivia & Skilling 2006) for the parameter estimation and model selection. Given the measured timing residuals δt, the parameters θi of a chosen model i are considered random variables with a probability density function (parameter posteriors) described with the Bayes theorem as follows.

p(θi|δt,i)=p(δt|θi,i)p(θi|i)p(δt|i),$p\left( {{{\bf{\theta }}_i}\left| {{\bf{\delta t}},{{\cal M}_i}} \right.} \right) = {{p\left( {{\bf{\delta t}}\left| {{{\bf{\theta }}_i},{{\cal M}_i}} \right.} \right)p\left( {{{\bf{\theta }}_i}\left| {{{\cal M}_i}} \right.} \right)} \over {p\left( {{\bf{\delta t}}\left| {{{\cal M}_i}} \right.} \right)}},$(5)

where p(θi | i), p(δt | θi, i) and p(δt | i) are the parameter prior, likelihood and marginal likelihood (or evidence), respectively. The prior distributions for the noise parameters are described in Sect. 2.1 and those for the timing model parameters are shown in EPTA Collaboration (2023).

We use a Gaussian likelihood (van Haasteren & Vallisneri 2015; Chalumeau et al. 2022), where the deterministic signal waveforms are directly reduced to the timing residuals and the stochastic time-correlated and uncorrelated components are included in the time-domain covariance matrix. The evidence corresponds to the marginalised likelihood, that is, the integral of the likelihood over the whole parameter space. This term is particularly useful for model selection, as described below.

Let us now rewrite the Bayes theorem to express the probability density function of a model i given the set of timing residuals δt

p(i|δt)=p(δt|i)p(i)p(δt),$p\left( {{{\cal M}_i}\left| {{\bf{\delta t}}} \right.} \right) = {{p\left( {{\bf{\delta t}}\left| {{{\cal M}_i}} \right.} \right)p\left( {{{\cal M}_i}} \right)} \over {p\left( {{\bf{\delta t}}} \right)}},$(6)

where p(i) is the prior of model i, p(δt | i) is the evidence that also appears in Eq. (5) and p(δt) is the probability of observing the timing residuals marginalised over all models. The odds ratio, that is, the probability ratio between two models 1 and 2 given the same data is defined as

p(2|δt)p(1|δt)=p(δt|2)p(δt|1)p(2)p(1).${{p\left( {{{\cal M}_2}\left| {{\bf{\delta t}}} \right.} \right)} \over {p\left( {{{\cal M}_1}\left| {{\bf{\delta t}}} \right.} \right)}} = {{p\left( {{\bf{\delta t}}\left| {{{\cal M}_2}} \right.} \right)} \over {p\left( {{\bf{\delta t}}\left| {{{\cal M}_1}} \right.} \right)}}{{p\left( {{{\cal M}_2}} \right)} \over {p\left( {{{\cal M}_1}} \right)}}.$(7)

Considering equal prior probability between all models (as in this work), Eq. (7) is reduced to the ratio of evidences, also termed the Bayes factor,

12=p(δt|2)p(δt|1).${\cal B}_{{{\cal M}_1}}^{{{\cal M}_2}} = {{p\left( {{\bf{\delta t}}\left| {{{\cal M}_2}} \right.} \right)} \over {p\left( {{\bf{\delta t}}\left| {{{\cal M}_1}} \right.} \right)}}.$(8)

If 12>1${\cal B}_{{{\cal M}_1}}^{{{\cal M}_2}} > 1$, the model 2 is preferred over 1 given the measured timing residuals δt considered as the data. We use the scale proposed in Kass & Raftery (1995) to interpret and make decisions about the selection of the model and include an additional noise component in a simpler model only if the related Bayes factor is equal to or greater than 150. In this work, most of the Bayes factors are evaluated following a product-space sampling approach (Carlin & Chib 1995; Hee et al. 2015; Taylor et al. 2020), also referred to as the HyperModel method, where an additional hyperparameter is added to switch between two or more models. The Bayes factor between the two models becomes equivalent to the ratio of the number of samples between the models.

2.3 Noise modelling packages and samplers

For the single-pulsar noise analysis results reported in this paper, we use two Bayesian pulsar timing analysis packages, temponest (Lentati et al. 2014) and enterprise3 (Ellis et al. 2019). Both implement the noise models as described in Sect. 2.1. Although both codes have been tested with a range of samplers, for this work, the sampling for temponest is performed using Multinest (Feroz & Hobson 2008) and for enterprise is performed using PTMCMCSampler (Ellis & van Haasteren 2017). We have made several optimisations to temponest4, including adding a scattering delay variation model.

2.4 Choosing the optimal number of Fourier coefficients

We model the red noise, the DM variations and the scattering variations as stationary Gaussian processes, following the Fourier-sum approach described in van Haasteren & Vallisneri (2014), with a discrete and finite set of sine/cosine basis functions and a power-law power spectral density (PSD). The set of frequencies for each Gaussian process is chosen to be linearly distributed as 1/Tspan, 2/Tspan, …, Ncoef/Tspan Most PTA analyses in the past used the same number of frequency bins Ncoef for a given signal for all pulsars. However, different types of co-variance expansions have been discussed in van Haasteren & Vallisneri (2015), and methods to select a customised number of frequency bins have been proposed, either from the transitional frequency of a "red noise – white noise" broken power-law PSD (Chen et al. 2021), or from selecting the preferred value from a model selection among chosen sets of frequency bins (Chalumeau et al. 2022). In this work, we propose a novel method to select the number of frequency components for Gaussian processes by setting Ncoef as a free parameter in the noise model, and thus evaluating a marginalised posterior distribution p(Ncoef | δt). This approach extends the method performed in Chalumeau et al. (2022) by enabling tests for any possible value in a selected range of frequencies without having to evaluate a Bayes factor between all possible models. As the prior is the same between two Gaussian processes with a different number of frequency bins, this method is equivalent to a Bayes factor evaluation from a product-space approach.

We decide to set a non-informative prior for the discrete parameter Ncoef. The method implemented consists of including a real hyperparameter N˜coef${{\tilde N}_{{\rm{coef}}}}$ with a uniform prior 𝒰([ N˜min,N˜max+1 ])${\cal U}\,\left( {\left[ {{{\tilde N}_{\min }},{{\tilde N}_{\max }} + 1} \right]} \right)$, and setting the number of frequency bins Ncoef as the integer floor value [ N˜coef ]$\left[ {{{\tilde N}_{{\rm{coef}}}}} \right]$, thus ensuring an equal probability between all frequencies inside the prior range. The prior limits for the three stochastic signals considered in this work (red noise, DM, and scattering delay variations) are N˜min=10${{\tilde N}_{\min }} = 10$ and N˜max=150${{\tilde N}_{\max }} = 150$. The former is empirically chosen for statistical reasons. The maximum value of 150 allows testing for frequencies up to ~33 days−1 and ~60 days−1 for pulsars with the shortest and longest baselines, respectively, where white noise is expected to dominate over other signals. The value of the upper edge prior for the truncated dataset (DR2new; cf. Sect. 3) is N˜max=100${{\tilde N}_{\max }} = 100$ to cover frequencies up to 37–33 day−1 for all pulsars.

In case of an inconclusive (that is, flat) posterior distribution, we select a minimal value and perform an additional Bayes factor evaluation between the maximum posterior value and the selected minimal value as a performance check. By minimising the number of Fourier modes, we aim at selecting only the low-frequency noise expected to follow a power-law PSD, and reduce the effects of excess noise at high frequencies in the white noise dominated range. Furthermore, this step allows to reduce the computational cost of the likelihood which becomes crucial for a gravitational-wave analysis where the Gaussian process hyperparameters for all the pulsars in the array are simultaneously sampled.

2.5 Model selection and parameter estimation methodology

For each pulsar, we perform a Bayesian model selection to evaluate the most favoured combination among a chosen list of time-correlated noise components (cf. Sect. 2.1). The selection between two models is based on a Bayes factor evaluation performed with enterprise using the HyperModel method. For these analyses, we marginalise across all timing model parameters and fit the remaining parameters. For PSR J1713+0747, we also include the exponential dips described in Sect. 2.1.3. After obtaining the customised noise models, a full Bayesian analysis is performed with temponest that fits all the timing and noise model parameters. The results of these analyses are reported in Sect. 3. The procedure used to obtain the customised noise models is described in the following part of this section.

The six candidate noise models include any combination of achromatic red noise (RN), DM variations (DM), and scattering delay variations (SV). These are NONE (no time-correlated noise), RN, DM, RN+DM, DM+SV and RN+DM+SV (as reported in Tables 3 and 4. In this work, we do not consider models including SV but without DM, as they are physically unlikely at the observed radio frequencies in our dataset. We first evaluate the number of frequency bins that are the most preferred for each candidate noise model following the method described in Sect. 2.4. In the second step, we select the most favoured model among the six candidates with a Bayes factor evaluation. Following the principle of Occam's razor, we select the simplest model (that is, the model with the lowest number of Gaussian process signals) in case of inconclusive results.

After obtaining the final customised models, we perform additional checks to consolidate the robustness of the results described in Sect. 5. This step has been particularly useful in revisiting the noise model for PSR J1022+1011, which exhibited unmodelled achromatic red noise in the low frequencies of the power spectrum. A deeper analysis and a more advanced noise model would likely be required for this pulsar. Therefore, we stick to a standard version of the noise model for this pulsar, including RN and DM with 30 and 100 PSD frequencies, respectively.

3 Estimated noise budget for the EPTA DR2 pulsars

Here, we present the results of the noise model selection and parameter estimation for two EPTA datasets (to assist the reader, Table 5 provides a summary of the various datasets used in this paper),

  • The first is DR2full, containing data from the EPTA DR1 and the new instrumentation in DR2.

  • The second dataset consists only of the EPTA DR2 data from the new instrumentation (starting from ~MJD 55611; hereafter, DR2new).

  • The third is DR2full+ which contains the data described as DR2full along with additional low-frequency data from the InPTA collaboration for 10 common MSPs.

In principle, the combined EPTA+InPTA dataset (DR2full+) is much more sensitive for measuring the pulsar noise parameters as well as the GWB, with a total time span of 15–25 yr for most pulsars. However, around half of these data are from instruments that were designed prior to the start of the PTA experiment, and in some cases lacked coherent dedispersion or were otherwise limited in time resolution on the fastest pulsars. Furthermore, in some cases the early data do not have good coverage in observing frequency, making it difficult to disentangle chromatic and achromatic noise processes. Hence, although the DR2new is only 10 yr long, it may contain fewer systematics and degeneracies in the noise models. Indeed, as shown in EPTA Collaboration & InPTA Collaboration (2023), DR2new appears to be favourable in sensitivity to the GWB, which may suggest that the noise modelling is insufficient for early EPTA data, or that the characteristics of the pulsar noise or GWB signal are varying with time. Further investigation of the time-stationary properties of noise in each pulsar is briefly discussed in Sect. 4.1 and should be a priority for future IPTA data analysis.

In Table 3, we present the noise budget for 25 MSPs in DR2full+, while Table 4 shows the same results when applied to DR2new. For each pulsar, we report the properties of the preferred noise model estimated from the method described in Sect. 2.5 and discuss the interpretations of the results in Sect. 4.1.

For the DR2full+ data, we found that RN, DM and RN+DM are favoured for three, thirteen and eight pulsars respectively, and no significant time-correlated noise was evident in PSR J2322+2057. The data for fourteen pulsars do not support the inclusion of achromatic red noise, and this is further investigated in Sect. 5.1. However, including these pulsars in a GWB search analysis is likely still important for the optimisation of the PTA sensitivity as we include the spatial correlations among pulsars. For these pulsars, estimates of DM variations in a multi-pulsar analysis with spatial correlations are largely consistent with single-pulsar results with a few exceptions.

Of the 21 pulsars with significant DM variations, twelve have constrained power laws with a spectral index consistent with the expected γDM = 8/3 from Kolmogorov turbulence in IISM. Of the remaining nine pulsars, seven of them also include an achromatic red noise component in the favoured model, which could potentially impact the measurement of DM variations.

We report a significant measurement of achromatic red noise for eleven pulsars, with six consistent with the predicted spectral index γGWB = 13/3 from an idealistic GWB produced from GW-driven circular supermassive black hole binaries (SMBHBs). We find peculiar achromatic red noise for PSRs J0900−3144 and J1012+5307, with flat power-laws (resp. γRN=1.060.27+0.28${{\rm{\gamma }}_{{\rm{RN}}}} = 1.06_{ - 0.27}^{ + 0.28}$ and γRN=1.210.17+0.17${{\rm{\gamma }}_{{\rm{RN}}}} = 1.21_{ - 0.17}^{ + 0.17}$), displaying short-term correlated noise, as shown in Fig. 1.

For DR2new, we find significant achromatic noise only for seven pulsars, four of which are consistent with γGWB = 13/3. The lower number of pulsars favouring RN as compared to DR2full+ can either be due to the shorter timing baselines which reduce the sensitivity for long-term red noise signals or due to unmodelled noise present in the first half of the dataset. Of the 22 pulsars with significant DM, sixteen are consistent with γDM = 8/3. Of the remaining six, three pulsars have RN+DM as the favoured model. PSRs J1640+2224 and J1744−1134 display unusually low spectral indices at 0.380.36+0.75$0.38_{ - 0.36}^{ + 0.75}$ and 1.130.36+0.46$1.13_{ - 0.36}^{ + 0.46}$ respectively. Conversely, we report high DM variations spectral index for PSR J1600−3053 at 4.511.5+2.15$4.51_{ - 1.5}^{ + 2.15}$, for which the favoured noise model is DM+SV. More details on the noise properties for this pulsar are described in Sect. 4.2. Figures for posterior distributions, time-domain realisations, and relevant tables generated for each pulsar are available on Zenodo5.

Table 3

Favoured models listed for the 25 pulsars using both EPTA and InPTA data (for 10 common MSPs; DR2full+) along with estimated values for chromatic and achromatic noise models.

Table 4

Same as in Table 3, but for DR2new.

Table 5

Abbreviations for the different datasets used in this paper.

4 Interpretation

4.1 Comparison of EPTA Noise Models

In this section, we compare the noise models between DR2full, DR2new, and the original models reported for the EPTA DR1 dataset from Caballero et al. (2016). We find that the pulsars largely fall into three categories:

  • Category1: consistent noise models in the three datasets: Typically, DR2full improves over DR1 or DR2new, and both DR2full and DR2new do a better job of constraining chromatic noise than DR1.

  • Category2: DR2full and DR2new find chromatic noise: Several pulsars in DR1 found only achromatic noise or were unable to distinguish between achromatic and DM noise, but the DR2full and DR2new datasets prefer models with only DM variations.

  • Category3: complex noise models: In a handful of pulsars, including several of those with the best TOA precision, we find a more complex relationship between the noise models in the three datasets.

In Fig. 2, we present noise models for four pulsars as an example of belonging to either of the above categories. Below, we present a summary of the noise models in the three datasets for each of the 25 pulsars.

PSR J0030+0541 (Category 1). DR2full and DR2new largely agree, both with achromatic noise only. The DR2 posterior is better constrained. DR1 finds the same achromatic noise, but is unable to cleanly distinguish it from the DM noise.

PSR J0613-0200 (Category 3). DR2 and DR1 both see achromatic noise of similar amplitude, but the spectral index is unconstrained for DR1, whereas DR2 finds the achromatic noise to have an index >4. DR2new is unable to detect this noise, probably due to the relatively short data (~10 yr), and only finds DM variations. The posterior for γDM shifts somewhat between the three datasets but are largely consistent and slightly flatter than Kolmogorov.

PSR J0751+1807 (Category 1). All three datasets are largely consistent, finding only DM variations with a spectral slope consistent with Kolmogorov.

PSR J0900-3144 (Category 3). DR2 and DR2new posteriors are consistent, finding achromatic noise with a very flat spectral shape <2, and DM variations with an unconstrained spectral slope. The achromatic noise is much flatter than expected from classical power-law timing noise and may represent another noise process. The chromatic noise in DR1 data is consistent but unable to discern the achromatic noise.

PSR J1012+5307 (Category 3). DR2full has a tightly constrained posterior for DM and achromatic noise. The achromatic index is very flat <2 and appears to be dominated by the high-frequency power above 1/yr. DR2new finds similar achromatic noise, and similar amplitude DM variations, but is barely able to constrain γDM. We speculate that this is because the longer dataset of DR2full can rule out very steep DM variations given the absence of steep achromatic noise. DR1 finds achromatic noise with a marginally steeper spectrum and finds an upper limit on DM variations just below that observed in the DR2full data.

PSR J1022+1001 (Category 2). DR2full and DR2new are consistent with each other and show a very flat spectrum DM variation and an achromatic noise with 4. The flat-spectrum DM variation in this pulsar could likely be due to high-frequency residuals remaining from the solar wind contribution as this pulsar passes close to the ecliptic. A revised interpretation is given in Sect. 5.1, after performing a chromatic index evaluation for each time-correlated component. DR1 showed flat spectrum achromatic noise and little variation in DM. We suspect that this may be a leakage from DM variations due to the limited frequency coverage. We account for the solar wind by fitting for the NE-SW parameter as part of the timing model, which results in an estimated value of 10.9 ± 0.3 cm−3 relative to the constant value of 7.9 cm−3 in the other pulsars.

PSR J1024-0719 (Category 2). DR1 showed a marginal detection of DM variations with a steep spectral index. DR2full and DR2new both find only DM variations consistent with Kolmogorov and incompatible with the DR1 value. We find that the DR2full and DR2new models seem more plausible, but they could indicate the presence of smooth DM structures in the early DR1 data.

PSR J1455-3330 (Category 1). The three datasets are largely consistent. DR1 cannot distinguish between DM and achromatic noise, but the better frequency coverage of DR2full and DR2new finds only achromatic noise. The DR2full posteriors are more tightly constrained, particularly in γred.

PSR J1600-3053 (Category 3). DR1 has fairly flat spectrum DM variations, which are split into flat-spectrum scattering delay variations and steep (than Kolmogorov) DM variations in DR2full and DR2new.

PSR J1640+2224 (Category 2). DR1 is unable to distinguish between DM and achromatic noise. DR2full and DR2new find only DM variations with an extremely flat spectral index, γDM < 1. It is not clear what physical process would give rise to such a flat spectral index for DM variations.

PSR J1713+0747 (Category 3). Two DM events (exponential dips) at MJDs ~54757 and one at MJD ~57510 are observed for this pulsar in DR2full. The methodologies adopted to model these events are different between DR1 and DR2, but we find that this does not affect the red noise properties. However, from Fig. 2, it is clear that the red noise properties change considerably between DR1, DR2new, and DR2full with the red noise spectral index going from steep to shallow, respectively. We think that this could likely point to either our assumptions of stationarity being incorrect or processes not well-modelled by a power law.

PSR J1730-2304 (Category 1). Generally consistent between the three datasets, with DR1 unable to distinguish between DM and achromatic noise. DR2full and DR2new find only DM variations consistent with Kolmogorov.

PSR J1738+0333 (Category 2). DR1 does not find any noise. DR2full and DR2new find a flat-spectrum noise process, but the model selection prefers achromatic noise for DR2full and chromatic noise for DR2new. In both cases, the evidence is marginal for any one particular model.

PSR J1744-1134 (Category 3). The three datasets are superficially similar, but paint a confusing picture when taken together. DR1 finds no variations in DM and an achromatic process with γred ~ 3. DR2full finds DM variations with γDM < 1, amplitude above the upper limit for DM variations from DR1, and achromatic noise of similar amplitude but with a steeper, but largely unconstrained spectral slope, γred > 2.5. Indeed, the DR2full posterior for the achromatic noise is bimodal with a component that is largely consistent with DR1 and a component that is much steeper (γred > 5). DR2new does not find evidence of achromatic noise, but has consistent DM variations with DR2full. The very flat spectrum DM variations, inconsistent with DR1, suggest that the chromatic noise is not a power-law process but instead dominated by high-frequency variations to which DR1 was insensitive because of the limited instantaneous frequency coverage. The inconsistency of the achromatic noise may indicate there is another underlying process at work in this pulsar beyond a single power-law achromatic noise and DM variations. This pulsar is observed by all members of the IPTA, and hence we suspect that the picture will become clearer when studied with a combined IPTA dataset.

PSR J1751-2857 (Category 1). Generally consistent between the three datasets. DR1 only finds upper limits, but DR2full and DR2new find a DM variation consistent with Kolmogorov.

PSR J1801-1417 (Category 1). Generally consistent between the three datasets. DR2full and DR2new better constrain the DM variation, and the spectral shape is consistent with Kolmogorov.

PSR J1804-2717 (Category 1). Generally consistent between the three datasets. DR2full and DR2new much better constrain the DM variations, and the spectral shape is consistent with Kolmogorov, although both prefer a flatter γDM.

PSR J1843-1113 (Category 1). Generally consistent between the three datasets. DR2full and DR2new much better constrain the DM variation, and the spectral shape is consistent with Kolmogorov.

PSR J1857+0943 (Category 2). DR1 finds achromatic noise that DR2full and DR2new attribute to DM variations, without any achromatic noise. There is some overlap in the DM posteriors for DR1 and DR2full, though we suspect that the detection of achromatic noise in DR1 was incorrect.

PSR J1909-3744 (Category 3). DR1 finds achromatic noise that seems to have the same properties as the DM variations in the DR2full/DR2new data. The DM variations in DR2full/DR2new are above the upper limit from DR1. DR1 had minimal frequency coverage for this pulsar, though it is not clear why achromatic noise was preferred if there was a degeneracy. DR2full and DR2new find achromatic noise with a steeper spectral index and at a lower amplitude than the DM variations in the DR1 data. We note that the noise analysis in Liu et al. (2020) which used very similar EPTA data as in this work, reported similar results when the L-band data were divided into sub-bands.

PSR J1910+1256 (Category 1). Generally consistent between the three datasets. DR1 only finds upper limits, while DR2full and DR2new find DM variations consistent with Kolmogorov, but γDM is not well constrained.

PSR J1911+1347 (Category 2). Generally consistent between the three datasets. DR1 is not able to distinguish between DM and achromatic noise, but DR2full and DR2new find a DM variation consistent with Kolmogorov.

PSR J1918-0642 (Category 2). DR1 finds achromatic noise with a preference for large γred, and does not find significant DM variations. DR2full does not find evidence for achromatic noise, but instead finds a DM variation consistent with Kolmogorov, although also consistent with a steeper spectrum. DR2new is consistent with DR2full. This is another example of achromatic noise in DR1 being interpreted as chromatic noise when wide-band receivers are used.

PSR J2124-3358 (Category 2). Generally consistent between the three datasets. DR1 only finds upper limits, but DR2full and DR2Bew find a variation in DM consistent with Kolmogorov, though preferring a flatter γDM.

PSR J2322+2057. Does not show evidence of time-correlated noise in any EPTA dataset.

thumbnail Fig. 1

100 random realisations (blue) and medians for each epoch (black) of the RN time-delay reconstructions for PSRs J0900–3144 (left) and J1012+5307 (right) using the most favoured noise models with the DR2new dataset. The short-timescale stochastic signals seen at a level for these two pulsars still have unknown origins.

thumbnail Fig. 2

Marginalised posterior distributions of noise hyperparameters obtained with the EPTA DR1, DR2full and DR2new. Top left: PSR J1730–2304 falls in Category 1, where the DR2full/New datasets are able to constrain chromatic noise (DMAmp, DMSlope). Top right: For PSR J1857+0943, belonging to Category 2, it is clear that the DR1 data could not disentangle chromatic from achromatic noise (RedAmp, RedSlope), which is resolved better with the DR2 datasets. Bottom left: PSR J1713+0747 in Category 3, shows clear inconsistencies in the achromatic noise estimates. Bottom right: PSR J1909–3744 shows a much more complicated behaviour as described in the text motivating its inclusion in Category 3.

4.2 Changes in noise models after the inclusion of the InPTA data

Here we study the impact of including low radio frequency observations from the InPTA on the estimated noise models. The InPTA dataset complements the EPTA data with simultaneous observations at 300-500 MHz and at 1260-1460 MHz between MJDs 58235 and 59496 observed with the upgraded Giant Metrewave Radio Telescope (uGMRT). The frequency coverage at 300-500 MHz is particularly important since EPTA has a limited number of observations at this frequency band. Therefore, the inclusion of the InPTA data is of particular interest in constraining noise due to the IISM such as DM and scattering variations. To allow a quantitative comparison of the posterior noise models before and after the inclusion of InPTA data, we adapt and employ a tension metric package as detailed in Raveri & Doux (2021)6.

We begin by introducing the probability density for parameter differences between the posteriors of two sets of parameters θ1 and θ2 to be Ƥθ) where Δθθ1θ2. Employing a joint posterior Ƥ(θ1, θ2) ≡ Ƥ(θ1, θ2|d1, d2), we can write Ƥ(θ1, θ) = Ƥ(θ1, θ1 − Δθ), where d1 and d2 refer to the two data sets. Integrating the base parameters, we obtain

𝒫(Δθ)=Vπ𝒫(θ1,θ1Δθ)dθ1,$P\left( {{\rm{\Delta }}{\bf{\theta }}} \right) = \int_{{V_\pi }} {{\cal P}\left( {{{\bf{\theta }}_1},{{\bf{\theta }}_1} - {\rm{\Delta }}{\bf{\theta }}} \right)d{{\bf{\theta }}_1},} $(9)

where Vπ is the volume of the parameter space where the prior is non-vanishing, also known as the support of the prior. If the two datasets, specified by the parameter θ, are conditionally independent, it is possible to sample the posteriors Ƥ(θ1) and Ƥ(θ2) separately (Raveri & Doux 2021). In this scenario, the expression for Ƥθ) becomes

𝒫(Δθ)=Vπ𝒫(θ1)𝒫(θ1Δθ)dθ1.$P\left( {{\rm{\Delta }}{\bf{\theta }}} \right) = \int_{{V_\pi }} {P\left( {{{\bf{\theta }}_1}} \right)P\left( {{{\bf{\theta }}_1} - {\rm{\Delta }}{\bf{\theta }}} \right)d{{\bf{\theta }}_1}.} $(10)

Thereafter, we obtain the mean probability of the presence of a parameter shift using the following equation.

Δ=𝒫(Δθ)𝒫(0)𝒫(Δθ)dΔθ,${\rm{\Delta }} = \int_{P\left( {{\rm{\Delta }}{\bf{\theta }}} \right)P\left( 0 \right)} {P\left( {{\rm{\Delta }}{\bf{\theta }}} \right)d{\rm{\Delta }}{\bf{\theta }},} $(11)

which incorporates the posterior mass above the iso-contour of no shift. We then convert the above Δ into an effective number of er using the standard normal distribution. Detailed comparisons that arise from the posteriors of DR2full and DR2full+ are available online7. In Table 6, we report the estimated tension (in σ) for the red and DM noise models while dealing with DR2full and DR2full+ datasets. It shows that the 2D posterior distributions of the RN and DM parameters are consistent (Δ < 1σ) for all parameters, except for the power law DM variations of the PSRs J0613–0200, J1600–3053, J1744–1134 and J1909–3744. In the following, we discuss possible explanations for these pulsars.

thumbnail Fig. 3

100 random realisations (blue) and medians for each epoch of the time-domain reconstructions of the DM variations delays at 1.4 GHz radio frequency modelled as a Gaussian process for PSRs J0613–0200 (left) and J1909–3744 (right), with the “DR2full+”. The black dashed lines display the last epochs of DR2full (that is, EPTA only). The inclusion of InPTA data allows to measure sharp changes at after MJD 59000. The delays are obtained from marginalising the timing model parameters which comprise the DM constant and the first two time derivatives of the dispersion measure (DM1, DM2).

Table 6

Estimated tension (Z-score in sigma) between the DR2full and DR2full+ datasets for the red and DM noise models.

4.2.1 PSR J0613–0200

For this pulsar, combining InPTA with the EPTA data yields a lower spectral index and higher amplitude at fyr for the chromatic noise. We observe an interesting sharp jump in the last years of the DM time series after including InPTA data (cf. the left panel of Fig. 3), which might increase the power at high PSD frequencies, and therefore yield to a flatter constrained power law.

4.2.2 PSR J1600–3053

The favoured model using DR2full comprises DM and SV, with constraints that are (1) consistent with independent measurements of scattering delays with the Large European Array for Pulsars (LEAP; Main et al. 2023; cf. left panel of Fig. 4), and (2) highly consistent with the expected chromatic indices for both DM and SV as shown in the right panel of Fig. 4. However, the inclusion of InPTA data no longer supports scattering variations, and the favoured model is RN+DM. While the inclusion of SV is still supported after including the L-band data from the InPTA, we find that this discrepancy is happening from including the P-band data (300–500 MHz). Furthermore, the scattering variations in DR2full are unlikely to be related to pulse broadening at low frequencies, as we have no evidence for this in the InPTA P-band data. This discrepancy will be further investigated with data from the IPTA.

4.2.3 PSR J1744–1134

As described in Sect. 4.1, this pulsar exhibits a bimodal posterior for achromatic noise and a very flat spectrum for DM variations with the DR2full dataset. The inclusion of the InPTA data measures DM variations with a much lower spectral index and a lower amplitude. This removes the observed bimodality in the achromatic noise and allows a tighter constraint on a single mode as shown in Fig. 5.

4.2.4 PSR J1909–3744

The behaviour observed in PSR J1909–3744 appears to be very similar to PSR J0613–0200, in which including the InPTA data produces a lower spectral index and a higher amplitude for the chromatic noise. This is also likely caused by the sharp change in DM variations in the last two years of the dataset (cf. Fig. 3). This feature was already reported in Tarafdar et al. (2022) and is also shown in Curylo et al. (2023) that used, respectively, the first data release of InPTA and the ultra-wide-bandwidth low-frequency data from PPTA. Interestingly, we do not observe any associated impact on the RN posterior distributions.

thumbnail Fig. 4

Scattering delay variations measurement for PSR J1600–3053 with DR2new. Left plot: 100 random realisations (blue) of the time-domain reconstructions of the scattering delay variations at 1.4 GHz radio frequency modelled as a Gaussian process with DR2new. Independent scattering delays measured from scintillation analysis with the LEAP data (Main et al. 2023) are shown by the black points. Right plot: chromatic index posterior distributions measured for the DM (orange) and the SV (blue) with DR2new, while fixing the chromatic index of the other component. The favoured model for this dataset includes both DM and SV. The black dotted lines emphasise the expected values for DM and SV, respectively at 2 and 4.

thumbnail Fig. 5

Red and DM noise models for PSR J1744–1134 using DR2full and DR2full+ datasets. The inclusion of InPTA data allows a better constraint on the achromatic noise.

4.3 Implications

As expected the much-improved frequency coverage of DR2 has meant that the chromatic noise is much better constrained with the DR2full and DR2new datasets. Surprisingly, in some cases, signals attributed to achromatic processes in DR1 have been attributed to chromatic noise in DR2.

Of particular interest, however, are pulsars such as J1713+0747 and J1744−1134, in which the choice of dataset seems to affect the noise model hyperparameters in a way that is not easily understood. Although these effects are rather subtle, these are two of the best-timed pulsars in the EPTA, and this inconsistency may be an indication that more complex noise models may be needed for the best-timed pulsars and longest datasets. We assume that the noise process is a stationary Gaussian process with a power-law spectral density, and this may suggest that one or more of these assumptions are incorrect. Indeed, it is quite plausible that the pulsar spin noise is not entirely stationary. The IISM is known to have discrete structures that may change the statistics of the DM and scattering variations. In particular, PSR J1713+0747 is observed to undergo ‘chromatic timing events that are neither a stationary process nor fully modelled by achromatic or ν−2 DM variations (Lam et al. 2018). Pulsar timing noise in the population of canonical pulsars also shows behaviour that is either not strictly modelled by a power-law (e.g. quasi-periodic variability; Lyne et al. 2010) or shows discrete changes in spin-down behaviour on long timescales (e.g. Brook et al. 2014; Shaw et al. 2022).

It is worth reiterating that the noise models presented in this work represent our best estimate of the underlying noise using the understanding that we currently have. Further insights are hampered by the fact that it can be hard to disentangle the effect of improvements in the observing systems, especially sensitivity and observing frequency coverage, from changes in the observable pulsar noise properties, either due to time-variable processes or from processes only detectable on long timescales. Combined IPTA datasets bring additional complications, but the combination of improved instantaneous sensitivity and frequency coverage, particularly in the latest instrumentation, will be our best chance to separate instrumental effects from those intrinsic to the pulsars. Furthermore, the direct combination of data can make use of a dropout analysis to identify any telescope-or backend-speciflc instrumental effects that may mask the real behaviour of the pulsars.

5 Noise model validation

In this section, we perform several tests to compare the performance of the customised noise models with more standard models and assess the robustness of the results.

5.1 Performance of the customised noise models

As shown by Tables 3 and 4, the favoured noise models include only one component for most pulsars, making them simpler than the standard models commonly used in PTA analyses (RN and DM with 30 and 100 frequency bins, respectively, for all pulsars). We first evaluate the improvements enabled by the model selection process by comparing them with standard noise models, except for PSR J1012+1001, since we use the standard model for this pulsar (cf. Sect. 3). The Bayes factors in favour of the customised noise models over the standard models are shown in Col. 3 of Table 7 for each dataset. Following the Kass & Raftery (1995) scale, we find

  • weak support (standcus${\cal B}_{{\rm{stand}}}^{{\rm{cus}}}$ ∈ [1, 3]) for 11 and 13 pulsars,

  • positive support (standcus${\cal B}_{{\rm{stand}}}^{{\rm{cus}}}$ ∈ [3, 20]) for five and seven pulsars,

  • strong support (standcus${\cal B}_{{\rm{stand}}}^{{\rm{cus}}}$ ∈ [20, 100]) for two and zero pulsars,

  • very strong support (standcus${\cal B}_{{\rm{stand}}}^{{\rm{cus}}}$ ≥ 100) for six and four pulsars,

in DR2full+ and DR2new respectively. We notice higher Bayes factor values for DR2full+ compared to DR2new for most pulsars, which can be partially explained by the better performance of the standard noise models to describe higher PSD frequencies for DR2new as its timing baseline is approximately two times shorter than DR2full+. Furthermore, cases with “strong” and “very strong” evidence are obtained for the pulsars that include two time-correlated components (RN+DM, DM+SV), except for PSR J1801–1417 (only DM variations). We also observe a very significant reduction of Bayes factors for PSR J1909–3744 with the DR2new, even if the preferred model for this pulsar includes both RN and DM for the two datasets.

To assess the validity of the customised noise models, we checked the chromatic index a values used for the time-correlated noise components fixed at 0, 2 and 4 for RN, DM and SV respectively. To do so, we used custom noise models for each pulsar and set a for each noise component as a hyperparameter to estimate its posterior distribution, as performed in recent PTA analyses (e.g. Goncharov et al. 2021; Chalumeau et al. 2022). If the pulsar has two time-correlated components, we perform two analyses to evaluate each a independently while fixing the index of the other component to its relevant value (0 if RN and 2 if DM). We use a uniform prior probability as 𝒰$U$(−5, 10). The medians and 95% credible intervals of the posterior distributions for RN and DM chromatic indices (resp. αRN and αDM) are shown in the first two columns for each dataset in Table 7. In the following, we summarise our findings.

  • For RN, we note that the chromatic indices of PSRs J1713+0747 and J1909−3744 are consistent with the expected value of 0 with DR2new, while it is not the case with DR2full. We observe slightly negative values for PSRs J0030+0451, J0900−3144 and J1012+5307 with both datasets.

  • For the DM chromatic indices, 18 pulsars are consistent with the expected value for DM variations with both the DR2full+ and the DR2new. However, the high values for PSRs J1911+1347 and J2124−3358 with the DR2new warrant further investigation.

  • The only pulsar with a preferred SV model is PSR J1600−3053 with DR2new. The posterior distribution of the SV chromatic index for this pulsar is shown in the right panel of Fig. 4, with a median and 95% credible intervals at 4.311.27+1.88$4.31_{ - 1.27}^{ + 1.88}$, nicely consistent with the expected value from scattering variations (cf. Sect. 2).

  • For PSR J1022+1001, it is interesting that the trend in RN and DM indices swaps between the two datasets (that is, RN is shallower in DR2full+ and steeper in DR2new, while the opposite is the case for DM). To account for this, we stick to the standard noise model for this pulsar, as RN might prioritise the low frequencies, while DM also samples the high frequencies. The estimated indices for the DR2new show that the high-frequency signals are achromatic for this pulsar and the low-frequency components are consistent with the DM variations. While the former could correspond to profile instabilities (Liu et al. 2015; Padmanabh et al. 2020), the latter is likely due to the presence of a variable solar wind (Tiburzi et al. 2021).

For the second validity test, we evaluate a Bayes factor after including an achromatic noise component over the favoured models and therefore check if there is evidence for any remaining red noise. We use 30 frequency components for this additional red noise in order to prioritise the low frequencies. As shown in Col. 5 of Table 7, we only find weak evidence (maximum of 2.3 for PSR J1640+2224) for the inclusion of this component. These results reconfirm the robustness of the customised noise models that do not include an achromatic component.

Table 7

Validation and performance statistics for the customised noise models with the “DR2full+” and “DR2new”.

Table 8

Tension metrics for the red and DM noise models estimated using enterprise and temponest.

5.2 Consistency among different softwares

In Table 8, we present the tension in σ (see Sect. 4.2 for a description of the tension metric) for red and DM noise models estimated using two Bayesian timing packages, enterprise and temponest for the 25 EPTA MSPs. We find that the noise models for all the pulsars are very consistent.

5.3 Marginalisation of the timing model

Preliminary analysis of DR2full using temponest showed that the posteriors of the noise model parameters varied significantly depending on whether the timing model was marginalised. This was particularly surprising given that Lentati et al. (2014) demonstrated that neither the analytic marginalisation nor linearisation made a significant difference to any parameters. Further investigation revealed that changes in the implementation of temponest in 20178 introduced a critical flaw when not marginalising over the timing model whilst marginalising over the arbitrary jumps between instruments; the typical method of operation when not marginalising over the timing model. This error disabled fitting for many jumps, leading to incorrect noise model inferences. Once resolved, we no longer find any discrepancy between noise model parameters when marginalising over the timing model, and find that the linearised timing model is sufficient for further analysis once any highly non-linear binary parameters have been solved.

5.4 Simulations

We have implemented a toolkit to generate repeatable simulations of EPTA datasets to validate noise models. Simulations are generated using the toasim framework that is distributed with tempo2. In brief, the method is to generate a set of ‘idealised’ TOAs that produce zero residual with respect to a seed pulsar parameter file and then add perturbations for each of the simulated noise processes. For an idealised TOA ti, the final simulated TOA is given by

tfinal,i=ti+yred(ti)+ydm(ti)+yscat(ti)+ywhite(ti),${t_{{\rm{final}},i}} = {t_i} + {y_{{\rm{red}}}}\left( {{t_i}} \right) + {y_{{\rm{dm}}}}\left( {{t_i}} \right) + {y_{{\rm{scat}}}}\left( {{t_i}} \right) + {y_{{\rm{white}}}}\left( {{t_i}} \right),$(12)

where each of the perturbations y is discussed in the following sections. In principle, this can only be solved iteratively, but in practise we can neglect second-order corrections as long as the perturbations are small. Therefore, we typically subtract a quadratic from each y(t), which significantly reduces the general amplitude of the perturbations, especially for steep red noise processes. Removing a quadratic in this way does not affect the noise models, but it does mean that the mean pulsar spin frequency and frequency derivative are unchanged from realisation to realisation. This is hence equivalent to fitting for F0 and F1 on each dataset, and small variations in these parameters are largely uninteresting, and in many senses arbitrary for a simulation, this does not affect the overall usefulness of the simulations.

Significant improvements have been implemented to the toasim code as part of this work, which were required to ensure that the definitions of the model parameters are consistent between this work and the simulation code. The simulation code has been adapted to be consistent with the tempo2 and temponest definitions of the model parameters, although these are generally trivially related to other definitions.

5.4.1 Achromatic red noise

The toasim framework provides a method for simulating power-law red-noise processes by means of the inverse Fourier transform. This first computes the noise process over a grid of 4096 equally spaced values over Tspan, extended by a factor of n = 100 to avoid periodic boundary effects of the Fourier transform. The spectrum is computed for N = 1024n frequencies with Hermitian symmetry, and Fourier transformed to give a time series:

rk=j=N/2N/2Rj(aj+ibj)exp(2π1jk/N),${r_k} = \sum\limits_{j = {{ - N} \mathord{\left/ {\vphantom {{ - N} 2}} \right. \kern-\nulldelimiterspace} 2}}^{{N \mathord{\left/ {\vphantom {N 2}} \right. \kern-\nulldelimiterspace} 2}} {{R_j}\left( {{a_j} + i{b_j}} \right)\exp \left( {2\pi \sqrt { - 1} {{jk} \mathord{\left/ {\vphantom {{jk} N}} \right. \kern-\nulldelimiterspace} N}} \right)} ,$(13)

with aj and bj being random variables drawn from 𝒩$N$(0,1). Note that the requirement for Hermitian symmetry means that only 512 independent random values are needed for each of aj and bj. This is largely analogous to Eq. (1) for the noise modelling except that the Fourier transform includes both positive and negative frequencies and therefore the amplitudes must be scaled down by a factor of two, relative to the noise models in Eq. (2). Therefore, the mean amplitude at each frequency Rj is given by

Rj=12Ared212π2syr3nTspan(fjfyr)γred,${R_j} = {1 \over 2}\sqrt {{{A_{{\rm{red}}}^2} \over {12{\pi ^2}}}{{{{\rm{s}}_{{\rm{y}}{{\rm{r}}^3}}}} \over {n{T_{{\rm{span}}}}}}{{\left( {{{{f_j}} \over {{f_{{\rm{yr}}}}}}} \right)}^{ - {\gamma _{{\rm{red}}}}}}} ,$(14)

where fyr = 1yr−1, and syr = 31557600 s yr−1 converts years to seconds to give a perturbation in seconds for Ared in yr3/2 and Tspan in s. The factor of 1/2 corrects from the one-sided PSD used by the noise models to the two-sided PSD needed for the Fourier transform. The final perturbation for the achromatic red noise for TOA ti is computed by linear interpolation of rk,

yred(ti)=rk+(titk)rk+1rktk+1tk,${y_{{\rm{red}}}}\left( {{t_i}} \right) = {r_k} + \left( {{t_i} - {t_k}} \right){{{r_{k + 1}} - {r_k}} \over {{t_{k + 1}} - {t_k}}},$(15)

for tk < ti < tk+1. This is implemented by the addRedNoise tempo2 plugin.

5.4.2 DM variations

Perturbations due to DM variations are similarly computed by the inverse Fourier transform using largely the same method as for the achromatic noise, except that we model a DM time series before converting to a TOA perturbation as a last step. The mean amplitude at each frequency, Dj is given by

Dj=12ADM2Syr3nTspan(fjfyr)γDM,${D_j} = {1 \over 2}\sqrt {A_{{\rm{DM}}}^2{{{{\rm{S}}_{{\rm{yr}}}}^3} \over {n{T_{{\rm{span}}}}}}{{\left( {{{{f_j}} \over {{f_{{\rm{yr}}}}}}} \right)}^{ - {\gamma _{{\rm{DM}}}}}}} ,$(16)

with Tspan in seconds, ADM in temponest units of cm−3pc yr3/2s−1.

The final perturbation for a given TOA is then calculated from d(t), a function that linearly interpolates dk in a similar way to Eq. (15), and

yDM(ti)=d(ti)κDMvk2,${y_{{\rm{DM}}}}\left( {{t_i}} \right) = {{d\left( {{t_i}} \right)} \over {{\kappa _{{\rm{DM}}}}v_k^2}},$(17)

where νk is the observing frequency of the TOA and κDM = 2.41 × 10−4cm−3pcMHz2s−1 is the DM constant. This is implemented by the addDmVar plugin in tempo2.

5.4.3 Scattering variations

Scattering variations are implemented very similarly to the DM variations. We create a scattering time series sk using the Fourier transform of n Hermitian complex values given by S j(aj + ibj), where

Sj=12Ascat212π2syr3nTspan(fjfyr)γscat.${S_j} = {1 \over 2}\sqrt {{{A_{{\rm{scat}}}^2} \over {12{\pi ^2}}}{{{{\rm{s}}_{{\rm{yr}}}}^3} \over {n{T_{{\rm{span}}}}}}{{\left( {{{{f_j}} \over {{f_{{\rm{yr}}}}}}} \right)}^{ - {\gamma _{{\rm{scat}}}}}}} .$(18)

Then the perturbation associated with the scattering is computed from a linear interpolation function s(t) to give

yscat(Ti)=s(ti)(vvref)αscat,${y_{{\rm{scat}}}}\left( {{T_i}} \right) = s\left( {{t_i}} \right){\left( {{v \over {{v_{{\rm{ref}}}}}}} \right)^{ - {\alpha _{{\rm{scat}}}}}},$(19)

where we use αscat = 4 for scattering variations, and = 1400 MHz, chosen to be consistent with the implementation in temponest and enterprise. This is implemented by the addChromVar tempo2 plugin.

5.4.4 White noise

White noise is added to the TOA with uncertainty Ct¿ after applying the EFAC and EQUAD parameters (see Eq. (3)) by drawing ywhite(ti) from 𝒩(0,σi2)$N\left( {0,\sigma _i^2} \right)$.

5.4.5 Creating the realisations

The simulation parameters can be given as single values or a uniform prior range from which to draw, so that a different set of parameters is used for each realisation of the simulation. When the parameters are drawn randomly from the prior, we record the sets of parameters used for later comparison with the results.

5.5 P–P plots

One tool to validate Bayesian analysis tools is by studying posterior quantiles from simulated datasets (Cook et al. 2006). Our method largely follows that implemented in bilby (Ashton et al. 2019), where we simulate a large number of datasets, j, with ‘true’ model parameters Θj drawn from a prior distribution p(Θ). We then use our Bayesian software, in this case temponest, to generate samples, θi,j of the posterior distribution of the parameters given the prior distribution (see Table 9), for each of the simulated datasets, following the same approach as for the real data. We can then, for each parameter i, and each simulated dataset j, compute the quantile qi,j within the 1-d posterior that the ‘true’ value of the parameter lies, that is, the weighted fraction of samples for which θi,j < Θi,j. It can be shown that the qi,j should be uniformly distributed between 0 and 1 (Cook et al. 2006). We can visualise this by plotting the cumulative distribution of qi,j, over all j, yielding a so-called ‘P–P plot’. In principle, this should follow a straight line, but even with perfect algorithms, there will be deviations caused by the finite number of simulations, which we can estimate with confidence contours taken from the binomial distribution.

It is important that the prior distribution matches between the simulated data and the Bayesian search, particularly for any parameters that may not be constrained on one or both sides and hence be bounded by the prior. This is quite common in the pulsar datasets, as power-law red-noise processes below the detection threshold will have log amplitude unbounded on the low side and the spectral slope may simply return the prior. If the prior does not match the simulation and posterior generation, this will inevitably lead to a perceived error in the posterior computation for such cases. We choose a prior range such that most simulations have noise processes that are largely representative of what we estimated in the real data, and wide enough that the posterior of the real data would not be heavily constrained by the choice of prior.

Results: For simplicity, we first consider the simulation of only achromatic red noise. The P-P plot, shown in Fig. 6a, indicates generally good consistency in the posterior of γred, but there is an excess at low confidence intervals for log (Ared). The investigation of individual results shows a correlation between the log (Ared) confidence interval and the injected γred, visualised in Fig. 6b. We believe that the observed correlation suggests that the excess results are driven by cases where γred is large, and hence we suspect that they are cases where the red noise has such a steep spectrum that it is no longer able to be expressed in a Fourier basis with lowest frequency at 1/Tspan.

To understand this, we recall that the Fourier-basis Gaussian process models used in this work, and widely used by other similar projects are naturally periodic on Tspan due to the nature of the Fourier basis. This use of a Fourier basis vastly speeds up the computation compared to directly evaluating the covariance matrix for these large datasets, but it requires that the noise process can be well approximated by a periodic function. It is generally assumed that fitting for the spin frequency (F0) and spin frequency derivatives (F1) parameters absorbs the low frequencies sufficiently that the periodic model is acceptable, and previous analyses have suggested that the posteriors remain largely unchanged when trying to absorb noise below 1/Tspan even for pulsars with γred ~ 6 (Lentati et al. 2016). However, the P–P plot is very sensitive to even small biases, and hence quite likely that the difference may not be noticeable (or have strong Bayesian evidence) for a single analysis. Recently, it has also been seen that Fourier-basis Gaussian processes models periodic on Tspan struggle to recover the spin frequency second derivative (F2) when γred is greater than ~4 (Keith & Nifu 2023), and this implies that fitting for F0 and F1 may not be sufficient to allow safe use of the 1/Tspan Fourier basis model when the noise process has a very steep spectral exponent.

In order to test our hypothesis, we repeat the analsis whilst forcing the injected red noise to be periodic on Tspan, with results shown in Fig. 6c. The curve now goes smoothly to zero at 0, and there is no longer any correlation between the log (Ared) confidence interval and the injected γred. However, the results are still not consistent with a one-to-one relationship (with probability <0.5%), in particular, we find that log (Ared) is slightly biased high and γred biased slightly low. We find that a correction of γred by only ~2% is needed to fully correct the bias, a correction much smaller than the typical uncertainty of the measurement. The slope and amplitude are highly correlated, and indeed scaling log (Ared) to a longer time span closer to where the signal can be measured is sufficient to remove the apparent bias here. Our hypothesis is that there is an intrinsic bias for slightly flatter spectral slopes, which then manifests itself as a corresponding bias in the amplitude at 1 yr−1.

We repeat this with simulations that include achromatic noise red noise, excess white noise, DM variations, and scattering variations. The resulting P–P plots are very similar to the simulation with achromatic red noise only. Figure 6d shows the P–P plot when the noise is periodic on 1/Tspan, showing a similarly small but significant bias in the posterior parameters for all three parameters. The similarity between all three components of the noise model to that obtained when only injecting achromatic noise suggests that the bias is due to something fundamental in the Fourier domain Gaussian process modelling of the noise, rather than being driven by the choice of the dataset.

Overall, the P–P plots are encouraging and show that the recovered posteriors are largely correct, but in almost all cases, we find subtle deviations from the correct posterior distribution. In agreement with Sect. 5.2, we find that the analysis does not change significantly if using enterprise to compute the noise models rather than temponest. The P–P plot methodology is sensitive to very subtle errors in the posterior, including biases much smaller than the uncertainty on the results, so although the results are not perfect, we do not feel that it invalidates the results we present. Nevertheless, we feel that it is important that these tests are repeated on a wider scale across the IPTA, to fully understand the characteristics of the posteriors produced by our noise modelling code and increase confidence in the interpretation of the posterior distributions for the pulsar noise models and the GWB detection parameters, particularly for any pulsars with steep spectrum red noise.

Table 9

Uniform prior bounds for the P–P simulations for J1600–3053.

6 Conclusions

The noise models presented here are our current best estimates for stochastic noise in the 25-pulsar EPTA DR2 dataset. With the inclusion of wide-band receivers and additional low-frequency data from the InPTA collaboration, all but four pulsars show evidence for chromatic noise, specifically in the form of DM variations. However, several of these pulsars seem to have power-law noise with a much flatter spectrum than expected from DM variations in the turbulent IISM. This may be attributed to additional high-frequency terms, but it may also suggest other processes leaking into the model for DM variations. Compared to the EPTA DR1 and InPTA datasets, we find that the choice of preferred noise model can change, including a couple of cases where chromatic noise that was ruled out in EPTA DR1 becomes detected in EPTA DR2.

We feel that this is evidence that the assumptions typically made in noise modelling are not strictly true for all pulsars. Either there are additional processes that need to be considered, such as noise that is not purely modelled by a power law, or the spectral properties of the noise vary over time, or there are systematic instrumental effects that affect particular parts of the dataset. In reality, none of these assumptions are likely completely true, and the improvements in data quality and time span of the EPTA DR2 data particularly highlight these subtle effects.

The fact that the best noise models can depend on the frequency and time coverage of the dataset implies that great care is needed when comparing the noise models across different PTAs, and hence we suggest that a thorough investigation into the time and frequency stationary of the pulsar noise models, as well as exploration of additional noise terms, is best undertaken within the IPTA framework, which necessarily has better time and frequency coverage than any individual PTA.

We have also tested the estimation of hyperparameters from the noise model in temponest and enterprise and find that, as might be expected, they are highly consistent in results in real and simulated data. We also demonstrate that previously observed differences in model parameters when marginalising over the timing model parameters in temponest were due to a software bug and that marginalising over the timing model is entirely sufficient when trying to estimate the noise model hyperparameters.

Furthermore, we performed a ‘P–P plot’ test on our noise model hyperparameter estimation using simulated data and found that there may be two subtle biases in the results. For ‘realistic’ simulations where there is power at timescales longer than the observing span, a bias in the estimation of the red noise amplitude is observed when the spectral slope of the noise is greater than ~4. We further saw that even if we simulate noise that has the same periodic nature as the noise model, there is a very small bias (~2%) in the recovered spectral slope. Although these tests require large numbers of trials and, hence, large computing costs, we encourage further testing of this nature within the IPTA framework to attempt to better understand the origin and any possible effect of these biases.

It is important to note that although we find several areas of investigation for a better understanding of the pulsar noise processes, the custom noise models presented here model the noise in the EPTA dataset extremely well, and the majority of the pulsar noise models are consistent between all datasets that we have compared. We also demonstrate that our customised noise models improve our overall sensitivity to GWB signals over the ‘standard’ models for both the complete DR2full+ dataset as well as when focussing only on the new purpose-built EPTA instrumentation in DR2new.

thumbnail Fig. 6

From top left to bottom right: (a) P–P plot for ~2000 simulations with only chromatic noise, injected at much lower than 1/Tspan; (b) the confidence interval of the recovered achromatic amplitude for various levels of injected red noise slope for the simulations is shown in (a); (c) P–P plot for ~1000 simulations with only chromatic noise periodic on 1/Tspan; (d) P–P plot for ~1000 simulations with chromatic, achromatic and scattering variations, periodic on 1/Tspan.

Acknowledgements

The European Pulsar Timing Array (EPTA) is a collaboration between European and partner institutes, namely ASTRON (NL), INAF/Osservatorio di Cagliari (IT), Max-Planck-Institut für Radioastronomie (GER), Nançay/Paris Observatory (FRA), the University of Manchester (UK), the University of Birmingham (UK), the University of East Anglia (UK), the University of Bielefeld (GER), the University of Paris (FRA), the University of Milan-Bicocca (IT), the Foundation for Research and Technology, Hellas (GR), and Peking University (CHN), with the aim to provide high-precision pulsar timing to work towards the direct detection of low-frequency gravitational waves. An Advanced Grant of the European Research Council allowed to implement the Large European Array for Pulsars (LEAP) under Grant Agreement Number 227947 (PI M. Kramer). The EPTA is part of the International Pulsar Timing Array (IPTA); we thank our IPTA colleagues for their support and help with this paper and the external Detection Committee members for their work on the Detection Checklist. Part of this work is based on observations with the 100-m telescope of the Max-Planck-Institut für Radioastronomie (MPIfR) at Effelsberg in Germany. Pulsar research at the Jodrell Bank Centre for Astrophysics and the observations using the Lovell Telescope are supported by a Consolidated Grant (ST/T000414/1) from the UK’s Science and Technology Facilities Council (STFC). ICN is also supported by the STFC doctoral training grant ST/T506291/1. The Nançay radio Observatory is operated by the Paris Observatory, associated with the French Centre National de la Recherche Scientifique (CNRS), and partially supported by the Region Centre in France. We acknowledge financial support from “Programme National de Cosmologie and Galaxies” (PNCG), and “Programme National Hautes Energies” (PNHE) funded by CNRS/INSU-IN2P3-INP, CEA and CNES, France. We acknowledge financial support from Agence Nationale de la Recherche (ANR-18-CE31-0015), France. The Westerbork Synthesis Radio Telescope is operated by the Netherlands Institute for Radio Astronomy (ASTRON) with support from the Netherlands Foundation for Scientific Research (NWO). The Sardinia Radio Telescope (SRT) is funded by the Department of University and Research (MIUR), the Italian Space Agency (ASI), and the Autonomous Region of Sardinia (RAS) and is operated as a National Facility by the National Institute for Astrophysics (INAF). The work is supported by the National SKA programme of China (2020SKA0120100), Max-Planck Partner Group, NSFC 11690024, CAS Cultivation Project for FAST Scientific. This work is also supported as part of the “LEGACY” MPG-CAS collaboration on low-frequency gravitational wave astronomy. JA acknowledges support from the European Commission (Grant Agreement number: 101094354). JA and SCha were partially supported by the Stavros Niarchos Foundation (SNF) and the Hellenic Foundation for Research and Innovation (H.F.R.I.) under the 2nd Call of the “Science and Society – Action Always strive for excellence – Theodoras Papazoglou” (Project Number: 01431). AC acknowledges support from the Paris Île-de-France Region. AC, AF, ASe, ASa, EB, DI, GMS, MBo acknowledge financial support provided under the European Union’s H2020 ERC Consolidator Grant “Binary Massive Black Hole Astrophysics” (B Massive, Grant Agreement: 818691). GD, KLi, RK and MK acknowledge support from European Research Council (ERC) Synergy Grant “BlackHoleCam”, Grant Agreement Number 610058. This work is supported by the ERC Advanced Grant “LEAP”, Grant Agreement Number 227947 (PI M. Kramer). AV and PRB are supported by the UK’s Science and Technology Facilities Council (STFC; grant ST/W000946/1). AV also acknowledges the support of the Royal Society and Wolfson Foundation. JPWV acknowledges support by the Deutsche Forschungsgemeinschaft (DFG) through thew Heisenberg programme (Project No. 433075039) and by the NSF through AccelNet award #2114721. NKP is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Projektnummer PO 2758/1–1, through the Walter-Benjamin programme. ASa thanks the Alexander von Humboldt foundation in Germany for a Humboldt fellowship for postdoctoral researchers. APo, DP and MBu acknowledge support from the research grant “iPeska” (P.I. Andrea Possenti) funded under the INAF national call Prin-SKA/CTA approved with the Presidential Decree 70/2016 (Italy). RNC acknowledges financial support from the Special Account for Research Funds of the Hellenic Open University (ELKE-HOU) under the research programme “GRAVPUL” (K.E.-80383/grant agreement 319/10-10-2022: PI N. A. B. Gizani). EvdW, CGB and GHJ acknowledge support from the Dutch National Science Agenda, NWA Startimpuls – 400.17.608. BG is supported by the Italian Ministry of Education, University and Research within the PRIN 2017 Research Program Framework, n. 2017SYRTCN. LS acknowledges the use of the HPC system Cobra at the Max Planck Computing and Data Facility. The Indian Pulsar Timing Array (InPTA) is an Indo-Japanese collaboration that routinely employs TIFR’s upgraded Giant Metrewave Radio Telescope for monitoring a set of IPTA pulsars. BCJ, YG, YM, SD, AG and PR acknowledge the support of the Department of Atomic Energy, Government of India, under Project Identification # RTI 4002. BCJ, YG and YM acknowledge support of the Department of Atomic Energy, Government of India, under project No. 12-R&D-TFR-5.02-0700 while SD, AG and PR acknowledge support of the Department of Atomic Energy, Government of India, under project no. 12-R&D-TFR-5.02-0200. KT is partially supported by JSPS KAKENHI Grant Numbers 20H00180, 21H01130, and 21H04467, Bilateral Joint Research Projects of JSPS, and the ISM Cooperative Research Program (2021-ISMCRP-2017). AS is supported by the NANOGrav NSF Physics Frontiers Center (awards #1430284 and 2020265). AKP is supported by CSIR fellowship Grant number 09/0079(15784)/2022-EMR-I. SH is supported by JSPS KAKENHI Grant Number 20J20509. KN is supported by the Birla Institute of Technology & Science Institute fellowship. AmS is supported by CSIR fellowship Grant number 09/1001(12656)/2021-EMR-I and T-641 (DST-ICPS). TK is partially supported by the JSPS Overseas Challenge Program for Young Researchers. We acknowledge the National Supercomputing Mission (NSM) for providing computing resources of ‘PARAM Ganga’ at the Indian Institute of Technology Roorkee as well as ‘PARAM Seva’ at IIT Hyderabad, which is implemented by C-DAC and supported by the Ministry of Electronics and Information Technology (MeitY) and Department of Science and Technology (DST), Government of India. DD acknowledges the support from the Department of Atomic Energy, Government of India through Apex Project - Advance Research and Education in Mathematical Sciences at IMSc. The work presented here is a culmination of many years of data analysis as well as software and instrument development. In particular, we thank Drs. N. D’Amico, P. C. C. Freire, R. van Haasteren, C. Jordan, K. Lazaridis, P. Lazarus, L. Lentati, O. Löhmer and R. Smits for their past contributions. We also thank Dr. N. Wex for supporting the calculations of the galactic acceleration as well as the related discussions. The EPTA is also grateful to staff at its observatories and telescopes who have made the continued observations possible. We also thank the referee for their insightful and timely comments which has improved the quality of the manuscript. Author contributions: The European Pulsar Timing Array: JA, SB, ASBN, CGB, ABe, MBo, EB, PRB, MBu, RNC, AC, DJC, SCha, SChe, IC, GD, MF, RDF, AF, JRG, BG, EG, JMG, LG, YJG, HH, FI, DIV, JJan, JJaw, GHJ, AJ, RK, EFK, MJK, MK, MAK, KLa, KJL, KLi, YL, AGL, JWM, RAM, MBM, ICN, APa, BBPP, DP, APe, NKP, APo, HQL, ASa, SAS, ASe, GS, LS, RS, BWS, SCS, GT, CT, EvdW, AV, VVK, JPWV, JW, LW and ZW. The EPTA is a multi-decade effort and all authors have contributed through conceptualisation, funding acquisition, data-curation, methodology, software and hardware developments as well as (aspects of) the continued running of the observational campaigns, which includes writing and proofreading observing proposals, evaluating observations and observing systems, mentoring students, developing science cases. All authors also helped in (aspects of) verification of the data, analysis and results as well as in finalising the paper draft. Specific contributions from individual EPTA members are listed in the CRediT (https://credit.niso.org/) format below. The Indian Pulsar Timing Array: PA, SA, MB, ABa, SDa, DD, SDe, NDB, CD, YG, SH, BCJ, FK, DK, TK, NK, MAK, YM, KN, AKP, TP, PR, JS, ASr, MS, ASu, KT and PT. InPTA members contributed in uGMRT observations and data reduction to create the InPTA data set which is employed while assembling the DR2full+ and DR2new+ data sets. InPTA members contributed to the discussions that probed the impact of including InPTA data on single pulsar noise analysis. Furthermore, they provided quantitative comparisons of various noise models, wrote a brief description of the underlying Tensiometer package, and helped with the related interpretations. APa, AC, MJK equally share the correspondence of the paper. CRediT statement: Conceptualisation: AC, APa, APo, AV, BWS, CT, GMS, GT, JPWV, JWM, KJL, KLi, MJK, MK. Methodology: AC, APa, AV, DJC, GMS, IC, JWM, KJL, KLi, LG, MJK, MK, SB, SChe, VVK. Software: AC, AJ, APa, APe, GD, GMS, KJL, KLi, MJK, RK, SChe, VVK. Validation: AC, APa, BG, GMS, IC, JPWV, JWM, KLi, LG, MJK. Formal Analysis: AC, APa, BG, EvdW, GHJ, GMS, JWM, KLi, MJK. Investigation: AC, APa, APo, BWS, CGB, DJC, DP, GMS, IC, JPWV, JWM, KLi, LG, MBM, MBu, MJK, RK, VVK. Resources: AC, APa, APe, APo, BWS, GHJ, GMS, GT, IC, JPWV, JWM, KJL, KLi, LG, MJK, MK, RK. Data Curation: AC, AJ, APa, BWS, CGB, DJC, DP, EvdW, GHJ, GMS, JA, JWM, KLi, MBM, MJK, MK, NKP, RK, SChe. Writing - Original Draft: AC, APa, GMS, MJK. Writing - Review & Editing: AC, AF, APa, APo, BG, EB, EFK, GMS, GT, JA, JPWV, JWM, KLi, MJK, MK, SChe, VVK. Visualisation: AC, APa, GMS, KLi, MJK. Supervision: AC, APo, ASe, AV, BWS, CGB, DJC, EFK, GHJ, GT, JPWV, KJL, LG, MJK, MK, VVK. Project Administration: AC, APo, ASe, AV, BWS, CGB, CT, GHJ, GMS, GT, JPWV, JWM, LG, MJK, MK. Funding Acquisition: APe, APo, ASe, BWS, GHJ, GT, IC, LG, MJK, MK.

References

  1. Alam, M. F., Arzoumanian, Z., Baker, P. T., et al. 2021, ApJS, 252, 4 [Google Scholar]
  2. Antoniadis, J., Arzoumanian, Z., Babak, S., et al. 2022, MNRAS, 510, 4873 [NASA ADS] [CrossRef] [Google Scholar]
  3. Arzoumanian, Z., Baker, P. T., Blumer, H., et al. 2020, ApJ, 905, L34 [NASA ADS] [CrossRef] [Google Scholar]
  4. Ashton, G., Hübner, M., Lasky, P. D., et al. 2019, ApJS, 241, 27 [NASA ADS] [CrossRef] [Google Scholar]
  5. Brook, P. R., Karastergiou, A., Buchner, S., et al. 2014, ApJ, 780, L31 [Google Scholar]
  6. Caballero, R. N., Lee, K. J., Lentati, L., et al. 2016, MNRAS, 457, 4421 [NASA ADS] [CrossRef] [Google Scholar]
  7. Caballero, R. N., Guo, Y. J., Lee, K. J., et al. 2018, MNRAS, 481, 5501 [Google Scholar]
  8. Carlin, B. P., & Chib, S. 1995, J. R. Stat. Soc. Ser. B Methodol., 57, 473 [Google Scholar]
  9. Chalumeau, A., Babak, S., Petiteau, A., et al. 2022, MNRAS, 509, 5538 [Google Scholar]
  10. Chen, S., Caballero, R. N., Guo, Y. J., et al. 2021, MNRAS, 508, 4970 [NASA ADS] [CrossRef] [Google Scholar]
  11. CHIME/Pulsar Collaboration (Amiri, M., et al.) 2021, ApJS, 255, 5 [NASA ADS] [CrossRef] [Google Scholar]
  12. Coles, W. A., Kerr, M., Shannon, R. M., et al. 2015, ApJ, 808, 113 [NASA ADS] [CrossRef] [Google Scholar]
  13. Cook, S. R., Gelman, A., & Rubin, D. B. 2006, J. Comput. Graph. Stat., 15, 675 [CrossRef] [Google Scholar]
  14. Cordes, J. M., & Shannon, R. M. 2010, ArXiv e-prints [arXiv:1010.3785] [Google Scholar]
  15. Cordes, J. M., Shannon, R. M., & Stinebring, D. R. 2016, ApJ, 817, 16 [Google Scholar]
  16. Curylo, M., Pennucci, T. T., Bailes, M., et al. 2023, ApJ, 944, 128 [NASA ADS] [CrossRef] [Google Scholar]
  17. Demorest, P. B., Ferdman, R. D., Gonzalez, M. E., et al. 2013, ApJ, 762, 94 [Google Scholar]
  18. Desvignes, G., Caballero, R. N., Lentati, L., et al. 2016, MNRAS, 458, 3341 [Google Scholar]
  19. Donner, J. Y., Verbiest, J. P. W., Tiburzi, C., et al. 2020, A & A, 644, A153 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  20. Ellis, J., & van Haasteren, R. 2017, https://zenodo.org/record/1037579 [Google Scholar]
  21. Ellis, J. A., Vallisneri, M., Taylor, S. R., & Baker, P. T. 2019, Astrophysics Source Code Library [record ascl:1912.015] [Google Scholar]
  22. EPTA Collaboration (Antoniadis, J., et al.) 2023, A & A, 678, A48 (Paper I) [Google Scholar]
  23. EPTA Collaboration, & InPTA Collaboration (Antoniadis, J., et al.) 2023, A & A, 678, A50 (Paper III) [Google Scholar]
  24. Fermi-LAT Collaboration (Ajello, M., et al.) 2022, Science, 376, 521 [NASA ADS] [CrossRef] [Google Scholar]
  25. Feroz, F., & Hobson, M. P. 2008, MNRAS, 384, 449 [NASA ADS] [CrossRef] [Google Scholar]
  26. Foster, R. S., & Backer, D. C. 1990, ApJ, 361, 300 [NASA ADS] [CrossRef] [Google Scholar]
  27. Foster, R. S., & Cordes, J. M. 1990, ApJ, 364, 123 [NASA ADS] [CrossRef] [Google Scholar]
  28. Goncharov, B., Zhu, X.-J., & Thrane, E. 2020, MNRAS, 497, 3264 [NASA ADS] [CrossRef] [Google Scholar]
  29. Goncharov, B., Shannon, R. M., Reardon, D. J., et al. 2021, ApJ, 917, L19 [NASA ADS] [CrossRef] [Google Scholar]
  30. Goncharov, B., Thrane, E., Shannon, R. M., et al. 2022, ApJ, 932, L22 [NASA ADS] [CrossRef] [Google Scholar]
  31. Hazboun, J. S., Simon, J., Taylor, S. R., et al. 2020, ApJ, 890, 108 [NASA ADS] [CrossRef] [Google Scholar]
  32. Hee, W. S., Khor, W. Y., Lim, H. S., & Jafri, M. Z. M. 2015, AIP Conf. Ser., 1657, 040015 [NASA ADS] [Google Scholar]
  33. Hellings, R. W., & Downs, G. S. 1983, ApJ, 265, L39 [NASA ADS] [CrossRef] [Google Scholar]
  34. Hobbs, G., Guo, L., Caballero, R. N., et al. 2020, MNRAS, 491, 5951 [Google Scholar]
  35. Jones, P. B. 1990, MNRAS, 246, 364 [NASA ADS] [Google Scholar]
  36. Kass, R. E., & Raftery, A. E. 1995, J. Am. Stat. Assoc., 90, 773 [Google Scholar]
  37. Keith, M. J., & Nitu, I. C. 2023, MNRAS, 523, 4603 [NASA ADS] [Google Scholar]
  38. Lam, M. T., Ellis, J. A., Grillo, G., et al. 2018, ApJ, 861, 132 [NASA ADS] [CrossRef] [Google Scholar]
  39. Lentati, L., Alexander, P., Hobson, M. P., et al. 2013, Phys. Rev. D, 87, 104021 [NASA ADS] [CrossRef] [Google Scholar]
  40. Lentati, L., Alexander, P., Hobson, M. P., et al. 2014, MNRAS, 437, 3004 [NASA ADS] [CrossRef] [Google Scholar]
  41. Lentati, L., Shannon, R. M., Coles, W. A., et al. 2016, MNRAS, 458, 2161 [Google Scholar]
  42. Liu, K., Verbiest, J. P. W., Kramer, M., et al. 2011, MNRAS, 417, 2916 [Google Scholar]
  43. Liu, K., Keane, E. F., Lee, K. J., et al. 2012, MNRAS, 420, 361 [NASA ADS] [CrossRef] [Google Scholar]
  44. Liu, K., Karuppusamy, R., Lee, K. J., et al. 2015, MNRAS, 449, 1158 [NASA ADS] [CrossRef] [Google Scholar]
  45. Liu, K., Guillemot, L., Istrate, A. G., et al. 2020, MNRAS, 499, 2276 [NASA ADS] [CrossRef] [Google Scholar]
  46. Lyne, A., Hobbs, G., Kramer, M., Stairs, I., & Stappers, B. 2010, Science, 329, 408 [NASA ADS] [CrossRef] [Google Scholar]
  47. Madison, D. R., Cordes, J. M., Arzoumanian, Z., et al. 2019, ApJ, 872, 150 [Google Scholar]
  48. Main, R. A., Antoniadis, J., Chen, S., et al. 2023, MNRAS, 525, 1079 [NASA ADS] [CrossRef] [Google Scholar]
  49. Manchester, R. N., Hobbs, G., Bailes, M., et al. 2013, PASA, 30, e017 [NASA ADS] [CrossRef] [Google Scholar]
  50. Padmanabh, P. V., Barr, E. D., Champion, D. J., et al. 2020, MNRAS, 500, 1178 [NASA ADS] [CrossRef] [Google Scholar]
  51. Parthasarathy, A., Shannon, R. M., Johnston, S., et al. 2019, MNRAS, 489, 3810 [Google Scholar]
  52. Parthasarathy, A., Bailes, M., Shannon, R. M., et al. 2021, MNRAS, 502, 407 [NASA ADS] [CrossRef] [Google Scholar]
  53. Raveri, M., & Doux, C. 2021, Phys. Rev. D, 104, 043504 [NASA ADS] [CrossRef] [Google Scholar]
  54. Shannon, R. M., & Cordes, J. M. 2010, ApJ, 725, 1607 [NASA ADS] [CrossRef] [Google Scholar]
  55. Shannon, R. M., & Cordes, J. M. 2017, MNRAS, 464, 2075 [NASA ADS] [CrossRef] [Google Scholar]
  56. Shannon, R. M., Lentati, L. T., Kerr, M., et al. 2016, ApJ, 828, L1 [NASA ADS] [CrossRef] [Google Scholar]
  57. Shaw, B., Stappers, B. W., Weltevrede, P., et al. 2022, MNRAS, 513, 5861 [NASA ADS] [CrossRef] [Google Scholar]
  58. Singha, J., Surnis, M. P., Joshi, B. C., et al. 2021, MNRAS, 507, L57 [NASA ADS] [CrossRef] [Google Scholar]
  59. Sivia, D. S., & Skilling, J. 2006, Data Analysis - A Bayesian Tutorial, 2nd edn. (Oxford University Press) [Google Scholar]
  60. Tarafdar, P., Nobleson, K., Rana, P., et al. 2022, PASA, 39, e053 [NASA ADS] [CrossRef] [Google Scholar]
  61. Taylor, S. R., Mingarelli, C. M. F., Gair, J. R., et al. 2015, Phys. Rev. Lett., 115, 041101 [NASA ADS] [CrossRef] [Google Scholar]
  62. Taylor, S. R., van Haasteren, R., & Sesana, A. 2020, Phys. Rev. D, 102, 084039 [NASA ADS] [CrossRef] [Google Scholar]
  63. Tiburzi, C., Shaifullah, G. M., Bassa, C. G., et al. 2021, A & A, 647, A84 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  64. van Haasteren, R., & Vallisneri, M. 2014, Phys. Rev. D, 90, 104012 [NASA ADS] [CrossRef] [Google Scholar]
  65. van Haasteren, R., & Vallisneri, M. 2015, MNRAS, 446, 1170 [NASA ADS] [CrossRef] [Google Scholar]
  66. Xu, H., Huang, Y. X., Burgay, M., et al. 2021, ATel, 14642, 1 [NASA ADS] [Google Scholar]
  67. Zhu, W. W., Stairs, I. H., Demorest, P. B., et al. 2015, ApJ, 809, 41 [NASA ADS] [CrossRef] [Google Scholar]
  68. Zic, A., Hobbs, G., Shannon, R. M., et al. 2022, MNRAS, 516, 410 [NASA ADS] [CrossRef] [Google Scholar]

1

The solar wind amplitude is measured at 1 au.

2

However, some PTA collaborations use different notations.

8

commit hash: e745752.

All Tables

Table 1

Noise model constants and hyperparameter priors.

Table 2

Parameter priors for the two exponential dips included in the noise models for PSR J1713+0747.

Table 3

Favoured models listed for the 25 pulsars using both EPTA and InPTA data (for 10 common MSPs; DR2full+) along with estimated values for chromatic and achromatic noise models.

Table 4

Same as in Table 3, but for DR2new.

Table 5

Abbreviations for the different datasets used in this paper.

Table 6

Estimated tension (Z-score in sigma) between the DR2full and DR2full+ datasets for the red and DM noise models.

Table 7

Validation and performance statistics for the customised noise models with the “DR2full+” and “DR2new”.

Table 8

Tension metrics for the red and DM noise models estimated using enterprise and temponest.

Table 9

Uniform prior bounds for the P–P simulations for J1600–3053.

All Figures

thumbnail Fig. 1

100 random realisations (blue) and medians for each epoch (black) of the RN time-delay reconstructions for PSRs J0900–3144 (left) and J1012+5307 (right) using the most favoured noise models with the DR2new dataset. The short-timescale stochastic signals seen at a level for these two pulsars still have unknown origins.

In the text
thumbnail Fig. 2

Marginalised posterior distributions of noise hyperparameters obtained with the EPTA DR1, DR2full and DR2new. Top left: PSR J1730–2304 falls in Category 1, where the DR2full/New datasets are able to constrain chromatic noise (DMAmp, DMSlope). Top right: For PSR J1857+0943, belonging to Category 2, it is clear that the DR1 data could not disentangle chromatic from achromatic noise (RedAmp, RedSlope), which is resolved better with the DR2 datasets. Bottom left: PSR J1713+0747 in Category 3, shows clear inconsistencies in the achromatic noise estimates. Bottom right: PSR J1909–3744 shows a much more complicated behaviour as described in the text motivating its inclusion in Category 3.

In the text
thumbnail Fig. 3

100 random realisations (blue) and medians for each epoch of the time-domain reconstructions of the DM variations delays at 1.4 GHz radio frequency modelled as a Gaussian process for PSRs J0613–0200 (left) and J1909–3744 (right), with the “DR2full+”. The black dashed lines display the last epochs of DR2full (that is, EPTA only). The inclusion of InPTA data allows to measure sharp changes at after MJD 59000. The delays are obtained from marginalising the timing model parameters which comprise the DM constant and the first two time derivatives of the dispersion measure (DM1, DM2).

In the text
thumbnail Fig. 4

Scattering delay variations measurement for PSR J1600–3053 with DR2new. Left plot: 100 random realisations (blue) of the time-domain reconstructions of the scattering delay variations at 1.4 GHz radio frequency modelled as a Gaussian process with DR2new. Independent scattering delays measured from scintillation analysis with the LEAP data (Main et al. 2023) are shown by the black points. Right plot: chromatic index posterior distributions measured for the DM (orange) and the SV (blue) with DR2new, while fixing the chromatic index of the other component. The favoured model for this dataset includes both DM and SV. The black dotted lines emphasise the expected values for DM and SV, respectively at 2 and 4.

In the text
thumbnail Fig. 5

Red and DM noise models for PSR J1744–1134 using DR2full and DR2full+ datasets. The inclusion of InPTA data allows a better constraint on the achromatic noise.

In the text
thumbnail Fig. 6

From top left to bottom right: (a) P–P plot for ~2000 simulations with only chromatic noise, injected at much lower than 1/Tspan; (b) the confidence interval of the recovered achromatic amplitude for various levels of injected red noise slope for the simulations is shown in (a); (c) P–P plot for ~1000 simulations with only chromatic noise periodic on 1/Tspan; (d) P–P plot for ~1000 simulations with chromatic, achromatic and scattering variations, periodic on 1/Tspan.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.