Data span and frequency coverage requirements for robust detection and inference in pulsar timing arrays

Irene Ferranti; Mikel Falxa; Federico Fantoccoli; Alberto Sesana; Golam Shaifullah

doi:10.1051/0004-6361/202558270

Home

All issues

Volume 709 (May 2026)

A&A, 709 (2026) A256

Full HTML

Open Access

Issue		A&A Volume 709, May 2026


Article Number		A256
Number of page(s)		15
Section		Astrophysical processes
DOI		https://doi.org/10.1051/0004-6361/202558270
Published online		25 May 2026

A&A, 709, A256 (2026)

A case study with EPTA `DR2`

Irene Ferranti¹^,2^★, Mikel Falxa¹, Federico Fantoccoli⁵, Alberto Sesana¹^,2^,3 and Golam Shaifullah¹^,2^,4

¹ Dipartimento di Fisica “G. Occhialini”, Universitá degli Studi di Milano-Bicocca, Piazza della Scienza 3, I-20126 Milano, Italy
² INFN, Sezione di Milano-Bicocca, Piazza della Scienza 3, I-20126 Milano, Italy
³ INAF – Osservatorio Astronomico di Brera, Via Brera 20, I-20121 Milano, Italy
⁴ INAF – Osservatorio Astronomico di Cagliari, Via della Scienza 5, 09047 Selargius (CA), Italy
⁵ Max Planck Institute for Gravitational Physics (Albert Einstein Institute), Am Mühlenberg 1, D-14476 Potsdam, Germany

^★ Corresponding author: This email address is being protected from spambots. You need JavaScript enabled to view it.

Received: 26 November 2025
Accepted: 4 March 2026

Abstract

Pulsar timing arrays (PTAs) are approaching the level of sensitivity required to make a 5-σ detection of the nanohertz stochastic gravitational-wave background (GWB). Thus, it is now crucial to develop a comprehensive understanding of our data and of the outcomes of our analysis pipelines. It is also essential to understand a counterintuitive feature revealed in the recent results from the European Pulsar Timing Array (EPTA) second data release (DR2). When restricting the dataset to its ultimate ∼10.3 years (DR2new), the inferred GWB significance increases from ≲2σ for the full 25-year dataset (DR2full), to ≳3.5σ for DR2new. In this work, we investigate whether this behaviour reflects an anomaly in the data or whether it is a possible outcome of the analysis pipeline. Using realistic DR2-like simulations, we generated multiple realisations with varying observation time spans and analysed their impact on GWB evidence and parameter estimations. We quantified the evidence using the signal-to-noise ratio (S/N) of a common process with Hellings-Downs spatial correlations (HD S/N). We find that the first ∼10 years of DR2 contribute little to the GWB evidence due to the limited bandwidth of the observation frequency, leading to a significant overlap between the DR2full and DR2new HD S/N distributions. As a consequence, random noise fluctuations result in DR2new producing a higher GWB significance than DR2full in ∼15% of the cases and 5% of the realisations are compatible with the HD S/N of the real DR2full and DR2new. This suggests that the picture observed in the data, while unlikely, remains consistent with a ∼2σ outcome due to noise fluctuations. We also find that regardless of the significance, the DR2new simulated data yielded biased GWB parameter estimates, primarily due to spectral leakage effects that are disregarded in the analysis and tend to flatten the inferred power-law spectrum. Including leakage in the model returns unbiased parameter estimates, demonstrating that DR2new is reliable when the signal is appropriately modelled. Furthermore, we show that combining EPTA DR2full data with complementary long-baseline observations from NANOGrav and PPTA and with low-frequency observations from LOFAR and NenuFAR significantly bolsters the GWB evidence, along with improved precision and accuracy in the parameter estimation. This outcome supports the case of integrating DR2full within the IPTA framework. Finally, we explored the impact of the observation time span on parameter estimations in greater detail, focusing on long-baseline (25 yr) and short-baseline (5 yr) datasets. We find that short-baseline datasets tend to introduce significant bias towards high amplitudes in the estimation of GWB parameters, while the short time span makes them very ineffective at constraining the GWB slope.

Key words: gravitational waves / methods: data analysis / methods: statistical

© The Authors 2026

Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. This email address is being protected from spambots. You need JavaScript enabled to view it. to support open access publication.

1. Introduction

At nanohertz (nHz) frequencies, a stochastic gravitational-wave background (GWB) is expected to arise from a cosmic population of supermassive black hole binaries (SMBHBs; Rajagopal & Romani 1995; Jaffe & Backer 2003; Wyithe & Loeb 2003; Sesana et al. 2008) as well as from a variety of processes occurring in the early Universe (see Caprini & Figueroa 2018; Afzal et al. 2023; EPTA and InPTA Collaboration 2024, and references therein). By conducting high-precision timing of millisecond pulsars, pulsar timing arrays (PTAs; Foster & Backer 1990) produce datasets sensitive enough to detect tiny perturbations in space-time curvature caused by passing GWs. This, in turn, makes them sensitive to potential nHz GWBs. Recently, the European PTA (EPTA), in partnership with the Indian PTA (InPTA; EPTA and InPTA Collaboration 2023a), along with the North American Nanohertz Observatory for Gravitational Waves (NANOGrav; Agazie et al. 2023), Parkes PTA (PPTA; Reardon et al. 2023), Chinese PTA (CPTA; Xu et al. 2023), and MeerKAT Pulsar Timing Array (MPTA; Miles et al. 2025) reported evidence of a GWB-like signal with a significance of 1–4.6σ (depending on the dataset). This report has effectively opened the first observational window onto the nHz GW sky.

A PTA dataset consists of timing residuals, δt, obtained by subtracting a best-fit timing model (TM) from the observed pulses’ times of arrival (TOAs; e.g. Edwards et al. 2006; EPTA and InPTA Collaboration 2023c). The TM accounts for all deterministic physical effects between pulse emission and reception. Therefore, timing residuals contain any unmodelled signals, including stochastic processes (e.g. noise sources) and potentially a GWB. Consequently, data analysis pipelines are designed to fit for both candidate GW signals and pulsar noise simultaneously (Agazie et al. 2023; EPTA and InPTA Collaboration 2023b). This simultaneous fit is particularly challenging, because pulsar observations are affected by numerous subtle stochastic noise sources, which are usually strongly correlated with the GW signal (e.g. Tiburzi et al. 2016; Verbiest & Shaifullah 2018). The three dominant sources of noise are: (i) white noise (WN); (ii) red noise (RN), which arises from random variations in pulsar spin rate (e.g. Lentati et al. 2016); and (iii) dispersion measure (DM) noise, which is caused by fluctuations in the electron density of the interstellar medium along the line of sight (e.g. Lorimer & Kramer 2004). Overall, RN, DM, and GWB are all time-correlated noise processes with red spectra, which can be distinguished through their distinctive signatures. In particular, the GWB is common to all pulsars, achromatic (radio frequency-independent) and exhibits a quadrupolar angular correlation (Hellings & Downs 1983); RN varies from pulsar to pulsar, lacks spatial correlation and is achromatic; and DM noise is unique to each pulsar and chromatic, scaling in amplitude as ν⁻² with observing frequency (e.g. Lorimer & Kramer 2004). In principle, a PTA with many well-timed pulsars spread across the sky and observed over a wide frequency range can disentangle these contributions. In practice, however, current PTAs fall short of these ideal conditions. For example, the current evidence for GWB is dominated by only a few pulsars (see Speri et al. 2023) and the frequency coverage, particularly in older legacy datasets, remains limited. Indeed, as demonstrated in Ferranti et al. (2025), the scarce frequency coverage in the second data release of the EPTA strongly limits the evidence for a GWB signal. Quite surprisingly, in the real data, this limitation appears to be overcome once the oldest data are removed. Indeed, the latest data release presented the results for two different versions of the EPTA DR2: DR2full, which contains all the backends and has a timespan of ∼25 years, along with DR2new, which is limited to the last 10.3 years of data for all the pulsars. The result is that the GWB significance obtained with DR2full is ≲2σ, while with DR2new it is ≳3.5σ. This cut was made to exclude the mono-frequency measurements present in the first half of the data (the frequency band distribution of DR2full is displayed in the middle panel of Fig. 2).

The main objective of this paper is to determine whether the counterintuitive increase in GWB significance obtained by cutting the data is an anomaly caused by the dataset itself or whether it is indeed a possible outcome of the analysis pipeline. Here, we tackle this main question by simulating and analysing synthetic replicas of EPTA DR2full and DR2new.

The paper is organised as follows. The simulation details and analysis techniques adopted are presented in Sect. 2. The main results of our investigation are presented in Sect. 3. In particular, in Sect. 3.1.1 we present detail results on the significance of GWB recovery in 100 mock realisations of DR2full and DR2new. In Sect. 3.1.2, we also evaluate the performance of the GWB parameter estimation in the two datasets, examining the effect of the spectral leakage resulting from the limited timespan of DR2new. This evaluation is conducted using the technique developed in Crisostomi et al. (2025, Sect. 3.1.3). In Sect. 3.2, we investigate how much the evaluation of the significance and the parameter estimation can be improved by adding to the 25 pulsar of EPTA DR2full an IPTA-like frequency coverage. Finally, in Sect. 3.3, we study the behaviour of very short observation datasets (5 yr of observation time), comparable to those recently presented by CPTA and MeerKAT, in the GWB parameter estimation. We summarise our conclusions in Sect. 4.

2. Simulations and analysis tools

2.1. Reference datasets

The reference datasets are the EPTA DR2full and EPTA DR2new, which are the 24.8 yr and 10.3 yr version of the EPTA DR2. DR2new was obtained from DR2full by removing the so-called legacy data, recorded by the oldest backends. EPTA data are taken from the major European radio telescopes: Effelsberg 100-m radio telescope (EFF) in Germany, 76-m Lovell Telescope, and Mark II Telescope at Jodrell Bank Observatory (JBO) in the United Kingdom, along with the large radio telescope operated by the Nançay Radio Observatory (NRT) in France, 64-m Sardinia Radio Telescope (SRT) operated by the Italian National Institute for Astrophysics (INAF) through the Astronomical Observatory of Cagliari (OAC), and Westerbork Synthesis Radio Telescope (WSRT) operated by ASTRON, the Netherlands Institute for Radio Astronomy. These telescopes also operate together as the Large European Array for Pulsars (LEAP), which gives an equivalent diameter of up to 194 m (Bassa et al. 2016).

2.2. Simulation of a realistic PTA dataset

This section briefly describes the main steps involved in creating simulations that are meant to be as realistic as possible. For a more detailed explanation, we refer to Ferranti et al. (2025).

2.2.1. Simulating realistic times of arrival

The simulation pipeline begins by replicating the key characteristics of each pulsar’s observations, including observation times, number of observations, and cadence. The temporal variations in cadence and the large gaps present in some pulsars’ TOAs are also reproduced. Each simulated epoch is assigned the same radio frequency as in the corresponding real observation, along with the associated observatory, backend, and timing error. The data are then generated using the libstempo package (Vallisneri 2020), adopting the timing model (TM) from EPTA and InPTA Collaboration (2023c). The same package is then used to inject the noise budget, as described below.

2.2.2. Simulating realistic noise budget

We modelled three noise components, white noise, red noise, and dispersion measure¹, all treated as stationary Gaussian processes. As such, these components are fully described by their covariance matrices. WN is uncorrelated in time, resulting in a diagonal covariance matrix with elements given by

$\begin{matrix} C_{ii}^{WN} = σ_{i}^{2} = {EFAC}_{i}^{2} σ_{ToA}^{2} + {EQUAD}_{i}^{2}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} C_{ii}^\mathrm{WN}=\sigma _i^2 = \mathrm{EFAC}_i^2 \, \sigma _{\mathrm{ToA}}^2 + \mathrm{EQUAD}_i^2, \end{aligned} $$$ (1)

where σ_ToA is the template-fitting timing uncertainty, EFAC is a multiplicative factor applied to the ToA errors, and EQUAD accounts for additional backend-dependent white noise. Additional noise processes acting at the single-pulse level, such as variations in pulse shape or radio-frequency interference (RFI), are not explicitly simulated since our simulations generate ToAs directly. Nevertheless, we expect these effects to contribute to the overall noise budget, since they influence the estimation of EFAC and EQUAD in real data. Therefore, by generating the ToAs using the EFAC and EQUAD values derived from real-data noise dictionaries, we implicitly account for these additional noise contributions in our simulations. RN and DM variations, in contrast, are time-correlated and modelled in Fourier space by means of sine and cosine basis functions with normally distributed coefficients determined by their power spectral density, S(f),

$\begin{matrix} S (f ; A, γ) = \frac{A^{2}}{12 π^{2}} (\frac{f}{{yr}^{- 1}})^{- γ} {yr}^{3} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} S(f; A, \gamma ) = \frac{A^2}{12\pi ^2}\biggl (\frac{f}{\mathrm{yr}^{-1}}\biggr )^{-\gamma }\mathrm{yr}^3. \end{aligned} $$$ (2)

Noise realisations are produced by randomly drawing Fourier coefficients around this power law S(f), generating a covariance matrix that sums contributions from different frequencies and incorporates a chromatic index i_d, with i_d = 0 for achromatic RN and i_d = 2 for DM noise,

$\begin{matrix} \begin{matrix} C_{ij}^{RN / DM} = \sum_{k} & S (f_{k}) Δ f (\frac{ν_{ref}}{ν_{i}})^{i_{d}} (\frac{ν_{ref}}{ν_{j}})^{i_{d}} \cdot cos (2 π f_{k} (t_{i} - t_{j})), \end{matrix} \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \begin{aligned} C_{ij}^\mathrm{RN/DM} = \sum _k&S(f_k) \Delta f \bigg (\frac{\nu _{\rm ref}}{\nu _i} \bigg )^{i_{d}} \bigg (\frac{\nu _{\rm ref}}{\nu _j} \bigg )^{i_{d}} \cdot \cos (2\pi f_k (t_i - t_j)), \end{aligned} \end{aligned} $$$ (3)

where ν_i is the observed radio frequency at the TOA, t_i. The amplitude and slope used in the S(f) for the injection are the maximum likelihood values obtained from the single pulsar noise analysis (SPNA) on real EPTA DR2 data (EPTA and InPTA Collaboration 2023b). The number of Fourier components f_k = k/T_obs used for the injection is also taken from the SPNA. This approach ensures that the simulated timing data reproduce the statistical characteristics of real pulsar observations.

2.2.3. Simulating a realistic GWB

To simulate a GWB from SMBHBs, the injection pipeline generates a population of circular binaries, calculates the timing variations each induces on pulsar TOAs, and sums these delays. The population is sampled from the distribution, d³N/(dz dM df_r), which is the number of emitting sources per unit redshift, mass and frequency and is based on observation-driven models (Sesana 2013; Rosado et al. 2015). The power spectrum of such a population is

$\begin{matrix} h_{c}^{2} (f_{i}) = \sum_{j \in Δ f_{i}} h_{j}^{2} (f_{r}) \frac{f_{r}}{Δ f_{i}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} h_c^2(f_i) = \sum _{j \in \Delta f_i} h_j^2(f_r)\frac{f_r}{\Delta f_i}, \end{aligned} $$$ (4)

where f_i is the bin’s central frequency and h_j(f_r) is the strain of each source, given by

$\begin{matrix} h (f_{r}) = 2 \sqrt{\frac{a {(i)}^{2} + b {(i)}^{2}}{2}} \frac{{(G M)}^{5 / 3} {(π f_{r})}^{2 / 3}}{c^{4} d (z)}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} h(f_r) = 2 \sqrt{\frac{a(i)^2 + b(i)^2}{2}} \frac{(G\mathcal{M} )^{5/3}(\pi f_r)^{2/3}}{c^4 d(z)}, \end{aligned} $$$ (5)

with a(i) = 1 + cos²i, b(i) = − 2 cos i, and d(z) the comoving distance,

$\begin{matrix} d (z) = \frac{c}{H_{0}} \int_{0}^{z} \frac{1}{\sqrt{Ω_{m} {(1 + z^{'})}^{3} + Ω_{Λ}}} d z^{'} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} d(z) = \frac{c}{H_0}\int _0^z \frac{1}{\sqrt{\Omega _m(1+z^{\prime })^3 + \Omega _\Lambda }}\,dz^{\prime }. \end{aligned} $$$ (6)

In the time domain, the GW signal emitted by an inspiraling SMBHB is described by eight parameters: chirp mass, ℳ, GW frequency, f_r, redshift, z, inclination, i, polarisation, ψ, phase, Φ₀, and sky position, (θ, ϕ). Here, ℳ, f_r, and z were drawn from d³N/(dz dℳ df_r), while the remaining angular parameters were drawn such that the source sky location (θ, ϕ) is random on the sphere and i is randomly oriented on the sphere, while ψ and Φ₀ are uniform in the ranges (0, π) and (0, 2π), respectively. Once the eight parameters of each binary in the SMBHB population were generated, the corresponding timing variations (accounting for Earth and pulsar terms) were computed following Ellis (2013) and summed in the time domain to yield the total GWB signal.

For this study, a single population of circular SMBHBs was used, generating a power spectrum with A_GWB = 2.4 × 10⁻¹⁵ when α_GWB is fixed at −2/3. These values are consistent with the constraints from EPTA DR2 (EPTA and InPTA Collaboration 2023a). The total spectrum together with the contribution of each binary in the population and the power law approximation is displayed in Fig. 1. A schematic summary of all the simulated datasets used and their description can be found in Table 1.

Fig. 1.

Characteristic strain spectrum of the injected GWB. Purple stars are the contributions to the spectrum of all the SMBHBs in the populations. The black solid line is the total power spectrum in each frequency bin, where the frequency bins start at f = 1/T_obs and have width Δf = 1/T_obs with T_obs = 24.8 yr. The power law approximation of the spectrum is also shown.

Table 1.

Simulated datasets used in this work.

2.3. Data analysis

We used standard PTA analysis techniques, based on the assumption of a Gaussian and stationary noise, and we adopted a Gaussian likelihood. Following van Haasteren & Vallisneri (2014), first-order timing model (TM) parameter uncertainties are analytically marginalised, yielding the TM-marginalised likelihood,

$\begin{matrix} ln L (δ t | θ) \propto - \frac{1}{2} δ t^{⊤} C^{- 1} δ t - \frac{1}{2} ln det C, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \ln \mathcal{L} (\delta t|\theta ) \propto -\frac{1}{2}\delta t^\top C^{-1} \delta t - \frac{1}{2} \ln \det C, \end{aligned} $$$ (7)

where $δ t = \sum_{a = 1}^{N_{PSR}} δ t_{a}$ $Mathematical equation: $ \delta t = \sum_{a = 1}^{N_{\mathrm{PSR}}} \delta t_a $$ and the covariance matrix is C = C_aδ_ab + C_hΓ_ab. Here, C_h describes an achromatic common red noise (RN) correlated between pulsars via the Hellings–Downs curve (Hellings & Downs 1983):

$\begin{matrix} Γ_{ab} = \frac{1}{3} - \frac{1}{6} \frac{1 - cos γ_{ab}}{2} + \frac{1 - cos γ_{ab}}{2} ln (\frac{1 - cos γ_{ab}}{2}), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \Gamma _{ab} = \frac{1}{3} - \frac{1}{6} \frac{1-\cos \gamma _{ab}}{2} + \frac{1-\cos \gamma _{ab}}{2} \ln \left( \frac{1-\cos \gamma _{ab}}{2} \right), \end{aligned} $$$ (8)

with γ_ab the angular separation between the pair of pulsars a and b. The power spectral density of this common red noise process is modelled as a power law, which is directly related to the GWB strain via

$\begin{matrix} S (f) = \frac{h_{c}^{2} (f)}{12 π^{2} f^{3}} = \frac{A_{yr}^{2}}{12 π^{2}} {(\frac{f}{1 {yr}^{- 1}})}^{- γ} \cdot {yr}^{3}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} S(f) = \frac{h_c^2(f)}{12\pi ^2f^3} = \frac{A_{\rm yr}^2}{12\pi ^2}\left( \frac{f}{\mathrm{1\,yr^{-1}}} \right)^{-\gamma }\cdot \mathrm{yr}^{3} ,\end{aligned} $$$ (9)

where γ = 3 − 2α. The number of components used in the common red noise model is set such that the maximum frequency is the highest frequency bin below f = 1 yr⁻¹; this means that for the 25 yr datasets, the number of components used is 24, for the 10yr is 9, and for the 5 yr is 4. The term C_a represents the noise in pulsar, a:

$\begin{matrix} C_{a} (t_{i}, t_{j}) = σ_{i}^{2} δ_{ij} + C^{RN} (t_{i} - t_{j}) + C^{DM} (t_{i} - t_{j}), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} C_a(t_i, t_j) = \sigma _i^2 \delta _{ij} + C^\mathrm{RN}(t_i-t_j) + C^\mathrm{DM}(t_i-t_j), \end{aligned} $$$ (10)

where σ_i is the white noise level at t_i, while C^RN and C^DM are the red noise and dispersion measure covariance matrices given by Eq. (3). Rather than assuming a power-law spectrum, the PSD in each frequency bin k/T_obs of width Δf = 1/T_obs can be treated as a free parameter. This analysis is commonly referred to as free spectrum analysis.

Given the likelihood, the analysis is run in a Bayesian framework, where the posterior probability distribution of the model parameters, θ, is obtained by updating our prior knowledge, π(θ), with the information provided by the data. This is done by multiplying the prior with the likelihood function ℒ(δt|θ), which quantifies the probability of the observed timing data, δt, given θ. The resulting posterior probability is

$\begin{matrix} p (θ | δ t) = \frac{L (δ t | θ) π (θ)}{\int L (δ t | θ) π (θ) d θ}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} p(\boldsymbol{\theta }|\delta t) = \frac{\mathcal{L} (\delta t|\boldsymbol{\theta })\,\pi (\boldsymbol{\theta })}{\int \mathcal{L} (\delta t|\boldsymbol{\theta })\,\pi (\boldsymbol{\theta })\,d\theta }, \end{aligned} $$$ (11)

which expresses the probability of the parameters θ conditioned on the timing data, δt. The denominator is the normalisation factor known as Bayesian evidence. To estimate the posterior distribution from the simulated data, we use the likelihood model implemented in ENTERPRISE_EXTENSIONS, sampled with a Markov chain Monte Carlo (MCMC) algorithm. To evaluate the significance of the GWB signal, we use the noise-marginalised optimal statistic described in Vigeland et al. (2018), which is a commonly employed hybrid of Bayesian and frequentist techniques.

3. Results

3.1. Understanding DR2full and DR2new results

In this section, we study the behaviour of the DR2full and DR2new simulations and compare them to the real DR2full and DR2new datasets. To this end, we simulate 100 realisations of a 24.8-year dataset that closely resembles the real DR2full one, as described in Sect. 2.2. These 100 realisations are obtained by injecting the same GWB signal (whose spectrum is shown in Fig. 1) each time with a different statistical realisation of the noise. Shorter versions of DR2 are then obtained by selecting the TOAs from a chosen epoch to the most recent TOA, such that the cut at 10yr corresponds to DR2new. Since the impact of the frequency coverage appears crucial for understanding the behaviour of DR2full and DR2new, we also created a version of DR2 with the same observation epochs but homogeneous coverage of observation frequencies throughout the entire observation time, T_obs. This was achieved by assigning a randomly chosen radio frequency to each ToA, from those available for that pulsar. This second version of the dataset is referred to as DR2FC, where FC stands for frequency coverage. The difference in frequency coverage between the two datasets is displayed in Fig. 2. We note that DR2FC contains the same GWB, as well as the same 100 noise realisations that were injected into DR2.

Fig. 2.

Top: S/N distribution computed from the 100 realisations of DR2 (orange) and DR2FC (green). Bottom: difference in frequency coverage between the two kinds of datasets. The TOAs from all the pulsars are plotted together to highlight the drop in the frequency coverage occurring in DR2 at 15 yr from now, which corresponds to the plateau of its S/N.

3.1.1. GWB significance and the impact of the frequency coverage

Figure 2 shows the distribution of the Hellings-Downs signal-to-noise ratio (HD S/N) for each of the 100 realisations of the DR2 and DR2FC datasets as a function of observation time. To create this plot, each DR2 and DR2FC realisation was cut at 5, 10, 15, and 20 years from the last TOA. A Bayesian search for a common uncorrelated red noise signal (CURN) was then performed on each of these realisations, with all noise components left as free parameters. The value of the HD S/N was finally obtained by evaluating the median of the HD S/N distribution obtained using the noise marginalised optimal statistics method. As shown here, the average behaviour of the two datasets is extremely similar up to an observation time of 10 years (DR2new), after which they diverge. For the DR2FC dataset, the S/N continues to rise, although a slowdown in growth is evident after 15 years, related to the decrease in the number of data points and the increase of the TOA errors in the early 5-year data segments. The S/N growth in DR2 slows down significantly after 10 years, compared to DR2FC, and stops for observation times greater than 15 years: the first ten years of data, on average, do not contribute to the HD S/N. This is due to the lack of frequency coverage in the first ten years of DR2: while the number of frequency channels remains significant up to 15 years, beyond this point, individual pulsars data feature a single frequency channel. Then, the legacy data are unable to resolve the degeneracy between the frequency-dependent DM noise model and the achromatic GWB model. Indeed, if only one frequency channel is available, the DM noise model is indistinguishable from an achromatic noise model (see Eq. (3)). The average behaviour shows that the first 10 years of DR2 are not informative enough to contribute to the GWB significance compared to the last 15 years.

The reason why the scarce frequency coverage implies a low GWB significance was investigated in Ferranti et al. (2025) and is related to the degeneracy between the GWB parameters and the individual pulsar noise parameters. This degeneracy is quantified by a degeneracy coefficient, which assigns a significance to the tails of the posterior distributions using the Jensen–Shannon divergence (JSD) with respect to a Gaussian distribution. The JSD is evaluated for all 31 noise processes and then averaged to obtain a single coefficient that quantifies the level of degeneracy for each realisation (the weighted average JSD; see Ferranti et al. 2025, for a detailed description of this stastistic). The relation between the GWB significance and the degeneracy coefficient is illustrated in Figs. 4a and b. Here, we selected the HD S/N and weighted average JSD values for the 25-year (full) and 10-year (new) versions of the datasets. Figure 4a shows that moving from the 10-year to the 25-year DR2FC datasets significantly improves the ability to distinguish between individual noise and GWB components, decreasing the mean JSD. Consequently, the HD S/N increases significantly, rising from a median value of ∼1.2 for the 10-year DR2FC dataset to ∼3.8 for the 25-year DR2FC dataset. In the DR2 dataset, however, the decrease in the JSD is much smaller, as is the increase in the HD S/N, which goes from ∼1.0 (DR2new) to ∼2.0 (DR2full). This causes the HD S/N distributions for DR2new and DR2full to largely overlap. The consequence of this overlap is that, because of random fluctuations in the value of the HD S/N, DR2new can give a higher GWB significance than DR2full with the same noise realisation: this happens in the 15% of the realisations. In this 15%, the average increase in the HD S/N from DR2full to DR2new is of order 50%. On the other side, there is only a 2% probability of having a higher significance from DR2FC 10 yr than from DR2FC 25 yr because of the much smaller overlap between the two distributions. To check whether the EPTA observations were compatible with statistical fluctuations found in our simulations, we selected the realisations based on the following criteria:

DR2new gives a higher S/N than DR2full;
The value of the HD S/N, computed as the median of the noise marginalised distribution of S/N, falls within the 90%C.I. obtained from the data ( $1 . 3_{- 1.2}^{+ 1.3}$ $Mathematical equation: $ 1.3^{+1.3}_{-1.2} $$ for DR2full, $3 . 5_{- 1.7}^{+ 2.4}$ $Mathematical equation: $ 3.5^{+2.4}_{-1.7} $$ for DR2new EPTA and InPTA Collaboration 2023a).

Fig. 3.

2D distributions of the noise marginalised HD S/N and the degeneracy coefficient (weighted average JSD). Panel (a): distribution for the 25 yr and the 10 yr version of DR2FC, which is EPTA DR2 with a homogeneous frequency coverage over the entire observation time. Panel (b): distribution for the 25 yr (DR2full) and the 10 yr (DR2new) version of DR2. As a comparison, the values of JSD and HD S/N with 90%C.I. computed from the real data.

Fig. 4.

Comparison of the precision and accuracy of the GWB parameter estimation performed with DR2full (orange) and DR2new (purple). Panel (a): Distribution of the 90%C.I. width of the γ_GWB posterior distribution as a function of the number of frequency bins that are constrained in the freespectrum search. Panel (b): Distribution of the 90%C.I. width of the log₁₀A_GWB posterior distribution as a function of the number of frequency bins that are constrained in the freespectrum search. Panel (c): P-P plot computed with the marginalised posterior distribution of γ_GWB, taking as the true value the standard 13/3. The green diagonal represents the true shape of the P-P plot for a perfectly unbiased recovery, the shaded areas are the 1, 3, and 5σ C.I. given the number of data, which is 100.

Five realisations were selected using this criterion: the probability of a scenario similar to that observed in EPTA DR2 occurring in our controlled set of simulations was 5%, which is consistent with it being a ≈2σ random fluctuation due to the noise realisation. However, when we repeated the same exercise with the DR2FC dataset, none of the realisations behaved similarly to the real data. This confirms that the sparse frequency coverage in the first half of the data is primarily responsible for this behaviour. In the five selected cases, adding the first 15 years of mainly mono-frequency data to DR2new degrades the GWB significance, instead of improving it. The values of the HD S/N from these cases are displayed in Table 2 and an example of the HD S/N evolution with T_obs is shown in Fig. A.1a. We explore them more closely in the next section.

Table 2.

Main analysis outputs for the five selected realisations and the real data.

Finally, it should be noted that the HD S/N values estimated from the simulations are consistent with those obtained from real data. The EPTA DR2 values are shown in Fig. 3b, together with the corresponding weighted average JSD values. The DR2full simulations agree with the median of the HD S/N computed from the real data within the 68% C.I.: this agreement was also observed with another set of 100 noise realisations in Ferranti et al. (2025). DR2new shows worse agreement, but the overlap between the HD S/N distribution obtained from the simulations and the 90%C.I. estimated from the real data (S/N $_{HD} = 3 . 5_{- 1.7}^{+ 2.4}$ $Mathematical equation: $ _{\mathrm{HD}} = 3.5^{+2.4}_{-1.7} $$ ; EPTA and InPTA Collaboration 2023a) is significant. This shows that the behaviour observed in the real data is unlikely, but cannot be ruled out by simulations.

3.1.2. GWB parameter estimation with DR2full and DR2new

As with the GWB significance, the GWB parameter estimation is sensitive to observation time. In this section, we discuss why the parameter estimation obtained from DR2new is less precise and less accurate than that obtained from DR2full. All the results presented in this section can be compared to the standard analysis performed in the EPTA DR2 release papers (EPTA and InPTA Collaboration 2023a), which is the basis of our investigation. Further effects can contribute to parameter estimation bias, as pointed out in Goncharov et al. (2025). We discuss those in more detail towards the end of this section.

DR2full is expected to give a more precise estimate than DR2new because a longer baseline gives access to more frequency components and a lower minimum frequency. Indeed, the 90%C.I. on γ_GWB and log₁₀A_GWB obtained from the simulations are ( $1 . 2_{- 0.2}^{+ 0.4}, 0 . 6_{- 0.1}^{+ 0.3}$ $Mathematical equation: $ 1.2^{+0.4}_{-0.2}, 0.6^{+0.3}_{-0.1} $$ ) for DR2full and ( $2 . 3_{- 0.7}^{+ 0.7}, 1 . 0_{- 0.4}^{+ 0.4}$ $Mathematical equation: $ 2.3^{+0.7}_{-0.7}, 1.0^{+0.4}_{-0.4} $$ ) for DR2new, so the improvement is approximately 50% for the slope and 40% for the amplitude. This is in good agreement with the 90%C.I. we see in the data: (1.37, 0.69) for DR2full and (1.92, 0.72) for DR2new. In both the data and the simulations, the improvement in precision is larger for γ_GWB than for log₁₀A_GWB. This is expected, because the error on the slope depends more on the observation time than the error on the amplitude, as discussed in Sect. 3.3 and demonstrated in Appendix B.

Furthermore, we can notice from Figs. 4a and b that there is a dependence of the precision on the number of frequency bins the data are able to constrain: the more frequency bins are constrained, the smaller the error in the GWB parameters. Here, a free spectrum analysis is run and whether a frequency bin is constrained or not is decided with the Bayes Factor approximated by the Savage–Dickey ratio (i.e. the ratio of the prior and posterior distributions evaluated at zero or, in this case, at the lower edge of the prior distribution). If the Bayes factor is bigger than 1, the frequency bin is considered as constrained. In the real data, the number of constrained bins is five for DR2full and four for DR2new. The similar number of constrained bins can be the reason why the improvement in the precision is smaller in the data than it is, on average, in the simulations (respectively 28% and 5% against an average of 50% 40% for γ_GWB and log₁₀A_GWB). Still, some of the constrained bins in the real data might come from un-modelled noise sources that are degenerate with the GWB. This should be determined by in-depth noise analysis campaigns.

Turning to the accuracy of the GWB parameter estimation, we built a probability–probability (P–P) plot in order to analyse the performance of DR2full and DR2new in providing an unbiased estimate of the GWB parameters. Specifically, we are interested in whether they can recover a GWB spectrum with γ = 13/3. The P–P plot shows on the y-axis the fraction of times in which the nominal injection value lies within the credible interval indicated on the x-axis. In the case of unbiased and confident inference, the data points follow the diagonal of the plot parameter space. As can be seen from the P–P plot in Fig. 3c, the recovery obtained from DR2new is more biased than that obtained from DR2full, preferring lower values of γ_GWB (and consequently higher values of the amplitude; see Fig. 5). The difference is quite large: the significance of the bias is ∼3σ for DR2full and ≥5σ for DR2new. It must be noted that a perfect agreement with γ = 13/3 is not to be expected even for a flawless PTA, because of the mismatch between the recovery model, a perfectly Gaussian power-law shaped GWB, and the injected spectrum (see Fig. 1), as stated in Valtolina et al. (2024). The bias in both datasets is always towards a flatter and higher GWB spectrum, as can be seen from Fig. 5. Indeed, the median and 90%C.I. of the 100 realisations are respectively ( $4 . 17_{- 0.66}^{+ 0.78}$ $Mathematical equation: $ 4.17^{+0.78}_{-0.66} $$ , $- 14 . 43_{- 0.58}^{+ 0.24}$ $Mathematical equation: $ -14.43^{+0.24}_{-0.58} $$ ) for DR2full and ( $3 . 92_{- 0.99}^{+ 1.47}$ $Mathematical equation: $ 3.92^{+1.47}_{-0.99} $$ , $- 14 . 31_{- 0.66}^{+ 0.43}$ $Mathematical equation: $ -14.31^{+0.43}_{-0.66} $$ ) for DR2new. The same behaviour (i.e. the recovered GWB spectrum being significantly flatter in DR2new than in DR2full) is observed in the real data, where the measured GWB parameters from DR2full are $(4 . 19_{- 0.63}^{+ 0.73}, - 14 . 54_{- 0.41}^{+ 0.28})$ $Mathematical equation: $ (4.19^{+0.73}_{-0.63}, -14.54^{+0.28}_{-0.41}) $$ and $(2 . 71_{- 0.71}^{+ 1.18}, - 13 . 94_{- 0.48}^{+ 0.23})$ $Mathematical equation: $ (2.71^{+1.18}_{-0.71}, -13.94^{+0.23}_{-0.48}) $$ from DR2new. Therefore, on average, the behaviour of real data is qualitatively reproduced, even if the GWB estimates from the DR2new simulations remain systematically lower in amplitude and steeper in the shape than the measured spectrum. This tension is mitigated if we look at the subsample of simulations that were selected in the previous section as the most similar to the data: as can be seen from Fig. 5, in this case the distribution of recovered parameters stays approximately the same for DR2full ( $γ_{GWB} = 3 . 98_{- 0.27}^{+ 0.50}$ $Mathematical equation: $ \gamma_{\mathrm{GWB}} = 3.98^{+0.50}_{-0.27} $$ , ${log}_{10} A_{GWB} = - 14 . 30_{- 0.31}^{+ 0.11}$ $Mathematical equation: $ \log_{10}A_{\mathrm{GWB}}=-14.30^{+0.11}_{-0.31} $$ ), but it moves towards higher and flatter spectra for DR2new ( $γ_{GWB} = 3 . 14_{- 0.43}^{+ 1.07}$ $Mathematical equation: $ \gamma_{\mathrm{GWB}} = 3.14^{+1.07}_{-0.43} $$ , ${log}_{10} A_{GWB} = - 13 . 99_{- 0.39}^{+ 0.15}$ $Mathematical equation: $ \log_{10}A_{\mathrm{GWB}}=-13.99^{+0.15}_{-0.39} $$ ). Indeed, three of the five selected realisations behave very similarly to the real data in terms of parameter estimation (these are realisations 69, 70, and 94; see Table 2): the recovered parameters from DR2new are γ ≲ 3 and log₁₀A ≳ −14 and the agreement with the nominal γ = 13/3 moves from within 1σ C.I. for DR2full to between 2 and 3σ for DR2new. The same occurs in the real data (see Table 2). Obviously, we do not know the true parameters of the signal that is present in the data and γ = 13/3 is only taken as a reference value. In the simulations, this increase in the bias from DR2full to DR2new is related to an increase in the degeneracy between individual noise and the GWB parameters (see again Table 2). The same increase is observed in the posterior distributions of the real data. This degeneracy is expected to introduce a bias into the estimate of the GWB parameters (see Ferranti et al. 2025). As mentioned earlier, other four relevant sources of bias in EPTA DR2new data have been identified in Goncharov et al. (2025): (i) neglecting of epoch-correlated white noise; (ii) neglecting the frequency dependence in the amplitude of the transient noise event in PSR J1713+0747; (iii) noise prior misspecification, which is solved through hierarchical inference (Goncharov & Sardana 2025; van Haasteren 2024); and (iv) the exclusion of certain noise terms based on the results of single-pulsar noise analysis, which is solved by performing model averaging instead of model selection (van Haasteren 2025). Since epoch-correlated noise and transient noise in PSR J1713+0747 were not injected in our simulations and the noise model used for the recovery is the same as the one used in the injection, effects (i), (ii), and (iv) do not affect our results on the parameter estimation. This means that other effects are capable of generating a strong bias in DR2new. On the other hand, noise prior mis-specification is expected to affect our recovery since we inject the noise components using values from the SPNA and then use uniform priors in the recovery. However, as will be shown in Sect. 3.1.3, the main source of bias for the DR2new simulations appears to be the neglect of spectral leakage due to limited observation time in standard GWB analyses. See Quelquejay Leclere et al. (2026) and Crisostomi et al. (2025) for in depth discussions on this effect.

Fig. 5.

Median of the marginalised posterior distributions of γ_GWB and log₁₀A_GWB of the 100 realisations of DR2full and DR2new (shaded regions, 95%C.I.) and of the 5 realisations selected on the HD S/N (solid lines, 68 and 95% C.I.). As a reference, the median values and 2σ covariance principal axis obtained from the real data are displayed.

Among the three selected realisations that resemble the real data also in the parameter estimation, realisation 94 seems to reproduce all the main features of the real data² (Table 2). The HD S/N distribution as a function of T_obs and the posterior distributions of the GWB parameters for realisation 94 are shown in Figs. A.1a and b. We notice that this agreement is found despite the fact that parameter estimation depends heavily on the injected GWB spectrum, and we only tested one SMBHB population realisation.

From these two sections – 3.1.1 and 3.1.2 – we can conclude that the difference in the GWB significance and GWB parameter estimation found in the EPTA second data release when using the long (DR2full) and the short (DR2new) observation baselines is not an anomaly. Evidence of GWB compatible with the measured values is found in 5% of the simulations and we also found one realisation that agrees with the data in both the significance and the parameter estimation of the GW signal. These percentages suggest that the observed scenario is consistent with an ≈2σ random fluctuations around the average behaviour.

3.1.3. Spectral leakage

As mentioned in the previous section, we can identify three sources of the significant bias observed in the GWB parameter estimation from our simulations: the degeneracy between the individual noise and the GWB parameters, the effect of the spectral leakage and the noise prior misspecification. The first two effects are expected to be more prominent in DR2new than in DR2full. In this section we will focus on the spectral leakage, which has been proven to be the main source of bias in our DR2new simulations.

The fact that we observed on a finite time window introduces correlations between the Fourier frequency components that are neglected in the standard analysis, where the frequency-domain covariance matrix is considered diagonal. This approximation is valid for very long baseline observations, but can introduce a substantial bias in the parameter estimation when observation times are short. Since DR2new is only 10.3 yr of observations, we expect the spectral leakage effect to be significant and might possibly comprise the origin of the ≥5σ bias in the GWB parameter estimation. We point out that our DR2new simulations are expected to be affected by the spectral leakage in both the individual noise and in the signal covariance matrices. This is because (i) the noise is injected with the Fourier diagonal model, that is to say, on a harmonic Fourier basis that has a fundamental frequency of 1/T_obs, psr where T_obs, psr ∈ (13.6, 24.5)yr depending on the pulsar, and applying the 10.3 years rectangular window to obtain DR2new truncates the injected Fourier components to non-harmonic frequencies; (ii) the signal is injected as the sum of the exact time delays generated by every binary in the population with frequencies that do not match the 1/T_obs, psr harmonic Fourier basis. To investigate this, we analysed the 100 realisations of DR2new using the FFTInt method, as described in Crisostomi et al. (2025) and implemented in Discovery³. Using the posterior distributions obtained from this analysis, we built the P-P plot with respect to γ = 13/3, shown as the purple dashed line in Fig. 4c. The significance of the bias goes from 5.6σ with the standard analysis, to 1.5σ with the FFTInt method. This means that in our DR2new simulations, among the three possible sources of bias in the GWB parameter estimation, the spectral leakage is the dominant one, while the other two effects only generate a bias that is within the precision of our recovery. Consequently, DR2new can be used to obtain a reliable estimate of the GW signal parameters, but only if the effect of the spectral leakage is taken into account.

It is worth noting that by extending the analysis to frequencies lower than 1/T_obs (in particular, we used a cutoff at 1/3T_obs), the estimate of the GWB parameters slightly changes: the difference in the estimate of γ_GWB is Δγ_GWB ∼ −0.12 ± 0.14 quoting the 68% C.I. This means that extending the frequency range gives, on average, a shallower GWB recovery, with a difference that is compatible with zero.

3.2. IPTA frequency coverage

In Sect. 3.1.1, we report that the limited frequency coverage in the first half of DR2full significantly restricts its performance in the GWB search. However, there are several feasible ways to increase the frequency coverage of our data, and these are currently being implemented. Indeed, the next data release of the International Pulsar Timing Array (IPTA; e.g. Verbiest et al. 2016) will combine data from all the main PTA collaborations: EPTA, InPTA, NANOGrav, PPTA, MPTA and CPTA plus data from the LOw Frequency ARray (LOFAR; Donner et al. 2020) and from the New extension in Nançay upgrading (NenuFAR; Donner et al. 2020). This will enable the construction of an array of more than 100 pulsars, with many observed pulsars covered by multiple PTAs in different frequency bands. In particular, LOFAR and NenuFAR, NANOGrav, and PPTA are very useful with respect to supplementing the first problematic 10 years of the EPTA DR2. The reason is that NANOGrav and PPTA, with respective observation times of ∼16 yr and ∼20 yr, have long baseline measurements at frequencies that are complementary to EPTA. Furthermore, LOFAR and NenuFAR, despite having a shorter baseline (∼11 yr) have the capacity to collect data at extremely low frequencies (i.e. 100–200 MHz for LOFAR and 30-90MHz for NenuFAR) for 12 EPTA pulsars. For more details on the LOFAR and NenuFAR datasets and the advantages of their combination with the EPTA DR2, we refer to Iraci et al. (2025). The aim of this section is to investigate in a simplified, yet representative way how the 25 pulsars of EPTA would behave following the addition of these four complementary datasets. Therefore, we supplemented the DR2full simulations with the following data:

Four high-frequency backends (at 2750, 3000, 3250 and 3500 MHz) starting from EPOCH = 53 000 with TOA error of 2 μs, which resemble PPTA-like backends;
One low-frequency backend (750 MHz) starting from EPOCH = 53 000 with TOA error of 2 μs, which in the real data is given either by NANOGrav or by PPTA;
100 TOAs at 150 MHz, with a TOA error of 7 μs, which are typical properties of LOFAR data, starting from EPOCH = 56 500;
100 TOAs at 60 MHZ starting from EPOCH = 58 300 and with error of 7 μs, which are typical of NenuFAR.

These backends were only added to those pulsar for which they are indeed available in the real data. This method allows us to have a representative simulation of how EPTA pulsars will appear in the next IPTA data release. The MPTA and CPTA backends, despite being the most precise TOAs in the IPTA, are not considered here because they have very short baselines.

An example of the frequency coverage of this simulated dataset is shown in the bottom panel of Fig. 6. In the top panel, the real TOAs from the five PTA involved are shown. This analysis aims to assess whether the additional frequency coverage provided by the IPTA collaboration is sufficient to solve the degeneracy problem affecting DR2full. This version of DR2full will be referred to as DR2full + IPTA.

Fig. 6.

For pulsar PSR J1022+1001. Top: data from the real EPTA DR2full, NANOGrav 16yr, PPTA 20yr, LOFAR and NenuFAR. Bottom: Simplified simulation of the three combined PTAs .

Figure 7a shows the joint distribution of the degeneracy coefficient and the noise-marginalised HD S/N for 100 realisations of DR2full + IPTA, compared with DR2full and DR2FC. Both the parameter degeneracy and the HD S/N improve significantly with IPTA coverage. The HD S/N median almost doubles, rising from $2 . 0_{- 1.0}^{+ 1.3}$ $Mathematical equation: $ 2.0^{+1.3}_{-1.0} $$ in DR2full to $3 . 8_{- 1.0}^{+ 1.2}$ $Mathematical equation: $ 3.8^{+1.2}_{-1.0} $$ in DR2full + IPTA. The performance of this dataset is equal to the performance obtained with the homogeneous frequency coverage of DR2FC. This is true also when looking at the evolution of the HD S/N with the observation time (not shown), which is equivalent to the one of DR2FC displayed in Fig. 2: the HD S/N keeps growing with increasing T_obs and there are no realisations in which the 10yr version of the dataset gives a higher significance than the 25yr one (the distribution of the ratio between the HD S/N obtained with the two versions of the IPTA-like dataset is displayed in Appendix B). In the five ‘pathological’ realisations resembling the data, the HD S/N always exceeds 3 when IPTA backends are added, compared to a maximum of 2.2 in DR2full (Table 2).

Fig. 7.

GWB significance and parameter estimation from the 100 DR2full + IPTA simulations. Panel (a): 2D distributions of the noise marginalised HD S/N and the degeneracy coefficient (weighted average JSD). Panel (b): distribution of the median of log₁₀A_GWB and γ_GWB marginalised posterior distributions.. Panel (c): summary of the noise parameters estimation from the 100 realisations of DR2full (orange) and DR2full + IPTA (red) simulated data. Top: Median of the marginalised posterior distribution of the noise component amplitude, the true values are shown for reference (black crosses). Bottom: Savage-Dickey ratio of the amplitude posterior distributions; BF < 1 means the posterior is unconstrained, BF ≫1 means the posterior is very well constrained.

Parameter estimation also benefits from the improved coverage: both the amplitude and slope of the GWB are accurately recovered (Fig. 7b), with the slope bias reduced to a non-significant 0.8σ (∼3σ from DR2full). Moreover, the overall precision on GWB parameters is thereby improved by ∼25% in amplitude and ∼40% in slope.

Finally, it is interesting to observe how the addition of these complementary backends affects the estimation of the noise parameters. Figure 7c shows the distribution of the posterior distribution median for the amplitude of each noise process involved in the search, compared to the injected value (top panel), and the constraining power on each parameter (bottom panel). The constraining power is quantified by the Bayes factor, which is approximated by the Savage–Dickey ratio: if the ratio is smaller than 1, then the posterior distribution is not constrained. When the ratio is at its maximum value (set to 1000 for representation purposes), the parameter is very precisely constrained. Several observations can be made from this plot; most importantly, all parameters that the model can constrain are centred on the correct injected value, indicating unbiased noise parameter estimation. Now focusing on the impact of IPTA coverage on the estimation of the noise parameters (see Fig. 7c), we can see that only in 15 pulsars out of 21 the DM parameters are constrained in all realisations of DR2full, whereas, when the IPTA backends are added, all the DM parameters are very well constrained in every realisation. This allows us to mitigate degeneracy of the these parameters with the GWB. Conversely, RN parameters are typically unconstrained or poorly constrained in DR2full, except when the amplitude is very high (≳10⁻¹³), as it is in PSR J0900−3144 and PSR J1012+5307. And despite the addition of IPTA frequency coverage, constraints on these parameters remain poor. However, a significant improvement can be seen in the cases of PSR J1022+1001, J1455−3330 and J1713+0747. These are the only three pulsars with an amplitude between 10⁻¹⁴ and 10⁻¹³. This suggests that DR2full + IPTA frequency coverage is sufficient, in many realisations, to constrain red noise processes with an amplitude greater than 10⁻¹⁴. It is therefore reasonable to assume that even better frequency coverage and higher S/N would allow us to constrain all red noise parameters, also solving their degeneracy with the GWB.

We can therefore conclude from this section that adding the longest baseline backends of NANOGrav and PPTA and the low frequency backends of LOFAR and Nenufar to EPTA DR2full is sufficient to significantly improve the evidence in favour of the GW signal, as well as to provide an accurate and precise measurement of the GWB parameters. This conclusion supports the findings of Larsen et al. (2025), which demonstrate that partial IPTA-style data combinations, such as the Lite method, can rapidly improve PTA sensitivity by selecting the most informative pulsar datasets prior to full formal combination. An interesting observation is that the addition of NANOGrav and PPTA boosts the average HD S/N from 2.0 to ∼3.2. The further addition of LOFAR and NenuFAR backends boosts it from ∼3.2 to 3.8, indicating that adding very low frequency backends has an effect that is comparable to adding complementary long baseline observations.

3.3. The 5yr dataset: Impact of the time span on the parameter estimation

In this last section, we investigate the differences between long- and short-time span datasets, highlighting how the estimation of the GWB parameters depends on the observation time. To this end, we analysed the longest and shortest simulated datasets produced for this work: the 25 yr and 5 yr datasets, referred to as DR2full and DR2 5yr.

In Fig. 8, we show the distribution of the parameter estimates obtained from the 100 realisations of both datasets, together with the posterior distribution of a single realisation. The chosen realisation is number 94, selected in Sect. 3.1.2 as the most similar to the real EPTA DR2. The first observation is that the posterior obtained from the 5yr dataset is much broader, as expected due to the shorter time span. However, its principal axis also exhibits a different orientation compared to that of the 25yr dataset. Interestingly, a similar behaviour is observed in real datasets: in the same figure, we display the principal axes of the posterior distributions obtained from EPTA DR2full (24.8 yr) and from the MeerKAT PTA (4.5 yr). These two posteriors show a significant difference in orientation, consistent with the results from our simulations.

Fig. 8.

Median of the marginalised posterior distributions of γ_GWB and log₁₀A_GWB of all the 100 simulations (shaded region), posterior distribution of realisation 94 (solid line) and principal axis evaluated orientation evaluated from the Fisher matrix (dashed line). Panel (a): comparison between DR2full and DR2 5yr. For comparison, the principal axes of the real EPTA DR2full and MeerKAT PTA are also shown. Panel (b): comparison between DR2* 5yr and DR2 5yr. For reference, the injected value is displayed as a black cross.

This dependence of the posterior orientation on the observation timespan arises from the functional form of the likelihood and can be predicted using the Fisher formalism. Following the derivation in Babak et al. (2024), the orientation of the principal axis of the posterior can be computed analytically and is shown as a dashed line in Fig. 8a. The agreement between the Fisher prediction and the simulation results is very good. Indeed, by computing the orientation for all the 100 noise realisations, we found that the prediction from the Fisher matrix agrees with the distribution of the orientation values within 68%C.I. (see Appendix B) for all the observation times investigated. In addition, the values of the orientation computed from the posterior obtained with the real EPTA DR2full, DR2new and MPTA are compatible with the 90%C.I. of the simulations.

The increase in the principal axis angle over time can be explained by the fact that the uncertainty in the GWB slope is more sensitive to the observation time than the uncertainty in the amplitude. Indeed, by denoting the variances of the amplitude and slope as σ_A² and σ_γ² respectively, an analytical calculation yields the following proportionality,

$\begin{matrix} {\begin{matrix} σ_{A}^{2} \propto {(S / N)}^{- 2} \\ σ_{γ}^{2} \propto {(S / N)}^{- 2} {[ln (\frac{yr}{T_{obs}})]}^{- 2}, \end{matrix} \end{matrix}$ $Mathematical equation: $$ \begin{aligned} {\left\{ \begin{array}{ll} \sigma _{A}^2 \propto (\mathrm{S/N})^{-2}\\ \displaystyle \sigma _{\gamma }^2 \propto (\mathrm{S/N})^{-2}\left[\ln \left(\frac{\mathrm{yr}}{T_{\rm obs}}\right)\right]^{-2}, \end{array}\right.} \end{aligned} $$$ (12)

where the additional dependence of σ_γ² on T_obs is explicit. This derivation and a detailed discussion are given in Appendix B.

Equation (12) also shows that the orientation of the posterior distribution does not depend on the overall S/N, since both variances scale equally with it. To test this, we simulated a high-S/N version of the 5yr dataset, that we will refer to as DR2* 5yr. The average S/N increases from ∼0.01 (for DR2 5yr) to ∼1.9 (for DR2* 5yr). The distribution of medians and a representative posterior for this high-S/N dataset are shown in Fig. 8b, where we can see that the posterior orientation remains roughly the same when increasing the S/N, consistent with the Fisher prediction.

An interesting implication of this time span dependence can be seen by comparing two datasets with similar HD S/N but very different observation times: this is the case for our simulated DR2full (25 yr, average HD S/N ≈ 2) and DR2* 5yr (5 yr, average HD S/N ≈ 1.9). Although the two datasets provide comparable precision on the GWB amplitude (68% C.I. of 0.38 and 0.45, respectively), they yield significantly different constraints on the GWB slope, with a 68% C.I. of 0.71 for DR2full and of 1.6 for DR2* 5 yr.

The posterior distributions obtained from the short-time-span dataset are not only less precise than those obtained from the long-time span dataset, but also less accurate, as it can be noticed from Fig. 8b. This is to be expected; we have already seen an analogous situation when comparing the parameter estimation obtained from DR2full and DR2new (see Sect. 3.1.2). However, while the majority of the bias in the case of DR2new was due to neglecting the inter-frequency correlation and was solved by taking this spectral leakage into account in the model (see Sect. 3.1.3), something slightly different is occurring in DR2 5yr. By building the P–P plot, we find that the DR2 5yr dataset has a bias in the parameter recovery with a significance of approximately 6.4σ, which reduces to ∼3.8σ if we account for spectral leakage in the analysis. In contrast, applying the standard diagonal analysis to DR2* 5yr (a version of DR2 5yr with high HD S/N) renders the bias in the recovery insignificant (∼0.95σ). The most straightforward explanation for these results is that most of the bias is due to the degeneracy of the pulsar individual noise with the GWB parameters, since the HD S/N is so low and therefore it is hard to tell individual noise apart from the common process. Indeed, as can be seen in Fig. 9, the high- and low-S/N datasets occupy two different regions in the degeneracy coefficient (i.e. the recovered amplitude space) and there is a strong correlation between the two quantities (r ∼ 0.5, with significance greater than 5σ). Thus, the higher amplitude and lower spectral index in the recovery from DR2 5yr are related to the strong degeneracy of the GWB parameters with the individual noise parameters. For DR2new, the degeneracy was much smaller thanks to the higher HD S/N. On the other hand, the spectral leakage is expected to affect more DR2 5yr than DR2new because of the shorter data span, which is not what we observe. This can be explained by the effect of timing model marginalisation, which acts as a windowing function and mitigates the effects of spectral leakage (see Crisostomi et al. 2025; Quelquejay Leclere et al. 2026). In DR2 5yr, the GWB spectrum is modelled with only 4 components and the fit is likely dominated by the first one or two bins, which are the most affected by the TM marginalisation. Consequently the timing model marginalisation manages to mitigate the effect of the spectral leakage. This mitigation is less effective in DR2new because more frequency components contribute to the GWB parameter estimation in that case.

Fig. 9.

Degeneracy coefficient (weighted average JSD) and median of the marginal posterior distribution of log₁₀A_GWB from the 100 simulations of DR2 5yr and DR2* 5 yr.

4. Conclusions

As PTA experiments enters the detection era, it is crucial that we develop a comprehensive understanding of our data and of the outcomes of our analysis pipelines. This work is a part of this effort, with a particular focus on the last data release of the EPTA collaboration. This dataset was presented in two versions, EPTA DR2full, which has an observation timespan of 24.8 yr, and DR2new, that includes only the last 10.3 yr of DR2full (EPTA and InPTA Collaboration 2023a). The unexpected result obtained by the analysis of these two datasets was that the shorter version gave a higher significance in favour of the presence of a GWB (HD S/N ∼3.5) than the longer version (HD S/N ∼1.3). This counterintuitive result motivated the analysis carried out in this paper. A faithful set of DR2full-like simulations was generated and then cut at different epochs (20, 15, 10, and 5 years before the most recent TOA) to obtain shorter versions of the dataset, including DR2new-equivalent simulations (10yr data span). For each time span, 100 different noise realisations are injected, on top of the same stochastic GW signal. The outcome of the analyses run on all these simulations led us to the following conclusions:

The first 10 years of DR2 are not sufficiently informative to contribute significantly to the GWB detection, mainly due to the limited frequency coverage in this part of the data. As a result, the HD S/N distributions for DR2new and DR2full largely overlap. This implies that due to random fluctuations in the HD S/N, DR2new can occasionally yield a higher GWB significance than DR2full, which occurs in about 15% of the realisations. In fact, the probability of observing a scenario similar to that seen in the EPTA DR2 within our controlled simulations is 5%. In these cases, adding the first 10 or 15 years of data degrades the signal, instead of enforcing it. Moreover, a realisation was found that reproduced the data behaviour exactly, both in terms of GWB significance and parameter estimation. This shows that the behaviour observed in DR2new versus DR2full could well be simply due to an ≈2σ occurrence of noise fluctuations, with no need to invoke other problems or issues with the data or the analysis pipelines.
DR2new-like simulations lead to a significantly biased GWB parameter estimation, with the bias being towards higher values of the amplitude and lower values of the slope. This bias is related to the correlation between different frequency bins, which is neglected in the standard analysis. If spectral leakage (i.e. frequency bin correlation) is accounted for in the model, using the method proposed in Crisostomi et al. (2025), we find that DR2new also provides a reliable estimate of the GWB parameters. Therefore, this analysis emphasises the importance of accounting for spectral leakage in GWB analyses.
Adding long-baseline complementary backends (such as TOAs from NANOGrav and PPTA) and low-frequency measures (e.g. those obtained with LOFAR and NenuFAR) to DR2full is crucial to address its pathological behaviour. Indeed, when these backends are added in a simplified, yet realistic simulation, the average HD S/N increases from 2.0 in DR2full to 3.8 in DR2full + IPTA, while the HD S/N trend with observation time is monotonically increasing. This proves that the first 10 years of data significantly contribute to the HD S/N. Therefore, DR2full has the potential of prividing high S/N and precise and accurate parameter estimation when combined with other datasets within the IPTA framework.
Very short baseline datasets with a low HD S/N ratio (below 2) are greatly impacted by spectral leakage and the degeneracy between individual noises and the GWB parameters. This results in a significant bias in the estimation of GWB parameters towards high amplitudes. Therefore, modelling the noise and accounting for spectral leakage in the analysis is crucial to obtain a reliable result from datasets spanning such short time spans. This could be particularly relevant when analysing the first releases of CPTA and MeerKAT data. Moreover, studying these short-baseline datasets allows us to investigate how the precision of the GWB parameter estimates evolves with the observation time, T_obs. T_obs has a greater impact on the precision of γ than on log₁₀A, which means that long-baseline datasets are important for maximising our ability to constrain the GWB slope, which carries valuable information about the eccentricity and environmental coupling of the underlying SMBHB population.

These results demonstrate that the behaviour observed in EPTA DR2 is consistent with the outcome of current analysis pipelines on controlled simulated data, without necessarily indicating any suspicious anomalies. This study also highlights the importance of accurate GWB modelling, along with broadband frequency coverage and joint PTA analyses for robust GWB detection.

Acknowledgments

IF would like to acknowledge the precious help and useful discussions of Marco Crisostomi, Hippolyte Quelquejay Leclère and Aurélien Chalumeau, which contributed to shaping this paper. AS acknowledges the financial support provided under the European Union’s H2020 ERC Advanced Grant “PINGU” (Grant Agreement: 101142079).

References

Afzal, A., Agazie, G., Anumarlapudi, A., et al. 2023, ApJ, 951, L11 [NASA ADS] [CrossRef] [Google Scholar]
Agazie, G., Anumarlapudi, A., Archibald, A. M., et al. 2023, ApJ, 951, L10 [CrossRef] [Google Scholar]
Agazie, G., Anumarlapudi, A., Archibald, A. M., et al. 2023, ArXiv e-prints [arXiv:2306.16221] [Google Scholar]
Babak, S., Falxa, M., Franciolini, G., & Pieroni, M. 2024, Phys. Rev. D, 110, 063022 [Google Scholar]
Bassa, C. G., Janssen, G. H., Stappers, B. W., et al. 2016, MNRAS, 460, 2207 [NASA ADS] [CrossRef] [Google Scholar]
Caprini, C., & Figueroa, D. G. 2018, Class. Quant. Grav., 35, 163001 [NASA ADS] [CrossRef] [Google Scholar]
Crisostomi, M., van Haasteren, R., Meyers, P. M., & Vallisneri, M. 2025, ArXiv e-prints [arXiv:2506.13866] [Google Scholar]
Donner, J. Y., Verbiest, J. P. W., Tiburzi, C., et al. 2020, A&A, 644, A153 [EDP Sciences] [Google Scholar]
Edwards, R. T., Hobbs, G. B., & Manchester, R. N. 2006, MNRAS, 372, 1549 [Google Scholar]
Ellis, J. A. 2013, Class. Quant. Grav., 30, 224004 [NASA ADS] [CrossRef] [Google Scholar]
EPTA and InPTA Collaboration (Antoniadis, J., et al.) 2024, A&A, 685, A94 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
EPTA and InPTA Collaboration (Antoniadis, J., et al.) 2023a, A&A, 678, A50 [CrossRef] [EDP Sciences] [Google Scholar]
EPTA and InPTA Collaboration (Antoniadis, J., et al.) 2023b, A&A, 678, A49 [Google Scholar]
EPTA and InPTA Collaboration (Antoniadis, J., et al.) 2023c, A&A, 678, A48 [Google Scholar]
Ferranti, I., Falxa, M., Sesana, A., et al. 2025, A&A, 694, A38 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Foster, R. S., & Backer, D. C. 1990, ApJ, 361, 300 [NASA ADS] [CrossRef] [Google Scholar]
Goncharov, B., & Sardana, S. 2025, MNRAS, 537, 3470 [Google Scholar]
Goncharov, B., Sardana, S., Sesana, A., et al. 2025, Reading Signatures of Supermassive Binary Black Holes in Pulsar Timing Array Observations [Google Scholar]
Hellings, R. W., & Downs, G. S. 1983, ApJ, 265, L39 [NASA ADS] [CrossRef] [Google Scholar]
Iraci, F., Chalumeau, A., Tiburzi, C., et al. 2025, ArXiv e-prints [arXiv:2510.04639] [Google Scholar]
Jaffe, A. H., & Backer, D. C. 2003, ApJ, 583, 616 [NASA ADS] [CrossRef] [Google Scholar]
Larsen, B., Mingarelli, C. M. F., Baker, P. T., et al. 2025, MNRAS, 542, 3028 [Google Scholar]
Lentati, L., Shannon, R. M., Coles, W. A., et al. 2016, MNRAS, 458, 2161 [Google Scholar]
Lorimer, D. R., & Kramer, M. 2004, Handbook of Pulsar Astronomy, Vol. 4 [Google Scholar]
Miles, M. T., Shannon, R. M., Reardon, D. J., et al. 2025, MNRAS, 536, 1489 [Google Scholar]
Quelquejay Leclere, H., Li, K., Volonteri, M., et al. 2026, A&A, 705, A56 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Rajagopal, M., & Romani, R. W. 1995, ApJ, 446, 543 [NASA ADS] [CrossRef] [Google Scholar]
Reardon, D. J., Zic, A., Shannon, R. M., et al. 2023, ApJ, 951, L6 [NASA ADS] [CrossRef] [Google Scholar]
Rosado, P. A., Sesana, A., & Gair, J. 2015, MNRAS, 451, 2417 [NASA ADS] [CrossRef] [Google Scholar]
Sesana, A. 2013, Class. Quant. Grav., 30, 244009 [Google Scholar]
Sesana, A., Vecchio, A., & Colacino, C. N. 2008, MNRAS, 390, 192 [NASA ADS] [CrossRef] [Google Scholar]
Speri, L., Porayko, N. K., Falxa, M., et al. 2023, MNRAS, 518, 1802 [Google Scholar]
Susarla, S. C., Chalumeau, A., Tiburzi, C., et al. 2024, A&A, 692, A18 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Tiburzi, C., Hobbs, G., Kerr, M., et al. 2016, MNRAS, 455, 4339 [Google Scholar]
Vallisneri, M. 2020, Astrophysics Source Code Library [record ascl:2002.017] [Google Scholar]
Valtolina, S., Shaifullah, G., Samajdar, A., & Sesana, A. 2024, A&A, 683, A201 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
van Haasteren, R. 2024, ApJS, 273, 23 [Google Scholar]
van Haasteren, R. 2025, MNRAS, 537, L1 [Google Scholar]
van Haasteren, R., & Vallisneri, M. 2014, Phys. Rev. D, 90, 104012 [NASA ADS] [CrossRef] [Google Scholar]
Verbiest, J. P. W., Lentati, L., Hobbs, G., et al. 2016, MNRAS, 458, 1267 [Google Scholar]
Verbiest, J. P. W., & Shaifullah, G. M. 2018, Class. Quant. Grav., 35, 133001 [NASA ADS] [CrossRef] [Google Scholar]
Vigeland, S. J., Islo, K., Taylor, S. R., & Ellis, J. A. 2018, Phys. Rev. D, 98, 044003 [NASA ADS] [CrossRef] [Google Scholar]
Wyithe, J. S. B., & Loeb, A. 2003, ApJ, 590, 691 [NASA ADS] [CrossRef] [Google Scholar]
Xu, H., Chen, S., Guo, Y., et al. 2023, Res. Astron. Astrophys., 23, 075024 [CrossRef] [Google Scholar]

For a more in-depth study, the solar wind noise components should be included, as their addition has been shown to affect the GWB search (Tiburzi et al. 2016; Susarla et al. 2024).

Realisation 69 does not match the parameter estimation from the real data so well, and realisation 70 has the maximum value of the HD S/N at 15 yr, while real data peak at 10 yr.

https://github.com/nanograv/discovery.git

Appendix A: Special realisation

Among the 100 realisations of noise that were generated, 5 were shown to aptly reproduce the behaviour of the real data well in the HD S/N evolution with time. Among these five realisations, one behaves very similar to the data both in the HD S/N and in the parameter estimation. In Fig. A.1a, the evolution of the HD S/N with the observation time is shown for both realisation 94 and the real data. In Fig. A.1b the parameter estimations obtained from the 25yr and the 10yr versions of realisation 94 and the real data are compared. It can be seen that the simulated and the real data are in good agreement.

Fig. A.1.

Panel (a): Median and 68%C.I. of the noise marginalised HD S/N computed with realisation 94 by cutting the dataset at different epochs. The same cuts were applied to the real EPTA DR2 and the violins shown the HD S/N distribution. Panel (b): Marginalised posterior distributions of γ_GWB and log₁₀A_GWB obtained from the DR2full and DR2new version of realisation 94. As a reference, the median values and 2σ covariance principal axes obtained from the real data are displayed.

Appendix B: DR2full + IPTA

The median behaviour of the DR2full + IPTA dataset is that the 25yr version gives an HD S/N which is more than twice (2.4) the one from the 10yr version. In Fig. B.1, the distribution of this ratio obtained from the 100 realisations is displayed.

Fig. B.1.

Distribution of the 100 values of the ratio between the HD S/N from the 25yr version of DR2full + IPTA and from the 10yr version.

Appendix C: Fisher matrix calculation

The expression of the Fisher matrix according to Babak et al. (2024) is

$\begin{matrix} \begin{matrix} F_{α β} = & \sum_{f_{k}} C_{IJ}^{- 1} C_{KL}^{- 1} \frac{\partial (R_{JK} S_{h} (f_{k}))}{\partial θ^{α}} \frac{\partial (R_{IL} S_{h} (f_{k}))}{\partial θ^{β}} = \\ = \sum_{f_{k}} C_{IJ}^{- 1} C_{KL}^{- 1} R_{JK} R_{IL} \frac{\partial S_{h} (f_{k})}{\partial θ^{α}} \frac{\partial S_{h} (f_{k})}{\partial θ^{β}} = \\ = \sum_{f_{k}} \frac{1}{S_{eff}^{2}} \frac{\partial S_{h}}{\partial θ^{α}} \frac{\partial S_{h}}{\partial θ^{β}}, \end{matrix} \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \begin{aligned} F_{\alpha \beta } =&\sum _{f_k}C_{IJ}^{-1}C_{KL}^{-1}\frac{\partial (R_{JK}S_{\rm h}(f_k))}{\partial \theta ^{\alpha }}\frac{\partial (R_{IL}S_{\rm h}(f_k))}{\partial \theta ^{\beta }} =\\&= \sum _{f_k}C_{IJ}^{-1}C_{KL}^{-1}R_{JK}R_{IL}\frac{\partial S_{\rm h}(f_k)}{\partial \theta ^{\alpha }}\frac{\partial S_{\rm h}(f_k)}{\partial \theta ^{\beta }} =\\&= \sum _{f_k} \frac{1}{S_{\rm eff}^2}\frac{\partial S_{\rm h}}{\partial \theta ^{\alpha }}\frac{\partial S_{\rm h}}{\partial \theta ^{\beta }}, \end{aligned} \end{aligned} $$$ (C.1)

where

$\begin{matrix} S_{eff} = {[C_{IJ}^{- 1} C_{KL}^{- 1} R_{JK} R_{IL}]}^{- 1 / 2} \end{matrix}$ $Mathematical equation: $$ \begin{aligned} S_{\rm eff} = \left[C_{IJ}^{-1}C_{KL}^{-1}R_{JK}R_{IL}\right]^{-1/2} \end{aligned} $$$ (C.2)

and

$\begin{matrix} {S / N}^{2} = \sum_{f_{k}} {(\frac{S_{h}}{S_{eff}})}^{2} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathrm{S/N}^2 = \sum _{f_k}\left(\frac{S_{\rm h}}{S_{\rm eff}}\right)^2 .\end{aligned} $$$ (C.3)

The partial derivatives take the simple form

$\begin{matrix} {\begin{matrix} \frac{\partial S (f)}{\partial logA} = 2 \ln 10 \cdot S (f) \\ \frac{\partial S (f)}{\partial γ} = - \ln (\frac{f}{{yr}^{- 1}}) \cdot S (f) \end{matrix} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} {\left\{ \begin{array}{ll} \displaystyle \frac{\partial S(f)}{\partial \mathrm{logA}} = 2\mathrm{ln10} \cdot S(f)\\ \displaystyle \frac{\partial S(f)}{\partial \mathrm{\gamma }} = -\mathrm{ln}\left(\frac{f}{\mathrm{yr^{-1}}}\right) \cdot S(f) \end{array}\right.} .\end{aligned} $$$ (C.4)

Thus, the entrances of the Fisher information matrix are

$\begin{matrix} {\begin{matrix} F_{AA} = {(2 \ln 10)}^{2} \cdot \sum_{f_{k}} {(\frac{S_{h}}{S_{eff}})}^{2} = {(2 \ln 10)}^{2} \cdot {S / N}^{2} \\ \begin{matrix} F_{γ γ} & = \sum_{f_{k}} {[- \ln (\frac{f_{k}}{{yr}^{- 1}})]}^{2} {(\frac{S_{h}}{S_{eff}})}^{2} \approx \\ \approx {[\ln (\frac{f_{\min}}{{yr}^{- 1}})]}^{2} \cdot {S / N}^{2} \propto {[\ln (\frac{yr}{T_{obs}})]}^{2} \cdot {S / N}^{2} \end{matrix} \\ \begin{matrix} F_{A γ} & = 2 \ln 10 \cdot \sum_{f_{k}} [- \ln (\frac{f_{k}}{{yr}^{- 1}})] {(\frac{S_{h}}{S_{eff}})}^{2} \approx \\ \approx [- \ln (\frac{f_{\min}}{{yr}^{- 1}})] \cdot {S / N}^{2} \propto - \ln (\frac{yr}{T_{obs}}) \cdot {S / N}^{2} . \end{matrix} \end{matrix} \end{matrix}$ $Mathematical equation: $$ \begin{aligned} {\left\{ \begin{array}{ll} \displaystyle F_{AA} = (2\mathrm{ln10})^2\cdot \sum _{f_k}\left(\frac{S_{\rm h}}{S_{\rm eff}}\right)^2 = (2\mathrm{ln10})^2 \cdot \mathrm{S/N}^2\\ \begin{aligned} \displaystyle F_{\gamma \gamma }&= \sum _{f_k}\left[-\mathrm{ln}\left(\frac{f_k}{\mathrm{yr^{-1}}}\right) \right]^2\left(\frac{S_{\rm h}}{S_{\rm eff}}\right)^2 \approx \\&\approx \left[\mathrm{ln}\left(\frac{f_{\rm min}}{\mathrm{yr^{-1}}}\right)\right]^2\cdot \mathrm{S/N}^2 \propto \left[\mathrm{ln}\left(\frac{\mathrm{yr}}{T_{\rm obs}}\right)\right]^2\cdot \mathrm{S/N}^2 \end{aligned}\\ \begin{aligned} \displaystyle F_{A \gamma }&= 2\mathrm{ln10}\cdot \sum _{f_k}\left[-\mathrm{ln}\left(\frac{f_k}{\mathrm{yr^{-1}}}\right) \right]\left(\frac{S_{\rm h}}{S_{\rm eff}}\right)^2 \approx \\&\approx \left[-\mathrm{ln}\left(\frac{f_{\rm min}}{\mathrm{yr^{-1}}}\right) \right]\cdot \mathrm{S/N}^2 \propto - \mathrm{ln}\left(\frac{\mathrm{yr}}{T_{\rm obs}}\right)\cdot \mathrm{S/N}^2. \end{aligned} \end{array}\right.} \end{aligned} $$$ (C.5)

From the Fisher matrix, we obtain the scalings in Eq. (12). These scalings are compared to the simulations in Fig. C.1. Here, we can see how the simulations are well described by the expected analytical scaling.

Fig. C.1.

Mean and 68%C.I. of the error on γ (left) and on log₁₀A (right) as a function of the mean and 68%C.I. of the HD S/N obtained from the simulations of DR2 5yr, DR2* 5yr and DR2full (25yr). The black solid line is the expected scaling ∝S/N⁻¹, while the dashed line accounts for the term dependent on the observation time that influences the error on γ and not the one on log₁₀A.

The angle between the horizontal axis and the principal axis of the posterior distribution is given by

$\begin{matrix} \begin{matrix} θ & = \frac{1}{2} \arctan (\frac{2 σ_{A γ}}{σ_{A}^{2} - σ_{γ}^{2}}) = \frac{1}{2} \arctan (\frac{2 F_{A γ}^{- 1}}{F_{AA}^{- 1} - F_{γ γ}^{- 1}}) \approx \\ \approx \frac{1}{2} \arctan (\frac{{[- \ln 10 \cdot \ln (\frac{yr}{T_{obs}}) \cdot {S / N}^{2}]}^{- 1}}{{[{(2 \ln 10)}^{2} \cdot {S / N}^{2}]}^{- 1} - {[\ln {(\frac{yr}{T_{obs}})}^{2} \cdot {S / N}^{2}]}^{- 1}}) = \\ = \frac{1}{2} \arctan (\frac{- {(\ln 10)}^{- 1} {[\ln (\frac{yr}{T_{obs}})]}^{- 1}}{{(2 \ln 10)}^{- 2} - {[\ln (\frac{yr}{T_{obs}})]}^{- 2}}) . \end{matrix} \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \begin{aligned} \theta&= \frac{1}{2}\mathrm {arctan} \left(\frac{2\sigma _{A\gamma }}{\sigma _A^2 - \sigma _{\gamma }^2}\right) = \frac{1}{2}\mathrm {arctan} \left(\frac{2F^{-1}_{A\gamma }}{F^{-1}_{AA} - F^{-1}_{\gamma \gamma }}\right) \approx \\&\approx \frac{1}{2}\mathrm {arctan} \left(\frac{\left[- \mathrm{ln10}\cdot \mathrm{ln}\left(\frac{\mathrm{yr}}{T_{\rm obs}}\right)\cdot \mathrm{S/N}^2\right]^{-1}}{\left[(2\mathrm{ln10})^2 \cdot \mathrm{S/N}^2\right]^{-1} - \left[\mathrm{ln}\left(\frac{\mathrm{yr}}{T_{\rm obs}}\right)^2\cdot \mathrm{S/N}^2\right]^{-1}}\right) = \\&= \frac{1}{2}\mathrm {arctan} \left(\frac{-(\mathrm{ln10})^{-1} \left[\mathrm{ln}\left(\frac{\mathrm{yr}}{T_{\rm obs}}\right)\right]^{-1}}{(2\mathrm{ln10})^{-2} - \left[\mathrm{ln}\left(\frac{\mathrm{yr}}{T_{\rm obs}}\right)\right]^{-2}}\right). \end{aligned} \end{aligned} $$$ (C.6)

Since for $F_{A γ}^{- 1}$ $Mathematical equation: $ F^{-1}_{A\gamma} $$ and $F_{γ γ}^{- 1}$ $Mathematical equation: $ F^{-1}_{\gamma\gamma} $$ we do not know the exact value, but only an approximation, we can write the exact expression for θ by adding two unknown variables:

$\begin{matrix} θ = \frac{1}{2} \arctan (\frac{a \cdot {[\ln (\frac{yr}{T_{obs}})]}^{- 1}}{b - {[\ln (\frac{yr}{T_{obs}})]}^{- 2}}) . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \theta = \frac{1}{2}\mathrm {arctan} \left(\frac{{ a} \cdot \left[\mathrm{ln}\left(\frac{\mathrm{yr}}{T_{\rm obs}}\right)\right]^{-1}}{{ b}- \left[\mathrm{ln}\left(\frac{\mathrm{yr}}{T_{\rm obs}}\right)\right]^{-2}}\right) .\end{aligned} $$$ (C.7)

By fitting this function to the θ measured from the 100 realisation of DR2, we find that the relation is in very good agreement with the behaviour of the data, as can be seen from Fig. C.2.

Fig. C.2.

Orientation of the (γ_GWB, log₁₀A_GWB) joint posterior distribution as a function of the observation time. This angle is computed from the 100 simulations of DR2, from the Fisher matrix formalism and with the approximated formula shown in Eq. (C.7) fitted to the data.

All Tables

Table 1.

Simulated datasets used in this work.

In the text

Table 2.

Main analysis outputs for the five selected realisations and the real data.

In the text

All Figures

	Fig. 1. Characteristic strain spectrum of the injected GWB. Purple stars are the contributions to the spectrum of all the SMBHBs in the populations. The black solid line is the total power spectrum in each frequency bin, where the frequency bins start at f = 1/T_obs and have width Δf = 1/T_obs with T_obs = 24.8 yr. The power law approximation of the spectrum is also shown.
In the text

	Fig. 2. Top: S/N distribution computed from the 100 realisations of `DR2` (orange) and `DR2FC` (green). Bottom: difference in frequency coverage between the two kinds of datasets. The TOAs from all the pulsars are plotted together to highlight the drop in the frequency coverage occurring in `DR2` at 15 yr from now, which corresponds to the plateau of its S/N.
In the text

	Fig. 5. Median of the marginalised posterior distributions of γ_GWB and log₁₀A_GWB of the 100 realisations of `DR2full` and `DR2new` (shaded regions, 95%C.I.) and of the 5 realisations selected on the HD S/N (solid lines, 68 and 95% C.I.). As a reference, the median values and 2σ covariance principal axis obtained from the real data are displayed.
In the text

	Fig. 6. For pulsar PSR J1022+1001. Top: data from the real EPTA `DR2full`, `NANOGrav 16yr`, `PPTA 20yr`, `LOFAR` and `NenuFAR`. Bottom: Simplified simulation of the three combined PTAs .
In the text

	Fig. 9. Degeneracy coefficient (weighted average JSD) and median of the marginal posterior distribution of log₁₀A_GWB from the 100 simulations of `DR2 5yr` and `DR2* 5 yr`.
In the text

Fig. A.1.

In the text

	Fig. B.1. Distribution of the 100 values of the ratio between the HD S/N from the 25yr version of DR2full + IPTA and from the 10yr version.
In the text

Fig. C.1.

In the text

	Fig. C.2. Orientation of the (γ_GWB, log₁₀A_GWB) joint posterior distribution as a function of the observation time. This angle is computed from the 100 simulations of `DR2`, from the Fisher matrix formalism and with the approximated formula shown in Eq. (C.7) fitted to the data.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[R1] Afzal, A., Agazie, G., Anumarlapudi, A., et al. 2023, ApJ, 951, L11 [NASA ADS] [CrossRef] [Google Scholar]

[R2] Agazie, G., Anumarlapudi, A., Archibald, A. M., et al. 2023, ApJ, 951, L10 [CrossRef] [Google Scholar]

[R3] Agazie, G., Anumarlapudi, A., Archibald, A. M., et al. 2023, ArXiv e-prints [arXiv:2306.16221] [Google Scholar]

[R4] Babak, S., Falxa, M., Franciolini, G., & Pieroni, M. 2024, Phys. Rev. D, 110, 063022 [Google Scholar]

[R5] Bassa, C. G., Janssen, G. H., Stappers, B. W., et al. 2016, MNRAS, 460, 2207 [NASA ADS] [CrossRef] [Google Scholar]

[R6] Caprini, C., & Figueroa, D. G. 2018, Class. Quant. Grav., 35, 163001 [NASA ADS] [CrossRef] [Google Scholar]

[R7] Crisostomi, M., van Haasteren, R., Meyers, P. M., & Vallisneri, M. 2025, ArXiv e-prints [arXiv:2506.13866] [Google Scholar]

[R8] Donner, J. Y., Verbiest, J. P. W., Tiburzi, C., et al. 2020, A&A, 644, A153 [EDP Sciences] [Google Scholar]

[R9] Edwards, R. T., Hobbs, G. B., & Manchester, R. N. 2006, MNRAS, 372, 1549 [Google Scholar]

[R10] Ellis, J. A. 2013, Class. Quant. Grav., 30, 224004 [NASA ADS] [CrossRef] [Google Scholar]

[R11] EPTA and InPTA Collaboration (Antoniadis, J., et al.) 2024, A&A, 685, A94 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R12] EPTA and InPTA Collaboration (Antoniadis, J., et al.) 2023a, A&A, 678, A50 [CrossRef] [EDP Sciences] [Google Scholar]

[R13] EPTA and InPTA Collaboration (Antoniadis, J., et al.) 2023b, A&A, 678, A49 [Google Scholar]

[R14] EPTA and InPTA Collaboration (Antoniadis, J., et al.) 2023c, A&A, 678, A48 [Google Scholar]

[R15] Ferranti, I., Falxa, M., Sesana, A., et al. 2025, A&A, 694, A38 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R16] Foster, R. S., & Backer, D. C. 1990, ApJ, 361, 300 [NASA ADS] [CrossRef] [Google Scholar]

[R17] Goncharov, B., & Sardana, S. 2025, MNRAS, 537, 3470 [Google Scholar]

[R18] Goncharov, B., Sardana, S., Sesana, A., et al. 2025, Reading Signatures of Supermassive Binary Black Holes in Pulsar Timing Array Observations [Google Scholar]

[R19] Hellings, R. W., & Downs, G. S. 1983, ApJ, 265, L39 [NASA ADS] [CrossRef] [Google Scholar]

[R20] Iraci, F., Chalumeau, A., Tiburzi, C., et al. 2025, ArXiv e-prints [arXiv:2510.04639] [Google Scholar]

[R21] Jaffe, A. H., & Backer, D. C. 2003, ApJ, 583, 616 [NASA ADS] [CrossRef] [Google Scholar]

[R22] Larsen, B., Mingarelli, C. M. F., Baker, P. T., et al. 2025, MNRAS, 542, 3028 [Google Scholar]

[R23] Lentati, L., Shannon, R. M., Coles, W. A., et al. 2016, MNRAS, 458, 2161 [Google Scholar]

[R24] Lorimer, D. R., & Kramer, M. 2004, Handbook of Pulsar Astronomy, Vol. 4 [Google Scholar]

[R25] Miles, M. T., Shannon, R. M., Reardon, D. J., et al. 2025, MNRAS, 536, 1489 [Google Scholar]

[R26] Quelquejay Leclere, H., Li, K., Volonteri, M., et al. 2026, A&A, 705, A56 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R27] Rajagopal, M., & Romani, R. W. 1995, ApJ, 446, 543 [NASA ADS] [CrossRef] [Google Scholar]

[R28] Reardon, D. J., Zic, A., Shannon, R. M., et al. 2023, ApJ, 951, L6 [NASA ADS] [CrossRef] [Google Scholar]

[R29] Rosado, P. A., Sesana, A., & Gair, J. 2015, MNRAS, 451, 2417 [NASA ADS] [CrossRef] [Google Scholar]

[R30] Sesana, A. 2013, Class. Quant. Grav., 30, 244009 [Google Scholar]

[R31] Sesana, A., Vecchio, A., & Colacino, C. N. 2008, MNRAS, 390, 192 [NASA ADS] [CrossRef] [Google Scholar]

[R32] Speri, L., Porayko, N. K., Falxa, M., et al. 2023, MNRAS, 518, 1802 [Google Scholar]

[R33] Susarla, S. C., Chalumeau, A., Tiburzi, C., et al. 2024, A&A, 692, A18 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R34] Tiburzi, C., Hobbs, G., Kerr, M., et al. 2016, MNRAS, 455, 4339 [Google Scholar]

[R35] Vallisneri, M. 2020, Astrophysics Source Code Library [record ascl:2002.017] [Google Scholar]

[R36] Valtolina, S., Shaifullah, G., Samajdar, A., & Sesana, A. 2024, A&A, 683, A201 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R37] van Haasteren, R. 2024, ApJS, 273, 23 [Google Scholar]

[R38] van Haasteren, R. 2025, MNRAS, 537, L1 [Google Scholar]

[R39] van Haasteren, R., & Vallisneri, M. 2014, Phys. Rev. D, 90, 104012 [NASA ADS] [CrossRef] [Google Scholar]

[R40] Verbiest, J. P. W., Lentati, L., Hobbs, G., et al. 2016, MNRAS, 458, 1267 [Google Scholar]

[R41] Verbiest, J. P. W., & Shaifullah, G. M. 2018, Class. Quant. Grav., 35, 133001 [NASA ADS] [CrossRef] [Google Scholar]

[R42] Vigeland, S. J., Islo, K., Taylor, S. R., & Ellis, J. A. 2018, Phys. Rev. D, 98, 044003 [NASA ADS] [CrossRef] [Google Scholar]

[R43] Wyithe, J. S. B., & Loeb, A. 2003, ApJ, 590, 691 [NASA ADS] [CrossRef] [Google Scholar]

[R44] Xu, H., Chen, S., Guo, Y., et al. 2023, Res. Astron. Astrophys., 23, 075024 [CrossRef] [Google Scholar]