A roadmap to cosmological parameter analysis with third-order shear statistics

Laila Linke; Sven Heydenreich; Pierre A. Burger; Peter Schneider

doi:10.1051/0004-6361/202245652

Home

All issues

Volume 672 (April 2023)

A&A, 672 (2023) A185

Full HTML

Open Access

Issue		A&A Volume 672, April 2023


Article Number		A185
Number of page(s)		27
Section		Cosmology (including clusters of galaxies)
DOI		https://doi.org/10.1051/0004-6361/202245652
Published online		21 April 2023

A&A 672, A185 (2023)

II. Analytic covariance estimate^⋆

Laila Linke, Sven Heydenreich, Pierre A. Burger and Peter Schneider

Argelander-Institut für Astronomie, Auf dem Hügel 71, 53121 Bonn, Germany
e-mail: llinke@astro.uni-bonn.de

Received: 8 December 2022
Accepted: 14 February 2023

Abstract

Context. Third-order weak lensing statistics are a promising tool for cosmological analyses since they extract cosmological information in the non-Gaussianity of the cosmic large-scale structure. However, such analyses require precise and accurate models for the covariance of the statistics.

Aims. In this second paper of a series on third-order weak lensing statistics, we derive and validate an analytic model for the covariance of the third-order aperture statistics 〈M_ap³〉.

Methods. We derive the covariance model from a real-space estimator for 〈M_ap³〉, including the Gaussian and non-Gaussian parts. We validate the model by comparing it to estimates from simulated Gaussian random fields (GRFs) and two sets of N-body simulations. Finally, we perform mock cosmological analyses with the model covariance and the simulation estimate to compare the resulting parameter constraints.

Results. We find good agreement between the analytic model and the simulations, both for the GRFs and the N-body simulations. The figure of merit in the S₈ − Ω_m plane from our covariance model is within 3% of the one obtained from the simulated covariances. We also show that our model, which is based on an estimator using convergence maps, can be used to obtain upper and lower bounds for the covariance of an estimator based on three-point shear correlation functions. This second estimator is required to measure 〈M_ap³〉 in realistic survey data. In our derivation, we find that the covariance of 〈M_ap³〉 cannot be obtained from the bispectrum covariance and that it includes several ‘finite-field terms’ that do not scale with the inverse survey area.

Conclusions. Our covariance model is sufficiently accurate for analysing stage III surveys. Covariances for statistics in Fourier space cannot always be straightforwardly converted into covariance for real-space statistics.

Key words: gravitational lensing: weak / cosmological parameters / methods: statistical / large-scale structure of Universe / methods: analytical

^⋆

The modelling code is available at https://github.com/sheydenreich/threepoint/releases

© The Authors 2023

Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1. Introduction

Using weak gravitational lensing of the cosmic large-scale structure (LSS) has become a precise and accurate method to infer cosmological parameters (Heymans et al. 2021; Hikage et al. 2019; Abbott et al. 2022). Weak lensing analyses have mainly concentrated on second-order statistics, such as shear two-point correlation functions or galaxy-galaxy-lensing. However, since the cosmic LSS is non-Gaussian, second-order statistics do not contain its total information content. To access the additional information, higher-order statistics (HOS) are needed. These HOS depend on cosmological parameters differently than second-order statistics, so they can be used to break degeneracies (Takada & Jain 2004; Kilbinger & Schneider 2005; Kayo et al. 2013) or constrain nuisance parameters such as galaxy bias or intrinsic alignment (Pyne & Joachimi 2021; Troxel & Ishak 2012).

Many HOS have been suggested to complement second-order analyses, for example, peak statistics (Kacprzak et al. 2016; Zürcher et al. 2021; Martinet et al. 2021), density split statistics (Friedrich et al. 2018; Gruen et al. 2018; Burger et al. 2023), or persistent homology (Heydenreich et al. 2021, 2022). Most of these statistics, though, cannot be modelled from analytical theories and instead require time-consuming realistic N-body simulations. Additionally, for many HOS, the measurement requires converting weak lensing shear estimates to mass convergence maps. This conversion becomes complicated in the presence of masks and for finite survey areas, which can lead to biased estimators (Seitz & Schneider 1996).

In this series of papers, we are considering the third-order aperture mass $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ , which is not affected by these problems. It is linearly related to the third-order shear correlation functions Γ_i (Jarvis et al. 2004; Schneider et al. 2005), which can be directly measured in shear catalogues and converted into $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ . This was demonstrated recently by Secco et al. (2022), who measured $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ with high significance from the Dark Energy Survey Year 3 shear catalogues. Moreover, in contrast to many other HOS, $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ and Γ_i can be modelled analytically without directly relying on simulations.

There are several advantages to using $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ instead of Γ_i, among them that $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ compress the information in the Γ_i from a data vector with thousands of entries to one containing only tens of entries in a non-tomographic setup, and that the $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ have a clear separation into E- and B-modes, allowing for the detection of untreated systematic effects. Our goal is to lay the groundwork for a cosmological analysis with $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ . Such an analysis requires two ingredients: a model of $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ and an estimate of its covariance matrix. The first ingredient is the subject of Heydenreich et al. (2023, Paper I hereafter), where we present and test an analytic model of $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ based on the BiHalofit bispectrum model (Takahashi et al. 2020). Here, we are concerned with the second ingredient, which is the covariance.

The covariance of the estimator of a statistic can be obtained in three ways. The first possibility is applying the estimator to a large set of (quasi-)independent cosmological N-body simulations and using the sample covariance of the estimates. However, an unbiased and precise estimate requires many more realisations than entries in the data vector. For example, for a tomographic analysis of $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ with four redshift bins and four different scale radii, the data vector contains 400 entries, so hundreds of simulations are needed for a simulated covariance estimate. Such an estimate would be very time-consuming, as both the creation of this many independent simulations and the measurement of $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ in them is computationally demanding. The second possibility is to estimate the covariance directly from the data using jackknife resampling or bootstrap methods. For this, the survey area is divided into smaller patches, the statistics are estimated separately on each of these patches, and the sample covariance of these individual estimates is taken as the covariance estimate. However, this method has two main disadvantages. First, information on correlations larger than the patch size is lost. Since the number of patches must be at least as large as the length of the data vector, the required number of patches becomes large and their size small. Second, the method implicitly assumes that the individual patches are independent of each other. This assumption is not true, in particular, if small, neighbouring patches are considered. This causes biases in the covariance estimate. Third, triplets covering two or three patches need to be neglected for the estimation of the data vector. Otherwise, the covariance from jackknife or bootstrap resampling would overestimate the true covariance of the data.

Instead, cosmic shear covariances can also be calculated analytically for a given estimator. This was done, for example, for two-point statistics (Joachimi et al. 2008) or the matter bispectrum (Joachimi et al. 2009). While the derivation of an expression for a covariance can be tedious, once it is done, evaluating them for the characteristics of a specific survey is usually accurate and quick, at least compared to running hundreds of cosmological simulations. It is also easier to infer the dependence of the covariance on survey area and geometry from an analytic expression. Consequently, we follow this approach and derive an analytical expression for the covariance of $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ . We test our result by comparing it with the covariance measured in simulated mock data, both simple Gaussian shear fields and full cosmological simulations, and investigate the influence of different terms in the covariance by performing mock cosmological analyses.

In our derivation, we make five crucial findings. First, the covariance consists of six terms, two of which belong to the Gaussian part and four to the non-Gaussian part. These terms all depend on the survey area and window function (Sects. 3.1 and 3.2). Second, the individual covariance terms can be estimated from correlation functions of the aperture mass map. These correlation functions need to be known only on scales inside the survey area (Sect. 3.3). Third, under the ‘large-field approximation’, which assumes a broad survey window function, two covariance terms vanish, while the others scale inversely with the survey area. The vanishing terms decrease faster than 1/A with survey area A. We refer to them as finite-field terms. One of these terms is already present for Gaussian fields (Sect. 3.4). Fourth, the covariance of $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ cannot be obtained from the bispectrum covariance unless the large-field approximation is valid. Using a linear transformation of the covariance of a Fourier space quantity to obtain the covariance for a real-space observable has been done for second-order statistics (Joachimi et al. 2021; Friedrich et al. 2021). However, for $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ this approach already fails for Gaussian fields if the large-field approximation is not used (Sect. 4). Finally, our covariance model, based on an estimator using convergence maps, can be used to obtain upper and lower limits on the covariance for an estimator based on third-order shear correlation functions (Sect. 5.3.2).

This paper is structured as follows. In Sect. 2 we give a short overview of third-order shear statistics, and we introduce our notation, the third-order aperture mass $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ , and its estimator. We derive the covariance of this estimator in Sect. 3. In Sect. 4 we show that an alternative derivation of the $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ covariance from the bispectrum covariance is not correct. We validate the model by comparing its predictions to simulated data in Sect. 5. Finally, in Sect. 6, we perform mock cosmological analyses to compare the results from the validation data to our model and investigate the impact of the individual covariance terms. We conclude with a summary and discussion in Sect. 7. Throughout this paper, we assume a spatially flat universe. Our covariance modelling code is publically available¹.

2. Theoretical background

In this section, we briefly review some fundamental weak lensing quantities relevant to third-order shear statistics and the covariance calculation in the remainder of this work. We refer to Bartelmann & Schneider (2001), Hoekstra & Jain (2008) or Bartelmann (2010) for in-depth reviews on weak lensing.

One of the fundamental quantities in weak gravitational lensing is the convergence κ(ϑ), which is the normalised surface mass density at angular position ϑ. In a flat universe, it is related to the density contrast δ(χϑ, χ) at angular position ϑ and comoving distance χ via

$\begin{matrix} κ (ϑ) = \frac{3 H_{0}^{2} Ω_{m}}{2 c^{2}} \int_{0}^{\infty} d χ q (χ) \frac{δ (χ ϑ, χ)}{a (χ)}, where q (χ) = \int_{χ}^{\infty} d χ^{'} p (χ^{'}) \frac{χ^{'} - χ}{χ^{'}}, \end{matrix}$ $\begin{aligned} \kappa (\boldsymbol{\vartheta }) = \frac{3H_0^2\Omega _{\rm m}}{2c^2}\int _0^\infty \mathrm{d}{\chi }\; q(\chi )\,\frac{\delta (\chi \boldsymbol{\vartheta }, \chi )}{a(\chi )}\;, \mathrm{where}\ q(\chi ) = \int _\chi ^\infty \mathrm{d}{\chi^\prime }\, p(\chi^\prime )\, \frac{\chi^\prime -\chi }{\chi^\prime }\;, \end{aligned}$ (1)

with the Hubble constant H₀, the matter density parameter Ω_m, the speed-of-light c, the cosmic scale factor a(χ) at χ, normalised to unity today, and the probability distribution p(χ) dχ of source galaxies in comoving distance.

We treat κ(ϑ) as a homogenous and isotropic random field, which is characterised by the full set of its polyspectra 𝒫_n, defined by

$\begin{matrix} {〈 \tilde{κ} (ℓ_{1}) \dots \tilde{κ} (ℓ_{n}) 〉}_{c} = P_{n} (ℓ_{1}, \dots, ℓ_{n}) {(2 π)}^{2} δ_{D} (ℓ_{1} + \dots + ℓ_{n}), \end{matrix}$ $\begin{aligned} \left\langle \tilde{\kappa }(\boldsymbol{\ell }_1)\dots \tilde{\kappa }(\boldsymbol{\ell }_n)\right\rangle _{\rm c} = \mathcal{P} _n(\boldsymbol{\ell }_1,\dots ,\boldsymbol{\ell }_n)\,(2\pi )^2\, \delta _{\rm D}(\boldsymbol{\ell }_1+\dots +\boldsymbol{\ell }_n)\;, \end{aligned}$ (2)

where $\tilde{κ} (ℓ)$ $\tilde{\kappa}({{\boldsymbol{\ell}}})$ is the Fourier transform of κ and the ⟨…⟩_c are connected correlation functions. Interesting for us in this work is the powerspectrum P, bispectrum B, trispectrum T, and the pentaspectrum P₆, defined as

$\begin{matrix} P (ℓ) = P_{2} (ℓ, - ℓ), B (ℓ_{1}, ℓ_{2}) = P_{n} (ℓ_{1}, ℓ_{2}, - ℓ_{1} - ℓ_{2}), T (ℓ_{1}, ℓ_{2}, ℓ_{3}) = P_{n} (ℓ_{1}, ℓ_{2}, ℓ_{3}, - ℓ_{1} - ℓ_{2} - ℓ_{3}), \end{matrix}$ $\begin{aligned} P(\ell ) = P_2(\boldsymbol{\ell }, -\boldsymbol{\ell })\,, \quad B(\boldsymbol{\ell }_1, \boldsymbol{\ell }_2) = \mathcal{P} _n(\boldsymbol{\ell }_1, \boldsymbol{\ell }_2, -\boldsymbol{\ell }_1-\boldsymbol{\ell }_2)\,, \quad T(\boldsymbol{\ell }_1, \boldsymbol{\ell }_2, \boldsymbol{\ell }_3) = \mathcal{P} _n(\boldsymbol{\ell }_1, \boldsymbol{\ell }_2, \boldsymbol{\ell }_3, -\boldsymbol{\ell }_1-\boldsymbol{\ell }_2-\boldsymbol{\ell }_3)\,, \end{aligned}$ (3)

and

$\begin{matrix} P_{6} (ℓ_{1}, ℓ_{2}, ℓ_{3}, ℓ_{4}, ℓ_{5}) = P_{n} (ℓ_{1}, ℓ_{2}, ℓ_{3}, ℓ_{4}, ℓ_{5}, - ℓ_{1} - ℓ_{2} - ℓ_{3} - ℓ_{4} - ℓ_{5}) . \end{matrix}$ $\begin{aligned}&P_6(\boldsymbol{\ell }_1, \boldsymbol{\ell }_2, \boldsymbol{\ell }_3, \boldsymbol{\ell }_4, \boldsymbol{\ell }_5)= \mathcal{P} _n(\boldsymbol{\ell }_1, \boldsymbol{\ell }_2, \boldsymbol{\ell }_3, \boldsymbol{\ell }_4, \boldsymbol{\ell }_5, -\boldsymbol{\ell }_1-\boldsymbol{\ell }_2-\boldsymbol{\ell }_3-\boldsymbol{\ell }_4-\boldsymbol{\ell }_5)\;. \end{aligned}$ (4)

The κ-polyspectra of are related to the polyspectra $P_{n}^{(3d)}$ $\mathcal{P}_n^\mathrm{(3d)}$ of the three-dimensional density contrast δ, using the Limber-approximation (Kaiser & Jaffe 1997; Kayo et al. 2013),

$\begin{matrix} P_{n} (ℓ_{1}, \dots, ℓ_{n}) & = {(\frac{3 H_{0}^{2} Ω_{m}}{2 c^{2}})}^{n} \int_{0}^{\infty} d χ \frac{q^{n} (χ)}{χ^{n - 2} a {(χ)}^{n}} P_{n}^{(3 d)} (ℓ_{1} / χ, \dots, ℓ_{n} / χ, χ) . \end{matrix}$ $\begin{aligned} \mathcal{P} _n(\boldsymbol{\ell }_1,\dots ,\boldsymbol{\ell }_n)&=\left(\frac{3H_0^2\Omega _{\rm m}}{2c^2}\right)^n\,\int _0^\infty \mathrm{d}{\chi }\; \frac{q^n(\chi )}{\chi ^{n-2}\, a(\chi )^n}\, \mathcal{P} _n^\mathrm{(3d)} (\boldsymbol{\ell }_1/\chi ,\dots ,\boldsymbol{\ell }_n/\chi , \chi )\;. \end{aligned}$ (5)

We model the three-dimensional power spectrum with the revised Halofit prescription by Takahashi et al. (2012) and the three-dimensional bispectrum with BiHalofit (Takahashi et al. 2020). For the tri- and pentaspectrum, we use the halo model (Cooray & Sheth 2002) with a Sheth–Tormen halo mass function and halo bias (Sheth & Tormen 1999) and Navarro–Frenk–White-halo profiles. To simplify our calculation, we use only the 1-halo term for the trispectrum, which we expect to be dominant at the considered ℓ-scales. For the pentaspectrum, we use the 1- and part of the 2-halo term (see Sect. 3.4 and Appendix C).

The convergence cannot be directly observed, but it is related to the weak lensing shear γ, which describes the change in the observed ellipticity of source galaxies due to the lensing effect. For third-order shear statistics, we are interested in the three-point correlation function of the shear, whose so-called natural components Γ_i were derived by Schneider & Lombardi (2003). However, these components are challenging to handle in practice since they require a large number of bins (typically of the order of 10³) and are complex to model because they relate to the matter bispectrum via multi-dimensional integrals involving oscillating Bessel functions. As shown in Paper I, more practical quantities with similar information content are the third-order aperture statistics $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ .

Aperture statistics are moments of the aperture mass M_ap, a smoothed convergence map, which depends on its position α and a characteristic smoothing scale θ,

$\begin{matrix} M_{ap} (α, θ) = \int d^{2} ϑ U_{θ} (| α - ϑ |) κ (ϑ), with U_{θ} (ϑ) = \frac{1}{θ^{2}} u (\frac{ϑ}{θ}), \end{matrix}$ $\begin{aligned} M_{\rm ap}(\boldsymbol{\alpha }, \theta )=\int \mathrm{d}^2{\vartheta }\; U_\theta (|\boldsymbol{\alpha }- \boldsymbol{\vartheta }|)\, \kappa (\boldsymbol{\vartheta })\;, \mathrm{with}\ U_\theta (\vartheta ) = \frac{1}{\theta ^2}\, u\left(\frac{\vartheta }{\theta }\right), \end{aligned}$ (6)

where U_θ is a filter function with aperture scale radius θ. As long as this is a compensated filter function, meaning ∫dϑϑ U_θ(ϑ) = 0, the aperture mass can be estimated from the tangential shear γ_t instead of the convergence (Schneider 1996),

$\begin{matrix} M_{ap} (α, θ) = \int d^{2} ϑ Q_{θ} (| α - ϑ |) γ_{t} (α ; ϑ), with Q_{θ} (ϑ) = \frac{2}{ϑ^{2}} \int_{0}^{ϑ} d ϑ^{'} ϑ^{'} U_{θ} (ϑ^{'}) - U_{θ} (ϑ) . \end{matrix}$ $\begin{aligned} M_{\rm ap}(\boldsymbol{\alpha }, \theta )=\int \mathrm{d}^2{\vartheta }\; Q_\theta (|\boldsymbol{\alpha }- \boldsymbol{\vartheta }|)\, \gamma _{\rm t}(\boldsymbol{\alpha };\boldsymbol{\vartheta }), \quad \mathrm{with}\quad Q_\theta (\vartheta )=\frac{2}{\vartheta ^2}\, \int _0^\vartheta \mathrm{d}{\vartheta^\prime }\; \vartheta^\prime U_\theta (\vartheta^\prime ) - U_\theta (\vartheta ). \end{aligned}$ (7)

Here, we are interested in the covariance of the third-order aperture statistics $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ , given by

$\begin{matrix} 〈 M_{ap}^{3} 〉 (θ_{1}, θ_{2}, θ_{3}) = 〈 M_{ap} (α, θ_{1}) M_{ap} (α, θ_{2}) M_{ap} (α, θ_{3}) 〉, \end{matrix}$ $\begin{aligned} \left\langle M_{\rm ap}^3\right\rangle (\theta _1, \theta _2, \theta _3)= \left\langle M_{\rm ap}(\boldsymbol{\alpha }, \theta _1)\,M_{\rm ap}(\boldsymbol{\alpha }, \theta _2)\,M_{\rm ap}(\boldsymbol{\alpha }, \theta _3)\right\rangle , \end{aligned}$ (8)

which is our main observable for third-order shear statistics. $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ is related to the convergence bispectrum by

$\begin{matrix} 〈 M_{ap}^{3} 〉 (θ_{1}, θ_{2}, θ_{3}) & = \int \frac{d^{2} ℓ_{1}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{2}}{{(2 π)}^{2}} \tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{2} θ_{2}) \tilde{u} (| ℓ_{1} + ℓ_{2} | θ_{3}) B (ℓ_{1}, ℓ_{2}) . \end{matrix}$ $\begin{aligned} \left\langle M_{\rm ap}^3\right\rangle (\theta _1,\theta _2,\theta _3)&= \int \frac{\mathrm{d}^2{\ell _1}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _2}}{(2\pi )^2}\; \tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _2\,\theta _2)\,\tilde{u}\left(|\boldsymbol{\ell }_1+\boldsymbol{\ell }_2|\,\theta _3\right)\, B(\boldsymbol{\ell }_1,\boldsymbol{\ell }_2). \end{aligned}$ (9)

For the comparison of the analytical covariance to mock data in Sect. 5 and the estimate of the cosmological parameter analysis in Sect. 6 we choose the exponential filter function from Crittenden et al. (2002),

$\begin{matrix} u (x) = \frac{1}{2 π} (1 - \frac{x^{2}}{2}) exp (- \frac{x^{2}}{2}) . \end{matrix}$ $\begin{aligned} u(x) = \frac{1}{2\pi } \, \left(1-\frac{x^2}{2}\right)\, \exp \left(-\frac{x^2}{2}\right). \end{aligned}$ (10)

The normalisation of the aperture filter functions is chosen such that ∫d²ϑQ(ϑ) = 1.

Based on Eq. (6), M_ap(α, θ) can be estimated from convergence fields using

$\begin{matrix} {\hat{M}}_{ap} (α, θ) = \int_{A^{'}} d^{2} ϑ U_{θ} (| α - ϑ |) κ (ϑ), \end{matrix}$ $\begin{aligned} \hat{M}_{\rm ap}(\boldsymbol{\alpha }, \theta ) = \int _{A^\prime }\mathrm{d}^2{\vartheta } U_\theta (|\boldsymbol{\alpha }-\boldsymbol{\vartheta }|)\, \kappa (\boldsymbol{\vartheta }), \end{aligned}$ (11)

where A′ is the full area of the convergence field κ. This estimator can be visualised as placing an aperture of radius θ centred at position α and averaging the convergence values within the aperture weighted by the filter function U_θ (see Fig. 1). The $〈 M_{ap}^{3} 〉 (θ_{1}, θ_{2}, θ_{3})$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}(\theta_1, \theta_2, \theta_3)$ can then be estimated by averaging the product of the ${\hat{M}}_{ap} (α, θ)$ ${\hat{M}_{\text{ ap}}}({\boldsymbol{\alpha}}, \theta)$ for the three filter radii over the aperture positions α. However, for α close to the border of the survey area, ${\hat{M}}_{ap} (α, θ)$ ${\hat{M}_{\text{ ap}}}({\boldsymbol{\alpha}}, \theta)$ is biased because the aperture includes regions outside the survey where κ is not known. Consequently, for an unbiased estimate of $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ , we only average over an area A smaller than A′, which leads to the estimator

$\begin{matrix} {\hat{M}}_{ap}^{3} (θ_{1}, θ_{2}, θ_{3}) & = \frac{1}{A} \int_{A} d^{2} α \prod_{i = 1}^{3} \int_{A^{'}} d^{2} ϑ_{i} U_{θ_{i}} (| α - ϑ_{i} |) κ (ϑ_{i}) . \end{matrix}$ $\begin{aligned} \hat{M}_{\rm ap}^3(\theta _1, \theta _2, \theta _3)&= \frac{1}{A}\int _A \mathrm{d}^2{\alpha }\, \prod _{i=1}^3 \int _{A^\prime } \mathrm{d}^2{\boldsymbol{\vartheta }_i} U_{\theta _i}(|\boldsymbol{\alpha }-\boldsymbol{\vartheta }_i|)\,\kappa (\boldsymbol{\vartheta }_i). \end{aligned}$ (12)

Fig. 1.

Illustration of aperture mass estimation. The area A′ is the size of the full convergence field, on which we place apertures with scale radius θ, illustrated by the circles, to obtain M_ap(α, θ). Apertures centred on positions α within the smaller area A lie completely within A′, while apertures centred on positions α′ outside of A extend outside of A′, so M_ap(α′,θ) is biased.

We have assumed here that apertures are placed densely so that the separation between aperture centres is much smaller than the aperture radius θ, to use a continuous integral over α instead of a sum. If A′ is large enough that all U_θ centred on points within A vanish outside of A′, we can replace A′ with the full ℝ². We also introduce the survey window function W_A(α), which is unity for α inside A and zero otherwise. With W, we convert integrals over A into integrals over ℝ², which we denote as two-dimensional integrals without integration borders in the following. The estimator for $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ becomes

$\begin{matrix} {\hat{M}}_{ap}^{3} (θ_{1}, θ_{2}, θ_{3}) & = \frac{1}{A} \int d^{2} α W_{A} (α) [\prod_{i = 1}^{3} \int d^{2} ϑ_{i} U_{θ_{i}} (| α - ϑ_{i} |) κ (ϑ_{i})] . \end{matrix}$ $\begin{aligned} \hat{M}_{\rm ap}^3(\theta _1, \theta _2, \theta _3)&= \frac{1}{A}\,\int \mathrm{d}^2{\alpha }\,W_A(\boldsymbol{\alpha })\, \left[\prod _{i=1}^3 \int \mathrm{d}^2{\boldsymbol{\vartheta }_i} U_{\theta _i}(|\boldsymbol{\alpha }-\boldsymbol{\vartheta }_i|)\,\kappa (\boldsymbol{\vartheta }_i)\right]. \end{aligned}$ (13)

This estimator is easily applied to simulations, where convergence maps without masks are available (see Sect. 5.2.1). However, this estimator should not be used for survey data that includes masked areas. In that case, one would instead measure the correlation functions Γ_i and convert these into $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ (see the discussion in Sect. 5.3 of Heydenreich et al. 2023). In this approach, one does not need to decrease the survey area from A′ to A because no apertures are laid down. Therefore, the covariance for this estimator is smaller than the covariance for the ${\hat{M}}_{ap}^{3}$ ${{\hat{M}_{\text{ ap}}}^3}$ in Eq. (13). However, the unbiased construction of the aperture mass on the whole area A′ requires shear estimates outside of A′ as well. Therefore, even Γ_i measured from shear estimates on all of A′ cannot include the full information that would be contained in an unbiased estimate of ${\hat{M}}_{ap}^{3}$ ${{\hat{M}_{\text{ ap}}}^3}$ on A′. Consequently, in the absence of masked regions, we expect the magnitude of the covariance for the estimator based on the Γ_i to be between the covariance for ${\hat{M}}_{ap}^{3}$ ${{\hat{M}_{\text{ ap}}}^3}$ for a survey area of A and a survey area of A′. We verify this expectation in Sect. 5.

3. Derivation of aperture statistics covariance from the real-space estimator

We now derive the covariance $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ of ${\hat{M}}_{ap}^{3}$ ${{\hat{M}_{\text{ ap}}}^3}$ . An overview of our calculation is given in Fig. 2, with references to the main equations in our derivation. In the derivation, we find that the covariance comprises several terms, including various permutations of scale radii. For simplicity, we write here ‘Perm’. to indicate permutations; the complete list of permutations for all terms is given in Appendix A. We are not explicitly addressing the effect of shape noise in this section, but shape noise can be easily included in the derived expressions by replacing the power spectrum P with $P + σ_{ϵ}^{2} / 2 n$ $P+{\sigma_\epsilon^2}/{2n}$ , where n is the galaxy number density and $σ_{ϵ}^{2}$ $\sigma_\epsilon^2$ the two-component ellipticity dispersion. We show the validity of this treatment in Appendix B.

Fig. 2.

Schematic representation of the calculation of the covariance $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ . The covariance is given by the difference between $〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉$ ${\left\langle{{\hat{M}_{\text{ ap}}}^3}\,{{\hat{M}_{\text{ ap}}}^3}\right\rangle}$ and $〈 {\hat{M}}_{ap}^{3} 〉 〈 {\hat{M}}_{ap}^{3} 〉$ ${\left\langle{{\hat{M}_{\text{ ap}}}^3}\right\rangle}\,{\left\langle{{\hat{M}_{\text{ ap}}}^3}\right\rangle}$ , the first of which can be decomposed (indicated by solid arrows) into one Gaussian (denoted by G) and three non-Gaussian parts (denoted by BB, PT, and P₆). We discuss the Gaussian part in Sect. 3.1 and the non-Gaussian parts in Sect. 3.2. These parts can be further decomposed into terms depending on different permutations of the aperture scale radii, called T_PPP, 1 to T_P₆. For large survey areas, the T_i can be approximated, indicated by dashed arrows, as shown in Sect. 3.4. Under this approximation, two terms vanish, which is why we term them ‘finite-field terms’. Also shown are equation numbers for the relevant expressions.

The covariance is

$\begin{matrix} C_{{\hat{M}}_{ap}^{3}} (Θ_{1}, Θ_{2}) & = 〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉 (Θ_{1}, Θ_{2}) - 〈 {\hat{M}}_{ap}^{3} (Θ_{1}) 〉 〈 {\hat{M}}_{ap}^{3} (Θ_{2}) 〉, \end{matrix}$ $\begin{aligned} C_{\hat{M}_{\rm ap}^3}(\Theta _1, \Theta _2)&= \left\langle \hat{M}_{\rm ap}^3\,\hat{M}_{\rm ap}^3\right\rangle (\Theta _1, \Theta _2)- \left\langle \hat{M}_{\rm ap}^3(\Theta _1)\right\rangle \, \left\langle \hat{M}_{\rm ap}^3(\Theta _2)\right\rangle , \end{aligned}$ (14)

where Θ₁ = (θ₁, θ₂, θ₃) and Θ₂ = (θ₄, θ₅, θ₆). With Eq. (13) and κ_i := κ(ϑ_i),

$\begin{matrix} 〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉 (Θ_{1}, Θ_{2}) & = \frac{1}{A^{2}} \int d^{2} α_{1} \int d^{2} α_{2} W_{A} (α_{1}) W_{A} (α_{2}) [\prod_{i = 1}^{3} \int d^{2} ϑ_{i} U_{θ_{i}} (| α_{1} - ϑ_{i} |)] [\prod_{j = 4}^{6} \int d^{2} ϑ_{j} U_{θ_{j}} (| α_{2} - ϑ_{j} |)] \\ \times 〈 κ_{1} κ_{2} κ_{3} κ_{4} κ_{5} κ_{6} 〉 . \end{matrix}$ $\begin{aligned} \nonumber \left\langle \hat{M}_{\rm ap}^3\, \hat{M}_{\rm ap}^3\right\rangle (\Theta _1, \Theta _2)&= \frac{1}{A^2}\int \mathrm{d}^2{\alpha _1} \int \mathrm{d}^2{\alpha _2} W_A(\boldsymbol{\alpha }_1)\,W_A(\boldsymbol{\alpha }_2)\,\Bigg [\prod _{i=1}^3 \int \mathrm{d}^2{\vartheta _i} U_{\theta _i}(|\boldsymbol{\alpha }_1-\boldsymbol{\vartheta }_i|)\Bigg ]\,\Bigg [\prod _{j=4}^6 \int \mathrm{d}^2{\vartheta _j} U_{\theta _j}(|\boldsymbol{\alpha }_2-\boldsymbol{\vartheta }_j|)\Bigg ]\\&\quad \times \,\left\langle \kappa _1\,\kappa _2\,\kappa _3\,\kappa _4\,\kappa _5\,\kappa _6\right\rangle . \end{aligned}$ (15)

The six-point correlation can be written in terms of connected correlation functions, denoted by ⟨⟩_c. This, together with ⟨κ(ϑ)⟩ = 0, leads to

$\begin{matrix} 〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉 (Θ_{1}, Θ_{2}) & = \frac{1}{A^{2}} \int d^{2} α_{1} \int d^{2} α_{2} W_{A} (α_{1}) W_{A} (α_{2}) [\prod_{i = 1}^{3} \int d^{2} ϑ_{i} U_{θ_{i}} (| α_{1} - ϑ_{i} |)] [\prod_{j = 4}^{6} \int d^{2} ϑ_{j} U_{θ_{j}} (| α_{2} - ϑ_{j} |)] \\ \times [({〈 κ_{1} κ_{2} 〉}_{c} {〈 κ_{3} κ_{4} 〉}_{c} {〈 κ_{5} κ_{6} 〉}_{c} + 14 Perm .) + ({〈 κ_{1} κ_{2} κ_{3} 〉}_{c} {〈 κ_{4} κ_{5} κ_{6} 〉}_{c} + 9 Perm .) \\ + ({〈 κ_{1} κ_{2} 〉}_{c} {〈 κ_{3} κ_{4} κ_{5} κ_{6} 〉}_{c} + 14 Perm .) + {〈 κ_{1} κ_{2} κ_{3} κ_{4} κ_{5} κ_{6} 〉}_{c}] \\ = {〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉}_{G} (Θ_{1}, Θ_{2}) + {〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉}_{BB} (Θ_{1}, Θ_{2}) + {〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉}_{PT} (Θ_{1}, Θ_{2}) \\ + {〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉}_{P_{6}} (Θ_{1}, Θ_{2}) . \end{matrix}$ $\begin{aligned} \nonumber \left\langle \hat{M}_{\rm ap}^3\, \hat{M}_{\rm ap}^3\right\rangle (\Theta _1, \Theta _2)&= \frac{1}{A^2}\int \mathrm{d}^2{\alpha _1} \int \mathrm{d}^2{\alpha _2} W_A(\boldsymbol{\alpha }_1)\, W_A(\boldsymbol{\alpha }_2)\, \Bigg [\prod _{i=1}^3 \int \mathrm{d}^2{\vartheta _i} U_{\theta _i}(|\boldsymbol{\alpha }_1-\boldsymbol{\vartheta }_i|)\Bigg ]\,\Bigg [\prod _{j=4}^6 \int \mathrm{d}^2{\vartheta _j} U_{\theta _j}(|\boldsymbol{\alpha }_2-\boldsymbol{\vartheta }_j|)\Bigg ]\\&\quad \times \Big [ \left(\left\langle \kappa _1\,\kappa _2\right\rangle _{\rm c}\,\left\langle \kappa _3\,\kappa _4\right\rangle _{\rm c}\,\left\langle \kappa _5\,\kappa _6\right\rangle _{\rm c} + 14\,\mathrm{Perm.} \right) + \left( \left\langle \kappa _1\,\kappa _2\,\kappa _3\right\rangle _{\rm c}\,\left\langle \kappa _4\,\kappa _5\,\kappa _6\right\rangle _{\rm c} + 9\,\mathrm{Perm.} \right)\\&\nonumber \qquad + \left( \left\langle \kappa _1\,\kappa _2\right\rangle _{\rm c}\,\left\langle \kappa _3\,\kappa _4\,\kappa _5\,\kappa _6\right\rangle _{\rm c} + 14\,\mathrm{Perm.} \right) + \left\langle \kappa _1\,\kappa _2\,\kappa _3\,\kappa _4\,\kappa _5\,\kappa _6\right\rangle _{\rm c} \Big ]\\&\nonumber = \left\langle \hat{M}_{\rm ap}^3\, \hat{M}_{\rm ap}^3\right\rangle _{\rm G}(\Theta _1, \Theta _2) + \left\langle \hat{M}_{\rm ap}^3\, \hat{M}_{\rm ap}^3\right\rangle _{BB}(\Theta _1, \Theta _2) +\left\langle \hat{M}_{\rm ap}^3\, \hat{M}_{\rm ap}^3\right\rangle _{PT}(\Theta _1, \Theta _2)\\&\nonumber \quad + \left\langle \hat{M}_{\rm ap}^3\, \hat{M}_{\rm ap}^3\right\rangle _{P_6}(\Theta _1, \Theta _2). \end{aligned}$ (16)

The first term ${〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉}_{G}$ ${\left\langle{{\hat{M}_{\text{ ap}}}^3}\, {{\hat{M}_{\text{ ap}}}^3}\right\rangle}_{\mathrm{G}}$ is the Gaussian part of the covariance – the only part that is non-zero for Gaussian random fields. It depends only on the two-point correlation of κ, or, equivalently, the matter power spectrum P(ℓ). The other terms comprise the non-Gaussian part of the covariance. They depend on higher-order polyspectra, namely the bi-, tri- and the pentaspectrum.

3.1. Gaussian part

We first concentrate on the Gaussian part ${〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉}_{G}$ ${\left\langle{{\hat{M}_{\text{ ap}}}^3}{{\hat{M}_{\text{ ap}}}^3}\right\rangle}_{\mathrm{G}}$ , given by

$\begin{matrix} {〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉}_{G} (Θ_{1}, Θ_{2}) & = \frac{1}{A^{2}} \int d^{2} α_{1} \int d^{2} α_{2} W_{A} (α_{1}) W_{A} (α_{2}) [\prod_{i = 1}^{3} \int d^{2} ϑ_{i} U_{θ_{i}} (| α_{1} - ϑ_{i} |)] [\prod_{j = 4}^{6} \int d^{2} ϑ_{j} U_{θ_{j}} (| α_{2} - ϑ_{j} |)] \\ \times [{〈 κ_{1} κ_{2} 〉}_{c} {〈 κ_{3} κ_{4} 〉}_{c} {〈 κ_{5} κ_{6} 〉}_{c} + 14 Perm .] . \end{matrix}$ $\begin{aligned} \nonumber \left\langle \hat{M}_{\rm ap}^3\hat{M}_{\rm ap}^3\right\rangle _{\rm G}(\Theta _1, \Theta _2)&= \frac{1}{A^2}\int \mathrm{d}^2{\alpha _1} \int \mathrm{d}^2{\alpha _2} W_A(\boldsymbol{\alpha }_1)\, W_A(\boldsymbol{\alpha }_2)\,\Bigg [\prod _{i=1}^3 \int \mathrm{d}^2{\vartheta _i} U_{\theta _i}(|\boldsymbol{\alpha }_1-\boldsymbol{\vartheta }_i|)\Bigg ]\,\Bigg [\prod _{j=4}^6 \int \mathrm{d}^2{\vartheta _j} U_{\theta _j}(|\boldsymbol{\alpha }_2-\boldsymbol{\vartheta }_j|)\Bigg ]\\&\quad \times \left[\left\langle \kappa _1\,\kappa _2\right\rangle _{\rm c}\,\left\langle \kappa _3\,\kappa _4\right\rangle _{\rm c}\,\left\langle \kappa _5\,\kappa _6\right\rangle _{\rm c} + 14\,\mathrm{Perm.} \right]. \end{aligned}$ (17)

We split the permutations in Eq. (17) into two groups, as

$\begin{matrix} {〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉}_{G} (Θ_{1}, Θ_{2}) & = \frac{1}{A^{2}} \int d^{2} α_{1} \int d^{2} α_{2} W_{A} (α_{1}) W_{A} (α_{2}) [\prod_{i = 1}^{3} \int d^{2} ϑ_{i} U_{θ_{i}} (| α_{1} - ϑ_{i} |)] [\prod_{j = 4}^{6} \int d^{2} ϑ_{j} U_{θ_{j}} (| α_{2} - ϑ_{j} |)] \\ \times [{〈 κ_{1} κ_{4} 〉}_{c} {〈 κ_{3} κ_{5} 〉}_{c} {〈 κ_{2} κ_{6} 〉}_{c} + 5 Perm .] \\ + \frac{1}{A^{2}} \int d^{2} α_{1} \int d^{2} α_{2} W_{A} (α_{1}) W_{A} (α_{2}) [\prod_{i = 1}^{3} \int d^{2} ϑ_{i} U_{θ_{i}} (| α_{1} - ϑ_{i} |)] [\prod_{j = 4}^{6} \int d^{2} ϑ_{j} U_{θ_{j}} (| α_{2} - ϑ_{j} |)] \\ \times [{〈 κ_{1} κ_{2} 〉}_{c} {〈 κ_{3} κ_{4} 〉}_{c} {〈 κ_{5} κ_{6} 〉}_{c} + 8 Perm .] \\ = T_{P P P, 1} (Θ_{1}, Θ_{2}) + T_{P P P, 2} (Θ_{1}, Θ_{2}) . \end{matrix}$ $\begin{aligned} \nonumber \left\langle \hat{M}_{\rm ap}^3\hat{M}_{\rm ap}^3\right\rangle _{\rm G}(\Theta _1, \Theta _2)&= \frac{1}{A^2}\int \mathrm{d}^2{\alpha _1} \int \mathrm{d}^2{\alpha _2} W_A(\boldsymbol{\alpha }_1)\, W_A(\boldsymbol{\alpha }_2)\, \Bigg [\prod _{i=1}^3 \int \mathrm{d}^2{\vartheta _i} U_{\theta _i}(|\boldsymbol{\alpha }_1-\boldsymbol{\vartheta }_i|)\Bigg ]\,\Bigg [\prod _{j=4}^6 \int \mathrm{d}^2{\vartheta _j} U_{\theta _j}(|\boldsymbol{\alpha }_2-\boldsymbol{\vartheta }_j|)\Bigg ]\\&\qquad \times \left[\left\langle \kappa _1\,\kappa _4\right\rangle _{\rm c}\,\left\langle \kappa _3\,\kappa _5\right\rangle _{\rm c}\,\left\langle \kappa _2\,\kappa _6\right\rangle _{\rm c} + 5\,\mathrm{Perm.}\right]\\&\nonumber \quad + \frac{1}{A^2}\int \mathrm{d}^2{\alpha _1} \int \mathrm{d}^2{\alpha _2} W_A(\boldsymbol{\alpha }_1)\, W_A(\boldsymbol{\alpha }_2)\, \Bigg [\prod _{i=1}^3 \int \mathrm{d}^2{\vartheta _i} U_{\theta _i}(|\boldsymbol{\alpha }_1-\boldsymbol{\vartheta }_i|)\Bigg ]\,\Bigg [\prod _{j=4}^6 \int \mathrm{d}^2{\vartheta _j} U_{\theta _j}(|\boldsymbol{\alpha }_2-\boldsymbol{\vartheta }_j|)\Bigg ]\\&\nonumber \qquad \times \left[\left\langle \kappa _1\,\kappa _2\right\rangle _{\rm c}\,\left\langle \kappa _3\,\kappa _4\right\rangle _{\rm c}\,\left\langle \kappa _5\,\kappa _6\right\rangle _{\rm c} + 8\,\mathrm{Perm.}\right]\\&\nonumber = T_{PPP, 1}(\Theta _1, \Theta _2) + T_{PPP, 2}(\Theta _1, \Theta _2). \end{aligned}$ (18)

The first group consists of six terms, in which for all three ⟨κ_i κ_j⟩_c, the first index i ∈ {1, 2, 3}, and the second index j ∈ {4, 5, 6}. The second group consists of the nine other permutations. To evaluate T_PPP, 1 and T_PPP, 2, we use Eq. (2) for n = 2 and

$\begin{matrix} \int d^{2} ϑ U_{θ} (| α - ϑ |) e^{- i ℓ \cdot ϑ} = \tilde{u} (ℓ θ) e^{- i ℓ \cdot α}, \end{matrix}$ $\begin{aligned} \int \mathrm{d}^2{\vartheta }\, U_\theta (|\boldsymbol{\alpha }-\boldsymbol{\vartheta }|) \, \mathrm{e} ^{-\mathrm{i} \boldsymbol{\ell }\cdot \boldsymbol{\vartheta }} = \tilde{u}(\ell \,\theta )\, \mathrm{e} ^{-\mathrm{i} \boldsymbol{\ell }\cdot \boldsymbol{\alpha }}, \end{aligned}$ (19)

and find

$\begin{matrix} T_{P P P, 1} (Θ_{1}, Θ_{2}) & = \frac{1}{A^{2}} [\prod_{i = 1}^{3} \int \frac{d^{2} ℓ_{i}}{{(2 π)}^{2}} P (ℓ_{i})] [\tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{2} θ_{2}) \tilde{u} (ℓ_{3} θ_{3}) \tilde{u} (ℓ_{1} θ_{4}) \tilde{u} (ℓ_{2} θ_{5}) \tilde{u} (ℓ_{3} θ_{6}) + 5 Perm .] \\ \times \int d^{2} α_{1} \int d^{2} α_{2} W_{A} (α_{1}) W_{A} (α_{2}) e^{- i (α_{1} - α_{2}) (ℓ_{1} + ℓ_{2} + ℓ_{3})} \\ = [\prod_{i = 1}^{3} \int \frac{d^{2} ℓ_{i}}{{(2 π)}^{2}} P (ℓ_{i})] G_{A} (ℓ_{1} + ℓ_{2} + ℓ_{3}) [\tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{2} θ_{2}) \tilde{u} (ℓ_{3} θ_{3}) \tilde{u} (ℓ_{1} θ_{4}) \tilde{u} (ℓ_{2} θ_{5}) \tilde{u} (ℓ_{3} θ_{6}) + 5 Perm .], \end{matrix}$ $\begin{aligned} T_{PPP, 1}(\Theta _1, \Theta _2) \nonumber&= \frac{1}{A^2} \left[\prod _{i=1}^3 \int \frac{\mathrm{d}^2{\ell _i}}{(2\pi )^2} P(\ell _i) \right] \, \left[\tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _2\,\theta _2)\,\tilde{u}(\ell _3\,\theta _3)\,\tilde{u}(\ell _1\,\theta _4)\,\tilde{u}(\ell _2\,\theta _5)\,\tilde{u}(\ell _3\,\theta _6) + \mathrm{5\,Perm.}\right]\\&\quad \times \int \mathrm{d}^2{\alpha _1} \int \mathrm{d}^2{\alpha _2} W_A(\boldsymbol{\alpha }_1)\, W_A(\boldsymbol{\alpha }_2)\, \mathrm{e} ^{-\mathrm{i} (\boldsymbol{\alpha }_1-\boldsymbol{\alpha }_2)\,(\boldsymbol{\ell }_1+\boldsymbol{\ell }_2+\boldsymbol{\ell }_3)} \\&\nonumber = \left[\prod _{i=1}^3 \int \frac{\mathrm{d}^2{\ell _i}}{(2\pi )^2} P(\ell _i) \right] \, G_A(\boldsymbol{\ell }_1+\boldsymbol{\ell }_2+\boldsymbol{\ell }_3)\,\left[ \tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _2\,\theta _2)\,\tilde{u}(\ell _3\,\theta _3)\,\tilde{u}(\ell _1\,\theta _4)\,\tilde{u}(\ell _2\,\theta _5)\,\tilde{u}(\ell _3\,\theta _6)\, + \mathrm{5\,Perm.}\right], \end{aligned}$ (20)

and

$\begin{matrix} T_{P P P, 2} (Θ_{1}, Θ_{2}) & = \frac{1}{A^{2}} [\prod_{i = 1}^{3} \int \frac{d^{2} ℓ_{i}}{{(2 π)}^{2}} P (ℓ_{i})] [\tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{1} θ_{2}) \tilde{u} (ℓ_{2} θ_{3}) \tilde{u} (ℓ_{2} θ_{4}) \tilde{u} (ℓ_{3} θ_{5}) \tilde{u} (ℓ_{3} θ_{6}) + 8 Perm .] \\ \times \int d^{2} α_{1} \int d^{2} α_{2} W_{A} (α_{1}) W_{A} (α_{2}) e^{- i (α_{1} - α_{2}) \cdot ℓ_{2}} \\ = [\prod_{i = 1}^{3} \int \frac{d^{2} ℓ_{i}}{{(2 π)}^{2}} P (ℓ_{i})] G_{A} (ℓ_{2}) [\tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{1} θ_{2}) \tilde{u} (ℓ_{2} θ_{3}) \tilde{u} (ℓ_{2} θ_{4}) \tilde{u} (ℓ_{3} θ_{5}) \tilde{u} (ℓ_{3} θ_{6}) + 8 Perm .], \end{matrix}$ $\begin{aligned} T_{PPP, 2}(\Theta _1, \Theta _2)&\nonumber = \frac{1}{A^2} \left[\prod _{i=1}^3 \int \frac{\mathrm{d}^2{\ell _i}}{(2\pi )^2} P(\ell _i) \right] \,\left[\tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _1\,\theta _2)\,\tilde{u}(\ell _2\,\theta _3)\,\tilde{u}(\ell _2\,\theta _4)\,\tilde{u}(\ell _3\,\theta _5)\,\tilde{u}(\ell _3\,\theta _6)+\mathrm{8\,Perm.}\right]\\&\quad \times \int \mathrm{d}^2{\alpha _1} \int \mathrm{d}^2{\alpha _2} W_A(\boldsymbol{\alpha }_1)\, W_A(\boldsymbol{\alpha }_2)\, \mathrm{e} ^{-\mathrm{i} (\boldsymbol{\alpha }_1-\boldsymbol{\alpha }_2)\cdot \boldsymbol{\ell }_2} \\&\nonumber = \left[\prod _{i=1}^3 \int \frac{\mathrm{d}^2{\ell _i}}{(2\pi )^2} P(\ell _i) \right] \, G_A(\boldsymbol{\ell }_2)\,\left[\tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _1\,\theta _2)\,\tilde{u}(\ell _2\,\theta _3)\,\tilde{u}(\ell _2\,\theta _4)\,\tilde{u}(\ell _3\,\theta _5)\,\tilde{u}(\ell _3\,\theta _6) + \mathrm{8\,Perm.}\right], \end{aligned}$ (21)

where the geometry factor G_A(ℓ), defined by

$\begin{matrix} G_{A} (ℓ) = \frac{1}{A^{2}} \int d^{2} α_{1} \int d^{2} α_{2} W_{A} (α_{1}) W_{A} (α_{2}) e^{- i (α_{1} - α_{2}) \cdot ℓ}, \end{matrix}$ $\begin{aligned} G_A(\boldsymbol{\ell }) = \frac{1}{A^2}\int \mathrm{d}^2{\alpha _1} \int \mathrm{d}^2{\alpha _2}\;W_A(\boldsymbol{\alpha }_1)\, W_A(\boldsymbol{\alpha }_2)\,\mathrm{e} ^{-\mathrm{i} (\boldsymbol{\alpha }_1-\boldsymbol{\alpha }_2)\cdot \boldsymbol{\ell }}, \end{aligned}$ (22)

contains the full dependence of the covariance on the survey area and geometry. For a square survey with side length $ϑ_{max} = \sqrt{A}$ ${\vartheta_{\text{ max}}}=\sqrt{A}$ , and ℓ = (ℓ_x, ℓ_y) the factor is

$\begin{matrix} G_{A, square} (ℓ) = \frac{4 {sin}^{2} (ℓ_{x} ϑ_{\max} / 2)}{ℓ_{x}^{2} ϑ_{\max}^{2}} \frac{4 {sin}^{2} (ℓ_{y} ϑ_{\max} / 2)}{ℓ_{y}^{2} ϑ_{\max}^{2}} . \end{matrix}$ $\begin{aligned} G_{A, \mathrm{square} }(\boldsymbol{\ell }) = \frac{4\sin ^2(\ell _{x}\, \vartheta _{\rm max}/2)}{\ell _{x}^2\, \vartheta _{\rm max}^2}\, \frac{4 \sin ^2(\ell _{y}\, \vartheta _{\rm max}/2)}{\ell _{y}^2\, \vartheta _{\rm max}^2}. \end{aligned}$ (23)

The geometry factor G_A is related to a function E_A(η), which, for a point α within A, gives the probability for a second point at α + η to lie within A as well. It is given by

$\begin{matrix} E_{A} (η) = \frac{1}{A} \int d^{2} α W_{A} (α) W_{A} (α + η), so that G_{A} (ℓ) = \frac{1}{A} \int_{A} d^{2} η E_{A} (η) e^{- i η \cdot ℓ} . \end{matrix}$ $\begin{aligned} E_A(\boldsymbol{\eta }) = \frac{1}{A}\int \mathrm{d}^2{\alpha }\, W_A(\boldsymbol{\alpha }) \, W_A(\boldsymbol{\alpha }+\boldsymbol{\eta }), \quad \mathrm{so\ that} \quad G_A(\boldsymbol{\ell })=\frac{1}{A}\int _A \mathrm{d}^2{\eta }\, E_A(\boldsymbol{\eta })\, \mathrm{e} ^{-\mathrm{i} \boldsymbol{\eta }\cdot \boldsymbol{\ell }}. \end{aligned}$ (24)

For a square area, E_A(η) was determined by Heydenreich et al. (2020). It is unity for vanishing separation |η| = 0 and smoothly declines to zero for η outside the square.

3.2. Non-Gaussian part

We now derive the non-Gaussian part of the covariance. First, we calculate the covariance part ${〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉}_{BB}$ ${\left\langle{{\hat{M}_{\text{ ap}}}^3}{{\hat{M}_{\text{ ap}}}^3}\right\rangle}_{BB}$ , which depends on third-order correlations of κ. We divide this term into two parts, as

$\begin{matrix} {〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉}_{BB} (Θ_{1}, Θ_{2}) & = \frac{1}{A^{2}} \int d^{2} α_{1} \int d^{2} α_{2} [\prod_{i = 1}^{3} \int d^{2} ϑ_{i} U_{θ_{i}} (| α_{1} - ϑ_{i} |)] [\prod_{j = 4}^{6} \int d^{2} ϑ_{j} U_{θ_{j}} (| α_{2} - ϑ_{j} |)] \\ \times W_{A} (α_{1}) W_{A} (α_{2}) {{〈 κ_{1} κ_{2} κ_{3} 〉}_{c} {〈 κ_{4} κ_{5} κ_{6} 〉}_{c} + [{〈 κ_{4} κ_{2} κ_{3} 〉}_{c} {〈 κ_{1} κ_{5} κ_{6} 〉}_{c} + 8 Perm .]} \\ = 〈 {\hat{M}}_{ap}^{3} (θ_{1}, θ_{2}, θ_{3}) 〉 〈 {\hat{M}}_{ap}^{3} (θ_{4}, θ_{5}, θ_{6}) 〉 + T_{BB} (Θ_{1}, Θ_{2}) . \end{matrix}$ $\begin{aligned} \left\langle \hat{M}_{\rm ap}^3\hat{M}_{\rm ap}^3\right\rangle _{BB}(\Theta _1, \Theta _2)&\nonumber = \frac{1}{A^2}\int \mathrm{d}^2{\alpha _1} \int \mathrm{d}^2{\alpha _2} \Bigg [\prod _{i=1}^3 \int \mathrm{d}^2{\vartheta _i} U_{\theta _i}(|\boldsymbol{\alpha }_1-\boldsymbol{\vartheta }_i|)\Bigg ]\,\left[\prod _{j=4}^6 \int \mathrm{d}^2{\vartheta _j} U_{\theta _j}(|\boldsymbol{\alpha }_2-\boldsymbol{\vartheta }_j|)\right]\\&\nonumber \quad \times W_A(\boldsymbol{\alpha }_1)\, W_A(\boldsymbol{\alpha }_2)\, \left\{ \left\langle \kappa _1\,\kappa _2\,\kappa _3\right\rangle _{\rm c}\,\left\langle \kappa _4\,\kappa _5\,\kappa _6\right\rangle _{\rm c} + \left[\left\langle \kappa _4\,\kappa _2\,\kappa _3\right\rangle _{\rm c}\,\left\langle \kappa _1\,\kappa _5\,\kappa _6\right\rangle _{\rm c} + 8\,\mathrm{Perm. }\right]\right\} \\&= \left\langle \hat{M}_{\rm ap}^3(\theta _1, \theta _2, \theta _3)\right\rangle \left\langle \hat{M}_{\rm ap}^3(\theta _4, \theta _5, \theta _6)\right\rangle + T_{BB}(\Theta _1, \Theta _2). \end{aligned}$ (25)

The first term cancels with the second term in the definition of the covariance of the estimator in Eq. (14). The term T_BB is

$\begin{matrix} T_{BB} (Θ_{1}, Θ_{2}) & = \frac{1}{A^{2}} \int d^{2} α_{1} \int d^{2} α_{2} [\prod_{i = 1}^{3} \int d^{2} ϑ_{i} U_{θ_{i}} (| α_{1} - ϑ_{i} |)] [\prod_{j = 4}^{6} \int d^{2} ϑ_{j} U_{θ_{j}} (| α_{2} - ϑ_{j} |)] W_{A} (α_{1}) W_{A} (α_{2}) \\ \times [{〈 κ_{4} κ_{2} κ_{3} 〉}_{c} {〈 κ_{1} κ_{5} κ_{6} 〉}_{c} + 8 Perm .] . \end{matrix}$ $\begin{aligned} T_{BB}(\Theta _1, \Theta _2)&=\frac{1}{A^2}\int \mathrm{d}^2{\alpha _1} \int \mathrm{d}^2{\alpha _2} \Bigg [\prod _{i=1}^3 \int \mathrm{d}^2{\vartheta _i} U_{\theta _i}(|\boldsymbol{\alpha }_1-\boldsymbol{\vartheta }_i|)\Bigg ]\,\left[\prod _{j=4}^6 \int \mathrm{d}^2{\vartheta _j} U_{\theta _j}(|\boldsymbol{\alpha }_2-\boldsymbol{\vartheta }_j|)\right]\, W_A(\boldsymbol{\alpha }_1)\, W_A(\boldsymbol{\alpha }_2)\,\\&\nonumber \quad \times \left[\left\langle \kappa _4\,\kappa _2\,\kappa _3\right\rangle _{\rm c}\,\left\langle \kappa _1\,\kappa _5\,\kappa _6\right\rangle _{\rm c} + \mathrm{8\,Perm.}\right]. \end{aligned}$ (26)

Using Eq. (2) for n = 3 to introduce the bispectrum and the definition of G_A in Eq. (22) leads to

$\begin{matrix} T_{BB} (Θ_{1}, Θ_{2}) & = \int \frac{d^{2} ℓ_{1}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{2}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{4}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{5}}{{(2 π)}^{2}} B (ℓ_{1}, ℓ_{2}) B (ℓ_{4}, ℓ_{5}) G_{A} (ℓ_{1} - ℓ_{4}) \\ \times [\tilde{u} (ℓ_{4} θ_{1}) \tilde{u} (ℓ_{2} θ_{2}) \tilde{u} (| ℓ_{1} + ℓ_{2} | θ_{3}) \tilde{u} (ℓ_{1} θ_{4}) \tilde{u} (ℓ_{5} θ_{5}) \tilde{u} (| ℓ_{4} + ℓ_{5} | θ_{6}) + 8 Perm .] . \end{matrix}$ $\begin{aligned} T_{BB}(\Theta _1, \Theta _2)&= \int \frac{\mathrm{d}^2{\ell _1}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _2}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _4}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _5}}{(2\pi )^2} \; B(\boldsymbol{\ell }_1, \boldsymbol{\ell }_2)\, B(\boldsymbol{\ell }_4, \boldsymbol{\ell }_5)\, G_A(\boldsymbol{\ell }_1-\boldsymbol{\ell }_4) \\&\nonumber \quad \times \left[\tilde{u}(\ell _4\,\theta _1)\,\tilde{u}(\ell _2\,\theta _2)\,\tilde{u}(|\boldsymbol{\ell }_1+\boldsymbol{\ell }_2|\,\theta _3)\,\tilde{u}(\ell _1\,\theta _4)\,\tilde{u}(\ell _5\,\theta _5)\,\tilde{u}(|\boldsymbol{\ell }_4+\boldsymbol{\ell }_5|\,\theta _6)+\mathrm{8\,Perm.}\right]. \end{aligned}$ (27)

Second, we calculate the covariance part ${〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉}_{PT}$ ${\left\langle{{\hat{M}_{\text{ ap}}}^3}{{\hat{M}_{\text{ ap}}}^3}\right\rangle}_{PT}$ . We divide this term into two parts as

$\begin{matrix} {〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉}_{PT} (Θ_{1}, Θ_{2}) & = \frac{1}{A^{2}} \int d^{2} α_{1} \int d^{2} α_{2} [\prod_{i = 1}^{3} \int d^{2} ϑ_{i} U_{θ_{i}} (| α_{1} - ϑ_{i} |)] [\prod_{j = 4}^{6} \int d^{2} ϑ_{j} U_{θ_{j}} (| α_{2} - ϑ_{j} |)] \\ \times W_{A} (α_{1}) W_{A} (α_{2}) [{〈 κ_{1} κ_{4} 〉}_{c} {〈 κ_{2} κ_{3} κ_{5} κ_{6} 〉}_{c} + 8 Perm . + {〈 κ_{1} κ_{2} 〉}_{c} {〈 κ_{3} κ_{4} κ_{5} κ_{6} 〉}_{c} + 5 Perm .] \\ = T_{P T, 1} (Θ_{1}, Θ_{2}) + T_{P T, 2} (Θ_{1}, Θ_{2}) . \end{matrix}$ $\begin{aligned} \left\langle \hat{M}_{\rm ap}^3\hat{M}_{\rm ap}^3\right\rangle _{PT}(\Theta _1, \Theta _2)&\nonumber = \frac{1}{A^2}\int \mathrm{d}^2{\alpha _1} \int \mathrm{d}^2{\alpha _2} \Bigg [\prod _{i=1}^3 \int \mathrm{d}^2{\vartheta _i} U_{\theta _i}(|\boldsymbol{\alpha }_1-\boldsymbol{\vartheta }_i|)\Bigg ]\,\left[\prod _{j=4}^6 \int \mathrm{d}^2{\vartheta _j} U_{\theta _j}(|\boldsymbol{\alpha }_2-\boldsymbol{\vartheta }_j|)\right]\\&\nonumber \quad \times W_A(\boldsymbol{\alpha }_1)\, W_A(\boldsymbol{\alpha }_2)\, \Big [ \left\langle \kappa _1\,\kappa _4\right\rangle _{\rm c}\,\left\langle \kappa _2\,\kappa _3\,\kappa _5\,\kappa _6\right\rangle _{\rm c} + 8\,\mathrm{Perm. } + \left\langle \kappa _1\,\kappa _2\right\rangle _{\rm c}\,\left\langle \kappa _3\,\kappa _4\,\kappa _5\,\kappa _6\right\rangle _{\rm c} + 5\,\mathrm{Perm. }\Big ]\\&= T_{PT, 1}(\Theta _1, \Theta _2) + T_{PT, 2}(\Theta _1, \Theta _2). \end{aligned}$ (28)

Here, the first permutations contain all terms where the two-point correlation ⟨κ_i κ_j⟩ has i ∈ {1, 2, 3} and j ∈ {4, 5, 6}. The second permutations contain all other terms. We introduce the power- and trispectrum with Eq. (2), and find

$\begin{matrix} T_{P T, 1} (Θ_{1}, Θ_{2}) & = \int \frac{d^{2} ℓ_{1}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{2}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{3}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{4}}{{(2 π)}^{2}} P (ℓ_{1}) T (ℓ_{2}, ℓ_{3}, ℓ_{4}) G_{A} (ℓ_{1} + ℓ_{2} + ℓ_{3}) \\ \times [\tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{2} θ_{2}) \tilde{u} (ℓ_{3} θ_{3}) \tilde{u} (ℓ_{1} θ_{4}) \tilde{u} (ℓ_{4} θ_{5}) \tilde{u} (| ℓ_{2} + ℓ_{3} + ℓ_{4} | θ_{6}) + 8 Perm .], \end{matrix}$ $\begin{aligned} T_{PT, 1}(\Theta _1, \Theta _2)&=\int \frac{\mathrm{d}^2{\ell _1}}{(2\pi )^2}\int \frac{\mathrm{d}^2{\ell _2}}{(2\pi )^2}\int \frac{\mathrm{d}^2{\ell _3}}{(2\pi )^2}\int \frac{\mathrm{d}^2{\ell _4}}{(2\pi )^2}\; P(\ell _1)\, T(\boldsymbol{\ell }_2, \boldsymbol{\ell }_3, \boldsymbol{\ell }_4)\, G_A(\boldsymbol{\ell }_1+\boldsymbol{\ell }_2+\boldsymbol{\ell }_3)\\&\nonumber \quad \times \left[ \tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _2\,\theta _2)\,\tilde{u}(\ell _3\,\theta _3)\,\tilde{u}(\ell _1\,\theta _4)\,\tilde{u}(\ell _4\,\theta _5)\,\tilde{u}(|\boldsymbol{\ell }_2+\boldsymbol{\ell }_3+\boldsymbol{\ell }_4|\,\theta _6) + \mathrm{8\,Perm.}\right], \end{aligned}$ (29)

and

$\begin{matrix} T_{P T, 2} (Θ_{1}, Θ_{2}) & = \int \frac{d^{2} ℓ_{1}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{2}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{3}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{4}}{{(2 π)}^{2}} P (ℓ_{1}) T (ℓ_{2}, ℓ_{3}, ℓ_{4}) G_{A} (ℓ_{2}) \\ \times [\tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{1} θ_{2}) \tilde{u} (ℓ_{2} θ_{3}) \tilde{u} (ℓ_{3} θ_{4}) \tilde{u} (ℓ_{4} θ_{5}) \tilde{u} (| ℓ_{2} + ℓ_{3} + ℓ_{4} | θ_{6}) + 5 Perm .] . \end{matrix}$ $\begin{aligned} T_{PT, 2}(\Theta _1, \Theta _2)&=\int \frac{\mathrm{d}^2{\ell _1}}{(2\pi )^2}\int \frac{\mathrm{d}^2{\ell _2}}{(2\pi )^2}\int \frac{\mathrm{d}^2{\ell _3}}{(2\pi )^2}\int \frac{\mathrm{d}^2{\ell _4}}{(2\pi )^2}\; P(\ell _1)\, T(\boldsymbol{\ell }_2, \boldsymbol{\ell }_3, \boldsymbol{\ell }_4)\, G_A(\boldsymbol{\ell }_2)\\&\nonumber \quad \times \left[ \tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _1\,\theta _2)\,\tilde{u}(\ell _2\,\theta _3)\,\tilde{u}(\ell _3\,\theta _4)\,\tilde{u}(\ell _4\,\theta _5)\,\tilde{u}(|\boldsymbol{\ell }_2+\boldsymbol{\ell }_3+\boldsymbol{\ell }_4|\,\theta _6) + \mathrm{5\,Perm.}\right]. \end{aligned}$ (30)

Finally, we consider ${〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉}_{P_{6}}$ ${\left\langle{{\hat{M}_{\text{ ap}}}^3}{{\hat{M}_{\text{ ap}}}^3}\right\rangle}_{P_6}$ , which is

$\begin{matrix} {〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉}_{P_{6}} (Θ_{1}, Θ_{2}) & = \frac{1}{A^{2}} \int d^{2} α_{1} \int d^{2} α_{2} W_{A} (α_{1}) W_{A} (α_{2}) \\ \times [\prod_{i = 1}^{3} \int d^{2} ϑ_{i} U_{θ_{i}} (| α_{1} - ϑ_{i} |)] [\prod_{j = 4}^{6} \int d^{2} ϑ_{j} U_{θ_{j}} (| α_{2} - ϑ_{j} |)] {〈 κ_{1} κ_{2} κ_{3} κ_{4} κ_{5} κ_{6} 〉}_{c} \\ = T_{P_{6}} (Θ_{1}, Θ_{2}) . \end{matrix}$ $\begin{aligned} \left\langle \hat{M}_{\rm ap}^3\hat{M}_{\rm ap}^3\right\rangle _{P_6}(\Theta _1, \Theta _2)&\nonumber = \frac{1}{A^2}\int \mathrm{d}^2{\alpha _1} \int \mathrm{d}^2{\alpha _2}\, W_A(\boldsymbol{\alpha }_1)\, W_A(\boldsymbol{\alpha }_2)\\&\nonumber \quad \times \Bigg [\prod _{i=1}^3 \int \mathrm{d}^2{\vartheta _i} U_{\theta _i}(|\boldsymbol{\alpha }_1-\boldsymbol{\vartheta }_i|)\Bigg ]\,\Bigg [\prod _{j=4}^6 \int \mathrm{d}^2{\vartheta _j} U_{\theta _j}(|\boldsymbol{\alpha }_2-\boldsymbol{\vartheta }_j|)\Bigg ]\,\left\langle \kappa _1\,\kappa _2\,\kappa _3\,\kappa _4\,\kappa _5\,\kappa _6\right\rangle _{\rm c} \\&= T_{P_6}(\Theta _1, \Theta _2). \end{aligned}$ (31)

The term T_P₆ is, with Eq. (2) for n = 6,

$\begin{matrix} T_{P_{6}} (Θ_{1}, Θ_{2}) & = \int \frac{d^{2} ℓ_{1}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{2}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{3}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{4}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{5}}{{(2 π)}^{2}} P_{6} (ℓ_{1}, ℓ_{2}, ℓ_{3}, ℓ_{4}, ℓ_{5}) G_{A} (ℓ_{1} + ℓ_{2} + ℓ_{3}) \\ \times \tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{2} θ_{2}) \tilde{u} (ℓ_{3} θ_{3}) \tilde{u} (ℓ_{4} θ_{4}) \tilde{u} (ℓ_{5} θ_{5}) \tilde{u} (| ℓ_{1} + ℓ_{2} + ℓ_{3} + ℓ_{4} + ℓ_{5} | θ_{6}) . \end{matrix}$ $\begin{aligned} T_{P_6}(\Theta _1, \Theta _2)&= \int \frac{\mathrm{d}^2{\ell _1}}{(2\pi )^2}\, \int \frac{\mathrm{d}^2{\ell _2}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _3}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _4}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _5}}{(2\pi )^2}\;P_6(\boldsymbol{\ell }_1, \boldsymbol{\ell }_2, \boldsymbol{\ell }_3, \boldsymbol{\ell }_4, \boldsymbol{\ell }_5)\, G_A(\boldsymbol{\ell }_1+\boldsymbol{\ell }_2+\boldsymbol{\ell }_3)\\&\nonumber \quad \times \tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _2\,\theta _2)\,\tilde{u}(\ell _3\,\theta _3)\,\tilde{u}(\ell _4\,\theta _4)\,\tilde{u}(\ell _5\,\theta _5)\,\tilde{u}(|\boldsymbol{\ell }_1+\boldsymbol{\ell }_2+\boldsymbol{\ell }_3+\boldsymbol{\ell }_4+\boldsymbol{\ell }_5|\,\theta _6). \end{aligned}$ (32)

3.3. Connection of covariance terms to aperture mass correlation functions

We now show that the individual covariance terms are related to correlation functions of the aperture mass. While this might appear like a meaningless algebraic exercise, it has an important corollary: Since the correlation functions can be measured, we can estimate all individual terms of the covariance in the validation data. Thereby, the analytic expressions can be validated.

We define the nth order aperture mass correlation functions $M_{ap}^{n, m}$ ${M^{n,m}_{\rm ap}}$ , as

$\begin{matrix} M_{ap}^{n, m} (θ_{1}, \dots, θ_{m} ; θ_{m + 1}, \dots, θ_{n} ; η) = 〈 M_{ap} (ϑ, θ_{1}) \dots M_{ap} (ϑ, θ_{m}) M_{ap} (ϑ + η, θ_{m + 1}) M_{ap} (ϑ + η, θ_{n}) 〉 . \end{matrix}$ $\begin{aligned} M_{\rm ap}^{n,m}(\theta _1,\dots ,\theta _m;\theta _{m+1},\dots ,\theta _n; \boldsymbol{\eta }) = \left\langle M_{\rm ap}(\boldsymbol{\vartheta }, \theta _1)\,\dots \,M_{\rm ap}(\boldsymbol{\vartheta }, \theta _m)\, M_{\rm ap}(\boldsymbol{\vartheta }+\boldsymbol{\eta }, \theta _{m+1}) M_{\rm ap}(\boldsymbol{\vartheta }+\boldsymbol{\eta }, \theta _n)\right\rangle . \end{aligned}$ (33)

This function is symmetric in its first m aperture radii and in its second n − m aperture radii. The correlation functions can be expressed in terms of the convergence as

$\begin{matrix} M_{ap}^{n, m} (θ_{1}, \dots, θ_{m} ; θ_{m + 1}, \dots, θ_{n} ; η) = [\prod_{i = 1}^{m} \int d^{2} ϑ_{i} U_{θ_{i}} (ϑ_{i})] [\prod_{j = m + 1}^{n} \int d^{2} ϑ_{j} U_{θ_{j}} (| ϑ_{j} + η |)] 〈 κ_{1} \dots κ_{n} 〉, \end{matrix}$ $\begin{aligned} M_{\rm ap}^{n,m}(\theta _1,\dots ,\theta _m;\theta _{m+1},\dots ,\theta _n; \boldsymbol{\eta }) = \Bigg [\prod _{i=1}^m\int \mathrm{d}^2{\vartheta _i}\, U_{\theta _i}(\vartheta _i)\Bigg ]\, \Bigg [\prod _{j=m+1}^n\int \mathrm{d}^2{\vartheta _j}\, U_{\theta _j}(|\boldsymbol{\vartheta }_j+\boldsymbol{\eta }|)\Bigg ] \left\langle \kappa _1\dots \kappa _n\right\rangle , \end{aligned}$ (34)

where κ_i = κ(ϑ_i) and the average over the convergence contains both connected and unconnected terms. Using n = 2, Eq. (18) and the definition of E(η) from Eq. (24), the first Gaussian covariance term can be written as

$\begin{matrix} T_{P P P, 1} (Θ_{1}, Θ_{2}) & = \frac{1}{A^{2}} \int d^{2} α_{1} \int d^{2} α_{2} W_{A} (α_{1}) W_{A} (α_{2}) M_{ap}^{2, 1} (θ_{1} ; θ_{4} ; | α_{1} - α_{2} |) M_{ap}^{2, 1} (θ_{2} ; θ_{5} ; | α_{1} - α_{2} |) \\ \times M_{ap}^{2, 1} (θ_{3} ; θ_{6} ; | α_{1} - α_{2} |) + 5 Perm . \\ = \frac{1}{A} \int_{A} d^{2} η E_{A} (η) M_{ap}^{2, 1} (θ_{1} ; θ_{4} ; η) M_{ap}^{2, 1} (θ_{2} ; θ_{5} ; η) M_{ap}^{2, 1} (θ_{3} ; θ_{6} ; η) + 5 Perm . \end{matrix}$ $\begin{aligned} T_{PPP, 1}(\Theta _1, \Theta _2)&=\frac{1}{A^2}\int \mathrm{d}^2{\alpha _1}\int \mathrm{d}^2{\alpha _2}\; W_A(\boldsymbol{\alpha }_1)\,W_A(\boldsymbol{\alpha }_2)\, M_{\rm ap}^{2,1}(\theta _1; \theta _4;|\boldsymbol{\alpha }_1-\boldsymbol{\alpha }_2|)\,M_{\rm ap}^{2,1}(\theta _2; \theta _5;|\boldsymbol{\alpha }_1-\boldsymbol{\alpha }_2|)\\&\nonumber \quad \times M_{\rm ap}^{2,1}(\theta _3; \theta _6;|\boldsymbol{\alpha }_1-\boldsymbol{\alpha }_2|) \mathrm{+ 5\,Perm.}\\&=\nonumber \frac{1}{A}\int _A \mathrm{d}^2{\eta }\; E_A(\boldsymbol{\eta })\, M_{\rm ap}^{2,1}(\theta _1; \theta _4;\boldsymbol{\eta })\,M_{\rm ap}^{2,1}(\theta _2; \theta _5;\boldsymbol{\eta })\, M_{\rm ap}^{2,1}(\theta _3; \theta _6;\boldsymbol{\eta }) \mathrm{+ 5\,Perm.} \end{aligned}$ (35)

Similarly, T_PPP, 2 can be written as

$\begin{matrix} T_{P P P, 2} (Θ_{1}, Θ_{2}) = M_{ap}^{2} (θ_{1}, θ_{2}) M_{ap}^{2} (θ_{5}, θ_{6}) \frac{1}{A} \int_{A} d^{2} η E_{A} (η) M_{ap}^{2, 1} (θ_{3} ; θ_{4} ; η) + 8 Perm ., \end{matrix}$ $\begin{aligned} T_{PPP, 2}(\Theta _1, \Theta _2)=M_{\rm ap}^2(\theta _1, \theta _2)\, M_{\rm ap}^2(\theta _5, \theta _6)\, \frac{1}{A}\int _A \mathrm{d}^2{\eta }\, E_A(\boldsymbol{\eta })\, M_{\rm ap}^{2,1}(\theta _3; \theta _4;\boldsymbol{\eta }) +\mathrm{8\,Perm.}, \end{aligned}$ (36)

where $M_{ap}^{2} (θ_{1}, θ_{2}) = M_{ap}^{2, 0} (θ_{1}, θ_{2}, η = 0)$ ${M^2_{\rm ap}}(\theta_1, \theta_2) = {M^{2,0}_{\rm ap}}(\theta_1, \theta_2, \eta=0)$ is the second-order aperture statistic.

The term T_BB becomes

$\begin{matrix} T_{BB} (Θ_{1}, Θ_{2}) = \frac{1}{A} \int_{A} d^{2} η E_{A} (η) M_{ap}^{3, 2} (θ_{1}, θ_{2} ; θ_{4} ; η) M_{ap}^{3, 2} (θ_{5}, θ_{6} ; θ_{3} ; η) + 8 Perm . \end{matrix}$ $\begin{aligned} T_{BB}(\Theta _1, \Theta _2)=\frac{1}{A}\int _A \mathrm{d}^2{\eta }\; E_A(\boldsymbol{\eta })\, M_{\rm ap}^{3,2}(\theta _1, \theta _2; \theta _4;\boldsymbol{\eta })\,M_{\rm ap}^{3,2}(\theta _5, \theta _6; \theta _3;\boldsymbol{\eta })+\mathrm{8\,Perm.} \end{aligned}$ (37)

We relate T_PT, 1 and T_PT, 2 to the correlation functions $M_{ap}^{4, 3}$ ${M^{4,3}_{\rm ap}}$ and $M_{ap}^{4, 2}$ ${M^{4,2}_{\rm ap}}$ . We stress that these correlation functions depend on both connected and unconnected correlations of the convergence field. Using Eq. (28), we find

$\begin{matrix} T_{P T, 1} (Θ_{1}, Θ_{2}) = \frac{1}{A} \int_{A} d^{2} η E_{A} (η) M_{ap}^{2, 1} (θ_{1} ; θ_{4} ; η) M_{ap}^{4, 2} (θ_{2}, θ_{3} ; θ_{5}, θ_{6} ; η) + 8 Perm . - [3 T_{P P P, 1} (Θ_{1}, Θ_{2}) + T_{P P P, 2} (Θ_{1}, Θ_{2})], \end{matrix}$ $\begin{aligned} T_{PT, 1}(\Theta _1, \Theta _2)=\frac{1}{A}\int _A \mathrm{d}^2{\eta }\; E_A(\boldsymbol{\eta })\, M_{\rm ap}^{2,1}(\theta _1; \theta _4;\boldsymbol{\eta })\,M_{\rm ap}^{4,2}(\theta _2, \theta _3; \theta _5, \theta _6;\boldsymbol{\eta }) +\mathrm{8\,Perm.} - \big [3\,T_{PPP, 1}(\Theta _1,\Theta _2)+T_{PPP, 2}(\Theta _1,\Theta _2)\big ], \end{aligned}$ (38)

and

$\begin{matrix} T_{P T, 2} (Θ_{1}, Θ_{2}) = M_{ap}^{2} (θ_{1}, θ_{2}) \frac{1}{A} \int_{A} d^{2} η E_{A} (η) M_{ap}^{4, 3} (θ_{4}, θ_{5}, θ_{6} ; θ_{3} ; η) + 5 Perm . - [2 T_{P P P, 2} (Θ_{1}, Θ_{2})], \end{matrix}$ $\begin{aligned} T_{PT, 2}(\Theta _1, \Theta _2)= M_{\rm ap}^2(\theta _1, \theta _2)\,\frac{1}{A}\int _A \mathrm{d}^2{\eta }\; E_A(\boldsymbol{\eta })\,M_{\rm ap}^{4,3}(\theta _4, \theta _5, \theta _6; \theta _3;\boldsymbol{\eta }) + \mathrm{5\,Perm.} - \big [2\,T_{PPP, 2}(\Theta _1,\Theta _2)\big ], \end{aligned}$ (39)

where the square brackets denote the terms caused by unconnected terms in $M_{ap}^{4, 3}$ ${M^{4,3}_{\rm ap}}$ and $M_{ap}^{4, 2}$ ${M^{4,2}_{\rm ap}}$ . The term T_P₆ cannot be obtained from the aperture mass correlation functions independently of the other covariance terms. However, using Eq. (15), the full covariance $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ is

$\begin{matrix} C_{{\hat{M}}_{ap}^{3}} (Θ_{1}, Θ_{2}) = \frac{1}{A} \int_{A} d^{2} η E_{A} (η) M_{ap}^{6, 3} (θ_{1}, θ_{2}, θ_{3} ; θ_{4}, θ_{5}, θ_{6} ; η) - {\hat{M}}_{ap}^{3} (Θ_{1}) {\hat{M}}_{ap}^{3} (Θ_{2}), \end{matrix}$ $\begin{aligned} C_{\hat{M}_{\rm ap}^3}(\Theta _1, \Theta _2)=\frac{1}{A}\int _A \mathrm{d}^2{\eta }\; E_A(\boldsymbol{\eta })\, M_{\rm ap}^{6,3}(\theta _1, \theta _2, \theta _3;\theta _4,\theta _5, \theta _6; \boldsymbol{\eta }) -\hat{M}_{\rm ap}^3(\Theta _1)\hat{M}_{\rm ap}^3(\Theta _2), \end{aligned}$ (40)

so T_P₆ can be estimated with

$\begin{matrix} T_{P_{6}} (Θ_{1}, Θ_{2}) = C_{{\hat{M}}_{ap}^{3}} (Θ_{1}, Θ_{2}) - T_{P P P, 1} (Θ_{1}, Θ_{2}) - T_{P P P, 2} (Θ_{1}, Θ_{2}) - T_{BB} (Θ_{1}, Θ_{2}) - T_{P T, 1} (Θ_{1}, Θ_{2}) - T_{P T, 2} (Θ_{1}, Θ_{2}) . \end{matrix}$ $\begin{aligned} T_{P_6}(\Theta _1, \Theta _2) = C_{\hat{M}_{\rm ap}^3}(\Theta _1, \Theta _2) - T_{PPP, 1}(\Theta _1, \Theta _2) -T_{PPP, 2}(\Theta _1, \Theta _2) - T_{BB}(\Theta _1, \Theta _2) - T_{PT, 1}(\Theta _1, \Theta _2) - T_{PT, 2}(\Theta _1, \Theta _2). \end{aligned}$ (41)

The expressions of $C_{{\hat{M}}_{ap}^{3}} (Θ_{1}, Θ_{2})$ $C_{{{\hat{M}_{\text{ ap}}}^3}}(\Theta_1, \Theta_2)$ in terms of the $M_{ap}^{n, m}$ ${M^{n,m}_{\rm ap}}$ also hold if shape noise is present. Measuring the $M_{ap}^{n, m}$ ${M^{n,m}_{\rm ap}}$ from a convergence field with a noise component N is equivalent to replacing the κ in Eq. (34) by K_N = κ + N. Then, as shown in Appendix B, Eq. (40) leads to

$\begin{matrix} 〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉 (Θ_{1}, Θ_{2}) = \frac{1}{A} \int_{A} d^{2} η E_{A} (η) [\prod_{i = 1}^{3} \int d^{2} ϑ_{i} U_{θ_{i}} (ϑ_{i})] [\prod_{j = 4}^{6} \int d^{2} ϑ_{j} U_{θ_{j}} (| ϑ_{j} + η |)] 〈 K_{N} (ϑ_{1}) \dots K_{N} (ϑ_{6}) 〉 \end{matrix}$ $\begin{aligned} \left\langle \hat{M}_{\rm ap}^3\, \hat{M}_{\rm ap}^3\right\rangle (\Theta _1, \Theta _2)=\frac{1}{A}\int _A \mathrm{d}^2{\eta }\; E_A(\boldsymbol{\eta })\, \Bigg [\prod _{i=1}^3\int \mathrm{d}^2{\vartheta _i}\, U_{\theta _i}(\vartheta _i)\Bigg ]\, \Bigg [\prod _{j=4}^6\int \mathrm{d}^2{\vartheta _j}\, U_{\theta _j}(|\boldsymbol{\vartheta }_j+\boldsymbol{\eta }|)\Bigg ] \left\langle K_N(\boldsymbol{\vartheta }_1)\dots K_N(\boldsymbol{\vartheta }_6)\right\rangle \end{aligned}$ (42)

which is equivalent to $〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉$ ${\left\langle{{\hat{M}_{\text{ ap}}}^3}\, {{\hat{M}_{\text{ ap}}}^3}\right\rangle}$ with shapenoise, as given by Eq. (B.4).

3.4. Large-field approximation

In the previous sections, we have derived six components for the covariance, which might appear analogous to each other. However, we show here that the terms T_PPP, 2 and T_PT, 2 show significantly different behaviour with survey size than the other terms. For this, we consider the case of a large survey area, for which the window function W_A can be approximated as one everywhere. In this case, the geometry factor G_A becomes

$\begin{matrix} G_{A} (ℓ) \to \frac{{(2 π)}^{2}}{A} δ_{D} (ℓ) . \end{matrix}$ $\begin{aligned} G_A(\boldsymbol{\ell }) \rightarrow \frac{(2\pi )^2}{A} \delta _{\rm D}(\boldsymbol{\ell }). \end{aligned}$ (43)

In this approximation, the terms T_PPP, 1, T_BB, T_PT, 1 and T_P₆ become

$\begin{matrix} T_{P P P, 1}^{\infty} (Θ_{1}, Θ_{2}) & = \frac{1}{A} \int \frac{d^{2} ℓ_{1}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{2}}{{(2 π)}^{2}} P (ℓ_{1}) P (ℓ_{2}) P (| ℓ_{1} + ℓ_{2} |) \\ \times [\tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{2} θ_{2}) \tilde{u} (| ℓ_{1} + ℓ_{2} | θ_{3}) \tilde{u} (ℓ_{1} θ_{4}) \tilde{u} (ℓ_{2} θ_{5}) \tilde{u} (| ℓ_{1} + ℓ_{2} | θ_{6}) + 5 Perm .], \end{matrix}$ $\begin{aligned} T_{PPP, 1}^\infty (\Theta _1, \Theta _2)&= \frac{1}{A}\int \frac{\mathrm{d}^2{\ell _1}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _2}}{(2\pi )^2} \; P(\ell _1)\,P(\ell _2)\,P(|\boldsymbol{\ell }_1+\boldsymbol{\ell }_2|)\,\\&\nonumber \quad \times \left[ \tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _2\,\theta _2)\,\tilde{u}(|\boldsymbol{\ell }_1+\boldsymbol{\ell }_2|\,\theta _3)\,\tilde{u}(\ell _1\,\theta _4)\,\tilde{u}(\ell _2\,\theta _5)\,\tilde{u}(|\boldsymbol{\ell }_1+\boldsymbol{\ell }_2|\,\theta _6) + \mathrm{5\,Perm.} \right], \end{aligned}$ (44)

$\begin{matrix} T_{BB}^{\infty} (Θ_{1}, Θ_{2}) & = \frac{1}{A} \int \frac{d^{2} ℓ_{1}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{2}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{3}}{{(2 π)}^{2}} B (ℓ_{1}, ℓ_{2}) B (ℓ_{1}, ℓ_{3}) \\ \times [\tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{2} θ_{2}) \tilde{u} (| ℓ_{1} + ℓ_{2} | θ_{3}) \tilde{u} (ℓ_{1} θ_{4}) \tilde{u} (ℓ_{3} θ_{5}) \tilde{u} (| ℓ_{1} + ℓ_{3} | θ_{6}) + 8 Perm .], \end{matrix}$ $\begin{aligned} T_{BB}^\infty (\Theta _1, \Theta _2)&= \frac{1}{A} \int \frac{\mathrm{d}^2{\ell _1}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _2}}{(2\pi )^2}\int \frac{\mathrm{d}^2{\ell _3}}{(2\pi )^2} \; B(\boldsymbol{\ell }_1, \boldsymbol{\ell }_2)\, B(\boldsymbol{\ell }_1, \boldsymbol{\ell }_3)\\&\nonumber \quad \times \left[\tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _2\,\theta _2)\,\tilde{u}(|\boldsymbol{\ell }_1+\boldsymbol{\ell }_2|\,\theta _3)\,\tilde{u}(\ell _1\,\theta _4)\,\tilde{u}(\ell _3\,\theta _5)\,\tilde{u}(|\boldsymbol{\ell }_1+\boldsymbol{\ell }_3|\,\theta _6) + \mathrm{8\,Perm.}\right], \end{aligned}$ (45)

$\begin{matrix} T_{P T, 1}^{\infty} (Θ_{1}, Θ_{2}) & = \frac{1}{A} \int \frac{d^{2} ℓ_{1}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{2}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{3}}{{(2 π)}^{2}} P (ℓ_{1}) T (ℓ_{2}, - ℓ_{1} - ℓ_{2}, ℓ_{3}) \\ \times [\tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{2} θ_{2}) \tilde{u} (| ℓ_{1} + ℓ_{2} | θ_{3}) \tilde{u} (ℓ_{1} θ_{4}) \tilde{u} (ℓ_{3} θ_{5}) \tilde{u} (| ℓ_{3} - ℓ_{1} | θ_{6}) + 8 Perm .], \end{matrix}$ $\begin{aligned} T_{PT, 1}^\infty (\Theta _1, \Theta _2)&=\frac{1}{A}\int \frac{\mathrm{d}^2{\ell _1}}{(2\pi )^2}\int \frac{\mathrm{d}^2{\ell _2}}{(2\pi )^2}\int \frac{\mathrm{d}^2{\ell _3}}{(2\pi )^2}\; P(\ell _1)\, T(\boldsymbol{\ell }_2, -\boldsymbol{\ell }_1-\boldsymbol{\ell }_2, \boldsymbol{\ell }_3)\\&\nonumber \quad \times \left[ \tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _2\,\theta _2)\,\tilde{u}(|\boldsymbol{\ell }_1+\boldsymbol{\ell }_2|\,\theta _3)\,\tilde{u}(\ell _1\,\theta _4)\,\tilde{u}(\ell _3\,\theta _5)\,\tilde{u}(|\boldsymbol{\ell }_3-\boldsymbol{\ell }_1|\,\theta _6) + \mathrm{8\,Perm.}\right], \end{aligned}$ (46)

and

$\begin{matrix} T_{P_{6}}^{\infty} (Θ_{1}, Θ_{2}) & = \frac{1}{A} \int \frac{d^{2} ℓ_{1}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{2}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{3}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{4}}{{(2 π)}^{2}} P_{6} (ℓ_{1}, ℓ_{2}, - ℓ_{1} - ℓ_{2}, ℓ_{3}, ℓ_{4}) \\ \times \tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{2} θ_{2}) \tilde{u} (| ℓ_{1} + ℓ_{2} | θ_{3}) \tilde{u} (ℓ_{3} θ_{4}) \tilde{u} (ℓ_{4} θ_{5}) \tilde{u} (| ℓ_{3} + ℓ_{4} | θ_{6}) . \end{matrix}$ $\begin{aligned} T_{P_6}^\infty (\Theta _1, \Theta _2)&= \frac{1}{A} \int \frac{\mathrm{d}^2{\ell _1}}{(2\pi )^2}\, \int \frac{\mathrm{d}^2{\ell _2}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _3}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _4}}{(2\pi )^2}\;P_6(\boldsymbol{\ell }_1, \boldsymbol{\ell }_2, -\boldsymbol{\ell }_1-\boldsymbol{\ell }_2, \boldsymbol{\ell }_3, \boldsymbol{\ell }_4)\,\\&\nonumber \quad \times \tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _2\,\theta _2)\,\tilde{u}(|\boldsymbol{\ell }_1+\boldsymbol{\ell }_2|\,\theta _3)\,\tilde{u}(\ell _3\,\theta _4)\,\tilde{u}(\ell _4\,\theta _5)\,\tilde{u}(|\boldsymbol{\ell }_3+\boldsymbol{\ell }_4|\,\theta _6). \end{aligned}$ (47)

These terms scale with the inverse survey area, as is commonly expected for covariance. The terms T_PPP, 2 and T_PT, 2, though, behave differently. Applying the approximation leads to

$\begin{matrix} T_{P P P, 2}^{\infty} (Θ_{1}, Θ_{2}) & = \frac{1}{A} \int \frac{d^{2} ℓ_{1}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{3}}{{(2 π)}^{2}} P (ℓ_{1}) P (0) P (ℓ_{3}) [\tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{1} θ_{2}) \tilde{u} (0) \tilde{u} (0) \tilde{u} (ℓ_{3} θ_{5}) \tilde{u} (ℓ_{3} θ_{6}) + 8 Perm .] = 0, \end{matrix}$ $\begin{aligned} T_{PPP, 2}^\infty (\Theta _1, \Theta _2)&= \frac{1}{A}\int \frac{\mathrm{d}^2{\ell _1}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _3}}{(2\pi )^2}\; P(\ell _1)\,P(0)\,P(\ell _3)\, \left[ \tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _1\,\theta _2)\,\tilde{u}(0)\,\tilde{u}(0)\,\tilde{u}(\ell _3\,\theta _5)\,\tilde{u}(\ell _3\,\theta _6) + \mathrm{8\,Perm.}\right] = 0, \end{aligned}$ (48)

and

$\begin{matrix} T_{P T, 2}^{\infty} (Θ_{1}, Θ_{2}) & = \frac{1}{A} \int \frac{d^{2} ℓ_{1}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{2}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{3}}{{(2 π)}^{2}} P (ℓ_{1}) T (0, ℓ_{2}, ℓ_{3}) [\tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{1} θ_{2}) \\ \times \tilde{u} (0) \tilde{u} (ℓ_{2} θ_{4}) \tilde{u} (ℓ_{3} θ_{5}) \tilde{u} (| ℓ_{2} + ℓ_{3} | θ_{6}) + 5 Perm .] = 0 . \end{matrix}$ $\begin{aligned} T_{PT, 2}^\infty (\Theta _1, \Theta _2)&= \frac{1}{A} \int \frac{\mathrm{d}^2{\ell _1}}{(2\pi )^2}\int \frac{\mathrm{d}^2{\ell _2}}{(2\pi )^2}\int \frac{\mathrm{d}^2{\ell _3}}{(2\pi )^2}\; P(\ell _1)\, T(0, \boldsymbol{\ell }_2, \boldsymbol{\ell }_3)\, \big [ \tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _1\,\theta _2)\,\\&\nonumber \quad \times \tilde{u}(0)\,\tilde{u}(\ell _2\,\theta _4)\,\tilde{u}(\ell _3\,\theta _5)\,\tilde{u}(|\boldsymbol{\ell }_2+\boldsymbol{\ell }_3|\,\theta _6) + \mathrm{5\,Perm.} \big ] = 0. \end{aligned}$ (49)

Both terms vanish because the filter function $\tilde{u} (0) = 0$ $\tilde{u}(0) = 0$ for a compensated filter. The same can be observed when writing T_PPP, 2 in terms of the aperture mass correlation function (Eq. (36)). The large-field approximation implies E(η) = 1, so for a square survey with sidelength ϑ_max

$\begin{matrix} T_{P P P, 2}^{\infty} (Θ_{1}, Θ_{2}) = M_{ap}^{2} (θ_{1}, θ_{2}) M_{ap}^{2} (θ_{5}, θ_{6}) \frac{1}{A} \int_{A} d^{2} η M_{ap}^{2, 1} (θ_{3} ; θ_{4} ; η) + 8 Perm ., \end{matrix}$ $\begin{aligned} T_{PPP, 2}^\infty (\Theta _1, \Theta _2)=M_{\rm ap}^2(\theta _1, \theta _2)\, M_{\rm ap}^2(\theta _5, \theta _6)\, \frac{1}{A}\int _A \mathrm{d}^2{\eta }\, M_{\rm ap}^{2,1}(\theta _3; \theta _4;\boldsymbol{\eta }) +\mathrm{8\,Perm.}, \end{aligned}$ (50)

Since $M_{ap}^{2, 1}$ ${M^{2,1}_{\rm ap}}$ is a correlation function of a quantity with vanishing mean, the integral over η vanishes as A → ℝ² and T_PPP, 2 declines to zero faster than 1/A. The same argument can be made for T_PT, 2, which depends on the integral of the correlation function $M_{ap}^{4, 3}$ ${M^{4,3}_{\rm ap}}$ . Consequently, these terms are directly connected to the finiteness of the survey area, which is why we refer to them as ‘finite-field terms’ of the covariance.

As shown in Eq. (23), the geometry factor G_A is typically a strongly oscillating function. Consequently, integrals over this function are difficult to evaluate numerically. Therefore, in the following, we use the approximations $T_{P P P, 1}^{\infty}$ ${T^\infty_{PPP, 1}}$ , $T_{B B}^{\infty}$ ${T^\infty_{BB}}$ and $T_{P T, 1}^{\infty}$ ${T^\infty_{PT, 1}}$ unless otherwise noted. However, as we will show in Sect. 6 using $T_{P_{6}}^{\infty}$ ${T^\infty_{P_6}}$ instead of T_P₆ neglects a significant part of the covariance. As shown in Appendix C, T_P₆ can be approximated as

$\begin{matrix} T_{P_{6}} (Θ_{1}, Θ_{2}) ≃ T_{P_{6}}^{\infty} (Θ_{1}, Θ_{2}) + T_{P_{6}, 2 h} (Θ_{1}, Θ_{2}), \end{matrix}$ $\begin{aligned} T_{P_6}(\Theta _1, \Theta _2) \simeq T_{P_6}^\infty (\Theta _1, \Theta _2) + T_{P_6,\mathrm{2h} }(\Theta _1, \Theta _2), \end{aligned}$ (51)

with T_P₆, 2h given by Eq. (C.7). The term T_P₆, 2h is similar to the finite-field terms T_PPP, 2 and T_PT, 2 since it does not scale inversely with survey area and vanishes under the large-field approximation.

The approximations $T_{P P P, 1}^{\infty}, T_{B B}^{\infty}$ ${T^\infty_{PPP, 1}},{T^\infty_{BB}}$ , and $T_{P T, 1}^{\infty}$ ${T^\infty_{PT, 1}}$ also neglect part of the covariance. However, since these terms are sub-dominant for the total covariance, as we will show in Sect. 5, we deem the approximation appropriate for these terms.

4. Attempt at alternative derivation of Gaussian ${\hat{M}}_{ap}^{3}$ $\hat{\mathit{M}}_{\mathsf{ap}}^{\mathsf{3}}$ covariance from bispectrum covariance

An alternative approach to finding an analytic covariance estimate of a real-space observable is via the covariance of the corresponding Fourier space quantity. As the $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ are related to the bispectrum B via Eq. (9), it might appear natural to derive the covariance of $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ from the covariance of B. However, this is impossible for finite survey areas without using the large-field approximation, even for Gaussian fields. To see this, we rewrite Eq. (9) as

$\begin{matrix} 〈 M_{ap}^{3} 〉 (θ_{1}, θ_{2}, θ_{3}) = {(2 π)}^{2} [\prod_{i = 1}^{3} \int \frac{d^{2} ℓ_{i}}{{(2 π)}^{2}} \tilde{u} (ℓ_{i} θ_{i})] P_{3} (ℓ_{1}, ℓ_{2}, ℓ_{3}) δ_{D} (ℓ_{1} + ℓ_{2} + ℓ_{3}), \end{matrix}$ $\begin{aligned} \left\langle M_{\rm ap}^3\right\rangle (\theta _1, \theta _2, \theta _3) = (2\pi )^2 \left[\prod _{i=1}^3 \int \frac{\mathrm{d}^2{\ell _i}}{(2\pi )^2}\, \tilde{u}(\ell _i\, \theta _i) \right]\, \mathcal{P} _3(\boldsymbol{\ell }_1, \boldsymbol{\ell }_2, \boldsymbol{\ell }_3) \, \delta _{\rm D}(\boldsymbol{\ell }_1+\boldsymbol{\ell }_2+\boldsymbol{\ell }_3), \end{aligned}$ (52)

where we remind that 𝒫₃(ℓ₁, ℓ₂, ℓ₃) = B(ℓ₁, ℓ₂). One could then assume that the covariance $C_{〈 M_{ap}^{3} 〉}$ $C_{{\left\langle{{M^3_{\rm ap}}}\right\rangle}}$ were given by

$\begin{matrix} C_{{\hat{M}}_{ap}^{3}} (Θ_{1}, Θ_{2}) = {(2 π)}^{4} [\prod_{i = 1}^{6} \int \frac{d^{2} ℓ_{i}}{{(2 π)}^{2}} \tilde{u} (ℓ_{i} θ_{i})] C_{B} (ℓ_{1}, ℓ_{2}, ℓ_{3}, ℓ_{4}, ℓ_{5}, ℓ_{6}) δ_{D} (ℓ_{1} + ℓ_{2} + ℓ_{3}) δ_{D} (ℓ_{4} + ℓ_{5} + ℓ_{6}), \end{matrix}$ $\begin{aligned} C_{\hat{M}_{\rm ap}^3}(\Theta _1, \Theta _2) = (2\pi )^4 \left[\prod _{i=1}^6 \int \frac{\mathrm{d}^2{\ell _i}}{(2\pi )^2}\, \tilde{u}(\ell _i\, \theta _i) \right]\, C_B(\boldsymbol{\ell }_1, \boldsymbol{\ell }_2, \boldsymbol{\ell }_3, \boldsymbol{\ell }_4, \boldsymbol{\ell }_5, \boldsymbol{\ell }_6) \, \delta _{\rm D}(\boldsymbol{\ell }_1+\boldsymbol{\ell }_2+\boldsymbol{\ell }_3)\, \delta _{\rm D}(\boldsymbol{\ell }_4+\boldsymbol{\ell }_5+\boldsymbol{\ell }_6), \end{aligned}$ (53)

where C_B is the covariance of the bispectrum. Then, it should be possible to infer C_B from $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ . However, the Gaussian part of $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ is

$\begin{matrix} T_{P P P, 1} (Θ_{1}, Θ_{2}) + T_{P P P, 2} (Θ_{1}, Θ_{2}) & = {(2 π)}^{6} [\prod_{i = 1}^{6} \int \frac{d^{2} ℓ_{i}}{{(2 π)}^{2}} \tilde{u} (ℓ_{i} θ_{i})] P (ℓ_{1}) P (ℓ_{2}) P (ℓ_{3}) \\ \times {[G_{A} (ℓ_{1} + ℓ_{2} + ℓ_{3}) δ_{D} (ℓ_{1} + ℓ_{4}) δ_{D} (ℓ_{2} + ℓ_{5}) δ_{D} (ℓ_{3} + ℓ_{6}) + 5 Perm .] \\ + [G_{A} (ℓ_{2}) δ_{D} (ℓ_{1} + ℓ_{2}) δ_{D} (ℓ_{3} + ℓ_{4}) δ_{D} (ℓ_{5} + ℓ_{6}) + 8 Perm .]} . \end{matrix}$ $\begin{aligned} T_{PPP, 1}(\Theta _1, \Theta _2) + T_{PPP, 2}(\Theta _1, \Theta _2)&= (2\pi )^6 \left[\prod _{i=1}^6\int \frac{\mathrm{d}^2{\ell _i}}{(2\pi )^2}\, \tilde{u}(\ell _i\, \theta _i) \right]\, P(\ell _1)\, P(\ell _2)\, P(\ell _3)\\&\nonumber \quad \times \Big \{[G_A(\boldsymbol{\ell }_1+\boldsymbol{\ell }_2+\boldsymbol{\ell }_3)\,\delta _{\rm D}(\boldsymbol{\ell }_1+\boldsymbol{\ell }_4)\,\delta _{\rm D}(\boldsymbol{\ell }_2+\boldsymbol{\ell }_5)\,\delta _{\rm D}(\boldsymbol{\ell }_3+\boldsymbol{\ell }_6) + \mathrm{5\,Perm.} ]\\&\nonumber \qquad + [G_A(\boldsymbol{\ell }_2)\,\delta _{\rm D}(\boldsymbol{\ell }_1+\boldsymbol{\ell }_2)\,\delta _{\rm D}(\boldsymbol{\ell }_3+\boldsymbol{\ell }_4)\,\delta _{\rm D}(\boldsymbol{\ell }_5+\boldsymbol{\ell }_6) + \mathrm{8\,Perm.}]\Big \}. \end{aligned}$ (54)

For Eq. (53) to hold true, we need the expression in braces to equal f(ℓ₁, ℓ₂, ℓ₃, ℓ₄, ℓ₅, ℓ₆) δ_D(ℓ₁ + ℓ₂ + ℓ₃) δ_D(ℓ₄ + ℓ₅ + ℓ₆) for some function f. However, this is impossible unless G_A is proportional to a Dirac function. Therefore, Eq. (53) cannot hold, and $C_{〈 M_{ap}^{3} 〉}$ $C_{{\left\langle{{M^3_{\rm ap}}}\right\rangle}}$ and C_B are not related by simple integration.

Under the large-field approximation, though, Eq. (53) can be used. We show this in Appendix D, where we start from the Gaussian bispectrum covariance from Joachimi et al. (2009) to recover $T_{P P P, 1}^{\infty}$ ${T^\infty_{PPP, 1}}$ .

5. Validation of model covariance

We test the analytical model for $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ by comparing it to numerical estimates from mock data. This section describes the used data, measurement method, and results.

5.1. Validation data

We use two kinds of mock data: Shear maps for Gaussian density distributions and realistic shear and convergence maps from two cosmological N-body simulations, namely the Scinet LIghtcone Simulations (SLICS, Harnois-Déraps & van Waerbeke 2015) and the suite by Takahashi et al. (2017, T17 hereafter). The data from N-body simulations are the same as for Paper I, which we describe here again for convenience.

5.1.1. SLICS

The SLICS contain 1536³ particles inside a 505 h⁻¹ Mpc box. They are gravitationally evolved according to a flat ΛCDM cosmology with normalised Hubble constant h = 0.69, matter clustering parameter σ₈ = 0.83, mater density parameter Ω_m = 0.29, baryon density parameter Ω_b = 0.047, and primordial power spectrum spectral index n_s = 0.969. We use two different sets of data products from these simulations. The first consists of mock shear catalogues from 924 pseudo-independent lines of sight, each with a square area of 100 deg². The source galaxies are distributed with a redshift distribution of

$\begin{matrix} n (z) \propto z^{2} e^{- {(z / z_{0})}^{β}}, \end{matrix}$ $\begin{aligned} n(z) \propto z^2\, \mathrm{e} ^{-\left(z/z_0\right)^\beta }, \end{aligned}$ (55)

with z₀ = 0.637, β = 1.5 and normalisation such that the overall galaxy density is 30 arcmin⁻². This redshift distribution and number density correspond to expectations for stage IV surveys. The shear catalogues are infused with shape noise by adding random ellipticities from a Gaussian. We use a two-component ellipticity dispersion of $σ_{ϵ}^{2}$ $\sigma_\epsilon^2$ = (0.37)².

The second set of data products from the SLICS are shape noise-free convergence maps. These convergence maps have the same area and cosmology as the shear lines of sight, but they were obtained with all source galaxies situated at redshift z_s = 1.

5.1.2. T17 Simulations

The T17 simulations are constructed from a series of boxes of side lengths L, 2L, 3L, with L = 450 h⁻¹ Mpc. The 2048³ particles in the box are evolved using GADGET2 (Springel et al. 2001) and a flat ΛCDM cosmology with parameters h = 0.7, σ₈ = 0.82, Ω_m = 0.279, Ω_b = 0.046, and n_s = 0.97. Using GRayTrix², light rays are traced through 108 realisations of the simulation, creating full-sky convergence shells at 38 redshifts.

We create mock data similar to the KiDS-1000 data release of the Kilo Degree Survey (KiDS, Kuijken et al. 2015) from these convergence shells by cutting out an area of 859.4 deg² from the convergence shells, combining them according to the KiDS-1000 n(z) shown in Fig. 3 (Hildebrandt et al. 2021), and adding a realistic shape noise with a standard deviation

$\begin{matrix} σ = \frac{σ_{ϵ}}{\sqrt{2 n_{gal} A_{pix}}}, \end{matrix}$ $\begin{aligned} \sigma =\frac{\sigma _\epsilon }{\sqrt{2 n_{\rm gal}\, A_{\rm pix}}}, \end{aligned}$ (56)

Fig. 3.

Redshift distribution constructed from the T17 simulation given the fiducial n(z) of the KiDS-1000 data.

with the pixel area A_pix, n_gal = 6.17 arcmin⁻², and σ_ϵ = 0.375 (Giblin et al. 2021).

5.1.3. Gaussian random fields (GRFs)

We also generate shear maps for Gaussian density distributions, characterised solely by their power spectrum. We use a shape-noise-free cosmic shear power spectrum based on the non-linear matter power spectrum from the revised halofit prescription (Takahashi et al. 2012). We use the cosmological parameters of the T17 simulations and the KiDS-1000 n(z).

From the power spectrum, we create square maps of the shear γ using the python library lenstools (Petri 2016) with the assumption of periodic boundary conditions. We create 4096 realisations for each of the sidelengths $ϑ_{max}^{'} \in {10^{°}, 15^{°}, 20^{°}}$ ${\vartheta^\prime_{\rm max}} \in \lbrace 10^\circ, 15^\circ, 20^\circ \rbrace$ . Each map contains N_pix = 4096 × 4096 pixels.

5.2. Covariance estimation from data

We estimate the covariance of ${\hat{M}}_{ap}^{3}$ ${{\hat{M}_{\text{ ap}}}^3}$ in three ways from the validation data. In the first method, we measure the aperture statistics $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ on all simulation realisations using FFT and then use the sample covariance as an estimate. In the second method, we measure the third-order shear correlation functions Γ_i and convert them to estimates of $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ for all simulation realisations. Finally, in the third method, we measure the aperture mass correlation functions and convert them to the terms T_PPP, 1 to T_P₆ (see Sect. 3.3).

We choose the same aperture filter scales as used in Paper I between 4′and 32′. There, we showed that $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ at these scales contains cosmological information that is complementary to the information content of second-order statistics. Our choice of scales also is impacted by two practical concerns: At aperture filters larger than 32′the signal-to-noise for $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ becomes small, and at scales smaller than 4′, the $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ -model disagrees with simulations (see Fig. 8 in Paper I).

5.2.1. Using FFT

The measurement of $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ in the simulations uses the same setup as the method described in Paper I. We will summarise it here shortly. For the flat GRFs and the SLICS, we measure $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ in the shear maps in three steps.

At first, we measure M_ap_i(ϑ, θ) for each realisation i, using scale radii θ ∈ {4′,8′,16′} for the SLICS and θ ∈ {4′,8′,16′,32′} for the GRFs. We use Eq. (7), which defines aperture mass maps as convolutions of the filter function Q and the tangential shear. We evaluate these convolutions using the convolution theorem, by first Fourier transforming Q and γ_t with FFT, then multiplying them and inverse Fourier transforming the product. To remove edge effects, we cut off a border of 4 times the largest aperture radius, that is 4 × 32′ = 128′ for the GRFs and 4 × 16′ = 64′ for the SLICS, from each side of the M_ap_i-maps. This large cut-off is needed because our filter function Q_θ has infinite support and is significantly non-zero even for ϑ > θ. The filter contains 99.9% of its power within 4θ, so with the cut-off, boundary effects are negligible. The resulting M_ap_i-maps have side lengths $ϑ_{max} \in} 5 \overset{°}{.} 73, 10 \overset{°}{.} 73, 15 \overset{°}{.} 73 {$ ${\vartheta_{\text{ max}}}\in \} 5{{\overset{\circ}{.}}}73, 10{{\overset{\circ}{.}}}73, 15{{\overset{\circ}{.}}}73 \{$ for the GRFs and $7 \overset{°}{.} 87$ $7{{\overset{\circ}{.}}}87$ for the SLICS.

Then, we estimate ${〈 M_{ap}^{3} 〉}_{i}$ $\langle{{M_{\rm ap}^3}\rangle}_i$ for each realisation i. For this, we multiply M_ap_i(ϑ, θ₁), M_ap_i(ϑ, θ₂), and M_ap_i(ϑ, θ₃) and then average over ϑ. Finally, we estimate the measured covariance $C_{{\hat{M}}_{ap}^{3}}^{meas}$ $C^{\mathrm{meas}}_{{{\hat{M}_{\text{ ap}}}^3}}$ with

$\begin{matrix} C_{{\hat{M}}_{ap}^{3}}^{sim} (Θ_{1}, Θ_{2}) & = \frac{1}{N - 1} \sum_{i = 1}^{N} {〈 M_{ap}^{3} 〉}_{i} (θ_{1}, θ_{2}, θ_{3}) {〈 M_{ap}^{3} 〉}_{i} (θ_{4}, θ_{5}, θ_{6}) \\ - \frac{1}{N (N - 1)} \sum_{i = 1}^{N} {〈 M_{ap}^{3} 〉}_{i} (θ_{1}, θ_{2}, θ_{3}) \sum_{j = 1}^{N} {〈 M_{ap}^{3} 〉}_{j} (θ_{4}, θ_{5}, θ_{6}) . \end{matrix}$ $\begin{aligned} C^\mathrm{sim} _{\hat{M}_{\rm ap}^3}(\Theta _1, \Theta _2)&= \frac{1}{N-1} \sum _{i=1}^N \left\langle M_{\rm ap}^3\right\rangle _i(\theta _1, \theta _2, \theta _3) \left\langle M_{\rm ap}^3\right\rangle _i(\theta _4, \theta _5, \theta _6)\\&\nonumber \quad -\frac{1}{N (N-1)} \sum _{i=1}^N \left\langle M_{\rm ap}^3\right\rangle _i(\theta _1, \theta _2, \theta _3) \sum _{j=1}^N \left\langle M_{\rm ap}^3\right\rangle _j(\theta _4, \theta _5, \theta _6). \end{aligned}$ (57)

For the curved-sky T17 convergence maps, we smooth the maps with the healpy function smoothing with a beam window function determined by the corresponding U_θ filter for the same aperture radii as for the GRFs, which are θ ∈ {4′,8′,16′,32′}. Each filter yields a full-sky aperture mass map M_ap_i, from which we extracted 18 tiles that are not adjacent. This gives in total 1944 almost independent realisations.

We compute uncertainty estimates with bootstrapping for both the SLICS and the T17 covariances from FFT. For this, we generate 10 000 vectors containing N_real randomly drawn integers between 1 and N_real where N_real is the number of realisations (924 for the SLICS and 1944 for the T17). The vectors can explicitly contain the same integer multiple times. For each vector, we take the ${〈 M_{ap}^{3} 〉}_{i}$ $\langle{{M_{\rm ap}^3}\rangle}_i$ for which i is an element in the vector and use it to calculate a sample covariance with Eq. (57). In this way, we obtain 10 000 covariance estimates. We take the standard deviation of these estimates as uncertainty for the $C_{{\hat{M}}_{ap}^{3}}^{sim}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}^{\mathrm{sim}}$ .

5.2.2. Using third-order shear correlation functions

As mentioned in Sect. 2, $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ is related to the natural components Γ_i of the third-order shear correlation function (see also Sect. 5.3 in Heydenreich et al. 2023). Therefore, we can estimate the ${〈 M_{ap}^{3} 〉}_{i}$ $\langle{{M_{\rm ap}^3}\rangle}_i$ of each simulation realisation i by first measuring the Γ_i and then converting them to ${〈 M_{ap}^{3} 〉}_{i}$ $\langle{{M_{\rm ap}^3}\rangle}_i$ , following Eqs. (69) and (71) of Schneider et al. (2005). The covariance of ${\hat{M}}_{ap}^{3}$ ${{\hat{M}_{\text{ ap}}}^3}$ can then be estimated as the sample variance of the ${〈 M_{ap}^{3} 〉}_{i}$ $\langle{{M_{\rm ap}^3}\rangle}_i$ , using Eq. (57).

We expect the covariance estimate from this approach to be smaller than the estimate from the FFT-based method described in the previous subsection. This is because, here, we do not cut off the border of the survey area. Consequently, more shear information is used to estimate $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ , leading to a smaller covariance.

The measurement of the Γ_i is very time-intensive. Therefore, we restrict this approach to the SLICS data, where we measure the Γ_i using treecorr (Jarvis et al. 2004). We use ten bins each for the triangle parameters u, v, and r with r between $0 \overset{'}{.} 1$ $0{{\overset{\prime}{.}}}1$ and 120′. As shown in Paper I, binning the Γ_i in this way yields accurate $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ for scale radii between 4′and 16′.

5.2.3. Using aperture mass correlation functions

As shown in Sect. 3.3, the individual terms of the covariance can be estimated from the aperture mass correlation functions, which we use to validate the analytic expressions.

For this, we measure the aperture mass maps M_ap_i(ϑ, θ) according to Sect. 5.2.1. From these maps, we first calculate the M_ap_i(ϑ, θ₁)M_ap_i(ϑ, θ₂) and M_ap_i(ϑ, θ₁)M_ap_i(ϑ, θ₂)M_ap_i(ϑ, θ₃) by multiplying the aperture mass maps for different filter radii. Then, we estimate the correlation functions $M_{ap}^{2, 1} (θ_{1}; θ_{2}; η)$ ${M^{2,1}_{\rm ap}}(\theta_1;\theta_2; \eta)$ , $M_{ap}^{3, 1} (θ_{1}, θ_{2}; θ_{3}; η)$ ${M^{3,1}_{\rm ap}}(\theta_1,\theta_2; \theta_3; \eta)$ , $M_{ap}^{4, 3} (θ_{1}, θ_{2}, θ_{3}; θ_{4}; η)$ ${M^{4,3}_{\rm ap}}(\theta_1,\theta_2,\theta_3; \theta_4; \eta)$ , and $M_{ap}^{4, 2} (θ_{1}, θ_{2}; θ_{3}, θ_{4}; η)$ ${M^{4,2}_{\rm ap}}(\theta_1,\theta_2; \theta_3, \theta_4; \eta)$ by correlating the relevant aperture mass maps: We first extend the fields M_ap_i(ϑ, θ) using zero-padding, meaning that we add pixel grids containing only zeros to the boundary of the map. Doing this, M_ap_i(ϑ, θ) becomes W_A(ϑ) M_ap_i(ϑ, θ). The correlation between two aperture mass maps then yields

$\begin{matrix} \int_{A} d^{2} ϑ W_{A} (ϑ) {M_{ap}}_{i} (ϑ, θ_{1}) W_{A} (ϑ + η) {M_{ap}}_{i} (ϑ + η, θ_{2}) = E (η) M_{ap}^{2} (θ_{1}, θ_{2} ; η) . \end{matrix}$ $\begin{aligned} \int _A\mathrm{d}^2\vartheta \; W_A(\boldsymbol{\vartheta })\,{M_{\rm ap}}_i(\boldsymbol{\vartheta }, \theta _1)\,W_A(\boldsymbol{\vartheta }+\boldsymbol{\eta })\,{M_{\rm ap}}_i(\boldsymbol{\vartheta }+\boldsymbol{\eta }, \theta _2) = E(\boldsymbol{\eta })\,M_{\rm ap}^{2}(\theta _1,\theta _2;\boldsymbol{\eta }). \end{aligned}$ (58)

We perform this correlation using the correlate2D function from scipy. Finally, we calculate the terms T_PPP, 1 to T_P₆ with the expressions in Sect. 3.3.

Our method to measure the correlation function uses the flat-sky approximation, so we did not apply it to the T17 simulations. Furthermore, the correlation function estimation of the terms T_PPP, 2 and T_PT, 2 is strongly affected by noise, which is why we only use it for the SLICS without shape noise.

5.3. Results

5.3.1. Gaussian random fields

We compare the covariance for $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ measured in the simulated GRFs with the analytical expression T_PPP, 1 + T_PPP, 2 and the large-field approximation $T_{P P P, 1}^{\infty}$ ${T^\infty_{PPP, 1}}$ in Fig. 4. For the GRFs, all terms containing the bi-, tri-, or pentaspectrum vanish, so the middle column gives the full analytical prediction for the covariance. We see that, as expected, both simulated and analytical covariance decrease with increasing survey size. The full analytic covariance shows similar values to the simulated covariance for all combinations of aperture radii. The large-field approximation agrees only on and near the diagonal with the simulation and is much smaller for elements far from the diagonal.

Fig. 4.

$C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ for the GRF, each row is showing a different field size. In the left column are the measured $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ from the simulated GRF. The middle column shows the model prediction, including the finite-field term T_PPP, 2. The covariances under the large-field approximation, for which T_PPP, 2 vanishes, are in the right column.

This observation is quantified in Fig. 5, where we show the fractional differences between the simulated and analytically calculated covariances. The full analytical expression shows deviations of less than 30% to the simulation, while the large-field approximation is almost a factor of two too small for the non-diagonal elements. While the deviations decrease with field size, they are still significant for the largest side length of $15 \overset{°}{.} 73$ $15{{\overset{\circ}{.}}}73$ . Consequently, neglecting T_PPP, 2 is inappropriate, even for surveys of a size of ${(15 \overset{°}{.} 73)}^{2} = 247.43 \deg^{2}$ $(15{{\overset{\circ}{.}}}73)^2 = 247.43\, \mathrm{deg}^2$ .

Fig. 5.

Fractional differences of the model covariance to the measured $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ for the GRF, each row showing a different field size. In the left column are the fractional differences between the full analytic model and the simulated GRF. The middle column shows the difference between the large-field approximation, neglecting the finite field term, and the simulated covariance. In the right column, the difference of the model using the large-field approximation for T_PPP, 1, but keeping T_PPP, 2 as it is, is shown.

The remaining disagreement between the full analytic covariance model and the simulations can be explained by the discreteness of the shear maps. Due to the finite pixel number, the shear maps contain only modes which are $ℓ = 2 π \sqrt{i^{2} + j^{2}} / ϑ_{\max}^{'}$ $\ell=2\pi\sqrt{i^2+j^2}/\vartheta^{\prime}_{\mathrm{max}}$ , where i and j are integers between 0 and N_pix/2, where N_pix is the pixel number and $ϑ_{max}^{'}$ $\vartheta^\prime_{\rm max}$ is the side length of the shear maps (before boundary cut-off). Therefore, while the model assumes a smooth power spectrum P(ℓ), the actual power spectrum of the GRFs is

$\begin{matrix} P^{GRF} (ℓ) = {(\frac{2 π}{ϑ_{\max}^{'}})}^{2} \sum_{i = 1}^{N_{pix} / 2} \sum_{j = 1}^{N_{pix} / 2} P (\frac{2 π \sqrt{i^{2} + j^{2}}}{ϑ_{\max}^{'}}) δ_{D} (ℓ_{1} - \frac{2 π i}{ϑ_{\max}^{'}}) δ_{D} (ℓ_{2} - \frac{2 π j}{ϑ_{\max}^{'}}), \end{matrix}$ $\begin{aligned} P^\mathrm{GRF} (\boldsymbol{\ell })=\left(\frac{2\pi }{\vartheta^\prime _{\rm max}}\right)^2\,\sum _{i=1}^{N_{\rm pix}/2}\sum _{j=1}^{N_{\rm pix}/2} P\left(\frac{2\pi \sqrt{i^2+j^2}}{\vartheta^\prime _{\rm max}}\right)\, \delta _{\rm D}\left(\ell _1-\frac{2\pi i}{\vartheta^\prime _{\rm max}}\right)\, \delta _{\rm D}\left(\ell _2-\frac{2\pi j}{\vartheta^\prime _{\rm max}}\right), \end{aligned}$ (59)

with ℓ = (ℓ₁, ℓ₂). Due to the large pixel number, it is not feasible to directly calculate T_PPP, 1 and T_PPP, 2 for this power spectrum. However, using P^GRF is similar to assuming that P is zero for ℓ outside of [ℓ_min, ℓ_max], with ℓ_min = 2π/ $ϑ_{max}^{'}$ $\vartheta^\prime_{\rm max}$ , and ℓ_max = π N_pix/ $ϑ_{max}^{'}$ $\vartheta^\prime_{\rm max}$ . This is the same as increasing the lower and decreasing the upper integration boundary for the ℓ-integrals in T_PPP, 1 and T_PPP, 2. We show the impact of these integration borders in Fig. 6. The upper boundary ℓ_max, and consequently the pixel number, has only a small impact on the model covariance. This is because the aperture filters $\hat{u}$ $\hat{u}$ decrease with ℓ and are already small at ℓ ≃ ℓ_max. The lower boundary ℓ_min has a stronger impact. By increasing the lower integration border to ℓ_min, we decrease the model covariance. This decrease significantly improves the agreement between the simulated GRFs and the model.

Fig. 6.

Fractional differences between covariance model and estimate from GRFs with $ϑ_{\max} = 5 \overset{°}{.} 78$ $\vartheta_{\mathrm{max}}= 5{{\overset{\circ}{.}}}78$ for different choices of the boundaries of the ℓ-integrals. Left: Fiducial calculation of model, where P(ℓ) is used for all ℓ. Middle: ℓ-modes larger than ℓ_max are cut-off. This removes modes not present in the simulation due to the finite pixel size. Right: ℓ-modes smaller than ℓ_min are cut-off. This removes modes not present in the simulation due to the finite field size.

As the field size (and $ϑ_{max}^{'}$ $\vartheta^\prime_{\rm max}$ ) increases, ℓ_min decreases so that smaller ℓ-modes are included in the GRFs. Therefore, the difference between the model, evaluated for all ℓ, and the covariance from the GRFs shrinks with increasing field size, as seen in the right column of Fig. 5.

We also show in Fig. 5 the fractional difference between $T_{P P P, 1}^{\infty} + T_{P P P, 2}$ ${T^\infty_{PPP, 1}} + {T_{PPP, 2}}$ and the simulated covariance. This approximated analytic expression shows a similar level of agreement with the simulated covariance as the full expression. The cause of this good agreement is the similarity between T_PPP, 1 and $T_{P P P, 1}^{\infty}$ ${T^\infty_{PPP, 1}}$ , shown in Fig. 7. Even for the smallest field size with side length $ϑ_{\max} = 5 \overset{°}{.} 73$ $\vartheta_{\mathrm{max}}= 5{{\overset{\circ}{.}}}73$ , most elements of T_PPP, 1 and $T_{P P P, 1}^{\infty}$ ${T^\infty_{PPP, 1}}$ differ by less than 5%, with stronger deviations mostly at very large aperture radii (32′) or very small aperture radii (4′). For the largest field, all elements agree at better than 15%.

Fig. 7.

Fractional differences of T_PPP, 1 with and without large-field approximation for different field sizes.

The agreement between T_PPP, 1 and $T_{P P P, 1}^{\infty}$ ${T^\infty_{PPP, 1}}$ encourages a practical simplification, which is substituting T_PPP, 1 by the large-field approximation. This significantly simplifies the calculation: Eq. (44) to obtain $T_{P P P, 1}^{\infty}$ ${T^\infty_{PPP, 1}}$ contains only three integrals, while Eq. (20) for T_PPP, 1 contains six integrals, two of which involve oscillating functions. Consequently, the numerical integration for $T_{P P P, 1}^{\infty}$ ${T^\infty_{PPP, 1}}$ is much simpler and faster.

The finite field effect, which gives rise to T_PPP, 2, can be important for covariance estimates derived from simulations. Such estimates are sometimes performed on simulations with a smaller area than the survey to which the covariance is applied. To account for the different areas, the covariances are rescaled by a factor A_sim/A_survey, where A_sim and A_survey are the simulation and survey areas, respectively. This rescaling assumes that the covariance is inversely proportional to the survey area, which is only true under the large-field approximation. While $T_{P P P, 1}^{\infty}$ ${T^\infty_{PPP, 1}}$ indeed scales with the inverse survey area, T_PPP, 2 decreases faster, as shown in Sect. 3.4. We demonstrate the failure of the rescaling approach in Fig. 8, where we show the fractional difference between the covariance for the simulated GRF with a side length of $15 \overset{°}{.} 73$ $15{{\overset{\circ}{.}}}73$ and the analytical estimate for the side length $5 \overset{°}{.} 73$ $5{{\overset{\circ}{.}}}73$ , rescaled by the factor (5.73/15.73)². The rescaled covariance is too large by a factor of up to 4.7.

Fig. 8.

Fractional difference of model covariance, rescaled from a field size of $5 \overset{°}{.} 73^{2}$ $5{{\overset{\circ}{.}}}73^2$ to $15 \overset{°}{.} 73^{2}$ $15{{\overset{\circ}{.}}}73^2$ to the covariance from a simulated GRF of size $15 \overset{°}{.} 73^{2}$ $15{{\overset{\circ}{.}}}73^2$ . The rescaling is performed under the assumption that the covariance scales inversely proportional to the survey area, which is not true for the finite field term T_PPP, 2.

5.3.2. N-body simulations

We now compare the full analytic $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ , including the non-Gaussian terms, to the covariances from the N-body simulations. Figure 9 compares the $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ and $C_{{\hat{M}}_{ap}^{3}}^{sim}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}^{\mathrm{sim}}$ for the SLICS and the T17 simulations. For the SLICS, we see deviations of 1 to 2 times the simulation uncertainty, with the model being larger than the simulated covariance at most scales. These deviations could indicate that the power-, tri-, and pentaspectrum of the SLICS do not correspond to the model polyspectra. This could be caused by the finite resolution and smoothing in the simulations, the need for additional halo terms for the tri- and pentaspectrum in the model, or the general inaccuracy of the halo model. However, for the T17, we generally see a better agreement between simulation and model. Most elements agree within the simulation accuracy, with a few elements outside the 2σ range.

Fig. 9.

Comparison of measured and modelled covariance for the SLICS (top) and T17 simulations (bottom). Left are the measured covariance, middle are the model predictions; right are the differences between model and simulation, normalised by the simulation bootstrap error.

We compare the diagonals of the individual terms of the analytic covariance in Fig. 10. We see that for both the SLICS and T17 setup, the non-Gaussian terms are dominant. For the SLICS, T_P₆ dominates on the scales considered here, whereas for the T17, the first Gaussian term T_PPP, 1 is of a similar magnitude as T_P₆. This difference is caused by the different levels of shape noise in the two simulations. For the SLICS, we use a stage IV-like setup, where the galaxy number density is n_g = 30 arcmin⁻², while the T17 use the KiDS-like number density of n_g = 6.17 arcmin⁻². Therefore, the Gaussian covariance term for the T17 has a larger impact than for the SLICS. The relative importance of the Gaussian and non-Gaussian terms is also related to the considered aperture radii. For larger scales, the importance of the Gaussian terms increases, while the contribution of the pentaspectrum-term decreases, as shown in Fig. 11 for the SLICS setup and aperture radii between 16′and 64′. However, we stress that we consider these large scales suboptimal for cosmological analyses with $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ since, as mentioned in Sect. 5.2, the matter field is more Gaussian and the signal-to-noise of third-order statistics is small.

Fig. 10.

Diagonal of $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ and the individual covariance terms for the SLICS (left) and the T17 (right). The measurement from the simulation is shown as a red dashed line, while the full modelled $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ is the bold black line. The other lines show individual terms. Error bars on the simulated covariances originate from bootstrapping.

Fig. 11.

Model predictions for the individual covariance terms for the SLICS-like setup.

As mentioned before, we can also directly measure the individual covariance terms and compare them to the analytic prediction in Fig. 12 using aperture mass correlation functions. Unfortunately, measuring the finite-field terms T_PPP, 2 and T_PT, 2 is challenging: In Eqs. (36) and (39), one can see that, when neglecting the function E_A(η), the integral over the correlation function is, by definition, zero. So the only part that leads to T_PPP, 2 and T_PT, 2 being non-zero is multiplication with the slowly varying function E_A(η). This amplifies the effect of noise in the measured correlation function. Thus, the measurements of T_PPP, 2 and T_PT, 2 are very sensitive to the noise level, and we can only perform them in the SLICS without shape noise. We note that the source galaxies in this data set are all situated at redshift z = 1 and do not follow the same redshift distribution as the data used for Fig. 10. The measured and modelled covariance terms show similar dependencies on the aperture scale radii. The model for the dominating term T_P₆ agrees within a few per cent with the measurement. For T_PPP, 1, T_PPP, 2, and T_PT, 2, though, the model underpredicts the measurements. This difference is likely due to differences between the model polyspectra and the actual polyspectra of the simulation. Since our trispectrum model contains only the 1-halo term, we do not expect a perfect match between the modelled T_PT, 1 and T_PT, 2 and the measurement. Furthermore, the BiHalofit bispectrum model only agrees with the data at the 10% level (compare to Appendix A of Heydenreich et al. 2023), so deviations at the 20% level are expected for T_BB, which depends on the square of the bispectrum.

Fig. 12.

Comparison of individual covariance terms modelled (solid) and measured (dashed) in the SLICS without shape noise and with all sources at redshift 1.

The term T_P₆, 2h gives a significant contribution to T_P₆. We show this in Fig. 13, where the blue line shows the diagonal of the ratio between T_P₆, 2h and $T_{P_{6}}^{\infty}$ ${T^\infty_{P_6}}$ + T_P₆, 2h for the T17 setup. The two-halo term is substantial for small apertures and the combinations of small and large apertures. The figure also shows the same ratio for larger field sizes. The importance of T_P₆, 2h decreases with field size. This is as expected, since $T_{P_{6}}^{\infty}$ ${T^\infty_{P_6}}$ scales with the inverse survey area, while T_P₆, 2h decreases faster.

Fig. 13.

Ratio of two-halo contribution T_P₆, 2h to full T_P₆ term for various field sizes with T17 cosmology. The blue line corresponds to the KiDS1000-like field size.

Finally, we compare the covariance estimated in the SLICS using the shear three-point correlation function Γ_i to the FFT-based estimate in Fig. 14 show the model prediction for two different survey areas: either the entire survey area of 10 ° ×10° or the survey area of $7 \overset{°}{.} 87 \times 7 \overset{°}{.} 87$ $7{{\overset{\circ}{.}}}87\times 7{{\overset{\circ}{.}}}87$ after boundary removal. As expected, the FFT-based covariance estimate is larger than the Γ_i-based estimate. This is a direct consequence of the boundary removal for the FFT - the effective survey area shrinks.

Fig. 14.

Covariance in SLICS, measured with shear correlation functions Γ_i (black) and FFT (blue), as well as the model covariance for the full survey area of 10 ° ×10° (orange, dashed) and for the effective survey area of $7 \overset{°}{.} 87 \times 7 \overset{°}{.} 87$ $7{{\overset{\circ}{.}}}87\times 7{{\overset{\circ}{.}}}87$ .

The Γ_i-based covariance is itself larger than the full-survey model covariance. This is because, as mentioned before, estimates of $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ for the whole survey area A′ require shear information outside of A′. Consequently, unbiased estimates of $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ on A′ contain more information than estimates of Γ_i, which are restricted to A′. Therefore, the covariance of the Γ_i-based estimator lies in between the model covariance for the total area A′ and cut-off survey area A. Since the Γ_i-based estimator is required to analyse survey data, the covariance model gives an upper and lower bound to the expected survey constraints.

6. Influence of covariance terms on cosmological parameter estimation

We have shown in the previous section that the analytical covariance estimates agree with estimates from the simulations within 1–2 times the simulations’ statistical uncertainty. However, it is unclear whether this level of agreement is sufficient for a cosmological parameter analysis. To test this, in this section, we perform mock MCMC analyses using $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ and $C_{{\hat{M}}_{ap}^{3}}^{sim}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}^{\mathrm{sim}}$ , and compare the resulting parameter constraints. We also test the impact of the individual covariance terms on the cosmological analysis.

6.1. Analysis setup

We perform mock cosmological analyses with the same setup as in Paper I. In this setup, we use the neural network emulator CosmoPower (Spurio Mancini et al. 2022) to quickly evaluate the $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ model and sample the parameter likelihood. With the trained emulator, we evaluate the parameter likelihoods for different choices for the covariance: the estimate from the simulations, the full analytic expressions, and various combinations of the individual terms of $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ . Since the $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ are not sensitive to w₀ and h in a non-tomographic setting, we evaluate the likelihood for fixed w₀ = −1 and h = 0.69. For the T17 validation, we evaluated the $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ -model at the aperture scale radii 4′,8′,16′, and 32′, whereas for the SLICS setup, we used the aperture scale radii 4′,8′, and 16′. The reference data vector is measured at the corresponding cosmology of the respective simulation.

6.2. Results

In the left panel of Fig. 15, we show the T17 results for the constraints on Ω_m, S₈, and σ₈, while varying only Ω_m and S₈. In Table 1, we report the resulting marginalised parameter constraints together with the figure of merit (FoM) estimated as

$\begin{matrix} FoM = \frac{1}{\sqrt{\det (C)}}, \end{matrix}$ $\begin{aligned} \mathrm{FoM} = \frac{1}{\sqrt{\mathrm{det} (C)}}\, , \end{aligned}$ (60)

Fig. 15.

Parameter constraints, using either the covariance from the simulations (red) or the analytic model (black). Left are the constraints for a KiDS-1000-like survey, and right are the constraints for the SLICS setup, which uses a stage IV-like n(z) and shape noise, but a small survey area of $7 \overset{°}{.} 87 \times 7 \overset{°}{.} 87$ $7{{\overset{\circ}{.}}}87 \times 7{{\overset{\circ}{.}}}87$ .

Table 1.

Overview of the marginalised MAP values and 68% confidence intervals resulting from MCMC chains where Ω_m, S₈ are varied and $σ_{8} = S_{8} \sqrt{0.3 / Ω_{m}}$ $\sigma_8=S_8\sqrt{0.3/\Omega_{\mathrm{m}}}$ .

where C is the parameter covariance matrix resulting from the MCMC process. The T17 covariances are tailored to a KiDS-like survey area and shape noise. The constraints from the simulated and analytic covariance coincide. The FoM of Ω_m–S₈ differs by less than 3%. Consequently, the analytic covariance is ideal for an unbiased cosmological analysis in our setup. We display in the right panel of Fig. 15 the comparison for the SLICS setup. The posteriors for both the simulated and modelled covariance matrix clearly agree. This supports the robustness of the modelled covariance for varying survey areas, shape noise and redshift distribution.

In Fig. 16, we compare the parameter constraints obtained when using the total analytic covariance estimate, only the Gaussian terms T_PPP, 1 and T_PPP, 2, when neglecting the finite-field terms T_PPP, 2 and T_PT, 2, and when neglecting the term T_P₆, 2h. The results demonstrate the importance of the non-Gaussian covariance terms, as the FoM of Ω_m–S₈ approximately triples if only the Gaussian terms are used (see Table 1). Consequently, for the application to a stage III survey, non-Gaussian covariance terms for $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ cannot be neglected.

Fig. 16.

Parameter constraints for a KiDS-1000-like survey, using the full model covariance (black), neglecting the finite field terms (blue), neglecting the 2-halo contribution to T_P₆ (green) or using only the Gaussian covariance (orange). The FoMs of Ω_m − S₈ are given in Table 1.

For the KiDS-like survey area used here, the finite-field terms T_PPP, 2 and T_PT, 2 play a secondary role (see Table 1). The FoM of Ω_m-S₈ increases by 12% if these terms are neglected. Consequently, depending on the desired accuracy of the modelled covariance, these terms could be neglected for a real parameter analysis. Since calculating T_PPP, 2 and T_PT, 2 for a real survey requires evaluating Eq. (22) for the true, most likely complicated, survey geometry, bypassing them could significantly decrease the computational complexity of the covariance calculation. Ignoring T_P₆, 2h, the 2-halo contribution to T_P₆, has a stronger effect. Neglecting T_P₆, 2h increases the FoM of Ω_m − S₈ by 27% and has a significant impact on the inferred constraints on the individual parameters.

7. Discussion

In this work, we derived an analytic model for the covariance of the third-order aperture statistics $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ . We make our modelling code publically available³.

We performed three tests for this analytic model. First, we compared the Gaussian part of the covariance to simulated GRFs. We found that all elements of the modelled covariance agree with the GRFs within 30%, with most elements agreeing better than 10% with the simulated data. Second, we compared the full model to two independent N-body simulations, the SLICS and T17 simulations. These comparisons showed that the model agrees within 1–2 times the statistical uncertainty on the simulations with the simulated covariance estimates. Third, we compared the expected constraints from a KiDS-1000-like survey for the cosmological parameters Ω_m and S₈ using the covariance from the T17 simulations and the analytic expression. The constraints were remarkably similar, with only a 3% deviation in the FoM on Ω_m − S₈. These three tests confirmed that our covariance model is sufficiently accurate for analysing a stage III survey.

Additionally, we found that all terms of the ${\hat{M}}_{ap}^{3}$ ${{\hat{M}_{\text{ ap}}}^3}$ covariance can be estimated by measuring aperture mass correlation functions. Using this approach, we compared the individual terms of the model to measurements in the SLICS and confirmed that the model agrees with the simulations.

In the derivation of the covariance model, we found terms that decrease faster than the inverse survey area and vanish entirely under the large-field approximation, which assumes an infinitely broad survey window function. These finite-field terms show a complex dependence on the survey geometry, while the other terms mainly scale with the inverse survey area. The finite field term T_PPP, 2 is already present in the Gaussian case. Comparing the model to GRFs showed that it is the dominating contribution for non-diagonal elements, and neglecting it leads to a severe bias. However, the effect of the finite-field term is small for the realistic covariances, including non-Gaussianity, as they are dominated by the pentaspectrum term T_P₆. Ignoring the finite-field terms leads to an increment of only 12% in the FoM on Ω_m − S₈.

We showed that it is incorrect to derive the covariance for the real-space statistic ${\hat{M}}_{ap}^{3}$ ${{\hat{M}_{\text{ ap}}}^3}$ from a bispectrum covariance. Using an integral over a bispectrum covariance to model the covariance of ${\hat{M}}_{ap}^{3}$ ${{\hat{M}_{\text{ ap}}}^3}$ is only possible under the large-field approximation. However, in this case, one automatically neglects the finite-field terms.

We note that our covariance model assumes the flat-sky approximation. Nevertheless, the favourable comparison to the T17 simulation indicates that this assumption is valid at the relatively small scales considered here. This is not a surprising finding since we consider $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ at small, sub-degree scales. At these small scales, the flat-sky approximation introduces no significant bias. Our model should therefore be applicable to actual observations.

Our covariance model is based on an estimator for $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ that is easily applicable to simulations but should not be applied to realistic survey data, which include masks, because it requires a continuous convergence (or shear) map. Instead, one must use an estimator based on the third-order shear correlation functions. However, we have shown that our model can quantify the covariance of this estimator. The model covariance for the entire survey area gives a lower bound to the expected covariance, while the model covariance for a smaller effective area after boundary removal gives an upper bound. For a square survey with an area of 1000 deg² and a boundary cut-off of 4 × 16′, the upper bound is around 15% above the lower bound, which gives relatively tight constraints on the true covariance. For even larger stage IV surveys, the bound will be tighter, as the boundary is a smaller fraction of the overall area.

In conclusion, we have presented and validated an analytic model for the covariance of third-order aperture statistics. Together with the analytical model for $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ presented in Paper I, our covariance model paves the way for the cosmological analysis of third-order shear statistics of stage III and stage IV weak lensing surveys.

We have deliberately avoided the term super-sample covariance (SSC) to describe parts of $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ . This term, originally devised for parts of the covariance of the power spectrum (Takada & Hu 2013), is used to describe covariance parts depending on ℓ-modes larger than a given survey area. The term T_P₆, 2h corresponds to what is commonly called SSC for the bispectrum (Chan et al. 2018; Pyne & Joachimi 2021). However, as we will show in a forthcoming paper, the SSC can be defined as the difference between the exact form of a covariance and its large-field approximation. By this definition, the finite-field terms T_PPP, 2 and T_PT, 2 are also part of the SSC for ${\hat{M}}_{ap}^{3}$ ${{\hat{M}_{\text{ ap}}}^3}$ .

¹

https://github.com/sheydenreich/threepoint/releases

²

http://th.nao.ac.jp/MEMBER/hamanatk/GRayTrix/

³

https://github.com/sheydenreich/threepoint/releases

Acknowledgments

Funded by the TRA Matter (University of Bonn) as part of the Excellence Strategy of the federal and state governments. This work has been supported by the Deutsche Forschungsgemeinschaft through the project SCHN 342/15-1 and DFG SCHN 342/13. PAB and SH acknowledge support from the German Academic Scholarship Foundation. We would like to thank Joachim Harnois-Déraps for making public the SLICS mock data, which can be found at http://slics.roe.ac.uk/. We thank Benjamin Joachimi, Susan Pyne, Lucas Porth and Niek Wielders for many helpful comments and discussions.

References

Abbott, T. M. C., Aguena, M., Alarcon, A., et al. 2022, Phys. Rev. D, 105, 023520 [CrossRef] [Google Scholar]
Bartelmann, M. 2010, CQG, 27, 233001 [NASA ADS] [CrossRef] [Google Scholar]
Bartelmann, M., & Schneider, P. 2001, Phys. Rep., 340, 291 [Google Scholar]
Burger, P. A., Friedrich, O., Harnois-Déraps, J., et al. 2023, A&A, 669, A69 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Chan, K. C., Moradinezhad Dizgah, A., & Noreña, J. 2018, Phys. Rev. D, 97, 043532 [NASA ADS] [CrossRef] [Google Scholar]
Cooray, A., & Sheth, R. 2002, Phys. Rep., 372, 1 [Google Scholar]
Crittenden, R. G., Natarajan, P., Pen, U.-L., & Theuns, T. 2002, ApJ, 568, 20 [NASA ADS] [CrossRef] [Google Scholar]
Friedrich, O., Gruen, D., DeRose, J., et al. 2018, Phys. Rev. D, 98, 023508 [Google Scholar]
Friedrich, O., Andrade-Oliveira, F., Camacho, H., et al. 2021, MNRAS, 508, 3125 [NASA ADS] [CrossRef] [Google Scholar]
Giblin, B., Heymans, C., Asgari, M., et al. 2021, A&A, 645, A105 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Gruen, D., Friedrich, O., Krause, E., et al. 2018, Phys. Rev. D, 98, 023507 [Google Scholar]
Harnois-Déraps, J., & van Waerbeke, L. 2015, MNRAS, 450, 2857 [Google Scholar]
Heydenreich, S., Schneider, P., Hildebrandt, H., et al. 2020, A&A, 634, A104 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Heydenreich, S., Brück, B., & Harnois-Déraps, J. 2021, A&A, 648, A74 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Heydenreich, S., Brück, B., Burger, P., et al. 2022, A&A, 667, A125 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Heydenreich, S., Linke, L., Burger, P., & Schneider, P. 2023, A&A, 672, A44 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Heymans, C., Tröster, T., Asgari, M., et al. 2021, A&A, 646, A140 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Hikage, C., Oguri, M., Hamana, T., et al. 2019, PASJ, 71, 43 [Google Scholar]
Hildebrandt, H., van den Busch, J. L., Wright, A. H., et al. 2021, A&A, 647, A124 [EDP Sciences] [Google Scholar]
Hoekstra, H., & Jain, B. 2008, Ann. Rev. Nucl. Part. Sci., 58, 99 [NASA ADS] [CrossRef] [Google Scholar]
Jarvis, M., Bernstein, G., & Jain, B. 2004, MNRAS, 352, 338 [Google Scholar]
Joachimi, B., Schneider, P., & Eifler, T. 2008, A&A, 477, 43 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Joachimi, B., Shi, X., & Schneider, P. 2009, A&A, 508, 1193 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Joachimi, B., Lin, C. A., Asgari, M., et al. 2021, A&A, 646, A129 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Kacprzak, T., Kirk, D., Friedrich, O., et al. 2016, MNRAS, 463, 3653 [Google Scholar]
Kaiser, N., & Jaffe, A. 1997, ApJ, 484, 545 [NASA ADS] [CrossRef] [Google Scholar]
Kayo, I., Takada, M., & Jain, B. 2013, MNRAS, 429, 344 [Google Scholar]
Kilbinger, M., & Schneider, P. 2005, A&A, 442, 69 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Kuijken, K., Heymans, C., Hildebrandt, H., et al. 2015, MNRAS, 454, 3500 [Google Scholar]
Martinet, N., Harnois-Déraps, J., Jullo, E., & Schneider, P. 2021, A&A, 646, A62 [EDP Sciences] [Google Scholar]
Petri, A. 2016, Astron. Comput., 17, 73 [NASA ADS] [CrossRef] [Google Scholar]
Pyne, S., & Joachimi, B. 2021, MNRAS, 503, 2300 [NASA ADS] [CrossRef] [Google Scholar]
Schneider, P. 1996, MNRAS, 283, 837 [Google Scholar]
Schneider, P., & Lombardi, M. 2003, A&A, 397, 809 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Schneider, P., van Waerbeke, L., Jain, B., & Kruse, G. 1998, MNRAS, 296, 873 [NASA ADS] [CrossRef] [Google Scholar]
Schneider, P., Kilbinger, M., & Lombardi, M. 2005, A&A, 431, 9 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Secco, L. F., Jarvis, M., Jain, B., et al. 2022, Phys. Rev. D, 105, 103537 [NASA ADS] [CrossRef] [Google Scholar]
Seitz, S., & Schneider, P. 1996, A&A, 305, 383 [NASA ADS] [Google Scholar]
Sheth, R. K., & Tormen, G. 1999, MNRAS, 308, 119 [Google Scholar]
Springel, V., Yoshida, N., & White, S. D. M. 2001, Nature, 6, 79 [Google Scholar]
Spurio Mancini, A., Piras, D., Alsing, J., Joachimi, B., & Hobson, M. P. 2022, MNRAS, 511, 1771 [NASA ADS] [CrossRef] [Google Scholar]
Takada, M., & Hu, W. 2013, Phys. Rev. D, 87, 123504 [NASA ADS] [CrossRef] [Google Scholar]
Takada, M., & Jain, B. 2004, MNRAS, 348, 897 [Google Scholar]
Takahashi, R., Sato, M., Nishimichi, T., Taruya, A., & Oguri, M. 2012, ApJ, 761, 152 [Google Scholar]
Takahashi, R., Hamana, T., Shirasaki, M., et al. 2017, ApJ, 850, 24 [Google Scholar]
Takahashi, R., Nishimichi, T., Namikawa, T., et al. 2020, ApJ, 895, 113 [NASA ADS] [CrossRef] [Google Scholar]
Troxel, M. A., & Ishak, M. 2012, MNRAS, 419, 1804 [CrossRef] [Google Scholar]
van Waerbeke, L. 2000, MNRAS, 313, 524 [CrossRef] [Google Scholar]
Zürcher, D., Fluri, J., Sgier, R., Kacprzak, T., & Refregier, A. 2021, J. Cosmol. Astropart. Phys., 2021, 028 [CrossRef] [Google Scholar]

Appendix A: List of all permutations for covariance terms

In Table A.1, we list all permutations of aperture scale radii for the terms T_PPP, 1 to T_PT, 2 from Sect. 3. The first permutation corresponds to the expressions in Sect. 3. The vertical dashes separate groups of scale radii, in which the terms are symmetric. These symmetries can be easily seen in the expressions in terms of the aperture mass correlation functions in Sect. 3.3.

Table A.1.

Permutations of aperture scale radii for covariance terms.

Appendix B: Impact of shape noise on aperture mass covariance

In this section, we show that the impact of shape noise on $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ can be estimated by replacing the power spectrum P(ℓ) in the expressions derived in Sect. 3 by P(ℓ)+ $σ_{ϵ}^{2}$ $\sigma_\epsilon^2$ /2n, where $σ_{ϵ}^{2}$ $\sigma_\epsilon^2$ is the two-component intrinsic ellipticity dispersion and n is the galaxy number density. For this, we consider the smoothed convergence K, which is

$\begin{matrix} K (ϑ) = \frac{1}{n} \sum_{i = 1}^{N} F_{σ} (| ϑ - ϑ_{i} |) κ (ϑ_{i}), \end{matrix}$ $\begin{aligned} K(\boldsymbol{\vartheta }) = \frac{1}{n} \sum _{i=1}^N F_\sigma (|\boldsymbol{\vartheta }-\boldsymbol{\vartheta }_i|)\, \kappa (\boldsymbol{\vartheta }_i)\;, \end{aligned}$ (B.1)

where F_σ is a smoothing kernel of width σ, the sum runs over the N galaxies and ϑ_i is the position of the ith galaxy. We assume that the galaxies are distributed uniformly within the survey area A′. The kernel F_σ needs to be normalised, such that ∫_A′d²ϑF_σ(ϑ) = 1. As noted in van Waerbeke (2000), the smoothed convergence field K_N including shape noise can be written as

$\begin{matrix} K_{N} (ϑ) = K (ϑ) + N (ϑ) with N (ϑ) = \frac{1}{n} \sum_{i = 1}^{N} \int \frac{d^{2} ℓ}{{(2 π)}^{2}} F_{σ} (ℓ) e^{- i ℓ \cdot (ϑ - ϑ_{i})} [cos 2 ϕ_{ℓ} ϵ_{1}^{s} (ϑ_{i}) + sin 2 ϕ_{ℓ} ϵ_{2}^{s} (ϑ_{i})], \end{matrix}$ $\begin{aligned} K_{\mathrm{N} }(\boldsymbol{\vartheta }) = K(\boldsymbol{\vartheta }) + \mathcal{N} (\boldsymbol{\vartheta })\quad \mathrm{with}\quad \mathcal{N} (\boldsymbol{\vartheta }) = \frac{1}{n} \sum _{i=1}^N\int \frac{\mathrm{d}^2{\ell }}{(2\pi )^2} {F}_\sigma (\boldsymbol{\ell })\, \mathrm{e} ^{-\mathrm{i} \boldsymbol{\ell }\cdot (\boldsymbol{\vartheta }-\boldsymbol{\vartheta }_i)}\left[ \cos {2\phi _\ell }\, \epsilon _1^\mathrm{s} (\boldsymbol{\vartheta }_i)+\sin {2\phi _\ell }\, \epsilon _2^\mathrm{s} (\boldsymbol{\vartheta }_i)\right]\;, \end{aligned}$ (B.2)

where 𝒩 is the noise contribution, ϕ_ℓ is the polar angle of ℓ and $ϵ_{1 / 2}^{s} (ϑ_{i})$ $\epsilon_{1/2}^\mathrm{s}({\bm{\vartheta}}_i)$ are the components of the intrinsic ellipticity ϵ^s of the galaxy at ϑ_i. In general, K_N has an imaginary component. However, aperture statistics, the quantity we are ultimately interested in, do not depend on the imaginary parts, which is why we concentrate on the real part here.

We now introduce a filter function $U_{θ}^{'}$ $U_\theta^\prime$ , defined such that

$\begin{matrix} \int d^{2} ϑ U_{θ}^{'} (| α - ϑ |) K (ϑ) = \int d^{2} ϑ U_{θ} (| α - ϑ |) κ (ϑ) . \end{matrix}$ $\begin{aligned} \int \mathrm{d}^2{\vartheta } U^{\prime }_\theta (|\boldsymbol{\alpha }-\boldsymbol{\vartheta }|)\, K(\boldsymbol{\vartheta }) = \int \mathrm{d}^2{\vartheta } U_\theta (|\boldsymbol{\alpha }-\boldsymbol{\vartheta }|) \, \kappa (\boldsymbol{\vartheta })\;. \end{aligned}$ (B.3)

This indicates that the aperture filter U is a convolution of U′ and the smoothing kernel F_σ. Therefore, we can write Eq. (15) including shape noise with K_N as

$\begin{matrix} 〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉 (Θ_{1}, Θ_{2}) & = \frac{1}{A^{2}} \int d^{2} α_{1} \int d^{2} α_{2} W_{A} (α_{1}) W_{A} (α_{2}) [\prod_{i = 1}^{3} \int d^{2} ϑ_{i} U_{θ_{i}}^{'} (| α_{1} - ϑ_{i} |)] [\prod_{j = 4}^{6} \int d^{2} ϑ_{j} U_{θ_{j}}^{'} (| α_{2} - ϑ_{j} |)] \\ \times 〈 K_{N} (ϑ_{1}) K_{N} (ϑ_{2}) K_{N} (ϑ_{3}) K_{N} (ϑ_{4}) K_{N} (ϑ_{5}) K_{N} (ϑ_{6}) 〉 \\ = \frac{1}{A^{2}} \int d^{2} α_{1} \int d^{2} α_{2} W_{A} (α_{1}) W_{A} (α_{2}) [\prod_{i = 1}^{3} \int d^{2} ϑ_{i} U_{θ_{i}}^{'} (| α_{1} - ϑ_{i} |)] [\prod_{j = 4}^{6} \int d^{2} ϑ_{j} U_{θ_{j}}^{'} (| α_{2} - ϑ_{j} |)] \\ \times} 〈 K (ϑ_{1}) K (ϑ_{2}) K (ϑ_{3}) K (ϑ_{4}) K (ϑ_{5}) K (ϑ_{6}) 〉 \\ + [〈 N (ϑ_{1}) N (ϑ_{2}) 〉 〈 K (ϑ_{3}) K (ϑ_{4}) K (ϑ_{5}) K (ϑ_{6}) 〉 + 14 perm .] \\ + [〈 N (ϑ_{1}) N (ϑ_{2}) N (ϑ_{3}) N (ϑ_{4}) 〉 〈 K (ϑ_{5}) K (ϑ_{6}) 〉 + 14 perm .] \\ + 〈 N (ϑ_{1}) N (ϑ_{2}) N (ϑ_{3}) N (ϑ_{4}) N (ϑ_{5}) N (ϑ_{6}) 〉 {, \end{matrix}$ $\begin{aligned} \nonumber \left\langle \hat{M}_{\rm ap}^3\, \hat{M}_{\rm ap}^3\right\rangle (\Theta _1, \Theta _2)&= \frac{1}{A^2}\int \mathrm{d}^2{\alpha _1} \int \mathrm{d}^2{\alpha _2} W_A(\boldsymbol{\alpha }_1)\,W_A(\boldsymbol{\alpha }_2)\,\Bigg [\prod _{i=1}^3 \int \mathrm{d}^2{\vartheta _i} U^{\prime }_{\theta _i}(|\boldsymbol{\alpha }_1-\boldsymbol{\vartheta }_i|)\Bigg ]\,\Bigg [\prod _{j=4}^6 \int \mathrm{d}^2{\vartheta _j} U^{\prime }_{\theta _j}(|\boldsymbol{\alpha }_2-\boldsymbol{\vartheta }_j|)\Bigg ]\\&\quad \times \,\left\langle K_{\rm N}(\boldsymbol{\vartheta }_1)\,K_{\rm N}(\boldsymbol{\vartheta }_2)\,K_{\rm N}(\boldsymbol{\vartheta }_3)\,K_{\rm N}(\boldsymbol{\vartheta }_4)\,K_{\rm N}(\boldsymbol{\vartheta }_5)\,K_{\rm N}(\boldsymbol{\vartheta }_6)\right\rangle \\&=\nonumber \frac{1}{A^2}\int \mathrm{d}^2{\alpha _1} \int \mathrm{d}^2{\alpha _2} W_A(\boldsymbol{\alpha }_1)\,W_A(\boldsymbol{\alpha }_2)\,\Bigg [\prod _{i=1}^3 \int \mathrm{d}^2{\vartheta _i} U^{\prime }_{\theta _i}(|\boldsymbol{\alpha }_1-\boldsymbol{\vartheta }_i|)\Bigg ]\,\Bigg [\prod _{j=4}^6 \int \mathrm{d}^2{\vartheta _j} U^{\prime }_{\theta _j}(|\boldsymbol{\alpha }_2-\boldsymbol{\vartheta }_j|)\Bigg ]\\&\nonumber \quad \times \,\Big \} \left\langle K(\boldsymbol{\vartheta }_1)\,K(\boldsymbol{\vartheta }_2)\,K(\boldsymbol{\vartheta }_3)\,K(\boldsymbol{\vartheta }_4)\,K(\boldsymbol{\vartheta }_5)\,K(\boldsymbol{\vartheta }_6)\right\rangle \\&\nonumber \qquad + \left[\left\langle \mathcal{N} (\boldsymbol{\vartheta }_1)\,\mathcal{N} (\boldsymbol{\vartheta }_2)\right\rangle \left\langle K(\boldsymbol{\vartheta }_3)\,K(\boldsymbol{\vartheta }_4)\,K(\boldsymbol{\vartheta }_5)\,K(\boldsymbol{\vartheta }_6)\right\rangle + \mathrm{14\,perm.}\right]\\&\nonumber \qquad + \left[\left\langle \mathcal{N} (\boldsymbol{\vartheta }_1)\,\mathcal{N} (\boldsymbol{\vartheta }_2)\,\mathcal{N} (\boldsymbol{\vartheta }_3)\,\mathcal{N} (\boldsymbol{\vartheta }_4)\right\rangle \,\left\langle K(\boldsymbol{\vartheta }_5)\,K(\boldsymbol{\vartheta }_6)\right\rangle + \mathrm{14\,perm.}\right]\\&\nonumber \qquad + \left\langle \mathcal{N} (\boldsymbol{\vartheta }_1)\,\mathcal{N} (\boldsymbol{\vartheta }_2)\,\mathcal{N} (\boldsymbol{\vartheta }_3)\,\mathcal{N} (\boldsymbol{\vartheta }_4)\,\mathcal{N} (\boldsymbol{\vartheta }_5)\,\mathcal{N} (\boldsymbol{\vartheta }_6)\right\rangle \Big \{ \;, \end{aligned}$ (B.4)

where we used that all odd moments of the noise vanish and that 𝒩 and K are uncorrelated.

The expectation values are taken by averaging over all galaxy positions inside A′ and taking the ensemble average. For the second-order moment of 𝒩, van Waerbeke (2000) showed that this leads to

$\begin{matrix} 〈 N (ϑ) N (ϑ^{'}) 〉 = \frac{σ_{ϵ}^{2}}{2 n} \int \frac{d^{2} ℓ}{{(2 π)}^{2}} e^{i ℓ \cdot (ϑ - ϑ^{'})} {| F_{σ} (ℓ) |}^{2} . \end{matrix}$ $\begin{aligned} \left\langle \mathcal{N} (\boldsymbol{\vartheta })\,\mathcal{N} (\boldsymbol{\vartheta }^{\prime })\right\rangle = \frac{\sigma ^2_\epsilon }{2n}\int \frac{\mathrm{d}^2{\ell }}{(2\pi )^2}\mathrm{e} ^{\mathrm{i} \boldsymbol{\ell }\cdot (\boldsymbol{\vartheta }-\boldsymbol{\vartheta }^{\prime })}\, |{F}_\sigma (\boldsymbol{\ell })|^2\;. \end{aligned}$ (B.5)

For K we find

$\begin{matrix} 〈 K (θ) K (θ^{'}) 〉 & = [\prod_{g = 1}^{N} \frac{1}{A^{'}} \int d^{2} ϑ_{g}] \sum_{i = 0}^{N} \sum_{j = 0}^{N} F_{σ} (| θ - ϑ_{i} |) F_{σ} (| θ^{'} - ϑ_{j} |) 〈 κ_{i} κ_{j} 〉 \\ = \frac{N (N - 1)}{n^{2} A^{' 2}} \int d^{2} ϑ_{1} \int d^{2} ϑ_{2} F_{σ} (| θ - ϑ_{1} |) F_{σ} (| θ^{'} - ϑ_{2} |) 〈 κ_{1} κ_{2} 〉 \\ + \frac{N}{n^{2} A^{'}} \int d^{2} ϑ F_{σ} (| θ - ϑ |) F_{σ} (| θ^{'} - ϑ |) 〈 κ^{2} (ϑ) 〉 \\ ≃ \int d^{2} ϑ_{1} \int d^{2} ϑ_{2} F_{σ} (| θ - ϑ_{1} |) F_{σ} (| θ^{'} - ϑ_{2} |) 〈 κ_{1} κ_{2} 〉 + \frac{〈 κ^{2} 〉}{n} \int d^{2} ϑ F_{σ} (| θ - ϑ |) F_{σ} (| θ^{'} - ϑ |), \end{matrix}$ $\begin{aligned} \left\langle K(\boldsymbol{\theta })\,K(\boldsymbol{\theta }^{\prime })\right\rangle&= \left[\prod _{g=1}^N \frac{1}{A^{\prime }} \int \mathrm{d}^2{\vartheta _g}\right] \sum _{i=0}^N \sum _{j=0}^N F_\sigma (|\boldsymbol{\theta }-\boldsymbol{\vartheta }_i|)\,F_\sigma (|\boldsymbol{\theta }^{\prime }-\boldsymbol{\vartheta }_j|)\, \left\langle \kappa _i\, \kappa _j\right\rangle \\&\nonumber = \frac{N(N-1)}{n^2\,A^{\prime 2}}\int \mathrm{d}^2{\vartheta _1}\, \int \mathrm{d}^2{\vartheta _2} F_\sigma (|\boldsymbol{\theta }-\boldsymbol{\vartheta }_1|)\,F_\sigma (|\boldsymbol{\theta }^{\prime }-\boldsymbol{\vartheta }_2|) \left\langle \kappa _1\, \kappa _2\right\rangle \\&\nonumber \quad + \frac{N}{n^2\,A^{\prime }}\int \mathrm{d}^2{\vartheta }\, F_\sigma (|\boldsymbol{\theta }-\boldsymbol{\vartheta }|)\,F_\sigma (|\boldsymbol{\theta }^{\prime }-\boldsymbol{\vartheta }|) \left\langle \kappa ^2(\boldsymbol{\vartheta })\right\rangle \\&\nonumber \simeq \int \mathrm{d}^2{\vartheta _1}\, \int \mathrm{d}^2{\vartheta _2} F_\sigma (|\boldsymbol{\theta }-\boldsymbol{\vartheta }_1|)\,F_\sigma (|\boldsymbol{\theta }^{\prime }-\boldsymbol{\vartheta }_2|) \left\langle \kappa _1\, \kappa _2\right\rangle + \frac{\left\langle \kappa ^2\right\rangle }{n}\,\int \mathrm{d}^2{\vartheta }\, F_\sigma (|\boldsymbol{\theta }-\boldsymbol{\vartheta }|)\,F_\sigma (|\boldsymbol{\theta }^{\prime }-\boldsymbol{\vartheta }|)\;, \end{aligned}$ (B.6)

where we assumed that N ≫ 1 and used that ⟨κ²(ϑ)⟩ is the κ-correlation function at vanishing separation and thus independent of ϑ. The second summand describes the noise contribution due to the finite number of galaxies (Schneider et al. 1998). However, since ⟨κ²⟩ is much smaller than $σ_{ϵ}^{2}$ $\sigma_\epsilon^2$ for realistic shape noise, this term is small compared to ⟨𝒩(ϑ) 𝒩(ϑ′)⟩ and we neglect it in the following. With the same approximation,

$\begin{matrix} 〈 K (θ_{1}) K (θ_{2}) K (θ_{3}) K (θ_{4}) 〉 & ≃ [\prod_{i = 1}^{4} \int d^{2} ϑ_{i} F_{σ} (| θ_{i} - ϑ_{i} |)] 〈 κ_{1} κ_{2} κ_{3} κ_{4} 〉 \end{matrix}$ $\begin{aligned} \left\langle K(\boldsymbol{\theta }_1)\,K(\boldsymbol{\theta }_2)\, \,K(\boldsymbol{\theta }_3)\, \,K(\boldsymbol{\theta }_4)\right\rangle&\simeq \left[ \prod _{i=1}^4 \int \mathrm{d}^2{\vartheta _i}\, F_\sigma (|\boldsymbol{\theta }_i-\boldsymbol{\vartheta }_i|)\,\right]\left\langle \kappa _1\, \kappa _2\, \kappa _3\, \kappa _4\right\rangle \end{aligned}$ (B.7)

$\begin{matrix} 〈 K (θ_{1}) K (θ_{2}) K (θ_{3}) K (θ_{4}) K (θ_{5}) K (θ_{6}) 〉 & ≃ \int [\prod_{i = 1}^{6} \int d^{2} ϑ_{i} F_{σ} (| θ_{i} - ϑ_{i} |)] 〈 κ_{1} κ_{2} κ_{3} κ_{4} κ_{5} κ_{6} 〉 . \end{matrix}$ $\begin{aligned} \left\langle K(\boldsymbol{\theta }_1)\,K(\boldsymbol{\theta }_2)\, \,K(\boldsymbol{\theta }_3)\, \,K(\boldsymbol{\theta }_4)\,K(\boldsymbol{\theta }_5)\,K(\boldsymbol{\theta }_6)\right\rangle&\simeq \int \left[ \prod _{i=1}^6 \int \mathrm{d}^2{\vartheta _i}\, F_\sigma (|\boldsymbol{\theta }_i-\boldsymbol{\vartheta }_i|)\,\right]\,\left\langle \kappa _1\, \kappa _2\, \kappa _3\, \kappa _4\,\kappa _5\, \kappa _6\right\rangle \;. \end{aligned}$ (B.8)

We now assume that the width σ of the smoothing kernel is much smaller than the aperture radii θ. In that case, we can approximate the F_σ with Dirac functions, so

$\begin{matrix} U_{θ} (ϑ) = \int d^{2} ϑ^{'} U_{θ}^{'} (| ϑ - ϑ^{'} |) F_{σ} (ϑ^{'}) = \int d^{2} ϑ^{'} U_{θ}^{'} (| ϑ - ϑ^{'} |) δ_{D} (ϑ^{'}) = U_{θ}^{'} (ϑ), \end{matrix}$ $\begin{aligned} U_\theta (\vartheta ) = \int \mathrm{d}^2{\vartheta ^{\prime }} U^{\prime }_\theta (|\boldsymbol{\vartheta }-\boldsymbol{\vartheta }^{\prime }|)\, F_\sigma (\boldsymbol{\vartheta }^{\prime }) = \int \mathrm{d}^2{\vartheta ^{\prime }} U^{\prime }_\theta (|\boldsymbol{\vartheta }-\boldsymbol{\vartheta }^{\prime }|)\, \delta _{\rm D}(\boldsymbol{\vartheta }^{\prime }) = U^{\prime }_\theta (\vartheta )\;, \end{aligned}$ (B.9)

and

$\begin{matrix} 〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉 (Θ_{1}, Θ_{2}) & = \frac{1}{A^{2}} \int d^{2} α_{1} \int d^{2} α_{2} W_{A} (α_{1}) W_{A} (α_{2}) [\prod_{i = 1}^{3} \int d^{2} ϑ_{i} U_{θ_{i}} (| α_{1} - ϑ_{i} |)] [\prod_{j = 4}^{6} \int d^{2} ϑ_{j} U_{θ_{j}} (| α_{2} - ϑ_{j} |)] \\ \times} 〈 κ_{1} κ_{2} κ_{3} κ_{4} κ_{5} κ_{6} 〉 + [\frac{σ_{ϵ}^{2}}{2 n} δ_{D} (ϑ_{1} - ϑ_{2}) 〈 κ_{3} κ_{4} κ_{5} κ_{6} 〉 + 14 perm .] \\ + [{(\frac{σ_{ϵ}^{2}}{2 n})}^{2} δ_{D} (ϑ_{1} - ϑ_{2}) δ_{D} (ϑ_{3} - ϑ_{4}) 〈 κ_{5} κ_{6} 〉 + 44 perm .] \\ + [{(\frac{σ_{ϵ}^{2}}{2 n})}^{3} δ_{D} (ϑ_{1} - ϑ_{2}) δ_{D} (ϑ_{3} - ϑ_{4}) δ_{D} (ϑ_{5} - ϑ_{6}) + 14 perm .] {, \end{matrix}$ $\begin{aligned} \left\langle \hat{M}_{\rm ap}^3\, \hat{M}_{\rm ap}^3\right\rangle (\Theta _1, \Theta _2)&=\nonumber \frac{1}{A^2}\int \mathrm{d}^2{\alpha _1} \int \mathrm{d}^2{\alpha _2} W_A(\boldsymbol{\alpha }_1)\,W_A(\boldsymbol{\alpha }_2)\,\Bigg [\prod _{i=1}^3 \int \mathrm{d}^2{\vartheta _i} U_{\theta _i}(|\boldsymbol{\alpha }_1-\boldsymbol{\vartheta }_i|)\Bigg ]\,\Bigg [\prod _{j=4}^6 \int \mathrm{d}^2{\vartheta _j} U_{\theta _j}(|\boldsymbol{\alpha }_2-\boldsymbol{\vartheta }_j|)\Bigg ]\\&\quad \times \,\Bigg \} \left\langle \kappa _1\,\kappa _2\,\kappa _3\,\kappa _4\,\kappa _5\,\kappa _6\right\rangle + \left[\frac{\sigma ^2_\epsilon }{2n}\, \delta _{\rm D}(\boldsymbol{\vartheta }_1-\boldsymbol{\vartheta }_2)\,\left\langle \kappa _3\, \kappa _4\,\kappa _5\, \kappa _6\right\rangle + \mathrm{14\,perm.}\right]\\&\nonumber \qquad + \left[ \left(\frac{\sigma ^2_\epsilon }{2n}\right)^2\, \delta _{\rm D}(\boldsymbol{\vartheta }_1-\boldsymbol{\vartheta }_2)\,\delta _{\rm D}(\boldsymbol{\vartheta }_3-\boldsymbol{\vartheta }_4)\,\left\langle \kappa _5\,\kappa _6\right\rangle + \mathrm{44\,perm.}\right]\\&\nonumber \qquad + \left[ \left(\frac{\sigma ^2_\epsilon }{2n}\right)^3\, \delta _{\rm D}(\boldsymbol{\vartheta }_1-\boldsymbol{\vartheta }_2)\,\delta _{\rm D}(\boldsymbol{\vartheta }_3-\boldsymbol{\vartheta }_4)\,\delta _{\rm D}(\boldsymbol{\vartheta }_5-\boldsymbol{\vartheta }_6) + \mathrm{14\,perm.}\right]\Bigg \{ \;, \end{aligned}$ (B.10)

By decomposing the six- and four-point function into its connected components and considering all permutations, one can show that this is equal to the right-hand side of Eq. (16) after replacing ⟨κ_i κ_j⟩ by $〈 κ_{i} κ_{j} 〉 + σ_{ϵ}^{2} δ_{D} (ϑ_{i} - ϑ_{j}) / 2 n$ $\langle{\kappa_i\, \kappa_j}\rangle+\sigma_\epsilon^2\, {\delta_{\rm D}}({\bm{\vartheta}}_i-{\bm{\vartheta}}_j)/2n$ . This implies a power spectrum P′ given by

$\begin{matrix} P^{'} (ℓ) {(2 π)}^{2} δ_{D} (ℓ + ℓ^{'}) & = \int d^{2} ϑ_{1} \int d^{2} ϑ_{2} [〈 κ_{1} κ_{2} 〉 + \frac{σ_{ϵ}^{2}}{2 n} δ_{D} (ϑ_{1} - ϑ_{2})] e^{i (ℓ \cdot ϑ_{1} + ℓ^{'} \cdot ϑ_{2})} \\ = [P (ℓ) + \frac{σ_{ϵ}^{2}}{2 n}] {(2 π)}^{2} δ_{D} (ℓ + ℓ^{'}), \end{matrix}$ $\begin{aligned} P^{\prime }(\ell )\, (2\pi )^2\delta _{\rm D}(\boldsymbol{\ell }+\boldsymbol{\ell }^{\prime })&= \int \mathrm{d}^2{\vartheta _1}\, \int \mathrm{d}^2{\vartheta _2} \left[ \left\langle \kappa _1\, \kappa _2\right\rangle + \frac{\sigma _\epsilon ^2}{2n}\,\delta _{\rm D}(\boldsymbol{\vartheta }_1-\boldsymbol{\vartheta }_2) \right]\, \mathrm{e} ^{\mathrm{i} (\boldsymbol{\ell }\cdot \boldsymbol{\vartheta }_1 + \boldsymbol{\ell }^{\prime }\cdot \boldsymbol{\vartheta }_2)}\\&\nonumber = \left[ P(\ell ) + \frac{\sigma _\epsilon ^2}{2n}\right]\, (2\pi )^2\delta _{\rm D}(\boldsymbol{\ell }+\boldsymbol{\ell }^{\prime })\;, \end{aligned}$ (B.11)

where P is the power spectrum without shape noise. Consequently, replacing P(ℓ) by $P (l) + σ_{ϵ}^{2} / 2 n$ $P(\ell)+\sigma_\epsilon^2/2n$ in the covariance expressions from Sect. 3 gives the correct covariance for $〈 M_{ap}^{3} 〉$ ${\left\langle{{M^3_{\rm ap}}}\right\rangle}$ in the presence of shape noise.

Appendix C: Approximation of T_P₆

We here derive an approximation for T_P₆, which is necessary to reduce the computational complexity of the covariance model. For this, we rewrite Eq. (32) as

$\begin{matrix} T_{P_{6}} (Θ_{1}, Θ_{2}) & = \int \frac{d^{2} ℓ_{1}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{2}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{3}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{4}}{{(2 π)}^{2}} \int \frac{d^{2} s}{{(2 π)}^{2}} P_{6} (ℓ_{1}, ℓ_{2}, s - ℓ_{1} - ℓ_{2}, ℓ_{3}, ℓ_{4}) \\ \times \tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{2} θ_{2}) \tilde{u} (| s - ℓ_{1} - ℓ_{2} | θ_{3}) \tilde{u} (ℓ_{3} θ_{4}) \tilde{u} (ℓ_{4} θ_{5}) \tilde{u} (| s + ℓ_{3} + ℓ_{4} | θ_{6}) G_{A} (s) . \end{matrix}$ $\begin{aligned} T_{P_6}(\Theta _1, \Theta _2)&= \int \frac{\mathrm{d}^2{\ell _1}}{(2\pi )^2}\, \int \frac{\mathrm{d}^2{\ell _2}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _3}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _4}}{(2\pi )^2} \int \frac{\mathrm{d}^2{s}}{(2\pi )^2}\;P_6(\boldsymbol{\ell }_1, \boldsymbol{\ell }_2, \boldsymbol{s}-\boldsymbol{\ell }_1-\boldsymbol{\ell }_2, \boldsymbol{\ell }_3, \boldsymbol{\ell }_4)\,\\&\nonumber \quad \times \tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _2\,\theta _2)\,\tilde{u}(|\boldsymbol{s}-\boldsymbol{\ell }_1-\boldsymbol{\ell }_2|\,\theta _3)\,\tilde{u}(\ell _3\,\theta _4)\,\tilde{u}(\ell _4\,\theta _5)\,\tilde{u}(|\boldsymbol{s}+\boldsymbol{\ell }_3+\boldsymbol{\ell }_4|\,\theta _6)\, G_A(\boldsymbol{s})\;. \end{aligned}$ (C.1)

According to the Limber approximation (Eq. 5),

$\begin{matrix} P_{6} (ℓ_{1}, ℓ_{2}, s - ℓ_{1} - ℓ_{2}, ℓ_{3}, ℓ_{4}) & = {(\frac{3 H_{0}^{2} Ω_{m}}{2 c^{2}})}^{6} \int_{0}^{\infty} d χ \frac{q^{6} (χ)}{χ^{5} a^{6} (χ)} \\ \times P_{6}^{(3 d)} [ℓ_{1} / χ, ℓ_{2} / χ, (s - ℓ_{1} - ℓ_{2}) / χ, ℓ_{3}, ℓ_{4}, (- s - ℓ_{3} - ℓ_{4}) / χ ; χ] . \end{matrix}$ $\begin{aligned} P_6(\boldsymbol{\ell }_1, \boldsymbol{\ell }_2, \boldsymbol{s}-\boldsymbol{\ell }_1-\boldsymbol{\ell }_2, \boldsymbol{\ell }_3, \boldsymbol{\ell }_4)&=\left(\frac{3H_0^2\Omega _{\rm m}}{2c^2}\right)^6\,\int _0^\infty \mathrm{d}{\chi }\; \frac{q^6(\chi )}{\chi ^{5}\, a^6(\chi )}\, \\&\nonumber \quad \times \mathcal{P} _6^\mathrm{(3d)} \left[\boldsymbol{\ell }_1/\chi , \boldsymbol{\ell }_2/\chi , (\boldsymbol{s}-\boldsymbol{\ell }_1-\boldsymbol{\ell }_2)/\chi , \boldsymbol{\ell }_3, \boldsymbol{\ell }_4, (-\boldsymbol{s}-\boldsymbol{\ell }_3-\boldsymbol{\ell }_4)/\chi ; \chi \right]\;. \end{aligned}$ (C.2)

We now approximate $P_{6}^{(3d)}$ $\mathcal{P}_6^\mathrm{(3d)}$ under the assumption that s ≪ ℓ₁, ℓ₂, ℓ₃, ℓ₄. This approximation is valid if G_A varies on larger scales (smaller s) than P₆. Then, as shown by Chan et al. (2018),

$\begin{matrix} P_{6}^{(3 d)} (k_{1}, k_{2}, q - k_{1} - k_{2}, k_{3}, k_{4}, - q - k_{3} - k_{4} ; χ) = P_{6}^{(3 d)} (k_{1}, k_{2}, - k_{1} - k_{2}, k_{3}, k_{4}, - k_{3} - k_{4} ; χ) + additional terms, \end{matrix}$ $\begin{aligned} \mathcal{P} _6^\mathrm{(3d)} (\boldsymbol{k}_1, \boldsymbol{k}_2, \boldsymbol{q}-\boldsymbol{k}_1-\boldsymbol{k}_2, \boldsymbol{k}_3, \boldsymbol{k}_4, -\boldsymbol{q}-\boldsymbol{k}_3-\boldsymbol{k}_4; \chi ) = \mathcal{P} _6^\mathrm{(3d)} (\boldsymbol{k}_1, \boldsymbol{k}_2, -\boldsymbol{k}_1-\boldsymbol{k}_2, \boldsymbol{k}_3, \boldsymbol{k}_4, -\boldsymbol{k}_3-\boldsymbol{k}_4; \chi ) \mathrm{+ additional terms}\;, \end{aligned}$ (C.3)

where the additional terms can be calculated from the halo model. We approximate the first part of Eq. (C.3) with the 1-halo term, which is $I_{6}^{0} (k_{1}, k_{2}, k_{3}, k_{4}, k_{6}, χ)$ $I_6^0({k}_1, {k}_2, {k}_3, {k}_4, k_6, \chi)$ , with

$\begin{matrix} I_{n}^{i} (k_{1}, \dots, k_{n} ; χ) & = \int d m n [m, z (χ)] b^{i} [m, z (χ)] {(\frac{m}{\bar{ρ}})}^{n} {\tilde{u}}_{NFW} (k_{1}, m) \dots {\tilde{u}}_{NFW} (k_{n}, m), \end{matrix}$ $\begin{aligned} I_n^i(k_1, \dots , k_n; \chi )&= \int \mathrm{d}{m} n[m, z(\chi )]\, b^i[m, z(\chi )]\, \left(\frac{m}{\bar{\rho }}\right)^n\, \tilde{u}_{\rm NFW}(\boldsymbol{k}_1, m)\dots \tilde{u}_{\rm NFW}(\boldsymbol{k}_n, m)\;, \end{aligned}$ (C.4)

with comoving mean density $\bar{ρ}$ $\bar{\rho}$ , halo mass function n(m, z) and halo bias b(m, z). The ${\tilde{u}}_{NFW}$ $\tilde{u}_{\mathrm{NFW}}$ is defined as Fourier-transformation of the normalised NFW-halo profile u_NFW, which is

$\begin{matrix} u_{NFW} (r, m) = \frac{1}{m} ρ (r, m), \end{matrix}$ $\begin{aligned} u_{\rm NFW}(r, m)= \frac{1}{m} \rho (r, m)\;, \end{aligned}$ (C.5)

where ρ is the halo density profile.

Pyne & Joachimi (2021) showed that the dominant additional term in Eq. (C.3) for k > 0.3 h Mpc⁻¹ is given by a part of the 2-halo term of the pentaspectrum, so that

$\begin{matrix} P_{6}^{(3 d)} (k_{1}, k_{2}, q - k_{1} - k_{2}, k_{3}, k_{4}, - q - k_{3} - k_{4} ; χ) & ≃ I_{6}^{0} (k_{1}, k_{2}, | q - k_{1} - k_{2} |, k_{3}, k_{4}, | - q - k_{3} - k_{4} | ; χ) \\ + P_{L} (q) I_{3}^{1} (k_{1}, k_{2}, | q - k_{1} - k_{2} | ; χ) I_{3}^{1} (k_{3}, k_{4}, | - q - k_{3} - k_{4} | ; χ), \end{matrix}$ $\begin{aligned} \mathcal{P} _6^\mathrm{(3d)} (\boldsymbol{k}_1, \boldsymbol{k}_2, \boldsymbol{q}-\boldsymbol{k}_1-\boldsymbol{k}_2, \boldsymbol{k}_3, \boldsymbol{k}_4, -\boldsymbol{q}-\boldsymbol{k}_3-\boldsymbol{k}_4;\chi )&\simeq I_6^0({k}_1, {k}_2, |\boldsymbol{q}-\boldsymbol{k}_1-\boldsymbol{k}_2|, {k}_3, {k}_4, |-\boldsymbol{q}-\boldsymbol{k}_3-\boldsymbol{k}_4|;\chi )\\&\nonumber \quad + P_{\rm L}(q)\, I_3^1(k_1, k_2, |\boldsymbol{q}-\boldsymbol{k}_1-\boldsymbol{k}_2|;\chi )\, I_3^1(k_3, k_4, |-\boldsymbol{q}-\boldsymbol{k}_3-\boldsymbol{k}_4|;\chi )\;, \end{aligned}$ (C.6)

where P_L is the linear matter power spectrum. With this pentaspectrum, the last covariance part becomes

$\begin{matrix} T_{P_{6}} (Θ_{1}, Θ_{2}) & = {(\frac{3 H_{0}^{2} Ω_{m}}{2 c^{2}})}^{6} \int_{0}^{\infty} d χ \frac{q^{6} (χ)}{χ^{4} a^{6} (χ)} \int \frac{d^{2} ℓ_{1}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{2}}{{(2 π)}^{2}} \int \frac{d^{2} s}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{3}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{4}}{{(2 π)}^{2}} \\ \times \tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{2} θ_{2}) \tilde{u} (| s - ℓ_{1} - ℓ_{2} | θ_{3}) \tilde{u} (ℓ_{3} θ_{4}) \tilde{u} (ℓ_{4} θ_{5}) \tilde{u} (| s + ℓ_{3} + ℓ_{4} | θ_{6}) G_{A} (s) \\ \times [I_{6}^{0} (ℓ_{1} / χ, ℓ_{2} / χ, | s - ℓ_{1} - ℓ_{2} | / χ, ℓ_{3} / χ, ℓ_{4} / χ, | - s - ℓ_{3} - ℓ_{4} | / χ ; χ) \\ + P_{L} (s / χ) I_{3}^{1} (ℓ_{1} / χ, ℓ_{2} / χ, | s - ℓ_{1} - ℓ_{2} | / χ ; χ) I_{3}^{1} (ℓ_{3} / χ, ℓ_{4} / χ, | - s - ℓ_{3} - ℓ_{4} | / χ ; χ)] \\ ≃ {(\frac{3 H_{0}^{2} Ω_{m}}{2 c^{2}})}^{6} \int_{0}^{\infty} d χ \frac{q^{6} (χ)}{χ^{4} a^{6} (χ)} \int \frac{d^{2} ℓ_{1}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{2}}{{(2 π)}^{2}} \int \frac{d^{2} s}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{3}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{4}}{{(2 π)}^{2}} \\ \times \tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{2} θ_{2}) \tilde{u} (| ℓ_{1} + ℓ_{2} | θ_{3}) \tilde{u} (ℓ_{3} θ_{4}) \tilde{u} (ℓ_{4} θ_{5}) \tilde{u} (| ℓ_{3} + ℓ_{4} | θ_{6}) G_{A} (s) \\ \times [I_{6}^{0} (ℓ_{1} / χ, ℓ_{2} / χ, | ℓ_{1} + ℓ_{2} | / χ, ℓ_{3} / χ, ℓ_{4} / χ, | ℓ_{3} + ℓ_{4} | / χ ; χ) \\ + P_{L} (s / χ) I_{3}^{1} (ℓ_{1} / χ, ℓ_{2} / χ, | ℓ_{1} + ℓ_{2} | / χ ; χ) I_{3}^{1} (ℓ_{3} / χ, ℓ_{4} / χ, | ℓ_{3} + ℓ_{4} | / χ ; χ)] \\ = T_{P_{6}}^{\infty} (Θ_{1}, Θ_{2}) + {(\frac{3 H_{0}^{2} Ω_{m}}{2 c^{2}})}^{6} \int_{0}^{\infty} d χ \frac{q^{6} (χ)}{χ^{4} a^{6} (χ)} [\int \frac{d^{2} s}{{(2 π)}^{2}} G_{A} (s) P_{L} (s / χ)] \\ \times \int \frac{d^{2} ℓ_{1}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{2}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{3}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{4}}{{(2 π)}^{2}} I_{3}^{1} (ℓ_{1} / χ, ℓ_{2} / χ, | ℓ_{1} + ℓ_{2} | / χ ; χ) I_{3}^{1} (ℓ_{3} / χ, ℓ_{4} / χ, | ℓ_{3} + ℓ_{4} | / χ ; χ) \\ \times \tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{2} θ_{2}) \tilde{u} (| ℓ_{1} + ℓ_{2} | θ_{3}) \tilde{u} (ℓ_{3} θ_{4}) \tilde{u} (ℓ_{4} θ_{5}) \tilde{u} (| ℓ_{3} + ℓ_{4} | θ_{6}), \end{matrix}$ $\begin{aligned} T_{P_6}(\Theta _1, \Theta _2)&\nonumber = \left(\frac{3H_0^2\Omega _{\rm m}}{2c^2}\right)^6\,\int _0^\infty \mathrm{d}{\chi }\; \frac{q^6(\chi )}{\chi ^{4}\, a^6(\chi )} \int \frac{\mathrm{d}^2{\ell _1}}{(2\pi )^2}\, \int \frac{\mathrm{d}^2{\ell _2}}{(2\pi )^2} \int \frac{\mathrm{d}^2{s}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _3}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _4}}{(2\pi )^2} \,\;\,\\&\nonumber \quad \times \tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _2\,\theta _2)\,\tilde{u}(|\boldsymbol{s}-\boldsymbol{\ell }_1-\boldsymbol{\ell }_2|\,\theta _3)\,\tilde{u}(\ell _3\,\theta _4)\,\tilde{u}(\ell _4\,\theta _5)\,\tilde{u}(|\boldsymbol{s}+\boldsymbol{\ell }_3+\boldsymbol{\ell }_4|\,\theta _6)\, G_A(\boldsymbol{s})\\&\nonumber \quad \times \Big [ I_6^0(\ell _1/\chi , \ell _2/\chi , |\boldsymbol{s}-\boldsymbol{\ell }_1-\boldsymbol{\ell }_2|/\chi , \ell _3/\chi , \ell _4/\chi , |-\boldsymbol{s}-\boldsymbol{\ell }_3-\boldsymbol{\ell }_4|/\chi ; \chi )\\&\nonumber \qquad + P_{\rm L}(s/\chi )\, I_3^1(\ell _1/\chi , \ell _2/\chi , |\boldsymbol{s}-\boldsymbol{\ell }_1-\boldsymbol{\ell }_2|/\chi ; \chi )\, I_3^1(\ell _3/\chi , \ell _4/\chi , |-\boldsymbol{s}-\boldsymbol{\ell }_3-\boldsymbol{\ell }_4|/\chi ; \chi )\Big ]\\&\simeq \left(\frac{3H_0^2\Omega _{\rm m}}{2c^2}\right)^6\,\int _0^\infty \mathrm{d}{\chi }\; \frac{q^6(\chi )}{\chi ^{4}\, a^6(\chi )} \int \frac{\mathrm{d}^2{\ell _1}}{(2\pi )^2}\, \int \frac{\mathrm{d}^2{\ell _2}}{(2\pi )^2} \int \frac{\mathrm{d}^2{s}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _3}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _4}}{(2\pi )^2} \\&\nonumber \quad \times \tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _2\,\theta _2)\,\tilde{u}(|\boldsymbol{\ell }_1+\boldsymbol{\ell }_2|\,\theta _3)\,\tilde{u}(\ell _3\,\theta _4)\,\tilde{u}(\ell _4\,\theta _5)\,\tilde{u}(|\boldsymbol{\ell }_3+\boldsymbol{\ell }_4|\,\theta _6)\, G_A(\boldsymbol{s})\\&\nonumber \quad \times \Big [ I_6^0(\ell _1/\chi , \ell _2/\chi , |\boldsymbol{\ell }_1+\boldsymbol{\ell }_2|/\chi , \ell _3/\chi , \ell _4/\chi , |\boldsymbol{\ell }_3+\boldsymbol{\ell }_4|/\chi ; \chi )\\&\nonumber \qquad + P_{\rm L}(s/\chi )\, I_3^1(\ell _1/\chi , \ell _2/\chi , |\boldsymbol{\ell }_1+\boldsymbol{\ell }_2|/\chi ; \chi )\, I_3^1(\ell _3/\chi , \ell _4/\chi , |\boldsymbol{\ell }_3+\boldsymbol{\ell }_4|/\chi ; \chi )\Big ]\\&\nonumber = T_{P_6}^\infty (\Theta _1, \Theta _2) + \left(\frac{3H_0^2\Omega _{\rm m}}{2c^2}\right)^6\,\int _0^\infty \mathrm{d}{\chi }\; \frac{q^6(\chi )}{\chi ^{4}\, a^6(\chi )} \left[\int \frac{\mathrm{d}^2{s}}{(2\pi )^2} G_A(\boldsymbol{s})\, P_{\rm L}(s/\chi ) \right]\\&\nonumber \quad \times \int \frac{\mathrm{d}^2{\ell _1}}{(2\pi )^2}\, \int \frac{\mathrm{d}^2{\ell _2}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _3}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _4}}{(2\pi )^2} I_3^1(\ell _1/\chi , \ell _2/\chi , |\boldsymbol{\ell }_1+\boldsymbol{\ell }_2|/\chi ; \chi )\, I_3^1(\ell _3/\chi , \ell _4/\chi , |\boldsymbol{\ell }_3+\boldsymbol{\ell }_4|/\chi ; \chi )\\&\nonumber \quad \times \tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _2\,\theta _2)\,\tilde{u}(|\boldsymbol{\ell }_1+\boldsymbol{\ell }_2|\,\theta _3)\,\tilde{u}(\ell _3\,\theta _4)\,\tilde{u}(\ell _4\,\theta _5)\,\tilde{u}(|\boldsymbol{\ell }_3+\boldsymbol{\ell }_4|\,\theta _6)\;, \end{aligned}$ (C.7)

where we used s ≪ ℓ₁, ℓ₂, ℓ₃, ℓ₄. We refer to the last summand in this equation as T_P₆, 2h. Similar to T_PPP, 2 and T_PT, 2, T_P₆, 2h vanishes under the large-field-approximation. This can be seen by replacing G_A(s) with 2πδ_D(s)/A. In this approximation, the integral in square brackets becomes zero, since P_L(0) = 0. Consequently, all of T_P₆, 2h vanishes.

Appendix D: Derivation of Gaussian $〈 M_{ap}^{3} 〉$ $\langle{{M_{\rm ap}^3}\rangle}$ covariance from bispectrum covariance

In this appendix, we try to use the Gaussian bispectrum covariance derived by Joachimi et al. (2009) to obtain $C_{〈 M_{ap}^{3} 〉}$ $C_{{\left\langle{{M^3_{\rm ap}}}\right\rangle}}$ . We will show that this approach only recovers the large-field approximation $T_{P P P, 1}^{\infty}$ ${T^\infty_{PPP, 1}}$ . In this appendix (and only here), we parameterise the bispectrum by the lengths of three ℓ, namely using

$\begin{matrix} 〈 \hat{κ} (ℓ_{1}) \hat{κ} (ℓ_{2}) \hat{κ} (ℓ_{3}) 〉 = {(2 π)}^{2} B (ℓ_{1}, ℓ_{2}, ℓ_{3}) δ_{D} (ℓ_{1} + ℓ_{2} + ℓ_{3}) . \end{matrix}$ $\begin{aligned} \left\langle \hat{\kappa }(\boldsymbol{\ell }_1)\,\hat{\kappa }(\boldsymbol{\ell }_2)\,\hat{\kappa }(\boldsymbol{\ell }_3)\right\rangle = (2\pi )^2\, B(\ell _1, \ell _2, \ell _3)\, \delta _{\rm D}(\boldsymbol{\ell }_1+\boldsymbol{\ell }_2+\boldsymbol{\ell }_3)\;. \end{aligned}$ (D.1)

The bispectrum covariance has been derived for Gaussian fields by Joachimi et al. (2009, see also references therein). It is

$\begin{matrix} C_{B} (ℓ_{1}, ℓ_{2}, ℓ_{3} ; ℓ_{4}, ℓ_{5}, ℓ_{6}) & = \frac{{(2 π)}^{3} Λ^{- 1} (ℓ_{1}, ℓ_{2}, ℓ_{3})}{A A_{R} (ℓ_{1}) A_{R} (ℓ_{2}) A_{R} (ℓ_{3})} [δ_{1, 4} δ_{2, 5} δ_{3, 6} + 5 permutations] P (ℓ_{1}) P (ℓ_{2}) P (ℓ_{3}), \end{matrix}$ $\begin{aligned} \mathrm{C} _{B}(\ell _1, \ell _2, \ell _3; \ell _4, \ell _5, \ell _6)&=\frac{(2\pi )^3 \Lambda ^{-1}(\ell _1, \ell _2, \ell _3)}{A\, \,A_R(\ell _1)\, A_R(\ell _2)\, A_R(\ell _3)}\, \left[\delta _{1,4}\,\delta _{2,5}\,\delta _{3,6} + \mathrm{5\,permutations }\right]\, {P}(\ell _1)\, {P}(\ell _2)\, {P}(\ell _3)\;, \end{aligned}$ (D.2)

where A_R(ℓ) is the size of the bin of ℓ, defined as A_R(ℓ) = 2π ℓ Δℓ, the δ_i, j denote Kronecker-deltas and Λ is defined by

$\begin{matrix} [\prod_{i = 1}^{3} \int \frac{d^{2} ℓ_{i}}{{(2 π)}^{2}}] δ_{D} (ℓ_{1} + ℓ_{2} + ℓ_{3}) = \int_{0}^{\infty} \frac{d ℓ_{1}}{{(2 π)}^{2}} \int_{0}^{\infty} \frac{d ℓ_{2}}{{(2 π)}^{2}} \int_{0}^{\infty} \frac{d ℓ_{3}}{{(2 π)}^{2}} ℓ_{1} ℓ_{2} ℓ_{3} 2 π Λ (ℓ_{1}, ℓ_{2}, ℓ_{3}) . \end{matrix}$ $\begin{aligned} \left[\prod _{i=1}^3\,\int \frac{\mathrm{d}^2{\ell _i}}{(2\pi )^2} \right]\, \delta _{\rm D}(\boldsymbol{\ell }_1+\boldsymbol{\ell }_2+\boldsymbol{\ell }_3) = \int _0^\infty \frac{\mathrm{d}{\ell _1}}{(2\pi )^2} \int _0^\infty \frac{\mathrm{d}{\ell _2}}{(2\pi )^2} \int _0^\infty \frac{\mathrm{d}{\ell _3}}{(2\pi )^2} \; \ell _1\, \ell _2\, \ell _3\, 2\pi \, \Lambda (\ell _1, \ell _2, \ell _3)\;. \end{aligned}$ (D.3)

We note that Eq. (D.2) was derived under the assumption that the Fourier transform $\tilde{κ} (ℓ)$ $\tilde{\kappa}({{\boldsymbol{\ell}}})$ of the convergence field is known. However, this assumption cannot be fulfilled if $\tilde{κ}$ $\tilde{\kappa}$ is derived from the κ on only a finite survey area, as the Fourier transform formally requires κ on all of ℝ². Nevertheless, we try to derive the covariance of ${\hat{M}}_{ap}^{3}$ ${{\hat{M}_{\text{ ap}}}^3}$ from C_B. Using Eq. (9) and Eq. (D.3), the aperture statistics are

$\begin{matrix} 〈 M_{ap}^{3} 〉 (θ_{1}, θ_{2}, θ_{3}) = [\prod_{i = 1}^{3} \int \frac{d ℓ_{i}}{2 π} ℓ_{i} \tilde{u} (ℓ_{i} θ_{i})] B (ℓ_{1}, ℓ_{2}, ℓ_{3}) Λ (ℓ_{1}, ℓ_{2}, ℓ_{3}) . \end{matrix}$ $\begin{aligned} \left\langle M_{\rm ap}^3\right\rangle (\theta _1, \theta _2, \theta _3) = \left[\prod _{i=1}^3\,\int {\frac{\mathrm{d}{\ell _i}}{2\pi }}\, \ell _i\, \tilde{u}(\ell _i\, \theta _i) \right]\, B(\ell _1, \ell _2, \ell _3) \, \Lambda (\ell _1,\ell _2,\ell _3)\;. \end{aligned}$ (D.4)

We go from the continuous integration to a discrete sum in the ℓ_i, so

$\begin{matrix} 〈 M_{ap}^{3} 〉 (θ_{1}, θ_{2}, θ_{3}) = \frac{1}{{(2 π)}^{3}} \sum_{ijk} Δ ℓ_{i} Δ ℓ_{j} Δ ℓ_{k} ℓ_{i} ℓ_{j} ℓ_{k} \tilde{u} (ℓ_{i} θ_{1}) \tilde{u} (ℓ_{j} θ_{2}) \tilde{u} (ℓ_{k} θ_{3}) B (ℓ_{i}, ℓ_{j}, ℓ_{k}) {(2 π)}^{3} Λ (ℓ_{i}, ℓ_{j}, ℓ_{k}) . \end{matrix}$ $\begin{aligned} \left\langle M_{\rm ap}^3\right\rangle (\theta _1, \theta _2, \theta _3) = \frac{1}{(2\pi )^3} \sum _{ijk} \Delta \ell _i\, \Delta \ell _j\, \Delta \ell _k\, \ell _i\, \ell _j\, \ell _k\, \tilde{u}(\ell _i\, \theta _1)\, \tilde{u}(\ell _j\, \theta _2)\, \tilde{u}(\ell _k\, \theta _3)\, B(\ell _i, \ell _j, \ell _k) \, (2\pi )^3 \Lambda (\ell _i,\ell _j,\ell _k)\;. \end{aligned}$ (D.5)

Here, the Δℓ_i are the bin sizes along the ℓ_i. Using Eq. (D.5), we derive $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ from C_B with

$\begin{matrix} C_{{\hat{M}}_{ap}^{3}} (Θ_{1}, Θ_{2}) & = \frac{1}{{(2 π)}^{6}} \sum_{ijk} \sum_{lmn} Δ ℓ_{i} Δ ℓ_{j} Δ ℓ_{k} Δ ℓ_{l} Δ ℓ_{m} Δ ℓ_{n} ℓ_{i} ℓ_{j} ℓ_{k} ℓ_{l} ℓ_{m} ℓ_{n} \tilde{u} (ℓ_{i} θ_{1}) \tilde{u} (ℓ_{j} θ_{2}) \tilde{u} (ℓ_{k} θ_{3}) \\ \times \tilde{u} (ℓ_{l} θ_{4}) \tilde{u} (ℓ_{m} θ_{5}) \tilde{u} (ℓ_{n} θ_{6}) C_{B} (ℓ_{i}, ℓ_{j}, ℓ_{k} ; ℓ_{l}, ℓ_{m}, ℓ_{n}) Λ (ℓ_{i}, ℓ_{j}, ℓ_{k}) Λ (ℓ_{l}, ℓ_{m}, ℓ_{n}) . \end{matrix}$ $\begin{aligned} C_{\hat{M}_{\rm ap}^3}(\Theta _1, \Theta _2)&= \frac{1}{(2\pi )^{6}}\sum _{ijk} \sum _{lmn} \Delta \ell _i\, \Delta \ell _j\, \Delta \ell _k\, \Delta \ell _l\, \Delta \ell _m\, \Delta \ell _n\, \ell _i\, \ell _j\, \ell _k\, \ell _l\, \ell _m\, \ell _n\, \tilde{u}(\ell _i\, \theta _1)\, \tilde{u}(\ell _j\, \theta _2)\, \tilde{u}(\ell _k\, \theta _3)\\&\nonumber \quad \times \tilde{u}(\ell _l\, \theta _4)\, \tilde{u}(\ell _m\, \theta _5)\, \tilde{u}(\ell _n\, \theta _6)\, C_B(\ell _i, \ell _j, \ell _k; \ell _l, \ell _m, \ell _n) \, \Lambda (\ell _i,\ell _j,\ell _k)\, \Lambda (\ell _l,\ell _m,\ell _n)\;. \end{aligned}$ (D.6)

So the Gaussian covariance of the aperture statistics is

$\begin{matrix} C_{{\hat{M}}_{ap}^{3}} (Θ_{1}, Θ_{2}) & = \frac{1}{{(2 π)}^{6}} \sum_{ijk} \sum_{lmn} Δ ℓ_{i} Δ ℓ_{j} Δ ℓ_{k} Δ ℓ_{l} Δ ℓ_{m} Δ ℓ_{n} ℓ_{i} ℓ_{j} ℓ_{k} ℓ_{l} ℓ_{m} ℓ_{n} \tilde{u} (ℓ_{i} θ_{1}) \tilde{u} (ℓ_{j} θ_{2}) \tilde{u} (ℓ_{k} θ_{3}) \\ \times \tilde{u} (ℓ_{l} θ_{4}) \tilde{u} (ℓ_{m} θ_{5}) \tilde{u} (ℓ_{n} θ_{6}) \frac{{(2 π)}^{3} Λ^{- 1} (ℓ_{i}, ℓ_{j}, ℓ_{k})}{A Δ ℓ_{i} Δ ℓ_{j} Δ ℓ_{k} ℓ_{i} ℓ_{j} ℓ_{k}} [δ_{i, l} δ_{j, m} δ_{k, n} + 5 Perm .] \\ \times P (ℓ_{i}) P (ℓ_{j}) P (ℓ_{k}) Λ (ℓ_{i}, ℓ_{j}, ℓ_{k}) Λ (ℓ_{l}, ℓ_{m}, ℓ_{n}) \\ & = \frac{1}{{(2 π)}^{3} A} \sum_{ijk} \sum_{lmn} Δ ℓ_{l} Δ ℓ_{m} Δ ℓ_{n} ℓ_{l} ℓ_{m} ℓ_{n} \tilde{u} (ℓ_{i} θ_{1}) \tilde{u} (ℓ_{j} θ_{2}) \tilde{u} (ℓ_{k} θ_{3}) \tilde{u} (ℓ_{l} θ_{4}) \tilde{u} (ℓ_{m} θ_{5}) \tilde{u} (ℓ_{n} θ_{6}) \\ \times [δ_{i, l} δ_{j, m} δ_{k, n} + 5 Perm .] P (ℓ_{i}) P (ℓ_{j}) P (ℓ_{k}) Λ (ℓ_{l}, ℓ_{m}, ℓ_{n}) . \end{matrix}$ $\begin{aligned} C_{\hat{M}_{\rm ap}^3}(\Theta _1, \Theta _2)&= \frac{1}{(2\pi )^{6}} \sum _{ijk} \sum _{lmn} \Delta \ell _i\, \Delta \ell _j\, \Delta \ell _k\, \Delta \ell _l\, \Delta \ell _m\, \Delta \ell _n\, \ell _i\, \ell _j\, \ell _k\, \ell _l\, \ell _m\, \ell _n\, \tilde{u}(\ell _i\, \theta _1)\, \tilde{u}(\ell _j\, \theta _2)\, \tilde{u}(\ell _k\, \theta _3)\\&\nonumber \quad \times \tilde{u}(\ell _l\, \theta _4)\, \tilde{u}(\ell _m\, \theta _5)\, \tilde{u}(\ell _n\, \theta _6)\, \frac{(2\pi )^3 \Lambda ^{-1}(\ell _i, \ell _j, \ell _k)}{A\,\,\Delta \ell _i\, \Delta \ell _j\, \Delta \ell _k\, \ell _i\, \ell _j\, \ell _k}\, \left[\delta _{i,l}\,\delta _{j,m}\,\delta _{k,n} + \mathrm{5\,Perm. }\right]\,\\&\nonumber \quad \times P(\ell _i)\, P(\ell _j)\, P(\ell _k) \, \Lambda (\ell _i,\ell _j,\ell _k)\, \Lambda (\ell _l,\ell _m,\ell _n)\\\&\nonumber = \frac{1 }{(2\pi )^3\,A\, }\, \sum _{ijk} \sum _{lmn} \Delta \ell _l\, \Delta \ell _m\, \Delta \ell _n\,\ell _l\, \ell _m\, \ell _n\, \tilde{u}(\ell _i\,\theta _1)\,\tilde{u}(\ell _j\, \theta _2)\, \tilde{u}(\ell _k\, \theta _3)\, \tilde{u}(\ell _l\, \theta _4)\, \tilde{u}(\ell _m\, \theta _5)\,\tilde{u}(\ell _n\,\theta _6)\\&\nonumber \quad \times \left[\delta _{i,l}\,\delta _{j,m}\,\delta _{k,n} + \mathrm{5\,Perm.}\right]\, P(\ell _i)\, P(\ell _j)\, P(\ell _k) \, \Lambda (\ell _l,\ell _m,\ell _n)\;. \end{aligned}$ (D.7)

We evaluate the sums over the l, m, and n using the Kronecker-Deltas, so

$\begin{matrix} C_{{\hat{M}}_{ap}^{3}} (Θ_{1}, Θ_{2}) & = \frac{1}{{(2 π)}^{3} A} \sum_{ijk} Δ ℓ_{i} Δ ℓ_{j} Δ ℓ_{k} ℓ_{i} ℓ_{j} ℓ_{k} \tilde{u} (ℓ_{i} θ_{1}) \tilde{u} (ℓ_{j} θ_{2}) \tilde{u} (ℓ_{k} θ_{3}) \\ \times [\tilde{u} (ℓ_{l} θ_{4}) \tilde{u} (ℓ_{m} θ_{5}) \tilde{u} (ℓ_{n} θ_{6}) + 5 Perm .] P (ℓ_{i}) P (ℓ_{j}) P (ℓ_{k}) Λ (ℓ_{i}, ℓ_{j}, ℓ_{k}) . \end{matrix}$ $\begin{aligned} C_{\hat{M}_{\rm ap}^3}(\Theta _1, \Theta _2)&= \frac{1}{(2\pi )^3\, A\, }\, \sum _{ijk} \Delta \ell _i\, \Delta \ell _j\, \Delta \ell _k\,\ell _i\, \ell _j\, \ell _k\,\tilde{u}(\ell _i\,\theta _1)\, \tilde{u}(\ell _j\, \theta _2)\, \tilde{u}(\ell _k\, \theta _3)\\&\nonumber \quad \times \left[ \tilde{u}(\ell _l\, \theta _4)\, \tilde{u}(\ell _m\, \theta _5)\, \tilde{u}(\ell _n\,\theta _6) + \mathrm{5\,Perm. }\right]\, P(\ell _i)\, P(\ell _j)\, P(\ell _k) \, \Lambda (\ell _i,\ell _j,\ell _k)\;. \end{aligned}$ (D.8)

Finally, we go from the discrete sum back to continuous integrals (assuming Δℓ_i → 0), so

$\begin{matrix} C_{{\hat{M}}_{ap}^{3}} (Θ_{1}, Θ_{2}) & = \frac{1}{A} [\prod_{i = 1}^{3} \int_{0}^{\infty} \frac{d ℓ_{i}}{{(2 π)}^{2}} ℓ_{i} \tilde{u} (ℓ_{i} θ_{i})] P (ℓ_{1}) P (ℓ_{2}) P (ℓ_{3}) {(2 π)}^{3} Λ (ℓ_{1}, ℓ_{2}, ℓ_{3}) [\tilde{u} (ℓ_{1} θ_{4}) \tilde{u} (ℓ_{2} θ_{5}) \tilde{u} (ℓ_{3} θ_{6}) + 5 Perm .] . \end{matrix}$ $\begin{aligned} C_{\hat{M}_{\rm ap}^3}(\Theta _1, \Theta _2)&= \frac{1}{A}\left[\prod _{i=1}^3\,\int _0^\infty \frac{\mathrm{d}{\ell _i}}{(2\pi )^2}\,\ell _i\, \tilde{u}(\ell _i\,\theta _i) \right]\, P(\ell _1)\, P(\ell _2)\, P(\ell _3) \, (2\pi )^3\, \Lambda (\ell _1,\ell _2,\ell _3)\,\left[ \tilde{u}(\ell _1\, \theta _4)\, \tilde{u}(\ell _2\, \theta _5)\, \tilde{u}(\ell _3\, \theta _6) + \mathrm{5\,Perm. }\right]\,\;. \end{aligned}$ (D.9)

This can also be written as

$\begin{matrix} C_{{\hat{M}}_{ap}^{3}} (Θ_{1}, Θ_{2}) & = \frac{1}{A} [\prod_{i = 1}^{3} \int \frac{d^{2} ℓ_{i}}{{(2 π)}^{2}} \tilde{u} (ℓ_{i} θ_{i})] P (ℓ_{1}) P (ℓ_{2}) P (ℓ_{3}) {(2 π)}^{2} δ_{D} (ℓ_{1} + ℓ_{2} + ℓ_{3}) [\tilde{u} (ℓ_{1} θ_{4}) \tilde{u} (ℓ_{2} θ_{5}) \tilde{u} (ℓ_{3} θ_{6}) + 5 Perm .] \\ = \frac{1}{A} \int \frac{d^{2} ℓ_{1}}{{(2 π)}^{2}} \int \frac{d^{2} ℓ_{2}}{{(2 π)}^{2}} \tilde{u} (ℓ_{1} θ_{1}) \tilde{u} (ℓ_{2} θ_{2}) \tilde{u} (| ℓ_{1} + ℓ_{2} | θ_{3}) P (ℓ_{1}) P (ℓ_{2}) P (| ℓ_{1} + ℓ_{2} |) \\ \times [\tilde{u} (ℓ_{1} θ_{4}) \tilde{u} (ℓ_{2} θ_{5}) \tilde{u} (| ℓ_{1} + ℓ_{2} | θ_{6}) + 5 Perm .] . \end{matrix}$ $\begin{aligned} C_{\hat{M}_{\rm ap}^3}(\Theta _1, \Theta _2)&= \frac{1}{A}\left[\prod _{i=1}^3\,\int \frac{\mathrm{d}^2{\ell _i}}{(2\pi )^2}\, \tilde{u}(\ell _i\,\theta _i) \right]\, P(\ell _1)\, P(\ell _2)\, P(\ell _3) \,(2\pi )^2 \delta _{\rm D}(\boldsymbol{\ell }_1+\boldsymbol{\ell }_2+\boldsymbol{\ell }_3)\, \left[ \tilde{u}(\ell _1\, \theta _4)\, \tilde{u}(\ell _2\, \theta _5)\, \tilde{u}(\ell _3\,\theta _6) + \mathrm {5\,Perm.} \right]\,\\&\nonumber = \frac{1 }{A}\, \int \frac{\mathrm{d}^2{\ell _1}}{(2\pi )^2} \int \frac{\mathrm{d}^2{\ell _2}}{(2\pi )^2} \tilde{u}(\ell _1\,\theta _1)\,\tilde{u}(\ell _2\, \theta _2)\, \tilde{u}(|\boldsymbol{\ell }_1+\boldsymbol{\ell }_2|\, \theta _3)\,P(\ell _1)\, P(\ell _2)\, P(|\boldsymbol{\ell }_1+\boldsymbol{\ell }_2|)\\&\nonumber \quad \times \left[ \tilde{u}(\ell _1\, \theta _4)\, \tilde{u}(\ell _2\, \theta _5)\, \tilde{u}(|\boldsymbol{\ell }_1+\boldsymbol{\ell }_2|\, \theta _6) + \mathrm{5\,Perm. }\right]\, \;. \end{aligned}$ (D.10)

By comparing Eq. (D.10) with Eq. (44), we see that this expression corresponds to $T_{P P P, 1}^{\infty}$ ${T^\infty_{PPP, 1}}$ , which is the Gaussian covariance of ${\hat{M}}_{ap}^{3}$ ${{\hat{M}_{\text{ ap}}}^3}$ in the limiting case of large survey areas A. The term T_PPP, 2 is not recovered from the bispectrum covariance. This is a direct consequence of the assumption that $\tilde{κ} (ℓ)$ $\tilde{\kappa}({{\boldsymbol{\ell}}})$ can be reconstructed from the κ on the finite survey window A in the derivation of Eq. (D.2). This assumption is equivalent to assuming a window function W_A, which is one on the whole ℝ², leading to the large-field approximation for G_A in Eq. (43). This approximation directly reduces T_PPP, 1 to $T_{P P P, 1}^{\infty}$ ${T^\infty_{PPP, 1}}$ and T_PPP, 2 to zero.

All Tables

Table 1.

Overview of the marginalised MAP values and 68% confidence intervals resulting from MCMC chains where Ω_m, S₈ are varied and $σ_{8} = S_{8} \sqrt{0.3 / Ω_{m}}$ $\sigma_8=S_8\sqrt{0.3/\Omega_{\mathrm{m}}}$ .

In the text

Table A.1.

Permutations of aperture scale radii for covariance terms.

In the text

All Figures

Fig. 1.

Illustration of aperture mass estimation. The area A′ is the size of the full convergence field, on which we place apertures with scale radius θ, illustrated by the circles, to obtain M_ap(α, θ). Apertures centred on positions α within the smaller area A lie completely within A′, while apertures centred on positions α′ outside of A extend outside of A′, so M_ap(α′,θ) is biased.

In the text

Fig. 2.

Schematic representation of the calculation of the covariance $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ . The covariance is given by the difference between $〈 {\hat{M}}_{ap}^{3} {\hat{M}}_{ap}^{3} 〉$ ${\left\langle{{\hat{M}_{\text{ ap}}}^3}\,{{\hat{M}_{\text{ ap}}}^3}\right\rangle}$ and $〈 {\hat{M}}_{ap}^{3} 〉 〈 {\hat{M}}_{ap}^{3} 〉$ ${\left\langle{{\hat{M}_{\text{ ap}}}^3}\right\rangle}\,{\left\langle{{\hat{M}_{\text{ ap}}}^3}\right\rangle}$ , the first of which can be decomposed (indicated by solid arrows) into one Gaussian (denoted by G) and three non-Gaussian parts (denoted by BB, PT, and P₆). We discuss the Gaussian part in Sect. 3.1 and the non-Gaussian parts in Sect. 3.2. These parts can be further decomposed into terms depending on different permutations of the aperture scale radii, called T_PPP, 1 to T_P₆. For large survey areas, the T_i can be approximated, indicated by dashed arrows, as shown in Sect. 3.4. Under this approximation, two terms vanish, which is why we term them ‘finite-field terms’. Also shown are equation numbers for the relevant expressions.

In the text

	Fig. 3. Redshift distribution constructed from the T17 simulation given the fiducial n(z) of the KiDS-1000 data.
In the text

Fig. 4.

$C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ for the GRF, each row is showing a different field size. In the left column are the measured $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ from the simulated GRF. The middle column shows the model prediction, including the finite-field term T_PPP, 2. The covariances under the large-field approximation, for which T_PPP, 2 vanishes, are in the right column.

In the text

Fig. 5.

Fractional differences of the model covariance to the measured $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ for the GRF, each row showing a different field size. In the left column are the fractional differences between the full analytic model and the simulated GRF. The middle column shows the difference between the large-field approximation, neglecting the finite field term, and the simulated covariance. In the right column, the difference of the model using the large-field approximation for T_PPP, 1, but keeping T_PPP, 2 as it is, is shown.

In the text

Fig. 6.

Fractional differences between covariance model and estimate from GRFs with $ϑ_{\max} = 5 \overset{°}{.} 78$ $\vartheta_{\mathrm{max}}= 5{{\overset{\circ}{.}}}78$ for different choices of the boundaries of the ℓ-integrals. Left: Fiducial calculation of model, where P(ℓ) is used for all ℓ. Middle: ℓ-modes larger than ℓ_max are cut-off. This removes modes not present in the simulation due to the finite pixel size. Right: ℓ-modes smaller than ℓ_min are cut-off. This removes modes not present in the simulation due to the finite field size.

In the text

	Fig. 7. Fractional differences of T_PPP, 1 with and without large-field approximation for different field sizes.
In the text

Fig. 8.

Fractional difference of model covariance, rescaled from a field size of $5 \overset{°}{.} 73^{2}$ $5{{\overset{\circ}{.}}}73^2$ to $15 \overset{°}{.} 73^{2}$ $15{{\overset{\circ}{.}}}73^2$ to the covariance from a simulated GRF of size $15 \overset{°}{.} 73^{2}$ $15{{\overset{\circ}{.}}}73^2$ . The rescaling is performed under the assumption that the covariance scales inversely proportional to the survey area, which is not true for the finite field term T_PPP, 2.

In the text

	Fig. 9. Comparison of measured and modelled covariance for the SLICS (top) and T17 simulations (bottom). Left are the measured covariance, middle are the model predictions; right are the differences between model and simulation, normalised by the simulation bootstrap error.
In the text

Fig. 10.

Diagonal of $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ and the individual covariance terms for the SLICS (left) and the T17 (right). The measurement from the simulation is shown as a red dashed line, while the full modelled $C_{{\hat{M}}_{ap}^{3}}$ $C_{{{\hat{M}_{\text{ ap}}}^3}}$ is the bold black line. The other lines show individual terms. Error bars on the simulated covariances originate from bootstrapping.

In the text

	Fig. 11. Model predictions for the individual covariance terms for the SLICS-like setup.
In the text

	Fig. 12. Comparison of individual covariance terms modelled (solid) and measured (dashed) in the SLICS without shape noise and with all sources at redshift 1.
In the text

	Fig. 13. Ratio of two-halo contribution T_P₆, 2h to full T_P₆ term for various field sizes with T17 cosmology. The blue line corresponds to the KiDS1000-like field size.
In the text

	Fig. 14. Covariance in SLICS, measured with shear correlation functions Γ_i (black) and FFT (blue), as well as the model covariance for the full survey area of 10 ° ×10° (orange, dashed) and for the effective survey area of $7 \overset{°}{.} 87 \times 7 \overset{°}{.} 87$ $7{{\overset{\circ}{.}}}87\times 7{{\overset{\circ}{.}}}87$ .
In the text

Fig. 15.

Parameter constraints, using either the covariance from the simulations (red) or the analytic model (black). Left are the constraints for a KiDS-1000-like survey, and right are the constraints for the SLICS setup, which uses a stage IV-like n(z) and shape noise, but a small survey area of $7 \overset{°}{.} 87 \times 7 \overset{°}{.} 87$ $7{{\overset{\circ}{.}}}87 \times 7{{\overset{\circ}{.}}}87$ .

In the text

	Fig. 16. Parameter constraints for a KiDS-1000-like survey, using the full model covariance (black), neglecting the finite field terms (blue), neglecting the 2-halo contribution to T_P₆ (green) or using only the Gaussian covariance (orange). The FoMs of Ω_m − S₈ are given in Table 1.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Abbott, T. M. C., Aguena, M., Alarcon, A., et al. 2022, Phys. Rev. D, 105, 023520 [CrossRef] [Google Scholar]

[2] Bartelmann, M. 2010, CQG, 27, 233001 [NASA ADS] [CrossRef] [Google Scholar]

[3] Bartelmann, M., & Schneider, P. 2001, Phys. Rep., 340, 291 [Google Scholar]

[4] Burger, P. A., Friedrich, O., Harnois-Déraps, J., et al. 2023, A&A, 669, A69 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[5] Chan, K. C., Moradinezhad Dizgah, A., & Noreña, J. 2018, Phys. Rev. D, 97, 043532 [NASA ADS] [CrossRef] [Google Scholar]

[6] Cooray, A., & Sheth, R. 2002, Phys. Rep., 372, 1 [Google Scholar]

[7] Crittenden, R. G., Natarajan, P., Pen, U.-L., & Theuns, T. 2002, ApJ, 568, 20 [NASA ADS] [CrossRef] [Google Scholar]

[8] Friedrich, O., Gruen, D., DeRose, J., et al. 2018, Phys. Rev. D, 98, 023508 [Google Scholar]

[9] Friedrich, O., Andrade-Oliveira, F., Camacho, H., et al. 2021, MNRAS, 508, 3125 [NASA ADS] [CrossRef] [Google Scholar]

[10] Giblin, B., Heymans, C., Asgari, M., et al. 2021, A&A, 645, A105 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[11] Gruen, D., Friedrich, O., Krause, E., et al. 2018, Phys. Rev. D, 98, 023507 [Google Scholar]

[12] Harnois-Déraps, J., & van Waerbeke, L. 2015, MNRAS, 450, 2857 [Google Scholar]

[13] Heydenreich, S., Schneider, P., Hildebrandt, H., et al. 2020, A&A, 634, A104 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[14] Heydenreich, S., Brück, B., & Harnois-Déraps, J. 2021, A&A, 648, A74 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[15] Heydenreich, S., Brück, B., Burger, P., et al. 2022, A&A, 667, A125 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[16] Heydenreich, S., Linke, L., Burger, P., & Schneider, P. 2023, A&A, 672, A44 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[17] Heymans, C., Tröster, T., Asgari, M., et al. 2021, A&A, 646, A140 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[18] Hikage, C., Oguri, M., Hamana, T., et al. 2019, PASJ, 71, 43 [Google Scholar]

[19] Hildebrandt, H., van den Busch, J. L., Wright, A. H., et al. 2021, A&A, 647, A124 [EDP Sciences] [Google Scholar]

[20] Hoekstra, H., & Jain, B. 2008, Ann. Rev. Nucl. Part. Sci., 58, 99 [NASA ADS] [CrossRef] [Google Scholar]

[21] Jarvis, M., Bernstein, G., & Jain, B. 2004, MNRAS, 352, 338 [Google Scholar]

[22] Joachimi, B., Schneider, P., & Eifler, T. 2008, A&A, 477, 43 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[23] Joachimi, B., Shi, X., & Schneider, P. 2009, A&A, 508, 1193 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[24] Joachimi, B., Lin, C. A., Asgari, M., et al. 2021, A&A, 646, A129 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[25] Kacprzak, T., Kirk, D., Friedrich, O., et al. 2016, MNRAS, 463, 3653 [Google Scholar]

[26] Kaiser, N., & Jaffe, A. 1997, ApJ, 484, 545 [NASA ADS] [CrossRef] [Google Scholar]

[27] Kayo, I., Takada, M., & Jain, B. 2013, MNRAS, 429, 344 [Google Scholar]

[28] Kilbinger, M., & Schneider, P. 2005, A&A, 442, 69 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[29] Kuijken, K., Heymans, C., Hildebrandt, H., et al. 2015, MNRAS, 454, 3500 [Google Scholar]

[30] Martinet, N., Harnois-Déraps, J., Jullo, E., & Schneider, P. 2021, A&A, 646, A62 [EDP Sciences] [Google Scholar]

[31] Petri, A. 2016, Astron. Comput., 17, 73 [NASA ADS] [CrossRef] [Google Scholar]

[32] Pyne, S., & Joachimi, B. 2021, MNRAS, 503, 2300 [NASA ADS] [CrossRef] [Google Scholar]

[33] Schneider, P. 1996, MNRAS, 283, 837 [Google Scholar]

[34] Schneider, P., & Lombardi, M. 2003, A&A, 397, 809 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[35] Schneider, P., van Waerbeke, L., Jain, B., & Kruse, G. 1998, MNRAS, 296, 873 [NASA ADS] [CrossRef] [Google Scholar]

[36] Schneider, P., Kilbinger, M., & Lombardi, M. 2005, A&A, 431, 9 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[37] Secco, L. F., Jarvis, M., Jain, B., et al. 2022, Phys. Rev. D, 105, 103537 [NASA ADS] [CrossRef] [Google Scholar]

[38] Seitz, S., & Schneider, P. 1996, A&A, 305, 383 [NASA ADS] [Google Scholar]

[39] Sheth, R. K., & Tormen, G. 1999, MNRAS, 308, 119 [Google Scholar]

[40] Springel, V., Yoshida, N., & White, S. D. M. 2001, Nature, 6, 79 [Google Scholar]

[41] Spurio Mancini, A., Piras, D., Alsing, J., Joachimi, B., & Hobson, M. P. 2022, MNRAS, 511, 1771 [NASA ADS] [CrossRef] [Google Scholar]

[42] Takada, M., & Hu, W. 2013, Phys. Rev. D, 87, 123504 [NASA ADS] [CrossRef] [Google Scholar]

[43] Takada, M., & Jain, B. 2004, MNRAS, 348, 897 [Google Scholar]

[44] Takahashi, R., Sato, M., Nishimichi, T., Taruya, A., & Oguri, M. 2012, ApJ, 761, 152 [Google Scholar]

[45] Takahashi, R., Hamana, T., Shirasaki, M., et al. 2017, ApJ, 850, 24 [Google Scholar]

[46] Takahashi, R., Nishimichi, T., Namikawa, T., et al. 2020, ApJ, 895, 113 [NASA ADS] [CrossRef] [Google Scholar]