Issue 
A&A
Volume 661, May 2022



Article Number  A137  
Number of page(s)  20  
Section  Cosmology (including clusters of galaxies)  
DOI  https://doi.org/10.1051/00046361/202141628  
Published online  19 May 2022 
A revised density split statistic model for general filters
^{1}
ArgelanderInstitut für Astronomie, Auf dem Hügel 71, 53121 Bonn, Germany
email: pburger@astro.unibonn.de
^{2}
Kavli Institute for Cosmology, University of Cambridge, Cambridge CB3 0HA, UK
^{3}
Churchill College, University of Cambridge, Cambridge CB3 0DS, UK
^{4}
School of Mathematics, Statistics and Physics, Newcastle University, Newcastle Upon Tyne NE1 7RU, UK
^{5}
Astrophysics Research Institute, Liverpool John Moores University, 146 Brownlow Hill, Liverpool L3 5RF, UK
Received:
24
June
2021
Accepted:
2
March
2022
Context. Studying the statistical properties of the largescale structure in the Universe with weak gravitational lensing is a prime goal of several current and forthcoming galaxy surveys. The power that weak lensing has to constrain cosmological parameters can be enhanced by considering statistics beyond secondorder shear correlation functions or power spectra. One such higherorder probe that has proven successful in observational data is density split statistics (DSS), in which one analyses the mean shear profiles around points that are classified according to their foreground galaxy density.
Aims. In this paper, we generalise the most accurate DSS model to allow for a broad class of angular filter functions used for the classification of the different local density regions. This approach is motivated by earlier findings showing that an optimised filter can provide tighter constraints on model parameters compared to the standard tophat case.
Methods. As in the previous DSS model we built on large deviation theory approaches and approximations thereof to model the matter density probability distribution function, and on perturbative calculations of higherorder moments of the density field. The novel addition relies on the generalisation of these previously employed calculations to allow for general filter functions and is validated on several sets of numerical simulations.
Results. It is shown that the revised model fits the simulation measurements well for many filter choices, with a residual systematic offset that is small compared to the statistical accuracy of current weak lensing surveys. However, by use of a simple calibration method and a Markov chain Monte Carlo analysis, we studied the expected sensitivity of the DSS to cosmological parameters and find unbiased results and constraints comparable to the commonly used twopoint cosmic shear measures. Hence, our DSS model can be used in competitive analyses of current cosmic shear data, while it may need refinements for forthcoming lensing surveys.
Key words: gravitational lensing: weak / largescale structure of Universe / methods: statistical / galaxies: abundances / surveys
© ESO 2018
1. Introduction
Studying the matter distribution of the present largescale structure reveals a wealth of information about the evolution of the Universe. In particular, its distorting effect on the propagation of light from distant galaxies, known as cosmic shear, can be captured by analysing weak lensing surveys. By comparing the results of cosmological models with the observed signal, one can constrain cosmological parameters (see e.g. Asgari et al. 2021; Abbott et al. 2022; Hamana et al. 2020).
The preferred methods used to infer statistical properties of the matter and galaxy distribution concentrate on secondorder statistics, such as the twopoint correlation functions or their Fourier counterparts, the power spectra. Although these statistics have an impressive accuracy to describe for instance primordial perturbations visible in the cosmic microwave background (CMB; e.g. Planck Collaboration V 2020) they probe only the Gaussian information present in the density fluctuations. However, these initial conditions developed significant nonGaussian features by means of nonlinear gravitational instability, which can only be investigated with higherorder statistics. Although they are typically more time consuming to model and measure, these higherorder statistics scale differently with cosmological parameters, and are not affected in the same way by residual systematics. Hence, by jointly investigating second and higherorder statistics, the constraining power on cosmological parameters increases (see e.g. Bergé et al. 2010; Pyne & Joachimi 2021; Pires et al. 2012; Fu et al. 2014; Kilbinger & Schneider 2005).
A large number of analytical models for the twopoint statistics exists in the literature (Takahashi et al. 2012; Heitmann et al. 2014; Euclid Collaboration 2021; Mead et al. 2020; Nishimichi et al. 2021); however, the analysis of higherorder statistics is usually based on simulations. Analytical models for higherorder lensing statistics are rare, although they are important not only for scientists to understand physical processes, but also to crosscheck simulations, which are usually only tested against Gaussian statistics. For example, Reimberg & Bernardeau (2018) and Barthelemy et al. (2021) used large deviation theory (LDT) to compute the reducedshear correction to the aperture mass probability distribution function (PDF); Munshi et al. (2020) and Halder et al. (2021) analytically modelled the integrated shear threepoint function; the lensing peak count function was modelled in Fan et al. (2010), Lin & Kilbinger (2015) and Shan et al. (2018), while the lensing PDF is modelled in Boyle et al. (2021).
The examples mentioned above all pertain to the analysis of cosmic shear data. However, it has been established in recent analyses that the addition of foreground clustering data, and their crosscorrelation with the background source galaxies, yield significantly better constraints (Abbott et al. 2018; Heymans et al. 2021). While the central analyses focused again on twopoint statistics, Friedrich et al. (2018, hereafter F18) developed a competitive model based on density split statistics (hereafter DSS). The idea is to measure the mean tangential shear around small subareas of the survey, and to stack the signal according to the foreground galaxy density in these subareas. We expect the tangential shear to be larger around points with a high density of foreground galaxies, given they correspond to a large matter overdensity on average. The model derived in F18 is based on nonperturbative calculations of the matter density PDF, which predicts the shear profiles and the probability density of galaxy counts in the subareas, for a given cosmological model, a redshift distribution for the source and lens galaxy populations, and a mean galaxy density. In Gruen et al. (2018, hereafter G18), the F18 model is used to constrain cosmological parameters from DSS measurements from the Dark Energy Survey (DES) First Year and Sloan Digital Sky Survey (SDSS) data, which yields constraints on the matter density that agree and are competitive with the DES analysis of galaxy and shear twopoint functions (see Abbott et al. 2018).
One of the motivations of this work is based on Burger et al. (2020, hereafter B20), who use a suite of numerical simulations to show that using matched filter functions for searching peaks and troughs in the galaxy and matter density contrast has clear advantages compared to the tophat filter used in the F18 model, both in terms of the overall signal S/N and in recovering accurately the galaxy bias term. Another motivation of using compensated filters is that these filters are more confined in Fourier space and are therefore better at smoothing out large ℓmodes where baryonic effects play an important role (Asgari et al. 2020). Therefore, it is of interest to generalise the DSS to a broader set of filter functions. Smoothing cosmic density fields with filters other than tophat ones significantly complicates the LDTlike calculations used by F18 and G18 (cf. Barthelemy et al. 2021) because for tophat filters the Lagrangian to Eulerian mapping inherent in LDT is particularly simple. However, we find here that density split statistics with nontophat filters that are sufficiently concentrated around their centres can still be accurately modelled with computationally feasible extensions of approximations made by F18. This paper describes our modifications to the F18 model that will allow us to optimise filtering strategies when applying density split statistics to Stage III weak lensing surveys such as KiDS. Throughout this paper, if not otherwise stated, we assume a spatially flat universe.
This work is structured as follows. In Sect. 2 we review the basics of the aperture statistics; we then detail our changes to the F18 model in Sect. 3. In Sect. 4 we describe the simulations, and the construction of our mock data used to validate the revised model. In Sect. 5 we compare the model predictions with simulations, and establish the model’s limitations. We summarise our work in Sect. 6.
2. Aperture statistics
The lensing convergence κ and shear γ are related via the lensing potential ψ (Schneider et al. 1992) as
with and θ the angular position on the sky; we employ the flatsky approximation. Given a reference point in a Cartesian coordinate system on the sky and a second point whose separation to the first is oriented at an angle ϕ with respect to that coordinate system, we can express the shear at the second point in terms of the tangential and crossshear with respect to the first point as
where the factor 2 in the exponent is due to the polar nature of the shear. Given a convergence field κ(θ), the aperture mass at position θ is defined as
where U(ϑ) is a compensated axisymmetric filter function, such that ∫ϑ U(ϑ) dϑ = 0. As shown in Schneider (1996), if U is compensated, M_{ap} can also be expressed in terms of the tangential shear γ_{t} and a related filter function Q as
where
which can be inverted, yielding
In analogy to M_{ap}, we define, as done in B20, the aperture number counts (Schneider 1998), or aperture number, as
where U(ϑ) is the same filter function as in Eq. (3) and n(ϑ) is the (foreground) galaxy number density on the sky. This definition of the aperture number is equivalent to the “CountsinCell” (CiC) from Gruen et al. (2016) if a tophat filter of the form
is used, where ℋ is the Heaviside step function and A is the area of the filter. However, B20 demonstrated that tophat filters are not optimal, and that a better performance is achieved by an adapted filter in terms of signaltonoiseratio (S/N) and in recovering accurately the galaxy bias term. In this paper we compute aperture mass statistics with Eq. (4) using simulated weak lensing catalogues of background source galaxies, notably regarding positions and ellipticities, and aperture number statistics with Eq. (7) from the position of simulated foreground lens galaxies (see Sect. 4).
3. Revised model
In this section we describe our modifications of the original F18 model. Although the derivations shown here are selfcontained, we recommend the interested reader to consult the original F18 paper. In particular, it is shown there that the full nonperturbative calculation of the PDF within large deviation theory (LDT) can be well approximated with a lognormal PDF that matches variance and skewness of the LDT result. This allowed F18 and G18 to replace the expensive LDT computation with a faster one, hence making the calculation of full Markov chain Monte Carlo (MCMC) functions feasible. The reason why this approximation works well is that, for tophat filters, the scaling of variance and higherorder cumulants in LDT is similar to that found in lognormal distributions. This cannot be expected a priori for other filter functions. However, through comparison with Nbody simulations we find here (cf. Sect. 5) that either a simple lognormal or a combination of two lognormal distributions still accurately describes the density PDFs required to analyse density split statistics with more general classes of filters. The following section describes these calculations. In order to reduce the mathematical calculations in this section, some derivations are detailed in Appendix A.
We start by defining the lineofsight projection of the 3D matter density contrast δ_{m,3D}, weighted by a foreground (lens) galaxy redshift probability distribution n_{f}(z) as
where χ is the comoving distance and the projection kernel q_{f}(χ) is
This 2D matter density contrast can then be used together with a linear bias term to represent a tracer density contrast (see Sect. 3.3 or Sect. 4). Following F18, the next step consists of smoothing the results with a filter U of size Θ:
This simplifies in the case of a tophat filter of size Θ to
Similar to the 2D density contrast, the convergence, which is needed to describe the DSS signal, is given by
where W_{s}(χ) is the lensing efficiency defined as
with being the lineofsight probability density of the sources, Ω_{m} the matter density parameter, H_{0} the Hubble parameter, and c the speed of light. The mean convergence inside an angular separation ϑ, κ_{< ϑ}, follows then in analogy to Eq. (12) by substituting δ_{m,2D}(θ) with κ(θ).
The aim of our model is to predict the tangential shear profiles γ_{t} given a quantile 𝒬 of the foreground aperture number N_{ap}, ⟨γ_{t}𝒬⟩, where for instance the highest quantile is the set of lines of sight of the sky that have the highest values of N_{ap}. Therefore, to determine ⟨γ_{t}𝒬⟩ the model calculates ⟨γ_{t}N_{ap}⟩ and sums up all that belong to the corresponding quantile 𝒬. The expectation value of ⟨γ_{t}N_{ap}⟩ is computed from the convergence profile as
where κ_{ϑ} is the azimuthally averaged convergence at angular separation ϑ from the centre of the filter, and κ_{< ϑ} is the average convergence inside that radius. The latter quantity, conditioned on a given N_{ap}, can be specified by
where in the second step we assumed that the expected convergence within ϑ only depends on the projected matter density contrast δ_{m, U} and not on the particular realisation of shotnoise in N_{ap} found within that fixed matter density contrast^{1}.
By use of Bayes’ theorem, we can express the conditional PDF as
where p(N_{ap}δ_{m, U}) is the probability of finding N_{ap} given the smoothed density contrast δ_{m, U}. The normalisation in the denominator of Eq. (18) follows by integrating over the numerator,
As seen in the derivation above, we are left with three ingredients in order to calculate the tangential shear profiles given a quantile Q of the aperture number ⟨γ_{t}(ϑ)N_{ap}⟩:

the PDF of the matter density contrast smoothed with the filter function U (used in Eqs. (18), (19))

the expectation value of the convergence inside a radius ϑ given the smoothed density contrast (used in Eq. (17))

the distribution of N_{ap} for the given filter function U given the smoothed density contrast (used in Eqs. (18), (19))
Since all three ingredients are sensitive to the filter U, we need to adjust all of them coherently with respect to the tophat case.
3.1. (I):p(δ_{m},U)
As shown by F18 the full LDT computation of the matter density PDF can accurately approximated by a shifted lognormal distribution with vanishing mean (Hilbert et al. 2011), which is fully characterised by two parameters, σ and δ_{0}, as
The two free parameters can be determined by specifying the variance and skewness of the PDF as (Hilbert et al. 2011)
we derive the expression of and in Appendix A (see Eq. (A.28)).
As we show later, this approximation works well for nonnegative filter functions like tophat or Gaussian filters. However, the lognormal PDF approximation becomes less accurate for compensated filters that include negative weights. In these cases we instead divide U into U_{>} (ϑ) = U(ϑ)ℋ(ϑ_{ts} − ϑ) and U_{<} (ϑ) = − U(ϑ)ℋ(ϑ − ϑ_{ts}), where ϑ_{ts} is the transition scale from positive to negative filter weights. As a consequence, we obtain two correlated lognormal random variables, δ_{m, U>} and δ_{m, U<}, whose joint distribution can be represented by a bivariate lognormal distribution as
where we defined
and similarly for Δ_{<}. The correlation coefficient ρ is determined by
and in order to calculate the difference of two independent random variables δ_{m, U} = δ_{m, U>} − δ_{m, U<} we can use the convolution theorem (Arfken & Weber 2008) to get
3.2. (II):⟨κ_{< ϑ}δ_{m, U}⟩
In order to calculate the expectation value of the mean convergence inside an angular radius ϑ, κ_{< ϑ}, given the matter density contrast δ_{m, U}, we assume that both follow a joint lognormal distribution (see e.g. the discussion in Appendix B of G18). In this case, the expectation value can be written as
where δ_{0} is determined with Eq. (25) and the three variables C, V, and κ_{0} can be calculated from the moments , , and , which follow from the derivation in Appendix A (see Eq. (A.29)):
We note that the assumption that δ_{m, U} is lognormal distributed is not well justified for filters with negative weights as we mentioned in the previous section. A possible improvement could be done for instance by assuming again that δ_{m, U} is made up of two lognormal random variables, and we would need to calculate conditional moments like ⟨κ_{< ϑ}δ_{m, U>}−δ_{m, U<}⟩. This would significantly increase the amount of joint moments needed in our calculation and would render fast modelling unfeasible. However, an improved modelling is also unnecessary at present, given the statistical uncertainties we expect for Stage III weak lensing surveys such as KiDS1000. We demonstrate this empirically in Sect. 5 by comparison to Nbody simulated data, but we also want to give a brief theoretical motivation. The average value of κ_{< ϑ}, given that δ_{m, U} lies within the range [δ_{min}, δ_{max}], is given by
If κ_{< ϑ} and δ_{m, U} were joint Gaussian random variables, then p(δ_{m, U}) would be a Gaussian PDF and we would have . We now argue that the leadingorder correction to this Gaussian approximation consists of replacing p(δ_{m, U}) by our full nonGaussian model, without changing ⟨κ_{< ϑ}δ_{m, U}⟩, since this would be exactly correct in the limit of strong correlation between the two variables. Our lognormal approximation to ⟨κ_{< ϑ}δ_{m, U}⟩ is then already a nexttoleadingorder correction and a bivariate lognormal approximation for ⟨κ_{< ϑ}δ_{m, U}⟩ would be of even higher order. While this reasoning is admittedly only heuristic, it is proven correct by the accuracy of our model predictions for the lensing signals in Sect. 5.
3.3. (III):p(N_{ap}δ_{m, U})
The third basic ingredient is the PDF of N_{ap} given the projected matter density contrast smoothed with the filter U. Assuming a Poisson distribution for N_{ap}, which is the most straightforward ansatz, is unfortunately not possible because negative values are expected with a compensated filter (i.e. in some of the U_{<} contributions). We use instead a completely new approach compared to F18, and derive an expression for p(N_{ap}δ_{m, U}) by use of the characteristic function (Papoulis & Pillai 1991, hereafter CF), which is an alternative representation of a probability distribution, similar to the moment generating functions, but based on the Fourier transform of the PDF. Of interest to us, the n th derivative of the CFs can be used to calculate the n th moment of the PDF. The CF corresponding to p(N_{ap}δ_{m, U}) is defined as
where in our particular case, we derive in Appendix A.4 a closed expression as
with n_{0} being the mean number density of foreground galaxies on the sky. The assumption of linear galaxy bias enters here by the term b ⟨w_{ϑ}δ_{m, U}⟩, with
Hence, n_{0}(1+b ⟨w_{ϑ}δ_{m, U}⟩) is the effective number density at ϑ given δ_{m, U}. The conditional expectation value ⟨w_{ϑ}δ_{m, U}⟩ is given in analogy to Eq. (30), but replacing in Eqs. (31)–(33) for k = 1, 2 and using that
where the joint moments are also derived in Eq. (A.30). Next, we reexpress Eq. (36) as the product of two terms,
where
and R_{max} is the angular radius beyond which U vanishes. We note that G18 and F18 found superPoisson shotnoise in their work. They interpret these deviations from Poisson noise as having a number ≠1 of galaxies per Poisson halo. This would suggest that we could incorporate nonPoissonian behaviour by replacing n_{0} with an effective density of Poisson halos and making this a free parameter of our model. However, more recent investigations (e.g. Friedrich et al., in prep.) cast doubt on the simplified interpretation of F18 and G18. A proper investigation of the problem of nonPoissonian shotnoise is beyond the scope of this work, and we will address it in future investigations.
Finally, the probability density function p(N_{ap}δ_{m, U}) follows from the inverse Fourier transform of the CF
where the second step follows from the fact that the imaginary part cancels out.
In Appendix A.4 we discuss a similar approach, where we assume that p(N_{ap}δ_{m, U}) is lognormal distributed. In that case, to specify the PDF, only the first three moments are needed, which follow from derivatives of the CF. As shown in Appendix A.4 both methods yield almost identical results, and since the lognormal approach is significantly faster, we use it hereafter, unless otherwise stated.
To summarise, the major changes compared to the F18 model are the following:

To determine p(δ_{m, U}) we

To determine p(N_{ap}δ_{m, U}) we

To determine ⟨κ_{< ϑ}δ_{m, U}⟩ we

updated the calculations of and to general filter functions (see Appendix A).

4. Simulation data
Before using our revised model in data analyses, it is mandatory to quantify its precision and range of validity. We use for this validation exercise three simulations suites:

the fullsky gravitational lensing simulations described in Takahashi et al. (2017, hereafter T17), with which we carry out a detailed investigation of the model in a simple survey configuration;

the cosmoSLICS simulations, described in HarnoisDéraps et al. (2019), with which we validate our model on a independent simulation suite;

the SLICS simulations, described in HarnoisDéraps et al. (2018), with which we construct a KiDS1000 like covariance matrix.
4.1. T17 simulations
The T17 simulations are constructed from a series of nested cubic boxes with side lengths of L, 2L, 3L, … placed around a fixed vertex representing the observer’s position, with L = 450 Mpc h^{−1}. Each box is replicated eight times and placed around the observer using periodic boundary conditions. The number of particles per box is fixed to 2048^{3}, which results in higher mass and spatial resolutions at lower redshifts. Within each box three spherical lens shells are constructed, each with a width of 150 Mpc h^{−1}, which are then used by the public code GRAYTRIX^{2} to trace the lightray trajectories from the observer to the last scattering surface^{3}. With the Nbody code GADGET2 (Springel et al. 2001) the gravitational evolution of dark matter particles without baryonic processes are followed from the initial conditions, which in turn are determined by use of secondorder Lagrangian perturbation theory. The initial linear power spectrum followed from the Code for Anisotropies in the Microwave Background (CAMB; Lewis et al. 2000) with Ω_{m} = 1 − Ω_{Λ} = 0.279, Ω_{b} = 0.046, h = 0.7, σ_{8} = 0.82, and n_{s} = 0.97. The matter power spectrum agrees with theoretical predictions of the revised Halofit (Takahashi et al. 2012) within 5%(10%) for k < 5(6) h Mpc^{−1} at z < 1. In order to account for the finite shell thickness and angular resolution, T17 provide correction formulae, which we repeat in Appendix B. Although various resolution options are available, for our purpose the realisations with a resolution of NSIDE = 4096 are sufficient.
We use the publicly available matter density contrast maps to create a realistic lens galaxy catalogue that mimics the second and third redshift bins of the luminous red galaxies sample constructed from the KiDS1000 data (Vakili et al. 2019), as shown by the solid lines in Fig. 1. The reason to mock the LRG sample is that the galaxy bias for this kind of galaxies can be roughly described with a constant linear bias, which is needed for the analytical model. We excluded the lowestredshift lens bin, first because of its low galaxy number density (n_{0} = 0.012 gal arcmin^{−2}) in which the shotnoise level is significant, and second because the density field is more nonlinear, and hence we expect the lognormal approximation to break down. Since there is a significant overlap between the KiDS1000 sources and the lenses in the fourth LRG redshift bin, we reject it as well. To create our lens galaxy samples we first project the T17 3D density maps δ_{m,3D} following the n(z) shown as the step functions in Fig. 1 to get two δ_{m,2D} maps. For both maps we then distribute galaxies following a Poisson distribution with parameter λ = n(1+b δ_{m,2D}), where b is a constant linear galaxy bias and n is chosen such that the galaxy number density is n_{0} = 0.028 gal arcmin^{−2} for the second bin (hereafter the lowredshift bin ) and n_{0} = 0.046 gal arcmin^{−2} for the third lens bin (hereafter the highredshift bin ). Since our method requires a constant linear galaxy bias, we specify a bias of 1.72 for lens bin two and 1.74 for lens bin three, similar to those reported in Vakili et al. (2019). F18 found this linear bias assumption to be accurate enough for year 1 data of the Dark Energy Survey, which is similar in constraining power to our target KiDS data (but we note that an investigation of higherorder biasing is underway in Friedrich et al., in prep.).
Fig. 1. Lens galaxy redshift distribution constructed from the T17 simulation given the true n(z) of the second and third LRG bin (Vakili et al. 2019). The black dashed line shows the redshift of the source galaxies. 
Fig. 2. Redshift distributions of the second and third LRG (lens) bins and the last two KiDS1000 (source) bins of the SLICS simulations. The n(z) are scaled such that a comparison is possible. 
Fig. 3. Different filters U used in this work to verify the new model. For all filters we scaled the first bin value to 1/arcmin^{−2} for comparison. The corresponding Qfilters are shown in Fig. B.1. The wide Mexican filter extends up to 150′. 
In our validation test, we use a shear grid at a single source plane located at z = 0.8664, indicated by the black dashed line in Fig. 1. F18 showed that the model works for realistic redshift distributions, and this choice simplifies the generation of our source catalogues. Furthermore, in order to determine a realistic covariance matrix, we transform the shear field into an observed ellipticity field by adding shape noise to the shear grid as
where ϵ^{obs}, ϵ^{s}, and γ are complex numbers, and the asterisk ( * ) indicates complex conjugation. The source ellipticities ϵ^{s} per pixel are generated by drawing random numbers from a Gaussian distribution with width
where A_{pix} is the pixel area of the shear grid, and the effective number density n_{gal} and σ_{ϵ} are chosen such that they are consistent with the KiDS data. While this transformation is valid in terms of the reduced shear g = γ/(1 − κ), we use throughout this paper the approximation γ ≈ g, as the typical values for the convergence are small, κ≪1. We neglect the intrinsic alignment of galaxies in this work.
4.2. Extracting the model components from the T17 simulations
In order to validate the different components of our model, we need to extract p(δ_{m, U}), p(N_{ap}), and ⟨γ_{t}𝒬⟩ from the simulation. The first two follow directly by smoothing the maps of the projected density contrast and the lens galaxy with the corresponding filters. This smoothing can be performed in two different ways. The first is to use the HEALPY function QUERY_DISC, which finds all pixel centres that are located within a given radius, whereas the second approach uses the HEALPY function SMOOTHING, with a given beam window function created by the function BEAM2BL. The two approaches result in PDFs that differ slightly, since the QUERY_DISC does not reproduce an exact tophat, while the SMOOTHING approach is only over a finite ℓrange. Nevertheless, we found that both approaches are consistent for NSIDE = 4096 well within the uncertainty we estimate from 48 subpatches (see discussion below), and hence we use the second approach which is significantly faster.
The tangential shear information ⟨γ_{t}𝒬⟩ is measured for each quantile Q by the software TREECORR (Jarvis et al. 2004) in 15 logspaced bins with angular separation Θ/20 < ϑ < Θ, where Θ is the size of the filter being used. For the tophat filter we measured the shear profiles from 6′< ϑ < 120′, corresponding to a filter with a size of 120′. We note here that for all measured shear profiles the shear around random points is always subtracted, which ensures that the shear averaged over all quantiles for one realisation vanishes by definition.
In order to have an uncertainty for all three model quantities, we divide the fullsky map into 48 subpatches, such that each patch has a size of approximately 859.4 deg^{2}. For p(δ_{m, U}) and p(N_{ap}) we determined for each subpatch one distribution, such that we were able to calculate a standard deviation from 48 values for each bin in the PDF. For the covariance matrix we use 10 out of the 108 realisations and divide each fullsky map in 48 subpatches, which then results in a covariance matrix measured from 480 fields. Furthermore, both for the covariance and for the error bars in the plotted shear profiles we use Eq. (43) to create noisy shear profiles for each subpatch, which are then rescaled to the effective KiDS1000 area (see Giblin et al. 2021).
4.3. CosmoSLICS
We use the cosmoSLICS simulations described in HarnoisDéraps et al. (2019) to determine the validity regime of our revised model for different cosmologies. These are a suite of weak lensing simulations sampling 26 points (listed in Table B.1) in a broad cold dark matter (CDM) parameter space, distributed in a Latin hypercube to minimise interpolation errors. Specifically, the matter density Ω_{m}, the dimensionless Hubble parameter h, the normalisation of the matter power spectrum σ_{8}, and the timeindependent equationofstate parameter of dark energy w_{0} are varied over a range that is large enough to complement the analysis of current weak lensing data (see e.g. HarnoisDéraps et al. 2021). Each simulation follows 1536^{3} particles inside a cube of comoving side length L_{box} = 505 h^{−1} Mpc and n_{c} = 3072 grid cells on the side, starting with initial conditions produced with the Zel’dovich approximation. Moreover, the cosmoSLICS evolve a pair of simulations at each node, designed to suppress the sampling variance (see HarnoisDéraps et al. 2019, for more details). Each cosmological model is raytraced multiple times to produce 50 pseudoindependent light cones of size 100 deg^{2}.
For each realisation, we create KiDS1000like sources and KiDSLRGlike lens catalogues, following the pipeline described in HarnoisDéraps et al. (2018); notably, we reproduce exactly the source galaxy number density and n(z) that is used in Asgari et al. (2021), who report a total number density n_{gal} = 6.93 arcmin^{−2} and a redshift distribution estimated from selforganising maps (see Wright et al. 2020). These mock galaxies are then placed at random angular coordinates on 100 deg^{2} light cones. In contrast to the T17 mocks, we test our model with two source redshift bins, corresponding to the KiDS1000 fourth and fifth tomographic bins (hereafter and ). The source galaxies are assigned a shear signal γ from a series of lensing maps, following the linear interpolation algorithm described in Sect. 2 in HarnoisDéraps et al. (2018). For our lens sample we opted to include the second and third tomographic bin of the LRG galaxies described in Vakili et al. (2019) ( and ). Compared to the T17 values, the n(z) of the cosmoSLICS LRG mocks have a coarser redshift resolution of the simulations. Moreover, the n(z) vary slightly for different underlying cosmologies, due to variations in the relation between comoving distance and redshift. Following Vakili et al. (2019), we generate our LRG catalogues assuming a constant linear galaxy bias of 1.72 and 1.74, with a galaxy number density of n_{0} = 0.028 gal arcmin^{−2} and n_{0} = 0.046 gal arcmin^{−2}.
4.4. SLICS
In total the SLICS^{4} are a set of over 800 fully independent realisations similar to the fiducial ΛCDM cosmoSLICS model. The underlying cosmological parameters for each run are the same, fixed to Ω_{m} = 0.2905, Ω_{Λ} = 0.7095, Ω_{b} = 0.0473, h = 0.6898, σ_{8} = 0.826 and n_{s} = 0.969 (see Hinshaw et al. 2013). For Fourier modes k < 2.0 h Mpc^{−1}, the SLICS and cosmoSLICS threedimensional dark matter power spectrum P(k) agrees within 2% with the predictions from the Extended Cosmic Emulator (see Heitmann et al. 2014), followed by a progressive deviation for higher kmodes (HarnoisDéraps et al. 2018). We use the SLICS to estimate a reliable covariance matrix, which, combined with the cosmoSLICS, allows us to test our model for a simulation that is independent of T17. Similar to the T17 simulations, the signal of the SLICS is combined with the randomly oriented intrinsic shapes ϵ^{s} to create ellipticities, whereas ϵ^{s} is drawn from a Gaussian distribution with width σ_{ϵ} directly since the shear information are given here per galaxy. We added an additional layer of realism and used a redshiftdependent shape noise that better reproduces the data properties. Specifically, we used σ_{ϵ} = 0.25 and 0.27 for the source bins, as reported in Giblin et al. (2021).
4.5. Extracting the SLICS and cosmoSLICS data vector
The extraction of the data vector for the SLICS and cosmoSLICS analyses is similar to the T17 case, where shape noise was not included for the cosmoSLICS data vector to better capture the cosmological signal. Another slight difference is that the light cones are now square, which accentuates the edge effects when the aperture filter overlaps with the lightcone boundaries. In principle, it is possible to weight the outer rims for each N_{ap} map, so that the whole map can be used. Although this would increase our statistical power, it could also introduce a systematic offset. We opted instead to exclude the outer rim for each realisation resulting in an effective area of (10 − 2Θ)^{2} deg^{2} with Θ the size of the corresponding filter. This procedure also ensures that roughly the same number of background galaxies are used to calculate the shear profile around each pixel.
5. Testing the revised model
We used the simulations described in Sect. 4 to test our revised model and its accuracy in predicting shear profiles. Following the results of F18 we chose a tophat filter of 20′ as our starting point and we considered a number of more general filters with a similar angular extent, shown in Fig. 3. Our motivation for studying these filters is as follows: We use a Gaussian filter to test whether the model performs well for nonconstant but positive filters; the “adapted” filter is the filter that results from B20; the “Mexican” filter removes the local minimum at ϑ ∼ 40′; the “broad Mexican” has a larger width; finally, the “wide Mexican” suppresses the negative tail. In order to lower the amplitude of the negative part while keeping a similar width, we adjusted the upper bound of the wideMexican filter to conserve the compensation to 150′, which makes it better suited to large contiguous survey areas.
Before comparing our model to the simulations, we note that we are using here the revised model even for the tophat filter, for which we could instead use the F18 model directly. Notably, the derivations of and are identical in the revised model, and we show in the following plots for the tophat filters that both models yield almost identical results in predicting the shear profiles with a tophat filter. Therefore, from here on, we only show results from the revised model. In the following three sections, we validate the key model ingredients introduced in Sects. 3.1–3.3.
5.1. Validating p(δ_{m, U})
We show in Fig. 4 the PDF of the smoothed twodimensional density contrast for all six filters, and for the two lens bins. We see by inspecting the different panels that the predictions agree with the simulations for the two lens bins within 1 σ cosmic variance expected for KiDS1000. We note here that this PDF cannot be measured in real data, and that the real test for the accuracy of our model are the shear signals, with larger uncertainties due to shape noise. Nevertheless, for the tophat and the Gaussian we have an agreement between model and simulation well within the 1 σ, which indicates that the lognormal approximation for theses filters is good. The other filters show stronger deviations when using a lognormal approximation, but these are weaker when the negative part of the filter approaches zero (wide Mexican) or when the width of the filter increases (broad Mexican), although the negative part of the broad Mexican is stronger than for the Mexican filter. This indicates that probing on larger scales either with a broader or wider filter the lognormal approximation is more accurate. Furthermore, when using the bivariate lognormal approach discussed in Sect. 3.1, the residuals are even more suppressed, and thus we cannot recognise differences in the match between predicted and measured PDF for all compensated filters. Although the model for the compensated filters is not as good as for the nonnegative filters (tophat and Gaussian), the revised model remains consistent throughout with the T17 simulations.
Fig. 4. PDF of δ_{m, U} smoothed with the filters shown in Fig. 3. The orange shaded region is the standard deviation of 48 subpatches scaled by a 777.4/859.4, where 777.4 deg^{2} is the effective survey area of KiDS1000 (see Giblin et al. 2021) and 859.4 deg^{2} is the area of one subpatch. The red dashed curve corresponds to a lognormal PDF with the measured moments and from the smoothed T17 density maps, and indicates the accuracy using a lognormal PDF. The green and the black dashed lines are both from the model; the green corresponds to the PDF of δ_{m, U} when using lognormal and the black using the bivariate approach described in Eq. (26). Lower panels: residuals Δp(δ_{m, U}) of all lines with respect to the simulations. 
5.2. Validating p(N_{ap})
We show in Fig. 5 how well the model can predict p(N_{ap}) given the galaxy distributions described in Sect. 4. As for p(δ_{m, U}), the best matches are observed for the nonnegative filters, where the simple lognormal PDF is used. For the compensated filters with the bivariate lognormal p(δ_{m, U}) we note a slight deviation in the skewness of p(N_{ap}). These discrepancies are not seen when placing galaxies at random positions regardless of any underlying matter density field as shown in Fig. A.1, which indicates that they must originate either from p(δ_{m, U}) or from the ⟨w_{ϑ}δ_{m, U}⟩ term (we set the latter to 0 for uniform random fields). It might be that the deviations seen in p(N_{ap}) are exclusively caused by the deviations in p(δ_{m, U}), but since they are much smaller, we expect that the assumptions made in computing ⟨w_{ϑ}δ_{m, U}⟩ induce additional inaccuracies. Nevertheless, we show next that these deviations result in shear signals whose residuals are well within the statistical uncertainties of Stage III weak lensing surveys such as KiDS1000. However the accuracy of the ⟨w_{ϑ}δ_{m, U}⟩ term will likely need to be improved for future surveys like Euclid, as discussed in Sect. 3.2.
Fig. 5. PDF of N_{ap} calculated with the filters U in Fig. 3. The orange lines are determined with the simulations and the orange shaded region is the standard deviation from 48 subpatches. The black dashed lines correspond to the results from the new model, and for comparison the red dashed line in the upper left panel is from the old model. Lower panels: residuals Δp(N_{ap}) of all lines with respect to the simulations. 
5.3. Validating ⟨γ_{t}𝒬⟩
Having quantified the accuracy of the basic ingredients of our model, we are now in a position to compare the predicted and measured shear profiles. This is a major result of our paper, which is shown in Fig. 6. Following G18, we used five quantiles and we measured the shear profiles up to 120′ (or 150′ for the wide Mexican case). For the tophat, Gaussian, and wide Mexican filters we see no significant deviations between the model and the simulations. For the adapted and the smaller Mexicans the shear profiles show minor discrepancies in some quantiles and at large angular scales, but are always consistent within the KiDS1000 accuracy. The shapes of the signals are affected by the choice of the filter. We can observe shifts in the peak positions and changes in the slope of the signals especially at small scales. This will allow us in the future to select filters that optimise the signaltonoise ratio of the measurement, while being clean of systematics related to smallscale inaccuracies. Finally, we show in Fig. B.2 that for the compensated filter the difference in using the proposed bivariate lognormal approach is slightly more accurate than using a plain lognormal. Although the difference does not change the final results noticeably, and although it introduces some inconsistency in the sense that we use a bivariate approach for p(δ_{m,U}) but not for ⟨κ_{< θ}δ_{m,U}⟩^{5}, we decided to stay with the proposed ansatz because it is slightly more accurate, and we plan to use p(N_{ap}) in future analysis.
Fig. 6. Predicted shear profiles for the two lens samples (dashed black line) and measured shear profiles (in orange) for the new model with filter U. The orange shaded region is the standard deviation on the mean from 48 subpatches, scaled to the KiDS1000 area. The residuals between model and simulations were tested to determine whether they can be erased when the PDF of the aperture number is fixed to the measured value from T17, but the same discrepancies were present. 
In order to check whether the discrepancies seen for some compensated filters yield biased results, we performed an MCMC analysis. As our data vector we used the T17 shear profiles shown in Fig. 6, where we made a conservative cut and included only scales above 14′, since as shown in F18 the model is not fully accurate for small angular scales. For the comparison we decided to use the adapted filter and the tophat filter to have one analysis with and one without these discrepancies. Furthermore, since the mean aperture mass summed over all quantiles vanishes per definition, one of the five shear signals is fully determined by the others, and so we discarded for all cases the middle quantile with the lowest signal. Thus, we ended up with data and model vectors of size 88. As explained previously, we measured our covariance matrix from ten T17 simulations, each divided into 48 subpatches, for a total of 480 subpatches. We note here that the galaxy number density can slightly deviate between the different realisations due to the Poisson sampling. Given the amplitude of these small fluctuations, these can be safely neglected. Next we debiased the inverse covariance matrix C^{−1} following Hartlap et al. (2007),
where n is the number of simulations (480) and p the size of the data vector (88). Finally, given our data d measured from only one noisefree T17 realisation, and our model vector m, we measured the χ^{2} statistics as
Given this setup we ran an MCMC varying the matter density parameter Ω_{m} and normalisation of the power spectrum σ_{8} for the adapted and the tophat filters, where we marginalised over the biases of the lens samples. As shown in Fig. 7 the analysis with the adapted filter results in a biased inference for the Ω_{m}σ_{8}plane (although still within 1σ); this is not the case for the tophat filter. We note here that this bias is due to the systematic offset in the slope of the highest quantile, which in turn is is highly sensitive to Ω_{m}. Since the amplitude of the shear profiles are correct and these are highly correlated with the parameter, the contours shift to smaller σ_{8} values in order to compensate for the bigger Ω_{m} value^{6}. In the next section we calibrate the model to investigate whether this systematic bias can be corrected.
Fig. 7. MCMC results for the tophat and adapted filter using the model and the T17 simulations as our data vector and a covariance matrix calculated from ten T17 realisations each divided into 48 subpatches. For the adapted filter a systematic bias for σ_{8} and Ω_{m} is found, although it cancels out for the parameter. The contours here are marginalised over the lens galaxy bias parameters. 
5.4. Calibrating the model
In this section we calibrate the remaining small inaccuracies of the analytical model seen in Fig. 6 which result in the systematic bias we had observed in the parameter constraints shown in Fig. 7. For this we decided to divide out for each quantile the residuals between the model, γ_{MT}, at the T17 cosmological parameters, p_{T17}, and the noiseless shear profiles measured from the T17 simulations, γ_{T17}, such that the calibrated model at parameters p is defined as
Since we used the n(z) combinations of the fiducial cosmoSLICS shown in Fig. 2 to validate the calibration, we decided to use the n(z) shown in Fig. 1, and in order to have the source n(z) as close as possible to the one of the cosmoSLICS we averaged several T17 shear grids at different redshifts for the same realisation weighted by the source n(z) shown in Fig. 2. In Fig. B.4 we show the calibration vectors for the highest and lowest quantile for the tophat and adapted filter, where it can be seen that the different lens n(z) is more important than the source n(z).
Next, in order to investigate whether the calibration decreases the systematic biases we performed another MCMC analysis on independent simulations, where our data vector is the fiducial cosmology from the cosmoSLICS shear profiles shown in Fig. 8, with the original model in red and the calibrated one in black. As before we used the adapted filter and the tophat filter. The match between the predicted and measured shear profiles is slightly degraded compared to the T17 simulations, which could be caused by edge effects (contiguous fullsky vs. 100 deg^{2} patches), the smaller statistic (41 253 deg^{2} vs. 5000 deg^{2}), or differences in the underlying matter power spectrum P(k) that is used in the model. We use the Takahashi et al. (2012) HALOFIT function throughout this paper, which is calibrated on the same Nbody code that is used to create the T17 simulations (Springel et al. 2001, GADGET2), and which is known to have an excess power of 5–8% in the mildly nonlinear regime (Heitmann et al. 2014). The cosmoSLICS, in contrast, are produced from cubep^{3}m (HarnoisDéraps et al. 2013), whose P(k) agrees better with the Cosmic Emulator of Heitmann et al. (2014). We tested different choices of power spectra models calculated with the PYCCL package (Chisari et al. 2019), but found differences in the predicted shear profiles that are negligible compared to the expected KiDS1000 uncertainties.
Fig. 8. Shear profiles for the tophat filter (left) and for the adapted filter (right) for the fiducial cosmology of cosmoSLICS. The orange lines are the mean shear profiles and the orange shaded region is the expected KiDS1000 uncertainty. The red dashed line corresponds to the original model and the black to the calibrated model. 
Discarding again the middle quantile with the lowest signal, using four different redshift combinations (Sect. 4.3) and the signal at all scales because the model is calibrated at all scales, we have data and model vectors of size 160. In this scenario we calculated our covariance matrix from 614 SLICS simulations^{7} with shape noise that mimics the KiDS1000 data. After debiasing the inverse covariance matrix C^{−1} with Eq. (45) we calculated the χ^{2} with Eq. (46). Given this setup we ran multiple MCMC, where we used the original model and the calibrated model. As shown in Fig. 9 the calibrated model for the adapted filter results compared to the original model in less a biased inference. Interestingly, the results for the tophat filter seen in Fig. B.3 are slightly more biased than the calibrated model for the adapted filter. Since this offset is still inside 1σ, it is likely to be only a statistical fluke due to the remaining residual between model and cosmoSLICS simulations. The constraining power between tophat and adapted filter are different because the smoothing scales of the two filters were not adjusted as in Burger et al. (2020), and are here sensitive to different physical scales. Nevertheless, we show in Table 1 the resulting constrains for both filters, where it is seen that the calibration moves the results, also for the tophat filter, closer to the truth.
Fig. 9. MCMC results for the adapted filter using the original and calibrated model. The data vector is calculated from the fiducial cosmology of cosmoSLICS and a covariance matrix from 614 SLICS realisations. It is clearly seen that the calibrated model is less biased than the original one. The contours are marginalised over the lens galaxy bias parameters. 
Overview of the maximumposterior cosmologies with the constraining power that we obtain for the original and calibrated model.
In order to compare our results with those of G18, who derived constraints of and with their fiducial analysis, we need to multiply our uncertainty intervals by to account for the smaller area of KiDS1000 (777.4 deg^{2}) compared to the DES Y1 area (1321 deg^{2}). Furthermore, we exclusively used information about the shear profiles, whereas G18 also used the mean aperture number in each quantile. For this work we were a bit sceptical about using the aperture number here for the compensated filters because we have significant residual discrepancies between model and simulation, which would affect our analysis. The match of the shear profiles in turn is very accurate in our simulations, which shows that they are robust against uncertainties in P(N_{ap})^{8}. For instance, if one monotonically transforms the N_{ap} values, the predicted P(N_{ap}) changes, but the segmentation into quantiles is not affected, hence the shear profiles would remain the same. In order to use P(N_{ap}) in future analysis we need to model shot noise in the galaxy distribution and investigate if the residuals between model and simulations result in systematic biases, but we will keep this for future work. Nevertheless, we see that our constraints from using only information about the shear profiles can be similar to the ones in G18. In addition, due to the calibration method used here, smaller smoothing scales are available than those recommended in F18, where even the tophat filter has significant deviations. This could allow us to further improve the significance for future DSS analyses or to investigate effects such as baryonic feedback and intrinsic alignments, which are typically relevant on scales < 10 Mpc h^{−1}.
6. Summary and conclusion
In our previous work (Burger et al. 2020) we showed that using compensated filters in the density split statistic (DSS) to quantify over and underdense regions on the sky have advantages compared to the tophat filter, both in terms of the overall S/N and of recovering accurately the galaxy bias term. Furthermore, we expect that compensated filters are less influenced by baryonic effect, since they are more confined in Fourier space and therefore are better in smoothing out large ℓmodes where baryonic effects play an important role. This will be investigated in more detail in a followup paper, when we start dealing with real data. Gruen et al. (2016) demonstrated that the DSS is a powerful cosmological tool by constraining cosmological parameters with DSS measurements from the Dark Energy Survey (DES) First Year and Sloan Digital Sky Survey (SDSS) data, using the DSS model derived in Friedrich et al. (2018) which uses a tophat filter. They found for the matter density parameter , a constraint that agrees with and is competitive with the DES analysis of galaxy and shear twopoint functions (see Abbott et al. 2018).
Following these works, we modify the model of Friedrich et al. (2018) in such a way that it can predict the shear profiles ⟨γ_{t}𝒬⟩ for a given quantile Q of the aperture number N_{ap} for general filters (Gaussian and also compensated filters). This is achieved by recalculating the three basic ingredients, which are the PDF of the projected matter density contrast smoothed with the filter function, p(δ_{m, U}); the expectation value of the convergence inside a radius ϑ for a fixed smoothed matter density contrast, ⟨κ_{< ϑ}δ_{m, U}⟩; and the distribution of N_{ap} for the given filter function U given the smoothed matter density contrast, p(N_{ap}δ_{m, U}). For ⟨κ_{< ϑ}δ_{m, U}⟩ we modified the calculation of the moments for general filters, while we introduced new approaches to calculate p(N_{ap}δ_{m, U}) and p(δ_{m, U}) for compensated filters. For nonnegative filters, δ_{m, U} is well described by a lognormal PDF, although we found significant deviations for compensated filters. To solve this issue we used a bivariate lognormal ansatz, where we assumed that δ_{m, U} can be divided into two lognormal random variables with each separately following a lognormal distribution. For the calculation of p(N_{ap}δ_{m, U}) we derived an expression for the corresponding characteristic function, which can be used either directly to calculate p(N_{ap}δ_{m, U}) by inverse Fourier transformation or by calculating the first three moments, which then specify a lognormal distribution for p(N_{ap}δ_{m, U}). The differences between these two approaches are considerably smaller than the statistical uncertainty, and so we used the latter approach because of its smaller computational time.
In order to validate the revised model, we compared it to the Takahashi et al. (2017) simulations. For nonnegative filters like a tophat or a Gaussian, no significant difference between the model and simulations for the PDF or the tangential shear profiles were detected. For compensated filters, however, we found some discrepancies in the predicted PDF of N_{ap} and shear signals, which results in a biased inference, although still inside 1σ. To correct this biased result, we calibrated the model to match the noiseless Takahashi et al. (2017) and tested the calibrated model with the independent fiducial cosmology of cosmoSLICS (HarnoisDéraps et al. 2019). With the calibration applied, all systematic biases are removed, so we are confident that we can apply the model to Stage III surveys such as KiDS1000. Although this calibration is less important for the tophat and Gaussian filter, it is still an interesting approach because it allows even smaller scales to be used for both the shear profiles and the filter scales. The use of smaller scales, where the original models fail, makes it possible to increase the constraining power or to study baryonic effects that normally play an important role only at small scales.
After passing all these tests, we are confident that the revised model can be readily applied to Stage III lensing data. We note that a number of systematic effects related to weak lensing analyses will require external simulations, notably regarding the inclusion of secondary signal from the intrinsic alignments of galaxies, or from the impact of baryonic feedback on the matter distribution. However, our model is able to capture the uncertainty on the lens and source redshift distribution, the shape calibration bias, or the galaxy bias at a low computational cost, and is therefore ideally suited to perform competitive weak lensing analyses in the future.
This assumption is not evident per se, since via mode coupling the largescale profile of a given density perturbation may well be correlated to the shotnoise (i.e. smallscale fluctuations) of galaxy formation in the centre of that perturbation. F18 have found the approximation ⟨κ_{< ϑ}δ_{m, U},N_{ap}⟩ ≈ ⟨κ_{< ϑ}δ_{m, U}⟩ to be accurate in the Buzzard Nbody simulations (DeRose et al. 2019), but a more stringent investigation of this assumption is left for future work.
These maps are freely available for download at http://cosmo.phys.hirosakiu.ac.jp/takahasi/allsky_raytracing/
The SLICS are made publicly available on the SLICS portal at https://slics.roe.ac.uk/
Since the impact is already quite small when adjusting p(N_{ap}), we are confident that also using a bivariate approach for ⟨κ_{< θ}δ_{m,U}⟩ would result in even smaller improvements as discussed in greater detail at the end of Sect. 3.2.
Acknowledgments
We thank the anonymous referee for the very constructive and fruitful comments. This paper went through the whole KiDS review process, where we especially want to thank the KiDSinternal referee Benjamin Joachimi for his fruitful comments to improve this work. Further, we would like to thank Mike Jarvis for maintaining TREECORR and Ryuichi Takahashi for making his simulation suite publicly available. PB acknowledges support by the Deutsche Forschungsgemeinschaft, project SCHN34213. OF gratefully acknowledges support by the Kavli Foundation and the International Newton Trust through a NewtonKavliJunior Fellowship and by Churchill College Cambridge through a postdoctoral ByFellowship. JHD is supported by a STFC Ernest Rutherford Fellowship (project reference ST/S004858/1). Author contributions: all authors contributed to the development and writing of this paper.
References
 Abbott, T. M. C., Abdalla, F. B., Alarcon, A., et al. 2018, Phys. Rev. D, 98, 043526 [NASA ADS] [CrossRef] [Google Scholar]
 Abbott, T. M. C., Aguena, M., Alarcon, A., et al. 2022, Phys. Rev. D, 105, 023520 [CrossRef] [Google Scholar]
 Abramowitz, M., & Stegun, I. A. 1972, in Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 10th edn. (Washington DC: US Government Printing Office), Applied Mathematics Series, 55 [Google Scholar]
 Arfken, G., & Weber, H. 2008, Mathematical Methods for Physicists, 6th edn. (Amsterdam, Heidelberg: Elsevier Academic Press) [Google Scholar]
 Asgari, M., Tröster, T., Heymans, C., et al. 2020, A&A, 634, A127 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Asgari, M., Lin, C.A., Joachimi, B., et al. 2021, A&A, 645, A104 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Barthelemy, A., Codis, S., & Bernardeau, F. 2021, MNRAS, 503, 5204 [Google Scholar]
 Bergé, J., Amara, A., & Réfrégier, A. 2010, ApJ, 712, 992 [CrossRef] [Google Scholar]
 Bernardeau, F., & Valageas, P. 2000, A&A, 364, 1 [NASA ADS] [Google Scholar]
 Bernardeau, F., Colombi, S., Gaztañaga, E., & Scoccimarro, R. 2002, Phys. Rep., 367, 1 [Google Scholar]
 Boyle, A., Uhlemann, C., Friedrich, O., et al. 2021, MNRAS, 505, 2886 [NASA ADS] [CrossRef] [Google Scholar]
 Burger, P., Schneider, P., Demchenko, V., et al. 2020, A&A, 642, A161 [EDP Sciences] [Google Scholar]
 Chisari, N. E., Alonso, D., Krause, E., et al. 2019, ApJS, 242, 2 [Google Scholar]
 DeRose, J., Wechsler, R. H., Becker, M. R., et al. 2019, ArXiv eprints [arXiv:1901.02401] [Google Scholar]
 Eisenstein, D. J., & Hu, W. 1998, ApJ, 496, 605 [Google Scholar]
 Euclid Collaboration (Knabenhans, M., et al.) 2021, MNRAS, 505, 2840 [NASA ADS] [CrossRef] [Google Scholar]
 Fan, Z., Shan, H., & Liu, J. 2010, ApJ, 719, 1408 [Google Scholar]
 Friedrich, O., Gruen, D., DeRose, J., et al. 2018, Phys. Rev. D, 98, 023508 [Google Scholar]
 Fu, L., Kilbinger, M., Erben, T., et al. 2014, MNRAS, 441, 2725 [Google Scholar]
 Giblin, B., Heymans, C., Asgari, M., et al. 2021, A&A, 645, A105 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Gruen, D., Friedrich, O., Amara, A., et al. 2016, MNRAS, 455, 3367 [Google Scholar]
 Gruen, D., Friedrich, O., Krause, E., et al. 2018, Phys. Rev. D, 98, 023507 [Google Scholar]
 Halder, A., Friedrich, O., Seitz, S., & Varga, T. N. 2021, MNRAS, 506, 2780 [CrossRef] [Google Scholar]
 Hamana, T., Shirasaki, M., Miyazaki, S., et al. 2020, PASJ, 72, 16 [Google Scholar]
 HarnoisDéraps, J., Pen, U.L., Iliev, I. T., et al. 2013, MNRAS, 436, 540 [Google Scholar]
 HarnoisDéraps, J., Amon, A., Choi, A., et al. 2018, MNRAS, 481, 1337 [Google Scholar]
 HarnoisDéraps, J., Giblin, B., & Joachimi, B. 2019, A&A, 631, A160 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 HarnoisDéraps, J., Martinet, N., Castro, T., et al. 2021, MNRAS, 506, 1623 [CrossRef] [Google Scholar]
 Hartlap, J., Simon, P., & Schneider, P. 2007, A&A, 464, 399 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Heitmann, K., Lawrence, E., Kwan, J., Habib, S., & Higdon, D. 2014, ApJ, 780, 111 [Google Scholar]
 Heymans, C., Tröster, T., Asgari, M., et al. 2021, A&A, 646, A140 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Hilbert, S., Hartlap, J., & Schneider, P. 2011, A&A, 536, A85 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Hinshaw, G., Larson, D., Komatsu, E., et al. 2013, ApJS, 208, 19 [Google Scholar]
 Jarvis, M., Bernstein, G., & Jain, B. 2004, MNRAS, 352, 338 [Google Scholar]
 Kilbinger, M., & Schneider, P. 2005, A&A, 442, 69 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Lewis, A., Challinor, A., & Lasenby, A. 2000, ApJ, 538, 473 [Google Scholar]
 Limber, D. N. 1953, ApJ, 117, 134 [NASA ADS] [CrossRef] [Google Scholar]
 Lin, C.A., & Kilbinger, M. 2015, A&A, 576, A24 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Mead, A. J., Tröster, T., Heymans, C., Van Waerbeke, L., & McCarthy, I. G. 2020, A&A, 641, A130 [EDP Sciences] [Google Scholar]
 Munshi, D., McEwen, J. D., Kitching, T., et al. 2020, J. Cosmol. Astropart. Phys., 2020, 043 [CrossRef] [Google Scholar]
 Nishimichi, T., Takada, M., Takahashi, R., et al. 2021, ApJ, 884, 29 [Google Scholar]
 Papoulis, A., & Pillai, S. U. 1991, Probability, Random Variables, and Stochastic Processes, 3rd edn. (Boston: McGrawHill) [Google Scholar]
 Pires, S., Leonard, A., & Starck, J.L. 2012, MNRAS, 423, 983 [NASA ADS] [CrossRef] [Google Scholar]
 Planck Collaboration V. 2020, A&A, 641, A5 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Pyne, S., & Joachimi, B. 2021, MNRAS, 503, 2300 [NASA ADS] [CrossRef] [Google Scholar]
 Reimberg, P., & Bernardeau, F. 2018, Phys. Rev. D, 97, 023524 [NASA ADS] [CrossRef] [Google Scholar]
 Schneider, P. 1996, MNRAS, 283, 837 [Google Scholar]
 Schneider, P. 1998, ApJ, 498, 43 [Google Scholar]
 Schneider, P., Ehlers, J., & Falco, E. E. 1992, Gravitational Lenses, Astronomy and Astrophysics Library (Berlin and Heidelberg: Springer) [Google Scholar]
 Shan, H., Liu, X., Hildebrandt, H., et al. 2018, MNRAS, 474, 1116 [Google Scholar]
 Springel, V., Yoshida, N., & White, S. D. M. 2001, New Astron, 6, 79 [NASA ADS] [CrossRef] [Google Scholar]
 Takahashi, R., Sato, M., Nishimichi, T., Taruya, A., & Oguri, M. 2012, ApJ, 761, 152 [Google Scholar]
 Takahashi, R., Hamana, T., Shirasaki, M., et al. 2017, ApJ, 850, 24 [Google Scholar]
 Vakili, M., Bilicki, M., Hoekstra, H., et al. 2019, MNRAS, 487, 3715 [Google Scholar]
 Wright, A. H., Hildebrandt, H., van den Busch, J. L., et al. 2020, A&A, 640, L14 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Appendix A: Detailed derivations for the new model
In this appendix we show more detailed derivations of the results than in the main text. We start with the calculation of the variances or covariances in the flatsky approximation, continue with calculations of the thirdorder moments and finish with the derivation of the PDF of the aperture number given a smoothed density contrast by use of the characteristic function.
A.1. Variance and skewness for general filters at leading order in perturbation theory
Although analytical possible we decided against using the bispectrum to calculate thirdorder moments like the skewness. Instead we use a formalism where we calculate the second and thirdorder moments of the smoothed density contrasts within cylinder of physical radius R and physical length L using the flatsky approximation shown in Appendix B in F18 for a tophat filter, and apply it to our case with a general filter U. Numerically, this approach is faster since, as we will see below, it is possible to express the thirdorder moments in terms of secondorder moments. Another advantage is that the projection is only along one dimension (radius of the cylinder) compared to the bispectrum, where the projection is at least along a 2D grid. Following F18 we start by considering a cylinder of radius R and length L. In Fourier space the tophat filter for such a cylinder is given by
where J_{1} is the first Bessel function, and k_{} and k_{⊥} are the components of k parallel and orthogonal to the cylinder, respectively. The variance of the matter contrast within such a cylinder is given at leading order by
where the last expression follows from L ≫ R, and since the integration depends from now on only on k_{⊥} we write the orthogonal component as k. The linear matter power spectrum of P_{lin, 0} is calculated using Eisenstein & Hu (1998), and D_{+} is the growth factor which depends on the conformal time. We note that the factor 1/L cancels out when projecting the moments in Eq. (A.28A.30) using the Limber approximation (Limber 1953). According to this derivation for a tophat filter we get for a general filter U that
where U_{χ}(r) = U(r/χ) = U(ϑ)/χ^{2}, with U(ϑ) being a filter measured in angular coordinates (see Fig. 3). Correspondingly, the variance of the matter density contrast for a general filter U in the flatsky approximation is
Following lines similar to those of Appendix B.4 of F18, the leadingorder contribution to the skewness of matter density contrast for the general filter U can be calculated as
where . The function F_{2} in a general ΛCDM universe is given by
where μ results from perturbation theory and is a function of the growth factor D_{+} (see Appendix B.1 in F18 for more details)^{9}, and ϕ is the angle between the vectors with absolute values q_{1} and q_{2}. Given the definition of W_{Uχ} in Eq. (A.3), Φ_{Uχ} can be written as
Next, we use Graf’s addition theorem (see e.g. Abramowitz & Stegun 1972), which states that
such that Φ_{Uχ}(q_{1}, q_{2}) becomes
where we made use of the orthogonality of the trigonometric functions. Plugging Φ_{Uχ}(q_{1}, q_{2}) back into Eq. (A.5) and considering each term separately we get
and by analogy
and finally
The following transformations provide a more compressed expression for , which can then be used to verify our derivation by comparing it with the result from F18 for a tophat filter. For this, we rewrite the expression of Bessel functions in terms of as
and with
we get
Using these relations together with the following notation
we find that
and
Finally, combining A, B, and C, the skewness of δ_{Uχ, L} simplifies to
where it is seen that for a tophat of size ϑ with U_{χ}(r) = ℋ(ϑ − χr) the result in Eq. (B.35) immediately follows.
Although all necessary ingredients for specifying the PDF of δ_{m, U} are derived already, we still need moments like to compute quantities like ⟨κ_{< ϑ}δ_{m, U}⟩ or . With the definitions of two further integrals,
and using the result of F18 that for a tophat filter of size R
the joint filter moments between the matter density contrast smoothed with the general filter and the matter density contrast smoothed with a tophat of size ϑ follows analogously to the skewness, and is given by
A.2. Limber projection
Given the moments of the smoothed density contrasts at comoving distance χ derived in the previous section, the moments in Eqs. (24, 25) and Eqs. (31–33) for k = 1, 2, or 3 follow (see e.g. Bernardeau & Valageas 2000),
where q_{f}(χ) is the projection kernel defined in Eq. (10) and W_{s}(χ) the lensing efficiency defined in Eq. (14). We note that these three equations employ a Limber approximation, which consists of L → ∞ (Limber 1953), and that the physical radius r of filter U scales with χ as described below Eq. (A.3). We also note that these expectation values are independent of L.
A.3. Nonlinear regime
In order to go to the nonlinear regime for secondorder moments, we replace the linear power spectrum in the above calculations with the nonlinear power spectrum, which in turn is determined with the halofit model from Takahashi et al. (2012) using an analytic approximation for the transfer function (Eisenstein & Hu 1998).
For the thirdorder moments we use that for a tophat filter of size R the filter simplifies to , such that
and
Furthermore, the skewness simplifies in the this case to:
This then helps to define
which in the linear and the nonlinear regime is approximately the same (Bernardeau et al. 2002), meaning that in order to get the skewness in the nonlinear regime we approximate
For the general filter we use that the numerical integration of r in results basically in a sum of tophat filters, such that we make use of S_{3} to scale each term individual to the nonlinear regime. For the joint filter moment we use a generalised version of S_{3}, which states that for two different tophat filters of size R_{1} and R_{2}
Using again that the rintegration results in a sum of tophat filters and factoring out the nonderivative terms similar to Eq. (A.34), we scale individually all the nonderivative terms to the nonlinear regime.
A.4. Characteristic function
We consider a large circle of radius R, inside of which there are N = n_{0}πR^{2} galaxies, where n_{0} is the galaxy number density. The probability of finding a galaxy at separation ϑ is
where ⟨w_{ϑ}δ_{m, U}⟩ is the expectation of the mean 2D density contrast on a circle at ϑ (see Eq. (37)) given the smoothed density contrast defined in Eq. (38). The assumption of linear galaxy bias enters here by the term b ⟨w_{ϑ}δ_{m, U}⟩. The normalisation is
which goes to unity for R → ∞. The characteristic function (CF) of the aperture number N_{ap}, given the smoothed 2D density contrast δ_{m, U}, is given by
where we used in the second line that
with n(ϑ) = ∑_{j}δ_{D}(ϑ − ϑ_{j}), and that the galaxy positions ϑ_{i} independently trace the density profile ⟨w_{ϑ}δ_{m, U}⟩. As discussed in Sect. 3.3, the exact approach is to transform the CF to the probability density function p(N_{ap}δ_{m, U}) by use of the inverse Fourier transformation. Alternatively, we assume that the PDF is well approximated by a lognormal distribution as
where the parameters S, M, L are fixed with the first raw moment
and the central moments
The raw moments can be calculated from the derivatives of the CF,
With the definition
it follows that
and so
To find the parameters of the lognormal distribution Eq. (A.39) by use of the raw and central moments in Eqs. (A.45–A.47) we define
where we defined in the last step q = exp(S^{2}). Modifying γ we get
which always has one real solution q_{0}, and so the parameters follow to
To check this derivation and compare it to the direct approach of using the inverse Fourier transform, we created an idealised case of a fullsky uniform random field n_{side} = 4096 with a number density n_{0} ≈ 0.034/arcmin^{2}. Next we calculated by use of the HEALPY internal SMOOTHING function N_{ap} for the tophat filter of size 20′ and the for the adapted filter. In the determination of the predicted PDF we set, ⟨w_{ϑ}δ_{m, U}⟩ = 0 so p(N_{ap}) follows immediately with Eq. (A.39) or Eq. (42). It is clearly seen in Fig. A.1 that the model for both filters has an excellent fit with the measured PDF of the aperture number. Additionally, we show in Fig. A.2 a comparison between the predicted p(N_{ap}) using the full characteristic function Eq. (36) versus the lognormal approach Eq. (A.39) for the lowredshift bin from the Takahashi setup. In the lower panel the residual difference between the two methods is three orders of magnitude smaller than the signal itself, which shows that the two approaches are identical given the uncertainties we expect for Stage III surveys. Since the lognormal approach is faster to compute we can use this approach in future analyses where computational speed is essential.
Fig. A.1. Probability distribution of the aperture number resulting in a uniform random field smoothed with the tophat filter of size 20′ in the upper panel and for the adapted filter U of size 120′ in the lower panel. The orange shaded region is the standard deviation determined from 48 subpatches. 
Fig. A.2. Comparison between the two approaches to calculate p(N_{ap}). It is clearly seen that both methods yield almost the same result. 
Appendix B: Correction formulae for the power spectra of the T17 simulations
Fig. B.1. Different filters Q resulting from the corresponding U filters shown in Fig. 3 used in this work to verify the new model. 
Fig. B.2. Comparison between the uncalibrated shear profiles for the adapted filter with and without using the bivariate lognormal approach discussed in Sect. 3.1 The ratio is calculated between the measured shear profiles from T17 for the lower LRG source bin and for sources where several T17 shear grids were averaged, weighted by the n(z) given in Fig. 2. The bivariate lognormal shear profiles are more consistent with the measured shear profiles and thus (although the shear signals were calibrated) the more accurate model was chosen. Here only the highest and lowest two quantiles are shown because the middle one is to close to zero. 
Fig. B.3. MCMC results for the tophat filter using the original and calibrated model. The data vector is calculated from the fiducial cosmology of cosmoSLICS and a covariance matrix from 614 SLICS realisations. The systematic biases are likely to be statistical flukes due to the noise in the data vector. The contours are marginalised over the lens galaxy bias parameters. 
Fig. B.4. Calibration of the model γ_{MT}(p_{T}) by the T17 simulations γ_{T}(p_{T}) explained in Eq. (47), shown for the highest and lowest quantile for the adapted and tophat filter. The corresponding redshift distributions of the lenses are given in Fig. 1 and for the sources several T17 shear grids are averaged, weighted by the n(z) given in Fig. 2. 
Overview of all the different cosmological parameters for the 26 cosmoSLICS models, which are used in Sect. 5 for the cosmological analysis.
To account for the finite angular resolution T17 suggested a simple damping factor at small scales as
where ℓ_{res} = 1.6 × N_{side}. Additionally, to take the shell thickness into account they conducted a simple fitting formula by which the matter power spectrum should be modified to
where the parameters are simulation specific and are c_{1} = 9.5171 × 10^{−4}, c_{2} = 5.1543 × 10^{−3}, α_{1} = 1.3063, α_{2} = 1.1475, α_{3} = 0.62793, and the wavenumber k is in units of h/Mpc. We note that although we incorporated these corrections in the following, they have very little effect on the scales we are considering.
All Tables
Overview of the maximumposterior cosmologies with the constraining power that we obtain for the original and calibrated model.
Overview of all the different cosmological parameters for the 26 cosmoSLICS models, which are used in Sect. 5 for the cosmological analysis.
All Figures
Fig. 1. Lens galaxy redshift distribution constructed from the T17 simulation given the true n(z) of the second and third LRG bin (Vakili et al. 2019). The black dashed line shows the redshift of the source galaxies. 

In the text 
Fig. 2. Redshift distributions of the second and third LRG (lens) bins and the last two KiDS1000 (source) bins of the SLICS simulations. The n(z) are scaled such that a comparison is possible. 

In the text 
Fig. 3. Different filters U used in this work to verify the new model. For all filters we scaled the first bin value to 1/arcmin^{−2} for comparison. The corresponding Qfilters are shown in Fig. B.1. The wide Mexican filter extends up to 150′. 

In the text 
Fig. 4. PDF of δ_{m, U} smoothed with the filters shown in Fig. 3. The orange shaded region is the standard deviation of 48 subpatches scaled by a 777.4/859.4, where 777.4 deg^{2} is the effective survey area of KiDS1000 (see Giblin et al. 2021) and 859.4 deg^{2} is the area of one subpatch. The red dashed curve corresponds to a lognormal PDF with the measured moments and from the smoothed T17 density maps, and indicates the accuracy using a lognormal PDF. The green and the black dashed lines are both from the model; the green corresponds to the PDF of δ_{m, U} when using lognormal and the black using the bivariate approach described in Eq. (26). Lower panels: residuals Δp(δ_{m, U}) of all lines with respect to the simulations. 

In the text 
Fig. 5. PDF of N_{ap} calculated with the filters U in Fig. 3. The orange lines are determined with the simulations and the orange shaded region is the standard deviation from 48 subpatches. The black dashed lines correspond to the results from the new model, and for comparison the red dashed line in the upper left panel is from the old model. Lower panels: residuals Δp(N_{ap}) of all lines with respect to the simulations. 

In the text 
Fig. 6. Predicted shear profiles for the two lens samples (dashed black line) and measured shear profiles (in orange) for the new model with filter U. The orange shaded region is the standard deviation on the mean from 48 subpatches, scaled to the KiDS1000 area. The residuals between model and simulations were tested to determine whether they can be erased when the PDF of the aperture number is fixed to the measured value from T17, but the same discrepancies were present. 

In the text 
Fig. 7. MCMC results for the tophat and adapted filter using the model and the T17 simulations as our data vector and a covariance matrix calculated from ten T17 realisations each divided into 48 subpatches. For the adapted filter a systematic bias for σ_{8} and Ω_{m} is found, although it cancels out for the parameter. The contours here are marginalised over the lens galaxy bias parameters. 

In the text 
Fig. 8. Shear profiles for the tophat filter (left) and for the adapted filter (right) for the fiducial cosmology of cosmoSLICS. The orange lines are the mean shear profiles and the orange shaded region is the expected KiDS1000 uncertainty. The red dashed line corresponds to the original model and the black to the calibrated model. 

In the text 
Fig. 9. MCMC results for the adapted filter using the original and calibrated model. The data vector is calculated from the fiducial cosmology of cosmoSLICS and a covariance matrix from 614 SLICS realisations. It is clearly seen that the calibrated model is less biased than the original one. The contours are marginalised over the lens galaxy bias parameters. 

In the text 
Fig. A.1. Probability distribution of the aperture number resulting in a uniform random field smoothed with the tophat filter of size 20′ in the upper panel and for the adapted filter U of size 120′ in the lower panel. The orange shaded region is the standard deviation determined from 48 subpatches. 

In the text 
Fig. A.2. Comparison between the two approaches to calculate p(N_{ap}). It is clearly seen that both methods yield almost the same result. 

In the text 
Fig. B.1. Different filters Q resulting from the corresponding U filters shown in Fig. 3 used in this work to verify the new model. 

In the text 
Fig. B.2. Comparison between the uncalibrated shear profiles for the adapted filter with and without using the bivariate lognormal approach discussed in Sect. 3.1 The ratio is calculated between the measured shear profiles from T17 for the lower LRG source bin and for sources where several T17 shear grids were averaged, weighted by the n(z) given in Fig. 2. The bivariate lognormal shear profiles are more consistent with the measured shear profiles and thus (although the shear signals were calibrated) the more accurate model was chosen. Here only the highest and lowest two quantiles are shown because the middle one is to close to zero. 

In the text 
Fig. B.3. MCMC results for the tophat filter using the original and calibrated model. The data vector is calculated from the fiducial cosmology of cosmoSLICS and a covariance matrix from 614 SLICS realisations. The systematic biases are likely to be statistical flukes due to the noise in the data vector. The contours are marginalised over the lens galaxy bias parameters. 

In the text 
Fig. B.4. Calibration of the model γ_{MT}(p_{T}) by the T17 simulations γ_{T}(p_{T}) explained in Eq. (47), shown for the highest and lowest quantile for the adapted and tophat filter. The corresponding redshift distributions of the lenses are given in Fig. 1 and for the sources several T17 shear grids are averaged, weighted by the n(z) given in Fig. 2. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.