Issue 
A&A
Volume 619, November 2018



Article Number  A38  
Number of page(s)  20  
Section  Cosmology (including clusters of galaxies)  
DOI  https://doi.org/10.1051/00046361/201833481  
Published online  06 November 2018 
Breaking degeneracies in modified gravity with higher (than 2nd) order weaklensing statistics
^{1}
AIM, CEA, CNRS, Université ParisSaclay, Université Paris Diderot, Sorbonne Paris Cité, 91191 GifsurYvette, France
email: austin.peel@cea.fr
^{2}
Dipartimento di Fisica e Astronomia, Alma Mater Studiorum Università di Bologna, Via Gobetti 93/1, 40129 Bologna, Italy
^{3}
Astrophysics and Space Science Observatory Bologna, Via Gobetti 93/2, 40129 Bologna, Italy
^{4}
INFN – Sezione di Bologna, Viale Berti Pichat 6/2, 40127 Bologna, Italy
Received:
23
May
2018
Accepted:
31
July
2018
General relativity (GR) has been well tested up to solar system scales, but it is much less certain that standard gravity remains an accurate description on the largest, that is cosmological, scales. Many extensions to GR have been studied that are not yet ruled out by the data, including by that of the recent direct gravitational wave detections. Degeneracies among the standard model (ΛCDM) and modified gravity (MG) models, as well as among different MG parameters, must be addressed in order to best exploit information from current and future surveys and to unveil the nature of dark energy. We propose various higherorder statistics in the weaklensing signal as a new set of observables able to break degeneracies between massive neutrinos and MG parameters. We have tested our methodology on socalled f(R) models, which constitute a class of viable models that can explain the accelerated universal expansion by a modification of the fundamental gravitational interaction. We have explored a range of these models that still fit current observations at the background and linear level, and we show using numerical simulations that certain models which include massive neutrinos are able to mimic ΛCDM in terms of the 3D power spectrum of matter density fluctuations. We find that depending on the redshift and angular scale of observation, nonGaussian information accessed by higherorder weaklensing statistics can be used to break the degeneracy between f(R) models and ΛCDM. In particular, peak counts computed in aperture mass maps outperform third and fourthorder moments.
Key words: dark energy / gravitation / gravitational lensing: weak
© ESO 2018
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. Introduction
Current observations from the cosmic microwave background (CMB) anisotropies, baryon acoustic oscillations, and supernova luminosity distances all support a phase of accelerated cosmic expansion during the present epoch (e.g. Planck Collaboration XIII 2016; Planck Collaboration XIV 2016; Anderson et al. 2014; Betoule et al. 2014). The cause of such evolution is still unknown and should amount to approximately 70% of the total energy density of the Universe. Whether it is simply a fundamental constant of nature, a physical fluid, or instead a modification of general relativity (GR) itself at cosmological scales, the accelerated expansion remains one of the biggest puzzles in modern cosmology.
The simplest cosmological model consistent with the data assumes a constant contribution (Λ) and a cold dark matter component (CDM). The cosmic expansion of a ΛCDM model, however, can be mimicked by vast range of models, which are also in agreement with current observations. These scenarios typically assume either a fluid component, in other words, dark energy (DE), or a modification of general relativity at large scales. Even when considering linear perturbations and the impact on CMB, dark energy and modified gravity (MG) models are still viable (Planck Collaboration XIV 2016) and may be advocated to solve tensions currently present between Planck data and late time probes, such as weak lensing (Hildebrandt et al. 2017; DES Collaboration 2018) and, in smaller measure, redshift space distortions (Planck Collaboration XIV 2016).
The recent direct detection of gravitational waves (GW) with an electromagnetic counterpart has contributed to the exclusion of a range of MG models in the space of general Horndeski scalartensor theories (Lombriser & Taylor 2016; Lombriser & Lima 2017; Creminelli & Vernizzi 2017; Baker et al. 2017; Ezquiaga & Zumalacárregui 2017; Sakstein & Jain 2017). Using measurements in Sakstein (2015), Dima & Vernizzi (2018) put further constraints on models beyond Horndeski, which make use of the Vainshtein mechanism to protect high density regions where GR is well tested. A large range of models is still viable, however, including those discussed in Planck Collaboration XIV (2016) and in particular scalartensor theories with a universal coupling, that is the f(R) models discussed in this paper. We refer to Crisostomi & Koyama (2018; and to references above) for a general formulation of viable models which assume a universal coupling. We note also that theories such as coupled DE (Amendola 2000; Pettorino & Baccigalupi 2008; Pettorino 2013) are still viable after GW constraints. As they do not introduce derivative couplings in the action, they therefore do not modify the speed of GWs in tensor equations; see for example Bettoni et al. (2017), where terms leading to anomalous speeds have been identified.
Distinguishing between ΛCDM and MG scenarios that mimic ΛCDM at the linear level, but which may be different at nonlinear scales, is the next challenge for future galaxy surveys. In this paper, we investigate whether various observables based on statistics of the weaklensing signal can be used to discriminate among different theoretical models that are strongly degenerate in the matter power spectrum on linear and possibly also on nonlinear scales. In particular, we focus on the strong degeneracy between f(R) modified gravity and ΛCDM that occurs when the latter models include the effects of massive neutrinos (see Motohashi et al. 2013; He 2013; Baldi et al. 2014; Wright et al. 2017). Given such a degeneracy, we are motivated to explore nonGaussian weaklensing probes that access higherorder information than twopoint correlation functions or their associated power spectra. Our goal is to determine which statistics are most promising and how the discrimination efficiency between models depends on redshift and angular scale.
Within ΛCDM, there is evidence that weak lensing alone can break the degeneracy between σ_{8} and Ω_{m} by a combination of higherorder (than second) convergence moments and shear tomography (Vicinanza et al. 2018). In terms of MG, Liu et al. (2016) have shown that nonGaussian statistics, in particular weaklensing peak counts, contain significant cosmological information and can constrain f(R) model parameters. In addition, Higuchi & Shirasaki (2016), Shirasaki et al. (2017) have studied the effects of f(R) gravity on statistical properties of the weaklensing field, including the convergence power spectrum and bispectrum, peak counts, and Minkowski functionals. Unlike these analyses, the matter power spectrum degeneracy in our case is facilitated (and the constraining power of WL observation is challenged) by the presence of massive neutrinos, a component known to play an important role during structure formation in the real universe.
The deflection of light coming from distant galaxies represents an important tool to indirectly map the projected matter density distribution along the line of sight. When light bundles travel from distant galaxies to an observer, they are continuously deflected by the inhomogeneities of the density field of the intervening nonlinear structures. As a consequence, images of background galaxies appear slightly stretched and distorted, the phenomenon known as weak gravitational lensing. Considering that by definition the weaklensing effect is small, it is necessary to average over a large number of background sources in order to quantify it. Future widefield experiments like LSST Science Collaboration (2009) and the ESA space mission Euclid^{1} (Laureijs et al. 2011) will both increase the number and redshift range of the background source population used to measure the weaklensing signal.
Given vastly larger datasets and substantially improved image quality, nextgeneration surveys will thus permit the most stringent tests yet on a variety of different theoretical scenarios (Amendola et al. 2018). In these tests, weak lensing will be a primary tool in our effort to understand the dark universe, and in particular the nature of the latetime acceleration phase. In this work, we present the first highorder moment analysis of weaklensing fields of nonstandard MG cosmologies with massive neutrinos. Our aim is to quantify the capability of future galaxy surveys to distinguish nonstandard cosmological features using weak lensing.
The paper is organised as follows. In Sect. 2 we describe the theoretical cosmologies used in this paper as a proof of concept of the methodology we propose. In Sect. 3 we briefly recall the basic weak lensing definitions that will be needed in the analysis. In Sect. 4 we discuss the Nbody simulations used for the analysis described in Sect. 5. Our results are presented and discussed in Sect. 6, and we draw our conclusions in Sect. 7.
2. f(R) cosmology: the HuSawicki model
Among the classes of theories that modify gravitational attraction at large scales in order to have cosmic acceleration, a large class of (still viable) models is given by f(R) theories. In these cosmologies, the dependence of the action on the Ricci scalar R is modified with respect to GR, and depends on a generic function f of R, such that:
where is the matter Lagrangian and . (We use a subscript G to avoid confusion with the weak lensing convergence κ.) It is possible to show that these theories are a subclass of scalartensor theories, meaning they include a universal fifth force mediated by a scalar field, that adds to the metric tensor already present in GR. Since this force is universal, it affects baryons as well as dark matter: as a consequence, f(R) theories typically need some screening mechanism that restores GR in high density regions, such as in the solar system, where observations show no significant deviation from GR. In addition, the choice of f(R) should provide a cosmology that is close enough to ΛCDM in the highredshift regime (or else it would affect the CMB in a way that would have already been observed) and provide cosmic acceleration at late times.
One of the few known functional forms of f(R) able to satisfy solar system constraints is the Hu & Sawicki (2007) model in which
and
where is the average density at present time, and c_{1} and c_{2} are dimensionless parameters. The sign of f(R) is chosen such that its second derivative f_{RR} is positive for R ≫ m^{2}, leading to stable solutions in the high curvature regime (Hu & Sawicki 2007). It has been shown that for
this model mimics the background evolution of a ΛCDM model with a cosmological constant relative density of Ω_{Λ} and a relative matter density of Ω_{m}. In the Hu & Sawicki (2007) model however, there is no cosmological constant and cosmic acceleration is given by the modification of gravity. Furthermore, the parameter c_{2} is usually expressed in terms of f_{R0} ≡ df/dR(z = 0).
In these cosmologies, therefore, two additional parameters are present: n and f_{R0}. This set of functional forms are such that f(R)/m^{2} have a transition from zero (for R → 0) to a constant, for R ≫ m^{2}. The sharpness of this transition increases with n, while f_{R0} determines when the transition occurs. A detailed analysis of the impact of such cosmologies at different scales has been presented already in the proposal paper of Hu & Sawicki (2007) and as well in more recent papers (Koyama 2016; Lombriser 2014). In Hu & Sawicki (2007, see their Fig. 9), for solar system to galaxy scales, assuming a galactic density ρ_{g} = 10^{−24} g cm^{−3}, the two parameters of the theory have to satisfy
Hu & Sawicki (2007) models have also been shown to satisfy background observations (Martinelli et al. 2009) and have been tested against some cosmological constraints (e.g. Lombriser et al. 2012; Boubekeur et al. 2014; Hu et al. 2016; see Koyama 2016 and Lombriser 2014 for an overview and detailed list of constraints). CMB and largescale structure provide relatively weak constraints, giving f_{R0}< 10^{−2}–10^{−4}. Solar system constraints typically require f_{R0}< 10^{−4}–10^{−6}, depending on environmental assumptions of the Milky Way, while galaxy clusters give the slightly stronger bound of f_{R0}< 10^{−5}. Strong lenses (Smith 2009) or dwarf galaxies and Cepheids may reach f_{R0}< 10^{−6}–10^{−7} (Jain et al. 2013; Vikram et al. 2018), with bounds that depend, however, on several assumptions and approximations, for example the shape of density profiles. Hu & Sawicki (2007) cosmologies automatically satisfy constraints recently derived by Sakstein (2015) for any choice of the parameters.
In the following we fix n = 1 and consider different values of f_{R0}, labelled as follows:

f_{4}(R) (or equivalently fR4) labels f(R) with f_{R0} = −10^{−4},

f_{5}(R) (or equivalently fR5) labels f(R) with f_{R0} = −10^{−5},

f_{6}(R) (or equivalently fR6) labels f(R) with f_{R0} = −10^{−6}.
This choice of range allows us to check typical values used in the literature for f(R) models. A model like f_{4}(R) may be at the more extreme limit of solar system observations, possibly requiring further screening at those scales, whereas the other models may instead be more in tension with strong lenses, dwarf galaxies and Cepheids, depending on assumptions in deriving the constraints. Assuming that a screening mechanism is in place, the values chosen here may still be viable at cosmological scales, where much weaker bounds are in place.
The ratios of the power spectra for these models with respect to ΛCDM are shown in Fig. 1. As we see from the figure, the inclusion of neutrinos suppresses the matter power spectrum on scales smaller than their free streaming length. This suppression is degenerate with the value of f_{R0}, which also introduces a transition scale: the higher the value, the more the linear perturbations grow below the Compton wavelength λ_{fR} and the more the power spectrum is enhanced. The connection between λ_{fR} and f_{R0} can be seen directly by defining , where
Fig. 1.
Ratio of the matter power spectra P(k) of different f(R) models, including neutrinos of different mass, with respect to ΛCDM. The different f(R) models, labelled as f_{4}(R), f_{5}(R), and f_{6}(R), are described in the text. f_{6}(R) with any neutrino mass and f_{5}(R) with m_{ν} = 0.15 eV all have a matter power spectrum within 10% of the ΛCDM one. In the following we illustrate how different statistics can be used (or not) to discriminate these models, showing complete results in particular for f_{5}(R) for illustration. 
and again f_{R} and f_{RR} are, respectively, the first and second derivatives of f with respect to R. In this expression, the effective potential V_{eff} is defined in such a way that the equation of motion for f_{R} can be written as ; specifically, this happens for , where ρ is the total energy density. Values of λ_{fR } large enough to be in the nonlinear regime may therefore lead to effects which are observable in structure formation.
Models such as f_{4}(R) without massless neutrinos would lead to matter power spectra that are strongly amplified with respect to ΛCDM (solid red curve in Fig. 1). On the other hand, the inclusion of massive neutrinos as large as 0.3 eV (crossed red line) would compensate such a choice of f(R), bringing it back to within 10% from ΛCDM even down to the smallest scales (see Baldi et al. 2014, for an early investigation of such degeneracy in the nonlinear regime).
The degeneracy between f_{R0} and neutrino masses leads similar choices to have matter power spectra within about 10% from ΛCDM, making it difficult to discriminate between f(R) and a cosmological constant, or between different f(R) models, purely on the basis of observations of second order statistics. In the following, we will investigate whether higherorder weaklensing observables can help to break such degeneracies. Although we have tested the results for different choices of (f_{R0}, m_{ν}), we will show results for the f_{5}(R) use case, which is an intermediate one among the ones shown in the figure and available in simulations.
3. Weak lensing
Weak gravitational lensing describes small distortions in the observed shapes of distant galaxies induced by the gravitational fields of massive structures in the universe along the line of sight. Significant information about the distribution and masses of cosmic structures is imprinted on galaxy images as the statistical alignment of their observed ellipticities. This signal has been used successfully in the data analysis of numerous galaxy surveys to constrain cosmological parameters (e.g. Heymans et al. 2013; Hildebrandt et al. 2017; DES Collaboration 2018) and is potentially a key probe to test gravity and models beyond ΛCDM in future weak lensing surveys such as Euclid (Amendola et al. 2018). For this purpose, the combination of weak lensing with other probes, such as galaxy clustering (Planck Collaboration XIV 2016), helps to break degeneracies in measuring the gravitational potentials. On the other hand, it is worth investigating whether one can break degeneracies in parameter space relying on weak lensing only, independently of other latetime probes. This is an important test that can help maximise the information we extract from the large amount of data that is going to be available for this probe. Furthermore, it can help identify whether tensions present in the data (Planck Collaboration XIV 2016) between weak lensing and the CMB will point to a signature of new physics or rather to systematics.
The basic weak lensing quantities are the shear γ(θ) and convergence κ(θ) fields. Shear is a spin2 field that quantifies the anisotropic distortion of source galaxy images due to the tidal gravitational fields of foreground structures. The shear measured at some position θ on the sky depends on the amplitude of density fluctuations along the line of sight to the source galaxy, as well as on the relative distances between the observer, deflectors, and source. Convergence, a scalar field, is derivable from the shear up to a constant (Kaiser & Squires 1993) and reflects isotropic changes in the observed shapes of galaxies.
The connection with cosmic density fluctuations is straightforward with κ, as it can be interpreted as the projected total matter density along the line of sight:
In this expression, H_{0} is the presentday Hubble parameter, c is the speed of light, a is the universal scale factor, and δ is the 3D density contrast defined as , where is the spatially averaged matter density. The radial coordinate χ is comoving, and the geometrical function f_{K} determines the comoving angular distance, whose form depends on the (constant) curvature of 3D space. We refer to Kilbinger (2015), for example, for a recent review of weak lensing cosmology that includes an introduction to the formalism. We deal directly with κ maps in this work as derived from our simulated cosmologies (cf. Sect. 4.2).
4. Numerical setup
4.1. The DUSTGRAINpathfinder simulations
We make use of the DUSTGRAINpathfinder simulations (see Giocoli et al. 2018a for a detailed description), a suite of cosmological collisionless simulations specifically designed to sample the joint parameter space of f(R) gravity and massive neutrino cosmologies in order to investigate their main observational degeneracies and to devise strategies to break them. These simulations followed the evolution of 768^{3} dark matter particles of mass m_{CDM}^{p} = 8.1 × 10^{10} M_{⊙} h^{−1} (for the case of m_{ν} = 0) and of as many neutrino particles (for the case of m_{ν} > 0) within a periodic cosmological box of 750 Mpc h^{−1} per side, under the effect of an f(R) gravitational interaction defined by Eq. (1).
The simulations were performed with the MGGadget code (Puchwein et al. 2013), which is a modified version of the GADGET code (Springel 2005) that implements the extra force and the chameleon screening mechanism (see Khoury & Weltman 2004) characterising f(R) gravity theories. As was already done in Baldi et al. (2014), such an MG solver was combined with the particlebased implementation of massive neutrinos developed for GADGET by Viel et al. (2010). Therefore, massive neutrinos were included in the simulations as a separate family of particles with its specific transfer function and velocity distribution, so that both CDM and neutrino particles contributed to the density source term that enters the calculation of the f_{R} extra degree of freedom.
Initial conditions were produced by generating two separate but fully correlated random realisations of the linear density power spectrum for CDM and massive neutrino particles as computed by the linear Boltzmann code CAMB (Lewis et al. 2000) at the starting redshift of the simulation z_{i} = 99. Following the approach of, for example, Zennaro et al. (2017) and VillaescusaNavarro et al. (2018), we then computed the scaledependent growth rate D^{+}(z_{i}, k) for the neutrino component in order to correctly account for neutrino gravitational velocities. Apart from these, neutrino particles also received an additional thermal velocity extracted from the neutrino momentum distribution for each value of neutrino mass under consideration.
In the present work, we use a subset of the full DUSTGRAINpathfinder runs consisting of nine simulations whose parameters are summarised in Table 1. All simulations share the same standard cosmological parameters, which are set in accordance with the Planck 2015 constraints (Planck Collaboration XIII 2016), namely Ω_{m} = Ω_{CDM} + Ω_{b} + Ω_{ν} = 0.31345, Ω_{b} = 0.0481, Ω_{Λ} = 0.68655, H_{0} = 67.31 km s^{−1} Mpc^{−1}, , and n_{s} = 0.9658.
Subset of the DUSTGRAINpathfinder simulations considered in this work with their specific parameters.
4.2. Map making
For each cosmological simulation, we constructed different lens planes from the various stored snapshots using the MAPSIM pipeline (Giocoli et al. 2015). The particles were distributed onto different planes according to their comoving distances with respect to the observer. For each simulation, we used the particles stored in 21 different snapshots to construct continuous pastlightcones from z = 0 to z = 4, with a square sky coverage of 25 deg^{2}. From the stored snapshots, we were able to construct 27 lens planes of the projected matter density distribution.
In MAPSIM, the observer was placed at the vertex of a pyramid whose square base was set at the comoving distance of z = 4. We constructed 256 different lightcone realisations within each simulation by randomising the various comoving boxes. This included changing signs, inverting, as well as redefining the centre of the coordinate system. By construction, these procedures preserve the clustering properties of the particle density distribution at a given simulation snapshot (Roncarelli et al. 2007).
On each pixel of the lens plane, with coordinate indexes (i, j), we can define the particle surface mass density
where m_{k} is the mass of the kth particle within the pixel, and A_{l} represents the comoving pixel area of the llens plane. Since gravitational lensing is sensitive to the projected matter density distribution along the lineofsight, we projected onto each lens plane all particles between two defined comoving distances from the observer, and in the simulations with massive neutrinos we consistently also accounted for this component, as well as for the proper Hubble function and the comoving distance calculation.
As was done by Petri et al. (2016, 2017), Giocoli et al. (2017, 2018b), Castro et al. (2018), we constructed the convergence maps by weighting the lens planes by the lensing kernel and assuming the Born approximation (Schäfer et al. 2012; Giocoli et al. 2016; Castro et al. 2018), which represents an excellent estimation for weak cosmic lensing down to very small scales (ℓ ≥ 10^{4}). From Σ_{l} we can write down the convergence map κ at a given source redshift z_{s} as
where l varies over the different lens planes with redshift z_{l} smaller than z_{s}. The critical surface density Σ_{crit, l, s} at the lens plane z_{l} for sources at redshift z_{s} is given by
where c indicates the speed of light, G Newton’s constant, and D_{l}, D_{s}, and D_{ls} the angular diameter distances between observerlens, observersource and sourcelens, respectively. We followed this approach also in constructing the convergence maps in the MG models, since the definition of the lensing potential in f(R) gravity models remains unchanged with respect to the GR case of the standard ΛCDM model. The resulting maps for our analysis each contain 2048^{2} pixels, giving a pixel scale of approximately 8.8 arcsec.
5. Analysis
Statistics of the aperture mass M_{ap}(ϑ) have been used in many weaklensing analyses as a probe of the matter distribution in the Universe and to constrain cosmological parameters (e.g. van Waerbeke et al. 2001, Jarvis et al. 2003, Hamana et al. 2003, Kilbinger & Schneider 2005, Clowe et al. 2006, Hetterscheidt et al. 2007, Schrabback et al. 2010, Kilbinger et al. 2013). In particular, the variance, or second central moment , is commonly used, which at a certain scale ϑ measures the lensing power spectrum within a narrow window in lspace. Furthermore, the nonlinear evolution of density fluctuations in the lowredshift Universe gets imprinted as nonGaussian features in the weaklensing signal, which can be accessed through higherorder moments of the aperture mass. For example, Pires et al. (2012) have shown that the capture of weaklensing nonGaussianities is able to break the degeneracy between σ_{8} and Ω_{m} within ΛCDM.
We explore in this work the ability of various statistics of M_{ap} to break the degeneracy between ΛCDM and f(R) models that include nonvanishing neutrino mass. We use maps free of noise throughout most of our analysis; that is, the lensing field is the true one as derived from the simulations. A discussion of galaxy shape noise and its impact on our results via a simple prescription has been included in the Appendix.
5.1. Aperture mass calculation
Given a convergence map κ(θ) for a particular model realisation, we compute the aperture mass map (Schneider 1996; Schneider et al. 1998) as
where θ is a twodimensional position vector within the map, U_{ϑ}(θ) is a circularly symmetric filter function, and ϑ is the aperture radius. The aperture mass is by design insensitive to the masssheet degeneracy, which describes the fact that a shear signal is unchanged by the presence of a uniform mass sheet along the line of sight between the observer and the source. To achieve this, the function U should be compensated in 2D, or in other words satisfy
where we have written the angular separation as θ = θ′−θ. It is also typically assumed that U_{ϑ} has either finite support or goes to zero smoothly at sufficiently small θ in order that the integral in Eq. (11) be computable on a finite data field.
Various definitions of the filter function U have been adopted in previous studies. For example, a useful parameterised family of polynomial functions was first introduced by Schneider et al. (1998). Nearly concurrently, van Waerbeke (1998) provided an alternative definition based on a Gaussian function and resembling wavelets in order to have better Fourier space properties. This latter U was subsequently used (in slightly modified form) by, for example, Crittenden et al. (2002), Zhang et al. (2003) and Jarvis et al. (2004). A third type of filter was derived in Schirmer et al. (2007) which approximates a NavarroFrenkWhite (NFW) density profile (Navarro et al. 1996). As it mimics the shear profile of a spherically symmetric dark matter halo, this form has been used for the detection of individual mass concentrations in real data.
Equation (11) is equivalently expressible in terms of the tangential component of shear γ_{t} about the position θ, where the convolution kernel is then a filter Q_{ϑ} derivable analytically from U_{ϑ} (Kaiser et al. 1994); Schneider 1996. This form is often used in the literature for convenience (e.g. Dietrich & Hartlap 2010; Kacprzak et al. 2016; Martinet et al. 2018, since the actual weaklensing observable is not the convergence, but instead the (reduced) shear via galaxy ellipticity measurements. Borders and missing data due to observational masks are also treated more accurately in shear space, although at the expense of significantly reduced algorithm speeds. Because the output of raytracing in our Nbody simulations is the convergence field directly, we compute M_{ap} on pixellated κ maps throughout this paper.
For high resolution fields like our 2048 × 2048 maps, the convolution is typically carried out in Fourier space, since directspace convolution can be too timeconsuming. We adopt a slightly different approach here for the practical computation of aperture mass maps than direct application of Eq. (11). We instead compute M_{ap} by means of the starlet transform (Starck et al. 2007), which is a wavelet transform that simultaneously produces a set of maps filtered at apertures of increasing dyadic (powers of two) scales. In other words, with a single wavelet transform of κ(θ) with O(nlogn) time complexity, we obtain {M_{ap}(θ; ϑ_{j})}_{j = 1, 2, …, jmax }, where ϑ_{j} = 2^{j} pixels. With a pixel scale of 8.8 arcsec, this corresponds to filter scales of 0.293, 0.586, 1.17, … arcmin for values j = 1, 2, 3, … in our simulated maps. The maximum j is determined by the map size as log_{2}N for an N × N map.
It was demonstrated in Leonard et al. (2012) that the aperture mass is formally identical to a wavelet transform at a specific scale. Different transforms are associated with different effective filter functions. The particular transform we adopt is the isotropic undecimated wavelet transform, also called the starlet transform, whose wavelet function at a given scale is defined as the difference between B_{3}spline functions at neighboring resolutions. We refer to Starck et al. (2007) for more details of the formalism and to Leonard et al. (2012) for the explicit form of U_{ϑ} corresponding to this transform.
In contrast to the above mentioned filter functions, the starlet transform filter presents several significant simultaneous advantages: (i) it is nonoscillatory both in angular space and in Fourier space, meaning that it can better isolate features of the signal represented in either domain; (ii) it has compact support in direct space, which is useful to control the systematics due to a mask or the image borders; (iii) it is compensated, or has a zero mean, so it is not sensitive to the mass sheet degeneracy problem (cf. Eq. (12)); and (iv) it presents an exact reconstruction, meaning that we can reconstruct the decomposed image from its starlet coefficients. We do not make use of this last property in this paper, but it could be very convenient in the case where we need to fill gaps due to missing data. The aperture mass maps resulting from the starlet transform then represent only information at a particular scale, with little leakage from other frequencies. Finally, we note that although the starlet transform returns maps filtered only at dyadic scales, which we find sufficient for our purposes here, this is not a general restriction of the wavelet formalism. Alternative wavelet transforms are possible that allow filtering at any intermediate scale.
As an illustration of the weak lensing fields in our simulations, as well as of our aperture mass computation, we show typical maps corresponding to three different models in Fig. 2. In the top row are κ(θ) maps extracted from the three f_{5}(R) simulations with, from left to right, neutrino mass sums of 0 eV, 0.1 eV, and 0.15 eV. All simulations share the same initial phases in the random realisation of the matter power spectrum, and these maps were generated by raytracing from the same observer position. The same physical structures can therefore be seen across the three convergence fields. Subtle differences in the peak positions are also apparent by eye, which is due to the different gravitational interaction in MG compared to GR and the presence or lack of neutrinos affecting structure growth.
Fig. 2.
Example convergence fields (top row) extracted from the three f_{5}(R) simulations with neutrino mass sums of 0 eV, 0.1 eV, and 0.15 eV. Each map shows the same line of sight for sources at redshift z_{s} = 2.0. Essentially the same structures are seen across all three maps due to identical initial conditions and the common field of view. Subtle differences, however, due to the different model evolutions can also be detected in the peak positions and their morphologies. Aperture mass residuals (bottom row) with respect to the ΛCDM map (not shown) at filtering scale ϑ_{3} = 1.17′ are shown for each corresponding map above. The compensated starlet filter has the effect of highlighting features with an approximate size of ϑ_{3}. Our aim is to distinguish between these models using differences in the statistics of such aperture mass maps computed at different scales. 
Shown in the lower panels of Fig. 2 are aperture mass residuals corresponding to the maps above and filtered at ϑ_{3} = 1.17′. The ΛCDM κ and aperture mass maps (not shown) are hardly distinguishable from any of the f_{5}(R) maps by eye, but differences in the positions and amplitudes of structures can be clearly seen among the models by their residuals, especially for the most prominent peaks. However, it is not a priori obvious that the statistics of such maps will be significantly different enough to distinguish the models from each other. Whether this is possible, for which statistics and for which filtering scales, are the questions we seek to answer in the following sections.
We compute all aperture mass maps in this paper using the mr_transform binary of the publicly available Interactive Sparse Astronomical Data Analysis Packages (iSAP ^{2}). This is a C++ code with many optional multiresolution wavelet transforms, which, given the large size of our mass maps, affords a significant speed increase over directspace 2D convolution. Various statistics of the maps, which are discussed in the next section, are computed using an independent Python code that we have validated on maps from the CODECS simulations used in Giocoli et al. (2015). In particular, we are able to reproduce Figs. 8 and 9 of that paper using a further independent aperture mass calculation that implements the Schneider et al. (1998) filter.
5.2. Statistics
We consider the following statistics of aperture mass maps as a function of filter scale and source galaxy redshift. To avoid edge effects, we exclude from all calculations pixels whose distance to the map border is smaller than the filter diameter.
5.2.1. Variance
The aperture mass variance (ϑ) quantifies fluctuations in lensing strength along different directions in the sky, meaning that it also measures the amplitude of fluctuations in the matter density contrast. As a secondorder statistic, it can be expressed as an integral over the convergence power spectrum P_{κ}(ℓ)
where the window function W is the Fourier transform of U_{ϑ} (Schneider et al. 1998). Aperture mass variance is therefore mostly sensitive to the Gaussian information in the distribution of matter. This integral form is convenient when comparing measurements with theoretical predictions, since P_{κ} is readily calculated (at least in ΛCDM) for a given set of cosmological parameters. Because we have simulated maps for all the models of interest in this work, we instead compute directly on filtered convergence maps:
where θ _{k} refers to the kth pixel position.
5.2.2. Skewness
As a thirdorder statistic, skewness is complementary to variance in that it probes nonGaussian information contained in the lensing observables. The skewness S is zero if the data are symmetrically distributed around the mean. If a tail extends to the right (resp. left), S is positive (resp. negative). In particular, skewness has been shown to be a sensitive probe of Ω_{m} and can break the degeneracy with σ_{8} if combined with twopoint statistics (Bernardeau et al. 1997; Jain & Seljak 1997). Analogously to the variance, skewness can be written as a bandpass filter over the convergence bispectrum (e.g. Kilbinger & Schneider 2005), the Fourier transform of the threepoint correlation function. Following Giocoli et al. (2015), we compute skewness as the (nonstandardised) thirdorder moment of our filtered maps:
5.2.3. Kurtosis
The kurtosis of a distribution is a measure of its symmetric broadening or narrowing relative to a Gaussian. Positive kurtosis implies a higher peak and larger wings than the Gaussian distribution with the same mean and variance. Negative kurtosis means a wider peak and shorter wings. Again following Giocoli et al. (2015), we compute kurtosis as the (nonstandardised) fourthorder moment of the aperture mass:
5.2.4. Peak counts
The number count distribution of lensing peaks provides another probe of nonGaussianity. Peaks represent local regions of high convergence, and their abundance for a given cosmological model carries significant information about its matter content and clustering amplitude. The number of cosmological studies based on peak statistics in convergence and aperture mass maps, both from simulated and real lensing data, has surged in the past decade (e.g. Kruse & Schneider 1999, 2000, Dietrich & Hartlap 2010, Kratochvil et al. 2010, Fan et al. 2010; Yang et al. 2011, Maturi et al. 2011, Marian et al. 2012, Hamana et al. 2012, Shan et al. 2014, 2018, Lin & Kilbinger 2015, Martinet et al. 2015, 2018, Liu et al. 2016, Kacprzak et al. 2016, Peel et al. 2017, Fluri et al. 2018). The highest signaltonoise (S/N) peaks are mostly generated by single massive structures along the line of sight. On the other hand, intermediate and low S/N peaks can arise from projection effects due to many smaller structures or filaments along the line of sight, as well as simply to noise when dealing with real data (Yang et al. 2011; Liu & Haiman 2016).
In this work, we explore the possibility of peak counts to distinguish between GR and MG models (with and without neutrinos), which are degenerate at the level of secondorder statistics. We compute peaks as a function of filter scale ϑ in maps of M_{ap}(ϑ)/σ(ϑ), where σ(ϑ) is the rms of that scale. A peak is defined as a pixel with larger amplitude than its eight neighbors and exceeding a given kσ (k = 1, 2, …) threshold. Throughout the rest of the paper, we display results for thresholds of 3σ and 5σ.
5.3. Model discrimination efficiency
For each cosmological model we consider, we produce 256 independent convergence maps of 25 deg^{2} for sources at redshifts z_{s} = (0.5, 1.0, 1.5, 2.0) by the procedure outlined in Sect. 4. We have sufficient area therefore to meaningfully study the statistical distributions of observables as a function of scale and redshift. In particular, we test the ability of various aperture mass statistics to distinguish between the models by employing the discrimination efficiency technique introduced and used in Pires et al. (2009, 2012).
The concept in our context can be understood as follows. Given the distributions of any two observables, the amount of overlap between the distributions is an indicator of how likely one is to mistake one model for the other, based on measurements of that observable. Quantification of the overlap is facilitated by the notion of a false discovery rate (FDR; Benjamini & Hochberg 1995). The FDR framework allows one to set a threshold value for a distribution such that new test samples (or observations) can be classified as either belonging to the distribution or not, depending on which side of the threshold they lie. The threshold is computed according to a desired false discovery rate α. The formalism then ensures that the fraction of false positives, or observations which indeed belong to the distribution but are incorrectly classified as detections, is less than or equal to α. As is common, we set α = 0.05 throughout this work to ensure that the number of false detections on average does not exceed 5%.
As an illustration, consider the toy example in Fig. 3. The probability distributions of some observable, all taken to be Gaussian for simplicity, are shown for four fictitious models. Let model 1 serve as the “hypothesis” against which the other models are tested. To determine the discrimination efficiency of, for example, model 2 from model 1, we first compute the right tail threshold t_{R} (dashed line) using FDR based on the samples of model 1. This is done by first rankordering the pvalues {p_{i}}, i = 1, …, n, of the n model 1 samples, computed with respect to a right tail event. Next one finds the maximum p_{m} (1 ≤ m ≤ n) for which p_{m} < m ⋅ α/n, where we have assumed statistical independence of the pvalues. The threshold t_{R} is the observation value corresponding to p_{m}, and the discrimination efficiency is the fraction of model 2 samples greater than t_{R}. Analogous reasoning holds for comparing model 3 against model 1 by computation of threshold t_{L}.
Fig. 3.
Toy example of discrimination efficiencies computed using the FDR formalism. Each (Gaussian) distribution represents the histogram measured for some observable given the model. The amount of overlapping area between any two distributions indicates how distinguishable the two models are according to this observable: the more the curves overlap, the more difficult it is to distinguish the corresponding models, and the higher the discrimination efficiency parameter is. In the example shown, considering model 1 as the hypothesis against which the others are tested, models 2, 3, and 4 have computed discrimination efficiencies of 77%, 37%, and 100%. 
In general, then, suppose we have observations of two models, M_{1} = {x_{i}}_{i = 1, …, n1 } and M_{2} = {x_{j}}_{j = 1, …, n2 }. If the centre of M_{2} is greater than that of M_{1}, the discrimination efficiency of M_{2} from M_{1} is
where counts the number of elements satisfying the condition C, and the threshold t_{R} is computed from the M_{1} samples. If the centre of M_{2} lies to the left of that of M_{1}, the condition C becomes x_{j} < t_{L}(M_{1}). For a more detailed description of this procedure, we refer to, for example, Appendix A of Pires et al. (2009).
For the example in Fig. 3, the discrimination efficiencies for models 2, 3, and 4 from model 1 are, respectively, ℰ_{2, 1} = 77%, ℰ_{3, 1} = 37%, and ℰ_{4, 1} = 100%. This corresponds with the intuition that models with fully overlapping distributions should not be considered distinguishable at all, whereas completely disjoint distributions should be 100% distinguishable.
There is an inherent asymmetry in discrimination efficiency between any two models that do not share the same width, or in general are not perfectly symmetric functional forms with merely different means. Considering model 2 as the hypothesis, for example, the model 1 discrimination efficiency is close to that of the reverse case at 75%, which follows from their near symmetry. With respect to model 3, however, the model 1 efficiency drops to 9%. This is consistent with the fact that the support of the model 1 distribution is a subset of that of model 3, and not the other way around. To take this into account in our analysis, we quote the mean of the two possible values for a given pair of models in all following results, notably in Figs. 10 and 11 and Table 2. The asymmetry is largest for the third and fourthorder moments, where the distributions deviate more from a Gaussian than do the variance and peak counts.
Maximum discrimination efficiency ℰ_{max} with respect to ΛCDM attained for each MG model according to M_{ap} statistic, filtering scale ϑ, and source redshift z_{s}.
We have depicted continuous distributions in the example of Fig. 3. In practice, however, with our simulations, we have 256 samples of each distribution for a given cosmological model, statistic, filter scale, and source redshift. Given the likelihood of undersampling, especially at the tails of the distributions that are important for setting the FDR threshold, we apply a Gaussian smoothing in each case using kernel density estimation (KDE) prior to calculating discrimination efficiencies. We find that KDE smoothing improves results in terms of the agreement between the observed overlap of the distributions and the computed value.
6. Results
6.1. Secondorder M_{ap} statistics
The lensing power spectrum, or equivalently the angular twopoint correlation function, accesses the Gaussian information in the projected matter density. As seen by their 3D matter power spectra (Fig. 1), there exist f(R) models, both with and without neutrinos, that can mimic GR to better than 10% at second order. Our goal is to look beyond secondorder statistics to see if it is possible to distinguish the MG models from ΛCDM. As a check of consistency, we first compare the lensing power spectra to verify agreement between the two secondorder probes.
In Fig. 4 we show convergence power spectra P_{κ}(ℓ) computed for various f(R) models along with ΛCDM for z_{s} = 2.0. The left plot shows the three f(R) models we consider defined by their different f_{R0} values (−10^{−4}, −10^{−5}, and −10^{−6}), all with a neutrino mass sum m_{ν} equal to zero. As theory predicts, and as we have seen by their matter power spectra, f(R) models more closely mimic ΛCDM the smaller the value of f_{R0}. In particular, f_{5}(R) agrees with ΛCDM within 17% and f_{6}(R) at better than 3% over three decades in scale.
Fig. 4.
Convergence power spectra for ΛCDM and various MG models. Left panel: f(R) models with f_{R0} = ( − 10^{−4}, −10^{−5}, −10^{−6}) and without neutrinos. Compared to ΛCDM, each MG model exhibits more lensing power, where the difference increases with the magnitude of f_{R0}. The discrepancy is most pronounced on smaller scales for f_{5}(R) and f_{6}(R) but shifts towards larger scales for f_{4}(R). Right panel: f_{5}(R) models with varying neutrino mass sums. Neutrinos have the effect of suppressing structure growth across all scales, which is seen in the damping of the f_{5}(R) curve with increasing m_{ν}. In particular, the combination of parameters (f_{R0}, m_{ν}) = (10^{−5}, 0.15 eV) reproduces ΛCDM in terms of secondorder statistics to within 8% over three decades in scale. The convergent behaviour of the curves near ℓ = 10^{5} reflects the resolution limit of our maps and particle noise that affects all of the models. 
It is interesting to note the different scales at which the deviation from ΛCDM is maximal. For both f_{5}(R) and f_{6}(R), the maximum occurs around multipole ℓ ≈ 10^{4}, whereas for f_{4}(R) it is around 10^{3}. Power thus moves from small scales to large scales with increasing f_{R0}. The effect of the modified gravitational interaction is nontrivial and not simply a uniform scaling of the power spectrum across different modes. We note that all curves have been computed for sources at z_{s} = 2.0, but very similar results occur for the other redshift planes, the main difference being an overall shift in amplitude.
In the right panel we focus on variations among f_{5}(R) models with different m_{ν}. Looking at the relative differences with ΛCDM, it is clear that the addition of neutrinos has the effect of damping power at all scales. This is consistent with the expectation that massive neutrinos should suppress structure growth. With m_{ν} = 0.1 eV (0.15 eV), the deviation does not exceed 11% (9%) within the ℓ range considered. We see that in terms of lensing, as well as directly through the matter distribution, f_{5}(R) models with neutrinos can produce a signal that is significantly degenerate with ΛCDM.
Figure 5 shows the mean aperture mass variance computed as a function of aperture scale for the same models as in Fig. 4. (ϑ) is plotted at filtering scales (ϑ_{1}, …, ϑ_{7}) = (0.293′, 0.586′, 1.17′, 2.34′, 4.69′, 9.34′, 18.8′) corresponding to a wavelet transform with j_{max} = 7. The shaded bands spanning the ΛCDM curves designate the statistical uncertainty computed as ±1σ, where σ is the standard deviation measured from the 256 ΛCDM maps at each aperture scale. We include this region in all plots of Figs. 5–7 to illustrate which models are likely to be distinguishable from ΛCDM according to the statistical scatter of the observable at a given scale.
Fig. 5.
Aperture mass variance as a function of filtering scale for ΛCDM and MG models. Data points represent the mean over 256 realisations of each simulated cosmology and are plotted at scales (ϑ_{1}, …, ϑ_{7}) = (0.293′, 0.586′, 1.17′, 2.34′, 4.69′, 9.34′, 18.8′). Shaded areas indicate the one standard deviation statistical uncertainty around ΛCDM. Left panel: three f(R) models without neutrinos. Differences in MG variance relative to ΛCDM match those seen for the power spectra in Fig. 4. In particular, the f_{4}(R) variance at large scales exceeds that of ΛCDM by around 30%, while the maximum deviations of f_{5}(R) and f_{6}(R) occur at small scales at a level of about 16% and 2%, respectively. Right panel: three f_{5}(R) models with different sums of neutrino masses m_{ν}. Trends in the curves again mirror their power spectra, and the effect of neutrinos damping structure growth is seen in the modulation of f_{5}(R) variance with m_{ν} = 0 eV towards the curve for m_{ν} = 0.15 eV. 
Fig. 6.
Higherorder aperture mass statistics for f(R) models with f_{R0} = −10^{−5} and a varying sum of neutrino masses m_{ν}. Data points represent the mean over 256 realisations of each simulated cosmology and are plotted at scales (ϑ_{1}, …, ϑ_{7}) = (0.293′, 0.586′, 1.17′, 2.34′, 4.69′, 9.34′, 18.8′). Left panel: aperture mass skewness computed as the third order moment . Higher neutrino mass leads to a nearly uniform reduction in skewness across all scales relative to ΛCDM. The maximum discrepancy between each model and ΛCDM is also larger than for the variance. Right panel: aperture mass kurtosis computed as the fourthorder moment . Differences from ΛCDM are comparable to the skewness, the main distinction being that the kurtosis of the m_{ν} > 0 models approaches the m_{ν} = 0 value at larger scales. 
Fig. 7.
Aperture mass peak counts for f(R) with f_{R0} = −10^{−5} and a varying sum of neutrino masses m_{ν}. Data points represent the mean over 256 realisations of each simulated cosmology and are plotted at scales (ϑ_{1}, …, ϑ_{7} = 0.293′, 0.586′, 1.17′, 2.34′, 4.69′, 9.34′, 18.8′). The abundance of peaks above a 3σ threshold is shown in the left panel, and above a 5σ threshold in the right panel. An order of magnitude more peaks are detected at each scale with the lower threshold, although deviations from ΛCDM, as well as among the different m_{ν} cases of f_{5}(R), are more pronounced with the higher threshold. This indicates that differences between models at this redshift are mostly contained in the highest amplitude peaks. Compared to the M_{ap} moments, peak counts interestingly do not very well distinguish between models with different neutrino mass, but they do distinguish modified from standard gravity, at least at certain scales. 
As discussed in Sect. 5, aperture mass variance probes the lensing power spectrum within a narrow window around the ℓ mode associated to scale ϑ. In agreement with their power spectra, the MG models with m_{ν} = 0 eV (left), for example, have larger variance at each scale compared to ΛCDM, with the difference increasing according to the magnitude of f_{R0}. Moreover, the variance relative to ΛCDM for f_{4}(R) is largest at intermediate filtering scales, whereas it is largest at small filtering scales for f_{5}(R) and f_{6}(R).
Plotted in the right panel of Fig. 5 is (ϑ) for f_{5}(R) models with varying neutrino mass. The trend in the curves is the same as for the power spectra seen in Fig. 4, namely the presence of neutrinos in the model can bring the aperture mass measurements into agreement with ΛCDM at better than 7% over the full range of scales considered. We confirm therefore that by including neutrinos, MG models with importantly different gravitational interactions from GR can mimic ΛCDM at the level of twopoint statistics in weaklensing observations.
We note that the maximum variance shown has been restricted to less than the value actually attained for ϑ_{1} in both plots of Fig. 5. This was done merely for visualisation purposes to better exhibit the spread of the curves at larger aperture scales. The variances for all models at the smallest scale are nearly identical in any case, as the difference plots of the lower panels show.
6.2. Higherorder M_{ap} statistics
Given that secondorder M_{ap} statistics may be insufficient to distinguish certain MG models from ΛCDM, we present now results from computing higherorder statistics in these models. These include skewness, kurtosis, and peak counts, as discussed in Sect. 5.2.
The aperture mass skewness provides a simple thirdorder statistic of the lensing field that probes the bispectrum (cf. Sect. 5.2). Results are shown in the left plot of Fig. 6 for the same f_{5}(R) models considered in Sect. 6.1. The effect of neutrinos is seen as the nearly uniform reduction in skewness at each aperture scale which varies with m_{ν} in an analogous way to the variance. That is, the curves with m_{ν} > 0 are shifted downward systematically from the curve with m_{ν} = 0. This means that the distribution of κ becomes more Gaussian as m_{ν} increases, which is consistent with a higher neutrino mass more effectively suppressing the generation of the highest matter peaks.
At the smallest scales, the skewness across all three f_{5}(R) models is amplified relative to ΛCDM, and more so than for the variance at the same scale. For example, the m_{ν} = 0 eV model skewness is approximately 25% larger than ΛCDM at ϑ_{2} compared to 12% for the variance. In addition, for the model with m_{ν} = 0.15 eV, the skewness at certain intermediate scales is up to 13% different from ΛCDM, whereas the corresponding variance at the same scale is less than 5%. Each f_{5}(R) model approaches an approximately constant skewness at scales around ϑ ≈ 10′ and larger. Given the larger (average) differences among the models, this suggests that skewness may be a better observable to break degeneracies than the variance.
We show the aperture mass kurtosis as a function of filter scale in the right plot of Fig. 6. As with the variance and skewness, the m_{ν} = 0 curve is highest and decreases at each scale as m_{ν} increases. Differences with respect to ΛCDM are of the same amplitude as for the skewness within each f_{5}(R) model. We see at fourth order, however, that neutrinos cause a more complex behaviour in the full lensing distribution than was apparent at lower orders. For example, in the relative deviation from ΛCDM (lower panel), there exists now a clear local minimum at ϑ ≈ 5′ for the models with m_{ν} > 0, in addition to the maximum at smaller scales.
It is interesting to comment on the overall behaviour of the statistics of M_{ap} moments as one varies neutrino mass. The f_{5}(R) model with m_{ν} = 0 eV overpredicts ΛCDM at each filter scale and for each statistic. Including massive neutrinos in the model, as we have discussed previously, counteracts the tendency of the modified gravitational interaction to enhance structure growth – at least on scales smaller than the neutrino freestreaming length, which in our case, for each m_{ν}, lies above the full range of aperture scales considered. The degeneracy with ΛCDM is generally strengthened by larger m_{ν}, but not without limit, and not uniformly, as the higherorder M_{ap} statistics reveal. We expect that m_{ν} values larger than we have studied would again decrease the degeneracy with ΛCDM, perhaps measurably at the twopoint level. For reference, the most recent constraints from CMB data put an upper bound on the sum of neutrino masses at around 0.17 eV (Couchot et al. 2017), however this assumes a ΛCDM model that does not necessarily apply beyond this framework.
The final nonGaussian statistics we consider is the abundance of peaks counted above a given k σ threshold (k = 1, 2, …). Results for k = 3 and k = 5 are shown in Fig. 7. A higher peak value cutoff means that in general fewer peaks will be detected. This is borne out in the two plots of Fig. 7, where the peak count at a fixed scale is about ten times smaller for k = 5 than for k = 3. In addition, the peak count for the smallest scale (ϑ_{1} = 0.293′) relatively close to that of ΛCDM in both cases but significantly larger than ΛCDM (up to about 15%) by the second scale (ϑ_{2} = 0.568′). This is similar to what we have seen for all of the M_{ap} moments, in particular for f_{5}(R) with m_{ν} = 0 and 0.1 eV.
The impact of neutrino mass on the f_{5}(R) models is less obvious with peak counts compared to the other statistics across most scales. This is because the variation between models is much smaller than the range of peak count values between different aperture scales (up to four orders in magnitude). Peak counts are therefore much more sensitive to the scale of observation than the variance, skewness, and kurtosis. We notice as well that for a given aperture size, the spread among f_{5}(R) models with different neutrino masses is less pronounced than for the M_{ap} moments. This suggests that peak counts could offer a robust test of modified gravity whether or not neutrinos are considered in the analysis and efficiently break the degeneracy that affects all other WL statistics.
We have found identical trends to those observed in Figs. 5–7 for the f_{4}(R) and f_{6}(R) models with m_{ν} > 0. This is also the case for the other source redshift planes across all models. We further explore the evolution with redshift in Sect. 6.4.
In quantifying the level at which one model is in fact distinguishable from another, visual inspection of plots is, of course, not sufficient. We therefore address this question in detail in the following sections by studying the histograms of the different observables and by computing their discrimination efficiencies.
6.3. Distributions of observables
Considering seven filtering scales, four source redshift planes, and five statistical observables in our simulations (variance, skewness, kurtosis, and peak counts above two threshold levels), we have available 7 × 4 × 5 = 140 observables for each simulated cosmology. These observables are not all independent. For example, correlation matrices between the different probes are shown in Fig. 8 for the two cases of ΛCDM and f_{5}(R) without neutrinos; the cases with m_{ν} > 0 (not shown) look very similar. The statistics have been computed for maps at z_{s} = 2.0 and aperture size ϑ_{2} = 0.586′. We see from the figure that aperture mass moments are highly positively correlated with each other at this scale for both models, while peak counts show a slight (moderate) negative correlation for f_{5}(R) (ΛCDM) between the two thresholds. Correlations between moments and peak counts are strongest between kurtosis and the 3σ peak threshold, although this varies with scale and redshift.
Fig. 8.
Correlation matrices between aperture mass moments and peak count probes for two models using maps at z_{s} = 2.0 and filtering scale ϑ_{2} = 0.586′. Aperture mass moments are strongly positively correlated with each other, while peak counts show slight to moderate negative correlation between the two threshold values. The amplitude of cross correlation between moments and peaks depends on redshift and scale. 
We are interested in comparing the histograms of each of these observables between different models in order to determine which, if any, is best at breaking degeneracies. We present here only four representative plots intended to illustrate our approach and results. Shown in Fig. 9 are the areanormalised histograms of M_{ap} variance, skewness, kurtosis, and peaks counts for ΛCDM and MG models. To compare with previous results, we focus on f_{5}(R) models with m_{ν} = (0, 0.1, 0.15) eV and choose a filtering scale of ϑ_{2} = 0.586′. We will further investigate in the rest of the paper how results depend on the filtering scale and redshift, for each statistic. Solid lines indicate smoothing by KDE (cf. Sect. 5.3). Referring to Figs. 5–7, the mean value of each statistic corresponds to the central positions of the histograms. For example, the variance at this scale of the m_{ν} = 0 eV f_{5}(R) model is most distinct from ΛCDM, while the m_{ν} = 0.15 eV is least.
Fig. 9.
Histograms of aperture mass statistics for ΛCDM and f_{5}(R) models with varying neutrino mass m_{ν}. Each histogram, with area normalised to one, comprises 256 samples of the statistic computed at a filtering scale of ϑ = 0.586′ and for sources at redshift z_{s} = 2.0. Solid lines represent the result of smoothing the distribution by KDE (cf. Sect. 5.3). The central positions of the histograms correspond to the mean values of each statistic as seen in Figs. 5–7. Considering the most degenerate case with ΛCDM, f_{5}(R) with m_{ν} = 0.15 eV, second and higherorder moments of M_{ap} do not appear able to distinguish the models. Peak counts, on the other hand, shown here for a 3σ threshold, cleanly separate the two distributions. Moreover, it is interesting that peak counts separate all f_{5}(R) cases from ΛCDM by approximately the same amount, independent of m_{ν}. 
The histograms allow us to see qualitatively how efficient each statistic is at distinguishing between models. Considering f_{5}(R) with m_{ν} = 0.15 eV (green), the model most degenerate with ΛCDM (black) in terms of the matter and convergence power spectra, higher order moments of M_{ap} do not appear able to break the degeneracy. The skewness and kurtosis histograms overlap with ΛCDM more than does the variance. On the other hand, peak counts, shown here for a 3σ threshold, displace the f_{5}(R) distribution from that of ΛCDM so that they are nearly disjoint. This is the case as well for f_{5}(R) with the other neutrino masses, supporting the result seen in Fig. 7.
In this example, we have chosen a combination of source redshift and filter scale that provides relatively good separation between the histograms of these models. Sources have been taken at z_{s} = 2.0 here, compared to z_{s} = 1.0 in Figs. 5–7, as the contrast between MG models and GR tends to increase with redshift for each observable. It should be noted as well that for many of the observables considered, that is for other redshifts and filtering scales, we find that the f(R) model histogram is essentially fully coincident with that of ΛCDM, indicating a negligible discrimination efficiency. Only a relatively small subset of observables appear able to break degeneracies. We therefore further investigate the dependence of discrimination efficiency on redshift and filtering scale in the next subsection.
6.4. Discrimination efficiency: variation with statistic, scale, and redshift
In this section we present the discrimination efficiency with respect to ΛCDM of two f_{5}(R) models, m_{ν} = (0, 0.15) eV, as a function of M_{ap} statistic, filtering scale, and source galaxy redshift. Figure 10 shows results for f_{5}(R) without neutrinos. Each polar plot represents one of the statistics, where for peak counts (lower right) the threshold has been set to 3σ.
Each plot is divided into seven wedges indicated by the bold red lines, where each wedge represents a filtering by the aperture ϑ appearing at the outer edge. Apertures are numbered as before according to wavelet scale and which correspond to angular sizes (ϑ_{1}, …, ϑ_{7}) = (0.293′, 0.586′, 1.17′, 2.34′, 4.69′, 9.34′, 18.8′). Each wedge contains four bars, colour coded according to source redshift, where the height of each bar represents the discrimination efficiency with respect to ΛCDM (as a percent) at that redshift and filtering scale. A bar extending from the centre of the figure and touching the outer edge expresses a 100% discrimination efficiency, and the scaling with radius is linear.
As expected from our previous results, we see that the variance can be a relatively good discriminator between ΛCDM and f_{5}(R) without neutrinos. In particular, the discrimination efficiency is at least 80% at scales ϑ_{2} and ϑ_{3} for sources at z_{s} ≥ 1.5. The skewness performs well only for the single aperture of ϑ_{2} and z_{s} ≥ 1.5, while the kurtosis is a poor discriminator at all filtering scales and redshifts. Considering that the kurtosis of this model can deviate from ΛCDM by up to 30%, or ∼10% more than the variance (cf. Figs. 5 and 6), it is somewhat surprising that this fourthorder statistic does not offer more discrimination power compared to second order.
The final plot in Fig. 10 reveals that peak counts can discriminate at approximately the same level as the variance or better, at least for filtering scales ϑ_{2} and ϑ_{3}. Indeed, peak count discrimination efficiency even exceeds that of the variance for z_{s} = 1.5, for example, at these two scales. We have checked that by raising the peak detection threshold to 5σ, discrimination efficiency rises to 100% at ϑ_{2} for sources at z_{s} ≥ 1.0. A consequence, however, is that the efficiency is reduced at ϑ_{3} for the higher redshift source planes. From the cases we have checked, it appears that the relationship between discrimination efficiency and peak count threshold, filtering scale, and redshift is highly nontrivial.
Fig. 10.
Discrimination efficiency with respect to ΛCDM of the MG model f_{5}(R) with m_{ν} = 0 eV as a function of statistic, aperture scale, and source redshift. In each plot, the wedges marked by red lines indicate M_{ap} filtering by the aperture ϑ appearing at the outer edge. Numbered apertures correspond to angular sizes of (ϑ_{1}, …, ϑ_{7}) = (0.293′, 0.586′, 1.17′, 2.34′, 4.69′, 9.34′, 18.8′). The radial length of a bar, shaded according to source redshift, represents the discrimination efficiency (in percent) of the statistic at the filtering scale associated to its wedge. The variance of this model suffices to distinguish it from ΛCDM at greater than 80% at scales ϑ_{2} and ϑ_{3} for sources at z_{s} ≥ 1.5. Peak counts (above a 3σ detection threshold) achieve similar results, while kurtosis proves to be a poor discriminator, not exceeding 55% at any filter scale or redshift. Skewness shows comparably good discrimination power to the variance and peak counts only at scale ϑ_{2}. 
We recall that our peak statistics do not take into account the full peak distribution but instead constitute a simple survival function, in other words the number of peaks remaining after a certain cut. As a result, the signal for a given threshold is dominated by the peaks with amplitudes near that threshold, of which we have only considered two. The peak information we explore is therefore suboptimal, and results would likely improve by using the full distribution. Nonetheless, it is clear even from our simple approach that peak counts can outperform standard moments in breaking degeneracies between MG and GR using weak lensing.
Figure 11 is analogous to Fig. 10 for the f_{5}(R) model with m_{ν} = 0.15 eV. Comparing to the case without neutrinos, it is apparent that no aperture mass moments up to fourth order are capable of distinguishing this model from ΛCDM. On the contrary, peak counts again show a promising result. For ϑ_{2} = 0.586′ filtering, sources at z_{s} = 2.0 give a 100% discrimination efficiency. For ϑ_{3} = 1.17′ filtering, the efficiency is 93% for z_{s} ≥ 1.5 and 84% for z_{s} = 1.0. We have checked here as well that raising the detection level to 5σ produces the same effect as it did for the previous case without neutrinos. Given that this model was designed to be degenerate with ΛCDM, it is a useful result, and indeed the main result of the paper, that there exist observables capable of discriminating between the models with high efficiency.
Fig. 11.
Discrimination efficiency with respect to ΛCDM of the MG model f_{5}(R) with m_{ν} = 0.15 eV as a function of statistic, aperture scale, and source redshift. Plots are analogous to those in Fig. 10. Neutrinos significantly enhance the degeneracy of this model with ΛCDM at the twopoint level compared to the m_{ν} = 0 eV case. This can be seen in the variance plot (upper left), where the discrimination efficiency is less than 18% at all filter scales and redshifts. The higherorder moments show even less discrimination power. On the contrary, peak counts (lower right), again above a 3σ detection threshold, achieve 100% discrimination efficiency for z_{s} = 2.0 at ϑ_{2}, and above 92% for z_{s} ≥ 1.5 at ϑ_{3}. 
We have carried out the same analysis for the other MG models, namely f_{R0} = −10^{−4} with m_{ν} = (0, 0.3) eV and f_{R0} = −10^{−6} with m_{ν} = (0, 0.06, 0.1) eV. A summary of the results is presented in Table 2, and f_{5}(R) is included for comparison. Shown are the maximum discrimination efficiencies from ΛCDM attained for each model along with the statistic, aperture ϑ, and source redshift z_{s} that produces it. To highlight the difference between second and higherorder statistics, results for the variance and for the best discriminating statistic are shown for each model. For all models except f_{4}(R) [m_{ν} = 0 eV], the unique maximum efficiency is achieved with peaks counted either above a 3σ or 5σ threshold. On the other hand, the f_{4}(R) case with m_{ν} = 0 eV reaches 100% with all of the statistics for multiple combinations of ϑ and z_{s}. This is expected given that this model diverges from ΛCDM most significantly already at the power spectrum level.
7. Conclusion
The cosmological data do not yet point to a unique model within which all observations can be explained, with several cosmologies still fitting the data. The standard ΛCDM model indeed accommodates the broad range of current observational probes that measure structure growth and the universal expansion across cosmic time. The fundamental nature of the latetime acceleration, however, remains unclear – in particular whether it is caused by a fluid (dark energy) component, a cosmological constant Λ, or instead by a modification to general relativity at large scales. Many modified gravity models, such as the f(R) family, are still viable given the data.
We have explored in this paper several particular f(R) models that mimic ΛCDM in terms of their background evolution and their matter power spectra at z = 0. The models can be written in terms of two free parameters, n and f_{R0}, which determine the density and scale at which the modified gravitational interaction takes effect. We fixed n = 1 throughout and considered f_{R0} values within the range 10^{−4}–10^{−6}.
By design, the models we chose are difficult to distinguish from ΛCDM based on observations at linear scales. Furthermore, using Nbody simulations, we showed that the amplitude of the matter power spectrum is larger relative to ΛCDM with increasing f_{R0}, but this can be in turn reduced by the inclusion of massive neutrinos (up to m_{ν} = 0.3 eV) in the model. This is because massive neutrinos suppress the growth of structure on scales smaller than their freestreaming length, thereby compensating the increased clustering due to larger f_{R0}. For certain combinations then of f_{R0} and m_{ν}, the model well reproduces ΛCDM not only on linear scales, but also into the nonlinear regime.
Our primary goal has been to determine whether there are weaklensing observables that are more efficient than standard secondorder statistics in discriminating between GR and MG. We first verified that the convergence power spectrum P_{κ}(ℓ) and aperture mass variance , both twopoint measurements, of our simulated cosmologies behaved as expected and consistently with each other. As with the matter power spectra, deviations from ΛCDM of the MG models decreased as f_{R0} decreased. Then, focusing on f_{5}(R), we showed that the discrepancy diminished further by including m_{ν} up to 0.15 eV over the angular scales considered.
The nonGaussian weaklensing observables we have studied are the aperture mass skewness (third order), kurtosis (fourth order), and peak counts as a function of map filtering scale and source galaxy redshift. We considered redshifts up to z = 2.0 and aperture scales in the range from 0.293′ up to 18.8′. Provided that noise is properly taken into account, a full treatment of which is beyond the scope of this work (though see the Appendix), our results may be potentially interesting for future widefield galaxy surveys like Euclid.
For f_{5}(R) with m_{ν} = 0.15 eV, the f_{5}(R) model most closely mimicking ΛCDM, there exist filtering scales for which the skewness and kurtosis measurements deviate more from ΛCDM compared to the variance, indicating a greater chance of using these observables to discriminate between the models. This was also the case for peak counts above thresholds of 3σ and 5σ. However, a further study of the discrimination efficiency based on the false discovery rate (FDR) formalism showed that only peak counts offered any reliable power to distinguish the MG model from ΛCDM. The discrimination efficiency calculation does this by quantifying the overlap between the actual distributions of observables computed over the 256 model realisations. We conclude therefore that we are less likely to mistake a true MG model for ΛCDM by measuring peak counts compared to second and higherorder moments of the aperture mass.
A particularly interesting feature of peak counts we have found is that they can be less sensitive to differences in neutrino mass than to the model of gravity. The example in Fig. 9 demonstrates this in the significant overlap among the three f_{5}(R) histograms which are each in turn nearly fully disjoint from the ΛCDM histogram. This behaviour was seen as well for the other sets of models with the same value of f_{R0} but differing m_{ν}. In addition to offering more discrimination power compared to aperture mass moments, at least for the models we have considered, peaks may generally be a more robust observable for distinguishing between standard and modified gravity.
To better understand the sensitivity of peak counts to neutrinos, it would be useful to systematically study variations in the halo mass function as a function of m_{ν} and redshift for the same f(R) model. As high S/N weaklensing peaks are thought to trace the most massive DM halos in the universe, peak counts can probe the highmass tail of the mass function. Some results in Hagstotz et al. (2018), who used the same DUSTGRAINpathfinder simulations as we have here, are already instructive on this front (cf. their Figs. 8 and 9). For example, at z = 0, the f_{5}(R) model without neutrinos shows an increase in DM halo abundance of between 10% and 20% relative to GR over the mass range 13.3 ≤ log_{10}(M_{200 m} h / M_{⊙})≤15.0, where M_{200 m} (M_{⊙}) is the halo (solar) mass. The same model with m_{ν} = 0.15 eV shows a maximum increase of only up to about 12% over the same mass range, a reduction we expect from the MG–neutrino degeneracy, but interestingly the halo abundance prediction becomes consistent with GR for the most massive halos around 10^{15} M_{⊙} h^{−1}. This is not the case for f_{4}(R) with m_{ν} = 0.3 eV, where the GR abundance is recovered a low but not high masses, and results for all models evolve with redshift as well. A detailed study of the mass function and its correlations with the peak count signal over the full parameter space we have considered would therefore be enlightening, however it is beyond the scope of this work.
In terms of source redshift dependence, we have found generally that higher z_{s} observations tend to provide greater discrimination efficiencies between models for every statistic, although we did find exceptions for particular filtering scales. In no case, however, did z_{s} = 0.5 outperform z_{s} = 2.0, which we can understand as the high redshift convergence maps carrying more significant cosmological information that is accessible by these statistics. Physically, since light travels through (and is distorted by) more structures along the line of sight, we expect better sensitivity for higher redshift sources. Moreover, as we considered the different redshift planes independently, we have not exploited the evolution of observables with z, which likely constitutes a further signal that could be used to distinguish the models.
We found similar but not precisely equivalent results for the f_{4}(R) and f_{6}(R) models (both with m_{ν} = 0 and m_{ν} > 0) compared to f_{5}(R) (cf. Table 2). They are similar in the sense that peak counts remain the best observable tested for breaking degeneracies with ΛCDM for any model with m_{ν} > 0. On the other hand, no filtering scale nor source galaxy redshift provided a larger discrimination efficiency than 66% when m_{ν} > 0, significantly lower than was seen for f_{5}(R) models. This reflects the strong degeneracy that can persist for suitably chosen combinations of MG parameters and neutrino mass sums. Without neutrinos, the f_{4}(R) model is easily distinguishable from ΛCDM, while f_{6}(R) is not.
As we have not sought in this paper to find an optimal statistic, one may reasonably wonder whether there exist other M_{ap} observables (e.g. derived from different filter functions, filtering scales, statistics, etc.) that would be better at breaking degeneracies not only between MG and GR, but also within MG models themselves. A likely way to improve results would be to use the full distribution of peaks as a function of S/N, rather than the simple threshold cut we have employed. We leave a dedicated study of this question to future work.
Acknowledgments
AP acknowledges support by an Enhanced Eurotalents Fellowship, a Marie SkłodowskaCurie Actions Programme cofunded by the European Commission and Commissariat à l’énergie atomique et aux énergies alternatives (CEA). AP wishes to thank M. Kilbinger, S. Pires, D. Elbaz, and S. Casas for many useful discussions while preparing this paper. VP and MB thank L. Lombriser and K. Koyama for useful discussions on f(R) models. CG and MB acknowledge support from the Italian Ministry for Education, University and Research (MIUR) through the SIR individual grant SIMCODE, project number RBSI14P4IH, from the grant MIUR PRIN 2015 “Cosmology and Fundamental Physics: illuminating the Dark Universe with Euclid” and from the agreement ASI n.I/023/12/0 “Attività relative alla fase B2/C per la missione Euclid”. The authors wish to acknowledge the European Community through the grant DEDALE (contract no. 665044) within the H2020 Framework Programme of the European Commission, the Euclid Collaboration, the European Space Agency and the support of the Centre National d’Etudes Spatiales. The DUSTGRAINpathfinder simulations discussed in this work have been performed and analysed on the Marconi supercomputing machine at Cineca thanks to the PRACE project SIMCODE1 (grant nr. 2016153604) and on the computing facilities of the Computational Center for Particle and Astrophysics (C2PAP) and of the Leibniz Supercomputer Centre (LRZ) under the project ID pr94ji.
References
 Amendola, L. 2000, Phys. Rev. D, 62, 043511 [CrossRef] [Google Scholar]
 Amendola, L., Appleby, S., Avgoustidis, A., et al. 2018, Liv. Rev. Relativ., 21, 2 [Google Scholar]
 Anderson, L., Aubourg, É., Bailey, S., et al. 2014, MNRAS, 441, 24 [NASA ADS] [CrossRef] [Google Scholar]
 Baker, T., Bellini, E., Ferreira, P. G., et al. 2017, Phys. Rev. Lett., 119, 251301 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Baldi, M., VillaescusaNavarro, F., Viel, M., et al. 2014, MNRAS, 440, 75 [NASA ADS] [CrossRef] [Google Scholar]
 Benjamini, Y., & Hochberg, Y. 1995, J. R. Stat. Soc. Ser. B, 57, 289 [Google Scholar]
 Bernardeau, F., van Waerbeke, L., & Mellier, Y. 1997, A&A, 322, 1 [NASA ADS] [Google Scholar]
 Betoule, M., Kessler, R., Guy, J., et al. 2014, A&A, 568, A22 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Bettoni, D., Ezquiaga, J. M., Hinterbichler, K., & Zumalacárregui, M. 2017, Phys. Rev. D, 95, 084029 [NASA ADS] [CrossRef] [Google Scholar]
 Boubekeur, L., Giusarma, E., Mena, O., & Ramírez, H. 2014, Phys. Rev. D, 90, 103512 [NASA ADS] [CrossRef] [Google Scholar]
 Castro, T., Quartin, M., Giocoli, C., Borgani, S., & Dolag, K. 2018, MNRAS, 478, 1305 [NASA ADS] [CrossRef] [Google Scholar]
 Clowe, D., Schneider, P., AragónSalamanca, A., et al. 2006, A&A, 451, 395 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Couchot, F., HenrotVersillé, S., Perdereau, O., et al. 2017, A&A, 606, A104 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Creminelli, P., & Vernizzi, F. 2017, Phys. Rev. Lett., 119, 251302 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Crisostomi, M., & Koyama, K. 2018, Phys. Rev. D, 97, 084004 [NASA ADS] [CrossRef] [Google Scholar]
 Crittenden, R. G., Natarajan, P., Pen, U.L., & Theuns, T. 2002, ApJ, 568, 20 [NASA ADS] [CrossRef] [Google Scholar]
 DES Collaboration (Abbott, T. M. C., et al.) 2018, Phys. Rev. D, 98, 043526 [Google Scholar]
 Dietrich, J. P., & Hartlap, J. 2010, MNRAS, 402, 1049 [NASA ADS] [CrossRef] [Google Scholar]
 Dima, A., & Vernizzi, F. 2018, Phys. Rev. D, 97, 101302 [NASA ADS] [CrossRef] [Google Scholar]
 Ezquiaga, J. M., & Zumalacárregui, M. 2017, Phys. Rev. Lett., 119, 251304 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Fan, Z., Shan, H., & Liu, J. 2010, ApJ, 719, 1408 [NASA ADS] [CrossRef] [Google Scholar]
 Fluri, J., Kacprzak, T., Sgier, R., Réfrégier, A., & Amara, A. 2018, ArXiv eprints [arXiv:1803.08461] [Google Scholar]
 Giocoli, C., Metcalf, R. B., Baldi, M., et al. 2015, MNRAS, 452, 2757 [NASA ADS] [CrossRef] [Google Scholar]
 Giocoli, C., Jullo, E., Metcalf, R. B., et al. 2016, MNRAS, 461, 209 [NASA ADS] [CrossRef] [Google Scholar]
 Giocoli, C., Di Meo, S., Meneghetti, M., et al. 2017, MNRAS, 470, 3574 [NASA ADS] [CrossRef] [Google Scholar]
 Giocoli, C., Baldi, M., & Moscardini, L. 2018a, MNRAS, 481, 2813 [NASA ADS] [CrossRef] [Google Scholar]
 Giocoli, C., Moscardini, L., Baldi, M., Meneghetti, M., & Metcalf, R. B. 2018b, MNRAS, 478, 5436 [NASA ADS] [CrossRef] [Google Scholar]
 Hagstotz, S., Costanzi, M., Baldi, M., & Weller, J. 2018, ArXiv eprints [arXiv:1806.07400] [Google Scholar]
 Hamana, T., Miyazaki, S., Shimasaku, K., et al. 2003, ApJ, 597, 98 [NASA ADS] [CrossRef] [Google Scholar]
 Hamana, T., Oguri, M., Shirasaki, M., & Sato, M. 2012, MNRAS, 425, 2287 [NASA ADS] [CrossRef] [Google Scholar]
 He, J.H. 2013, Phys. Rev. D, 88, 103523 [NASA ADS] [CrossRef] [Google Scholar]
 Hetterscheidt, M., Simon, P., Schirmer, M., et al. 2007, A&A, 468, 859 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Heymans, C., Grocutt, E., Heavens, A., et al. 2013, MNRAS, 432, 2433 [NASA ADS] [CrossRef] [Google Scholar]
 Higuchi, Y., & Shirasaki, M. 2016, MNRAS, 459, 2762 [NASA ADS] [CrossRef] [Google Scholar]
 Hildebrandt, H., Viola, M., Heymans, C., et al. 2017, MNRAS, 465, 1454 [Google Scholar]
 Hu, W., & Sawicki, I. 2007, Phys. Rev. D, 76, 064004 [CrossRef] [Google Scholar]
 Hu, B., Raveri, M., Rizzato, M., & Silvestri, A. 2016, MNRAS, 459, 3880 [NASA ADS] [CrossRef] [Google Scholar]
 Jain, B., & Seljak, U. 1997, ApJ, 484, 560 [NASA ADS] [CrossRef] [Google Scholar]
 Jain, B., Vikram, V., & Sakstein, J. 2013, ApJ, 779, 39 [NASA ADS] [CrossRef] [Google Scholar]
 Jarvis, M., Bernstein, G. M., Fischer, P., et al. 2003, AJ, 125, 1014 [NASA ADS] [CrossRef] [Google Scholar]
 Jarvis, M., Bernstein, G., & Jain, B. 2004, MNRAS, 352, 338 [Google Scholar]
 Kacprzak, T., Kirk, D., Friedrich, O., et al. 2016, MNRAS, 463, 3653 [NASA ADS] [CrossRef] [Google Scholar]
 Kaiser, N., & Squires, G. 1993, ApJ, 404, 441 [NASA ADS] [CrossRef] [Google Scholar]
 Kaiser, N., Squires, G., Fahlman, G., & Woods, D. 1994, in Clusters of Galaxies, Proc. of the XIVth Moriond Astrophysics Meeting, 269 [Google Scholar]
 Khoury, J., & Weltman, A. 2004, Phys. Rev. D, 69, 044026 [CrossRef] [Google Scholar]
 Kilbinger, M. 2015, Rep. Prog. Phys., 78, 086901 [Google Scholar]
 Kilbinger, M., & Schneider, P. 2005, A&A, 442, 69 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Kilbinger, M., Fu, L., Heymans, C., et al. 2013, MNRAS, 430, 2200 [NASA ADS] [CrossRef] [Google Scholar]
 Koyama, K. 2016, Rep. Prog. Phys., 79, 046902 [NASA ADS] [CrossRef] [Google Scholar]
 Kratochvil, J. M., Haiman, Z., & May, M. 2010, Phys. Rev. D, 81, 043519 [NASA ADS] [CrossRef] [Google Scholar]
 Kruse, G., & Schneider, P. 1999, MNRAS, 302, 821 [NASA ADS] [CrossRef] [Google Scholar]
 Kruse, G., & Schneider, P. 2000, MNRAS, 318, 321 [NASA ADS] [CrossRef] [Google Scholar]
 Laureijs, R., Amiaux, J., Arduini, S., et al. 2011, ArXiv eprints [arXiv:1110.3193] [Google Scholar]
 Leonard, A., Pires, S., & Starck, J.L. 2012, MNRAS, 423, 3405 [NASA ADS] [CrossRef] [Google Scholar]
 Lewis, A., Challinor, A., & Lasenby, A. 2000, ApJ, 538, 473 [Google Scholar]
 Lin, C.A., & Kilbinger, M. 2015, A&A, 576, A24 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Liu, J., & Haiman, Z. 2016, Phys. Rev. D, 94, 043533 [NASA ADS] [CrossRef] [Google Scholar]
 Liu, X., Li, B., Zhao, G.B., et al. 2016, Phys. Rev. Lett., 117, 051101 [NASA ADS] [CrossRef] [Google Scholar]
 Lombriser, L. 2014, Annal. Phys., 526, 259 [NASA ADS] [CrossRef] [Google Scholar]
 Lombriser, L., & Lima, N. A. 2017, Phys. Lett. B, 765, 382 [NASA ADS] [CrossRef] [Google Scholar]
 Lombriser, L., & Taylor, A. 2016, J. Cosmol. Astropart. Phys, 3, 031 [Google Scholar]
 Lombriser, L., Schmidt, F., Baldauf, T., et al. 2012, Phys. Rev. D, 85, 102001 [NASA ADS] [CrossRef] [Google Scholar]
 LSST Science Collaboration (Abell, P. A., et al.) 2009, ArXiv eprints [arXiv:0912.0201] [Google Scholar]
 Marian, L., Smith, R. E., Hilbert, S., & Schneider, P. 2012, MNRAS, 423, 1711 [NASA ADS] [CrossRef] [Google Scholar]
 Martinelli, M., Melchiorri, A., & Amendola, L. 2009, Phys. Rev. D, 79, 123516 [CrossRef] [Google Scholar]
 Martinet, N., Bartlett, J. G., Kiessling, A., & Sartoris, B. 2015, A&A, 581, A101 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Martinet, N., Schneider, P., Hildebrandt, H., et al. 2018, MNRAS, 474, 712 [NASA ADS] [CrossRef] [Google Scholar]
 Maturi, M., Fedeli, C., & Moscardini, L. 2011, MNRAS, 416, 2527 [NASA ADS] [CrossRef] [Google Scholar]
 Motohashi, H., Starobinsky, A. A., & Yokoyama, J. 2013, Phys. Rev. Lett., 110, 121302 [NASA ADS] [CrossRef] [Google Scholar]
 Navarro, J. F., Frenk, C. S., & White, S. D. M. 1996, ApJ, 462, 563 [NASA ADS] [CrossRef] [Google Scholar]
 Peel, A., Lin, C.A., Lanusse, F., et al. 2017, A&A, 599, A79 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Petri, A., Haiman, Z., & May, M. 2016, Phys. Rev. D, 93, 063524 [NASA ADS] [CrossRef] [Google Scholar]
 Petri, A., Haiman, Z., & May, M. 2017, Phys. Rev. D, 95, 123503 [NASA ADS] [CrossRef] [Google Scholar]
 Pettorino, V. 2013, Phys. Rev. D, 88, 063519 [NASA ADS] [CrossRef] [Google Scholar]
 Pettorino, V., & Baccigalupi, C. 2008, Phys. Rev. D, 77, 103003 [NASA ADS] [CrossRef] [Google Scholar]
 Pires, S., Starck, J.L., Amara, A., Réfrégier, A., & Teyssier, R. 2009, A&A, 505, 969 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Pires, S., Leonard, A., & Starck, J.L. 2012, MNRAS, 423, 983 [NASA ADS] [CrossRef] [Google Scholar]
 Planck Collaboration XIII. 2016, A&A, 594, A13 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Planck Collaboration XIV. 2016, A&A, 594, A14 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Puchwein, E., Baldi, M., & Springel, V. 2013, MNRAS, 436, 348 [NASA ADS] [CrossRef] [Google Scholar]
 Roncarelli, M., Moscardini, L., Borgani, S., & Dolag, K. 2007, MNRAS, 378, 1259 [NASA ADS] [CrossRef] [Google Scholar]
 Sakstein, J. 2015, Phys. Rev. Lett., 115, 201101 [NASA ADS] [CrossRef] [Google Scholar]
 Sakstein, J., & Jain, B. 2017, Phys. Rev. Lett., 119, 251303 [NASA ADS] [CrossRef] [Google Scholar]
 Schäfer, B. M., Heisenberg, L., Kalovidouris, A. F., & Bacon, D. J. 2012, MNRAS, 420, 455 [NASA ADS] [CrossRef] [Google Scholar]
 Schirmer, M., Erben, T., Hetterscheidt, M., & Schneider, P. 2007, A&A, 462, 875 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Schneider, P. 1996, MNRAS, 283, 837 [NASA ADS] [CrossRef] [Google Scholar]
 Schneider, P., van Waerbeke, L., Jain, B., & Kruse, G. 1998, MNRAS, 296, 873 [NASA ADS] [CrossRef] [Google Scholar]
 Schrabback, T., Hartlap, J., Joachimi, B., et al. 2010, A&A, 516, A63 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Shan, H. Y., Kneib, J.P., Comparat, J., et al. 2014, MNRAS, 442, 2534 [NASA ADS] [CrossRef] [Google Scholar]
 Shan, H., Liu, X., Hildebrandt, H., et al. 2018, MNRAS, 474, 1116 [NASA ADS] [CrossRef] [Google Scholar]
 Shirasaki, M., Nishimichi, T., Li, B., & Higuchi, Y. 2017, MNRAS, 466, 2402 [NASA ADS] [CrossRef] [Google Scholar]
 Smith T. L. 2009, ArXiv eprints [arXiv:0907.4829] [Google Scholar]
 Springel, V. 2005, MNRAS, 364, 1105 [NASA ADS] [CrossRef] [Google Scholar]
 Starck, J.L., Fadili, J., & Murtagh, F. 2007, IEEE Trans. Image Process., 16, 297 [Google Scholar]
 van Waerbeke, L. 1998, A&A, 334, 1 [NASA ADS] [Google Scholar]
 van Waerbeke, L. 2000, MNRAS, 313, 524 [NASA ADS] [CrossRef] [Google Scholar]
 van Waerbeke, L., Mellier, Y., Radovich, M., et al. 2001, A&A, 374, 757 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Vicinanza, M., Cardone, V. F., Maoli, R., Scaramella, R., & Er, X. 2018, Phys. Rev. D, 97, 023519 [NASA ADS] [CrossRef] [Google Scholar]
 Viel, M., Haehnelt, M. G., & Springel, V. 2010, J. Cosmol. Astropart. Phys., 6, 015 [NASA ADS] [CrossRef] [Google Scholar]
 Vikram, V., Sakstein, J., Davis, C., & Neil, A. 2018, Phys. Rev. D, 97, 104055 [NASA ADS] [CrossRef] [Google Scholar]
 VillaescusaNavarro, F., Banerjee, A., Dalal, N., et al. 2018, ApJ, 861, 53 [NASA ADS] [CrossRef] [Google Scholar]
 Wright, B. S., Winther, H. A., & Koyama, K. 2017, J. Cosmol. Astropart. Phys., 10, 054 [NASA ADS] [CrossRef] [Google Scholar]
 Yang, X., Kratochvil, J. M., Wang, S., et al. 2011, Phys. Rev. D, 84, 043529 [NASA ADS] [CrossRef] [Google Scholar]
 Zennaro, M., Bel, J., VillaescusaNavarro, F., et al. 2017, MNRAS, 466, 3244 [NASA ADS] [CrossRef] [Google Scholar]
 Zhang, T.J., Pen, U.L., Zhang, P., & Dubinski, J. 2003, ApJ, 598, 818 [NASA ADS] [CrossRef] [Google Scholar]
Appendix A: Impact of galaxy shape noise
We are not able to measure the true shear field in practice; instead we measure galaxy ellipticities ϵ, which in the absence of systematic errors, represent an unbiased measurement of the reduced shear field g(θ) = γ(θ)/[1 − κ(θ)] when averaged over many galaxies. A primary source of noise in weak lensing analyses is therefore due to the noncircular intrinsic shapes of galaxies. In this Appendix, we study the effect of galaxy shape noise on our results by reproducing Figs. 9 and 10 with such representative noise included.
Given that we work with convergence maps directly from our simulations, we generate noisy versions of these maps by simply adding a noise term: κ_{N}(θ) = κ(θ) + N(θ). The noise N(θ) is modelled as a Gaussian random field with zero mean and variance given by (van Waerbeke 2000)
where σ_{ϵ} ^{2} = σ_{ϵ1 } ^{2} + σ_{ϵ2 } ^{2} is the total galaxy ellipticity variance, n_{gal} is the number density of galaxies, and A_{pix} is the pixel area. Upcoming weak lensing surveys should achieve n_{gal} ≈ 30 arcmin^{−2} and σ_{ϵ} ≈ 0.4, corresponding to a standard deviation of about 0.28 per component of ellipticity.
Our methodology produces lensing maps for sources at fixed redshift, namely z_{s} = 0.5, 1.0, 1.5, and 2.0. In real observations, of course, galaxies are distributed in redshift, and the effective number density of a survey reflects this distribution. We can imagine that our lines of sight come from a survey in which the galaxies have been divided into four bins such that each contains the same number of galaxies and that our source planes lie at the centres. Assuming an overall galaxy number density of 30 arcmin^{−2}, this gives n_{gal} = 7.5 arcmin^{−2} per bin and σ_{pix} = 0.7.
Histograms corresponding to those in Fig. 9 of the noisy observables under the above conditions are shown in Fig. A.1. In terms of the variance (upper left), the distributions have all shifted towards larger mean values compared to their noiseless versions. They also now exhibit a prominent leftward skewness, in contrast to the approximately Gaussian distributions for the noiseless case. The skewness and kurtosis distributions (upper right and lower left, respectively) deviate significantly from Gaussians, the kurtosis especially now exhibiting strong asymmetry. Peak counts above 3σ (lower right) for the four models now have fully overlapping distributions, converging to a central value of around one third of that found in the noiseless case. Overall, the effect of shape noise is to wash out the distinctions between the models seen previously at this scale.
Fig. A.1.
Histograms of aperture mass statistics corresponding to the distributions in Fig. 9 with galaxy shape noise included. The filtering scale is again ϑ = 0.586′, and sources are at redshift z_{s} = 2.0. The profiles of the noisy distributions have all changed compared to their noiseless counterparts. Overall, the f_{5}(R) distributions are much more difficult to distinguish from ΛCDM, as well as from each other, than in the noiseless case at this scale. In particular, this level of noise washes out the discrimination power of peak counts above 3σ. 
Discrimination efficiencies ℰ corresponding to Fig. 10 for the noisy maps of f_{5}(R) are shown in Fig. A.2. As the overlap of the histograms suggest for ϑ_{2} = 0.586′, the skewness and peak counts lose all discrimination power not only at this filtering scale, but at all others as well. The noise dominates the signal for these statistics such that neither is capable of distinguishing f_{5}(R) from ΛCDM any longer. In contrast, the variance and kurtosis retain some discrimination power (although nowhere exceeding 55%), in particular for filter scales ≥ϑ_{3}. The noisy variance plot is qualitatively similar to its counterpart in Fig. 10 except that each bar has been scaled towards the centre. The kurtosis exhibits a somewhat different behaviour in that while ϑ_{1} and ϑ_{2} give lower ℰ values, scales ≥ϑ_{3} actually gain in discrimination power compared to the noiseless case.
Fig. A.2.
Reproduction of Fig. 10 including galaxy shape noise for the f_{5}(R) model without neutrinos. The effect of such noise is to reduce the discrimination efficiency from ΛCDM across all aperture scales, source redshifts, and statistics shown (with the exception of certain kurtosis scales as discussed in the text). In particular, the noise dominates the measured skewness and peak count distributions at each scale so that both statistics essentially lose all previous discrimination power between f_{5}(R) and ΛCDM, although this would likely be mitigated by a denoising procedure. 
What is not directly apparent from these figures is the way that the discrimination power changes as a function of noise level. We have verified that lower noise values, for example σ_{pix} = 0.2 and 0.4, produce discrimination efficiencies at intermediate values between the noiseless case and σ_{pix} = 0.7. The tendency of ℰ to decrease monotonically with increasing noise holds for scales ϑ_{1} and ϑ_{2} across all statistics we have considered. On the other hand, for scales ≥ϑ_{3} and for the nonGaussian statistics, ℰ can achieve a maximum with increasing noise before eventually decreasing again to zero for large enough σ_{pix}. What is seen in Fig. A.2 is that the noise level is not high enough to fully suppress ℰ using kurtosis as it is for skewness and peak counts. However, both these latter statistics have scales at which the maximum discrimination efficiency is largest with some nonzero noise level σ_{pix} < 0.7. To summarise, some amount of Gaussian noise can actually improve the distinction between models on some scales, but it depends sensitively on the initial shapes and separations of the noiseless histograms.
We have considered here one important source of noise present in any weaklensing analysis, but many others exist as well in real data. We leave a more complete and realistic treatment of noise for future work. We also recall that we have not attempted to denoise the maps before computing statistics, a step which would likely compensate to some extent the loss of discrimination power seen in Figs. A.1 and A.2. Finally, we do not show noisy results for f_{5}(R) with m_{ν} = 0.15 eV that would correspond to Fig. 11, since the discrimination efficiency does not exceed 10% for any statistic at any scale or source redshift.
All Tables
Subset of the DUSTGRAINpathfinder simulations considered in this work with their specific parameters.
Maximum discrimination efficiency ℰ_{max} with respect to ΛCDM attained for each MG model according to M_{ap} statistic, filtering scale ϑ, and source redshift z_{s}.
All Figures
Fig. 1.
Ratio of the matter power spectra P(k) of different f(R) models, including neutrinos of different mass, with respect to ΛCDM. The different f(R) models, labelled as f_{4}(R), f_{5}(R), and f_{6}(R), are described in the text. f_{6}(R) with any neutrino mass and f_{5}(R) with m_{ν} = 0.15 eV all have a matter power spectrum within 10% of the ΛCDM one. In the following we illustrate how different statistics can be used (or not) to discriminate these models, showing complete results in particular for f_{5}(R) for illustration. 

In the text 
Fig. 2.
Example convergence fields (top row) extracted from the three f_{5}(R) simulations with neutrino mass sums of 0 eV, 0.1 eV, and 0.15 eV. Each map shows the same line of sight for sources at redshift z_{s} = 2.0. Essentially the same structures are seen across all three maps due to identical initial conditions and the common field of view. Subtle differences, however, due to the different model evolutions can also be detected in the peak positions and their morphologies. Aperture mass residuals (bottom row) with respect to the ΛCDM map (not shown) at filtering scale ϑ_{3} = 1.17′ are shown for each corresponding map above. The compensated starlet filter has the effect of highlighting features with an approximate size of ϑ_{3}. Our aim is to distinguish between these models using differences in the statistics of such aperture mass maps computed at different scales. 

In the text 
Fig. 3.
Toy example of discrimination efficiencies computed using the FDR formalism. Each (Gaussian) distribution represents the histogram measured for some observable given the model. The amount of overlapping area between any two distributions indicates how distinguishable the two models are according to this observable: the more the curves overlap, the more difficult it is to distinguish the corresponding models, and the higher the discrimination efficiency parameter is. In the example shown, considering model 1 as the hypothesis against which the others are tested, models 2, 3, and 4 have computed discrimination efficiencies of 77%, 37%, and 100%. 

In the text 
Fig. 4.
Convergence power spectra for ΛCDM and various MG models. Left panel: f(R) models with f_{R0} = ( − 10^{−4}, −10^{−5}, −10^{−6}) and without neutrinos. Compared to ΛCDM, each MG model exhibits more lensing power, where the difference increases with the magnitude of f_{R0}. The discrepancy is most pronounced on smaller scales for f_{5}(R) and f_{6}(R) but shifts towards larger scales for f_{4}(R). Right panel: f_{5}(R) models with varying neutrino mass sums. Neutrinos have the effect of suppressing structure growth across all scales, which is seen in the damping of the f_{5}(R) curve with increasing m_{ν}. In particular, the combination of parameters (f_{R0}, m_{ν}) = (10^{−5}, 0.15 eV) reproduces ΛCDM in terms of secondorder statistics to within 8% over three decades in scale. The convergent behaviour of the curves near ℓ = 10^{5} reflects the resolution limit of our maps and particle noise that affects all of the models. 

In the text 
Fig. 5.
Aperture mass variance as a function of filtering scale for ΛCDM and MG models. Data points represent the mean over 256 realisations of each simulated cosmology and are plotted at scales (ϑ_{1}, …, ϑ_{7}) = (0.293′, 0.586′, 1.17′, 2.34′, 4.69′, 9.34′, 18.8′). Shaded areas indicate the one standard deviation statistical uncertainty around ΛCDM. Left panel: three f(R) models without neutrinos. Differences in MG variance relative to ΛCDM match those seen for the power spectra in Fig. 4. In particular, the f_{4}(R) variance at large scales exceeds that of ΛCDM by around 30%, while the maximum deviations of f_{5}(R) and f_{6}(R) occur at small scales at a level of about 16% and 2%, respectively. Right panel: three f_{5}(R) models with different sums of neutrino masses m_{ν}. Trends in the curves again mirror their power spectra, and the effect of neutrinos damping structure growth is seen in the modulation of f_{5}(R) variance with m_{ν} = 0 eV towards the curve for m_{ν} = 0.15 eV. 

In the text 
Fig. 6.
Higherorder aperture mass statistics for f(R) models with f_{R0} = −10^{−5} and a varying sum of neutrino masses m_{ν}. Data points represent the mean over 256 realisations of each simulated cosmology and are plotted at scales (ϑ_{1}, …, ϑ_{7}) = (0.293′, 0.586′, 1.17′, 2.34′, 4.69′, 9.34′, 18.8′). Left panel: aperture mass skewness computed as the third order moment . Higher neutrino mass leads to a nearly uniform reduction in skewness across all scales relative to ΛCDM. The maximum discrepancy between each model and ΛCDM is also larger than for the variance. Right panel: aperture mass kurtosis computed as the fourthorder moment . Differences from ΛCDM are comparable to the skewness, the main distinction being that the kurtosis of the m_{ν} > 0 models approaches the m_{ν} = 0 value at larger scales. 

In the text 
Fig. 7.
Aperture mass peak counts for f(R) with f_{R0} = −10^{−5} and a varying sum of neutrino masses m_{ν}. Data points represent the mean over 256 realisations of each simulated cosmology and are plotted at scales (ϑ_{1}, …, ϑ_{7} = 0.293′, 0.586′, 1.17′, 2.34′, 4.69′, 9.34′, 18.8′). The abundance of peaks above a 3σ threshold is shown in the left panel, and above a 5σ threshold in the right panel. An order of magnitude more peaks are detected at each scale with the lower threshold, although deviations from ΛCDM, as well as among the different m_{ν} cases of f_{5}(R), are more pronounced with the higher threshold. This indicates that differences between models at this redshift are mostly contained in the highest amplitude peaks. Compared to the M_{ap} moments, peak counts interestingly do not very well distinguish between models with different neutrino mass, but they do distinguish modified from standard gravity, at least at certain scales. 

In the text 
Fig. 8.
Correlation matrices between aperture mass moments and peak count probes for two models using maps at z_{s} = 2.0 and filtering scale ϑ_{2} = 0.586′. Aperture mass moments are strongly positively correlated with each other, while peak counts show slight to moderate negative correlation between the two threshold values. The amplitude of cross correlation between moments and peaks depends on redshift and scale. 

In the text 
Fig. 9.
Histograms of aperture mass statistics for ΛCDM and f_{5}(R) models with varying neutrino mass m_{ν}. Each histogram, with area normalised to one, comprises 256 samples of the statistic computed at a filtering scale of ϑ = 0.586′ and for sources at redshift z_{s} = 2.0. Solid lines represent the result of smoothing the distribution by KDE (cf. Sect. 5.3). The central positions of the histograms correspond to the mean values of each statistic as seen in Figs. 5–7. Considering the most degenerate case with ΛCDM, f_{5}(R) with m_{ν} = 0.15 eV, second and higherorder moments of M_{ap} do not appear able to distinguish the models. Peak counts, on the other hand, shown here for a 3σ threshold, cleanly separate the two distributions. Moreover, it is interesting that peak counts separate all f_{5}(R) cases from ΛCDM by approximately the same amount, independent of m_{ν}. 

In the text 
Fig. 10.
Discrimination efficiency with respect to ΛCDM of the MG model f_{5}(R) with m_{ν} = 0 eV as a function of statistic, aperture scale, and source redshift. In each plot, the wedges marked by red lines indicate M_{ap} filtering by the aperture ϑ appearing at the outer edge. Numbered apertures correspond to angular sizes of (ϑ_{1}, …, ϑ_{7}) = (0.293′, 0.586′, 1.17′, 2.34′, 4.69′, 9.34′, 18.8′). The radial length of a bar, shaded according to source redshift, represents the discrimination efficiency (in percent) of the statistic at the filtering scale associated to its wedge. The variance of this model suffices to distinguish it from ΛCDM at greater than 80% at scales ϑ_{2} and ϑ_{3} for sources at z_{s} ≥ 1.5. Peak counts (above a 3σ detection threshold) achieve similar results, while kurtosis proves to be a poor discriminator, not exceeding 55% at any filter scale or redshift. Skewness shows comparably good discrimination power to the variance and peak counts only at scale ϑ_{2}. 

In the text 
Fig. 11.
Discrimination efficiency with respect to ΛCDM of the MG model f_{5}(R) with m_{ν} = 0.15 eV as a function of statistic, aperture scale, and source redshift. Plots are analogous to those in Fig. 10. Neutrinos significantly enhance the degeneracy of this model with ΛCDM at the twopoint level compared to the m_{ν} = 0 eV case. This can be seen in the variance plot (upper left), where the discrimination efficiency is less than 18% at all filter scales and redshifts. The higherorder moments show even less discrimination power. On the contrary, peak counts (lower right), again above a 3σ detection threshold, achieve 100% discrimination efficiency for z_{s} = 2.0 at ϑ_{2}, and above 92% for z_{s} ≥ 1.5 at ϑ_{3}. 

In the text 
Fig. A.1.
Histograms of aperture mass statistics corresponding to the distributions in Fig. 9 with galaxy shape noise included. The filtering scale is again ϑ = 0.586′, and sources are at redshift z_{s} = 2.0. The profiles of the noisy distributions have all changed compared to their noiseless counterparts. Overall, the f_{5}(R) distributions are much more difficult to distinguish from ΛCDM, as well as from each other, than in the noiseless case at this scale. In particular, this level of noise washes out the discrimination power of peak counts above 3σ. 

In the text 
Fig. A.2.
Reproduction of Fig. 10 including galaxy shape noise for the f_{5}(R) model without neutrinos. The effect of such noise is to reduce the discrimination efficiency from ΛCDM across all aperture scales, source redshifts, and statistics shown (with the exception of certain kurtosis scales as discussed in the text). In particular, the noise dominates the measured skewness and peak count distributions at each scale so that both statistics essentially lose all previous discrimination power between f_{5}(R) and ΛCDM, although this would likely be mitigated by a denoising procedure. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.