Breaking degeneracies in modified gravity with higher (than 2nd) order weak-lensing statistics

General relativity (GR) has been well tested up to solar system scales, but it is much less certain that standard gravity remains an accurate description on the largest, that is, cosmological, scales. Many extensions to GR have been studied that are not yet ruled out by the data, including by that of the recent direct gravitational wave detections. Degeneracies among the standard model ($\Lambda$CDM) and modified gravity (MG) models, as well as among different MG parameters, must be addressed in order to best exploit information from current and future surveys and to unveil the nature of dark energy. We propose various higher-order statistics in the weak-lensing signal as a new set of observables able to break degeneracies between massive neutrinos and MG parameters. We have tested our methodology on so-called $f(R)$ models, which constitute a class of viable models that can explain the accelerated universal expansion by a modification of the fundamental gravitational interaction. We have explored a range of these models that still fit current observations at the background and linear level, and we show using numerical simulations that certain models which include massive neutrinos are able to mimic $\Lambda$CDM in terms of the 3D power spectrum of matter density fluctuations. We find that depending on the redshift and angular scale of observation, non-Gaussian information accessed by higher-order weak-lensing statistics can be used to break the degeneracy between $f(R)$ models and $\Lambda$CDM. In particular, peak counts computed in aperture mass maps outperform third- and fourth-order moments.


Introduction
Current observations from the cosmic microwave background (CMB) anisotropies, baryon acoustic oscillations, and supernova luminosity distances all support a phase of accelerated cosmic expansion during the present epoch (e.g. Planck Collaboration et al. 2016a,b;Anderson et al. 2014;Betoule et al. 2014). The cause of such evolution is still unknown and should amount to approximately 70% of the total energy density of the Universe. Whether it is simply a fundamental constant of nature, a physical fluid, or instead a modification of general relativity (GR) itself at cosmological scales, the accelerated expansion remains one of the biggest puzzles in modern cosmology. The simplest cosmological model consistent with the data assumes a constant contribution (Λ) and a cold dark matter component (CDM). The cosmic expansion of a ΛCDM model, however, can be mimicked by vast range of models, which are also in agreement with current observations. These scenarios typically assume either a fluid, i.e. dark energy (DE) component, or a modification of general relativity at large scales. Even when considering linear perturbations and the impact on CMB, dark energy and modified gravity (MG) models are still viable (Planck Collaboration et al. 2016b) and may be advocated to solve tensions currently present between Planck data and late time probes, such as weak lensing (DES Collaboration et al. 2017) and, in e-mail: austin.peel@cea.fr smaller measure, redshift space distortions (Planck Collaboration et al. 2016b).
Recent direct detection of gravitational waves contributed to the exclusion of a range of MG models that include higher order derivatives in the action (Creminelli & Vernizzi 2017;Baker et al. 2017;María Ezquiaga & Zumalacárregui 2017;Sakstein & Jain 2017). Further constraints were also put on models beyond Horndeski, which make use of the Vainshtein mechanism to protect high density regions in which GR is well tested (Dima & Vernizzi 2017), using measurements in Sakstein (2015). A very large range of models is still viable, however. These include, among many others, the ones discussed in Planck Collaboration et al. (2016b) and in particular scalar-tensor theories with a universal coupling, i.e. the f (R) models discussed in this paper. We refer to Crisostomi & Koyama (2018) (and to references above) for a general formulation of viable models which assume a universal coupling. Note also that non-universal couplings are still viable after GWs constraints (Amendola 2000;Pettorino & Baccigalupi 2008;Pettorino 2013).
Distinguishing among ΛCDM and MG scenarios that mimic ΛCDM at the linear level, but which may be different at nonlinear scales, is the next challenge for future galaxy surveys. In this paper, we investigate whether various observables based on statistics of the weak-lensing signal can be used to discriminate among different theoretical models that are strongly degenerate in the matter power spectrum on linear and possibly also on nonlinear scales. In particular, we focus on the strong degeneracy Article number, page 1 of 16 arXiv:1805.05146v1 [astro-ph.CO] 14 May 2018 A&A proofs: manuscript no. article between f (R) modified gravity and ΛCDM that occurs when the latter models include the effects of massive neutrinos (see Motohashi et al. 2013;He 2013;Baldi et al. 2014;Wright et al. 2017). Given such a degeneracy, we are motivated to explore non-Gaussian weak-lensing probes that access higher-order information than 2-point correlation functions or their associated power spectra. Our goal is to determine which statistics are most promising and how the discrimination efficiency between models depends on redshift and angular scale.
Within ΛCDM, there is evidence that weak lensing alone can break the degeneracy between σ 8 and Ω m by a combination of higher-order (than second) convergence moments and shear tomography (Vicinanza et al. 2018). In terms of MG,  have shown that non-Gaussian statistics, in particular weak-lensing peak counts, contain significant cosmological information and can constrain f (R) model parameters. In addition, Higuchi & Shirasaki (2016); Shirasaki et al. (2017) have studied the effects of f (R) gravity on statistical properties of the weak-lensing field, including the convergence power spectrum and bispectrum, peak counts, and Minkowski functionals. Unlike these analyses, the matter power spectrum degeneracy in our case is facilitated (and the constraining power of WL observation is challenged) by the presence of massive neutrinos, a component known to play an important role during structure formation in the real universe.
The deflection of light coming from distant galaxies represents an important tool to indirectly map the projected matter density distribution along the line of sight. When light bundles travel from distant galaxies to an observer, they are continuously deflected by the inhomogeneities of the density field of the intervening non-linear structures. As a consequence, images of background galaxies appear slightly stretched and distorted, the phenomenon known as weak gravitational lensing. Considering that by definition the weak-lensing effect is small, it is necessary to average over a large number of background sources in order to quantify it. Future wide-field experiments like LSST (LSST Science Collaboration et al. 2009) and the ESA space mission Euclid 1 (Laureijs et al. 2011) will both increase the number and redshift range of the background source population used to measure the weak-lensing signal.
Given vastly larger datasets and substantially improved image quality, next-generation surveys will thus permit the most stringent tests yet on a variety of different theoretical scenarios (Amendola et al. 2016). In these tests, weak lensing will be a primary tool in our effort to understand the dark universe, and in particular the nature of the late-time acceleration phase. In this work, we present the first high-order moment analysis of weaklensing fields of non-standard MG cosmologies with massive neutrinos. Our aim is to quantify the capability of future galaxy surveys to distinguish non-standard cosmological features using weak lensing.
The paper is organized as follows. In section Section 2 we describe the theoretical cosmologies used in this paper as a proof of concept of the methodology we propose. In Section 3 we briefly recall the basic weak lensing definitions that will be needed in the analysis. In Section 4 we discuss the N-Body simulations used for the analysis described in Section 5. Our results are presented and discussed in Section 6, and we draw our conclusions in Section 7. 1 https://www.euclid-ec.org 2. f (R) cosmology: the Hu-Sawicki model Among the classes of theories that modify gravitational attraction at large scales in order to have cosmic acceleration, a large class of (still viable) models is given by f (R) theories. In these cosmologies, the dependence of the action on the Ricci scalar R is modified with respect to GR, and depends on a generic function f of R, such that: where L m is the matter Lagrangian and κ 2 G ≡ 8πG. (We use a subscript G to avoid confusion with the weak lensing convergence κ used later.) It is possible to show that these theories are a subclass of scalar-tensor theories, i.e. they include a universal fifth force, mediated by a scalar field, that adds to the metric tensor already present in GR. Since this force is universal, it affects baryons as well as dark matter: as a consequence, f (R) theories typically need some screening mechanism that restores GR in high density regions, such as in the solar system, where observations show no significant deviation from GR. In addition, the choice of f (R) should provide a cosmology that is close enough to ΛCDM in the high-redshift regime (or else it would affect the CMB in a way that would have already been observed) and provide cosmic acceleration at late times.
One of the few known functional forms of f (R) able to satisfy solar system constraints is the Hu & Sawicki (2007) model in which: and whereρ 0 is the average density at present time, and c 1 and c 2 are dimensionless parameters. The sign of f (R) is chosen such that its second derivative f RR is positive for R m 2 , leading to stable solutions in the high curvature regime (Hu & Sawicki 2007). It has been shown that for this model mimics the background evolution of a ΛCDM model with a cosmological constant relative density of Ω Λ and a relative matter density of Ω m . In the Hu & Sawicki (2007) model however, there is no cosmological constant and cosmic acceleration is given by the modification of gravity. Furthermore, the parameter c 2 is usually expressed in terms of f R0 ≡ d f /dR(z = 0). In these cosmologies, therefore, two additional parameters are present: n and f R0 . This set of functional forms are such that f (R)/m 2 have a transition from zero (for R → 0) to a constant, for R m 2 . The sharpness of this transition increases with n, while f R0 determines when the transition occurs. A detailed analysis of the impact of such cosmologies at different scales has been presented already in the proposal paper of Hu & Sawicki (2007) and as well in more recent papers (Koyama 2016;Lombriser 2014). In Hu & Sawicki (2007) (see Fig. 9), for solar system to galaxy scales, assuming a galactic density ρ g = 10 −24 g cm −3 , the two parameters of the theory have to satisfy Ratio of the matter power spectra P(k) of different f (R) models, including neutrinos of different mass, with respect to ΛCDM. The different f (R) models, labelled as f 4 (R), f 5 (R), and f 6 (R), are described in the text. f 6 (R) with any neutrino mass and f 5 (R) with m ν = 0.15 eV all have a matter power spectrum within 10% of the ΛCDM one. In the following we illustrate how different statistics can be used (or not) to discriminate these models, showing complete results in particular for f 5 (R) for illustration. Hu & Sawicki (2007) models have also been shown to satisfy both background observations (Martinelli et al. 2009) and have been tested against some cosmological constraints (Lombriser et al. 2012;Hu et al. 2016). (See Koyama (2016) and Lombriser (2014) for an overview and detailed list of constraints). CMB and large-scale structure provide relatively weak constraints, giving | f R0 | < 10 −2 to 10 −4 . Solar system constraints typically require | f R0 | < 10 −4 to 10 −6 , depending on environmental assumptions of the Milky Way, while clusters give the slightly stronger bound of | f R0 | < 10 −5 . Strong lenses (Smith 2009) or dwarf galaxies and Cepheids may reach | f R0 | < 10 −6 to 10 −7 (Jain et al. 2013;Vikram et al. 2014), with bounds that depend, however, on several assumptions and approximations, e.g. the shape of density profiles. Hu & Sawicki (2007) cosmologies automatically satisfy constraints recently derived by (Sakstein 2015) for any choice of the parameters.
In the following we fix n = 1 and consider different values of f R0 , labelled as follows.
This choice of range allows us to check typical values used in the literature for f (R) models. A model like f 4 (R) may be at the more extreme limit of solar system observations, possibly requiring further screening at those scales, whereas the other models may instead be more in tension with strong lenses, dwarf galaxies and Cepheids, depending on assumptions in deriving the constraints. Assuming that a screening mechanism is in place, the values chosen here may still be viable at cosmological scales, where much weaker bounds are in place.
The ratios of the power spectra for these models with respect to ΛCDM are shown in Fig. 1. As we see from the figure, the inclusion of neutrinos suppresses the matter power spectrum on scales smaller than their free streaming length. This suppression is degenerate with the value of f R0 , which also introduces a transition scale: the higher the value, the more the linear perturbations grow below the Compton wavelength λ f R and the more the power spectrum is enhanced. The connection between λ f R and f R0 can be seen directly by defining In this expression, the effective potential V eff is defined in such a way that the equation of motion for f R can be written as where ρ is the total energy density. Values of λ f R large enough to be in the non-linear regime may therefore lead to effects which are observable in structure formation. Models such as f 4 (R) without massless neutrinos would lead to matter power spectra that are strongly amplified with respect to ΛCDM (solid red curve in Fig. 1). On the other hand, the inclusion of massive neutrinos as large as 0.3 eV (crossed red line) would compensate such a choice of f (R), bringing it back to within 10% from ΛCDM even down to the smallest scales (see Baldi et al. 2014, for an early investigation of such degeneracy in the non-linear regime).
The degeneracy between f R0 and neutrino masses leads similar choices to have matter power spectra within about 10% from ΛCDM, making it difficult to discriminate between f (R) and a cosmological constant, or between different f (R) models, purely on the basis of observations of second order statistics. In the following, we will investigate whether higher-order weak-lensing observables can help to break such degeneracies. Although we have tested the results for different choices of ( f R0 , m ν ), we will show results for the f 5 (R) use case, which is an intermediate one among the ones shown in the figure and available in simulations.

Weak lensing
Weak gravitational lensing describes small distortions in the observed shapes of distant galaxies induced by the gravitational fields of massive structures in the universe along the line of sight. Significant information about the distribution and masses of cosmic structures is imprinted on galaxy images as the statistical alignment of their observed ellipticities. This signal has been used successfully in the data analysis of numerous galaxy surveys to constrain cosmological parameters (DES Collaboration et al. 2017) and is potentially a key probe to test gravity and models beyond ΛCDM in future weak lensing surveys such as Euclid (Amendola et al. 2016). For this purpose, the combination of weak lensing with other probes, such as galaxy clustering (Planck Collaboration et al. 2016b), helps to break degeneracies in measuring the gravitational potentials. On the other end, it is worth investigating whether one can break degeneracies in parameter space relying on weak lensing only, independently of other late-time probes. This is an important test that can help maximize the information we can extract from the large amount of data that is going to be available for this probe. Furthermore, it can help identify whether tensions present in the data (Planck Collaboration et al. 2016b) between weak lensing and the cosmic microwave background will point to a signature of new physics or rather to systematics.
The basic weak lensing quantities are the shear γ(θ) and convergence κ(θ) fields. Shear quantifies the anisotropic distortion of source galaxy images due to the tidal gravitational fields of foreground structures. The shear measured at some position θ on the sky depends on the amplitude of density fluctuations along the line of sight to the source galaxy, as well as on the relative distances between the observer, deflectors, and source. Convergence, which is derivable from the shear up to a constant (Kaiser & Squires 1993), reflects isotropic changes in the observed shapes of galaxies.
The connection with the cosmic density fluctuations is straightforward with κ, as it is a scalar quantity and can be interpreted as the projected total matter density along the line of sight: In this expression, H 0 is the present-day Hubble parameter, c is the speed of light, a is the universal scale factor, and δ is the 3D density contrast defined as δ = (ρ −ρ)/ρ, whereρ is the spatially averaged matter density. The radial coordinate χ is comoving, and the geometrical function f K determines the comoving angular distance, whose form depends on the (constant) curvature of 3D space. We refer to e.g. Kilbinger (2014) for a recent review of weak lensing cosmology that includes an introduction to the formalism. We deal directly with κ maps in this work as derived from our simulated cosmologies (cf. Section 4.2).

The DUSTGRAIN-pathfinder simulations
We make use of the DUSTGRAIN-pathfinder simulations (see Giocoli & Baldi, in prep., for a detailed description), a suite of cosmological collisionless simulations specifically designed to sample the joint parameter space of f (R) gravity and massive neutrino cosmologies in order to investigate their main observational degeneracies and devise strategies to break them. These simulations follow the evolution of 768 3 dark matter particles of mass m p CDM = 8.1 × 10 10 M /h (for the case of m ν = 0) and of as many neutrino particles (for the case of m ν > 0) within a periodic cosmological box of 750 Mpc/h per side, under the effect of an f (R) gravitational interaction defined by Eq. 1.
The simulations have been performed with the MG-Gadget code (Puchwein et al. 2013), which is a modified version of the GADGET code (Springel 2005) that implements the extra force and the chameleon screening mechanism ([see Khoury & Weltman 2004)] characterising f (R) gravity theories. As was already done in Baldi et al. (2014), such an MG solver has been combined with the particle-based implementation of massive neutrinos developed for GADGET by Viel et al. (2010). Therefore, massive neutrinos are included in the simulations as a separate family of particles with its specific transfer function and velocity distribution, so that both CDM and neutrino particles contribute to the density source term that enters the calculation of the f R extra degree of freedom.
Initial conditions have been produced by generating two separate but fully correlated random realisations of the linear density power spectrum for CDM and massive neutrino particles as computed by the linear Boltzmann code CAMB (Lewis et al. 2000) at the starting redshift of the simulation z i = 99. Following the approach of e.g. Zennaro et al. (2017); Villaescusa-  we have then computed the scale-dependent growth rate D + (z i , k) for the neutrino component in order to correctly account for neutrino gravitational velocities. Besides these, neutrino particles also receive an additional thermal velocity extracted from the neutrino momentum distribution for each value of neutrino mass under consideration.

Map making
For each cosmological simulation, we construct different lens planes from the various stored snapshots using the MapSim pipeline (Giocoli et al. 2015a). The particles are distributed onto different planes according to their comoving distances with respect to the observer. For each simulation, we use the particles stored in 21 different snapshots to construct continuous pastlight-cones from z = 0 to z = 4, with a square sky coverage of 5 × 5 deg 2 . From the stored snapshots and for each simulation, we are able to construct 27 lens planes of the projected matter density distribution.
In MapSim, the observer is placed at the vertex of a pyramid whose square base is set at the comoving distance of z = 4. We construct 256 different light-cone realizations within each simulation by randomizing the various comoving boxes. This includes changing signs as well as redefining the center of and inverting the coordinate system. By construction, these procedures preserve the clustering properties of the particle density distribution at a given simulation snapshot (Roncarelli et al. 2007).
On each pixel of the lens plane, with coordinate indexes (i, j), we can define the particle surface mass density where A l represents the comoving pixel area of the l-lens plane.
Since gravitational lensing is sensitive to the projected matter density distribution along the line-of-sight, we project onto each lens plane all particles between two defined comoving distances from the observer, and in the simulations with massive neutrinos we consistently also account for this component, as well as for the proper Hubble function and the comoving distance calculation. As was done by Petri et al. (2016a,b); Giocoli et al. (2017Giocoli et al. ( , 2018; Castro et al. (2017), we construct the convergence map by weighting the lens planes by the lensing kernel and assuming the Born approximation (Schäfer et al. 2012;Giocoli et al. 2016;Castro et al. 2017), which represents an excellent estimation for weak cosmic lensing down to very small scales (l ≥ 10 4 ). From Σ l we can write down the convergence map κ at a given source redshift z s as where l varies over the different lens planes with redshift z l smaller than z s . The critical surface density Σ crit,l,s at the lens plane z l for sources at redshift z s is given by where c indicates the speed of light, G Newton's constant, and D l , D s , and D ls the angular diameter distances between observerlens, observer-source and source-lens, respectively. We follow Simulation Name Gravity type 0.30630 0.00715 7.92 × 10 10 1.85 × 10 9 fR5-0.15eV f (R) −1 × 10 −5 0.15 0.30987 0.00358 8.01 × 10 10 9.25 × 10 8 fR5-0.1eV f (R) −1 × 10 −5 0.1 0.31107 0.00238 8.04 × 10 10 6.16 × 10 8 fR6-0.1eV f (R) −1 × 10 −6 0.1 0.31107 0.00238 8.04 × 10 10 6.16 × 10 8 fR6-0.06eV f (R) −1 × 10 −6 0.06 0.31202 0.00143 8.07 × 10 10 3.7 × 10 8 this approach also in constructing the convergence maps in the MG models, since the definition of the lensing potential in f (R) gravity models remains unchanged with respect to the GR case of the standard ΛCDM model. The resulting maps each contain 2048 2 pixels, giving a pixel scale of approximately 8.8 arcsec.

Analysis
Statistics of the aperture mass M ap (ϑ) have been used in many weak-lensing analyses as a probe of the matter distribution in the Universe and to constrain cosmological parameters (e.g. Van Waerbeke et al. 2001;Jarvis et al. 2003;Hamana et al. 2003;Kilbinger & Schneider 2005;Clowe et al. 2006;Hetterscheidt et al. 2007;Schrabback et al. 2010;Kilbinger et al. 2013). In particular, the variance, i.e. second central moment M 2 ap (ϑ) , is commonly used, which at a certain scale ϑ measures the lensing power spectrum within a narrow window in lspace. Furthermore, non-linear evolution of density fluctuations in the low-redshift Universe gets imprinted as non-Gaussian features in the weak-lensing signal, which can be accessed through higher-order moments of the aperture mass. For example, Pires et al. (2012) have shown that the capture of weak-lensing non-Gaussianities is able to break the degeneracy between σ 8 and Ω m within ΛCDM. We explore in this work the ability of various statistics of M ap to break the degeneracy between ΛCDM and f (R) models that include non-vanishing neutrino mass.

Aperture mass calculation
Given a convergence map κ(θ) for a particular model realisation, we compute the aperture mass map (Schneider 1996;Schneider et al. 1998) as where θ is a two-dimensional position vector within the map, U ϑ (|θ|) is a circularly symmetric filter function, and ϑ is the aperture radius. The aperture mass is by design insensitive to the mass-sheet degeneracy, which describes the fact that a shear signal is unchanged by the presence of a uniform mass sheet along the line of sight between the observer and the source. To achieve this, the function U should be compensated in 2D, i.e. satisfy where we have written the angular separation as θ = |θ − θ|. It is also typically assumed that U ϑ has either finite support or goes to zero smoothly at sufficiently small θ in order that the integral in equation (11) be computable on a finite data field. Various definitions of the filter function U have been adopted in previous studies. For example, a useful parameterised family of polynomial functions was first introduced by Schneider et al. (1998). Nearly concurrently, van Waerbeke (1998) provided an alternative definition based on a Gaussian function and resembling wavelets in order to have better Fourier space properties. This latter U was subsequently used (in slightly modified form) by e.g. Crittenden et al. (2002); Zhang et al. (2003); Jarvis et al. (2004). A third type of filter was derived in Schirmer et al. (2007) which approximates a Navarro-Frenk-White (NFW) density profile (Navarro et al. 1996). As it mimics the shear profile of a spherically symmetric dark matter halo, this form has been used for the detection of individual mass concentrations in real data.
Equation (11) is equivalently expressible in terms of the tangential component of shear γ t about the position θ, where the convolution kernel is then a filter Q ϑ derivable analytically from U ϑ (Kaiser et al. 1994;Schneider 1996). This form is often used in the literature for convenience, since the actual weak-lensing observable is not the convergence, but instead the (reduced) shear via galaxy ellipticity measurements. Because the output of ray-tracing in our N-body simulations is the convergence field directly, we compute M ap on pixellated κ maps throughout this paper.
For high resolution fields like our 2048 × 2048 maps, the convolution is typically carried out in Fourier space, since direct-space convolution can be too time-consuming. We adopt a slightly different approach here for the practical computation of aperture mass maps than direct application of equation (11). We instead compute M ap by means of the starlet transform (Starck et al. 2007), i.e. a wavelet transform, which simultaneously produces a set of maps filtered at apertures of increasing dyadic scales. In other words, with a single wavelet transform of κ(θ) with O(n log n) time complexity, we obtain {M ap (θ; ϑ j )} j=1,2,..., j max , where ϑ j = 2 j pixels. With a pixel scale of 8.8 arcsec, this corresponds to filter scales of 0.293, 0.586, 1.17, ... arcmin for values j = 1, 2, 3, ... in our simulated maps. The maximum j is determined by the map size as log 2 N for an N × N map.
It was demonstrated in Leonard et al. (2012) that the aperture mass is formally identical to a wavelet transform at a specific scale. Different transforms are associated with different effective filter functions. The particular transform we adopt is the isotropic undecimated wavelet transform, also called the starlet transform, whose wavelet function at a given scale is defined as the difference between B 3 -spline functions at neighboring resolutions. We refer to Starck et al. (2007) for more details of the formalism and to Leonard et al. (2012) for the explicit form of U ϑ corresponding to this transform.
In contrast to the above mentioned filter functions, the starlet transform filter presents several significant simultaneous advantages: (i) it is non-oscillatory both in angular space and in Fourier space, meaning that it can better isolate features of the signal represented in either domain; (ii) it has compact support in direct space, which is useful to control the systematics due to a mask or the image borders; (iii) it is compensated, i.e. it has a zero mean, so it is not sensitive to the mass sheet degeneracy problem (cf. equation 12); and (iv) it presents an exact reconstruction, meaning that we can reconstruct the decomposed image from its starlet coefficients. We do not make use of this last property in this paper, but it could be very convenient in the case where we need to fill gaps due to missing data. The aperture mass maps resulting from the starlet transform then represent only information at a particular scale, with little leakage from other frequencies.
Finally, we note that although the starlet transform returns maps filtered only at dyadic (powers of two) scales, which we find sufficient for our purposes here, this is not a general restriction of the wavelet formalism. Alternative wavelet transforms are possible that allow filtering at any intermediate scale.
As an illustration of the weak lensing fields in our simulations, as well as of our aperture mass computation, we show typical maps corresponding to three different models in Fig. 2. In the top row, from left to right, are κ(θ) maps extracted from the ΛCDM and the f 5 (R) simulations, the latter with m ν = 0 eV (central panel) and m ν = 0.15 eV (right panel). All simulations share the same initial phases in the random realisation of the matter power spectrum, and these maps were generated by ray-tracing toward the same observer position. The same physical structures can therefore be seen across the three convergence fields. Subtle differences in the peak positions are also apparent by eye, which is due to the different gravitational interaction in MG compared to GR and the presence or lack of neutrinos affecting structure growth.
In the lower left panel of Fig. 2 is shown the aperture mass map M ap (θ; ϑ 3 ) corresponding to the above ΛCDM map filtered at 1.17 . To highlight the distinction between the models at this scale, we show the residual M ap maps with respect to ΛCDM for the two f 5 (R) models above, also filtered at ϑ 3 = 1.17 , in the lower center and right panels. Differences in the positions and amplitudes of structures can be clearly seen, especially for the most prominent convergence peaks. However, it is not obvious a priori that the statistics of such maps will be significantly different enough to distinguish the models from each other. Whether this is possible, and for which statistics, are the questions we seek to answer in the following sections.
We compute all aperture mass maps in this paper using the mr_transform binary of the publicly available Interactive Sparse Astronomical Data Analysis Packages (iSAP 2 ). This is a C++ code with many optional multi-resolution wavelet transforms, which, given the large size of our mass maps, affords a significant speed increase over direct-space 2D convolution. Various statistics of the maps, which are discussed in the next section, are computed using an independent Python code that we have validated on maps from the CoDECS simulations used in Giocoli et al. (2015b). In particular, we are able to reproduce Figures 8 and 9 of that paper using a further independent aperture mass calculation that implements the Schneider et al. (1998) filter.

Statistics
We consider the following statistics of aperture mass maps as a function of filter scale and source galaxy redshift. To avoid edge effects, we exclude from all calculations pixels whose distance to the map border is smaller than the filter diameter.
i variance: the aperture mass variance M 2 ap (ϑ) quantifies fluctuations in lensing strength along different directions in the sky, meaning that it also measures the amplitude of fluctuations in the matter density contrast. As a second-order statistic, it can be expressed as an integral over the convergence power spectrum P κ ( ) where the window function W is the Fourier transform of U ϑ (Schneider et al. 1998). Aperture mass variance therefore probes Gaussian information in the distribution of matter. This integral form is convenient when comparing measurements with theoretical predictions, since P κ is readily calculated (at least in ΛCDM) for a given set of cosmological parameters. Because we have simulated maps for all the models of interest in this work, we instead compute M 2 ap directly on filtered convergence maps like the one shown for ΛCDM in Fig. 2: where θ k refers to the kth pixel position. ii skewness: as a third-order statistic, skewness is complementary to variance in that it probes non-Gaussian information contained in the lensing observables. The skewness S is zero if the data are symmetrically distributed around the mean. If a tail extends to the right (resp. left), S is positive (resp. negative). In particular, skewness has been shown to be a sensitive probe of Ω m and can break the degeneracy with σ 8 if combined with two-point statistics (Bernardeau et al. 1997;Jain & Seljak 1997). Analogously to the variance, skewness can be written as a bandpass filter over the convergence bispectrum (e.g. Kilbinger & Schneider 2005), the Fourier transform of the three-point correlation function. Following Giocoli et al. (2015b), we compute skewness as the (non-standardized) third-order moment of our filtered maps: iii kurtosis: the kurtosis of a distribution is a measure of its symmetric broadening or narrowing relative to a Gaussian. Positive kurtosis implies a higher peak and larger wings than the Gaussian distribution with the same mean and variance. Negative kurtosis means a wider peak and shorter wings. Again following Giocoli et al. (2015b), we compute kurtosis as the (non-standardized) fourth-order moment of the aperture mass: Maps for the MG model are derived from runs with neutrino masses of 0 eV (center) and 0.15 eV (right). Essentially the same structures are seen across all three maps due to identical initial conditions and the common field of view. Subtle differences, however, due to the different model evolutions can also be detected in the peak positions and their morphologies. Bottom row: in the left panel is the aperture mass map at filtering scale ϑ 3 = 1.17 corresponding to the ΛCDM κ map at the upper left. The compensated filter has the effect of highlighting features with an approximate size of ϑ 3 . Shown in the center (right) panel is the difference in aperture mass between the f 5 (R) map above and ΛCDM for the same filtering scale. Differences in peak positions and amplitudes with respect to ΛCDM can be seen, and our aim is to distinguish between these models using various statistics of such aperture mass maps.
iv peak counts: the number count distribution of lensing peaks provides another probe of non-Gaussianity. Peaks represent local regions of high convergence, and their abundance for a given cosmological model carries significant information about its matter content and clustering amplitude. The number of cosmological studies based on peak statistics in convergence maps, both from simulated and real lensing data, has surged in the past decade (e.g. Kratochvil et al. 2010;Fan et al. 2010;Yang et al. 2011;Marian et al. 2012;Shan et al. 2014;Lin & Kilbinger 2015;Peel et al. 2017;Shan et al. 2018;Fluri et al. 2018). The highest signal-to-noise peaks are mostly generated by single massive structures along the line of sight. On the other hand, intermediate and low signalto-noise peaks can arise from projection effects due to many smaller structures or filaments along the line of sight, as well as simply to noise when dealing with real data (Yang et al. 2011;Liu & Haiman 2016).
In this work, we explore the possibility of peak counts to distinguish between GR and MG models (with and without neutrinos), which are degenerate at the level of second-order statistics. We compute peaks as a function of filter scale ϑ in maps of M ap (ϑ)/σ(ϑ), where σ(ϑ) is the rms of that scale. A peak is defined as a pixel with larger amplitude than its eight neighbors and exceeding a given k σ (k = 1, 2, ...) threshold.

Model discrimination efficiency
For each cosmological model we consider, we produce 256 independent convergence maps of 25 deg 2 for sources at redshifts z s = (0.5, 1.0, 1.5, 2.0) by the procedure outlined in Section 4. We have sufficient area therefore to meaningfully study the statistical distributions of observables as a function of scale and redshift. In particular, we test the ability of various aperture mass statistics to distinguish between the models by employing the discrimination efficiency technique introduced and used in Pires et al. (2009Pires et al. ( , 2012. The concept in our context can be understood as follows. Given the distributions of any two observables, the amount of overlap between the distributions is an indicator of how likely one is to mistake one model for the other, based on measurements of that observable. Quantification of the overlap is facilitated by the notion of a false discovery rate (FDR) (Benjamini & Hochberg 1995). The FDR framework allows one to set a threshold value for a distribution such that new test samples (or observations) can be classified as either belonging to the distribution or not, depending on which side of the threshold they lie. The threshold is computed according to a desired false detection rate α. The formalism then ensures that the fraction of false positives, i.e. observations which indeed belong to the distribution but are incorrectly classified as detections, is less than or equal to α. As (100%) Fig. 3. Toy example of discrimination efficiencies computed using the FDR formalism. Each (Gaussian) distribution represents the histogram measured for some observable given the model. The amount of overlapping area between any two distributions indicates how distinguishable the two models are according to this observable: the more the curves overlap, the more difficult it is to distinguish the corresponding models, and the higher the discrimination efficiency parameter is. In the example shown, considering model 1 as the hypothesis against which the others are tested, models 2, 3, and 4 have computed discrimination efficiencies of 76%, 37%, and 100%.
is common, we set α = 0.05 throughout this work to ensure that the number of false detections on average does not exceed 5%.
As an illustration, consider the toy example in Fig. 3. The probability distributions of some observable, all taken to be Gaussian for simplicity, are shown for four fictitious models. Let model 1 serve as the 'hypothesis' against which the other models are tested. To determine the discrimination efficiency of, for example, model 2 from model 1, we first compute the right tail threshold t R (dashed line) using FDR based on the samples of model 1. This is done by first rank-ordering the p-values {p i }, i = 1, ..., n, of the n model 1 samples, computed with respect to a right tail event. Next one finds the maximum p m (1 ≤ m ≤ n) for which p m < m · α/n, where we have assumed statistical independence of the p-values. The threshold t R is the observation value corresponding to p m , and the discrimination efficiency is the fraction of model 2 samples greater than t R . Analogous reasoning holds for comparing model 3 against model 1 by computation of threshold t L .
In general, then, suppose we have observations of two models, M 1 = {x i } i=1,...,n 1 and M 2 = {x j } j=1,...,n 2 . If the center of M 2 is greater than that of M 1 , the discrimination efficiency of M 2 from M 1 is where N[C] counts the number of elements satisfying the condition C, and the threshold t R is computed from the M 1 samples. If the center of M 2 lies to the left of that of M 1 , the condition C becomes x j < t L (M 1 ). For a more detailed description of this procedure, we refer to e.g. Appendix A of Pires et al. (2009). For the example in Fig. 3, the discrimination efficiencies for models 2, 3, and 4 from model 1 are, respectively, E 2,1 = 76%, E 3,1 = 37%, and E 4,1 = 100%. This corresponds with the intuition that models with fully overlapping distributions should not be considered distinguishable at all, whereas completely disjoint distributions should be 100% distinguishable. Note that there is an inherent asymmetry in discrimination efficiency between any two models that do not share the same width, or in general are not perfectly symmetric functional forms with merely different means. Considering model 2 as the hypothesis, for example, the model 1 discrimination efficiency is close to that of the reverse case at 75%, which follows from their near symmetry. With respect to model 3, however, the model 1 efficiency drops to 9%. This is consistent with the fact that the support of the model 1 distribution is a subset of that of model 3, and not the other way around.
We have depicted continuous distributions in the example of Fig. 3. In practice, however, with our simulations, we have 256 samples of each distribution for a given cosmological model, statistic, filter scale, and source redshift. Given the likelihood of undersampling, especially at the tails of the distributions that are important for setting the FDR threshold, we apply a Gaussian smoothing in each case using kernel density estimation (KDE) prior to calculating discrimination efficiencies. We find that KDE smoothing improves results in terms of the agreement between the observed overlap of the distributions and the computed value.

Second-order M ap statistics
The lensing power spectrum, or equivalently the angular twopoint correlation function, accesses the Gaussian information in the projected matter density. As seen by their 3D matter power spectra (Fig. 1), there exist f (R) models, both with and without neutrinos, that can mimic GR to better than 10% at second order. Our goal is to look beyond second-order statistics to see if it is possible to distinguish the MG models from ΛCDM. As a check of consistency, we first compare the lensing power spectra to verify agreement between the two second-order probes.
In Fig. 4 we show convergence power spectra P κ ( ) computed for various f (R) models along with ΛCDM. The left plot shows the three f (R) models we consider defined by their different f R0 values (−10 −4 , −10 −5 , and −10 −6 ), all with a neutrino mass sum m ν equal to zero. As theory predicts, and as we have seen by their matter power spectra, f (R) models more closely mimic ΛCDM the smaller the value of f R0 . In particular, f 5 (R) agrees with ΛCDM within 20% and f 6 (R) at better than 5% over three decades in scale.
It is interesting to note the different scales at which the deviation from ΛCDM is maximal. For both f 5 (R) and f 6 (R), the maximum occurs around multipole ≈ 10 4 , whereas for f 4 (R) it is around 10 3 . Power thus moves from small scales to large scales with increasing f R0 . The effect of the modified gravitational interaction is non-trivial and not simply a uniform scaling of the power spectrum across different modes. Note that all curves have been computed for sources at z s = 1.0, but very similar results occur for the other redshift planes.
In the right panel we focus on variations among f 5 (R) models with different m ν . Looking at the relative differences with ΛCDM, it is clear that the addition of neutrinos has the effect of damping power at all scales. This is consistent with the expectation that massive neutrinos should suppress structure growth. With m ν = 0.1 eV (0.15 eV), the deviation does not exceed 14% (8%) within the range considered. We see that in terms of lensing, as well as directly through the matter distribution, f 5 (R) models with neutrinos can produce a signal that is significantly degenerate with ΛCDM. Figure 5 shows the mean aperture mass variance computed as a function of aperture scale for the same models as in Fig. 4. M 2 ap (ϑ) is plotted at filtering scales (ϑ 1 , ..., ϑ 7 ) = (0. 293 , 0.586 , 1.17 , 2.34 , 4.69 , 9.34 , 18.8 ) 2.34 , 4.69 , 9.34 , 18.8 ).
Left: three f (R) models without neutrinos. Differences in MG variance relative to ΛCDM match those seen for the power spectra in Fig. 4. In particular, f 4 (R) variance at large scales exceeds that of ΛCDM by around 30%, while the maximum deviations of f 5 (R) and f 6 (R) occur at small scales at a level of about 20% and 5%, respectively. Right: three f 5 (R) models with a different sum of neutrino masses m ν . Trends in the curves again mirror their power spectra, and the effect of neutrinos damping structure growth is seen in the modulation f 5 (R) variance with m ν = 0 eV toward the curve for m ν = 0.15 eV. a wavelet transform with j max = 7. As discussed in Section 5, aperture mass variance probes the lensing power spectrum within a narrow window around the mode associated to scale ϑ. In agreement with their power spectra, the MG models with m ν = 0 eV (left), for example, have larger variance at each scale compared to ΛCDM, with the difference increasing according to the value of f R0 . Moreover, the variance relative to ΛCDM for f 4 (R) is largest at large filtering scales, whereas it is largest at small filtering scales for f 5 (R) and f 6 (R).
Plotted in the right panel of Fig. 5 is M 2 ap (ϑ) for f 5 (R) models with varying neutrino mass. The trend in the curves is the same as for the power spectra seen in Fig. 4, namely the presence of neutrinos in the model can bring the aperture mass measurements into agreement with ΛCDM at better than 6% over the range of scales considered. We confirm therefore that by including neutrinos, MG models with importantly different gravitational interactions from GR can mimic ΛCDM at the level of two-point statistics in weak-lensing observations. Article number, page 9 of 16 Left: aperture mass skewness computed as the third order moment M 3 ap . Higher neutrino mass leads to a nearly uniform reduction in skewness across all scales relative to ΛCDM. The maximum discrepancy between each model and ΛCDM is also larger than for the variance. Right: aperture mass kurtosis computed as the fourth-order moment M 4 ap . Differences from ΛCDM are comparable to the skewness, the main distinction being that the kurtosis of the m ν > 0 models approaches the m ν = 0 value at larger scales.
Note that the maximum variance shown has been restricted to less than the value actually attained for ϑ 1 in both plots of Fig. 5. This was done merely for visualisation purposes to better exhibit the spread of the curves at larger aperture scales. The variances for all models at the smallest scale are nearly identical in any case, as the difference plots of the lower panels show.

Higher-order M ap statistics
Given that second-order M ap statistics may be insufficient to distinguish certain MG models from ΛCDM, we present now results from computing higher-order statistics in these models. These include skewness, kurtosis, and peak counts, as discussed in Section 5.2.
The aperture mass skewness M 3 ap provides a simple thirdorder statistic of the lensing field that probes the bispectrum (cf. Section 5.2). Results are shown in the left plot of Fig. 6 for the same f 5 (R) models considered in Section 6.1. The effect of neutrinos is seen as the nearly uniform reduction in skewness at each aperture scale which varies with m ν in an analogous way to the variance. That is, the curves with m ν > 0 are shifted downward systematically from the curve with m ν = 0. This means that the distribution of κ becomes more Gaussian as m ν increases, which is consistent with a higher neutrino mass more effectively suppressing the generation of the highest matter peaks.
At the smallest scales, the skewness across all three f 5 (R) models is amplified relative to ΛCDM, and more so than for the variance at the same scale. For example, the m ν = 0 eV model skewness is approximately 30% larger than ΛCDM at ϑ 2 compared to 10% for the variance. In addition, for the model with m ν = 0.15 eV, the skewness at certain intermediate scales is up to 10% different from ΛCDM, whereas the corresponding variance is less than 3%. Each f 5 (R) model approaches an approximately constant skewness at scales around ϑ ≈ 10 and larger. Given the larger (average) differences among the models, this suggests that skewness may be a better observable to break degeneracies than the variance.
We show the aperture mass kurtosis M 4 ap as a function of filter scale in the right plot of Fig. 6. As with the variance and skewness, the m ν = 0 curve is highest and decreases at each scale as m ν increases. Differences with respect to ΛCDM are of the same amplitude as for the skewness within each f 5 (R) model. We see at fourth-order, however, that neutrinos cause a more complex behaviour in the full lensing distribution than was apparent at lower orders. For example, in the relative deviation from ΛCDM (lower panel), there exists now a clear local minimum at ϑ ≈ 5 for the models with m ν > 0, in addition to the maximum at smaller scales.
It is interesting to comment on the overall behaviour of the statistics of M ap moments as one varies neutrino mass. The f 5 (R) model with m ν = 0 eV overpredicts ΛCDM at each filter scale and for each statistic. Including massive neutrinos in the model, as we have discussed previously, counteracts the tendency of the modified gravitational interaction to enhance structure growthat least on scales smaller than the neutrino free-streaming length, which in our case, for each m ν , lies above the full range of aperture scales considered. The degeneracy with ΛCDM is generally strengthened by larger m ν , but not without limit, and not uniformly, as the higher-order M ap statistics reveal. We expect that m ν values larger than we have studied would again decrease the degeneracy with ΛCDM, perhaps measurably at the two-point level. For reference, the most recent constraints from CMB data put an upper bound on the sum of neutrino masses at around 0.17 eV (Couchot et al. 2017), however this assumes a ΛCDM model that does not necessarily apply beyond this framework.
The final non-Gaussian statistics we consider is the abundance of peaks counted above a given k σ threshold (k = 1, 2, ...). Results for k = 3 and k = 5 are shown in Fig. 7. A higher peak value cutoff means that in general fewer peaks will be detected. This is borne out in the two plots of Fig. 7, where the peak count at a fixed scale is about ten times smaller for k = 5 than for k = 3. In addition, the peak count for the smallest scale (ϑ 1 = 0.293 ) Article number, page 10 of 16 A. Peel et al.: Breaking degeneracies in MG with higher-order WL statistics  2.34 , 4.69 , 9.34 , 18.8 ). The abundance of peaks above a 3σ threshold is shown in the left panel, and above a 5σ threshold in the right panel. An order of magnitude more peaks are detected at each scale with the lower threshold, although deviations from ΛCDM, as well as among the different m ν cases of f 5 (R), are more pronounced with the higher threshold. This indicates that differences between models at this redshift are mostly contained in the highest amplitude peaks. Compared to the M ap moments, peak counts interestingly do not very well distinguish between models with different neutrino mass, but they do distinguish modified from standard gravity, at least at certain scales.
is slightly less than that of ΛCDM in both cases but significantly larger than ΛCDM (up to 20%) by the third scale (ϑ 3 = 1.17 ). This is similar to what we have seen for all of the M ap moments, in particular for f 5 (R) with m ν = 0 and 0.1 eV.
The impact of neutrino mass on the f 5 (R) models is less obvious with peak counts compared to the other statistics across most scales. This is because the variation between models is much smaller than the range of peak count values between different aperture scales (up to four orders in magnitude). Peak counts are therefore much more sensitive to the scale of observation than the variance, skewness, and kurtosis. We notice as well that for a given aperture size, the spread among f 5 (R) models with different neutrino masses is less pronounced than for the M ap moments. This suggests that peak counts could offer a robust test of modified gravity whether or not neutrinos are considered in the analysis and efficiently break the degeneracy that affects all other WL statistics.
We have found identical trends to those observed in Figs. 5-7 for the f 4 (R) and f 6 (R) models with m ν > 0. This is also the case for the other source redshift planes across all models. We further explore the evolution with redshift in Section 6.4.
In quantifying the level at which one model is in fact distinguishable from another, visual inspection of plots is, of course, not sufficient. We therefore address this question in detail in the following sections by studying the histograms of the different observables and by computing their discrimination efficiencies.

Distributions of observables
Considering seven filtering scales, four source redshift planes, and five statistical observables in our simulations (variance, skewness, kurtosis, and peak counts above two threshold levels), we have available 7 × 4 × 5 = 140 observables for each simulated cosmology. We are interested in comparing the histograms of each of these observables between different models in order to determine which, if any, is best at breaking degeneracies. We present here only four example plots intended to illustrate our approach and results. Shown in Fig. 8 are the normalised histograms of M ap variance, skewness, kurtosis, and peaks counts for ΛCDM and MG models. To compare with previous results, we focus on f 5 (R) models with m ν = (0, 0.1, 0.15) eV and choose a filtering scale of ϑ 2 = 0.586 . We will further investigate in the rest of the paper how results depend on the filtering scale and redshift, for each statistic. Solid lines indicate smoothing by KDE (cf. Section 5.3). Referring to Figs. 5-7, the mean value of each statistic corresponds to the central positions of the histograms and their relative separations. For example, the variance at this scale of the m ν = 0 eV f 5 (R) model is most distinct from ΛCDM, while the m ν = 0.15 eV is least.
The histograms allow us to see qualitatively how efficient each statistic is at distinguishing between models. Considering f 5 (R) with m ν = 0.15 eV (green), the model most degenerate with ΛCDM (black) in terms of the matter and convergence power spectra, higher order moments of M ap do not appear able to break the degeneracy. The skewness and kurtosis histograms overlap with ΛCDM more than does the variance. On the other hand, peak counts, shown here for a 3σ threshold, displace the f 5 (R) distribution from that of ΛCDM so that they are nearly disjoint. This is the case as well for f 5 (R) with the other neutrino masses, supporting the result seen in Fig. 7.
In this example, we have chosen a combination of source redshift and filter scale that provides relatively good separation between the histograms of these models. Sources have been taken at z s = 2.0 here, compared to z s = 1.0 in Figs. 5-7, as the contrast between MG models and GR tends to increase with redshift for each observable. It should be noted as well that for many of the observables considered, i.e. for other redshifts and filtering scales, we find that the f (R) model histogram is essentially fully coincident with that of ΛCDM, indicating a negligible discrimination efficiency. Only a relatively small subset of observables appear able to break degeneracies. We therefore further investi- Considering the most degenerate case with ΛCDM, f 5 (R) with m ν = 0.15 eV, secondand higher-order moments of M ap do not appear able to distinguish the models. Peak counts, on the other hand, shown here for a 3σ threshold, cleanly separate the two distributions. Moreover, it is interesting that peak counts separate all f 5 (R) cases from ΛCDM by approximately the same amount, independent of m ν . gate the dependence of discrimination efficiency on redshift and filtering scale in the next subsection.

Discrimination efficiency: variation with statistic, scale, and redshift
In this section we present the discrimination efficiency with respect to ΛCDM of two f 5 (R) models, m ν = (0, 0.15) eV, as a function of M ap statistic, filtering scale, and source galaxy redshift. Figure 9 shows results for f 5 (R) without neutrinos. Each polar plot represents one of the statistics, where for peak counts (lower right) the threshold has been set to 3σ.
Each plot is divided into seven wedges indicated by the bold red lines, where each wedge represents a filtering by the aperture ϑ appearing at the outer edge. Apertures are numbered as before according to wavelet scale and which correspond to angular sizes (ϑ 1 , ..., ϑ 7 ) = (0. 293 , 0.586 , 1.17 , 2.34 , 4.69 , 9.34 , 18.8 ). Each wedge contains four bars, color coded according to source redshift, where the height of each bar represents the discrimination efficiency (as a percent) at that redshift and filtering scale. A bar extending from the center of the figure and touching the outer edge expresses a 100% discrimination efficiency, and the scaling with radius is linear.
As we expect from our previous results, we see that the variance can be a relatively good discriminator between ΛCDM and f 5 (R) without neutrinos. In particular, the discrimination efficiency is at least 80% at scales ϑ 2 and ϑ 3 for sources at z s ≥ 1.5. The skewness performs well only for the single aperture of ϑ 2 and z s ≥ 1.5, while the kurtosis is a poor discriminator at all filtering scales and redshifts. Considering that the kurtosis of this model can deviate from ΛCDM by up to 30%, i.e. ∼10% more than the variance (cf. Figs. 5 and 6), it is somewhat surprising that this fourth-order statistic does not offer more discrimination power compared to second order.
The final plot in Fig. 9 reveals that peak counts can discriminate at approximately the same level as the variance, at least for filtering scales ϑ 2 and ϑ 3 , or more. Indeed, peak count discrimination efficiency even exceeds that of the variance for z s = 1.5, for example, at these two scales. We have checked that by raising the peak detection threshold to 5σ, discrimination efficiency rises to 100% at ϑ 2 for sources at z s ≥ 1.0. A consequence, however, is that the efficiency is reduced at ϑ 3 for the higher redshift source planes. From the cases we have checked, it appears that the relationship between discrimination efficiency and peak count threshold, filtering scale, and redshift is highly nontrivial. Nonetheless, it is clear that peak counts can outperform standard Table 2. Maximum discrimination efficiency E attained for each MG model according to M ap statistic, filtering scale ϑ, and source redshift z s . Results for the variance are shown first, along with the best available statistic underneath. For all models except f 4 (R) [m ν = 0 eV], the unique maximum discrimination efficiency is achieved with peaks counted (p.c.) above a 5σ threshold. For f 4 (R) with m ν = 0 eV, all of the statistics reach 100% discrimination efficiency for at least one combination of ϑ and z s .

Model
Max  Figure 10 is analogous to Fig. 9 for the f 5 (R) model with m ν = 0.15 eV. Comparing to the case without neutrinos, it is apparent that no aperture mass moments up to fourth order are capable of distinguishing this model from ΛCDM. On the contrary, peak counts again show a promising result. For ϑ 2 ≈ 0.6 filtering, sources at z s = 2.0 give a 100% discrimination efficiency. For ϑ 3 ≈ 1.2 filtering, the efficiency is 93% for z s ≥ 1.5 and 82% for z s = 1.0. We have checked here as well that raising the detection level to 5σ produces the same effect as it did for the previous case without neutrinos. Given that this model was designed to be degenerate with ΛCDM, it is a useful result, and indeed the main result of the paper, that there exist observables capable of discriminating between the models with high efficiency.
We have carried out the same analysis for the other MG models, i.e. f R0 = −10 −4 with m ν = (0, 0.3) eV and f R0 = −10 −6 with m ν = (0, 0.06, 0.1) eV. A summary of the results is presented in Table 2. Shown are the maximum discrimination efficiencies from ΛCDM attained for each model along with the statistic, aperture ϑ, and source redshift z s that produces it. To highlight the difference between second-and higher-order statistics, results for the variance and for the best discriminating statistic are shown for each model. For all models except f 4 (R) [m ν = 0 eV], the unique maximum efficiency is achieved with peaks counted above a 5σ threshold. On the other hand, the f 4 (R) case with m ν = 0 eV reaches 100% with all of the statistics for multiple combinations of ϑ and z s . This is expected given that this model diverges from ΛCDM most significantly already at the power spectrum level.

Conclusion
The cosmological data do not yet point to a unique model within which all observations can be explained, with several cosmologies still fitting the data. The standard ΛCDM model indeed accommodates the broad range of current observational probes that measure structure growth and the universal expansion across cosmic time. The fundamental nature of the late-time acceleration, however, remains unclear-in particular whether it is caused by a fluid (dark energy) component, a cosmological constant Λ, or instead by a modification to general relativity at large scales. Many modified gravity models, such as the f (R) family, are still viable given the data.
We have explored in this paper several particular f (R) models that mimic ΛCDM in terms of their background evolution and their matter power spectra at z = 0. The models can be written in terms of two free parameters, n and f R0 , which determine the density and scale at which the modified gravitational interaction takes effect. We fixed n = 1 throughout and considered | f R0 | values within the range 10 −4 to 10 −6 .
By design, the models we chose are difficult to distinguish from ΛCDM based on observations at linear scales. Furthermore, using N-body simulations, we showed that the amplitude of the matter power spectrum is larger relative to ΛCDM with increasing | f R0 |, but this can be in turn reduced by the inclusion of massive neutrinos (up to m ν = 0.3 eV) in the model. This is because massive neutrinos suppress the growth of structure on scales smaller than their free-streaming length, thereby compensating the increased clustering due to larger | f R0 |. For certain combinations then of f R0 and m ν , the model well reproduces ΛCDM not only on linear scales, but also into the non-linear regime.
Our primary goal has been to determine whether there are weak-lensing observables that are more efficient than standard second-order statistics in discriminating between GR and MG. We first verified that the convergence power spectrum P κ ( ) and aperture mass variance M 2 ap , both two-point measurements, of our simulated cosmologies behaved as expected and consistently with each other. As with the matter power spectra, deviations from ΛCDM of the MG models decreased as | f R0 | decreased. Then, focusing on f 5 (R), we showed that the discrepancy diminished further by including m ν up to 0.15 eV over the angular scales considered.
The non-Gaussian weak-lensing observables we have studied are the aperture mass skewness (third order), kurtosis (fourth order), and peak counts as a function of map filtering scale and source galaxy redshift. We considered redshifts up to z = 2.0 and aperture scales from 0.293 up to 18.8 . In terms of a space-based survey like Euclid, which aims to achieve a galaxy number density of ∼30 arcmin −2 and a pixel scale of at most 1 arcmin for κ maps, we can therefore consider our results for ϑ ≥ ϑ 3 = 1.17 applicable to such future weak-lensing analyses.
For f 5 (R) with m ν = 0.15 eV, the f 5 (R) model most closely mimicking ΛCDM, there exist filtering scales for which the skewness and kurtosis measurements deviate more from ΛCDM compared to the variance, indicating a greater chance of using these observables to discriminate between the models. This was also the case for peak counts above thresholds of 3σ and 5σ. However, a further study of the discrimination efficiency based on the false detection rate (FDR) formalism showed that only peak counts offered any reliable power to distinguish the MG model from ΛCDM. The discrimination efficiency calculation does this by quantifying the overlap between the actual distributions of observables computed over the 256 model realizations. We conclude therefore that we are less likely to mistake a true MG model for ΛCDM by measuring peak counts compared to second-and higher-order moments of the aperture mass.
A particularly interesting feature of peak counts we have found is that they are less sensitive to differences in neutrino mass than to the model of gravity. The example in Fig. 8 demonstrates this in the significant overlap among the three f 5 (R) his- Fig. 9. Discrimination efficiency with respect to ΛCDM of the MG model f 5 (R) with m ν = 0 eV as a function of statistic, aperture scale, and source redshift. In each plot, the wedges marked by red lines indicate M ap filtering by the aperture ϑ appearing at the outer edge. Numbered apertures correspond to angular sizes of (ϑ 1 , ..., ϑ 7 ) = (0. 293 , 0.586 , 1.17 , 2.34 , 4.69 , 9.34 , 18.8 ). The radial length of a bar, shaded according to source redshift, represents the discrimination efficiency (in percent) of the statistic at the filtering scale associated to its wedge. The variance of this model suffices to distinguish it from ΛCDM at greater than 80% at scales ϑ 2 and ϑ 3 for sources at z s ≥ 1.5. Peak counts (above a 3σ detection threshold) achieve similar results, while kurtosis proves to be a poor discriminator, not exceeding 38% at any filter scale or redshift. Skewness shows comparably good discrimination power to the variance and peak counts only at scale ϑ 2 . tograms which are each in turn nearly fully disjoint from the ΛCDM histogram. This behavior was seen as well for the other sets of models with the same value of f R0 but differing m ν . In addition to offering more discrimination power compared to aperture mass moments, at least for the models we have considered, peaks may generally be a more robust observable for distinguishing between standard and modified gravity.
In terms of source redshift dependence, we have found generally that higher z s observations tend to provide greater discrimination efficiencies between models for every statistic, although we did find exceptions for particular filtering scales. In no case, however, did z s = 0.5 outperform z s = 2.0, which we can understand as the high redshift convergence maps carrying more significant cosmological information that is accessible by these statistics. Moreover, as we considered the different redshift planes independently, we have not exploited the evolution of observables with z, which likely constitutes a further signal that could be used to distinguish the models.
We found similar but not precisely equivalent results for the f 4 (R) and f 6 (R) models (with various neutrino masses) compared to f 5 (R). They are similar in the sense that peak counts remain the best observable tested for breaking degeneracies with ΛCDM for any model with m ν > 0. On the other hand, no filtering scale nor source galaxy redshift provided a larger discrimination ef- Fig. 10. Discrimination efficiency with respect to ΛCDM of the MG model f 5 (R) with m ν = 0.15 eV as a function of statistic, aperture scale, and source redshift. Plots are analogous to those in Fig. 9. Neutrinos significantly enhance the degeneracy of this model with ΛCDM at the two-point level compared to the m ν = 0 eV case. This can be seen in the variance plot (upper left), where the discrimination efficiency is less than 15% at all filter scales and redshifts. The higher-order moments show even less discrimination power. On the contrary, peak counts (lower right), again above a 3σ detection threshold, achieve 100% discrimination efficiency for z s = 2.0 at ϑ 2 , and above 90% for z s ≥ 1.5 at ϑ 3 . ficiency than 63%, significantly lower than was seen for f 5 (R) models. As we have not sought in this paper to find an optimal statistic, one may reasonably wonder whether there exist other M ap observables (e.g. from different filter functions, filtering scales, statistics, etc.) that would be better at breaking degeneracies not only between MG and GR, but also within MG models themselves. We leave a dedicated study of this question to future work.