Open Access
Issue
A&A
Volume 621, January 2019
Article Number A2
Number of page(s) 12
Section Cosmology (including clusters of galaxies)
DOI https://doi.org/10.1051/0004-6361/201833740
Published online 19 December 2018

© ESO 2018

Licence Creative Commons
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction

Upcoming weak-lensing surveys have the goal of measuring cosmology with unprecedentedly high precision. Their very high statistical power requires systematic errors to be very well understood and calibrated. One of the main sources of systematic error for weak gravitational lensing is the bias in the measurement of the galaxy shear, which carries the cosmological information about the galaxy’s large-scale structure and its evolution. For upcoming experiments such as Euclid (Laureijs et al. 2011), the Large Synoptic Survey Telescope (LSST, LSST Science Collaboration 2009), or the Wide Field Infrared Survey Telescope (WFIRST, Spergel et al. 2013), we need to calibrate shear biases to sub-percent precision. Traditional calibration methods create large suites of galaxy image simulations and estimate the shear bias for a given shape measurement method, point spread function (PSF), galaxy population, noise level, and other factors. Shear bias estimation to date has been dominated by the intrinsic ellipticity dispersion of the simulated galaxies. To reach the desired precision requires a simulation volume that exceeds the actual observational data that are to be calibrated, with billions of simulated galaxies. This has a dramatic impact on the computation load of generating both the simulations and shape measurement methods, and therefore on limiting the complexity, storage, and re-usability (e.g. Hoekstra et al. 2017).

Existing methods to reduce the number of simulations made use of rotated galaxies with the same shear such that their mean intrinsic ellipticity cancels out differences. The first proposed method was the so-called ring test (Nakajima & Bernstein 2007), using galaxies evenly distributed in their orientation of intrinsic ellipticity with constant modulus, and with constant shear. Massey et al. (2007) reduced the number of objects to a pair of orthogonally oriented galaxies. Both approaches result in a zero net intrinsic ellipticity. This reduces the error of the estimated shear, but does not entirely cancel out the contribution from the measured shapes. Stochasticity in the measurement, for example due to pixel noise and the PSF, will perturb the exact shape-noise cancellation. In addition, systematic biases break the input ellipticity symmetry. Sources of these systematic effects are ellipticity bias (the response of the measurement to intrinsic ellipticity) when it depends on the orientation of the galaxy with respect to the coordinate system, the PSF, or the shear, and also selection effects, including non-equal galaxy weights. The selection-induced shear bias calibration for the Hype Suprime-Cam (HSC) survey was performed without shape-noise suppression (Mandelbaum et al. 2018).

In the case of simulating fields with non-constant shear, noise suppression can be achieved by simulating the intrinsic shape-noise distribution of galaxies as a pure B-mode field. Using an estimate of the E-mode power spectrum or real-space E-mode correlation is then insensitive to the intrinsic shape noise (Kitching et al. 2011). However, simulating a realistic intrinsic ellipticity distribution inevitably leads to a power leakage from B to E (Mandelbaum et al. 2014). In addition, measurement stochasticity and biases, causing imperfect noise cancellation as described above, also apply in the case of variable shear.

In this paper we propose a new method to estimate the shear bias from simulations that is insensitive to the shear estimator noise coming from both the intrinsic and the measured ellipticity dispersion. This reduces the number of required simulated galaxies by three orders of magnitude. Even though we estimate a bias for each galaxy individually, for calibration purposes we only use the mean bias averaged over a sample of galaxies. This avoids unstable ratios of two noisy quantities.

Our method is inspired by the metacalibration technique (Sheldon & Huff 2017; Huff & Mandelbaum 2017), where the shear bias is estimated as the shape estimator response to a small shear applied directly to the individual images. This technique is used to calibrate shear bias on real data without the need to create simulations (Zuntz et al. 2018). However, it requires us to perform operations on the images such as PSF de- and re-convolution, and subtraction and addition of noise components. Our method is not a calibration technique, but a shear bias estimation. As it is simulation-based, we do not need to de-convolve and de-noise observed images with an estimated PSF, which is notoriously difficult. We can apply any shear to the simulated image before the PSF convolution step. The challenge for shear calibration using our method, as with all simulation-based techniques, is to closely match the properties of the simulations to the observed data in order to minimize the important biases due to selection effect (see e.g. (Fenech Conti et al. 2017)).

This paper is organized as follows. In Sect. 2 we define the basic required concepts. Section 3 introduces our new shear bias estimation method and contrasts it with existing ones. In Sect. 4 we analytically compute the precision of our method and compare it to existing shear bias estimation techniques. In Sect. 5 we describe the galaxy image simulations, which we use in Sect. 6 to test our analytical descriptions, and to compare our method with existing ones. We discuss potential applications of our method in Sect. 7, and give a summary in Sect. 8.

2. Definitions

2.1. Shear bias

We define multiplicative and additive shear bias for a population of galaxies. Let gα be the shear of a given galaxy and its observed ellipticity, where α = 1, 2 stands for the two components of the complex shear and ellipticity. If the mean intrinsic ellipticity of the galaxy sample is zero, we can estimate the mean reduced shear as the average of the observed ellipticities1,

(1)

This estimator is biased in general, and cα, mα are the ensemble-average additive and multiplicative shear biases (Huterer et al. 2006; Heymans et al. 2006). Here and in the following, we ignore non-linear contributions to the bias. We also neglect higher-order terms, for example the term g*eI in the denominator of the shear estimator introduced in Seitz & Schneider (1997). Further we set the convergence κ = 0 such that the (observable) reduced shear g = γ/(1 − κ) equals the shear γ.

Alternatively, we can describe a shear bias for individual galaxies as the response of the observed ellipticity to a small shear distortion (Huff & Mandelbaum 2017; Sheldon & Huff 2017):

(2)

The shear response R is a 2 × 2 matrix whose diagonal (off-diagonal) terms represent the response of the ellipticity measurements to shear changes of the same (opposite) component. An additive shear bias for individual galaxies is defined as

(3)

where is the intrinsic ellipticity. We note that can be defined if we use an analytic expression for the galaxy profile, but complex galaxy morphologies do not necessarily have a unique “true” ellipticity, in which case we cannot measure aα using Eq. (3). We can however estimate the additive bias if over the population, which can be fulfilled under certain symmetry assumptions without the need to define intrinsic ellipticity, as we describe later in Sect. 3.1.

A perfect shape estimation corresponds to R being the unit matrix and aα = 0. If the shape measurement conserves the spin-2 property of ellipticity and shear, Rαβ needs to be a combination of a scalar and a spin-4 tensor. If we neglect the latter, the response collapses to a single non-zero number R11 = R22, with R12 = R21 = 0.

2.2. Shear calibration

A simulation-based calibration of measured shear estimates typically measures ensemble biases mα, cα from a large number of image simulations with different galaxy properties and shear gα via Eq. (1). A calibrated shear estimate is then obtained by correcting the observed ellipticities by the ensemble biases, , provided the simulated galaxy population matches the data in all relevant properties.

When measuring the individual responses using Eqs. (2) and (3), an unbiased shear estimator is given as ⟨R−1eobs − ⟨a⟩⟩ ≈ ⟨R−1Rγ⟩ (see Sheldon & Huff 2017). Below we present our method to compute R for each simulated galaxy without being sensitive to shape noise.

3. Shear bias measurement methods

3.1. Our method: Shear bias estimation reducing measurement noise

We measure the shear response using Eq. (2) for individual simulated galaxy images as follows. For each simulated galaxy with given properties, intrinsic ellipticity eI and given shear g (which can but need not be zero), we create additional, sheared versions of the same galaxy. The galaxy is analytically sheared before PSF convolution and noise addition, so that the differences between the images only come from the shear. We then approximate the shear response by finite differences, following (Huff & Mandelbaum 2017),

(4)

where is the measured ellipticity of the image with additional small shear ±Δgα. With three sheared images we can estimate all components of R for each galaxy. To determine the shear response averaged over a sample of galaxies, we only require two appropriately chosen shear values (see Sect. 5 and Appendix A for more details).

To further reduce the stochasticity of our response estimator, we use the same noise realization for all image copies for each galaxy. This guarantees that the contribution from the intrinsic ellipticity exactly cancels out for our bias estimator. Then, the intrinsic ellipticity can be considered as just another property of the galaxy (like the flux or radius) and as such affects the shear bias in a deterministic way, but does not contribute to the statistical uncertainty. Therefore, we can obtain a much more precise bias estimation compared to methods that average over observed galaxy ellipticities. This will be quantified in Sect. 4.

When randomizing the noise for each image, we obtain the same mean but noisier response values. Keeping the same noise realization of our images is not an artificial noise reduction in the bias estimate, it only helps us to obtain a noise-free numerical derivative. The noise properties will be sufficiently well sampled by the different simulated galaxies. The additive shear bias for each galaxy is measured using Eq. (3) on the original, non-sheared image.

In Fig. 1 we show an example of the estimated component of the response, R11, for one galaxy image. The finite-difference estimate is insensitive to the shear value as long as it is small, |Δgα|≲0.05 for α = 1, 2. More details about the robustness of our new estimator are presented in Appendix A.

thumbnail Fig. 1.

Scheme of shear response estimation for a single galaxy for R11.

Open with DEXTER

From the measurements of individual galaxy shear biases, we estimate the ensemble multiplicative and additive bias of a galaxy population as the average of the individual estimates, respectively ⟨Rαα⟩ and ⟨aα⟩. As mentioned before, we do not need to define as far as since then . This is true for the usual cases of study with randomly oriented galaxies (assuming eI transforms under rotations like a spin-2 quantity) when the shape estimators have no preferred direction (which is something expected for most of the estimators). This can be a weighted average if galaxies have different weights. We ignore the non-diagonal terms of R, as we have found that their contribution averages out to zero if the shear values are symmetrical around zero (see Appendix A). In the following two subsections, we review two commonly used calibration methods to estimate the shear bias.

3.2. Linear fit estimation

The most common method to estimate the shear bias in the literature is to perform a linear fit of Eq. (1) to simulated sheared galaxy images (e.g. Heymans et al. 2006; Miller et al. 2013; Zuntz et al. 2013; Mandelbaum et al. 2015, 2018; Fenech Conti et al. 2017; Huff & Mandelbaum 2017; Hoekstra et al. 2017; Pujol et al. 2017; Zuntz et al. 2018). For each galaxy population (e.g. for each bin of given galaxy properties) we obtain the additive and multiplicative biases cα and mα from a linear fit of the measured ellipticities as a function of simulated input shear, as illustrated in the top panel of Fig. 2. The error of the parameter estimation can then be obtained by jackknife resampling and obtaining the distribution of best-fit parameters for each resample.

thumbnail Fig. 2.

Scheme of the estimation of biases m1 and c1 from the linear fit of the distribution of as a function of g1. In the top panel, the white dashed line shows the linear fit of the distribution, represented by the colour map. In the bottom panel, the blue points and error bars show the mean of the galaxies with the same shear. The black dashed line shows the weighted linear fit of the blue points.

Open with DEXTER

Alternatively, the straight line can be fitted to the average measured ellipticities for each input shear, , as shown in the bottom panel of Fig. 2. Both fitting schemes provide consistent values and error bars for the shear bias parameters.

3.3. Linear fit estimation with shape-noise suppression

The precision of the linear fitting technique to measure shear bias is limited by shape noise stemming from the intrinsic ellipticity distribution. Reducing this noise requires the use of a very large number of galaxy images. An alternative method to reduce the shape-noise contribution is to force the mean ellipticity to cancel out, by simulating orthogonal pairs of galaxy images (Massey et al. 2007; Mandelbaum et al. 2014), As described in (Massey et al. 2007), the estimated shear of a pair of orthogonal objects is

(5)

where and are the observed ellipticities of respectively two orthogonal galaxies, whose intrinsic ellipticities cancel each other out exactly, for both α = 1, 2.

The shear bias is then estimated from a linear fit of as a function of gα. This estimator is an improvement over the simple linear fit reviewed in the previous section, with reduced contribution from shape noise. However, the observed ellipticities in the absence of shear do not cancel each other out in general, due to various effects. First, the stochasticity of the two (assumed to be independent) ellipticity measurements means that is a random variable with non-zero dispersion. We model this dispersion in Sect. 4.3. Second, ellipticity bias can be different between the orthogonal pairs. Ellipticity bias can be defined from a linear fit between observed and true ellipticities (see Eq. (1) from Pujol et al. 2017) when a true ellipticity can be defined, and it depends on the galaxies’ orientation, either with respect to the pixel coordinate system or to the PSF (Pujol et al. 2017). This can cause the estimated shear of orthogonal pairs to be biased with respect to gα (Kacprzak et al. 2012; Pujol et al. 2017). Third, selection effects can break the symmetry if one of the two galaxies is missed. This selection can occur at the detection level or the shape measurement stage, both of which can fail for one of the two objects. This could be due to a dependence on the relative orientation of the galaxy with respect to the PSF, or random noise fluctuations in particular in the low-SNR range. Fourth, when accounting for galaxy weights, the ellipticity cancellation is broken.

A generalization of this method consists in simulating sets of n galaxies on a ring with constant |eI|, rotated uniformly such that their mean intrinsic ellipticity is zero (Nakajima & Bernstein 2007). The case with n = 2 corresponds to the case of orthogonal pairs discussed above. In Sect. 4.3 we show that increasing n beyond n = 2 does not reduce the shape-noise contribution to the shear bias estimator.

4. Error estimation

In this section we study and compare the precision of the different shear bias estimators. In this section, a latin index of shear, ellipticity, bias, and so on indicates a galaxy number from a population. The figures shown in this section are obtained from the simulated images described in Sect. 5, and only serve for a visualization of our method. Their quantitative analysis is left for Sect. 6.

4.1. Our method: Shear bias estimation reducing measurement noise

Each galaxy i with properties Pi has a shear response Ri estimated as described in Sect. 3.1, from different sheared versions of the original simulated galaxy image with the same noise realization. The response Ri depends deterministically on Pi, given by the input parameters of the simulated image, the PSF, and stochastically on the random processes of the image realization. The latter in our case is a simple Gaussian pixel noise realization, but we can include other effects such as Poisson noise and cosmic rays. The effects on R from this stochasticity can be measured by repeatedly estimating Ri for fixed Pi with different noise realizations. This provides us with samples from the probability density function (PDF) of Ri(Pi). This PDF defines the uncertainty σN, α for both components of the estimated shear response due to stochastic effects.

In Fig. 3 we show two examples of this stochasticity coming from noise. We have measured R 10 000 times for 10 000 different noise realizations for the two galaxies shown in the figure (see Sect. 5 for details on the simulated images and shape measurement). As before, for each realization we do not change the noise for the original and the four sheared versions of the image. The mean responses ⟨Ri⟩ depend on the galaxy properties Pi. In general, the response is further from 1 for small galaxies (the top panel) and closer for large galaxies (bottom panel), and the two response components can be different as in the top panel. These results are consistent with the bias results from (Pujol et al. 2017).

thumbnail Fig. 3.

Stochasticity of the measurement of R due to noise. The upper and lower panel show the distribution of R11 (blue histogram) and R22 (in green) for two different galaxies, respectively, shown as inlaid postage stamps, with different properties.

Open with DEXTER

The dispersion for each component σN, α of the response depends on the noise level and on the properties P of the object. The dispersion is generally larger for smaller objects. For our shear estimation method, we only measure Rαα once per galaxy, which means that each shear response Rααi(Pi) has a stochasticity of σN, αi.

Quantifying σN, α allows us to estimate the number of galaxies we need to simulate such that the stochasticity is smaller than the uncertainty we want to obtain. To meet an allowed shear bias uncertainty of σreq, α, assuming that all galaxies have the same stochasticity σN, α (alternatively one can use the mean, or a worst-case value), we would need at least image simulations not to be dominated by pixel noise.

In the following, for the calculation of the precision of our estimator, we do not try to disentangle the contributions from noise and galaxy properties. Our bias estimator mα for a sample of N equally weighted galaxies (the application of different weights is discussed in Sect. 7.2) is the average of the individual shear responses,

(6)

The uncertainty of the estimated response is

(7)

where σR, α is the standard deviation of the distribution of Rαα.

Analogously, the additive bias is estimated as

(8)

with uncertainty

(9)

where now σa, α corresponds to the dispersion of the additive bias over the galaxy population. Figure 4 shows the distributions of the R11 and a1 for our sample of simulated images (see in Sect. 5). Only the multiplicative bias is insensitive to the ellipticity distribution or uncertainty. The additive bias estimated using Eq. (9) is still affected by shape noise.

thumbnail Fig. 4.

Distribution of R11 (top) and a1 (bottom) for the 2 million simulated galaxies. The second component of the biases shows similar distributions.

Open with DEXTER

4.2. Linear fit estimation

The observed ellipticity of a galaxy i with properties Pi can be defined as

(10)

where gαi is the shear and Sαi is the stochasticity around the linear regression of the measurement for galaxy i that will be dominated by the intrinsic ellipticity . We write the dependence of observed to intrinsic ellipticity as with some generic function f. In general, f is not the identity that would represent a perfect measurement. Because ellipticity is typically larger than shear, this relation is likely to be non-linear. When comparing the predictions with results from data, we only make the weak assumption that Sα is dominated by .

For the linear fit to Eq. (10) we use a set of values of gα and , whose distributions have dispersions σg, α and σe, α, respectively. In Fig. 5 we show these distributions measured on our simulated images, which we describe in more detail in Sect. 5.

thumbnail Fig. 5.

Distributions of and g1 used for the two million simulated galaxies. The second component shows similar distributions.

Open with DEXTER

The best values of (1 + mα) and cα obtained from a linear regression fit from Eq. (10) are given by (Kenney & Keeping 1962) as

(11)

(12)

Assuming ⟨gα⟩=0, these relations become

(13)

(14)

We assume that Rαα and gα are not correlated, which is a very good approximation since the shear bias is linear with gα. Then, with

(15)

we find

(16)

The estimated mα is consistent with our method if ⟨Sαgα⟩=0. A correlation between these two quantities would effectively modify the slope of the distribution of Eq. (10), resulting in a biased estimate of mα. For our method this condition does not need to be fulfilled.

We can estimate the error σm, α on mα via simple Gaussian error propagation assuming that the uncertainties in Rααi and Sαi are uncorrelated. This assumption would be violated if the shape estimator has a shear bias that depends on ellipticity.

We test our assumptions and approximations in Sect. 6, where we compare the numerical predictions with measurements from simulated images. The sensitivity of the bias with respect to these two quantities is

(17)

Replacing for simplicity the individual galaxies’ dispersions σR, αi and σS, αi by the mean values, we get

(18)

(19)

Compared to Eq. (7) this expressions shows the additional term . In most scenarios this is indeed the dominant term for the bias dispersion, which is the main reason why the linear fit achieves a much lower precision in bias estimation compared to our method.

The uncertainty on the additive bias comes directly from the dispersion in the stochasticity,

(20)

4.3. Linear fit with shape-noise suppression

Here we estimate the uncertainty of the shape-noise suppression estimator (Eq. (5)), which we write in a similar way to Eq. (10) as

(21)

The difference to Eq. (10) is that the index i now denotes a pair of orthogonal galaxies. The stochasticity depends on the sum of the observed ellipticities of the orthogonal pair,

(22)

In the scenario of a perfect shape estimator, the sum vanishes exactly. However, a shape estimator typically has a non-zero ellipticity bias,

(23)

for X = A, B, and gα = 0. If the ellipticity bias depends on the galaxy orientation, or the relative orientation between galaxy and PSF or shear, the two bias values bα, A and bα, B are in general not equal, and we find

(24)

We have measured |bαi, A − bαi, B| and found it can be up to 2% when one of the pairs is aligned with the shear.

The shear bias uncertainties σm, α and σc, α are computed via Eqs. (19) and (20) derived in the previous section, but with σS, α given by the dispersion of Eq. (24). This is a clear improvement, since the pre-factor |bαi, A − bαi, B| can be expected to be smaller than unity. In addition, if the noise realization is different for each of the objects A and B, this measurement is stochastic even if ⟨bαi, A⟩=⟨bαi, B⟩. This stochasticity contributes to σm, α and σc, α, which we denote with σeobs, α. In the general ring estimator case where we simulate n rotated copies of each galaxy to suppress shape noise, with , we can write

(25)

Keeping the total number of galaxies used in the linear fit constant, which is now N/n, we get

(26)

and

(27)

We can see that forcing shape-noise suppression gives a more precise m than the simple linear fit as far as σeobs, α ≲ σe, α. We also see that σc, α does not depend on the number of galaxies used for the shape-noise suppression, but σm, α increases with n. However, in our derivation we neglected the higher-order contributions in the shear estimator (Eq. (1)), which decrease with n. In this paper we do not quantify the optimal n that minimizes both contributions, since the second term in Eq. (26) dominates over these other two quantities. In conclusion, n does not significantly affect σm, α.

5. Simulations

For this analysis we used the public software package GALSIM (Rowe et al. 2015) to generate isolated images of two million galaxies, corresponding to the Control-Space-Constant branch of the GREAT3 challenge (Mandelbaum et al. 2014, 2015). The images are organised into 200 fields, each field with a unique PSF and shear (both constant for each field). The galaxy light distribution follows either a single Sérsic profile or a de Vaucouleurs bulge plus exponential disk.

Each galaxy is simulated twice, the second one being rotated by 90 degrees with respect to the first one to achieve shape-noise suppression. For more details about the simulated images, we refer the reader to (Pujol et al. 2017) as well as (Mandelbaum et al. 2014). This set of simulations are used for the linear fit methods, with (Sect. 3.3) and without (Sect. 3.2) shape-noise suppression. For the latter, we average the observed ellipticity of all galaxies for a given shear g, not specifically accounting for the orthogonal pairs when calculating the error bars (so we do not keep the galaxy pairs in the same jackknife subsamples). This results in a mean ellipticity in each bin close to zero, but does not reduce the scatter due to the intrinsic shape noise.

For our estimations of R as described in Sect. 3.1, we simulate the two million galaxies three times, with two sheared values drawn from the cases g = ( ± 0.02, 0), g = (0, ±0.02). The two values chosen have to be different in both components in order to be able to estimate R11, 22. Since both components of g change for each of the shear versions, the estimation of Rαα is affected by the non-diagonal terms as follows:

(28)

Our estimation of ⟨Rαα⟩ is unbiased as far as ⟨Δgβ⟩=0, for β ≠ α, over the entire sample. This is the case since we choose the sign of the shear changes at random. We can also measure the non-diagonal terms of R by using three images with shear values from g = (0, ±0.02), g = ( ± 0.02, 0), and g = (0, 0) (see Appendix A for more details).

Galaxy shapes are obtained with the method from Kaiser et al. (1995; KSB), using the publicly available code SHAPELENS (Viola et al. 2011). This method estimates the ellipticity of the objects from the surface brightness moments

(29)

defining the ellipticity as

(30)

The implementation details of the shape measurement algorithm are not very relevant for this paper, and we refer the reader to (Pujol et al. 2017) where we used the same methodology.

6. Results

In the top panel of Fig. 6 we compare the shear bias obtained with our method to the linear fit technique. As an example of galaxy property we use the input disk flux Fd of the simulated bulge+disk galaxies. We show that both methods give consistent results when using all two million galaxies. However, our method estimates the biases with a significantly better precision. The location of the points on the x-axis corresponds to the centre of the Fd bins. In addition to a small shift that we apply for an easier visual comparison, the bin centres for our method in the lower panel are modified, since the galaxies are now a random subsample. It is remarkable that when using all two million galaxies, the curves of m1 and m2 for our method are almost identical.

thumbnail Fig. 6.

Multiplicative shear bias as a function of the disk flux Fd, measured with our method (black lines) and (in orange) from the linear fit to Eq. (1). Solid (dashed) lines correspond to m1 (m2). The top panel shows the results using the same number of object for both methods. In the bottom panel, only 1/1300 objects have been used for our method.

Open with DEXTER

We quantify the precision of the different shear bias estimation methods in Fig. 7. as a function of the number of simulated galaxies Nsim. We create different random subsets of galaxies with size Nsim, and measure for each subset the shear bias for the three methods as described in Sect. 3. We compute the rms for each subset by jackknife resampling of the input galaxies for all methods, using 50 subsamples (other numbers of subsamples have given the same results).

thumbnail Fig. 7.

rms of the multiplicative (top panel) and additive (bottom panel) shear bias. We compare our method (red/orange lines) to the linear fit with (green) and without (cyan/blue) shape-noise suppression. The solid lines are measurements from the numerical simulations. Dashed lines show the analytical predictions derived in Sect. 4.

Open with DEXTER

We compare these uncertainties as measured from the simulations to the numerical predictions derived in Sect. 4. For the latter, we measure the parameters σR, α, σa, α, σS, α, σeobs, α, and σg, α directly from the simulations, as illustrated in Figs. 4 and 5. The amplitude and -dependence of the uncertainty measured from the data shows excellent agreement with the analytical calculations for all three methods. This suggests that the assumptions we made to derive these expressions are valid for the system and regime studied here. For the linear fit predictions, we set , assuming that stochasticity Sα is entirely determined by the intrinsic ellipticity. For the linear fit with shape-noise suppression, we measure σeout, α directly from the distribution of the sum of observed ellipticities of the orthogonal pairs, .

Our method has a much higher precision on the multiplicative shear bias estimation. Compared to the linear fit, σm, α for our method is smaller by a factor of 35.9. This means that for this study our method requires 35.92/n′ ∼ 1300/n′ times fewer simulated images to obtain the same precision, where n′ is the number of sheared versions used for each object. In Fig. 7 we show our method with n′ = 2, where we used shear values symmetrically distributed around zero, but similar results have been found for n′ = 4.

We demonstrate the high precision of our method in the bottom panel of Fig. 6, where we estimate the shear bias as in the top panel, but now for our method with only a fraction of 1/1300 of the objects, chosen at random. The results are consistent in both mean and error bars, demonstrating that our method reaches the precision of existing methods with three orders of magnitude fewer simulations. We note that some of the noise in the data points for our method comes from the more sparsely sampled galaxy properties in each bin. In the case of Euclid, with a global requirement of σm, α <  2 × 10−3 one needs at least 2 × 107 images for the linear fit method, but only ∼104 for our method according to this study.

The ratio of the rms between the two methods is approximately σe, α/(σR, ασg, α) > 1. The quantities σe, α and σg, α in the simulation need to be chosen to match expectations from cosmology and galaxy morphology. Given some basic survey characteristics such as redshift and wavelength coverage, and the survey selection function, these fundamental quantities are fixed. The dispersion σR, α, however, strongly depends on instrumental effects such as the PSF size and on the shape estimator. In this study we used a KSB method to measure the shapes on GREAT3-CSC-like images, and we expect σR, α to change when using other simulations and shape estimators. This provides a strong motivation to choose or develop a shape measurement method that minimizes this dispersion, and therefore minimizes the number of required simulations for calibration.

Applying the shape-noise suppression with orthogonal pairs improves the precision with respect to the simple linear fit by a factor of ∼2.8 for the measurements of both the multiplicative and additive bias. This improvement reduces by a factor of ∼8 the number of simulated images required for the same level of precision. We note, however, that each galaxy needs to be simulated twice for the shape-noise suppression. This is consistent with the factor of ∼9 found in Fenech Conti et al. (2017), where they used n = 4 for the shape-noise suppression.

Comparing our method to the linear fit with shape-noise suppression, we obtain an improvement of a factor of 12.8 for the multiplicative shear bias. This implies that for the same level of precision, we can reduce the number of simulated images required by a factor of 12.82/n′ ∼ 164/n′.

When comparing the additive bias precision, our method shows a factor of 2.26 improvement with respect to the linear fit, and a factor 0.56 with respect to the shape-noise suppression. Shape-noise suppression performs better because the additive bias is the average ellipticity over all simulated images, while for our method only 1/n′ images are used. In principle we could estimate cα with n′ = 1 (e.g. using only the original image). This would, however, unfairly not count the n′ >  1 images we have to simulate to measure mα. We conclude that a similar precision is obtained for both our method and shape-noise suppression when estimating additive shear bias.

7. Discussion and applications

The method presented here is a clear improvement on the precision of the shear bias estimation in simulations with respect to the standard linear fit of Eq. (1). It is also more precise compared to the linear fit with shape-noise suppression via pairs of orthogonally aligned galaxies. In the following we discuss potentially useful applications to improve shear bias analyses with simulated images.

7.1. Shear bias validation and calibration

One of the interests of measuring shear bias in simulations is to validate or calibrate the performance of a shear estimation algorithm. In the case upcoming surveys such as Euclid, LSST, or WFIRST the requirements concerning the knowledge of the additive and multiplicative bias imply the generation of a very large volume of simulations, which is computationally very challenging. Our method allows the saving of significant computational efforts to reach these requirements. In our case study, we require 2–3 orders of magnitude fewer images to reach the same precision as common approaches, although the exact factor depends on the shear estimator algorithms and the image and survey specifications.

7.2. Selection biases and weights

Shear bias from selection effects has been found to be of the same order of magnitude as those induced by the shape measurement process (Fenech Conti et al. 2017; Mandelbaum et al. 2018). Such biases arise when the galaxy selection function depends on the shear. This is for example the case when detection or shape measurement fails for galaxies that are very elliptical, or aligned with the PSF. Such selection effects also arise by imposed, necessary cuts on galaxy properties such as the signal to noise ratio (SNR) or size, which can favour certain shear values. The resulting shape catalogue then samples the underlying shear field in a non-representative way, which induces biases on the estimated shear if uncorrected.

Our method does not require shape-noise suppression via tuples of galaxies, and is therefore particularly useful when selection effects and weights are to be simulated and studied. Weights can be applied to the simulated galaxies following an arbitrary distribution, to study the impact on shear bias. When weights are given to the galaxies, the shear bias estimators become

(31)

(32)

where wi represents the weight of the ith galaxy and N is the total number of galaxies.

Selection effects that are correlated with the shear can be studied as proposed by (Sheldon & Huff 2017): we calculate the mean response from Eq. (4) by first averaging the ellipticities of both sheared samples before taking the difference and dividing by the small shear,

(33)

Now, the two galaxy samples giving rise to the mean observed ellipticities and , respectively, are not only different because of their shear. In addition, a given selection criterium (e.g. a minimum SNR) is applied to the two sheared samples. If the applied shears modify the selection, this results in different mean sample ellipticities, and the selection-induced shear bias translates into the shear response (given by Eq. (33)).

This shear bias estimator, however, does not account for selection effects that affected the shear response estimation of the originally selected galaxies. It can happen that the shear response estimation fails because, although the original galaxy is well detected and measured, this is not the case for the sheared version of the image. These selection effects are undesired, since they create additional, spurious selection biases. Such differential detections or shape measurement successes or failures is rare however, since the shear is very small and thus the images are very similar. Such occurrences can be further reduced: if an image sheared by a value g cannot be measured, for example because its increased observed ellipticity pushed it under the SNR threshold, the opposite shear −g (e.g. making the galaxy rounder) should not affect the measurement success. In the case of n′ = 2 images per original galaxy, we are free to choose the sign of the shear as long as the average shear is zero, largely avoiding such selection-induced measurement failures. If the problem only comes from the detection process, another solution can be applying the detection process to the original images and assume the same detection for the sheared versions of the images.

7.3. Shot noise

In the methodology described the shear response is estimated from sheared versions of images keeping the noise realizations fixed. This can be generalized to different random or stochastic effects such as cosmic rays or Gaussian noise. However, shot noise is a random process that depends on the flux of the image. Because of this, sheared versions of the same galaxy cannot have exactly the same shot noise realization. Our case is not affected by this, since noise is purely Gaussian, and we can expect other cases to be also insensitive to shot noise, but this is not always the case.

Exploring alternatives to treat shot noise with our method is beyond the scope of this paper, but we propose several options. First, in some cases approximating shot noise with a Gaussian noise can be enough for the required precision of the analysis, which again can be treated as described in this paper. Second, we can keep the random shot noise realization of the original image and rescale it with the changes in the flux produced in the sheared versions. A study should be done to test possible systematics coming from this approach. Finally, we can change the shot noise realization for each of the sheared image versions, but keep the other random processes fixed. This will degrade the precision of the method depending on the contribution of shot noise with respect to the other processes, but it should converge to the same results. In the worst case scenario where the shear response depends completely on shot noise, the precision of the method would be the same as for the method with orthogonal pair shape-noise suppression.

7.4. Individual shear responses

Studying the shear response as a function of galaxy properties for individual galaxies without the need to bin or average can have advantages. For calibration, the shear bias as a function of galaxy properties is typically modelled as a smooth function, either parametric, for example by fitting an analytical, multi-variate function, or non-parametric, such as by interpolating the (smoothed) measured bias values.

For linear fit methods to estimate shear bias we need to compute such a function from data binned into galaxy properties. Then the average shear biases are measured for each bin. However, these average values depend on the galaxy population inside the bin, whose shear responses might not only depend on the binned properties, but also on the properties that have not been used in the binning. As a consequence, our measured shear bias dependencies are sensitive to the property distribution of the galaxy population used. Individual shear responses and biases of simulated galaxies can further serve to learn shear calibration as a complex non-linear function of galaxy and image properties (e.g. using machine learning techniques), where no binning is needed and we can use a larger set of properties so that the function can be less dependent on the population used.

7.5. Variable shear and response on shear statistics

Switching from constant to variable shear is possible by imposing a shear field on our simulation. This is potentially interesting to study the scale dependences of shear bias that could come from spatially varying effects such as the PSF variation. Similarly to the constant shear case, the shear bias is derived by computing the shear response to a small shear power spectrum perturbation. For example, with the shear drawn from a Gaussian random field with a certain power spectrum C, the small shear values applied to each galaxy with an intrinsic shear g can be arbitrary, for example they can be proportional to g (see discussion in Sect. 7.6). Going a step further, the same methodology can be applied to derive the influence of shear bias on any shear statistics: the shear two-point correlation function, the shear power spectrum, peak counts, mass maps, higher-order statistics, and so on. As an example we hereafter illustrate this with the shear two-point correlation function ξ±.

First, as described above, we apply a shear field to the simulated galaxies, where each Fourier-space shear coefficient is drawn from a normal distribution with zero mean and variance C. Next, we perturb the shear field by drawing new coefficients , where we change the power spectrum by a small amount, δC. From the original and perturbed shear field, we compute the statistics of our choice, for example the correlation functions ξ± and , respectively. The difference between both divided by the perturbation is then the response due to the multiplicative shear bias on the correlation function, which would give us information about the spatially varying shear bias.

7.6. Non-isolated images

This analysis has been done using isolated galaxy images. To create more realistic simulations with blended galaxy images leads to the problem that the shape of many galaxies is measured in the presence of one or more nearby galaxies at different redshift and therefore different shear, if the simulation presents a realistic cosmic shear field as described in the previous section. The same issue arises if shapes of blended galaxies are estimated jointly. The presence of nearby isophotes of other objects is known to affect the shear bias (Hoekstra et al. 2015, 2017).

A common procedure to study these effects is by simulating many combinations of blended objects and close neighbours, and measuring the impact on the shear bias statistics over different populations. We claim that we can more efficiently account for these effects, since our shape-noise insensitive method does not require us to sample the large space of the distribution of N ellipticities and shears p(e1, …, eN, g1, …gN). One of the questions to address in this situation is how to produce the different sheared versions of the same images. Here we discuss two possibilities:

  • We change the shear of only one of the galaxies (the target) from the N-tuple of blended images. The inconvenience is that we need to generate N times more images compared to isolated galaxies.

  • Alternatively, we can shear every member i of the N-tuple. This shear could be a small additive shear, Δg = const, as applied to isolated images in this paper. Or it could be a function of g, such as a multiplicative factor, Δgi = Cgi with C ≪ 1 = const. In this case we would preserve the proportions between shears for galaxies at different redshifts. This function of g can be chosen taking into account the statistics or cosmological analysis that we want to do, as discussed in Sect. 7.5.

We leave a comparison of these two approaches, and estimation of shear bias for blended objects in general, to future work, which is beyond the scope of this paper.

7.7. Future simulation challenges

Adopting future versions of simulation challenges such as the GRavitational lEnsing Accuracy Testing (GREAT) series (Bridle et al. 2009; Kitching et al. 2011; Mandelbaum et al. 2014) into our method of shear bias estimation can result in a significant decrease of required image simulations. For GREAT3 the total data volume that had to be downloaded by the participants was 6.5 terabyte (10 000 simulated galaxies ×200 fields ×20 branches). Reducing this number could result in a more accessible and faster to process challenge.

To use our method, for each galaxy two additional sheared versions of the same galaxy would need to be simulated with the same noise realization. The challenge organisers would estimate the shear response for each original galaxy via Eq. (2), and apply some metric on the distribution, such as the mean, to evaluate the submissions.

To guarantee the blind aspect of the challenge, all codes have to be run on the organisers’ server without direct access to the simulation by the participants. (For testing, smaller training sets of simulations could be provided to the teams for download.) Alternatively, the shear values applied to each galaxy have to be random and kept hidden from the users. We note that it would be trivial for participants to identify the sheared versions of each original galaxy since the noise is the same for the sheared versions, even if the image order was randomized. To take the example of GREAT3, a similar challenge using our bias estimator could reduce the 10 000 × 200 simulated galaxies for one branch to a few thousand. If a variable shear field is to be used in the challenge, with a metric operating on the shear correlation function or power spectrum, a similar method as described in Sect. 7.5 can be employed to measure the response to a small and variable shear.

8. Summary

In this paper we present a new method to estimate shear bias from image simulations. Our estimator of the multiplicative shear bias is not affected by shape noise and reduces the noise contribution from the measured shape, removing the dominant uncertainty in bias estimation. Previous methods constrain the multiplicative and additive bias from a linear fit of the observed average ellipticity as a function of shear. The uncertainty of this parameter estimation is dominated by the intrinsic ellipticity distribution. Shape-noise suppression techniques using matched sets of galaxies with net zero intrinsic ellipticity improve the precision of the measurements, but are affected by selection effects, weights, and ellipticity bias that can break the shape-noise suppression.

Our method consists in measuring the shear response and additive bias of individual galaxy images. To that end, we simulate different sheared versions of the same galaxy, and measure the shear response of the image from the numerical derivative of the measured ellipticity with respect to the shear. We also measure the additive bias for the individual images. For each galaxy the sheared version has the same noise realization, allowing us to determine the individual responses at a very high precision. Then, the multiplicative and additive bias of a sample of galaxy images is obtained from the average shear response and additive bias, respectively. This method improves the precision of the estimation of shear bias significantly because it is not affected by shape noise or by the stochastic uncertainty of the measured ellipticity.

Using numerical simulations as well as analytical predictions, we quantified the uncertainty of the shear bias estimation for our method as well as for linear fits. For the multiplicative shear bias, our method provides a significant decrease in the shear bias error of a factor of ∼36 compared to the linear fit, and a factor of ∼12 if the latter is used in combination with shape-noise suppression. The additive bias uncertainty improves by about 2.3 over the linear fit, and under-performs only compared to shape-noise suppression, by a factor ∼0.5.

This implies that we can reduce the number of simulated images by a factor of ∼1300 and ∼150, respectively, to measure the shear multiplicative bias with the same precision. Our method has the further advantage that it does not need to impose shape-noise suppression, and hence it can easily be applied for analyses where selection biases or weights play an important role.

Our method has many applications as discussed in the previous section. In particular for shear bias calibration, we require much fewer simulated images to reach a required uncertainty, allowing us to study more extensively the bias dependence as function of galaxy property, PSF characteristics, or noise. It also relieves us of the potentially very severe restrictions on computing time for both simulation and shape measurement, allowing us to simulate galaxies with higher complexity, and using computationally expensive shape measurement techniques. Further, it permits us to study the shear bias as a function of galaxy properties that usually have to be averaged over, for example galaxy orientation. We have also outlined ways forward for more complex simulation scenarios, such as variable shear, blended galaxy images, and selection biases, which in principle pose no obstacles for our method.


1

The following commonly used equation ignores the 2 × 2-tensor nature of m. We will use the full expression for the shear response defined below.

Acknowledgments

The authors would like to thank Rachel Mandelbaum, Richard Massey, Arun Kannawadi, Lance Miller, and Henk Hoekstra for very helpful comments and suggestions. We also thank the anonymous referee for the useful feedback that helped improve the paper. AP, FS, and JB acknowledge support from a European Research Council Starting Grant (LENA-678282). AP and MK are supported by the French national programme for cosmology and galaxies (PNCG).

References

Appendix A: Robustness of Method 1

thumbnail Fig. A.1.

Illustration of the shear response estimations with different shear values used for ten image examples. Each colour represents a different image, and the points show the observed ellipticities obtained with different shear values. The solid lines show for each case.

Open with DEXTER

thumbnail Fig. A.2.

Distribution of the differences obtained for and for our simulated images. Similar results are found for R22.

Open with DEXTER

In this section we investigate whether a least-squares fit to more than two shear values is more accurate than the use of Eq. (4). To measure R for each individual galaxy, we have generated copies of the same images with the different shears specified in Sect. 3.1. In reality, galaxies can have other values of shear, where both shear components can be different from 0 at the same time. Fixing the other component to 0 can be a simplification of the estimation of R. For this reason, here we measure the impact of different estimations of R using different shear values. In particular, we compare the following estimators:

  • : we obtain R11 from the fit using the shear values g = ( ± 0.02, 0).

  • : we obtain R11 from the fit using the shear values g = ( ± 0.02, 0) and g = [0, 0].

  • : we obtain R11 from the fit using the shear values g = ( ± 0.02, 0) and a random value of g, with both components random.

  • : we obtain R11 from the fit using the shear values g = ( ± 0.02, 0), g = [0, 0] and the random g.

  • : we obtain R11 from the fit using all the previous values and also g = (0 ± 0.02).

If the non-diagonal terms of R are non-zero, and should behave differently from the rest. In Fig. A.1 we show ten cases of R estimations, where each case is represented with a different colour. The solid lines correspond to the fits of (so they connect the points at g1  =   − 0.02 with those at g1  =  0.02). We can see that all the points, even the random ones, tend to be well adjusted to the fitting line, although not always. These cases are an indication of non-diagonal terms of the shear response R, causing changes in when changing g2. In this section we focus on the first component of shear and response, but the same holds for the second.

thumbnail Fig. A.3.

Top panel: differences between and as a function of the random g1 applied. Bottom panel: resulting average responses from the different estimations as a function of the disk flux for galaxies with a bulge and a disk. Similar results are found for R22.

Open with DEXTER

We have found that the differences between and are negligible. This means that the method is very precise, and the relation is very well described with a linear relation. There is no need to have more than two shear values to estimate R precisely. When the second component is non-zero, sometimes it can affect the ellipticity measurement, in which case is different from . In Fig. A.2 we see the differences between and . The differences between and are very similar. We see that in most of the cases the differences are negligible, uncorrelated with , and they average out because of symmetry. We have found that, when the second component of the shear affects eout, it does it in a symmetric way so that positive g2 give opposite effects to negative g2. For a random distribution of g2, the differences cancel out. In the top panel of Fig. A.3 we show that the differences between the different estimations are consistent with 0 and independent of the random g applied. In the bottom panel we show that the mean response as a function of the disk flux of the galaxies is consistent for all the estimators. This is actually the case as a function of all the properties studied, which means that the method is very precise and that the non-diagonal terms of the shear response do not affect the shear response estimation. As a consequence, our method does not depend on the different shear values used for the fit as far as the shear values used are symmetric or homogeneous.

All Figures

thumbnail Fig. 1.

Scheme of shear response estimation for a single galaxy for R11.

Open with DEXTER
In the text
thumbnail Fig. 2.

Scheme of the estimation of biases m1 and c1 from the linear fit of the distribution of as a function of g1. In the top panel, the white dashed line shows the linear fit of the distribution, represented by the colour map. In the bottom panel, the blue points and error bars show the mean of the galaxies with the same shear. The black dashed line shows the weighted linear fit of the blue points.

Open with DEXTER
In the text
thumbnail Fig. 3.

Stochasticity of the measurement of R due to noise. The upper and lower panel show the distribution of R11 (blue histogram) and R22 (in green) for two different galaxies, respectively, shown as inlaid postage stamps, with different properties.

Open with DEXTER
In the text
thumbnail Fig. 4.

Distribution of R11 (top) and a1 (bottom) for the 2 million simulated galaxies. The second component of the biases shows similar distributions.

Open with DEXTER
In the text
thumbnail Fig. 5.

Distributions of and g1 used for the two million simulated galaxies. The second component shows similar distributions.

Open with DEXTER
In the text
thumbnail Fig. 6.

Multiplicative shear bias as a function of the disk flux Fd, measured with our method (black lines) and (in orange) from the linear fit to Eq. (1). Solid (dashed) lines correspond to m1 (m2). The top panel shows the results using the same number of object for both methods. In the bottom panel, only 1/1300 objects have been used for our method.

Open with DEXTER
In the text
thumbnail Fig. 7.

rms of the multiplicative (top panel) and additive (bottom panel) shear bias. We compare our method (red/orange lines) to the linear fit with (green) and without (cyan/blue) shape-noise suppression. The solid lines are measurements from the numerical simulations. Dashed lines show the analytical predictions derived in Sect. 4.

Open with DEXTER
In the text
thumbnail Fig. A.1.

Illustration of the shear response estimations with different shear values used for ten image examples. Each colour represents a different image, and the points show the observed ellipticities obtained with different shear values. The solid lines show for each case.

Open with DEXTER
In the text
thumbnail Fig. A.2.

Distribution of the differences obtained for and for our simulated images. Similar results are found for R22.

Open with DEXTER
In the text
thumbnail Fig. A.3.

Top panel: differences between and as a function of the random g1 applied. Bottom panel: resulting average responses from the different estimations as a function of the disk flux for galaxies with a bulge and a disk. Similar results are found for R22.

Open with DEXTER
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.