Issue 
A&A
Volume 528, April 2011



Article Number  A51  
Number of page(s)  9  
Section  Cosmology (including clusters of galaxies)  
DOI  https://doi.org/10.1051/00046361/201015850  
Published online  28 February 2011 
A bias in cosmic shear from galaxy selection: results from raytracing simulations
^{1}
ArgelanderInstitut für Astronomie, Universität Bonn,
Auf dem Hügel 71, 53121
Bonn, Germany
email: hartlap@astro.unibonn.de
^{2}
Max Planck Institute for Astrophysics,
KarlSchwarzschildStr. 1,
85741
Garching,
Germany
^{3}
Leiden Observatory, Leiden University,
Niels Bohrweg 2, 2333 CA
Leiden, The
Netherlands
Received:
30
September
2010
Accepted:
14
January
2011
Aims. We identify and study a previously unknown systematic effect on cosmic shear measurements, caused by the selection of galaxies used for shape measurement, in particular the rejection of close (blended) galaxy pairs.
Methods. We use raytracing simulations based on the Millennium Simulation and a semianalytical model of galaxy formation to create realistic galaxy catalogues. From these, we quantify the bias in the shear correlation functions by comparing measurements made from galaxy catalogues with and without removal of close pairs. A likelihood analysis is used to quantify the resulting shift in estimates of cosmological parameters.
Results. The filtering of objects with close neighbours: (a) changes the redshift distribution of the galaxies used for correlation function measurements; and (b) correlates the number density of sources in the background with the density field in the foreground. This leads to a scaledependent bias of the correlation function of several percent, translating into biases of cosmological parameters of similar amplitude. This makes this new systematic effect potentially harmful for upcoming and planned cosmic shear surveys. As a remedy, we propose and test a weighting scheme that can significantly reduce the bias.
Key words: largescale structure of Universe / cosmological parameters / gravitational lensing: weak
© ESO, 2011
1. Introduction
In preparation for upcoming and planned large cosmic shear surveys, such as PANSTARRS (Kaiser & PanSTARRS Collaboration 2005), KIDS^{1} or Euclid (Refregier et al. 2010), it is vital to find and quantify possible sources of systematic effects that can hamper the full exploitation of the information contained in these large data sets. A number of such effects have already been identified. The most fundamental problem on the observational side is to obtain unbiased estimates of the shapes of galaxies. The difficulty of this has been demonstrated, for example, by the STEP programme (Heymans et al. 2006; Massey et al. 2007) and the GREAT08 challenge (Bridle et al. 2010), where several shape measurement methods have been tested on mock data. Further, it is crucial to obtain reliable photometric redshifts to obtain an accurate redshift distribution of the galaxy sample under consideration, which is needed for accurate theoretical predictions, and also to allow the construction of redshift bins for shear tomography (Hearin et al. 2010; Bernstein & Huterer 2010; Ma et al. 2006). Intrinsic alignments of physically close galaxies and shapeshear alignments probably constitute the most severe physical contaminant of the cosmic shear signal. The first can be reduced by removing physically close pairs (King & Schneider 2002, 2003; Heymans & Heavens 2003; Takada & White 2004), whereas the influence of the latter can either be removed by the socalled nulling technique (Joachimi & Schneider 2008, 2009) or by selfcalibration (Zhang 2010; Joachimi & Bridle 2010). Another physical contamination is caused by the magnification effect. Density fluctuations in the foreground can, depending on the slope of the galaxy number count, either enhance or deplete the number of background galaxies, thus correlating the density field in the foreground with the galaxy distribution that is used to estimate its shear field (Schmidt et al. 2009).
In order to achieve percentlevel accuracy, the difference between shear and reduced shear also needs to be taken into account. Theoretical predictions for the shear correlation functions can be obtained from the matter power spectrum with relative ease (e.g. Bartelmann & Schneider 2001), but for the computation of the actually observable reduced shear correlation functions it is necessary to include higherorder corrections to the shear power spectrum (White 2005; Krause & Hirata 2010).
Finally, the process of parameter estimation requires great care as well. For example, the likelihood of the shear correlation functions has been shown to be significantly nonGaussian (Hartlap et al. 2009; Schneider & Hartlap 2009). This may also apply to other twopoint statistics derived from the correlation functions. Furthermore, even if a Gaussian likelihood is assumed, the cosmology dependence of the covariance matrix of the statistics under consideration should be taken into account, as was shown in Eifler et al. (2009). Neglecting these issues could introduce nonnegligible biases to estimates of cosmological parameters.
In this paper, we add to this list a systematic effect that leads to a biased estimate of the shear correlation functions. This bias is due to the fact that the ellipticity of a galaxy cannot be estimated reliably when its light distribution overlaps with that of a close neighbour. Therefore, it is common to discard pairs of galaxies that appear too close together on the sky. We argue that – while allowing for clean estimates of galaxy shapes – this practice has two adverse side effects: it changes the redshift distribution of source galaxies, and it correlates the lensing mass distribution in the foreground with the source galaxy distribution in the background. Because of the latter issue, the product of the complex ellipticities of a randomly selected pair of galaxies no longer yields an unbiased estimate of the shear correlation function.
The article is organized as follows: after briefly reviewing the cosmic shear twopoint statistics relevant for this paper in Sect. 2, we describe in Sect. 3 the raytracing simulations and semianalytic galaxy formation models we use to create our mock galaxy catalogues. In Sect. 4, we quantify the bias in the shear correlation function using our simulation results for various choices of galaxy selection criteria. We then propose a weighting scheme that can help to reduce the bias (Sect. 5) and discuss the impact on cosmological parameter estimation (Sect. 6). We conclude the paper in Sect. 7.
2. The shear correlation functions
Several statistics have been developed to capture the twopoint information that is contained in the ellipticities of distant galaxies, such as the shear correlation functions (e.g. Kaiser 1992; Crittenden et al. 2002), the shear dispersion in circular apertures (e.g. Kaiser 1992) or the aperture mass dispersion (e.g. Schneider et al. 1998, 2002a). Recently, Schneider et al. (2010) have proposed the socalled COSEBIs, which allow for a clean E/Bmode decomposition given the shear correlation functions on a finite interval. These statistics are all related to the power spectrum of the weak lensing convergence (see, e.g., Crittenden et al. 2002; Schneider et al. 2002b). Regarding actual measurements, the shear correlation functions ξ_{ ± } are the most convenient of these statistics, since they can be estimated with relative ease from real data sets, even in the presence of gaps and masked regions. Any other twopoint statistic of interest is therefore usually computed from an estimate of the shear correlation functions.
A practical estimator for the shear correlation functions is given by (e.g. Schneider et al. 2002a) (1)where the sum runs over all pairs of galaxies, located at the angular positions θ_{i}. The complex ellipticity of the ith galaxy is denoted by ϵ_{i}, and its tangential and cross components with respect to the line joining it to the jth galaxy are given by ϵ_{t,i;j} and ϵ_{ × ,i;j}, respectively. The symbol Δ_{ij}(θ) is equal to one if the angular separation θ of the ith and jth galaxies lies in the bin centred on θ, and vanishes otherwise. Finally, the ρ_{i} are weights assigned to the galaxies. For the purpose of this work, it is convenient to write them as ρ_{i} = m_{i} s_{i}, where s_{i} is a statistical weight that, for example, reflects the quality of the shape estimate. The “selection weight” m_{i} is zero if the galaxy is too close to its nearest neighbour to allow for a reliable measurement of its shape, and unity otherwise.
3. Raytracing simulations
Our raytracing simulations are based on the dark matter distribution in the Millennium Simulation (MS, Springel et al. 2005). The cosmological parameters used for the MS (Ω_{m} = 0.25, Ω_{DE} = 0.75, Ω_{b} = 0.045, σ_{8} = 0.9, h = 0.73, w_{0} = −1.0, n_{s} = 1.0) also define the fiducial cosmological model used throughout the paper.
We have used the raytracing code described in Hilbert et al. (2009) to obtain 32 realisations of a 4 × 4deg^{2} field, thus covering 512deg^{2} in total. The matter distribution along the backwards light cone of the observer is obtained by the periodic continuation of simulation snapshots of increasing redshift. It is then divided into slices of a thickness of ≈ 100 h^{1} Mpc, which are subsequently projected onto lens planes. The periodic repetition of structures along the line of sight (l.o.s.) is prevented by choosing a l.o.s. direction that is tilted with respect to the boundaries of the simulation box. The advantage of this technique in comparison to the random transformation approach is that the matter distribution is continuous across slice boundaries and that largescale correlations extending beyond the redshift slices are maintained. The code follows a set of light rays, which form a grid on the first lens plane (the image plane), through the array of lens planes. At the same time, the Jacobian matrices of the lens mapping from the observer to the lens planes are computed using a recursion formula.
To create realistic, lensed mock galaxy catalogues, we combine the raytracing with the semianalytic model of galaxy formation by De Lucia & Blaizot (2007), making extensive use of the public Millennium Simulation database (Lemson & Springel 2006; Lemson & the Virgo Consortium 2006). We use the method outlined in Hilbert et al. (2009) to obtain the lensed positions and observed magnitudes (taking the magnification due to lensing into account) for all galaxies in the semianalytic model with M_{stellar} ≥ 10^{9} h^{1} M_{⊙}. In addition, the galaxy formation model yields the masses of the disk and spheroidal (henceforth bulge) component and the disc radius r_{disc} (which can be zero). As described in more detail in Hilbert et al. (2008), we complement this with an estimate of the comoving radius of the spheroidal component of the galaxy given by (2)which combines the size distribution of galaxies measured by Shen et al. (2003) and the redshift evolution of galaxy sizes found by Trujillo et al. (2006). Each galaxy is then assigned an effective radius r_{e} = max(r_{disc},r_{bulge}). The angular diameter of the galaxy is given by , where f_{K}(w) is the comoving angular diameter distance to the galaxy, and μ is the lensing magnification at the position of the galaxy. The resulting distributions of angular and comoving galaxy radii for a simulated galaxy survey with a magnitude cut of r_{SDSS} = 25 are shown in Fig. 1.
Fig. 1 Distribution of galaxy radii in the simulated catalogue with r_{SDSS} = 25. Upper panel: angular radius (no seeing), lower panel: comoving physical radius. 

Open with DEXTER 
We construct our catalogues for cosmic shear measurements by selecting galaxies brighter than three different cuts in the SDSS rband (r_{SDSS} = 24, 25, 26). Unless otherwise stated, we assume that these cuts are the same as the limiting magnitude of the survey, . For comparison, however, we will also consider the case where . Since we assume that the check for overlapping light distributions is done before the galaxies are selected for shape measurement, galaxies that are brighter than r_{SDSS} and have faint close neighbours with magnitudes between r_{SDSS} and are removed from the lensing catalogue as well.
Furthermore, we use two criteria to identify pairs of objects whose projected angular separation θ is too small for obtaining reliable shape measurement of the individual galaxies:

According to the first criterion, two galaxies atθ_{1} and θ_{2} with angular separation θ = θ_{1} − θ_{2} are both removed from the catalogue if . Here, the effective angular radii are given by (3)where θ_{see} is the size of the seeing disk. The parameter α can be chosen arbitrarily to tune the strictness of the selection criterion and to compensate for inaccuracies of our modelling of the galaxy radii. Since this criterion depends on the halflight radius of the galaxies, we henceforth denote it with “HLR”.

The second criterion (called “FIX” criterion) is similar to what is used for, e.g., the CFHTLS (see also Van Waerbeke et al. 2000; Maoli et al. 2001). It uses a fixed angular separation threshold: if a pair of galaxies fulfils θ < θ_{fix}, one of the two galaxies is selected at random and removed from the catalogue. The rationale for doing this is the following: even though the light distribution of the remaining galaxy is still affected by the light of the removed neighbour, the resulting error of the shape estimate should be uncorrelated with any other galaxy that remains in the catalogue and should just add to the noise. In addition to this, we remove all galaxies that are members of obviously severely blended pairs by applying the HLR criterion with α = 1. We find, however, that the effect of this second step on the properties of the resulting galaxy catalogue is generally subdominant.
The choice of the selection criterion and its parameters most likely depends on the quality of the data at hand and the shape measurement pipeline used. In general, one wishes to retain as many galaxies as possible while keeping the bias caused by isophote overlap below a certain threshold.
We remark that a multitude of variants of these selection criteria can be conceived, where for example for the FIX criterion, not a random galaxy is removed from a close pair, but the galaxy with the lowest signaltonoise ratio (SNR); a related possibility would be to always keep close pairs when the SNR of one galaxy is considerably larger than the SNR of the second. Using these variants is expected to lead to minor quantitative, but not to qualitative changes of the results presented in the following sections.
In Tables 1 and 2, we list the galaxy number densities after applying the HLR and FIX criteria, respectively, for various values of α, θ_{see} and θ_{fix}. The values given in parentheses are the fractional decrease of the number density compared to the unfiltered galaxy catalogue. As expected, the deeper the survey and the more restrictive the criterion, the more galaxies are removed, since the probability of overlap is proportional to the projected galaxy density and the square of the threshold radius of the selection criterion.
Galaxy number densities for the HLR criterion.
Galaxy number densities for the FIX criterion.
We compute the shear correlation functions from our simulated galaxy catalogues using Eq. (1). We obtain the observed galaxy ellipticities ϵ using the relation (Schneider & Seitz 1995) (4)where g is the reduced shear obtained from the raytracing simulations and ϵ^{(s)} is the intrinsic ellipticity. For measuring the bias caused by the selection criteria described above, we set ϵ^{(s)} = 0 in Eq. (4). We include intrinsic ellipticities only when computing the covariance matrix of the shear correlation functions for the discussion in Sect. 6.
4. The effect of object selection
4.1. Effect on the redshift distribution
Using a selection criterion like the ones described in the previous section has two undesirable side effects: first, the redshift distribution of the filtered galaxy catalogue is different from the one before object selection. This is illustrated in Fig. 2, where we show in the lower panel the redshift distributions of our three mock surveys with different magnitude cuts without applying any selection criterion. The upper panel displays the ratios of the redshift distributions after and before object selection for the survey with r_{SDSS} = 25. We also consider the case of a limiting magnitude of the survey that is deeper than the cut used to define the sample of galaxies used for shape measurements (, whereas r_{SDSS} = 25). Seeing only has very little effect on the results obtained with the FIX criterion, because the size of the seeing disk is typically much smaller than the fixed threshold radius θ_{fix}; we therefore only consider the case with .
Fig. 2 Upper panel: ratio of the redshift distributions for r_{SDSS} = 25 after and before object selection for the HLR criterion with α = 2 (thick solid line: , thin solid line: ) and the FIX criterion with (thick dashed line). The case with a deeper limiting magnitude () than the magnitude cut for the lensing catalogue is represented by the doubledotted blue line. Lower panel: redshift distributions of the unfiltered galaxy catalogues with limiting magnitudes r_{SDSS} = 24 (solid line), r_{SDSS} = 25 (dashed line) and r_{SDSS} = 26 (dotted line). 

Open with DEXTER 
In all cases, the largest fraction of the galaxies is removed at low redshifts. These galaxies have the largest apparent radii and thus have the largest probability of isophote overlap. The amplitude of the deviations is highest for , because additional pairs are removed in which one galaxy is from the magnitude range . For the FIX criterion and the HLR criterion with seeing, a secondary dip occurs, approximately at the redshift of the peak of the redshift distribution. This behaviour is due to the presence of angular clustering.
We illustrate this by constructing a simple analytical model, for which we subdivide the galaxies into redshift slices of width dz. We assume that there are no angular crosscorrelations between slices at different redshifts. Furthermore, we use a powerlaw model for the angular correlation function, so that it is given by ω(θ;z,z′) = A(z) θ^{ − γ} if z = z′ and ω(θ;z,z′) = 0 otherwise. The galaxy radii are given in a deterministic fashion by , where r_{e}(z) can be considered to be the mean radius of all galaxies in a thin redshift bin centred on z. The probability of finding a galaxy with redshift z′ within an annulus of radius θ and width dθ around a galaxy at redshift z is thus dp(z,z′) = 2πθ dθ [1 + ω(θ;z,z′)] N(z′) / Ω, where Ω is the total area of the survey, and N(z′) is the number of galaxies in the redshift slice centred on z′. Two galaxies have overlapping isophotes if they are closer than θ_{eff}(z,z′) = θ_{e}(z) + θ_{e}(z′) (corresponding to the HLR criterion with α = 1, which we choose here for simplicity). The total probability of overlap for two galaxies is given by the integral of dp(z,z′) over a circle with radius θ_{eff}(z,z′). Finally, we obtain the total number of galaxies removed from the slice at z by summing up the contributions from all redshift slices: (5)Simplifying and inserting our model for the correlation function, we obtain (6)where the second term accounts for the effect of galaxy clustering. We can simplify this even more by using a constant clustering amplitude and a redshiftindependent radius ϑ for all galaxies, so that θ_{eff}(z,z′) = 2ϑ. We then find that (7)We see that in the absence of angular correlations, a constant fraction of objects is removed from the total galaxy population, given only by the fraction of the total area covered by galaxies. This leaves the shape of the redshift distribution unchanged. Galaxy clustering increases the probability of overlapping isophotes. However, this is effective only for galaxies in the same redshift slice. The fraction of blended objects in a slice is proportional to N(z) (and not N_{tot} as without clustering). This causes the secondary minimum seen in the upper panel of Fig. 2 for those selection criteria where θ_{e}(z) approaches a finite, constant value as z increases. This is the case if seeing is present, as well as for the FIX criterion. If, on the other hand, θ_{e}(z) is allowed to fall to zero, as for the HLR criterion without seeing, this suppresses the clustering term in Eq. (6), because the galaxy radii (and thus ) are already close to zero when N(z) approaches its maximum.
The change of the redshift distribution is relevant if the redshifts of the individual galaxies are not available. In such cases, p(z) is usually inferred from a subsample or a similar survey with either spectroscopic or photometric redshifts. In general, the objects used for these calibration samples are selected in a different way than the galaxies for the shear catalogue, and therefore the redshift distribution obtained in this way does not account for the change of p(z) due to object selection. For upcoming lensing surveys incorporating photometric redshifts, this should be less of a concern, since in this case the redshift distribution of the galaxies in the shear catalogue can at least in principle be estimated directly.
4.2. Densitydependent galaxy selection
The second, and probably more severe effect of using selection criteria such as HLR and FIX arises because the selection is densitydependent. Since the galaxy distribution is correlated with the underlying density field, a mass overdensity in the foreground also implies an overdensity of galaxies. This in turn implies a higher probability of isophote overlap and thus of the removal of galaxies. Therefore, the fraction of galaxy pairs that can be formed from galaxies located behind overdensities is decreased relative to all galaxy pairs that contribute to the shear correlation function estimator for a certain angular separation bin. Highdensity regions are therefore effectively downweighted compared to the case without object selection. The opposite is true for underdense regions: the probability that a galaxy behind the underdensity is filtered is reduced, and more pairs than for a region of average density contribute to the shear correlation functions. This reweighting is further modified by the fact that the fraction of galaxies that is removed from the fore and background of a specific lens is not constant (see Fig. 2). The ratio of the number of foregroundforeground and foregroundbackground pairs, which do not carry information about the lens, to the number of backgroundbackground pairs, where both galaxies have been sheared by the lens, depends on the lens redshift. If relatively more pairs in the background than in the foreground are removed, the signal of the lens is further suppressed (and vice versa).
The net effect of all this is that the shear correlation estimator given by Eq. (1) is no longer unbiased. The reason for this is that the weights ρ_{i} are no longer uncorrelated with the galaxy ellipticities, because the selection weights m_{i} depend on the projected galaxy density through the mechanism described above.
4.3. Simulation results
We define the bias due to object selection as (8)where are the shear correlation functions after filtering for close pairs, and are the correlation functions computed from all galaxies in the field of view. The superscript “z + δ” indicates that includes the bias both due to the change of the redshift distribution and the densitydependent selection of galaxies.
In the upper panels of Figs. 3 and 4, we show the fractional bias (9)for the HLR and FIX criterion, respectively. The error bars have been computed from the fieldtofield variation between the 32 raytracing realisations. For both criteria, is biased high by several percent on large scales, whereas on small scales this bias can become negative. The behaviour for large θ can be explained by the change of the redshift distribution, which gives more weight to highredshift galaxies, which carry the strongest shear signal (see Fig. 2). On small scales, the negative bias begins to dominate due to the densitydependence of the way galaxy pairs are selected (see below). As discussed before, seeing is only relevant for the HLR criterion and was therefore not considered in Fig. 4.
Fig. 3 Fractional bias of the shear correlation functions for the HLR criterion, without (upper panels) and with (lower panels) correction for the change of the redshift distribution. Thick dashed lines are for α = 1, thick solid lines for α = 3 without seeing. For the respective thin lines a seeing of was assumed. The shaded region shows the 1σerror. For better visibility, it is shown only for the case of α = 3, θ_{see} = 0′′. The error bars for the other cases are very similar. 

Open with DEXTER 
Fig. 4 Same as Fig. 3, but for the FIX criterion with θ_{fix} = 2′′ (solid red curves), θ_{fix} = 3.7′′ (shortdashed blue curves) and θ_{fix} = 5.0′′ (dotdashed blue curves). 

Open with DEXTER 
Fig. 5 Comparison of the fractional bias of ξ_{ ± } for a survey with limiting magnitude and a magnitude cut for the galaxies that are used for shape measurements of r_{SDSS} = 25 (solid lines), to the bias for a survey where (dashed lines). For both cases, the HLR criterion with α = 2 was used. Thick curves display the case without seeing, thin curves the case with . 

Open with DEXTER 
If photometric redshifts are available for all galaxies, the correct redshift distribution after object selection can be estimated, and one is only interested in the systematic effect caused by the densitydependence of the galaxy selection. To quantify this, we therefore would like to compare to fiducial correlation functions that were computed using the correct redshift distribution and with galaxy pairs selected in a fair way, i.e. uncorrelated with the density field. To this end, we take the unfiltered galaxy catalogues (the ones that led to ), sort the galaxies into redshift bins, and randomly remove galaxies from each bin so that the resulting new galaxy catalogue has the same redshift distribution as the catalogue after applying one of the selection criteria. We denote the correlation functions computed from the new catalogues by , and define the bias only due to the densitydependence of the galaxy selection process (indicated by the superscript “δ”) by (10)Accordingly, the corresponding fractional bias is given by (11)We show the simulation results for in the lower panels of Figs. 3 and 4. The bias is now consistent with being negative for all angular separations, as expected from the qualitative picture described in Sect. 4.1. The effect is most severe for small θ, whereas seems to asymptotically approach zero on large scales. Even after correcting for the change of the redshift distribution, the bias is of the order of several percent and therefore constitutes a potentially significant contaminant for present and future cosmic shear surveys.
We compare a survey with a limiting magnitude of and a cut for the lensing catalogue of r_{SDSS} = 25 to a survey with in Fig. 5, using the HLR criterion with α = 2. While increases by ≈ 1% in the case with due to the change in redshift distribution (see also Fig. 2), no significant differences between the two surveys can be found if the correct redshift distribution is known. The reason for this is that although more galaxies are removed when the deeper limiting magnitude is used, the bias is primarily due to the change of the relative weights of over and underdense regions in the correlation function estimator. These weights only depend on the relative change of the number of galaxy pairs behind such structures. The same argument can be used to explain the results displayed in Fig. 6, where we investigate the behaviour of the bias (taking the change of p(z) into account) for various magnitude cuts (using the HLR criterion). We find that, for the magnitude cuts and the resulting redshift distributions of galaxies considered here, the fractional bias depends only very little on the survey depth. The only notable difference occurs on small scales, where the bias for deeper surveys is slightly less severe than for shallower ones.
Fig. 6 Fractional bias of the shear correlation functions for various survey depths using the HLR criterion with α = 2, , corrected for the change of the redshift distribution. Solid black lines with error bars: r_{SDSS} = 24; blue dashed line: r_{SDSS} = 25; red dotted line: r_{SDSS} = 26. 

Open with DEXTER 
5. Weighting scheme
The bias discussed in the previous sections is caused by the removal of galaxy pairs in a way that is correlated with the density field. This suggests that the bias could be reduced by increasing or decreasing the relative pair count behind over and underdensities, respectively, to “fair” levels. Such a procedure would reduce both the bias due to the change of p(z) and due to the densitydependent selection. We propose that, if photoz estimates are available also for the galaxies that have been filtered out, this can be achieved by identifying the nearest neighbour of a removed object on the sky, and doubling its weight for the correlation function computation (i.e. using it twice in the shear catalogue). We demand close proximity on the sky and in redshift to ensure that the shear of the neighbour is a reasonable proxy for the shear at the position of the filtered galaxy.
Fig. 7 Upper panel: distribution of the separation of rejected galaxies from their nearest accepted neighbour in a slice of thickness Δw; a magnitude cut of r_{SDSS} = 25 and the HLR criterion with α = 2, were used. Lower panel: same as upper panel, but for slices with thickness given by Δz_{phot}. Vertical lines indicate mean separations. 

Open with DEXTER 
Since the geometrical lensing weight functions change relatively slowly with comoving distance, it is sufficient to choose an object from a redshift slice centred on the removed object with a certain width (a few hundred Mpc). This also helps finding a neighbour that is sufficiently close to the removed galaxy on the sky; clearly, the larger the slice width, the smaller is the projected nearestneighbour distance of objects in the slice. This is illustrated in Fig. 7, where we show the distribution of the distance from a rejected galaxy to its nearest neighbour for slices with widths specified in comoving distance (upper panel) or photometric redshift (lower panel). The latter were obtained by simulating the photoz accuracy in a typical contemporary weak lensing survey. We use the recipe described in Hildebrandt et al. (2007, 2009) to simulate a multicolour catalogue based on realistic distributions of redshift, spectral type, magnitude and magnitude error, closely resembling the CFHTLSWide. Finally, photoz’s are estimated with the BPZ code (Benítez 2000). Comparisons of the simulated photoz accuracy to the one obtained from real CFHTLS data (Erben et al. 2009) show good agreement. For the thickest slices considered (Δz_{phot} = 0.2), the mean distance to the nearest neighbour is 44″, and decreases to 22″ for the slice with Δz_{phot} = 0.05.
Fig. 8 Fractional bias of ξ_{ ± } using the weighting scheme described in Sect. 5. Thin curves: fractional bias after applying the weighting scheme for slices with thickness Δz_{phot}; thick solid green curve: without weighting scheme, thick dotdashed green curve: without weighting scheme. Error bars were computed from fieldtofield variation and for better visibility are shown only for one case. 

Open with DEXTER 
In Fig. 8, we compare the fractional bias of the correlation functions , measured after applying the weighting scheme, (12)where (13)to the original and . We only consider slices defined in terms of photometric redshift; the corresponding results for slices with a given comoving thickness are very similar. The suggested procedure clearly reduces the bias, in particular on large scales. Its performance degrades on scales below angular separations of ≈ 1′. The reason for this is that the selection of the nearest neighbour effectively corresponds to a smoothing of the shear field with smoothing length comparable to the mean nearestneighbour distance, because it is assumed that the shear of the removed galaxy is the same as the one of its substitute. Since ξ_{ − } is more sensitive to smallscale power than ξ_{ + }, it should be particularly affected, as can indeed be seen in the lower panel of Fig. 8. The deviations seen for large θ are consistent with noise. On the other hand, the actual width of the redshift slice and, related to this, the actual value of the mean nearestneighbour distance have very little effect on the quality of the method, although there is a slight tendency for thicker slices to yield values of that are larger by a few fractions of a percent. This means that the weighting scheme is relatively insensitive to the quality of the photometric redshifts. This is particularly important for those galaxies which have been filtered out because their light distribution is contaminated by a close neighbour, which also adversely affects the accuracy of their photoz estimates.
6. Implications for cosmological parameters
Fig. 9 Likelihood analysis of the bias caused by the removal of blended galaxies. Each panel shows the 1 and 2σ confidence contours obtained by marginalizing over the remaining parameter (computed for the HLR criterion with α = 2 and ). The fiducial parameter values are marked with crosses. Circles indicate the maximum likelihood estimates assuming that the true redshift distribution after object selection is unknown, squares show the maxima if the correct p(z) is used, and triangles give the estimates after applying the weighting scheme of Sect. 5. For the HLR criterion (α = 2; ), filled symbols have been used, open symbols for the FIX criterion with θ_{fix} = 3.7′′. For better visibility, the estimates for each criterion type have been connected with a line. 

Open with DEXTER 
To illustrate the importance of the object selection bias for parameter estimation, we perform a likelihood analysis to fit for the parameters π = (Ω_{m},σ_{8},w_{0}) (assuming a flat universe). Our fiducial cosmological model, denoted by π_{0}, is that of the Millennium Simulation (see Sect. 3). We use the galaxy sample with r_{SDSS} = 25 from our raytracing simulations for simulating a survey with an area of 1500 deg^{2}. To each galaxy, we assign Gaussian ellipticity noise with dispersion σ_{ϵ} = 0.4. We consider correlation functions given on ten logarithmically spaced bins in the range from 1′ to 80′, and assume a Gaussian likelihood of the form (14)Here, ξ = [ξ_{ + }(θ_{1}), ..., ξ_{ + }(θ_{n}), ξ_{ − }(θ_{1}), ..., ξ_{ − }(θ_{n})]^{t} is the measured correlation function (see below), written in vectorial form, and m(π) is the model prediction based on the threedimensional matter power spectrum as given by Smith et al. (2003). The covariance matrix C has been estimated from the fieldtofield variation of the raytracing realisations. When computing its inverse, we correct for the bias caused by the noise in the estimate of C by using the correction factor described in Hartlap et al. (2007).
Since we are only interested in the effect of removing close galaxy pairs and not in the bias caused by the mismatch of the theoretical model and the correlation functions in the Millennium Simulation (see Hilbert et al. 2009), we construct the data vectors ξ from the model for our fiducial set of parameters, m(π_{0}), and the bias β due to object selection, measured from the raytracing simulations: (15)For β, we consider three cases: (a) assuming no knowledge of the true redshift distribution after removing blended galaxies (i.e. using from Eq. (8)); (b) assuming that the change of the redshift distribution has been taken into account (using from Eq. (10)); and (c) assuming that the weighting scheme described in the previous section has been applied (using from Eq. (13)).
In Fig. 9, we show the results of this procedure for both the HLR criterion with α = 2 and , and the FIX criterion with . In Table 3, we show the fractional shifts of each cosmological parameter with respect to its fiducial value for the various cases. As expected from the larger amplitude of the bias in the correlation functions for the HLR criterion (see Figs. 3 and 4), the deviation of the maximumlikelihood points from the true values is generally larger for the HLR criterion than for the FIX criterion. In both cases, the parameter estimates are off by several percent. Interestingly, knowledge of the correct redshift distribution does not necessarily improve the parameter estimates, which is particularly striking in the Ω_{m}σ_{8}plane. The bias of the maximumlikelihood estimates of all parameters is reduced by a significant amount if the proposed weighting scheme is applied, as could be expected from the likewise reduction of the bias in the shear correlation functions (see Fig. 8). While not being a perfect solution that would be accurate enough to be applied to planned largearea surveys, the method works sufficiently well to reduce the bias to a level that is well below the statistical errors for surveys like the CFHTLS (at least for the nontomographic case considered here). Furthermore, a comparison of the correlation functions measured with and without the weighting scheme may be used to assess the importance of the object selection bias for a given survey.
Fractional bias of cosmological parameters.
7. Summary and conclusions
We have described a new, so far unconsidered systematic effect affecting the measurement of the shear correlation functions. The cause of the bias is the common practice of removing galaxies from the lensing catalogue that have very close neighbours, in order to avoid isophote overlap. While this filtering is necessary for obtaining clean shape estimates, it has two adverse effects on the correlation function estimate. The first consists in altering the redshift distribution of the galaxy catalogue. This is most important for low redshifts (where the angular sizes of galaxies are large) and (as a result of the angular clustering of the galaxies) near the peak of the redshift distribution. Second, such filtering predominantly removes galaxies that lie behind overdense regions. Therefore, fewer pairs of galaxies can be formed that carry the shear signal of the overdensity, effectively downweighting it in the correlation function estimator. For similar reasons, underdense patches of the sky receive a higher weight. As a result of this, the estimate of the shear correlation functions obtained from a galaxy catalogue, from which close pairs have been removed, is biased.
In order to quantify this bias, we have run raytracing simulations through the Millennium Simulation in conjunction with a semianalytic model of galaxy formation and observed scaling relations for the radii of the galaxies. We consider two different selection criteria, one that removes close pairs of galaxies closer than a certain threshold separation which depends on the radii of the galaxies, and the other removing one galaxy of a pair that is closer than a certain fixed threshold. We find that the change of the redshift distribution due to filtering is of the order of several percent; however, this can in principle be dealt with if photometric redshifts are available for all galaxies.
The effect of the densitydependence of the galaxy selection varies with angular separation. We find that on scales of ≈ 1′, the shear correlation functions are biased low by typically several percent; the bias decreases for larger angular separations. The bias seems to be almost independent of the survey depth. While seeing has essentially no effect on the bias when a fixed threshold radius is used to define close pairs, adding seeing to the simulations can significantly increase the bias for the selection criterion depending on the sizes of galaxies.
We note that the bias studied here is different from the effects of the clustering of source galaxies previously discussed in the literature (Bernardeau 1998; Schneider et al. 2002b): the removal of close galaxy pairs creates an anticorrelation between foreground and background galaxies, and thus between the lensing matter distribution and the galaxies that are used to trace the shear field caused by the matter in the foreground. This induces clustering between galaxy populations that are widely separated in redshift, whereas the effect of Schneider et al. (2002b) arises from the clustering of source galaxies that are at very similar redshifts and which need not be related to the dark matter distribution at all.
We have investigated the impact of the new systematic effect on estimates of Ω_{m}, σ_{8} and w_{0}, assuming a flat universe and keeping all other parameters fixed. Irrespective of whether the correct redshift distribution is used or not, we find shifts of the maximumlikelihood estimators of several percent. The situation can be significantly improved by using different weights for the galaxies that are eventually used for measuring the correlation functions. The weighting scheme consists of doublecounting the nearest neighbour (from within a redshift slice with thickness of a few hundred Mpc) of a galaxy that has been removed. This requires photometric redshift estimates to be available also for the galaxies that have been removed by the selection criterion. We find that the method works well even for slices as thick as Δz_{phot} = 0.2, so that the requirements on the quality of these redshift estimates are relatively low. The weighting scheme restores the pair count to fair levels and substitutes the shear of the filtered galaxy with the shear of the nearest neighbour. The scheme is surprisingly independent of the actual width of the redshift slice and reduces the bias of the correlation function to levels of ≲ 1% for angular scales ranging from ≈ 2′ to ≈ 80′. Accordingly, the bias of cosmological parameter estimates is also significantly reduced.
Given the amplitude of the bias of the shear correlation function, this new systematic effect has the potential of being very significant. The weighting scheme we propose is a first step towards controlling it, but it probably lacks the accuracy necessary for the next generation of weak lensing experiments.
Acknowledgments
We would like to thank Sherry Suyu, Tim Eifler, Benjamin Joachimi and Simon White for helpful discussions and input during the course of this project. J.H. and S.H. acknowledge support by the Deutsche Forschungsgemeinschaft within the Priority Programme 1177 under the project SCHN 342/6 and the Transregional Collaborative Research Centre TRR 33 “The Dark Universe”. H.H. was supported by the European DUEL RTN, project MRTNCT2006036133. The Millennium Simulation databases used in this paper and the web application providing online access to them were constructed as part of the activities of the German Astrophysical Virtual Observatory.
References
 Bartelmann, M., & Schneider, P. 2001, Phys. Rep., 340, 291 [NASA ADS] [CrossRef] [Google Scholar]
 Benítez, N. 2000, ApJ, 536, 571 [NASA ADS] [CrossRef] [Google Scholar]
 Bernardeau, F. 1998, A&A, 338, 375 [NASA ADS] [Google Scholar]
 Bernstein, G., & Huterer, D. 2010, MNRAS, 401, 1399 [NASA ADS] [CrossRef] [Google Scholar]
 Bridle, S., Balan, S. T., Bethge, M., et al. 2010, MNRAS, 405, 2044 [NASA ADS] [Google Scholar]
 Crittenden, R., Natarajan, P., Pen, U., & Theuns, T. 2002, AJ, 568, 20 [NASA ADS] [CrossRef] [Google Scholar]
 De Lucia, G., & Blaizot, J. 2007, MNRAS, 375, 2 [NASA ADS] [CrossRef] [Google Scholar]
 Eifler, T., Schneider, P., & Hartlap, J. 2009, A&A, 502, 721 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Erben, T., Hildebrandt, H., Lerchster, M., et al. 2009, A&A, 493, 1197 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Hartlap, J., Simon, P., & Schneider, P. 2007, A&A, 464, 399 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Hartlap, J., Schrabback, T., Simon, P., & Schneider, P. 2009, A&A, 504, 689 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Hearin, A. P., Zentner, A. R., Ma, Z., & Huterer, D. 2010, ApJ, 720, 1351 [NASA ADS] [CrossRef] [Google Scholar]
 Heymans, C., & Heavens, A. 2003, MNRAS, 339, 711 [NASA ADS] [CrossRef] [Google Scholar]
 Heymans, C., Van Waerbeke, L., Bacon, D., et al. 2006, MNRAS, 368, 1323 [NASA ADS] [CrossRef] [Google Scholar]
 Hilbert, S., White, S. D. M., Hartlap, J., & Schneider, P. 2008, MNRAS, 386, 1845 [NASA ADS] [CrossRef] [Google Scholar]
 Hilbert, S., Hartlap, J., White, S. D. M., & Schneider, P. 2009, A&A, 499, 31 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Hildebrandt, H., Pielorz, J., Erben, T., et al. 2007, A&A, 462, 865 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Hildebrandt, H., Pielorz, J., Erben, T., et al. 2009, A&A, 498, 725 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Joachimi, B., & Bridle, S. L. 2010, A&A, 523, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Joachimi, B., & Schneider, P. 2008, A&A, 488, 829 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Joachimi, B., & Schneider, P. 2009, A&A, 507, 105 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Kaiser, N. 1992, ApJ, 388, 272 [NASA ADS] [CrossRef] [Google Scholar]
 Kaiser, N., & PanSTARRS Collaboration. 2005, in BAAS, 37, 465 [Google Scholar]
 King, L., & Schneider, P. 2002, A&A, 396, 411 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 King, L. J., & Schneider, P. 2003, A&A, 398, 23 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Krause, E., & Hirata, C. M. 2010, A&A, 523, A28 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Lemson, G., & Springel, V. 2006, in Astronomical Data Analysis Software and Systems XV, ed. C. Gabriel, C. Arviset, D. Ponz, & S. Enrique, ASP Conf. Ser., 351, 212 [Google Scholar]
 Lemson, G., & the Virgo Consortium 2006 [arXiv:astroph/0608019] [Google Scholar]
 Ma, Z., Hu, W., & Huterer, D. 2006, ApJ, 636, 21 [NASA ADS] [CrossRef] [Google Scholar]
 Maoli, R., Van Waerbeke, L., Mellier, Y., et al. 2001, A&A, 368, 766 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Massey, R., Heymans, C., Bergé, J., et al. 2007, MNRAS, 376, 13 [NASA ADS] [CrossRef] [Google Scholar]
 Refregier, A., Amara, A., Kitching, T. D., et al. 2010 [arXiv:1001.0061] [Google Scholar]
 Schmidt, F., Rozo, E., Dodelson, S., Hui, L., & Sheldon, E. 2009, ApJ, 702, 593 [NASA ADS] [CrossRef] [Google Scholar]
 Schneider, P., & Hartlap, J. 2009, A&A, 504, 705 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Schneider, P., & Seitz, C. 1995, A&A, 294, 411 [NASA ADS] [Google Scholar]
 Schneider, P., van Waerbeke, L., Jain, B., & Kruse, G. 1998, MNRAS, 296, 873 [NASA ADS] [CrossRef] [Google Scholar]
 Schneider, P., van Waerbeke, L., Kilbinger, M., & Mellier, Y. 2002a, A&A, 396, 1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Schneider, P., van Waerbeke, L., & Mellier, Y. 2002b, A&A, 389, 729 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Schneider, P., Eifler, T., & Krause, E. 2010, A&A, 520, A116 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Shen, S., Mo, H. J., White, S. D. M., et al. 2003, MNRAS, 343, 978 [NASA ADS] [CrossRef] [Google Scholar]
 Smith, R. E., Peacock, J. A., Jenkins, A., et al. 2003, MNRAS, 341, 1311 [NASA ADS] [CrossRef] [Google Scholar]
 Springel, V., White, S. D. M., Jenkins, A., et al. 2005, Nature, 435, 629 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Takada, M., & White, M. 2004, ApJ, 601, 1 [NASA ADS] [CrossRef] [Google Scholar]
 Trujillo, I., Förster Schreiber, N. M., Rudnick, G., et al. 2006, ApJ, 650, 18 [NASA ADS] [CrossRef] [Google Scholar]
 Van Waerbeke, L., Mellier, Y., Erben, T., et al. 2000, A&A, 358, 30 [NASA ADS] [Google Scholar]
 White, M. 2005, Astroparticle Physics, 23, 349 [NASA ADS] [CrossRef] [Google Scholar]
 Zhang, P. 2010, ApJ, 720, 1090 [NASA ADS] [CrossRef] [Google Scholar]
All Tables
All Figures
Fig. 1 Distribution of galaxy radii in the simulated catalogue with r_{SDSS} = 25. Upper panel: angular radius (no seeing), lower panel: comoving physical radius. 

Open with DEXTER  
In the text 
Fig. 2 Upper panel: ratio of the redshift distributions for r_{SDSS} = 25 after and before object selection for the HLR criterion with α = 2 (thick solid line: , thin solid line: ) and the FIX criterion with (thick dashed line). The case with a deeper limiting magnitude () than the magnitude cut for the lensing catalogue is represented by the doubledotted blue line. Lower panel: redshift distributions of the unfiltered galaxy catalogues with limiting magnitudes r_{SDSS} = 24 (solid line), r_{SDSS} = 25 (dashed line) and r_{SDSS} = 26 (dotted line). 

Open with DEXTER  
In the text 
Fig. 3 Fractional bias of the shear correlation functions for the HLR criterion, without (upper panels) and with (lower panels) correction for the change of the redshift distribution. Thick dashed lines are for α = 1, thick solid lines for α = 3 without seeing. For the respective thin lines a seeing of was assumed. The shaded region shows the 1σerror. For better visibility, it is shown only for the case of α = 3, θ_{see} = 0′′. The error bars for the other cases are very similar. 

Open with DEXTER  
In the text 
Fig. 4 Same as Fig. 3, but for the FIX criterion with θ_{fix} = 2′′ (solid red curves), θ_{fix} = 3.7′′ (shortdashed blue curves) and θ_{fix} = 5.0′′ (dotdashed blue curves). 

Open with DEXTER  
In the text 
Fig. 5 Comparison of the fractional bias of ξ_{ ± } for a survey with limiting magnitude and a magnitude cut for the galaxies that are used for shape measurements of r_{SDSS} = 25 (solid lines), to the bias for a survey where (dashed lines). For both cases, the HLR criterion with α = 2 was used. Thick curves display the case without seeing, thin curves the case with . 

Open with DEXTER  
In the text 
Fig. 6 Fractional bias of the shear correlation functions for various survey depths using the HLR criterion with α = 2, , corrected for the change of the redshift distribution. Solid black lines with error bars: r_{SDSS} = 24; blue dashed line: r_{SDSS} = 25; red dotted line: r_{SDSS} = 26. 

Open with DEXTER  
In the text 
Fig. 7 Upper panel: distribution of the separation of rejected galaxies from their nearest accepted neighbour in a slice of thickness Δw; a magnitude cut of r_{SDSS} = 25 and the HLR criterion with α = 2, were used. Lower panel: same as upper panel, but for slices with thickness given by Δz_{phot}. Vertical lines indicate mean separations. 

Open with DEXTER  
In the text 
Fig. 8 Fractional bias of ξ_{ ± } using the weighting scheme described in Sect. 5. Thin curves: fractional bias after applying the weighting scheme for slices with thickness Δz_{phot}; thick solid green curve: without weighting scheme, thick dotdashed green curve: without weighting scheme. Error bars were computed from fieldtofield variation and for better visibility are shown only for one case. 

Open with DEXTER  
In the text 
Fig. 9 Likelihood analysis of the bias caused by the removal of blended galaxies. Each panel shows the 1 and 2σ confidence contours obtained by marginalizing over the remaining parameter (computed for the HLR criterion with α = 2 and ). The fiducial parameter values are marked with crosses. Circles indicate the maximum likelihood estimates assuming that the true redshift distribution after object selection is unknown, squares show the maxima if the correct p(z) is used, and triangles give the estimates after applying the weighting scheme of Sect. 5. For the HLR criterion (α = 2; ), filled symbols have been used, open symbols for the FIX criterion with θ_{fix} = 3.7′′. For better visibility, the estimates for each criterion type have been connected with a line. 

Open with DEXTER  
In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.