Issue 
A&A
Volume 536, December 2011



Article Number  A95  
Number of page(s)  36  
Section  Cosmology (including clusters of galaxies)  
DOI  https://doi.org/10.1051/00046361/201117117  
Published online  16 December 2011 
Covariance matrices for halo number counts and correlation functions^{⋆}
^{1} Institut de Physique Théorique, CEA Saclay, 91191 GifsurYvette, France
email: patrick.valageas@cea.fr
^{2} Laboratoire AIM, CEA/DSM/IRFU/Sap, CEA Saclay, 91191 GifsurYvette, France
^{3} ArgelanderInstitut für Astronomie, University of Bonn, Auf dem Hügel 71, 53121 Bonn, Germany
Received: 20 April 2011
Accepted: 20 September 2011
Aims. We study the mean number counts and twopoint correlation functions, along with their covariance matrices, of cosmological surveys such as for clusters. In particular, we consider correlation functions averaged over finite redshift intervals, which are well suited to cluster surveys or populations of rare objects, where one needs to integrate over nonzero redshift bins to accumulate enough statistics.
Methods. We develop an analytical formalism to obtain explicit expressions of all contributions to these means and covariance matrices, taking into account both shotnoise and samplevariance effects. We compute loworder as well as highorder (including nonGaussian) terms.
Results. We derive expressions for the number counts per redshift bins both for the general case and for the small window approximation. We estimate the range of validity of Limber’s approximation and the amount of correlation between different redshift bins. We also obtain explicit expressions for the integrated 3D correlation function and the 2D angular correlation. We compare the relative importance of shotnoise and samplevariance contributions, and of loworder and highorder terms. We check the validity of our analytical results through a comparison with the Horizon fullsky numerical simulations, and we obtain forecasts for several future cluster surveys.
Key words: cosmology: observations / largescale structure of Universe / galaxies: clusters: general
Appendices are only available in electronic form at http://www.aanda.org
© ESO, 2011
1. Introduction
The largescale structure of the Universe is a key test of modern cosmological scenarios. Indeed, according to the standard cosmological model, the largescale structures of the present Universe have formed through the amplification by gravitational instability of small almostGaussian primordial fluctuations (Peebles 1980). Then, from observations of the recent Universe, such as galaxy surveys (Cole et al. 2005; Tegmark et al. 2006), cluster surveys (Evrard 1989; Oukbir & Blanchard 1992; Pacaud et al. 2007), weaklensing studies (Massey et al. 2007; Munshi et al. 2008), or measures of baryon acoustic oscillations (Eisenstein et al. 1998, 2005), one can derive constraints on the cosmological parameters (e.g., the mean matter and dark energy contents) and on the properties of the initial perturbations (e.g., possible deviations from Gaussianity). Moreover, one can check whether these structures have really formed through this gravitational instability process.
In the case of welldefined astrophysical objects, such as galaxies or Xray clusters, which form a discrete population, standard probes are the abundance of these objects, that is, “number counts”, and their loworder correlation functions. Galaxies are governed by complex gas physics and star formation processes, which makes it difficult to relate their abundance as a function of optical luminosity to theoretical predictions. However, since they are rather common objects (one can reach almost 10^{6} galaxies in current surveys, e.g. Abazajian et al. 2009) it is possible to reconstruct halo density fields using subsamples of luminous red galaxies and to compare their power spectrum (i.e. the Fourier transform of their twopoint correlation) with theory to derive constraints on cosmology (Reid et al. 2010). In contrast, as the largest nonlinear objects in the present Universe, galaxy clusters are much rarer (current cluster samples have a density of about ten per deg^{2} at most, e.g. Adami et al. 2011), but their relationship with dark matter halo mass is controlled better. This means that their abundance is very sensitive to cosmological parameters (especially Ω_{m} and σ_{8}, Evrard 1989; Oukbir & Blanchard 1992), and has already been used to derive constraints on cosmology, but their clustering has not yet provided much cosmological information (because of low statistics). However, upcoming cluster surveys should allow the use of both number counts and spatial clustering to derive cosmological constraints (Pierre et al. 2011).
To compare observations with theory, one needs either to relate the observed properties of the objects (e.g., optical galaxy luminosity, Xray cluster luminosity or temperature) to the quantities that are predicted by theoretical models (e.g., virialized halo mass) or to use semianalytical models that attempt to build mock catalogs (Harker et al. 2007). For clusters, one can use scaling laws between luminosity, temperature, and mass, calibrated on observations of the local universe (Arnaud et al. 2005, 2010). Then, one must take the selection function of the survey being considered into account, since the probability of detecting the objects is usually more complex than a sharp cutoff on mass or luminosity (Pierre et al. 2011). Finally, one needs to estimate the error bars of the statistical quantities that are measured, in order to derive meaningful constraints on cosmology. In addition to the uncertainties associated with the relationship between observed quantities (e.g., luminosity) and theoretical quantities (e.g., halo mass) discussed above, two unavoidable sources of uncertainty are the “shotnoise” effects of the discrete character of the population and the “sample variance” due to the limited size of the survey. In the case of a fullsky survey, the latter is also known as the “cosmic variance”, due to the fact that we only observe “one sky” so that there is only a limited number of lowk modes to be measured.
In practice (Benoist et al. 1996; Maller et al. 2005; Norberg et al. 2009), one often estimates error bars from the data itself by subsampling the data and by computing the scatter between the means measured within each subsample (e.g., jackknife resampling). However, if one studies rare objects (e.g., clusters) or large scales, it is not possible to obtain reliable estimates from such subsamplings (because of lowquality statistics). Moreover, one often wishes to estimate the signaltonoise ratio of future surveys, even before they have been approved by research agencies, in order to evaluate their scientific possibilities and to compare the efficiency of different probes. Then, one must use numerical simulations (Pierre et al. 2011; Croton et al. 2004; Kazin et al. 2010) or analytic methods. The former have the advantage of greater power (in the sense that one may explicitly introduce complex recipes for the formation of the objects, such as cooling processes and feedback, or intricate survey geometry), but are limited by finite resolution on large scales and for rare objects. Analytical approaches allow one to describe a wider range of scales and halo masses, and usually provide faster computations. Hence they remain a useful complementary method, which we investigate in this paper.
Thus, in this paper, we present a general analytical formalism for computing the shotnoise and samplevariance error bars of estimators of number counts and realspace twopoint correlations for deep surveys that cover a significant range of redshifts. We consider both the 3D correlation function and the 2D angular correlation function on the sky. As explained above, this study is motivated by the need for such covariance matrices to compare any survey with theory. This extends over some previous works, which only considered the sample variance of number counts (Hu & Kravtsov 2003) or neglected highorder or nonGaussian terms in the sample variance of estimators for twopoint correlations or power spectra (Feldman et al. 1994; Majumdar & Mohr 2004; Eisenstein et al. 2005; Cohn 2006; Crocce et al. 2011). Indeed, while the sample variance of number counts (i.e., the mean number of objects per unit volume) only involves the twopoint correlation of the objects, the sample variance of estimators of the twopoint correlation itself also involves the three and fourpoint correlations (and so on for estimators of higher order correlation functions) (Bernstein 1994).
Some previous studies have already included the contributions of such higher order correlations to covariance matrices (Szapudi & Colombi 1996; Meiksin & White 1999; Scoccimarro et al. 1999; Eisenstein & Zaldarriaga 2001; Smith 2009), mostly in the context of galaxy surveys. However, we extend these works by comparing the various contributions to expected error bars in detail, including all shotnoise and samplevariance terms, as well as highorder contributions, and by studying realspace twopoint correlation functions instead of Fourierspace power spectra. Moreover, since we have the application to cluster surveys in mind, and more generally to deep surveys of rare objects, we consider quantities that are defined by integration over finite redshift bins. For instance, number counts may be associated with bins Δz = 0.1 while the twopoint correlation functions are integrated over a significant redshift interval, such as 0 < z < 1, to accumulate enough statistics. Then, the statistical quantities that we consider involve integrations along the lineofsight, rather than local power spectra or twopoint correlations in a small box at a given redshift. Moreover, for number counts we consider arbitrary angular scales, from small angles, where the Limber approximation applies, to fullsky surveys. In these various respects, our study fills a gap in published works.
In view of the application to cluster surveys, we consider a population of objects defined by their mass M, and focusing on the case of dark matter halos, we use the halo mass function and bias measured in previous numerical studies for our numerical computations. To estimate the three and fourpoint correlation functions (needed for the covariance of the twopoint estimator), we use a simple hierarchical model (Peebles 1980; Bernstein 1994), which writes these higher order correlations as products of the twopoint correlation. We give the explicit expressions of our results, which we also compare with numerical simulations, and we provide several realistic illustrations. In particular, we take advantage of our analytical formalism to compare the various contributions to the error bars and derive approximate scalings. Although we eventually apply our results to several future cluster surveys, our method is more general and could be applied to other objects (e.g., galaxies or quasars), defined by other quantities (e.g., luminosity), provided one has a model for their multiplicity function and their twopoint correlation, and the three and fourpoint correlation functions can be described reasonably well by a hierarchical model in the regime where they are relevant.
This paper is organized as follows. In Sect. 2 we first briefly describe the analytic models that we use to estimate the means and covariance matrices of numbers counts and halo correlations, as well as the numerical simulations that we use to check the accuracy of our results. Then, we study the halo number counts per redshift bins in Sect. 3. This allows us to introduce on a simple example our approach to evaluate the mean and the covariance matrix of various estimators. We consider the cases of both small angular windows and arbitrary angular windows (including fullsky surveys), and we estimate the accuracy of smallangle approximations and the decay in correlations between distant redshift bins. Next, we study the realspace 3D halo correlation function in Sect. 4. We consider both the Peebles & Hauser and the Landy & Szalay estimators and compare their covariance matrices. We also discuss the relative importance of different contributions to these covariance matrices (shot noise/sample variance, loworder/highorder terms). Then, we investigate the halo angular correlation in Sect. 5, using the same approach. Finally, we apply our formalism to several real survey cases in Sect. 6 and we conclude in Sect. 7.
We give details of our calculations in several appendices. We discuss shotnoise terms in Appendix A, in the simple case of number counts, and finitesize effects in Appendix B. Then, we describe our computation of the mean and covariance of the estimators of the 3D halo correlation in Appendices C and D, for the Peebles & Hauser estimator, and in Appendix E for the Landy & Szalay estimator. We give further details on highorder contributions to the covariance matrices in Appendix F, for the 3D correlation, and Appendix H, for the angular correlation.
2. Halo density fields
Before we describe our analysis of the covariance matrices for halo number counts and correlation functions, we present in this section the analytic models that we use for the underlying halo distributions (mass and bias function, etc.) and the numerical simulations that we use to validate our results.
2.1. Analytic models
2.1.1. Halo mass function and correlation
To be consistent with the numerical simulations, in Sects. 3.1 to 5, where we develop our formalism and compare our results with simulations, we use the WMAP3 cosmology (Spergel et al. 2007), that is, Ω_{m} = 0.24, Ω_{de} = 0.76, Ω_{b} = 0.042, h = 0.73, σ_{8} = 0.77, n_{s} = 0.958, and w_{de} = −1. In Sect. 6, where we apply our formalism to obtain forecasts for current and future surveys, we use the more recent WMAP7 cosmology (Komatsu et al. 2011), that is, Ω_{m} = 0.274, Ω_{de} = 0.726, Ω_{b} = 0.046, h = 0.702, σ_{8} = 0.816, n_{s} = 0.968, and w_{de} = −1.
In this paper, keeping in mind the study of Xray clusters, we consider the number counts and correlations of dark matter halos defined by the nonlinear density contrast δ = 200. These halos are fully characterized by their mass, and we do not investigate the relationship between this mass and cluster properties such as the gas temperature and Xray luminosity. These scaling laws can be added to our formalism to derive the cluster number counts and correlations, depending on the quantities that are actually measured, but we keep a more general setting in this paper.
We use the halo mass function, dn/dlnM, of Tinker et al. (2008), and the halo bias of Tinker et al. (2010). Thus, the twopoint correlation function between two halos labeled “i” and “j” can be factored^{1} in as (1)where ξ is the matter density correlation, and the bias factors b_{i} and b_{j} do not depend on scale, (2)This approximation of scaleindependent halo bias is valid to better than 10% on scales 20 < r < 130 h^{1} Mpc (Manera & Gaztanaga 2011), with a small feature on the baryon acoustic scale (r ~ 100 h^{1}Mpc) of amplitude of 5% (Desjacques et al. 2010).
Throughout most of this paper we assume that correlations are negligible over cosmological distances (of order c/H_{0}), so that the redshift z on the righthand side of Eq. (1) can be taken at will as z_{i} or z_{j} (or the mean (z_{i} + z_{j})/2). In Sect. 3.2.2, where we consider the case of large angular windows (and go beyond the flat sky and Limber’s approximations), we do not use this approximation but replace the matter density correlation by its linear approximation. This yields the alternative factorization that allows one to handle arbitrary redshifts z_{i} and z_{j}.
For the nonlinear matter correlation ξ(x;z), and the Fourierspace nonlinear power spectrum P(k). defined by (3)we use the popular fitting formula to numerical simulations of Smith et al. (2003).
Throughout this paper, all angular number densities are in units of deg^{2}.
2.1.2. Threepoint and fourpoint halo correlations
The covariance matrices of the estimators for the halo twopoint correlation ξ^{h} also involve the halo threepoint and fourpoint correlation functions, ζ^{h} and η^{h}, so we must define a model for these quantities. On large scales, for Gaussian initial conditions, the three and fourpoint correlation functions of the matter density field behave as ζ ~ ξ^{2} and η ~ ξ^{3} at lowest order over ξ (Bernardeau et al. 2002; Goroff et al. 1986). On small scales, these scaling laws remain a reasonable approximation (Colombi et al. 1996), but with numerical prefactors that are different from the largescale ones (and may slightly vary with scale). On the other hand, for rare massive clusters, using the standard approach of Kaiser (1984) where virialized objects are identified with overdense regions in the linear density field, Politzer & Wise (1984) obtain . Since our goal is only to estimate the magnitude of highorder contributions we consider in this article a simple “hierarchical clustering ansatz”, where the N − point correlation function can be expressed in terms of products of (N − 1) twopoint correlation functions through tree diagrams (Groth & Peebles 1977; Peebles 1980). This is the simplest model^{2} that is in qualitative agreement with largescale theoretical predictions and smallscale numerical results, as well as with observations.
Through a comparison with numerical simulations, we check that the accuracy of this model is sufficient for our purpose, which is to estimate signaltonoise ratios and compare different survey strategies (while we would require a higher accuracy for the computation of the means themselves, that is, the number counts and twopoint correlations that we wish to measure). The advantage of this simple model is that it describes all scales, through the scalings recalled above, and does not require additional free parameters.
Thus, as in Bernstein (1994), Szapudi & Colombi (1996), Meiksin & White (1999), we write the threepoint halo correlation function as (4)where we sum over all three possible configurations over the three halos labeled “1”, “2”, and “3”. This corresponds to the three treediagrams shown in Fig. 1. We use a linear bias model^{3}, as in Eq. (1), and for the matter density normalization factor S_{3}, we take its largescale limit, which is obtained by perturbation theory (Peebles 1980; Fry 1984; Bernardeau et al. 2002), (5)where n is the slope of the linear power spectrum at the scale of interest. Within the same “hierarchical clustering ansatz”, the fourpoint correlation function is expressed in terms of products of three twopoint functions, as shown in Fig. 2. We have two possible topologies, and sixteen different diagrams for four distinct halos. For simplicity, in this work we give the same weight to all sixteen diagrams, independently of their topology, as (6)where “3 cyc.” and “11 cyc.” stand for three and eleven terms that are obtained from the previous one by permutations over the labels “1, 2, 3, 4” of the four halos. Again we take for S_{4} its largescale limit, (7)This is the simplest possible model, where the angular dependence only comes from the decomposition over the terms in brackets in Eqs. (4) and (6). More complex models and exact computations at lowest order of perturbation theory would introduce homogeneous kernels Q_{N}(x_{1},..,x_{N}) (in place of the numbers S_{N}) that also depend on the angles between the vectors x_{i} and x_{j} (Scoccimarro et al. 1999). Observations of galaxy clustering show, for instance, that Q_{3} displays a weak dependence on the triangle shape, while remaining close to unity (Gaztanaga et al. 2005; Kulkarni et al. 2007). Here we simply take the constant value Q_{3} = S_{3}/3.
Fig. 1 The “hierarchical clustering ansatz” for the threepoint correlation function of Eq. (4). Each solid line corresponds to a twopoint correlation ξ, and ζ^{h} is written as the sum of these three diagrams, with a multiplicative factor b_{1}b_{2}b_{3}S_{3}/3. 
Fig. 2 The two topologies of the fourpoint diagrams associated with the “hierarchical clustering ansatz” for the fourpoint correlation, as in Eq. (6). The numbers are the multiplicity factors of each diagram. 
In terms of the halo twopoint correlation, Eqs. (4) and (6) imply halo coefficients that behave as . For b ~ 1 and n ≃ − 1 and from Eqs. (5) and (7), this gives the values and , which roughly agree with observations of galaxy clustering (Szapudi et al. 2001; Croton et al. 2004; Ross et al. 2006; Marin et al. 2007).
2.1.3. Flatsky and Limber’s approximations
In this paper we often encounter quantities, such as the mean matter density correlation over a redshift bin j, z_{j, − } < z′ < z_{j, + }, with respect to some redshift z in a second bin i, integrated over some angular window of area (ΔΩ), (8)where and . Here χ(z) and are the comoving radial and angular distances, and we introduce the factor so that is dimensionless. Equation (8) is a “conical” average, within the observational cone. However, for small angular windows it is possible to use a flatsky approximation and to approximate this “conical” average by a “cylindrical” average . Thus, using Eq. (3) and assuming that the correlation ξ is negligible on cosmological scales, we write for circular angular windows of radius θ_{s}, for a redshift z that also belongs to the jbin, (9)and if z does not belong to the jbin. Here k_{ ∥ } and k_{ ⊥ } are the longitudinal and transverse components of k, with respect to the line of sight, while θ and θ′ are the 2D transverse angular vectors.
For a redshift binning that is not too small, , longitudinal wavenumbers above 1/(Δχ) are suppressed by integrating χ′ along the line of sight, and the integral is dominated by wavenumbers with k ≃ k_{ ⊥ } and . Thus, using the Fourier form of Limber’s approximation (Limber 1953), which is widely used in weaklensing studies (Kaiser 1992; Munshi et al. 2008), the integration over χ′ yields a Dirac term (2π)δ_{D}(k_{ ∥ }), and the integration over k_{ ∥ } gives (10)with (11)which does not depend on the size of the redshift bin j (because we have taken the limit of a very large redshift bin). Introducing the 2D Fourierspace circular window^{4}, (12)we obtain (13)where we defined the 3D power per logarithmic wavenumber, Δ^{2}(k,z), by (14)We will evaluate the accuracy of this approximation, based on the flatsky and Limber’s approximation, in Sect. 3.2.3.
2.2. Numerical simulations
Our analytical formalism allows us to consider a broad range of scales and halo masses, from small angular windows up to fullsky surveys, and to compare the relative contributions to covariance matrices that arise from shotnoise and cosmic variance effects, and from loworder and highorder largescale correlations. Numerical simulations do not easily allow such a detailed analysis; however, in order to validate our approach, we must check whether it agrees with estimates from simulations, wherever a comparison is possible.
We use the highresolution fullsky Horizon simulation (Teyssier et al. 2009), based on the WMAP3 cosmology (Spergel et al. 2007). This is a 68.7 billion particle Nbody simulation, featuring more than 140 billion cells in the AMR grid of the RAMSES code (Teyssier 2002). The simulation consists in a lightcone spanning the entire sky up to redshift ~1, with a mass resolution of 1.1 × 10^{10} M_{⊙}.
Halos in the (2 Gpc)^{3}Nbody simulation are found with the HOP algorithm (Eisenstein & Hut 1998). Their comoving positions are then converted into sky coordinates, taking their radial velocity into account when calculating redshifts. The physical effects of the baryons are neglected and the total mass is given by the number of particles inside the halo. We only consider halos at redshifts z ≤ 0.8, up to which the simulation is complete towards all directions.
We design simulated surveys by extracting rectangular fields in angular coordinates. To minimize the effect of intrinsic sample correlations, we impose a 10 (resp. 20) deg gap between consecutive fields when computing number counts (resp. correlation functions), which yields 138 (resp. 34) nonoverlapping fields that can be cut out in the simulation.
For the purpose of clustering analysis, auxiliary random fields are constructed by shuffling the angular coordinates of halos in the data fields, thus preserving the halo mass and redshift distributions. To gain in computational efficiency, the number of halos per random field is ten times the average number of halos in data fields. The Landy & Szalay estimator is then scaled accordingly to the ratios of pair numbers in data and random fields.
The mean and covariance of all quantities of interest are estimated by sample averaging over the extracted surveys. Because of the uniqueness of the simulation, a residual noise is expected whenever the area of individual fields becomes large and their number diminishes. Thus, there are about 41253 deg^{2}/(ΔΩ) nonoverlapping fields of area (ΔΩ). (For instance, we cut 41 fields of 400 deg^{2} for the analysis of the angular correlation function.)
3. Number density of halos
3.1. Mean number counts in redshift bins
We consider a population of objects defined by some property, such as their mass M, with a mean comoving number density per logarithmic interval of M written as dn/dlnM. Then, the mean number of objects in the redshift interval [z,z + dz] , within the solid angle dΩ on the sky, with a mass in the range [M,M + dM] , reads as (15)where dV/dzdΩ is the cosmological volume factor, which is given by (16)and χ(z) and are the comoving radial and angular distances. In Eq. (15) and in the following we define N as the number density of objects per unit area on the sky, instead of the total number of objects within a given window (ΔΩ). This choice is more convenient for practical purposes because it allows a simpler comparison between different surveys that have different total areas.
We can split the interval of mass^{5} over several bins “α”, [M_{α, − },M_{α, + }] and the observational cone over nonoverlapping redshift intervals “i”, [z_{i, − },z_{i, + }] with z_{i, + } ≤ z_{i + 1, − }. In practice, one usually takes z_{i, + } = z_{i + 1, − } so as to cover a continuous range of redshifts. Then, the number of objects per unit area, in the bin (i,α), reads as (17)where is the observed density of objects. Here and in the following, we note observed quantities by a hat (i.e. in one realization of the sky) to distinguish them from mean quantities, such as the comoving number density of Eq. (15), that correspond to expectation values over many realizations. In practice, assuming ergodicity, these expectation averages are assumed to be identical to volume averages (in the case of statistically homogeneous and isotropic cosmologies).
To simplify notations we define the mean cumulative number density of objects observed at a given redshift, within the mass bin α (with boundaries that may depend on z), (18)and we omit the explicit boundaries on mass in the following. Then, the mean number of objects per unit area in the redshift and mass bins (i,α) reads as (19)
Fig. 3 The mean number density of dark matter halos per square degree, within redshift bins of width Δz = 0.1. We count all halos above the thresholds M_{∗} = 2 × 10^{13},10^{14}, and 5 × 10^{14}h^{1} M_{⊙}, from top down to bottom. We compare our analytical results (solid lines) with numerical simulations (dashed lines). 
We plot the mean number counts of Eq. (19) in Fig. 3, per square degree, for redshift bins of width Δz = 0.1. Here we select all dark matter halos above a mass threshold M_{∗}, with M_{∗} = 2 × 10^{13},10^{14}, and 5 × 10^{14} h^{1} M_{⊙}. The error bars are the 3 − σ statistical errors obtained from the covariance matrices derived in Sect. 3.2.1, for 138 fields of 50 deg^{2} as used in the simulations. We can check that our estimates agree reasonably well with the numerical results. The small discrepancies are probably due to the complex relation between theoretical halo masses and the actual halos found in the simulation box. In particular, it is well known that using different algorithms, such as sphericaloverdensity, HOP or friendsoffriends algorithms, can lead to slightly different results (e.g. Eisenstein & Hut 1998; Tinker et al. 2008). However, this point is beyond the scope of this paper, as we only wish here to check that our analytical results provide reasonable estimates of the mean and covariance of halo number counts and correlations.
3.2. Covariance of number counts
As usual we define the covariance C_{i,α;j,β} of the statistical quantities and by (20)As recalled in Appendix A, following Peebles (1980), it can be decomposed over “shotnoise” and “samplevariance” contributions, (21)which write from Eq. (A.10) as (22)(for nonoverlapping mass binning), and (23)Here we denote and as the integrals over the redshift and mass bins i and α. The superscript “h” refers to the “halo” correlation function, which depends on the two redshifts, angular directions, and masses (or temperatures, etc.), ξ^{h} = ξ^{h}(M,x;M′,x′), with .
The first term is the shotnoise contribution and vanishes for nonoverlapping bins. As expected it decreases with the survey size as 1/(ΔΩ). (We recall that is the angular number density.) The second term is due to the “samplevariance” crosscorrelation ξ^{h} between distant objects (Hu & Kravtsov 2003). Using the approximation (1), that is, the factorization of the dependence on mass and distance of ξ^{h}, we define the mean bias at redshift z, for the mass bin α, through (24)where was defined in Eq. (18), so that Eq. (23) also writes as (25)
3.2.1. Small angular windows
Using the approximation that the correlation function is negligible on cosmological scales, whence , and , we obtain (26)where was defined in Eq. (8). It is a “conical” average, over objects “i” and “j” that are located at unrelated positions (χ,Ω) and (χ′,Ω′). To recall this “conical” integration along χ′ we have added the subscript “con”. This helps distinguishing quantities such as (8) from other averages of ξ over 3D spherical shells, which we encounter in Sect. 4 below. To further distinguish from the quantities encountered in Sects. 4 and 5, we put the label j, which refers to a redshift bin, as a superscript, whereas the labels i or j of the radial or angular bins studied in Sects. 4 and 5 appear as indices. (The parenthesis refer to the fact that in Limber’s approximation of wide redshift bins the dependence on the boundaries of the bin j disappears, because they are pushed to infinity, see Eq. (13).)
In the case where the nonoverlapping redshift bins are large enough to neglect the crosscorrelation between different bins (we evaluate the accuracy of this approximation in Sect. 3.2.4), the integral (8) gives rise to a Kronecker factor δ_{i,j}. Moreover, for small angular windows and large enough redshift bins we can use the flatsky approximation (9), where the “conical” average is approximated by a “cylindrical” average (and spherical harmonics are replaced by plane waves), and Limber’s approximations (10), where longitudinal wavenumbers are neglected over transverse wavenumbers. Substituting into Eq. (26) we recover the results of Hu & Kravtsov (2003), (27)where was defined in Eq. (13). (We evaluate the accuracy of the approximation (27) in Sect. 3.2.3.) Thus, while the shotnoise contribution (22) to the covariance matrix is diagonal, the samplevariance contribution (27) is only blockdiagonal (for large redshift bins) since within the same redshift bin different mass bins are correlated.
Fig. 4 The variance σ_{Ni} of the halo angular number densities of Fig. 3, for redshift bins Δz = 0.1 and an angular window of 50 deg^{2}. We compare our analytical results (solid lines) with numerical simulations (dashed lines). 
In Fig. 4 we compare with the numerical simulations our results for the variance σ_{Ni} of the halo angular number densities, where σ_{Ni} includes both the shotnoise and samplevariance contributions (21), (28)We consider an angular window of 50 deg^{2}, which corresponds for instance to the case of the XXL survey (Pierre et al. 2011). As for the mean densities of Fig. 3, we obtain a reasonable agreement with the simulations, and we correctly reproduce the dependence on halo mass and redshift.
Fig. 5 The shotnoise (dashed lines) and samplevariance (solid lines) errors for the angular number densities shown in Fig. 3, associated with a redshift binning of width Δz = 0.1, but up to z = 2, and an angular window of 50 deg^{2}. 
Taking advantage of our analytical model, we compare in Fig. 5 the shotnoise and samplevariance contributions to the total error that was displayed in Fig. 4, but going up to redshift z = 2. Here we define (29)As expected, we can see that the error of observed number counts is dominated by the shotnoise contribution for rare halos (high mass or high redshift), where effects associated with the discreteness of the halo distribution are very important.
Signaltonoise ratio
Fig. 6 The signaltonoise ratios of number counts for an angular area ΔΩ = 50 deg^{2}, as in Figs. 3 and 4. We compare our analytical results (solid lines) with numerical simulations (dashed lines). 
From the angular number density and its variance σ_{Ni} we define the signaltonoise ratio as (30)which we compute from Eqs. (19) and (28). Thus, combining Figs. 3 and 4, we display in Fig. 6 this signaltonoise ratio. In agreement with these previous figures, we obtain a good match to the numerical simulations. Thus, the analytical results are competitive with the numerical simulations since they appear to be no less reliable and much faster to compute.
Scalings with survey area and number of subfields
For practical purposes it is interesting to compare the signaltonoise ratios associated with a different number of subfields at fixed total area ΔΩ, since this can help for choosing the best observational strategy, whether one should perform a single widefield survey or several smaller scale surveys.
Fig. 7 The signaltonoise ratios of number counts for a total angular area ΔΩ = 50 deg^{2}, divided over independent subfields. We show the results obtained for the numbers of subfields (solid lines), 2 (dashed lines), and 4 (dotted lines). 
Therefore, let us consider a survey with a total angular window of area ΔΩ that can be split over angular subfields, which we assume to be independent and to have equal area . For instance, the survey may be made of smaller regions that are well separated on the sky. Then, the total angular number density of objects in the redshift bin [z_{i, − },z_{i, + }] , summed over the smaller subfields of index α with angular number densities , writes as (31)Since all subfields are independent and have the same depth we have, for any α, (32)which depends neither on (ΔΩ) nor . Without mass binning the covariance matrix remains diagonal, see Eq. (27), with a shotnoise contribution (33)while the samplevariance contribution is (34)where is the typical value of the integral (13) within the small angular window . Then, the signaltonoise ratio scales as (35)In the regime where the covariance is dominated by the shotnoise contribution, the signaltonoise ratio grows as the square root of the total area and does not depend on the number of subfields, .
To estimate the scaling in the regime where the covariance is dominated by the samplevariance contribution, we assume that on the relevant scale , that is, , the power spectrum behaves as P(k) ~ k^{n}, with − 2 < n < 1. (Wavenumbers where n < − 2 for CDM cosmologies would correspond to very small angular windows.) Then, from Eq. (13) we obtain , and the signaltonoise ratio scales as . Thus, in this regime the signaltonoise ratio still grows with the total survey area, but there is a weak dependence on the number of subfields, which may either increase or decrease with depending on the sign of n.
For illustration, we show in Fig. 7 the signaltonoise ratios of the number counts obtained for a total angular area ΔΩ = 50 deg^{2}, divided over subfields with , 2, and 4. The case of a single field, , corresponds to Figs. 3 and 4. The curves in Fig. 7 are exact estimates of the signaltonoise ratios, obtained from and not from the approximate scaling given in Eq. (35).
In agreement with the discussion above and with Fig. 4, we can check that at high redshift or for high mass, the signaltonoise ratio does not depend on , since the error is dominated by the shotnoise contribution, which only depends on the total area as seen in Eqs. (33) and (35).
At low redshift and low mass, where the error is dominated by the samplevariance contribution, the signaltonoise ratio increases slightly with . This can be understood from the fact that the local slope n of the power spectrum is slightly negative on the scales where the halo correlation is significant. For instance, within the ΛCDM cosmology that we consider in this paper, a window of area 50 deg^{2} corresponds at z = 1 to a radius Mpc, and the local slope n(k) of the linear power spectrum, at wavenumber k ≃ 2π/(164 h^{1} Mpc), is n ≃ − 0.6. This means that for the number counts at low redshift and low masses, it is slightly advantageous to choose a survey divided over several independent subfields.
As shown by Figs. I.1 and I.2 in Appendix I, these scalings are approximately satisfied by the results obtained from numerical simulations, for a wide variety of survey area and of number of subfields. Therefore, the scalings derived from Eq. (35) allow a reasonable estimate of the dependence of the signaltonoise ratio of number counts with (ΔΩ) and .
3.2.2. Large angular windows
Expression (27) relies on the approximation of small angular windows for the samplevariance contribution. This allowed us to use the flatsky approximation (9), where the observational cone over the redshift bin [z_{i, − },z_{i, + }] is approximated as a cylinder of radius θ_{s} around the central line of sight. For large angular windows this approximation is no longer valid and we must decompose over spherical harmonics (Hu & Kravtsov 2003), rather than over the plane waves of Eq. (9).
Rather than using the Eqs. (25) or (26), with a new expression for that would remain valid for large angles, it is more convenient to go back to the angular number densities , as in Eq. (15). Thus, for any redshift bin i we expand the observed distribution on the sky over the spherical harmonics, (36)and we define the angular power spectrum as (37)where we only take the samplevariance contribution. To simplify notations we do not include mass binning, but this can be added without difficulty, as in Sect. 3.2.1. Then, writing the twopoint correlation function again under the factored form (1), introducing the Fourierspace power spectrum as in Eq. (3) and expanding the planewave exponential factor over spherical harmonics, a standard calculation gives (Hu 2000; Hu & Kravtsov 2003) (38)where j_{ℓ} is the spherical Bessel function of order ℓ. Here we assumed for simplicity a flat background, which is sufficient for practical purposes, and we approximated the Fourierspace power spectrum by the linear power spectrum, P(k,z) ≃ D_{ + }(z)^{2}P_{L0}(k), where D_{ + }(z) is the linear growth rate (normalized to unity at z = 0).
Fig. 8 The angular power spectrum of the distribution of halos in the redshift bin 0.95 < z < 1.05. We plot both the exact result (38) (solid line) and Limber’s approximation (39) (dotted line). 
Limber’s approximation can be recovered in the limit of large ℓ, for slowly varying kdependent prefactors, by using the property ^{∫}dk k^{2}j_{ℓ}(kχ)j_{ℓ}(kχ′) = π/(2χ^{2})δ_{D}(χ − χ′), and the correspondence k ↔ (ℓ + 1/2)/χ (Hu & Kravtsov 2003; LoVerde & Afshordi 2008). This yields (39)Nevertheless, because the structures of Eqs. (38) and (26) are quite different (the order of the integrations over redshift and wavenumber is exchanged, and the largeangle expression keeps two integrations over redshift, while in the smallangle expression one integral over redshift has already been performed) it is more convenient to treat the smallangle and largeangle derivations separately.
We plot in Fig. 8 the angular power spectrum C_{i,i;ℓ} obtained for halos above three mass thresholds in the redshift bin 0.95 < z < 1.05. We can see that Limber’s approximation (39) significantly underestimates the power at low ℓ, while it slightly overestimates the power at high multipoles, ℓ > 20. It is already rather good at ℓ ~ 20, and becomes increasingly accurate at higher ℓ, although the difference remains on the order of 10% until ℓ ~ 80. These results agree with previous studies of the Limber approximation (LoVerde & Afshordi 2008; Crocce et al. 2011). As noticed in LoVerde & Afshordi (2008), the latter can be extended as a series expansion over (ℓ + 1/2)^{1}, but higher orders behave increasingly badly at low ℓ. Therefore, we do not investigate further this approach here, since using the exact expression (38) is not more difficult (but slower) to compute and ensures a smooth behavior over all ℓ, while the usual Limber approximation (39) is sufficient for our purposes on small scales.
Next, the mean angular number densities of Eq. (17), smoothed over the angular window of radius θ_{s} and filter W_{2}(Ω), read as (40)where W_{2}(Ω) = 1/(ΔΩ) within the angular window and vanishes outside (but we can choose more general filters). Then, the sample variance of these number counts writes as (Hu & Kravtsov 2003) (41)where C_{i,j;ℓ} is the angular power spectrum (38), while are the angular multipoles of the window W_{2}, (42)For a tophat window that is symmetric around the azimuthal axis, we have for ℓ ≥ 1, and (45)where P_{ℓ} are the Legendre polynomials, and for m ≠ 0.
Fig. 9 The shotnoise (dashed line) and samplevariance errors (29) for the angular number densities in the redshift bin 0.95 < z < 1.05, as a function of the radius θ_{s} of the angular window. The solid line is the exact sample variance, from Eqs. (38) and (41), while the dotted line is the result (27), which was used in Fig. 5 and involves both the flatsky and Limber’s approximations. 
In the limit of large ℓ, (Hu 2000), and we obtain , with k = (ℓ + 1/2)/χ and the 2D Fourierspace window (12). This shows that for small angles, where the covariance is dominated by large ℓ, the expression (41) goes to the flatsky approximation (27), using the fact that Limber’s approximation (39) also applies in this limit (see Fig. 8).
We plot in Fig. 9 the shotnoise and samplevariance errors , as in Fig. 4 but as a function of the angular radius θ_{s}, for the angular number densities in the redshift bin 0.95 < z < 1.05. In agreement with Fig. 8, we can check that the combination (27) of the flatsky & Limber’s approximations provides a good approximation to the exact result (41) on small angles, typically θ_{s} < 10 deg, while it underestimates the sample variance on large angles. In agreement with Fig. 5, the shotnoise contribution is dominant for massive and rare halos, and subdominant for small and numerous halos. In our case, the transition between the shotnoise and samplevariance dominated regimes takes place at M_{∗} ~ 10^{14} h^{1} M_{⊙}.
3.2.3. Accuracy of the “flatsky + Limber” approximation
Fig. 10 The ratio of the exact samplevariance error (41) to the approximation (27), which uses both the flatsky and Limber’s approximations. We show this ratio as a function of the radius θ_{s} of the angular window, for several redshift bins, for halos above the mass threshold M > 10^{14} h^{1} M_{⊙}. Higher z corresponds to a higher ratio. 
We plot in Fig. 10 the ratio of the exact samplevariance error (41) to the approximation (27), which used both the flatsky and Limber’s approximations. As in Figs. 8 and 9, we can check that the approximation (27) is reliable for small angular windows but significantly underestimate the samplevariance error for wide angles, above 10 deg. In the extreme case of fullsky surveys (θ_{s} = 180 deg), it can underestimate the samplevariance error by a factor from 2 to 5. The effect is actually greater for higher redshift bins. This may seem somewhat surprising since the “flatsky” approximation (9) is expected to be more accurate at higher redshifts, where large angles θ correspond to large distances that are weakly correlated and should not significantly contribute (i.e., the CDM power spectrum itself yields more weight to pairs separated by small angles). However, Limber’s approximation (10) goes in the opposite direction because it relies on the assumption (i.e., longitudinal wavenumbers are more suppressed by the integration along the line of sight than transverse wavenumbers, which are only integrated over the smaller angular distance ). For instance, for an Einsteinde Sitter universe, where , this constraint on the angle θ_{s} and the redshift bin width Δz writes as θ_{s} ≪ 29(Δz) [(1 + z)^{3/2} − (1 + z)] ^{1} deg. At fixed Δz this upper bound on θ_{s} becomes stronger at higher z. Therefore, Fig. 10 shows that this second effect, associated with Limber’s approximation, dominates over the first effect, associated with the flatsky approximation.
We checked that we obtain very close results for other mass thresholds (not shown in the figure). For instance, the curves obtained for the mass threshold 2 × 10^{13} and 5 × 10^{14} h^{1} M_{⊙} cannot be distinguished from those plotted in Fig. 10. This is not surprising since within our bias model (1) the halo correlations are governed by the same matter density correlation function ξ(r;z).
Figure 10 shows that the smallangle approximation (27) that we used in Sect. 3.2.1 was legitimate, since we considered angular windows of 50 deg^{2} or less (i.e. θ_{s} ≲ 4 deg).
3.2.4. Correlation between different redshift bins
Fig. 11 The correlation matrix of Eq. (46) between redshift bins of width Δz = 0.1. We show as a function of j, for four values of i. In each case, at j = i. We consider halos above 10^{14} h^{1} M_{⊙} and an angular window of area (ΔΩ) = 50 deg^{2}. 
We can use the expression (41) to compute the correlation between different redshift bins i and j. Thus, we show in Fig. 11 the correlation matrix (also called normalized covariance matrix) defined as (46)where we only consider the samplevariance contribution. (The shotnoise contribution (22) is always diagonal for nonoverlapping redshift bins.) Thus, is unity along the diagonal and elements { i,j } where is much smaller than one are weakly correlated. We can check that the decay is always rather fast and correlations between neighboring redshift bins, j = i ± 1, are already below 10%. This shows that it is appropriate to neglect crosscorrelations between redshift bins of width Δz = 0.1, as we did in Sect. 3.2.1. We also checked that we obtain almost identical results for other angular windows, such as (ΔΩ) = 400 deg^{2}. (For very large or fullsky surveys we do not need the approximation of uncorrelated redshift bins since we use Eq. (41).)
4. Realspace twopoint correlation function
In the previous section we have studied the covariance of the estimators , which measure the redshift distribution dn/dz of the population of interest (galaxies, clusters, etc.), over a set of finite redshift bins. This corresponds to onepoint statistics. We now study estimators of the realspace twopoint correlation function ξ(x_{12};M_{1},M_{2};z) of these objects, which corresponds to twopoint statistics, as a function of the comoving distance x_{12}. In this article we do not investigate redshiftspace distortions, which we leave for future works, and we assume that a realspace map of the population under study is available or that redshift distortions can be neglected.
Estimators of 3D correlation functions, or power spectra, have already been studied in many works, mostly in view of their application to galaxy surveys. However, since we have in mind the application to cluster surveys and, more generally, to deep surveys of rare objects, we consider 3D correlation functions averaged over a wide redshift bin (in order to accumulate a large enough number of objects), rather than the usual local 3D correlation functions at a given redshift. This means that the quantities that we consider in this section, while being truly 3D correlations and not 2D angular correlations, nevertheless involve integrations along the line of sight or, more precisely, the observational cone, within a finite redshift interval. This is also why 3D Fourierspace power spectra may not be the most convenient tool for our purposes, since we do not have homogeneous and isotropic distributions since the radial direction plays a special role.
4.1. Mean correlation
4.1.1. Peebles & Hauser estimator
Following Peebles & Hauser (1974), a simple estimator for the twopoint correlation function of a point distribution is given by (47)where D represents the data field and R an independent Poisson distribution, both with the same mean density. More precisely, the estimator introduced in (47) for the mean correlation over the radial bin [R_{i, − },R_{i, + }] corresponds to counting all pairs “DD” in the data field that fall in this pairseparation bin i and all pairs “RR” in the auxiliary Poisson field that fall in the same bin, and to taking the ratio of these two counts.
Before appropriate rescaling, the mean number density of the actual Poisson process R is taken as much higher than the observed one, so that the contribution from fluctuations of the denominator RR to the noise of can be ignored. The advantage of form (47) is that one automatically includes the geometry of the survey (including boundary effects, cuts, etc.), because the auxiliary field R is drawn on the same geometry.
In our case, we write the analog of Eq. (47) for the mean correlation on scales delimited by R_{i, − } and R_{i, + }, integrated over some redshift range and mass intervals, as (48)with (49)Here we denoted as the integral over the 3D spherical shell of radii R_{i, − } < R_{i, + }, and and are the integrals over the mass bins α and β.
The redshift interval Δz is not necessarily small, and to increase the statistics we can choose the whole redshift range of the survey, such as [0,z_{s}] . If we bin the survey over smaller nonoverlapping redshift intervals, which are large enough to neglect crosscorrelations between different bins (see for instance Fig. 11), we can independently study each redshift bin. For simplicity we do not explicitly write the redshift boundaries.
As in Eq. (47), the counting method that underlies Eq. (48) can be understood as follows (Peebles & Hauser 1974). We span all objects in the “volume” (z,Ω,lnM), and count all neighbors at distance r′, within the shell [R_{i, − },R_{i, + }] , with a mass M′. We denote with unprimed letters the quantities associated with the first object, (z,Ω,lnM), and with primed letters the quantities associated with the neighbor of mass M′ at distance r′. Thus, with obvious notations, and are the observed number densities at the first and second (neighboring) points. The difference between the quantities and Q is that in the latter case we use the mean number densities dn/dlnM and dn/dlnM′. Therefore, Q is not a random quantity so it shows no noise. In practice, the mean number densities dn/dlnM may actually be measured from the same survey, as described in Sect. 3. However, since these measures do not involve a distance binning over r′, there are many more objects in a redshift bin than within a small interval [R_{i, − },R_{i, + }] . Then, the onepoint quantities are measured with much greater accuracy than , so that we can indeed neglect their contribution to the noise of the estimator . In terms of Eq. (47) this corresponds to neglecting fluctuations of “RR”. (This is achieved in practice by choosing a much higher density for the field R, which is later rescaled.)
In Eq. (48) we used a simple average over the shell [R_{i, − },R_{i, + }] , because we count all pairs with a uniform weight in r′space. Through the change to spherical coordinates dr′ = dr′r^{′2}dΩ′, this yields a geometrical weight r^{′2} in terms of the radial distance r′. An alternative would be to add a weight r^{′ − 2}, instead of the simple 3D tophat written in Eq. (48), to eventually obtain a uniform weight over the radial distance r′. For simplicity we only consider choice (48) in the following, but such alternative weights could be used with straightforward modifications in the expressions given below.
Thus, we focus on the behavior of the twopoint correlation as a function of distance r′, measured through the binning over the intervals [R_{i, − },R_{i, + }] . We assume that different bins do not overlap, R_{i, + } ≤ R_{i + 1, − }, and in practice one usually has R_{i, + } = R_{i + 1, − }, to cover a continuous range of scales. On the other hand, these intervals may depend on redshift, as long as R_{i, + }(z) ≤ R_{i + 1, − }(z) at each redshift.
Using Eq. (18), Eq. (49) also writes as (50)where the volume of the ishell is (51)which may depend on z. In practice, one would usually choose constant comoving shells, so that does not depend on z. To obtain Eq. (50) we used that dn/dlnM and dn/dlnM′ have no scale dependence (because they correspond to a uniform distribution of objects) and we neglected edge effects. (These finitesize effects are discussed and evaluated in Appendix B.)
Because of the finite distance r′ between the two objects M and M′ in Eq. (48), there is no shotnoise contribution to the average of the quadratic term . Within the framework presented for Eqs. (A.1), (A.2), the integration in Eq. (48) does not contain common small (infinitesimal) cells, because of the finitesize distance r′ > R_{i, − }. Therefore, the average of the statistical estimator (48) reads as (52)where ξ^{h}(r′;M,M′;z) is the twopoint correlation function of the objects, as in Eq. (A.9). Here we denoted ^{∫}_{i}dr′ as the integral over the 3D spherical shell i. Comparing with Eq. (50) we clearly see that is an unbiased estimator of the twopoint correlation function ξ^{h}, averaged over the shell [R_{i, − },R_{i, + }] (with a geometrical weight r^{′2}), whence the name “”.
As in Eq. (50), in Eq. (52) and in the following we neglect finitesize effects, which arise because the integration over r′ should be restricted to the observational cone of the survey. This leads to a smaller available volume than the spherical shell [R_{i, − },R_{i, + }] close to the survey boundaries. This does not affect the mean value of the estimator , because this effect cancels out between the numerator of Eq. (52) and the denominator Q_{i}. However, it will have a small effect on our estimate of the covariance matrix. As described in Appendix B, at z = 1 for a circular survey area ΔΩ = 50 deg^{2}, and for a radial bin at r = 30 h^{1} Mpc, by geometrical counting we overestimate the number of pairs by 10% and the signaltonoise ratio by 5%.
As in Sect. 3.2, in order to make progress we assume that the twopoint correlation can be factored in as in Eq. (1), so that Eq. (52) reads as (53)with (54)We have introduced the superscript “(r)” to recall that Eq. (54) is the radial average of ξ, over the 3D spherical shell associated with the radial bin i, to distinguish it from the angular averages that we encounter in Sect. 5 below. The prime in the subscript “i′” also recalls that we integrate over a neighboring point r′, with respect to a given point of the observational cone, to distinguish it from the integration over an unrelated point within the observational cone as in the “cylindrical” average (8). We give in Eq. (C.3) in Appendix C the Fourierspace expression of , which is more convenient for numerical computations.
Fig. 12 The mean halo correlation, , over ten comoving distance bins within 5 < r < 100 h^{1} Mpc, equally spaced in log (r). We integrate over halos within the redshift interval 0 < z < 0.8 and we compare our analytical results (solid lines) with numerical simulations (dashed lines). 
4.1.2. Landy & Szalay estimator
As shown in Landy & Szalay (1993), a better estimator than (47) is given by (55)which involves the product DR between the data and the auxiliary field. Within our framework, where the mean quantity Q plays the role of R, this second estimator reads as (56)The difference between the terms associated with DD and DR is that in the former we have a product of two observed number densities, , while in the latter we have a crossproduct between the observed and the mean number densities, .
As checked in Appendix E, the mean of this second estimator is equal to the mean of the estimator studied in Sect. 4.1.1, (57)To simplify the notations, in the following we do not consider binning over mass (i.e., we independently consider the correlation functions of halos above some mass thresholds), so that Eq. (53) readily simplifies as (58)and a similar simplification holds for Q_{i}. If needed, it is not difficult to include a mass binning in the expressions given in the following.
4.1.3. Comparison with simulations
We compare in Fig. 12 the mean correlation (58) with results from numerical simulations (which use the Landy & Szalay estimator) for halos above the thresholds M > 2 × 10^{13} and 10^{14} h^{1} M_{⊙}, within the redshift range 0 < z < 0.8. The error bars are the 3 − σ statistical errors obtained from the covariance matrices derived in Sect. 4.2.2 for 34 fields of 50 deg^{2} as used in the simulations. We obtain reasonable agreement with the simulations, although we appear to underestimate the halo correlation of the most massive halos at small radius, r < 7 h^{1} Mpc. This may be due to a scaledependent halo bias or to a small discrepancy in the definition of the halo mass, which depends on the halofinder algorithm (Knebe et al. 2011).
4.2. Covariance matrices for the halo correlation
We now consider the covariance of the estimators and . As described in Appendix D, the covariance of the Peebles & Hauser estimator is given by (59)with (see also Landy & Szalay 1993 for a computation of loworder terms) where ξ^{h}, ζ^{h}, and η^{h}, are the twopoint, threepoint, and fourpoint correlation functions of the objects. To make the expressions compact but easy to understand, we introduced the following notation in Eqs. (60)–(62). Variables associated with the object at the center of the shell are noted by the label i (e.g., χ_{i},M_{i},...) and those associated with the object within the shell are noted by the label i′ (e.g., r_{i′},M_{i′},...). This corresponds to the primed and unprimed variables in Eqs. (48) and (52), and we may speak of objects i,i′,j, and j′. Then, in the indices of the correlation functions, we separate with a semicolon, as in of Eq. (62), objects i and j that are located at unrelated positions (χ_{i},Ω_{i}) and (χ_{j},Ω_{j}) in the observational cone, whereas we separate with a comma, as in of Eq. (60), objects that are located at a fixed distance r′. (More precisely, the distance r′ is restricted to a radial bin .)
The label C^{(n)} refers to quantities that involve n distinct objects. Thus, the contributions C^{(2)} and C^{(3)} arise from shotnoise effects (as is apparent through the prefactors 1/(ΔΩ)), associated with the discreteness of the number density distribution, and they would vanish for continuous distributions. However, they also involve the twopoint and threepoint correlations, and as such they couple discreteness effects with the underlying largescale correlations of the population. In case of zero largescale correlations, they remain nonzero because of the unit factors in the brackets and become purely shotnoise contributions, arising solely from discreteness effects.
More precisely, contribution (60) arises from the coupled identification i = j and i′ = j′ (or i = j′ and i′ = j), whereas contribution (61) arises from the single identification i = j (or either one of i = j′, i′ = j, i′ = j′). Thus, in Eq. (61) the object i is at the center of both shells and .
Contribution C^{(4)} is a pure samplevariance contribution and does not depend on the discreteness of the number density distribution (hence there is no 1/(ΔΩ) prefactor).
As shown in Appendix E, the covariance matrix of the Landy & Szalay estimator reads as (see also Szapudi 2001; Bernardeau et al. 2002) (63)where the first term is equal to Eq. (60), (64)and By comparison with Eqs. (61)–(62) we can see that many terms have been canceled (Landy & Szalay 1993; Szapudi & Szalay 1998). This confirms that the estimator (56) is more efficient than (48), since its covariance will be smaller.
4.2.1. Loworder terms
In this section we assume that the radial bins [R_{i, − },R_{i, + }] are restricted to large enough scales to neglect three and fourpoint correlation functions, as well as products such as . We compute these highorder terms in Sect. 4.2.2 and Figs. 15 and 16 show the range where they can be neglected. Along the diagonal, for halos above 10^{14}h^{1}M_{⊙} this corresponds to the full range 5 < r < 100 h^{1} Mpc. For lower mass halos, M > 2 × 10^{13}h^{1}M_{⊙}, all scales receive significant contributions from highorder terms, but the loworder terms contribute to about 50% for r < 15 h^{1} Mpc. This is sufficient for our purposes in this section, which are to compare the Peebles & Hauser and the Landy & Szalay estimators, the shotnoise and samplevariance effects, and the scalings with survey area and number of subfields. Accurate computation of the covariance matrix requires taking all terms into account, which we do in Sect. 4.2.2.
Thus, in this section we only keep the contributions that are constant or linear over the twopoint correlation function ξ^{h} of the objects, and we again assume that the twopoint correlation function can be factored as in Eq. (1). Then, as shown in Appendix D, for the Peebles & Hauser estimator we obtain from Eqs. (60)–(62), at this order, (67)where we introduced (68)Following the notation explained earlier, below Eq. (62), the comma and the primes in mean that this is a “spherical average”, more precisely the average over the two spherical shells and , in contrast to in Eq. (8), which was a “conical” average within the observational cone. There are two indices, i′ and j′, because we integrate over the two shells and , whereas in of Eq. (54) there was only one index i′ because we integrated over a single shell . The Fourierspace expression of Eq. (68), which can be convenient for numerical computations, is given in Eq. (D.10) in Appendix D.
Again, to obtain Eq. (67) we neglected finitesize effects, that is, we did not take the fact into account that close to the boundaries of the survey part of the shell is not observed. As explained in Appendix B, this only leads to an overestimate of 5% of the signaltonoise ratio, for a radial bin of 30 h^{1} Mpc in a circular survey window of 50 deg^{2}. This error decreases for wider surveys or smaller radial bins.
For the Landy & Szalay estimator, at the same order the covariance matrix reads from Eqs. (64)–(66) as (69)Again, as compared with Eq. (67) several terms have been canceled. Moreover, at this order only shotnoise terms, whether coupled to largescale correlations or not, contribute to the Landy & Szalay covariance (69), as can be seen from the prefactors 1/(ΔΩ). In contrast, at the same order in the Peebles & Hauser covariance (67), we have two more shotnoise terms (coupled to the largescale correlations through the means and ) and one additional samplevarianceonly contribution (i.e., the last term, without the prefactor 1/(ΔΩ)).
Comparison of Peebles & Hauser and Landy & Szalay covariance matrices
Fig. 13 The covariance matrices (solid line) and C_{i,j} (dashed line) of the estimators and , for i = 4 associated with the distance bin 12.3 < r < 16.6 h^{1} Mpc, as a function of j. We show the results obtained for halos in the redshift range 0 < z < 0.8 with an angular window of 50 deg^{2}. Here we only consider the loworder terms given by Eqs. (67) and (69). 
We show in Fig. 13 one row of the covariance matrices C_{i,j} and , as a function of j at fixed i. We consider halos in the redshift range 0 < z < 0.8, for a window of 50 deg^{2}. The covariance is larger for the case of higher mass threshold. In agreement with Eqs. (67) and (69) and with standard results (Kerscher et al. 2000), the covariance of the Landy & Szalay estimator (55) is smaller than for the Peebles & Hauser estimator (47), especially for the lower mass threshold (the higher mass threshold case being more dominated by the common shotnoise contribution (64)).
As shown by Fig. 13, another advantage of the Landy & Szalay estimator is that its covariance matrix is much more diagonal than for the Peebles & Hauser estimator. This can be checked by comparing the left and middle panels of Fig. 19, where we show the correlation matrices ℛ_{i,j} defined as in Eq. (46), but where we include all shotnoise and samplevariance contributions of Eqs. (67) and (69).
Comparison of samplevariance and shotnoise effects
Fig. 14 The contributions C^{(2)} and C^{(3)} to the covariance of the Landy & Szalay estimator, along the diagonal i = j. As in Fig. 13, we only consider the loworder terms, given by Eq. (69). 
We compare in Fig. 14 the contributions C^{(2)} (first term in Eq. (69)) and C^{(3)} (second term in Eq. (69)), again keeping only these loworder terms. We consider the same survey properties as in Fig. 13 but plot these contributions along the diagonal, i = j. Let us recall that both contributions C^{(2)} and C^{(3)} are shotnoise contributions (i.e., they arise from the discreteness of the halo distribution). However, they also involve the underlying largescale correlations, as apparent through the factors ξ. In particular, C^{(3)}, which arises from a single pair identification, vanishes if there are no largescale correlations, whereas C^{(2)}, which arises from two pair identifications, remains nonzero if ξ = 0 (the term associated with the factor 1 is thus a “pure shotnoise” contribution). Therefore, by comparing C^{(2)} and C^{(3)} we can assess the relative importance of shotnoise and samplevariance effects, C^{(2)} involving an extra degree of shot noise (one more pair identification). As expected, C^{(2)} is dominant for small distance bins, which correspond to small volumes, , and contain few halos. It also remains dominant up to larger scales in the case of more massive halos, which are rarer. Since the contribution C^{(2)} is diagonal, as shown by the Kronecker prefactor in Eq. (67), it implies that covariance matrices are more strongly diagonal for highmass halos, as can be checked in Fig. 19 where we show the correlation matrices ℛ_{i,j} of small (upper row) and large (lower row) halos.
Scalings with survey area and number of subfields
As in Sect. 3.2.1, we consider the dependence of the signaltonoise ratio on the total survey area ΔΩ and on the number of subfields. Thus, we define the estimator as the mean of the estimators of Eq. (56) of the subfields, (70)Of course, the expectation value is independent of (ΔΩ) and , (71)From Eq. (50) we can check that does not depend on (ΔΩ) nor , so that for each subfield α, of area , the covariance (69) scales as (72)Both terms in Eq. (69) scale in the same fashion, so that the structure of the covariance matrix does not change with (ΔΩ) nor (i.e. it does not become more or less diagonal), if we neglect boundary effects. Then, the covariance matrix of the averaged estimator (70) scales as (73)so that the signaltonoise ratio scales as (74)Therefore, a single widefield survey and a combination of several independent smaller surveys, with the same total area, show the same efficiency. This is because both terms in Eq. (69) scale in the same way with the survey geometry, as 1/(ΔΩ), because the samplevariance effects involved in these mixed contributions arise from the correlation between objects separated by a distance r < R_{i, + } + R_{j, + }, independently of the angular size of the survey. This is different from the samplevariance contribution (27) to the covariance of the number counts, which explicitly depends on the largescale correlation over the survey angular size θ_{s}, see Eq. (13), because it arises from the correlation between objects located at any position in the survey cone. Of course, result (74) only applies to small length scales, , where it is legitimate to neglect finitesize effects. For long wavelengths a wider survey is clearly more efficient, and the only possible choice for scales that are close to the larger survey diameter.
4.2.2. Highorder terms for the covariance of
We now estimate the highorder terms for the covariance of the Landy & Szalay estimator that we neglected in Eq. (69), where we only kept terms of order zero or one over the twopoint correlation function. To evaluate the contributions associated with the factors in Eq. (65) and in Eq. (66), we use the model for the three and fourpoint halo correlation functions described in Sect. 2.1.2. Then, as shown in Appendix F, the contribution associated with the product in Eq. (66) is given by (75)the term of Eq. (65) yields (76)and the term of Eq. (66) gives (77)where the various factors are given in Appendix F, and we used for Eqs. (76)–(77) the “hierarchical clustering ansatz”, described in Figs. 1 and 2 and given by Eqs. (4) and (6).
The terms and are “pure samplevariance” contributions. Thus, there is no prefactor 1/(ΔΩ) and they involve largescale correlations among four halos, i,i′,j,j′. The term is a coupled shotnoise and samplevariance contribution, as shown by the prefactor 1/(ΔΩ) and the fact that it involves largescale correlations among three halos, i,i′,j′. (The discreteness of the halo distribution has led to the identification i = j, i.e. a shotnoise effect, which leaves three distinct halos.)
Fig. 15 The low and highorder contributions to the covariance matrix along its diagonal. We again consider halos in the redshift range 0 < z < 0.8, with an angular window of 50 deg^{2}, above two mass thresholds. 
Fig. 16 The low and highorder contributions to the covariance matrix , as in Fig. 15, but along one row. This corresponds to the fixed bin i = 4, associated with the distance bin 12.3 < r < 16.6 h^{1} Mpc, as a function of j. 
We compare in Figs. 15 and 16 the loworder contributions (69) with these highorder contributions (75)–(77). We can see that the latter can be nonnegligible on these scales, 5 < r < 100 h^{1} Mpc. Along the diagonal, i = j, shown in Fig. 15, they are always significantly smaller than the loworder contribution (which includes both samplevariance and shotnoise effects) for massive halos, M > 10^{14} h^{1} M_{⊙}, but are close to it or larger for M > 2 × 10^{13} h^{1} M_{⊙}. On large scales the main highorder contribution is the term (75), associated with a product ξξ, while the terms (76) and (77), associated with the three and fourpoint correlation functions, dominate on small scales. Indeed, the former does not increase much on small scales, whereas the latter are very sensitive to the smoothing scales R_{i} and R_{j} and show a steep growth on small scales, even though formally ζ is also of order ξξ within the model (4). This is because the term (75) involves the product of two correlations between two distinct lines of sight, as seen in Eq. (66), so that each ξ is averaged along the radial direction, while the term (76), which arises from one shotnoise contraction that has removed one lineofsight integration, involves the product of two correlations between a central point and two points at distances R_{i} and R_{j}, as seen in Eq. (65).
As seen in Fig. 16, at fixed i the relative importance of these highorder contributions to increases as the bin j shifts to smaller scales. Again, we can see that among these contributions the “ξξ” term (75) dominates on large scales and saturates on small scales, while the “ζ” and “η” terms (76) and (77) dominate on small scales and strongly depend on the smoothing scales.
That highorder contributions can become dominant as one of the bins i and j shifts to small scales agrees with expectations, as one probes deeper into the nonlinear regime where three and fourpoint correlation functions become important, and with some previous studies (Meiksin & White 1999; Scoccimarro et al. 1999). This implies that the covariance matrix is less diagonal once we take these contributions into account, and it decreases the number of effectively independent modes. This can be checked in Fig. 19, where we show the correlation matrices without (middle panels) and with (right panels) these highorder contributions, for the mass thresholds M > 2 × 10^{14} h^{1} M_{⊙} and M > 10^{14} h^{1} M_{⊙}. Therefore, for survey characteristics such as those of Figs. 15, 16, it is necessary to include highorder contributions to the covariance matrix of twopoint estimators for moderatemass halos that are not dominated by shotnoise effects.
4.2.3. Comparison with numerical simulations
Fig. 17 The covariance matrix of the Landy & Szalay estimator, along the diagonal i = j. We show our analytical results including all contributions (solid lines) or only loworder terms (dotted lines), and results from numerical simulations (dashed lines). 
Fig. 18 The covariance matrix , as in Fig. 17, but along one row. This corresponds to the fixed bin i = 4, associated with the distance bin 12.3 < r < 16.6 h^{1} Mpc, as a function of j. 
We display in Fig. 17 the covariance matrix of the Landy & Szalay estimator, along its diagonal. We show our results obtained when we include the highorder contributions of Sect. 4.2.2, see Eqs. (75)–(77), and when we only take the loworder terms of Eq. (69) into account. We obtain a good match to the numerical simulations, especially on the largest scales, which are also more reliable. In particular, we recover the strong dependence on radius and halo mass. We can see that, for moderatemass halos, M > 2 × 10^{13} h^{1} M_{⊙}, the highorder contributions are not negligible (because the loworder shotnoise contribution is relatively smaller).
We show the same covariance matrix along its fourth row in Fig. 18. The results from the numerical simulations are somewhat noisy, especially for the rare massive halos at low radii. However, where they are reliable they show reasonably good agreement with our analytical results. In agreement with Sect. 4.2.2, it is clear that, even more than along the diagonal, the highorder contributions of Eqs. (75)–(77) cannot be neglected in order to obtain a good estimate of the offdiagonal terms of the covariance matrix (see also Fig. 19).
4.2.4. Correlation matrices
Fig. 19 Contour plots for the correlation matrix ℛ_{i,j}, defined as in Eq. (46) but for the full covariance matrix C_{ij} of the halo correlation. There are ten distance bins, over 5 < r < 100 h^{1} Mpc, equally spaced in log (r), as in previous figures. We consider halos in the redshift range 0 < z < 0.8, within an angular window of 50 deg^{2}, above the mass thresholds M > 2 × 10^{13} h^{1} M_{⊙} in the upper row, and M > 10^{14} h^{1} M_{⊙} in the lower row. Left panels: loworder contributions (67) for the Peebles & Hauser estimator. Middle panels: loworder contributions (69) for the Landy & Szalay estimator. Right panels: full correlation matrix, including the highorder contributions of Eqs. (75)–(77), for the Landy & Szalay estimator. 
We show in Fig. 19 the correlation matrices, defined as in Eq. (46) but for the full covariance matrix C_{ij} of the halo correlation. (Although ℛ_{i,j} is a discrete 10 × 10 matrix, it is still possible to draw a contour plot by interpolation. This gives clear figures that are easier to read than a density plot where each cell is colored with a level of gray that depends on the entry ℛ_{i,j}.)
The left and middle panels of Fig. 19 clearly show the strong improvement associated with the use of the Landy & Szalay estimator in place of the Peebles & Hauser estimator. In agreement with Fig. 14 and the discussion in Sect. 4.2.1, the correlation matrix is more diagonal for massive halos, where the diagonal shotnoise contribution C^{(2)} of Eq. (60) is more important. Indeed, shotnoise effects become dominant for rare objects. For the same reason, highorder contributions to the covariance matrix, which are due to samplevariance effects, are more important for lowmass halos, as shown by the comparison between the middle and right panels. The slope of the contour lines in the right panels, especially in the lowmass case, shows that highorder terms are more important on small scales and also increase the correlation between small and large scales while making the matrix less diagonal.
Thus, for lowmass halos there are rather strong correlations between all scales in the range 5 < r < 100 h^{1} Mpc, and to obtain accurate estimates of error bars on cosmological parameters it is necessary to take offdiagonal entries and highorder contributions to the covariance matrix into account.
5. Angular correlation function
In the previous section we considered the realspace 3D correlation function, which requires knowledge of the radial position of the halos (or more generally of the objects of interest). If this information is not available (e.g., redshift estimates are too noisy or distance measures are highly contaminated by redshiftspace distortions), it is still possible to derive some constraints on cosmology from the angular distribution of the objects on the sky (Peebles 1980; Eisenstein & Zaldarriaga 2001; Maller et al. 2005). Therefore, in this section we apply the formalism developed in Sect. 4 to the angular twopoint correlation function w(θ).
5.1. Mean correlation
5.1.1. Peebles & Hauser estimator
As in Sect. 3.2.2, we write the observed number density of objects on the sky as , but we omit the index i of Eq. (40) since we consider a single redshift bin. As in Sect. 4, the width Δz is not necessarily small and may cover the whole redshift range of the survey. Then, using notations that are similar to Eq. (48), we can write the Peebles & Hauser estimator ŵ_{i} as (78)with (79)where is the mean angular number density on the direction Ω, given by (80)Here we used Eq. (18), and we assumed that the sky coverage is the same over the survey window (ΔΩ), so that is actually a constant that does not depend on Ω (but the formalism is readily extended to the more general case where we add a filter that depends on Ω).
Fig. 20 The mean angular correlation, , over eight angular bins within 1.25 < θ < 50 arcmin, equally spaced in log (θ). We compare our analytical results (solid lines) with numerical simulations (dashed lines). 
Here and in the following, the index i of Eqs. (78), (79) refers to the angular bin [θ_{i, − },θ_{i, + }] , over which we estimate the angular correlation w(θ). Again, we denote with unprimed letters the quantities associated with the first object, such as its position Ω on the sky, and with primed letters the quantities associated with the neighbor at distance θ′, such as its position Ω′. We use the flatsky and Limber’s approximations, which are typically valid for angular radii below 10 deg, as seen in Fig. 10.
The quantity introduced in Eq. (79) can be written as (81)using that defined in Eq. (80) does not depend on Ω in our case, and is the area of the iring, (82)Then, we proceed as in Sect. 4. Substituting the observed 3D number density as in Eq. (17), introducing the halo twopoint correlation ξ^{h} when we take the average as in Eq. (52), and using the factorization (1), we obtain (83)with (84)The superscript “(θ)” recalls that Eq. (84) is an average over the angular ring , instead of the 3D spherical shell of Eq. (54). The prime in the subscript “i′” also recalls that we integrate over a neighboring point θ′, with respect to a given point of the observational cone. However, because the two points are only close in the 2D angular space (i.e., in the iring), we also integrate over the longitudinal coordinate χ′ along the full line of sight in Eq. (84).
Explicit expressions for are given in Appendix G. In contrast to the number counts studied in Sect. 3, where, for large angles above a few degrees, it is necessary to go beyond Limber’s approximation, as found in Figs. 9 and 10, for our study of the angular correlation function Limber’s approximation is sufficient because we consider much smaller angular scales of a few arcmin.
5.1.2. Landy & Szalay estimator
As in Sect. 4.1.2, the measure of the angular correlation can be made more accurate by using the Landy & Szalay estimator instead of the Peebles & Hauser estimator (78) (Landy & Szalay 1993; Szapudi & Szalay 1998). As in Eq. (56), this reads as (85)and we can check that its mean is again equal to the average (83).
5.1.3. Comparison with simulations
We compare in Fig. 20 the mean correlation (83) with results from numerical simulations. The error bars are the 3 − σ statistical errors obtained from the covariance matrices derived in Sect. 5.2.2 for 41 fields of 400 deg^{2} as used in the simulations. Above 5 arcmin we obtain a good match between our results and the numerical simulations. This could be expected from Sect. 4 since the angular correlation is a projection of the 3D correlation. On lower angular scales the discrepancy may be due to the finite size of the clusters. This implies that ξ^{h} = −1 at distances below the sum of the two cluster radii (exclusion effect), but we have not included this effect in our bias model (1). Since a typical cluster at z = 0.5 (with a size of 1 h^{1} Mpc) corresponds to an angle of ~ 2.5 arcmin and projection effects are rare (since clusters are rare objects with a surface density ~ 10 deg^{2}), this exclusion effect indeed occurs at θ ≲ 5 arcmin and appears at slightly larger angles for more massive halos. This explains the behavior found on these scales in Fig. 20.
5.2. Covariance matrices for the halo angular correlation
The covariance matrices of the estimators ŵ_{i} and can be computed following the procedure used in Sect. 4 for the 3D correlation. Denoting again the covariance matrices as C_{i,j} and , they decompose as in Eq. (59), (86)where is a pure samplevariance contribution, whereas and are shotnoise contributions that arise when either one pair or two pairs of objects are identified. Again, the contributions and also involve the twopoint and threepoint correlations; i.e., they contain terms that couple discreteness effects with largescale density correlations.
For the Peebles & Hauser estimator (78) we obtain, as in Eqs. (60)–(62), (87)For the Landy & Szalay estimator (85) only a few of these terms remain, as in Eqs. (64)–(66), and we obtain see also Szapudi (2001), and Bernstein (1994) who considers (up to order over the inverse of the mean density) the additional terms associated with fluctuations of the denominator in the estimator (55), when the latter is normalized to the number counts in the same field.
5.2.1. Loworder terms
Keeping only the contributions that are constant or linear over the twopoint halo correlation ξ^{h}, as in Eqs. (67) and (69), we obtain (93)and (94)where, in a fashion similar to Eq. (68), we introduced the average (95)The Fourierspace expression of Eq. (95) is given in Eq. (H.1).
Comparison of Peebles & Hauser and Landy & Szalay covariance matrices
Fig. 21 The covariance matrices (solid line) and C_{i,j} (dashed line) of the estimators and ŵ_{i}, for i = 2 associated with the angular bin 2 < θ < 3.2 arcmin, as a function of j. We show the results obtained for halos in the redshift range 0 < z < 0.8, with an angular window of 400 deg^{2}, above the mass thresholds M_{ ∗ } = 2 × 10^{13} and 10^{14} h^{1} M_{⊙}, from bottom to top. Here we only consider the loworder contributions, given by Eqs. (93) and (94). 
We compare in Fig. 21 the covariances matrices (93) and (94) as a function of j at fixed i. As in Fig. 20, we consider halos in the redshift range 0 < z < 0.8 in a survey of area 400 deg^{2}. As was the case for the 3D realspace correlation ξ shown in Fig. 13, and in agreement with previous works (Kerscher et al. 2000), the covariance is much smaller and more diagonal for the Landy & Szalay estimator (85) than for the Peebles & Hauser estimator (78). This can also be clearly seen from the comparison of the left and middle panels of Fig. 27, where we show the correlation matrices associated with Eqs. (93) and (94).
Comparison of samplevariance and shotnoise effects
Fig. 22 The contributions C^{(2)} and C^{(3)} to the covariance of the Landy & Szalay estimator, along the diagonal i = j. As in Fig. 21, we only consider the loworder terms, given by Eq. (94). 
Next, we compare in Fig. 22 the contributions C^{(2)} (first term in Eq. (94)) and C^{(3)} (second term in Eq. (94)), again keeping only these loworder terms of the covariance of the Landy & Szalay estimator. As compared with C^{(3)}, C^{(2)} involves an extra degree of shot noise (one more pair identification). Taking only these loworder terms into account, the covariance is dominated by C^{(2)} (whence shotnoise effects are dominant) below 10 arcmin for halos above M_{ ∗ } = 2 × 10^{13} h^{1} M_{⊙}, and below 50 arcmin for halos above M_{ ∗ } = 10^{14} h^{1} M_{⊙}. As for the 3D correlation, shotnoise effects dominate up to larger scales for more massive and rare halos, and this also implies that their covariance matrix is more strongly diagonal.
5.2.2. Highorder terms
Fig. 23 The low and highorder contributions to the covariance matrix along its diagonal. We again consider halos in the redshift range 0 < z < 0.8, with an angular window of 400 deg^{2}, above two mass thresholds. 
Fig. 24 The low and highorder contributions to the covariance matrix , as in Fig. 23, but along one row. This corresponds to the fixed bin i = 2, associated with the angular bin 2 < θ < 3.2 arcmin, as a function of j. 
At small angular separations, the highorder terms in Eqs. (91) and (92), associated with the product ξξ and the three and fourpoint correlations ζ and η, are not negligible. As in Sect. 4.2.2 and in Bernstein (1994), to estimate these highorder correlations we use the “hierarchical clustering ansatz” shown in Figs. 1 and 2 and given by Eqs. (4)–(7). As described in Appendix H, we follow the procedure that we have already used in Appendix F to compute the highorder terms associated with the 3D correlation ξ. Then, the contribution associated with the product in Eq. (92) writes as (96)the term of Eq. (91) yields (97)and the term of Eq. (92) gives (98)where the various factors are given in Appendix H.
We compare in Figs. 23 and 24 these highorder contributions (96)–(98) with the loworder contribution (94), for the covariance matrix of the Landy & Szalay estimator. We recover the qualitative behavior encountered in Figs. 15 and 16 for the estimator of the 3D correlation function. The “ζ” and “η” terms (97) and (98) show a strong dependence on the smoothing scales, while the “ξξ” term (96) shows a very weak dependence. Again, this is because the contribution (96) involves the product of two correlations between two distinct lines of sight, so that each ξ is averaged over the angular window θ_{s} of the survey, as seen in Eq. (92), whereas the contribution (97) involves the product of two correlations between a central point and two points at angular distances θ_{i} and θ_{j}, as seen in Eq. (91).
For the case of massive halos, M > 10^{14} h^{1} M_{⊙}, the highorder terms are negligible along the diagonal, which is dominated by the shotnoise term, and only give a modest contribution to offdiagonal entries. Then, the covariance matrix remains strongly diagonal (for the angular bins studied here).
For the case of lowmass halos, M > 2 × 10^{13} h^{1} M_{⊙}, the highorder contribution (96) to the diagonal is no longer negligible for θ > 5 arcmin, while the two other contributions (97) and (98), which involve the three and fourpoint correlation functions, are always subdominant on these scales. This is a convenient property since the modelization of highorder manybody correlations is increasingly difficult. However, this was not the case for the 3D correlation ξ, as seen in Figs. 15 and 16, except on large scales. For offdiagonal entries, the highorder contribution (96) can become dominant for widely separated angular scales, while on small scales, θ ~ 1 arcmin, all contributions are of the same order of magnitude.
5.2.3. Comparison with numerical simulations
Fig. 25 The covariance matrix along its diagonal. We show our analytical results including all contributions (solid lines) or only loworder terms (dotted lines), and results from numerical simulations (dashed lines). 
Fig. 26 The covariance matrix , as in Fig. 25, but along one row. This corresponds to the fixed bin i = 4, associated with the angular bin 5 < θ < 8 arcmin, as a function of j. 
As for the 3D correlation, we show the covariance matrix , along its diagonal and along one row, in Figs. 25 and 26. Again we obtain a reasonable agreement with the numerical simulations. For moderatemass halos, the highorder contributions are again necessary to obtain a good match on large scales for diagonal entries and on most scales for offdiagonal entries. The offdiagonal terms of the covariance matrix obtained from the numerical simulations are rather noisy, and our analytical results are competitive in obtaining reliable estimates.
5.2.4. Correlation matrices
Fig. 27 Contour plots for the correlation matrix ℛ_{i,j}, defined as in Eq. (46) but for the full covariance matrix C_{ij} of the halo angular correlation. There are eight angular bins, over 1.25 < r < 50 arcmin, equally spaced in log (θ), as in previous figures. We consider halos in the redshift range 0 < z < 0.8, with an angular window of 400 deg^{2}, above the mass thresholds M > 2 × 10^{13} h^{1} M_{⊙} in the upper row, and M > 10^{14} h^{1} M_{⊙} in the lower row. Left panels: loworder contributions (93) for the Peebles & Hauser estimator. Middle panels: loworder contributions (94) for the Landy & Szalay estimator. Right panels: full correlation matrix, including the highorder contributions of Eqs. (96)–(98), for the Landy & Szalay estimator. 
We show in Fig. 27 the correlation matrices ℛ_{i,j}, defined as in Eq. (46), but for the full covariance matrices C_{i,j} of the estimators of the halo angular correlation. As for the 3D correlation, we can check that, keeping only loworder terms, the correlation matrix of the Landy & Szalay estimator (85) is much more diagonal than for the Peebles & Hauser estimator (78). Taking highorder contributions into account makes the matrix slightly less diagonal, but it still remains significantly diagonal, in agreement with Fig. 24. As in the 3D case, the correlation matrix is much more diagonal for massive halos, where shotnoise effects are more important. The full angular correlation matrix is more diagonal than its 3D counterpart shown in the right hand panels of Fig. 19, and the correlations between small and large angular scales are not as strong as the correlations between the small and large radii found in Fig. 19. In particular, offdiagonal entries and highorder contributions play a less important role (although for lowmass halos it is still useful to take them into account).
6. Applications to real survey cases
In this section, we compare the statistical significance of the number counts and of the 3D correlation function for future large cosmological cluster surveys (we give in Appendix J the selection functions that we use for some of these surveys). Here we must note that, while redshiftspace distortions only have a low impact on angular number densities (number counts and angular correlations) for wide redshift bins, they can more strongly affect 3D clustering. In principle, redshift distortions could be corrected to recover a realspace map if the velocity field is known, and when also applying a fingerofgod compression, but this would require a rather complete spectroscopic follow up, so it is not very practical. Therefore, observations instead provide redshiftspace 3D correlations. Then, the results discussed in this section for 3D correlations should be seen as a first step toward more accurate computations.
Nevertheless, a simple estimate shows that these redshiftspace distortions should not strongly affect our results. Indeed, we find in the numerical simulations that at z = 0.5 for instance clusters have peculiar velocities v on the order of 300 km s^{1} along each axis. The redshiftspace coordinate s_{∥} along the line of sight is given by s_{∥} = x_{∥} + v_{∥}/(aH). This yields a typical error Δx_{∥} for the cluster comoving coordinate on the order of 3.6 h^{1} Mpc. This is not much larger than the typical size of the clusters, which ranges from 1 to 2 h^{1} Mpc. Then, for distance bins that are larger than 20 h^{1} Mpc we can expect redshift distortions to affect our results on the covariance matrices by about 20%. The net effect should actually be smaller because the 3D estimators also include information on clustering along the transverse directions, which are not contaminated by the cluster peculiar velocities. We leave an explicit computation of these redshiftspace distortions to future works.
6.1. Surveys of limited areas
Fig. 28 The mean angular number densities of Xray clusters per square degree, within redshift bins of width Δz = 0.1, for the XXL, DES, and SPT surveys. Error bars contain both the shotnoise and samplevariance contributions, from Eqs. (22) and (27). For DES we consider the mass thresholds M > 5 × 10^{13} h^{1} M_{⊙} and M > 5 × 10^{14} h^{1} M_{⊙} (smaller error bars), and for SPT the mass threshold M > 5 × 10^{14} h^{1}M_{⊙} (larger error bars shifted to the right). 
We first consider several surveys of clusters of galaxies on limited angular windows.

The XXL survey (Pierre et al. 2011) is an XMM Very Large Programme specifically designed to constrain the equation of state of the dark energy by using clusters of galaxies. It consists of two 5 × 5 deg^{2} areas and probes massive clusters out to a redshift of ~2. The wellcharacterized cluster selection function relies on the fact that clusters of galaxies are the only extended extragalactic sources, so that the selection operates in a twodimensional parameter space (equivalent to flux and spatial extent), allowing for different degrees of contamination by misclassified point sources. We show the mass detection probabilities as a function of redshift in the left hand panel of Fig. J.1, for the C1 selection. The space density of this population is ~6 deg^{2}. This complex selection function F(M,z), which differs from a simple mass or Xray flux threshold (see also Pacaud et al. 2006, 2007), is readily included in our formalism through a redefinition of the halo mass function, n(M,z) → F(M,z)n(M,z).

The Dark Energy Survey (DES) is an optical imaging survey to cover 5000 deg^{2} with the Blanco 4meter telescope at the Cerro Tololo InterAmerican Observatory^{6}. We consider the expected mass threshold M > 5 × 10^{13} h^{1} M_{⊙}, as well as the subset of massive clusters M > 5 × 10^{14} h^{1} M_{⊙}, since a binning over mass should help in deriving tighter constraints on cosmology.

The South Pole Telescope (SPT) operates at millimeter wavelengths^{7}. It will cover some 2500 deg^{2} at three frequencies, aiming at detecting clusters of galaxies from the SunyaevZel’dovich (SZ) effect. A preliminary survey of 178 deg^{2} at 150 GHz reveals some 20 clusters down to a depth of 18 μK. Extensive simulations allow the determination of the mass completeness level, above a given significance for these secondary CMB anisotropies (Vanderlinde et al. 2010). This gives a mass threshold on the order of 5 × 10^{14} h^{1} M_{⊙}.
Fig. 29 The mean correlation, from Eq. (58), over ten comoving distance bins within 5 < r < 100 h^{1} Mpc, equally spaced in log (r). We integrate over halos within the redshift interval 0 < z < 1, for the XXL, DES, and SPT surveys, as in Fig. 28 (again the error bars for SPT are slightly larger and shifted to the right with respect to those of DES, for M > 5 × 10^{14} h^{1} M_{⊙}). The error bars show the diagonal part of the covariance, , for the Landy & Szalay estimator, from Eqs. (69) and (75)–(77). 
Fig. 30 The mean correlation, , for the clusters detected by DES over the redshift interval 1 < z < 2. Here we consider 20 distance bins within 5 < r < 100 h^{1} Mpc, equally spaced in log (r) (i.e. twice as many as in Fig. 29). 
We show in Figs. 28–30 the angular number densities and 3D correlations expected for these various surveys. The error bars include all shotnoise and samplevariance contributions (including highorder terms). For the higher redshift interval, 1 < z < 2, we only show the correlation of clusters above 5 × 10^{13} h^{1} M_{⊙} for DES, because in other cases the error bars are too large to allow an accurate measure. On the other hand, to take advantage of the good expected accuracy of this case we consider in Fig. 30 distance bins that are half the size of those of Fig. 29.
As expected, the DES provides the best measures of cluster number counts and correlations, hence the tightest constraints on cosmology, thanks to its wide size, which provides a large number of objects. However, the much smaller XXL survey already provides a meaningful measure of both the abundance and the correlation of clusters, and appears to be a promising tool. The SPT survey allows a useful measure of the number counts as a function of redshift, but its rather high mass threshold leads to a relatively small number of objects, hence large error bars for the 3D correlations, even though a positive signal should still be within reach. Assuming its expected mass threshold of M > 5 × 10^{13} h^{1} M_{⊙} remains valid over 1 < z < 2, the DES is the only survey among these three that allows an accurate measure of the cluster correlation at high redshift, which should help to further constrain the cosmology.
6.2. Allsky surveys
Following Planck, space missions will map the entire sky in the Xray (EROSITA) and optical (EUCLID) wavebands at unprecedented depth and angular resolution. Corresponding selection functions are still at the tentative or predictive level. It is nevertheless instructive to compare estimates of the statistical significance of the allsky cluster catalogs expected from these forthcoming surveys^{8}. In practice, the total angular area of such surveys is not really 4π sterad since we must remove the galactic plane. In the following, for Planck we consider the twosided cone of angle θ_{s} = 75 deg (i.e., b > 15 deg), which yields a total area ΔΩ ≃ 30576 deg^{2}. For Erosita and Euclid we take θ_{s} = 59 deg (i.e., b > 31 deg), which corresponds to a total area that is about onehalf of the full sky, ΔΩ ≃ 20 000 deg^{2}.

Planck operates at nine frequencies, enabling an efficientdetection of the cluster SZ signature but has a rather large PSF(5′–10′). Some 1625 massive clusters out to z = 1 are expected over the whole sky. We assume the selection function by Melin et al. (2006), shown in middle panel of Fig. J.1.

For Erosita, a simple flux limit is currently assumed as an average over the whole sky: 4 × 10^{14} erg s^{1} cm^{2} in the [0.5 − 2] keV band (Predehl et al. 2009). The associated selection function is shown in the right hand panel of Fig. J.1. This would yield 71,907 clusters out to z = 1.

For Euclid, we follow the prescription of the Euclid Science Book for the cluster optical selection function and adopt a fixed mass threshold of 5 × 10^{13} h^{1} M_{⊙} (Refregier et al. 2010).
We show in Fig. 31 the angular number densities per redshift bin. The error bars contain the shotnoise contribution (22), as well as the samplevariance contribution (41) that holds for any angular window and does not rely on the flatsky and Limber’s approximations^{9}. The 3D correlation functions are shown in Figs. 32 and 33.
Fig. 31 The mean angular number densities of clusters within redshift bins of width Δz = 0.1. From top to bottom, we show a) halos above 5 × 10^{13} h^{1} M_{⊙} in Euclid, b) halos detected by Erosita with the selection function of the right panel in Fig. J.1, c), halos above 5 × 10^{14} h^{1} M_{⊙} in either Erosita or Euclid, and d) halos detected by Planck with the selection function of the middle panel in Fig. J.1. 
Fig. 32 The mean correlation, , integrated over 0 < z < 1, as in Fig. 29. From top to bottom, we show a) halos above 5 × 10^{14} h^{1} M_{⊙} in either Erosita or Euclid, b) halos detected by Planck with the selection function of the middle panel in Fig. J.1, c) halos above 5 × 10^{13} h^{1} M_{⊙} in Euclid, and d) halos detected by Erosita with the selection function of the right panel in Fig. J.1. 
Fig. 33 The mean correlation, , over the redshift interval 1 < z < 2, for the clusters detected by Erosita (upper curve, with ten distance bins) and Euclid (lower curve, with twenty distance bins). 
As compared with the smaller surveys of Sect. 6.1, these (almost) fullsky surveys provide much more accurate measures of the evolution with redshift of cluster abundance, and of twopoint correlation functions, thanks to the greater number of objects. In particular, thanks to its lower mass threshold, Euclid can probe higher redshifts, both for number counts and correlation functions. Although we have only considered two redshift bins for the twopoint correlation function, 0 < z < 1 and 1 < z < 2, Figs. 32 and 33 suggest that for Euclid it should be possible to introduce a smaller redshift binning, such as Δz = 0.5. We leave it to future works to estimate which redshift binning is the most efficient at constraining cosmology.
6.3. Shot noise versus sample variance
Fig. 34 The ratio of the rms shotnoise contribution to the rms samplevariance contribution , of the covariance of the angular number densities N_{i} obtained for various surveys. (For DES and Euclid we only consider the case M > 5 × 10^{13} h^{1} M_{⊙}.) 
Fig. 35 The ratio of the rms contributions and of the covariance matrix of the estimator . This is a measure of shotnoise effects. (For DES and Euclid we only consider the case M > 5 × 10^{13} h^{1} M_{⊙}.) 
We show in Fig. 34 the ratio of the shotnoise to samplevariance contributions to the covariance of number counts, where the rms contributions and are defined by Eq. (29). As expected, shot noise becomes increasingly dominant at higher redshift, as the number of clusters decreases, and it is smaller for Euclid which has a wider sky coverage and a lower mass threshold.
We show in Fig. 35 the ratio of the contribution (64) to the sum of contributions (65) and (66), to the rms error . In contrast to Fig. 14 we include the highorder terms of C^{(3)} and C^{(4)}, but the ratio is again a measure of shotnoise effects. As expected, we can see that the contribution C^{(2)} becomes increasingly dominant for smaller radial bins since they contain fewer clusters. We can see that the ordering between the various surveys is not the same as the one obtained in Fig. 34 for the number counts. This is because the mass thresholds are not the same (and couplings between shotnoise and samplevariance effects in the covariance matrix of halo correlations make the analysis less direct). Since a higher mass means both a larger correlation function and larger discreteness effects (because halos are rarer), it is not always obvious a priori how the relative importance of shotnoise effects changes from one configuration to another.
Here we must recall that Fig. 35 only shows the diagonal part of the covariance matrix C_{i,j} and that offdiagonal terms can be nonnegligible, see Sect. 4.
6.4. Highorder and loworder contributions to the sample variance of
Fig. 36 The ratio of the rms highorder contribution (75)–(77) to the rms loworder contribution (second term in Eq. (69)) of the sample variance of the correlation ξ_{i} obtained for various surveys. (For DES and Euclid we only consider the case M > 5 × 10^{13} h^{1} M_{⊙}.) 
We show in Fig. 36 the ratio of the highorder contributions to the loworder contribution of the sample variance of the 3D correlation ξ_{i}. We consider the Landy & Szalay estimator and we define along the diagonal the rms contributions as , from Eqs. (75)–(77) for the highorder term, and for the loworder term, where is given by the second term in Eq. (69). We do not consider here the shotnoise contribution associated with the first term in Eq. (69), which was studied in Fig. 35; however, while and are pure samplevariance contributions, and are mixed shotnoise and samplevariance contributions. Indeed, they arise from both the discreteness of the halo population (as shown by the power instead of , which comes from the identification of two objects as explained in Eq. (D.2)) and its largescale correlations (as shown by the bias factors and ).
As could be expected, we can see in Fig. 36 that the relative importance of highorder terms increases on smaller scales, deeper in the nonlinear regime where correlations are stronger. However, in some cases there is a flattening on larger scales because the relative importance of highorder terms no longer decreases (and could even increase in the case of lowmass halos as seen in Fig. 15). This is because the loworder contribution is actually a mixed “shotnoise and samplevariance” contribution, as noticed above, and shotnoise effects decrease on larger radii (because of the greater volume), as seen in Fig. 35. In agreement with this explanation, we can see that this upturn appears earlier and is greater for the surveys where shotnoise effects are less, that is, Erosita, Euclid, and DES.
More generally, Fig. 36 shows that highorder contributions to the samplevariance or mixed terms are not negligible (but on small scales along the diagonal the covariance matrix is often dominated by the pure shotnoise contribution). For the variety of cases studied in Fig. 36 they do not grow above five times the loworder contribution along the diagonal, but as shown in Figs. 16 and 18 their importance can be greater far from the diagonal. Then, these contributions should be taken into account if one requires accurate or safe estimates of signaltonoise ratios.
6.5. Dependence of the results on cosmology
We investigate in Appendix K the sensitivity of our results to the value of the cosmological parameters, by comparing the curves obtained in the previous sections with those that are obtained when we change either h, Ω_{m}, or σ_{8} by an amount that corresponds to the current “2 − σ” uncertainty (Komatsu et al. 2011). We find that the main features shown in Figs. 34–36 remain valid, with modest quantitative changes (e.g., shotnoise effects become slightly less important, with respect to samplevariance contributions, when σ_{8} is slightly increased). Therefore, our results and conclusions are not sensitive to the precise value of the cosmological parameters (within their current range of uncertainty).
7. Conclusion
In this paper we have presented a general formalism for obtaining analytical estimates of the means and covariance matrices of number counts and correlation functions, for distributions of cosmological objects such as clusters of galaxies or galaxies. To do so, we assumed that the twopoint correlation function of these objects can be factored in terms of a linear bias model, and this simplifies expressions as spatial and mass (or luminosity, temperature, etc.) integrals factor. To estimate the highorder contributions to the covariance of twopoint estimators, we also assumed that the three and fourpoint correlations can be described by a hierarchical ansatz, that is, that they can be written as products of the twopoint functions. This is the simplest model that agrees reasonably well with realistic distributions (of the dark matter density field, as well as of cosmological objects such as galaxies or clusters that follow the dark matter density on large scales). Although this is only an approximate model and it is known that actual cosmological fields do not exactly obey such a hierarchical clustering, this allows us to derive explicit expressions that provide a reasonably good description of covariance matrices.
The main differences or improvements with respect to previous studies are the following.

Keeping the application to cluster surveys in mind, rather thanthe galaxy surveys that have been the aim of most works, weconsidered twopoint estimators that involve integrations overbroad redshift bins. Thus, we do not work with local 3Dcorrelations within an homogeneous and isotropic box at a givenredshift, but with averages over a redshift interval with explicitintegration along part of the observational cone, where the radialdirection plays a specific role.

We took all shotnoise and samplevariance contributions into account, along with highorder contributions, which in the present case of onepoint and twopoint estimators involve products of two twopoint correlation functions and the three and fourpoint correlations.

Within the framework of the simple hierarchical model recalled above, we gave explicit expressions for all contributions to these means and covariance matrices. They can be readily used for any population of objects and any set of cosmological parameters, provided one is able to compute the mass function (or the luminosity/temperature function), the twopoint correlation and three and fourpoint normalization parameters. In practice, assuming a linear scaleindependent bias model (or a uniform scaledependence that can be absorbed into the twopoint correlation), it is sufficient to give a bias b(M,z) in addition to the mass function.
We first studied the number counts per redshift bins, comparing the relative importance of shotnoise and sample variance contributions and giving scaling laws obeyed by the signaltonoise ratios, as a function of the survey area and the number of fields. We have explicitly considered the case of large angular windows, and estimated the angular scale where the flatsky and Limber’s approximations break down, which occurs at about 10 deg. We also computed the decay of correlations between distant redshift bins. In particular, we checked that a redshift binning of width Δz = 0.1 is broad enough to neglect crosscorrelations between different bins.
Next, we studied estimators of the 3D correlation function, averaged over finite redshift intervals. We compared the Peebles & Hauser estimator with the usual Landy & Szalay estimator, and we evaluated the relative importance of shotnoise and samplevariance, loworder and highorder, contributions to the covariance matrix. We also considered the behavior of the offdiagonal terms, and described how highorder contributions make the covariance matrices less diagonal as correlations develop between different scales (especially as one of the scales becomes smaller and more nonlinear). Then, we performed the same analysis for the 2D angular correlation function.
Throughout we compared our analytical expressions with results from numerical simulations and we obtain a reasonably good match. This makes such analytical results more competitive than simulations, because they are much faster to compute and allow one to describe rare objects that would have lowquality statistics in the simulations. Finally, we applied our formalism to several future cluster surveys, and considered both limitedarea and fullsky missions.
We hope our results can help for estimating the signaltonoise ratio of current and future surveys. This is useful for comparing the efficiency of different probes and different survey configurations, such as the choice of redshift binning, survey area, or number of subfields.
Our study should be extended in several directions. First, it would be interesting to consider the noise associated with photometric redshifts. Second, one should include the effect of redshiftspace distortions, which are likely to be important on small scales. Third, the computation of the means and covariance matrices studied in this paper is only an intermediate tool for comparing theoretical predictions with observations, and the final goal is to derive constraints on cosmological parameters or astrophysical processes (e.g., scaling laws for cluster massluminositytemperature relationships). Our results may be used to further investigate the cosmological information that can be extracted from cluster surveys and to optimize observing configurations so as to improve those constraints. We leave these tasks to future works.
Online material
Appendix A: Mean and covariance of number counts
To avoid introducing numerous Dirac factors, owing to the discreteness of the observed distribution of objects, we follow the simple approach described in Sect. 36 of Peebles (1980) to compute the statistical properties of counts in cells. We illustrate in this section this method for the computation of the mean and covariance of number counts within redshift bins.
We divide the “volume” of the space (z,Ω,lnM), which enters the expression (17) of the angular number density of observed objects in redshift bin i, over small (infinitesimal) cells labeled by the index α, so that Eq. (17) reads as (A.1)where subscript i refers to the redshift bin i. Then, since the cell α_{i} is infinitesimally small it contains at most one object, whence (Peebles 1980) (A.2)Moreover, by definition its average is given by (A.3)Of course, we recover for the mean number of objects in the redshift bin i the expression (19), which could also be read from Eq. (17) using the average (A.4)We now consider the covariance of the angular number densities . From Eq. (A.1) we have In the second line we used the fact that the redshift bins do not overlap, so that for two “volumes” α_{i} and α_{j} to coincide, bins i and j must be the same (and δ_{i,j} is the Kronecker symbol), while in the third line we used Eq. (A.2). The first term in Eq. (A.7) corresponds to the shot noise, due to the discreteness of the object distribution. The second term includes the nonzerodistance correlation between objects, and reads as (for α_{i} ≠ α_{J}) (A.8)where is the “halo” twopoint correlation function between “volumes” α_{i} and α_{j}, see Peebles (1980). This yields (A.9)using obvious notations where we label the quantities associated with and by the subscripts i and j and we integrate over the bins i and j. This could also be directly obtained from Eq. (17) by writing (A.10)where the second term with the Dirac factors gives the shotnoise contribution.
In this derivation we have assumed in Eq. (A.2) that space can be divided into infinitesimal volumes that contain either zero or one object and that each object only appears in one cell. Even though clusters and dark matter halos are actually extended objects, it is still possible to define a point distribution by associating a single point to each cluster or halo, for instance the halo mass center. Thus, this approach, which follows Peebles (1980), applies to these cases as well and to any distribution of discrete objects, as long as we restrict ourselves to count distributions and do not study the internal structure of these objects.
Appendix B: Finitesize effects
Fig. B.1 Geometrical illustration of finitesize effects. Close to the survey boundary, part of the sphere of radius r extends beyond the observational cone and should not be counted. The left plot is a transverse view, orthogonal to the central line of sight, whereas the right plot is a view from a point far away on the line of sight. 
As noticed in Sect. 4, in our computations of the mean and covariance of the estimators and we neglect finitesize effects. Indeed, we do not take the fact into account that when a point i gets close to the survey boundaries the available space for points i′ located in the distance bin [R_{i, − },R_{i, + }] , with respect to point i, is only a fraction of this spherical shell since a part of it extends beyond the observational cone. This means that we overestimate the total number of pairs. This has no impact on the mean, , since this effect cancels out between the numerator and denominator in (48), but it means that we slightly overestimate the signaltonoise ratio.
To estimate the magnitude of this error, we compute the geometrical factor illustrated in Fig. B.1. Approximating the observational cone as a cylinder of radius , a point i at distance ℓ from the central line of sight is the center of a spherical shell of radius r, onto which we count all neighbors i′ to estimate the correlation ξ at this distance r. We denote F(ℓ) as the fraction of this sphere that is enclosed within the observational cone. In our computations elsewhere we used the approximation F = 1, but for R_{s} − r < ℓ < R_{s} we actually have F < 1. As in the transverse view shown in the left hand plot of Fig. B.1, the angle θ_{ℓ} associated with the farthest point of intersection between the cylinder and the sphere satisfies ℓ + rsinθ_{ℓ} = R, whence (B.1)Next, in the plane of each vertical section (i.e., at fixed θ), shown in the right hand plot of Fig. B.1 that corresponds to a projection along the line of sight, the cylinder appears as a circle of radius R_{s}, whereas the section of the sphere of center i appears as a circle of radius rsin(θ). Both circles intersect (again for R_{s} − r < ℓ < R_{s}) at the symmetric polar angles ϕ_{ ± }, with (B.2)Then, the surface of the sphere that extends outside of the observational cylinder writes as (B.3)Thus, for R_{s} − r < ℓ < R_{s} the fraction of the sphere that is enclosed within the observational cylinder reads as (B.4)whereas F(ℓ) = 1 for 0 < ℓ < R_{s} − r. Then, integrating the position of the central point i over the cylinder, the fraction of volume for pairs at distance r, with respect to the approximation F = 1, writes as This gives the ratio of the number of pairs N′, which is measured in the survey, to the number N obtained when we do not take finitesize effects into account. For instance, at z = 1, which corresponds to the angular distance Mpc, and for a survey angular window of area 50 deg^{2}, which corresponds to θ_{s} ≃ 0.0696 rad, we have Mpc. Then, we obtain N′/N ≃ 0.91 for a shell at radius r = 30 h^{1} Mpc. This means that the approximation F = 1 overestimates the number of pairs by about 10% and the signaltonoise ratio by 5%.
Appendix C: Computation of the mean of the estimators and
Defining the 3D Fourierspace tophat as (C.1)the 3D Fourierspace window of the ishell reads as (C.2)where the superscript (3) recalls that we consider a 3D radial bin. Then, writing the twopoint correlation function in terms of the power spectrum, as in Eq. (3), we obtain for its radial average (54) (C.3)(Here i and i′ refer to the same radial bin; the prime only recalls that we are integrating over a neighbor i′ within a small radial shell with respect to another point in the observational cone.)
Appendix D: Derivation of the covariance of the Peebles & Hauser estimator
We compute here the covariance of the estimators , which is identical to the covariance of the quantities . To simplify the expressions we do not consider mass binning here, but it is straightforward to generalize to the case of several mass bins. From the definition (48) we can write with obvious notations the second moment as (D.1)The average in Eq. (D.1) can be written as in Eq. (A.10), with many Dirac factors for the shotnoise contributions. However, as in Appendix A, it may be easier to follow Peebles (1980) and to divide “volumes” over small (infinitesimal) cells that contain objects, with or 1. Then, we can split the average as (D.2)where we have explicitly written the first “pure samplevariance” contribution and the last six “shotnoise” contributions associated with the Kronecker symbols. The remaining averages with the superscript “(s.v.)” denote “samplevariance” averages, that is, without further shotnoise terms. Here we used the fact that the objects i and i′ are separated by the finite distance r_{i′}, with r_{i′} ≥ R_{i, − }, so that the elementary “cells” i and i′ cannot coincide and there is no shotnoise contribution of the form δ_{i,i′}. For the same reason there is no term δ_{j,j′}. Next, the “samplevariance” averages of Eq. (D.2) read as (Peebles 1980) (D.3)(D.4)(D.5)where ξ^{h}, ζ^{h}, and η^{h} are the twopoint, threepoint, and fourpoint correlation functions of the objects. Since we have (D.6)we obtain from Eqs. (D.1)–(D.5) the decomposition (59) of the covariance matrix, with the explicit expressions (60)–(62) of the various “samplevariance” and “shotnoise” contributions. Here we used the symmetries^{10} { i ↔ i′ } and { j ↔ j′ } of Eq. (D.1). In Eq. (61) the object “j′” is at the distance r_{j′} from the object “i”, since this shotnoise contribution comes from the case where the objects i and j are the same object (or from one of the three remaining cases “i = j′”, “i′ = j”, or “i′ = j′”). The shotnoise contribution (60) comes from the identification “i = j and i′ = j′” (or “i = j′ and i′ = j”). This implies that the distances r_{i′} and r_{j′} are equal, which gives rise to the Kronecker symbol δ_{i,j} since we consider the case of nonoverlapping distance bins [R_{i, − },R_{i, + }] .
From Eq. (52) the contribution of Eq. (60) also reads as (D.7)In order to estimate the contributions and we assume that the radial bins [R_{i, − },R_{i, + }] are restricted to large enough scales to neglect three and fourpoint correlation functions, as well as products such as ξ_{i;j′}ξ_{i′;j}. Thus, we only keep in this Appendix the contributions that are constant or linear over the twopoint correlation function ξ_{i;j} of the objects, which we recall with the superscripts “1” and “ξ” below. Moreover, we again assume that the twopoint correlation function can be factored in as in Eq. (1).
The first contribution to , associated with the factor 1 in the brackets in Eq. (61), reads as (D.8)The contributions that are linear over ξ sum up as (D.9)where and are defined as in Eqs. (54) and (C.3), whereas is defined in Eq. (68) and also writes as (D.10)Next, at this order the contribution (62) to the covariance simplifies as (D.11)where is Limber’s approximation (13) to Eq. (8). Then, collecting all terms, we obtain the expression (67) for the covariance.
Appendix E: Derivation of the mean and covariance of the Landy & Szalay estimator
We can relate the Landy & Szalay estimator defined by Eq. (56) to the Peebles & Hauser estimator (48) by (E.1)where we defined the crossterm by (E.2)We obtain at once, using Eqs. (A.4) and (50), (E.3)which leads to Eq. (57).
From the relation (E.1) we have for the covariance of the estimator , (E.4)where C_{i,j} is the covariance of the Peebles & Hauser estimator , defined in Eq. (59). To compute the crossterms in (E.4) we write as in Eq. (D.1), (E.5)Proceeding as in Appendix D, this gives (E.6)with (E.7)(E.8)Next, to compute the last term in Eq. (E.4) we write (E.9)whence (E.10)with (E.11)(E.12)Collecting all terms in Eq. (E.4), which reads as , we obtain the decomposition (63) with the contributions (64)–(66).
Appendix F: Computation of highorder terms for the covariance of
We compute here the highorder terms for the covariance of the LandySzalay estimator that we had neglected in Eq. (69).
For numerical computations, it is often more efficient to express the quantities that we encounter in this work in terms of the realspace correlation ξ(x), instead of the Fourierspace power spectra P(k) or Δ^{2}(k), provided ξ(x) is known (e.g., computed in advance on a fine grid^{11}). Indeed, this replaces oscillatory integrals by integrals with slowlyvarying factors, which allows faster and more accurate computations. This comes from mostly considering various kinds of volume averages of correlation functions, such as Eq. (8), which are more naturally written in configuration space. This yields integrations over bounded or unbounded domains with typically positive and slowlyvarying kernels. In contrast, the transformation to Fourier space yields highly oscillatory kernels as soon as some underlying realspace volumes are finite with a size much larger than some other scales (see for instance the 2D tophat (12) for a window θ_{s} that is much broader than the typical angular scale ). On the other hand, intermediate analytical computations are often easier to perform in Fourier space, mostly because of the convolution theorem. Then, a convenient method is to first write expressions in terms of Fourierspace power spectra, perform integrations over angles, and finally go back to the realspace correlation function, using the fact that from Eq. (3), ξ(x) and Δ^{2}(k) are related by (F.1)(F.2)As shown below, this method also allows partial factorization of most integrals.
A first highorder contribution to the covariance arises from the product ξ_{i;j′}ξ_{i′;j} in Eq. (66), which also writes as Eq. (75) where we introduced the quantity defined by (F.3)Expressing the twopoint correlation functions in terms of the power spectrum, using the flatsky (small angle) approximation, as well as Limber’s approximation as we did for Eq. (9), we obtain after integration over angles and over the two radial shells, (F.4)Again, the factor 2πδ_{D}(k_{1 ∥ } + k_{2 ∥ }) comes from the integration over χ_{j}, which suppresses longitudinal wavelengths. Using the exponential representation of Dirac functions, Eq. (F.4) can be partially factorized as (F.5)and the integration over angles yields (F.6)This also reads as (F.7)where we introduced (F.8)and (F.9)The function A^{(3)}(y) can be written as where K(k) and E(k) are the complete elliptic integrals of the first and second kinds (Gradshteyn & Ryzhik 1965). One can check that A^{(3)}(y) is a positive, nonoscillatory, and continuous function (but not analytic at y = 2), with A^{(3)}(y) ~ 2y for y → 0 and A^{(3)}(y) ~ 1/y for y → ∞.
Going back to configuration space, by substituting Eq. (F.2), the integral (F.9) can be written as (F.12)with (F.13)which for a > 0 and b > 0 is given by (F.14)Thus, using Eq. (F.12), the quantity of Eq. (F.7) involves slowly varying integrals over realspace variables, which partially factor as three factors within the integrand of Eq. (F.7). This makes it more efficient to use Eq. (F.7) than the Fourierspace expressions (F.4) or (F.6).
To evaluate the two remaining contributions, associated with the factors ζ_{i,i′,j′} in Eq. (65) and η_{i,i′;j,j′} in Eq. (66), we use the model for the three and fourpoint correlation functions described in Sect. 2.1.2. Thus, using Eq. (4) for the threepoint correlation function that enters Eq. (65), this contribution to Eq. (65) reads as (F.15)The first term in the bracket in Eq. (F.15) is given by (F.16)where was defined in Eq. (54), because the integrations over r_{i′} and r_{j′} are independent. The second term reads as (F.17)which no longer factors. Introducing an auxiliary wavenumber k and the Dirac factor δ_{D}(k_{1} + k_{2} − k), which we write under its exponential form as in Eq. (F.5), and using the inverse Fourier transform of the 3D shell (C.2), as well as Eq. (F.9), we obtain (F.20)The third term in Eq. (F.15) is obtained from Eq. (F.20) by exchanging the labels “i” and “j”.
We now turn to the fourpoint contribution to Eq. (66), using Eq. (6) for the halo fourpoint correlation function . Thanks to the symmetries { i ↔ i′ } and { j ↔ j′ } we have two different contributions (a) and (b) associated with the topology of the left diagram in Fig. 2, each with a multiplicity factor 2, and four different contributions (c), (d), (e), and (f), associated with the topology of the right diagram, with multiplicity factors 4,4,2, and 2.
The first contribution (a) reads as (F.21)with (F.22)Proceeding as for Eq. (F.3), we obtain (F.23)the contribution is the symmetric one with respect to { i ↔ j } of Eq. (F.21); that is, the product is replaced by .
Next, the contribution reads as (F.24)where the geometrical average writes as (F.25)since integrals over r_{i′} and r_{j′} can be factored.
The contribution (d) involves where no factorization is possible. Proceeding as for Eq. (F.3) we obtain (F.26)The contribution (e) involves that can be written as (F.27)whereas contribution (f) is obtained from (e) by exchanging the labels “i” and “j”.
Collecting all terms, the highorder contributions to the covariance matrix are given by Eqs. (75)–(77).
Appendix G: Computation of the mean of the estimators ŵ and ŵ^{LS}
We give here explicit expressions of the average (84) of the correlation function over an angular ring. As in Sect. 2.1.3, using the flatsky and Limber’s approximations, we obtain (G.1)where we introduced the 2D Fourierspace window of the iring, (G.2)and , associated with a full circular window, was defined in Eq. (12). In terms of the twopoint correlation function, Eq. (G.1) also writes as (G.3)which avoids introducing oscillatory kernels.
Appendix H: Computation of the covariance of ŵ^{LS}
The loworder contribution (94) to the covariance matrix of the estimator ŵ^{LS} involves the angular average (95). Using Limber’s approximation it also reads as (H.1)We now compute the highorder terms of the covariance , which are given in Eqs. (96)–(98). A first contribution (96) arises from the product ξ_{i;j′}ξ_{i′;j} in Eq. (92). Using Limber’s approximation and integrating over angles yields (H.2)Introducing a Dirac factor , which we write with the usual exponential representation in a fashion similar to Eq. (F.5), we obtain after integration over angles (H.3)Then, after a rescaling of variables x and y, and defining the quantities (H.4)(H.5)(H.6)we obtain the expression (96), using the property A^{(2)}(y) = 0 for y > 2. As compared with Eq. (H.2), introducing the Dirac factor and the two auxiliary variables x and y has allowed us to partly factor in the integrals, as seen in Eq. (96), which is convenient for numerical computations. Again, it is useful to express Eq. (H.6) in terms of the realspace twopoint correlation function, which yields (H.7)with (H.8)Although there is no explicit expression for the integral (H.8) for arbitrary (a,b), for a − b > 1 we can use the properties (H.9)In the band a − b < 1 one can check that W_{2}(a,b) is positive and decays as ~ b^{2} for large b, so that the realspace expression (H.7) is again more convenient than the Fourierspace expression (H.6).
The second contribution (97) arises from the threepoint correlation ζ in Eq. (91). Using Eq. (4) it reads as (H.10)where the three terms in the brackets, which correspond to the three diagrams in Fig. 1, are again geometrical averages along the lines of sight, which we compute with Limber’s approximation. In particular, the first term factors as (H.11)where was defined in Eq. (84), while the second term reads as (H.12)With the same factorization method, and using the inverse Fourier transform of the 2D shell (G.2), we obtain (H.15)where we introduced The third term in Eq. (H.10) is obtained from Eq. (H.15) by exchanging the labels “i” and “j”.
The third contribution (98) arises from the fourpoint correlation η in Eq. (92). As in Appendix F, we must compute the various terms associated with the diagrams of Fig. 2, with contributions (a) and (b) associated with the left diagram and contributions (c), (d), (e), and (f) associated with the right diagram. The first contribution (a) leads to (H.18)As in Appendix F, this geometrical average factors as (H.19)with Contribution (b) is the symmetric one of (a) with respect to i ↔ j.
Next, contribution (c) involves the geometrical average , which again factors as (H.22)
The contribution (d) involves the average (H.23)which also writes as (H.24)Contribution (e) involves , which reads as (H.25)whereas contribution (f) is obtained from (e) by exchanging the labels “i” and “j”.
Collecting all terms, the highorder contributions to the covariance matrix are given by Eqs. (96)–(98).
Appendix I: Scaling of the number counts signaltonoise in simulations
Fig. I.1 Scaling of the numbercounts signaltonoise ratio by as computed in the Horizon simulation, see Sect. 2.2. Different configurations are displayed according to the total surveyed area ΔΩ, the number of subfields , and the mass limit. In the right caption, ΔΩ is expressed in deg^{2} and the mass unit is h^{1} M_{⊙}. 
Fig. I.2 Same as Fig. I.1 but with a scaling that depends on the number of subfields: , with n = −0.6 
Fig. J.1 Left panel: cluster mass associated with a 50%, 80%, or 95% detection probability (from bottom to top), for the XXL selection function C1, as a function of redshift. Middle panel: minimum detectable cluster mass, as a function of redshift, for the Planck space mission. Right panel: cluster mass associated with a 50%, 80%, or 95% detection probability (from bottom to top), for the Erosita selection function as a function of redshift (we consider a flux limit of 4 × 10^{14} erg s^{1} cm^{2} in the [0.5 − 2] keV band). 
We present here the result of scaling the number counts with the total surveyed area ΔΩ and the number of subfields . Figures I.1 and I.2 show the scalings expected from Eq. (35) in the shotnoise and samplevariance dominated regimes. Multiple survey configurations are explored by varying the total surveyed area (ΔΩ = 25,50, and 100 deg^{2}), the number of subfields (, and 4), and the mass threshold (M > 2 × 10^{13},10^{14}, and 5 × 10^{14} h^{1} M_{⊙}).
The weak scatter in those plots shows that (35) provides a valid approximation of the signaltonoise scaling with respect to ΔΩ and . In agreement with the discussion in Sect. 3.2.1 and Fig. 7, at high redshift and for high mass, the scaling shown in Fig. I.1 is best, as expected for the shotnoise dominated regime, whereas at low redshift and for low mass the scaling shown in Fig. I.2 (with n = −0.6) is best, as expected for the samplevariance dominated regime.
Appendix J: Selection functions used for various surveys
We give in Fig. J.1 the selection functions that we use for several cluster surveys investigated in Sect. 6. For Planck, the curves shown in the middle panel corresponds to a 100% detection probability.
For the other surveys studied in Sect. 6 we consider simple mass thresholds, rather than detailed selection functions. More precisely, we consider halos above the two thresholds 5 × 10^{13} h^{1} M_{⊙} and 5 × 10^{14} h^{1} M_{⊙} for DES and Euclid, and above 5 × 10^{14} h^{1} M_{⊙} for SPT.
Appendix K: Dependence on cosmology
In this appendix we investigate the dependence of the results obtained in Sect. 6 on the value of the cosmological parameters. Thus, in addition to the WMAP7 cosmology recalled in the first line of Table K.1, which was used in Sect. 6, we also consider the three modified cosmologies where one among the three parameters h, Ω_{m}, and σ_{8} is changed to the values shown in the second line of Table K.1. They correspond to “” deviations from WMAP7 (Komatsu et al. 2011) and describe current uncertainties. (When we vary Ω_{m} we keep a flat ΛCDM universe and we change Ω_{de} according to Ω_{de} = 1−Ω_{m}.)
Thus, we compare in Figs. K.1–K.3, the three curves obtained for these three alternative cosmologies with the curve that was obtained in Sect. 6 for the fiducial WMAP7 cosmology. To avoid overcrowding the figures we only consider the allsky
surveys, Planck, Erosita, and Euclid. We can see that the main features of these figures are not modified when we consider these alternative cosmologies, so our results and conclusions are not sensitive to the precise value of the cosmological parameters. As expected, we can also check that shotnoise effects become less important, with respect to samplevariance contributions, when σ_{8} is increased.
Three alternative cosmologies.
Fig. K.1 The ratio of the rms shotnoise contribution to the rms samplevariance contribution , of the covariance of the angular number densities N_{i}, as in Fig. 34. The fiducial curve that was shown in Fig. 34 is the solid line (mean WMAP7 cosmology), whereas the dashed, dotdashed, and dotted lines correspond to the three cosmologies where either h, Ω_{m}, or σ_{8}, is changed to the value given in the second line of Table K.1. 
Fig. K.2 The ratio of the rms contributions and of the covariance matrix of the estimator , as in Fig. 35. The line styles are as in Fig. K.1 and Table K.1. 
Fig. K.3 The ratio of the rms highorder contribution (75)–(77) to the rms loworder contribution (second term in Eq. (69)) of the sample variance of the correlation ξ_{i}, as in Fig. 36. The line styles are as in Fig. K.1 and Table K.1. 
An alternative would be to use a halo model (Cooray & Sheth 2002), coupled to perturbation theory predictions on large scales. This could provide more accurate estimates, since such an approach is able to describe loworder correlation functions well from very large to small scales (Scoccimarro et al. 2001; Giocoli et al. 2010; Valageas & Nishimichi 2011a,b). However, this would introduce several contributions associated with “1halo” up to “4halo” terms, and would require additional parameters such as halo occupation functions and highorder bias parameters, depending on the objects that one considers. Therefore, we do not investigate this approach in this paper (although it would certainly deserve further attention), and we restrict ourselves to the simpler hierarchical models, as described in Eqs. (4)–(6).
Within a local bias model, one writes the halo density field as δ_{h} = ∑ _{k}b_{k}δ^{k}/k !, where δ is the matter density contrast smoothed on large scales. Then, the halo manybody correlations also depend on the higher order bias coefficients b_{k} (Fry & Gaztanaga 1993), but for simplicity we only consider a linear bias model here (i.e. b_{k} = 0 for k ≥ 2).
For all the considered surveys, the M_{lim}(z) curves were estimated using specific assumptions as to the evolution of the Xray, optical, and SZ properties of the clusters, hence on their detectability. Moreover, the assumed cosmology was either WMAP5 or WMAP7, to be consistent with published analysis of each survey. Therefore, these hypotheses may not be totally selfconsistent with respect to each other, but the main results of the comparison between the expected signals should remain valid.
Because we consider a symmetric twosided angular window (i.e., two cones of angle θ_{s} around the north and south galactic poles), the coefficients vanish for nonzero m and for odd ℓ. For even ℓ, they are still given by Eqs. (44)–(45) where we substitute (ΔΩ) → (ΔΩ)_{half}, where (ΔΩ)_{half} = (ΔΩ)/2 is the area associated with a single side (so that the last expression (45) still applies).
The integration over the points i and i′ in Eq. (D.1) should be understood as , which is more clearly symmetric, where Θ is a tophat window that takes values 0 or 1 with obvious notations. In practice, we actually slightly “break” this symmetry by using the variables (χ_{i},Ω_{i};r_{i′}) as in Eq. (D.1) if we do not take boundary effects into account, as in this paper.
In practice, we compute in advance ξ(x,z) on a 2D grid, over distance and redshift, using Eq. (F.1) and the nonlinear power spectrum from Smith et al. (2003). To obtain meaningful and accurate results, one needs to make sure that ξ(x) is accurately computed, especially on large scales where one should recover linear theory and a smooth twopoint correlation.
Acknowledgments
We thank the anonymous referee for comments that helped to improve the presentation of the paper. F.P. acknowledges support from Grant No. 50 OR 1003 of the Deutsches Zemtrum für Luft und Raumfahrt (DLR) and from the Transregio Programme TR33 of the Deutsche Forschungsgemeinschaft (DfG).
References
 Abazajian, K. N., AdelmanMcCarthy, J. K., Agüeros, M. A., et al. 2009, APJS, 182, 543 [Google Scholar]
 Adami, C., Mazure, A., Pierre, M., et al. 2011, A&A, 526, A18 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Arnaud, M., Pointecouteau, E., & Pratt, G. W. 2005, A&A, 441, 893 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Arnaud, M., Pratt, G. W., Piffaretti, R., et al. 2010, A&A, 517, A92 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Benoist, C., Maurogordato, S., da Costa, L. N., Cappi, A., & Schaeffer, R. 1996, ApJ, 472, 452 [NASA ADS] [CrossRef] [Google Scholar]
 Bernardeau, F., Colombi, S., Gaztañaga, E., & Scoccimarro, R. 2002, Phys. Rep., 367, 1 [NASA ADS] [CrossRef] [EDP Sciences] [MathSciNet] [Google Scholar]
 Bernstein, G. M. 1994, ApJ, 424, 569 [NASA ADS] [CrossRef] [Google Scholar]
 Cohn, J. D. 2006, New Astron., 11, 226 [NASA ADS] [CrossRef] [Google Scholar]
 Cole, S., Percival, W. J., Peacock, J. A., et al. 2005, MNRAS, 362, 505 [NASA ADS] [CrossRef] [Google Scholar]
 Colombi, S., Bouchet, F. R., & Hernquist, L. 1996, ApJ, 465, 14 [NASA ADS] [CrossRef] [Google Scholar]
 Cooray, A., & Sheth, R. 2002, Phys. Rep., 372, 1 [NASA ADS] [CrossRef] [Google Scholar]
 Crocce, M., Cabre, A., & Gaztanaga, E. 2011, MNRAS, 414, 329 [NASA ADS] [CrossRef] [Google Scholar]
 Croton, D. J., Gaztanaga, E., Baugh, C. M., et al. 2004, MNRAS, 352, 1232 [NASA ADS] [CrossRef] [Google Scholar]
 Desjacques, V., Crocce, M., Scoccimarro, R., & Sheth, R. K. 2010, Phys. Rev. D, 82, 103529 [NASA ADS] [CrossRef] [Google Scholar]
 Eisenstein, D. J., & Hut, P. 1998, ApJ, 498, 137 [NASA ADS] [CrossRef] [Google Scholar]
 Eisenstein, D. J., & Zaldarriaga, M. 2001, ApJ, 546, 2 [NASA ADS] [CrossRef] [Google Scholar]
 Eisenstein, D. J., Hu, W., & Tegmark, M. 1998, ApJ, 504, L57 [NASA ADS] [CrossRef] [Google Scholar]
 Eisenstein, D. J., Zehavi, I., Hogg, D. W., et al. 2005, ApJ, 633, 560 [NASA ADS] [CrossRef] [Google Scholar]
 Evrard, A. E. 1989, ApJ, 341, L71 [NASA ADS] [CrossRef] [Google Scholar]
 Feldman, H. A., Kaiser, N., & Peacock, J. A. 1994, ApJ, 426, 23 [NASA ADS] [CrossRef] [Google Scholar]
 Fry, J. N. 1984, ApJ, 279, 499 [NASA ADS] [CrossRef] [Google Scholar]
 Fry, J. N., & Gaztanaga, E. 1993, ApJ, 413, 447 [NASA ADS] [CrossRef] [Google Scholar]
 Gaztanaga, E., Norberg, P., Baugh, C. M., & Croton, D. J. 2005, MNRAS, 364, 620 [NASA ADS] [CrossRef] [Google Scholar]
 Giocoli, C., Bartelmann, M., Sheth, R. K., & Cacciato, M. 2010, MNRAS, 408, 300 [NASA ADS] [CrossRef] [Google Scholar]
 Goroff, M. H., Grinstein, B., Rey, S.J., & Wise, M. B. 1986, ApJ, 311, 6 [NASA ADS] [CrossRef] [Google Scholar]
 Gradshteyn, I. S., & Ryzhik, I. M. 1965, Table of integrals, series, and products (New York: Academic Press) [Google Scholar]
 Groth, E. J., & Peebles, P. J. E. 1977, ApJ, 217, 385 [NASA ADS] [CrossRef] [Google Scholar]
 Harker, G., Cole, S., & Jenkins, A. 2007, MNRAS, 382, 1503 [NASA ADS] [CrossRef] [Google Scholar]
 Hu, W. 2000, Phys. Rev. D, 62, 043007 [NASA ADS] [CrossRef] [Google Scholar]
 Hu, W., & Kravtsov, A. V. 2003, ApJ, 584, 702 [NASA ADS] [CrossRef] [Google Scholar]
 Kaiser, N. 1984, ApJ, 284, L9 [NASA ADS] [CrossRef] [Google Scholar]
 Kaiser, N. 1992, ApJ, 388, 272 [NASA ADS] [CrossRef] [Google Scholar]
 Kazin, E. A., Blanton, M. R., Scoccimarro, R., et al. 2010, ApJ, 710, 1444 [NASA ADS] [CrossRef] [Google Scholar]
 Kerscher, M., Szapudi, I., & Szalay, A. S. 2000, ApJ, 535, L13 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Knebe, A., Knollmann, S. R., Muldrew, S. I., et al. 2011, MNRAS, 415, 2293 [NASA ADS] [CrossRef] [Google Scholar]
 Komatsu, E., Smith, K. M., Dunkley, J., et al. 2011, ApJS, 192, 18 [NASA ADS] [CrossRef] [Google Scholar]
 Kulkarni, G. V., Nichol, R. C., Sheth, R. K., et al. 2007, MNRAS, 378, 1196 [NASA ADS] [CrossRef] [Google Scholar]
 Landy, S. D., & Szalay, A. S. 1993, ApJ, 412, 64 [NASA ADS] [CrossRef] [Google Scholar]
 Limber, D. N. 1953, ApJ, 117, 134 [NASA ADS] [CrossRef] [Google Scholar]
 LoVerde, M., & Afshordi, N. 2008, Phys. Rev. D, 78, 123506 [NASA ADS] [CrossRef] [Google Scholar]
 Majumdar, S., & Mohr, J. J. 2004, ApJ, 613, 41 [NASA ADS] [CrossRef] [Google Scholar]
 Maller, A. H., McIntosh, D. H., Katz, N., & Weinberg, M. D. 2005, ApJ, 619, 147 [NASA ADS] [CrossRef] [Google Scholar]
 Manera, M., & Gaztanaga, E. 2011, MNRAS, 415, 383 [NASA ADS] [CrossRef] [Google Scholar]
 Marin, F. A., Wechsler, R. H., Frieman, J. A., & Nichol, R. C. 2007, ApJ, 172, 849 [Google Scholar]
 Massey, R., Rhodes, J., Leauthaud, A., et al. 2007, ApJS, 172, 239 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
 Meiksin, A., & White, M. 1999, MNRAS, 308, 1179 [NASA ADS] [CrossRef] [Google Scholar]
 Melin, J.B., Bartlett, J. G., & Delabrouille, J. 2006, A&A, 459, 341 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Munshi, D., Valageas, P., van Waerbeke, L., & Heavens, A. 2008, Phys. Rep., 462, 67 [NASA ADS] [CrossRef] [Google Scholar]
 Norberg, P., Baugh, C. M., Gaztañaga, E., & Croton, D. J. 2009, MNRAS, 396, 19 [NASA ADS] [CrossRef] [Google Scholar]
 Oukbir, J., & Blanchard, A. 1992, A&A, 262, L21 [NASA ADS] [Google Scholar]
 Pacaud, F., Pierre, M., Leauthaud, A., et al. 2006, MNRAS, 372, 578 [NASA ADS] [CrossRef] [Google Scholar]
 Pacaud, F., Pierre, M., Adami, C., et al. 2007, MNRAS, 382, 1289 [NASA ADS] [CrossRef] [Google Scholar]
 Peebles, P. J. E. 1980, The large scale structure of the universe (Princeton: Princeton University Press) [Google Scholar]
 Peebles, P. J. E., & Hauser, M. G. 1974, ApJS, 28, 19 [NASA ADS] [CrossRef] [Google Scholar]
 Pierre, M., Pacaud, F., Juin, J. B., et al. 2011, MNRAS, 414, 1732 [NASA ADS] [CrossRef] [Google Scholar]
 Politzer, H. D., & Wise, M. B. 1984, ApJ, 285, L1 [NASA ADS] [CrossRef] [Google Scholar]
 Predehl, P., Boehringer, H., et al. 2009, Proceedings of the conference Xray Astronomy 2009, Bologna, September 2009 [Google Scholar]
 Refregier, A., Amara, A., Kitching, T. D., et al. 2010, Euclid Imaging Consortium Science Book [arXiv:1001.0061] [Google Scholar]
 Reid, B. A., Percival, W. J., Eisenstein, D. J., et al. 2010, MNRAS, 404, 60 [NASA ADS] [CrossRef] [Google Scholar]
 Ross, A. J., Brunner, R. J., & Myers, A. D. 2006, ApJ, 649, 48 [NASA ADS] [CrossRef] [Google Scholar]
 Scoccimarro, R., Zaldarriaga, M., & Hui, L. 1999, ApJ, 527, 1 [NASA ADS] [CrossRef] [Google Scholar]
 Scoccimarro, R., Sheth, R. K., Hui, L., & Jain, B. 2001, ApJ, 546, 20 [NASA ADS] [CrossRef] [Google Scholar]
 Smith, R. E. 2009, MNRAS, 400, 851 [NASA ADS] [CrossRef] [Google Scholar]
 Smith, R. E., Peacock, J. A., Jenkins, A., et al. 2003, MNRAS, 341, 1311 [NASA ADS] [CrossRef] [Google Scholar]
 Spergel, D. N., Bean, R., Doré, O., et al. 2007, ApJS, 170, 377 [NASA ADS] [CrossRef] [Google Scholar]
 Szapudi, I. 2001, in Annals of the New York Academy of Sciences, The Onset of Nonlinearity in Cosmology, ed. J. N. Fry, J. R. Buchler, & H. Kandrup, 927, 94 [Google Scholar]
 Szapudi, I., & Colombi, S. 1996, ApJ, 470, 131 [NASA ADS] [CrossRef] [Google Scholar]
 Szapudi, I., & Szalay, A. 1998, ApJ, 494, L41 [NASA ADS] [CrossRef] [Google Scholar]
 Szapudi, I., Postman, M., Lauer, T. R., & Oegerle, W. 2001, ApJ, 548, 114 [NASA ADS] [CrossRef] [Google Scholar]
 Tegmark, M., Eisenstein, D., Strauss, M. A., et al. 2006, Phys. Rev. D, 74, 123507 [Google Scholar]
 Teyssier, R. 2002, A&A, 385, 337 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Teyssier, R., Pires, S., Prunet, S., et al. 2009, A&A, 497, 335 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Tinker, J., Kravtsov, A. V., Klypin, A., et al. 2008, ApJ, 688, 709 [NASA ADS] [CrossRef] [Google Scholar]
 Tinker, J., Robertson, B. E., Kravtsov, A. V., et al. 2010, ApJ, 724, 878 [NASA ADS] [CrossRef] [Google Scholar]
 Valageas, P., & Nishimichi, T. 2011a, A&A, 527, A87 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Valageas, P., & Nishimichi, T. 2011b, A&A, 532, A4 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Vanderlinde, K., Crawford, T. M., & de Haan, T., et al. 2010, ApJ, 722, 1180 [NASA ADS] [CrossRef] [Google Scholar]
All Tables
All Figures
Fig. 1 The “hierarchical clustering ansatz” for the threepoint correlation function of Eq. (4). Each solid line corresponds to a twopoint correlation ξ, and ζ^{h} is written as the sum of these three diagrams, with a multiplicative factor b_{1}b_{2}b_{3}S_{3}/3. 

In the text 
Fig. 2 The two topologies of the fourpoint diagrams associated with the “hierarchical clustering ansatz” for the fourpoint correlation, as in Eq. (6). The numbers are the multiplicity factors of each diagram. 

In the text 
Fig. 3 The mean number density of dark matter halos per square degree, within redshift bins of width Δz = 0.1. We count all halos above the thresholds M_{∗} = 2 × 10^{13},10^{14}, and 5 × 10^{14}h^{1} M_{⊙}, from top down to bottom. We compare our analytical results (solid lines) with numerical simulations (dashed lines). 

In the text 
Fig. 4 The variance σ_{Ni} of the halo angular number densities of Fig. 3, for redshift bins Δz = 0.1 and an angular window of 50 deg^{2}. We compare our analytical results (solid lines) with numerical simulations (dashed lines). 

In the text 
Fig. 5 The shotnoise (dashed lines) and samplevariance (solid lines) errors for the angular number densities shown in Fig. 3, associated with a redshift binning of width Δz = 0.1, but up to z = 2, and an angular window of 50 deg^{2}. 

In the text 
Fig. 6 The signaltonoise ratios of number counts for an angular area ΔΩ = 50 deg^{2}, as in Figs. 3 and 4. We compare our analytical results (solid lines) with numerical simulations (dashed lines). 

In the text 
Fig. 7 The signaltonoise ratios of number counts for a total angular area ΔΩ = 50 deg^{2}, divided over independent subfields. We show the results obtained for the numbers of subfields (solid lines), 2 (dashed lines), and 4 (dotted lines). 

In the text 
Fig. 8 The angular power spectrum of the distribution of halos in the redshift bin 0.95 < z < 1.05. We plot both the exact result (38) (solid line) and Limber’s approximation (39) (dotted line). 

In the text 
Fig. 9 The shotnoise (dashed line) and samplevariance errors (29) for the angular number densities in the redshift bin 0.95 < z < 1.05, as a function of the radius θ_{s} of the angular window. The solid line is the exact sample variance, from Eqs. (38) and (41), while the dotted line is the result (27), which was used in Fig. 5 and involves both the flatsky and Limber’s approximations. 

In the text 
Fig. 10 The ratio of the exact samplevariance error (41) to the approximation (27), which uses both the flatsky and Limber’s approximations. We show this ratio as a function of the radius θ_{s} of the angular window, for several redshift bins, for halos above the mass threshold M > 10^{14} h^{1} M_{⊙}. Higher z corresponds to a higher ratio. 

In the text 
Fig. 11 The correlation matrix of Eq. (46) between redshift bins of width Δz = 0.1. We show as a function of j, for four values of i. In each case, at j = i. We consider halos above 10^{14} h^{1} M_{⊙} and an angular window of area (ΔΩ) = 50 deg^{2}. 

In the text 
Fig. 12 The mean halo correlation, , over ten comoving distance bins within 5 < r < 100 h^{1} Mpc, equally spaced in log (r). We integrate over halos within the redshift interval 0 < z < 0.8 and we compare our analytical results (solid lines) with numerical simulations (dashed lines). 

In the text 
Fig. 13 The covariance matrices (solid line) and C_{i,j} (dashed line) of the estimators and , for i = 4 associated with the distance bin 12.3 < r < 16.6 h^{1} Mpc, as a function of j. We show the results obtained for halos in the redshift range 0 < z < 0.8 with an angular window of 50 deg^{2}. Here we only consider the loworder terms given by Eqs. (67) and (69). 

In the text 
Fig. 14 The contributions C^{(2)} and C^{(3)} to the covariance of the Landy & Szalay estimator, along the diagonal i = j. As in Fig. 13, we only consider the loworder terms, given by Eq. (69). 

In the text 
Fig. 15 The low and highorder contributions to the covariance matrix along its diagonal. We again consider halos in the redshift range 0 < z < 0.8, with an angular window of 50 deg^{2}, above two mass thresholds. 

In the text 
Fig. 16 The low and highorder contributions to the covariance matrix , as in Fig. 15, but along one row. This corresponds to the fixed bin i = 4, associated with the distance bin 12.3 < r < 16.6 h^{1} Mpc, as a function of j. 

In the text 
Fig. 17 The covariance matrix of the Landy & Szalay estimator, along the diagonal i = j. We show our analytical results including all contributions (solid lines) or only loworder terms (dotted lines), and results from numerical simulations (dashed lines). 

In the text 
Fig. 18 The covariance matrix , as in Fig. 17, but along one row. This corresponds to the fixed bin i = 4, associated with the distance bin 12.3 < r < 16.6 h^{1} Mpc, as a function of j. 

In the text 
Fig. 19 Contour plots for the correlation matrix ℛ_{i,j}, defined as in Eq. (46) but for the full covariance matrix C_{ij} of the halo correlation. There are ten distance bins, over 5 < r < 100 h^{1} Mpc, equally spaced in log (r), as in previous figures. We consider halos in the redshift range 0 < z < 0.8, within an angular window of 50 deg^{2}, above the mass thresholds M > 2 × 10^{13} h^{1} M_{⊙} in the upper row, and M > 10^{14} h^{1} M_{⊙} in the lower row. Left panels: loworder contributions (67) for the Peebles & Hauser estimator. Middle panels: loworder contributions (69) for the Landy & Szalay estimator. Right panels: full correlation matrix, including the highorder contributions of Eqs. (75)–(77), for the Landy & Szalay estimator. 

In the text 
Fig. 20 The mean angular correlation, , over eight angular bins within 1.25 < θ < 50 arcmin, equally spaced in log (θ). We compare our analytical results (solid lines) with numerical simulations (dashed lines). 

In the text 
Fig. 21 The covariance matrices (solid line) and C_{i,j} (dashed line) of the estimators and ŵ_{i}, for i = 2 associated with the angular bin 2 < θ < 3.2 arcmin, as a function of j. We show the results obtained for halos in the redshift range 0 < z < 0.8, with an angular window of 400 deg^{2}, above the mass thresholds M_{ ∗ } = 2 × 10^{13} and 10^{14} h^{1} M_{⊙}, from bottom to top. Here we only consider the loworder contributions, given by Eqs. (93) and (94). 

In the text 
Fig. 22 The contributions C^{(2)} and C^{(3)} to the covariance of the Landy & Szalay estimator, along the diagonal i = j. As in Fig. 21, we only consider the loworder terms, given by Eq. (94). 

In the text 
Fig. 23 The low and highorder contributions to the covariance matrix along its diagonal. We again consider halos in the redshift range 0 < z < 0.8, with an angular window of 400 deg^{2}, above two mass thresholds. 

In the text 
Fig. 24 The low and highorder contributions to the covariance matrix , as in Fig. 23, but along one row. This corresponds to the fixed bin i = 2, associated with the angular bin 2 < θ < 3.2 arcmin, as a function of j. 

In the text 
Fig. 25 The covariance matrix along its diagonal. We show our analytical results including all contributions (solid lines) or only loworder terms (dotted lines), and results from numerical simulations (dashed lines). 

In the text 
Fig. 26 The covariance matrix , as in Fig. 25, but along one row. This corresponds to the fixed bin i = 4, associated with the angular bin 5 < θ < 8 arcmin, as a function of j. 

In the text 
Fig. 27 Contour plots for the correlation matrix ℛ_{i,j}, defined as in Eq. (46) but for the full covariance matrix C_{ij} of the halo angular correlation. There are eight angular bins, over 1.25 < r < 50 arcmin, equally spaced in log (θ), as in previous figures. We consider halos in the redshift range 0 < z < 0.8, with an angular window of 400 deg^{2}, above the mass thresholds M > 2 × 10^{13} h^{1} M_{⊙} in the upper row, and M > 10^{14} h^{1} M_{⊙} in the lower row. Left panels: loworder contributions (93) for the Peebles & Hauser estimator. Middle panels: loworder contributions (94) for the Landy & Szalay estimator. Right panels: full correlation matrix, including the highorder contributions of Eqs. (96)–(98), for the Landy & Szalay estimator. 

In the text 
Fig. 28 The mean angular number densities of Xray clusters per square degree, within redshift bins of width Δz = 0.1, for the XXL, DES, and SPT surveys. Error bars contain both the shotnoise and samplevariance contributions, from Eqs. (22) and (27). For DES we consider the mass thresholds M > 5 × 10^{13} h^{1} M_{⊙} and M > 5 × 10^{14} h^{1} M_{⊙} (smaller error bars), and for SPT the mass threshold M > 5 × 10^{14} h^{1}M_{⊙} (larger error bars shifted to the right). 

In the text 
Fig. 29 The mean correlation, from Eq. (58), over ten comoving distance bins within 5 < r < 100 h^{1} Mpc, equally spaced in log (r). We integrate over halos within the redshift interval 0 < z < 1, for the XXL, DES, and SPT surveys, as in Fig. 28 (again the error bars for SPT are slightly larger and shifted to the right with respect to those of DES, for M > 5 × 10^{14} h^{1} M_{⊙}). The error bars show the diagonal part of the covariance, , for the Landy & Szalay estimator, from Eqs. (69) and (75)–(77). 

In the text 
Fig. 30 The mean correlation, , for the clusters detected by DES over the redshift interval 1 < z < 2. Here we consider 20 distance bins within 5 < r < 100 h^{1} Mpc, equally spaced in log (r) (i.e. twice as many as in Fig. 29). 

In the text 
Fig. 31 The mean angular number densities of clusters within redshift bins of width Δz = 0.1. From top to bottom, we show a) halos above 5 × 10^{13} h^{1} M_{⊙} in Euclid, b) halos detected by Erosita with the selection function of the right panel in Fig. J.1, c), halos above 5 × 10^{14} h^{1} M_{⊙} in either Erosita or Euclid, and d) halos detected by Planck with the selection function of the middle panel in Fig. J.1. 

In the text 
Fig. 32 The mean correlation, , integrated over 0 < z < 1, as in Fig. 29. From top to bottom, we show a) halos above 5 × 10^{14} h^{1} M_{⊙} in either Erosita or Euclid, b) halos detected by Planck with the selection function of the middle panel in Fig. J.1, c) halos above 5 × 10^{13} h^{1} M_{⊙} in Euclid, and d) halos detected by Erosita with the selection function of the right panel in Fig. J.1. 

In the text 
Fig. 33 The mean correlation, , over the redshift interval 1 < z < 2, for the clusters detected by Erosita (upper curve, with ten distance bins) and Euclid (lower curve, with twenty distance bins). 

In the text 
Fig. 34 The ratio of the rms shotnoise contribution to the rms samplevariance contribution , of the covariance of the angular number densities N_{i} obtained for various surveys. (For DES and Euclid we only consider the case M > 5 × 10^{13} h^{1} M_{⊙}.) 

In the text 
Fig. 35 The ratio of the rms contributions and of the covariance matrix of the estimator . This is a measure of shotnoise effects. (For DES and Euclid we only consider the case M > 5 × 10^{13} h^{1} M_{⊙}.) 

In the text 
Fig. 36 The ratio of the rms highorder contribution (75)–(77) to the rms loworder contribution (second term in Eq. (69)) of the sample variance of the correlation ξ_{i} obtained for various surveys. (For DES and Euclid we only consider the case M > 5 × 10^{13} h^{1} M_{⊙}.) 

In the text 
Fig. B.1 Geometrical illustration of finitesize effects. Close to the survey boundary, part of the sphere of radius r extends beyond the observational cone and should not be counted. The left plot is a transverse view, orthogonal to the central line of sight, whereas the right plot is a view from a point far away on the line of sight. 

In the text 
Fig. I.1 Scaling of the numbercounts signaltonoise ratio by as computed in the Horizon simulation, see Sect. 2.2. Different configurations are displayed according to the total surveyed area ΔΩ, the number of subfields , and the mass limit. In the right caption, ΔΩ is expressed in deg^{2} and the mass unit is h^{1} M_{⊙}. 

In the text 
Fig. I.2 Same as Fig. I.1 but with a scaling that depends on the number of subfields: , with n = −0.6 

In the text 
Fig. J.1 Left panel: cluster mass associated with a 50%, 80%, or 95% detection probability (from bottom to top), for the XXL selection function C1, as a function of redshift. Middle panel: minimum detectable cluster mass, as a function of redshift, for the Planck space mission. Right panel: cluster mass associated with a 50%, 80%, or 95% detection probability (from bottom to top), for the Erosita selection function as a function of redshift (we consider a flux limit of 4 × 10^{14} erg s^{1} cm^{2} in the [0.5 − 2] keV band). 

In the text 
Fig. K.1 The ratio of the rms shotnoise contribution to the rms samplevariance contribution , of the covariance of the angular number densities N_{i}, as in Fig. 34. The fiducial curve that was shown in Fig. 34 is the solid line (mean WMAP7 cosmology), whereas the dashed, dotdashed, and dotted lines correspond to the three cosmologies where either h, Ω_{m}, or σ_{8}, is changed to the value given in the second line of Table K.1. 

In the text 
Fig. K.2 The ratio of the rms contributions and of the covariance matrix of the estimator , as in Fig. 35. The line styles are as in Fig. K.1 and Table K.1. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.