Free Access
 Issue A&A Volume 508, Number 1, December II 2009 93 - 106 Cosmology (including clusters of galaxies) https://doi.org/10.1051/0004-6361/200912486 15 October 2009

A&A 508, 93-106 (2009)

## Mass functions and bias of dark matter halos

P. Valageas

Institut de Physique Théorique, CEA Saclay, 91191 Gif-sur-Yvette, France

Received 14 May 2009 / Accepted 8 October 2009

Abstract
Aims. We revisit the study of the mass functions and the bias of dark matter halos.
Methods. Focusing on the limit of rare massive halos, we point out that exact analytical results can be obtained for the large-mass tail of the halo mass function. This is most easily seen from a steepest-descent approach, that becomes asymptotically exact for rare events. We also revisit the traditional derivation of the bias of massive halos, associated with overdense regions in the primordial density field.
Results. We check that the theoretical large-mass cutoff agrees with the mass functions measured in numerical simulations. For halos defined by a nonlinear threshold this corresponds to using a linear threshold instead of the traditional value 1.686. We also provide a fitting formula that matches simulations over all mass scales and obeys the exact large-mass tail. Next, paying attention to the Lagrangian-Eulerian mapping (i.e. corrections associated with the motions of halos), we improve the standard analytical formula for the bias of massive halos. We check that our prediction, which contains no free parameter, agrees reasonably well with numerical simulations. In particular, it recovers the steepening of the dependence on scale of the bias that is observed at higher redshifts, which published fitting formulae did not capture. This behavior mostly arises from nonlinear biasing.

Key words: large-scale structure of Universe - methods: analytical - gravitation

## 1 Introduction

The distribution of nonlinear virialized objects, such as galaxies or clusters of galaxies, is a fundamental test of cosmological models. First, this allows us to check the validity of the standard cosmological scenario for the formation of large-scale structures, where nonlinear objects form thanks to the amplification by gravitational instability of small primordial density fluctuations, built for instance by an early inflationary stage (e.g., Peebles 1993; Peacock 1998). For cold dark matter (CDM) scenarios (Peebles 1982), where the amplitude of the initial perturbations grows at smaller scales, this gives rise to a hierarchical process, where increasingly large and massive objects form as time goes on, as increasingly large scales turn nonlinear. This process has been largely confirmed by observations, which find smaller galaxies at very high redshifts (e.g., Trujillo et al. 2007) while massive clusters of galaxies (which are the largest bound objects in the Universe) appear at low redshifts (e.g., Borgani et al. 2001). Second, on a more quantitative level, statistical properties, such as the mass function and the two-point correlation of these objects, provide strong constraints on the cosmological parameters (e.g. through the linear growth factor D+(t) of density perturbations) and on the primordial fluctuations (e.g. through the initial density power spectrum PL(k)). For these purposes, the most reliable constraints come from observations of the most massive objects (rare-event tails) at the largest scales. Indeed, in this regime the formation of large-scale structures is dominated by the gravitational dynamics (baryonic physics, which involves intricate processes associated with pressure effects, cooling and heating, mostly occurs at galactic scales and below), which further simplifies as one probes quasi-linear scales or rare events where effects associated with multiple mergings can be neglected. Moreover, in this regime astrophysical objects, such as galaxies or clusters of galaxies, can be directly related to dark matter halos, and their abundance is highly sensitive to cosmological parameters thanks to the steep decline of the high-mass tail of the mass function (e.g., Evrard 1989).

Thus, the computation of the halo mass function (and especially its large-mass tail) has been the focus of many works, as it is one of the main properties measured in galaxy and cluster surveys that can be compared with theoretical predictions. Most analytical derivations follow the Press-Schechter approach (Press & Schechter 1974; Blanchard et al. 1992) or its main extension, the excursion set theory of Bond et al. (1991). In this framework, one attempts to estimate the number of virialized objects of mass M from the probability to have a linear density contrast  at scale M above some given threshold  . Thus, one identifies current nonlinear halos from positive density fluctuations in the initial (linear) density field, on a one-to-one basis. This is rather well justified for rare massive objects, where one can expect such a link to be valid since such halos should have remained well-defined objects until now (as they should have suffered only minor mergers). By contrast, small and typical objects have experienced many mergers and should be sensitive to highly non-local effects (e.g. tidal forces, mergers), so that such a direct link should no longer hold, as can be checked in numerical simulations (Bond et al. 1991). As noticed by Press & Schechter (1974), the simplest procedure only yields half the mass of the Universe in such objects (essentially because only half of the Gaussian initial fluctuations have a positive density contrast, whatever the smoothing scale), which they corrected by an ad-hoc multiplicative factor 2. In this respect the main result of the excursion set theory was to provide an analytical derivation of this missing factor 2, in the simplified case of a top-hat filter in Fourier space (Bond et al. 1991). Then, it arises from the fact that objects of mass larger than M are associated with configurations such that the linear density contrast goes above the threshold at some scale , which includes cases missed by the Press-Schechter prescription where the linear density contrast decreases below this scale M' so that at scale M (cloud-in-cloud'' problem). The characteristic threshold is usually taken as the linear density contrast reached when the spherical collapse dynamics predicts collapse to a zero radius. In an Einstein-de Sitter universe this corresponds to and to a nonlinear density contrast (assuming full virialization in half the turn-around radius). This linear threshold only shows a very weak dependence on cosmological parameters.

Numerical simulations have shown that the Press-Schechter mass function (PS) is reasonably accurate, especially in view of its simplicity. Thus, it correctly predicts the typical mass scale of virialized halos at any redshift, as could be expected since the large-mass behavior is rather well justified. In addition, it predicts a universal scaling that appears to be satisfied by the mass functions measured in simulations, that is, the dependence on halo mass, redshift and cosmology is fully contained in the ratio , where is the rms linear density fluctuation at scale M. However, as compared with numerical results it overestimates the low-mass tail and it underestimates the high-mass tail. This has led to many numerical studies which have provided various fitting formulae for the mass function of virialized halos, written in terms of the scaling variable  (or ) (Sheth & Tormen 1999; Jenkins et al. 2001; Reed et al. 2003; Warren et al. 2006; Tinker et al. 2008). We may note here that a theoretical model that attempts to improve over the PS mass function is to consider the ellipsoidal collapse dynamics within the excursion-set approach, to take into account the deviations from spherical symmetry for intermediate mass halos (Sheth et al. 2001). Note that, as described in the original paper (and emphasized in Robertson et al. 2009), at large mass this would recover the spherical collapse. However, the halo mass obtained by such methods is generally underestimated for non center-of-mass particles (since analytical computations assume that particles are located at the center of their halo, i.e. they only consider the linear densities within spherical cells centered on the point of interest). To correct for this effect, in practice one treats the threshold  as a free parameter, close to 1.6, to build fitting formulae from numerical simulations. For instance, this correction is contained in the parameter a in the exponential cutoff of Sheth et al. (2001).

A second property of virialized halos that can be used to constrain cosmological models, beyond their number density, is their two-point correlation. Indeed, observations show that galaxies and clusters do not obey a Poisson distribution but show significant large-scale correlations (e.g., McCracken et al. 2008; Padilla et al. 2004). In particular, their two-point correlation roughly follows the underlying matter correlation, up to a multiplicative factor b2, called the bias, that grows for more massive and extreme objects. Following the spirit of the Press-Schechter picture, Kaiser (1984) found that this behavior arises in a natural fashion if halos are associated with large overdensities in the Gaussian initial (linear) density field, above the threshold  . This was further expanded by Bardeen et al. (1986) and Bond & Myers (1996), who considered the clustering of peaks in the Gaussian linear density field. A simpler derivation, based on a peak-background split argument, and taking care of the mapping from Lagrangian to Eulerian space, was presented in Mo & White (1996). It provides a prediction for the bias b(M) as a function of halo mass, in the limit of large distance, , that agrees reasonably well with numerical simulations. However, as for the PS mass function, in order to improve the agreement with numerical results various fitting formulae have been proposed (Sheth & Tormen 1999; Hamana et al. 2001; Pillepich et al. 2009). Again, since the ellipsoidal collapse model reduces to the spherical dynamics for rare massive halos, it also requires free parameters to improve its accuracy, but the latter are consistent with those used for the mass function (Sheth et al. 2001).

In this article we revisit the derivation of the mass function and the bias of rare massive halos, following the spirit of the Press & Schechter (1974) and Kaiser (1984) approaches. That is, we use the fact that large halos can be identified from overdensities in the Gaussian initial (linear) density field. First, we briefly review in Sect. 2 some properties of the growth of linear fluctuations and of the spherical dynamics in CDM cosmologies. Next, we recall in Sect. 3 that in the quasi-linear regime (i.e. at large scales), the probability distribution of the nonlinear density contrast within spherical cells of radius r can be obtained from spherical saddle-points of a specific action , for moderate values of where shell-crossing does not come into play. We also discuss the properties of these saddle-points as a function of mass, scale and redshift. Then, we point out in Sect. 4 that this provides the exact exponential tail of the halo mass function. This applies to any nonlinear density contrast threshold  that is used to define halos, provided it is below the upper bound where shell-crossing comes into play. We compare our results with numerical simulations and we give a fitting formula that applies over all mass scales and satisfies the exact large-mass cutoff. Next, we recall in Sect. 5 that these results also provide the density profile of dark matter halos at outer radii (i.e. beyond the virial radius) in the limit of large mass. Finally, we study the bias of massive halos in Sect. 6, paying attention to some details such as the Lagrangian-Eulerian mapping, and we compare our results with numerical simulations. We conclude in Sect. 7.

## 2 Linear perturbations and spherical dynamics

We consider in this article a flat CDM cosmology with two components, (i) a non-relativistic component (dark and baryonic matter, which we do not distinguish here) that clusters through gravitational instability; and (ii) an uniform dark energy component that does not cluster at the scales of interest, with an equation-of-state parameter . For the numerical computations we shall focus on a CDM cosmology, where the dark energy is associated with a cosmological constant that is exactly uniform with w=-1. However, our results directly extend to curved universes (i.e. ) and to dark energy models with a possibly time-varying w(z), as long as we can neglect the dark energy fluctuations on the scales of interest, which is valid for realistic cases. Focussing on the case of constant w, we first recall in this section the equations that describe the dynamics of the background and of linear matter density perturbations, as well as the nonlinear spherical dynamics.

The evolution of the scale factor a(t) is determined by the Friedmann equation (Wang & Steinhardt 1998),

 (1)

where subscripts 0 denote current values at z=0, when a=1, and a dot denotes the derivative with respect to cosmic time t. On the other hand, the density parameters vary with time as

 (2)

Next, introducing the matter density contrast, , where  is the mean matter density, linear density fluctuations grow as

 (3)

where the subscript L denotes linear quantities. For numerical purposes it is convenient to use the logarithm of the scale factor as the time variable. Then, using the Friedmann Eq. (1) the linear growth factor D+(t) evolves from Eq. (3) as

 (4)

where we note with a prime the derivative with respect to , as . Then, the normalized linear growth factor, g(t), defined as

 (5)

obeys

 (6)

with the initial conditions and at .

In the following we shall also need the dynamics of spherical density fluctuations. For such spherically symmetric initial conditions, the physical radius r(t), that contains the constant mass M until shell-crossing, evolves as

 (7)

Note that is the mean density within the sphere of radius r(and not the local density at radius r). Using again the Friedmann Eq. (1) this reads as

 (8)

As for the linear growth factor D+(t), it is convenient to introduce the normalized radius y(t) defined as

 (9)

Thus, q(t) is the Lagrangian coordinate of the shell r(t), that is, the physical radius that would enclose the same mass M in a uniform universe with the same cosmology. This also implies that the density  writes as

 (10)

 (11)

Of course, we can check that in the linear regime, where , we recover Eq. (4) for the linear growth of . Then, to obtain the nonlinear density contrast, , associated with the linear density contrast, , at a given redshift z, we solve Eq. (11) with the initial condition and at some high redshift , with . For any redshift z this defines a mapping that fully describes the spherical dynamics before shell-crossing.

 Figure 1: The function that describes the spherical dynamics when there is no shell-crossing, at z=0. The solid line corresponds to the CDM cosmology (with ) and the dashed line to the Einstein-de Sitter case . The second peak corresponds to a second collapse to the center, but for our purposes we only need before first collapse. Open with DEXTER

 Figure 2: The linear density contrast associated with the nonlinear density contrast , through the spherical dynamics, as a function of the cosmological parameter at the redshift of interest. We show the three cases and 300. The dashed line is the fit (12) to the case . Open with DEXTER

We compare in Fig. 1 the function obtained at z=0 within the CDM cosmology that we consider in this article (solid line) with the Einstein-de Sitter case (dashed line), where it has a well-known parametric form (Peebles 1980). As is well known, we can check that the dependence on is very weak until full collapse to a point, which occurs at slightly lower values of for low  . We show in Fig. 2 the linear density contrasts, , associated with three nonlinear density contrasts, and 300, as a function of the cosmological parameter . In agreement with Fig. 1, they show a slight decrease for smaller  . For a simple fit (dashed line) is provided by

 (12)

which agrees with the exact curve to better than for , which covers the range of practical interest.

In this article we consider Gaussian initial fluctuations, which are fully defined by the linear density power spectrum

 (13)

where is the Dirac distribution and we normalized the Fourier transform as

 (14)

where and are the comoving spatial coordinate and wavenumber. This gives for the linear density two-point correlation
 = = (15)

We also introduce the smoothed density contrast, , within the sphere of radius r and volume V around position ,

 (16)

with a top-hat window that reads in Fourier space as

 (17)

Then, in the linear regime, the cross-correlation of the smoothed linear density contrast at scales r1 and r2 and positions and reads as
 = = (18)

In particular, is the usual rms linear density contrast at scale r.

## 3 Distribution of the density contrast

We recall here that in the quasi-linear limit, , the distribution of the nonlinear density contrast at scale r can be derived from a steepest-descent method (Valageas 2002a). In agreement with the alternative approach of Bernardeau (1994a), this shows that rare-event tails are dominated by spherical saddle-points, which we use in Sects. 4-6 to obtain the properties of massive halos.

Since the system is statistically homogeneous we can consider the sphere of radius r centered on the origin . Then, we first introduce the cumulant generating function ,

 (19)

which determines the distribution through the inverse Laplace transform

 (20)

In Eq. (19) we rescaled the cumulant generating function by a factor so that it has a finite limit in the quasi-linear limit for the Gaussian initial density fluctuations that we study in this article (Bernardeau et al. 2002). In particular, its expansion at y=0 reads as

 (21)

Then, the average (19) can be written as the path integral

 (22)

where CL-1 is the inverse matrix of the two-point correlation (15) and the action reads as

 (23)

Here is the nonlinear functional that affects to the initial condition defined by the linear density field the nonlinear density contrast within the sphere of radius r, and we note . Note that the action  does not depend on the normalization of the linear power spectrum since both and CL are proportional to PL.

In the quasi-linear limit, , as shown in Valageas (2002a) the path integral (22) is dominated by the minimum of the action ,

 (24)

Using the spherical symmetry of the top-hat window W, in agreement with the approach of Bernardeau (1994a), one obtains a spherical saddle-point with the radial profile

 (25)

where and q is the Lagrangian coordinate associated with the Eulerian radius r through the spherical dynamics recalled in Sect. 2, and q' is a dummy Lagrangian coordinate along the radial profile (hereafter we denote by the letter q Lagrangian radii, which are associated with the linear density field, whereas r denotes Eulerian radii associated with the nonlinear density contrast ). Thus, the radii q' and r' are related by

 (26)

where the function , that describes the spherical dynamics, was obtained below Eq. (11) and shown in Fig. 1.

We can note that the profile (25) is also the mean conditional profile of the linear density contast , under the constraint that is it equal to a given value at a given radius q (e.g., Bernardeau 1994a). The reason why the nonlinear dynamics gives back this result is that the nonlinear density contrast only depends on the linear density contrast at the associated Lagrangian radius q, through the mapping . Indeed, as long as shell-crossing does not modify the mass enclosed within the shell of Lagrangian coordinate q, its dynamics is independent of the motion of inner and outer shells (thanks to Gauss theorem). Then, in order to obtain the minimum of the action we could proceed in two steps. First, for arbitrary Lagrangian radius q and linear contrast , we minimize with respect to the profile over . From the previous argument, only the second Gaussian term in (23) varies so that this partial minimization leads to the profile (25) (indeed, for Gaussian integrals the saddle-point method is exact). Second, we minimize over the Lagrangian radius q (or equivalently over or ), which leads to Eq. (29) below. Here we also use the fact that a spherical saddle-point with respect to spherical fluctuations is automatically a saddle-point with respect to arbitrary non-spherical perturbations and it can be seen that for small y one obtains a minimum (then we assume that at finite ystrong deviations from spherical symmetry do not give rise to deeper minima, which seems natural from physical expectations). We refer to Valageas (2002a, 2009b) for more detailed derivations.

Note that the shape of the linear profile (25) depends on the shape of the linear power spectrum, whence on the mass scale M of the saddle point for a curved CDM linear power spectrum, but not on redshift. We show in Fig. 3 the profile (25) obtained for several masses M. For a power-law linear power spectrum, of slope n, Eq. (25) leads to at large radii, . Then, since for a CDM cosmology n increases at larger scales, the profile shows a steeper falloff at large radii for larger mass, in agreement with Fig. 3 (in this section we consider a CDM cosmology with ).

 Figure 3: The radial profile (25) of the linear density contrast of the saddle point of the action . We show the profiles obtained with a CDM cosmology for the masses M=1011,1012,1013,1014 and . A larger mass corresponds to a lower ratio at large radii q'/q>1. Open with DEXTER
 Figure 4: The Lagrangian map given by the spherical dynamics (neglecting shell-crossing) for the saddle-point (25) with a nonlinear density contrast at the Eulerian radius r. We plot the curves obtained at z=0 for several masses, as in Fig. 3. Inner shells have already gone once through the center (hence their dynamics is no longer exactly given by Eq. (11)) but they have not crossed the radius r yet. Open with DEXTER

We show in Fig. 4 the Lagrangian map, , given by the spherical dynamics (i.e. the function where we neglect shell-crossing) for the saddle-point (25), with a nonlinear density contrast at the Eulerian radius r, at redshift z=0. Inner shells have already gone once through the center but they have not reached radius r yet. Even though their dynamics is no longer exactly given by Eq. (11), an exact computation would give the same property as the increasing mass seen by these particles, as they pass outer shells, should slow them down as compared with the constant-mass dynamics. In agreement with Fig. 3, for larger masses, which have a larger central linear density contrast, shell-crossing has moved to larger radii (the local maximum of r'/r, to the left of r'=0, is higher).

 Figure 5: The nonlinear density contrast , beyond which shell-crossing must be taken into account, as a function of redshift for several mass scales. Open with DEXTER

Next, the amplitude and the minimum are given as functions of y by the implicit equations (Legendre transform)

 (27)

where the function is defined from the spherical dynamics through the parametric system (Valageas 2002a; Bernardeau 1994b)

 (28)

The system (27) also reads as

 (29)

Note that the function , whence the cumulant generating function , depends on the shape of the linear power spectrum through the ratio of linear power at Eulerian and Lagrangian scales r and q. Finally, from the inverse Laplace transform (20), a second steepest-descent integration over the variable y gives (Valageas 2002a)

 (30)

Thus, in the quasi-linear limit, the distribution is governed at leading order by the Gaussian weight of the linear fluctuation that is associated with the nonlinear density contrast  through the spherical dynamics. We can note that Eq. (30) could also be obtained from a Lagrange multiplier method, without introducing the generating function  . Indeed, in the rare-event limit, where  is governed by a single (or a few) initial configuration, we may write

 (31)

That is, is governed by the maximum of the Gaussian weight e subject to the constraint (assuming there are no degenerate maxima). Then, we can obtain this maximum by minimizing the action of Eq. (23), where y plays the role of a Lagrange multiplier. This gives the saddle-point (25), and the amplitude and the radius q are directly obtained from the constraint , as in Eq. (26). Then, we do not need the explicit expression of the Lagrange multiplier y, as this is sufficient to obtain the last expression of the asymptotic tail (30). Nevertheless, it is useful to introduce the generating function  , which makes it clear that the Lagrange multiplier y is also the Laplace conjugate of the nonlinear density contrast  as in Eq. (19), since it is also of interest by itself, as it yields the density cumulants through the expansion (21).

The asymptotic (30) holds in the rare-event limit. This corresponds to both the quasi-linear limit, at fixed density contrast , and to the low-density limit, at fixed , as long as there is no shell-crossing. This latter requirement gives a lower boundary , for linear power spectra with a slope n>-1, and an upper boundary , for any linear spectrum (Valageas 2002b). This upper boundary was shown in Fig. 5 for a CDM cosmology, for several masses.

Indeed, at large positive density contrast, shell-crossing always occurs, as seen in Figs. 4 and 5. This invalidates the mapping obtained from Eq. (11), as mass is no longer conserved within the Lagrangian shell q, so that the asymptotic behavior (30) is no longer exact. In fact, as shown in Valageas (2002b), after shell-crossing it is no longer sufficient to follow the spherical dynamics, even if we take into account shell-crossing. Indeed, a strong radial-orbit instability develops so that the sensitivity to initial perturbations actually diverges when particles cross the center of the halo. Then, the functional is singular at such spherical states (i.e. it is discontinuous as infinitesimal deviations from spherical symmetry lead to a finite change of ) and the path integral (22) is no longer governed by spherical states that have a zero measure. As noticed above, this also means that, in the limit of rare massive halos, the nonlinear density threshold  separates outer shells with a smooth radial flow from inner shells with a significant transverse velocity dispersion. Thus, marks the virialization radius where isotropization of the velocity tensor becomes important, in agreement with numerical simulations (e.g., Cuesta et al. 2008).

It is interesting to note that a similar approach can be developed for the adhesion model'' (Gurbatov et al. 1989), where particles move according to the Zeldovich dynamics (Zeldovich 1970) but do not cross because of an infinitesimal viscosity (i.e. they follow the Burgers dynamics in the inviscid limit). Moreover, in the one-dimensional case, with a linear power-spectrum slope n=-2 or n=0 (i.e. the linear velocity is a Brownian motion or a white noise), it is possible to derive the exact distribution by other techniques and to check that it agrees with the asymptotic tail (30), as seen in Valageas (2009a,b).

## 4 Mass function of collapsed halos

The quasi-linear limit (30) of the distribution clearly governs the large-mass tail of the mass function , where we define halos as spherical objects with a fixed density contrast . Indeed, since the Lagrangian radius q is related to the halo mass by , massive halos correspond to large Eulerian and Lagrangian radii, and the limit corresponds to the quasi-linear limit . Then, going from the distribution to the mass function n(M) can introduce geometrical power-law prefactors since halos are not exactly centered on the cells of a fixed grid, as discussed in Betancort-Rijo & Montero-Dorta (2006), but the exponential cutoff remains the same as in Eq. (30), whence

 (32)

where with . A simple approximation for the mass function that satisfies this large-mass falloff can be obtained from the Press-Schechter method (PS), see Press & Schechter (1974), but using the actual linear threshold associated with the nonlinear threshold that defines the halo radius r, rather than the linear threshold associated with the spherical collapse down to a point. This yields for the number of halos in the range per unit volume,

 (33)

with

 (34)

The mass function (34) includes the usual prefactor 2 that gives the normalization

 (35)

which ensures that all the mass is contained in such halos. However, we must note that the power-law prefactor in (34) has no strong theoretical justification since only the exponential cutoff (32) was exactly derived from (30). In principle, it could be possible to derive subleading terms (whence power-law prefactors) in the quasi-linear limit for , by expanding the path integral (22) around the saddle-point (25). Then, one would also need to take more care of the prefactors that would arise as one goes from the distribution to the mass function n(M). However, this expansion would not allow one to derive the shape of the mass function at low masses, as this regime is far from the quasi-linear limit and involves multiple mergers far from spherical symmetry. Until a new method is devised to handle this regime, one must treat the prefactors as free parameters to be fitted to numerical simulations, in order to describe the mass function from small to large masses.

Here we can note that in the Press-Schechter approach (and in most models) the mass function (34) is obtained from the linear density reached within the sphere centered on each mass element. That is, a particle is assumed to be part of a halo of mass larger than M if the sphere of mass M (or a sphere of mass larger than M in the excursion set approach of Bond et al. 1991), centered on this particle, has a linear density contrast above a threshold . As pointed out in Sheth et al. (2001), and discussed in their Sect. 3, since all particles are not located at the center of their parent halo, the correct criteria should rather be that the particle belongs to a sphere, not necessarily centered on this point, of mass greater than M, that has collapsed by the redshift of interest. Within the excursion set approach of Bond et al. (1991), this means that one should consider the first-crossing distributions associated with center-of-mass particles, rather than with randomly chosen particles, as argued in Sheth et al. (2001). Then, the latter suggest that this could modify the numerical factor in the exponential tail of the mass function, that is, lead to a smaller factor in Eq. (32) than the one associated with spherical (or ellipsoidal) collapse dynamics. We point out here that this effect should only give subleading corrections to the tail (32), so that the factor in Eq. (32) remains exactly given by the spherical collapse dynamics.

This can be seen from the fact that the same effect would apply to the density probability distribution , as randomly placed cells of radius r are typically not centered on halo profiles. Nevertheless, the results (27)-(31) are exact in the quasi-linear limit (as can also be checked by the comparison with standard perturbation theory for the cumulant generating function  ), and off-center effects would be included in the subleading terms, computed as usual by expanding the path integral (22) around its saddle-point, which would typically give the power-law prefactor to the tail (30). In terms of the halo mass function itself, this point was also studied in Betancort-Rijo & Montero-Dorta (2006), who found as expected that at large mass such a geometrical factor only modifies that power-law prefactor.

As for the probability distribution , it is interesting to note that a similar high-mass tail can be derived for the adhesion model'', where halos are defined as zero-size objects (shocks). Again, in the one-dimensional case, for both n=-2 and n=0, where the exact mass function can be obtained by other means, one can check that it agrees with the analog of the asymptotic tail (32) (Valageas 2009a,b,c). Thus, we can check that in these two non-trivial examples the leading-order terms for the large-mass decays of and n(m) are exactly set by saddle-point properties and are not modified by the off-center effects discussed above.

We can note that the explanation of the large-mass tail (32) by the exact asymptotic result of the steepest-descent method described in Sect. 3 agrees nicely with numerical simulations. This is most clearly seen in Figs. 3 and 6 of Robertson et al. (2009), who trace back the linear density contrast of the Lagrangian regions that form halos at z=0. Their results show that the distribution of linear contrasts , measured as a function of mass (or of ), has a roughly constant lower bound , with , and an upper bound that grows with . We must note however that the difference between 1.59 and 1.6754(which would be the standard threshold in their CDM cosmology with ) is too small to discriminate both values from the results shown in their figures, so that these numerical simulations alone do not give the asymptotic value to better than about 0.2. They obtain the same results when they define halos by nonlinear density contrasts or , with a lower bound that grows somewhat with . In terms of the approach described above, this behavior expresses the fact that the most probable way to build a massive halo of nonlinear density contrast  is to start from a Lagrangian region of linear density contrast , which obeys the spherical profile (25) and corresponds to the saddle-point of the action (23). As recalled in Sect. 3, the path integral (22) is increasingly sharply peaked around this initial state at large mass scales, which explains why the dispersion of linear density contrasts measured in the simulations decreases with . At smaller mass, one is sensitive to an increasingly broad region around the saddle-point, which mostly includes non-spherical initial fluctuations. Since these initial conditions are less efficient to concentrate matter in a small region one needs a larger linear density contrast to reach the same nonlinear threshold  within the Eulerian radius r, which is why the distribution is not symmetric and mostly broadens by increasing its typical upper bound . In principles, it may be possible to estimate the width of this distribution, at large masses, by expanding the action around its saddle-point.

 Figure 6: The mass function at redshift z=0 of halos defined by the nonlinear density contrast . The points are the fits to numerical simulations from Sheth & Tormen (1999), Jenkins et al. (2001), Reed et al. (2003), Warren et al. (2006) and Tinker et al. (2008). The dashed line ( '') is the usual Press-Schechter mass function, while the solid line that goes close to the Press-Schechter prediction at small mass is the mass function (34) with the exact cutoff (32). Here we have . The second solid line that agrees with simulations over the whole range is the fit (36), that obeys the same large-mass exponential cutoff. Open with DEXTER

We compare in Fig. 6 the prediction (34) (solid line labeled as ) with results from numerical simulations, for the nonlinear threshold at redshift z=0. In our case, this corresponds to a linear density contrast , that is obtained from Fig. 2 or the fit (12). As in Sect. 3, we consider a CDM cosmology with . This corresponds to the cosmological parameters of the largest-box numerical simulations of Tinker et al. (2008), which allow the best comparison with the theoretical predictions, as they also define halos by the density threshold with a spherical-overdensity algorithm. The numerical results are the fits to the mass function given in Sheth & Tormen (1999) (ST''), Jenkins et al. (2001) (J''), Reed et al. (2003) (R''), Warren et al. (2006) (W'') and Tinker et al. (2008) (T''). Note that these mass functions are defined in slightly different fashions, using either a spherical-overdensity or friends-of-friends algorithm, and density contrast thresholds that vary somewhat about . However, they agree rather well, as the dependence on  is rather weak. This can be understood from Fig. 1, which shows that around the linear density contrast has a very weak dependence on . We also plot the usual Press-Schechter prediction (dashed line labeled ), that amounts to replace by in Eq. (34) (since at z=0). We can see that using the exact value significantly improves the agreement with numerical simulations at large masses. Note that there are no free parameters in Eq. (34). Of course, at small masses the mass function (34) closely follows the usual Press-Schechter prediction and shows the same level of disagreement with numerical simulations. This is expected since only the exponential cutoff (32) has been exactly derived from Sect. 3. Since at large masses the power-law prefactor in Eq. (34) is also unlikely to be correct, as discussed above, we give a simple fit that matches the numerical simulations from small to large masses (solid line that runs through the simulation points) while keeping the exact exponential cutoff:

 (36)

with

 (37)

At higher redshift or for other one simply needs to use the relevant mapping , shown in Fig. 2 or given by the simple fit (12) for . This mass function also satisfies the normalization (35), whatever the value of the threshold .

Thus, we suggest that mass functions of virialized halos should be defined with a fixed nonlinear density threshold, such as , and fits to numerical simulations should use the exact exponential cutoff of Eq. (32), with the appropriate linear density contrast , rather than treating this as a free parameter. This would automatically ensure that the large mass tail has the right form (up to subleading terms such as power-law prefactors), as emphasized by the reasonable agreement with simulations of Eq. (34) at large masses. Moreover, it is best no to introduce unnecessary free parameters that become partly degenerate. Here we must note that Barkana (2004) had already noticed that, taking the spherical collapse at face value and defining halos by a nonlinear threshold (he chose ), one should use the linear threshold for the Press-Schechter mass function. Unfortunately, noticing that the value of given by fits to numerical simulations was even lower, he concluded that this was not sufficient to reconcile theoretical predictions with numerical results. As shown by Fig. 6, this is not the case, as the parameter used in fitting formulae is partly degenerate with the exponents of the power-law prefactors, so that it is possible to match numerical simulations while satisfying the large-mass tail (32). Of course, the actual justification of the asymptote (32) is provided by the analysis of Sect. 3, which shows that below the upper bound the spherical collapse is indeed relevant and asymptotically correct at large masses, as it corresponds to the saddle-point of the action (23).

Note that the result (32) also implies that the mass function  is not exactly universal, since the mapping shows a (very) weak dependence on cosmological parameters, see Fig. 2. Using the exact tail (32) should also improve the robustness of the mass function with respect to changes of cosmological parameters and redshifts.

 Figure 7: The mass functions at z=0 of halos defined by the nonlinear density contrasts ( upper panel) and ( lower panel). The points show the numerical simulations of Tinker et al. (2008). The dashed line is the usual Press-Schechter mass function, while the solid lines correspond to Eqs. (34) and (36), with now ( upper panel) and ( lower panel). Open with DEXTER

Next, we compare in Fig. 7 the mass functions obtained at z=0 for the density thresholds and with the numerical simulations from Tinker et al. (2008), who also considered these density thresholds. We show the usual Press-Schechter mass function (dashed line) and the results obtained from Eqs. (34) and (36) (solid lines), with now and . We can see that the large mass tail remains consistent with the simulations, but for the case it seems that the shape of the mass function at low and intermediate masses is modified and cannot be absorbed through the rescaling of . It appears to follow Eq. (34) rather than the fit (36), but this is likely to be a mere coincidence. We should note that Fig. 5 shows that for massive halos, so that shell-crossing should be taken into account and the tail (32) is no longer exact for , although it should still provide a reasonable approximation, as checked in Fig. 7. Thus, even though Tinker et al. (2008) also studied higher density thresholds, we do not consider such cases here as the tail (32) no longer applies.

## 5 Halo density profile

 Figure 8: The density profile of rare massive halos, for several masses M. For each mass the redshift is such that , as the theoretical prediction from Eq. (25) (solid line) only applies to rare events. The second branch that appears at small radii is due to the shell-crossing, hence the theoretical prediction only holds to the right of this branch. The points are fits to numerical simulations, based on an NFW profile (Dolag et al. 2004; Duffy et al. 2008) or an Einasto profile (Duffy et al. 2008). Note that we show the mean density within radius r', , rather than the local density at radius r', so that the NFW and Einasto profiles are integrated once. Open with DEXTER

As for the mass function n(M), the analysis of Sect. 3 shows that the density profile of rare massive halos is given by the spherical saddle-point (25), see also Barkana (2004) and Prada et al. (2006). This holds for halos selected by some nonlinear density threshold in the limit of rare events, provided shell-crossing has not occurred beyond the associated radius r. In particular, this only applies to the outer part of the halo since in the inner part, at , shell-crossing must be taken into account. Then, as discussed in Sect. 3, a strong radial-orbit instability comes into play and modifies the profile in this inner region, as deviations from spherical symmetry govern the dynamics and the virialization process (Valageas 2002b).

We compare in Fig. 8 the nonlinear density profile obtained from Eq. (25) with fits to numerical simulations. We plot the overdensity within radius r', , as a function of radius r. This is again obtained from Eq. (25) with the mapping and . Since this only applies to the limit of rare events, we choose for each mass M a redshift z such that . Thus, smaller masses are associated with higher redshifts. The results do not significantly depend on the precise value of the criterium used to define rare events, here . The points in Fig. 8 are the results obtained from a Navarro et al. profile (Navarro et al. 1997, NFW),

 (38)

or an Einasto profile (Einasto 1965),

 (39)

For the NFW profile, the characteristic radius rs is obtained from the concentration parameter, c(M,z)=r/rs, where as in the previous sections r is the Eulerian radius where (i.e. r=r200). Then, we use in Fig. 8 the two fits obtained in Dolag et al. (2004) and Duffy et al. (2008) from numerical simulations, of the form c(M,z)= a (M/M0)b (1+z)c. For the Einasto profile, we use the fits obtained in Duffy et al. (2008) for the concentration parameter, c(M,z)=r/r-2, and for the exponent (written as a quadratic polynomial over as in Gao et al. 2008). Then, we integrate over r' the profiles (38)-(39), to obtain the overdensity profiles shown in Fig. 8 (as is the mean density within radius r'and not the local density at radius r' as in Eqs. (38)-(39)).

We can check in Fig. 8 that our results agree reasonably well with these fits to numerical simulations over the range where both predictions are valid. Note that our prediction has no free parameter, since it is given by the saddle-point profile (25). At large radii we recover the mean density of the Universe, while the numerical profiles (38)-(39) go to zero, but this is an artefact of the forms (38)-(39), that were designed to give a sharp boundary for the halos and are mostly used for the high-density regions. For a study of numerical simulations at outer radii, see Figs. 8 and 10 of Prada et al. (2006) which recover the mean density of the Universe at large scales. At small radii, the second branch that makes a turn somewhat below r is due to shell-crossing that makes the function bivaluate. Below the maximum shell-crossing radius, for each Eulerian radius r' there are two Lagrangian radii q', a large one that corresponds to shells that are still falling in, and a smaller one that corresponds to shells that have already gone once through the center (note that in agreement with Fig. 4, shell-crossing appears at slightly smaller relative radii for smaller mass). Then, the theoretical prediction only holds to the right of this second branch, where there is only one branch and no shell-crossing. Note that the theoretical prediction (25) explicitly shows that the halo density profiles are not universal. Within the phenomenological fits (38)-(39) this is parameterized through the dependence on mass and redshift of the concentration parameter and of the exponent . However, we can see that over the regime where Eq. (25) applies the local slope of the halo density is , which explains the validity of the fits (38)-(39) in this domain. Unfortunately, our approach cannot shed light on the inner density profile, where , which is the region of interest for most practical applications of the fits (38)-(39).

The same approach, based on the spherical collapse, was studied in greater details in Betancort-Rijo et al. (2006) and Prada et al. (2006). They consider the typical'' profile (25) that we study here, as well as mean'' and most probable'' profiles. In agreement with the steepest-descent approach of Sect. 3, they find that the most probable profile closely follows the typical profile (25) and they obtain a good match with numerical simulations for massive halos, paying particular attention to outer radii. Therefore, we do not further discuss halo profiles here.

## 6 Halo bias

In addition to their multiplicity and their density profile, a key property of virialized halos is their two-point correlation function. At large scales it is usually proportional to the matter density correlation, up to a multiplicative factor b2, called the bias of the specific halo population. We revisit in this section the derivation of the bias of massive halos, following Kaiser (1984), and we point out that paying attention to some details it is possible to reconcile the theoretical predictions with numerical simulations, without introducing any free parameter.

As seen in the previous sections, rare massive halos can be identified with rare spherical fluctuations in the initial (linear) density field. More precisely, as in Sect. 3 we may consider the bivariate density distribution, , of the density contrasts , in the spheres of radii r1,r2, centered at points . Thus, we introduce as in Eq. (19) the double Laplace transform ,

 = = (40)

where is the linear variance (18) at some scale r. If r1=r2 we may take r=r1=r2, otherwise it can be any intermediate scale, as the results do not depend on this factor. The only requirement is that  should scale as and , which is obtained by choosing for instance a fixed ratio . Then, the probability distribution can be obtained from the cumulant generating function through a double inverse Laplace transform, as in Eq. (20). Next, can be written in terms of the linear density field as a path integral, such as (22), with an action that now reads as

 (41)

Then, for rare events the tail of the distribution is governed by the last Gaussian weight of (41), as in Eq. (30),

 (42)

where is the relevant saddle-point of the action . Unfortunately, since the action (41) is no longer spherically symmetric, it is not possible to obtain an explicit expression of its minimum. Therefore, we must rely on some simple approximations. In the limit of large distance, , between the positions of both halos, the linear field around the Lagrangian positions , of the halos should follow the profile (25) (we note the Lagrangian coordinates to avoid confusion with the Lagrangian radii qi). Thus, we neglect the tidal forces of the halos, but we keep track of the mean displacement of each halo, due to the gravitational attraction from the other one, by distinguishing their Lagrangian positions , from their Eulerian positions . Within this approximation, the distribution can be estimated from the Gaussian distribution of the linear density contrasts , at positions , within the Lagrangian spheres of radii q1 and q2, with the mapping (26). This closely follows the approach introduced in Kaiser (1984), except for the distinction between  and . The joint distribution reads as (see also Kaiser 1984; Politzer & Wise 1984)

 (43)

where M is the linear covariance matrix,

 (44)

Here we defined from Eq. (18), and with . The inverse of the matrix M writes as

 (45)

This yields
 = (46)

Next, defining the real-space halo correlation as the fractional excess of halo pairs (Kaiser 1984; Peebles 1980),
 (47)

which also gives the conditional probability,

 (48)

we write
 (49)

where the mass of each halo is given by and are the linear density contrasts which are associated with the nonlinear density contrasts that define the halos, as in Sect. 4 and Fig. 2. In Eq. (49) the factor models the effects associated with the mapping from Lagrangian to Eulerian space. Indeed, in the limit of rare massive halos and large separation, the number of neighbors is conserved and we write (for an integral form of the conservation of pairs, or neighbors, see Peebles 1980). We take , where we define as the local nonlinear density contrast at Eulerian distance x12 from a halo of mass M, obtained from the profile (25), and we choose (which obeys the symmetry ), whence . This reflects the fact that the gravitational attraction of a massive halo pulls matter towards it center with a strength that depends on distance, so that the Jacobian is different from unity. In a sense this is a tidal effect, as a locally volume-preserving displacement would not affect the number densities, which is why we take for the density contrast at distance x12 and not the density contrast within radius x12. We may note that a local bias model (Mo & White 1996), using a peak-background split argument (Efstathiou et al. 1988), would rather give a quadratic factor . However, we prefer to keep the linear factor (49) as it also remains consistent in case of large displacements. This also expresses the fact that all pairs cannot simultaneously get closer (as fluctuations grow some objects move closer but they also further separate from other emerging groups). Then, defining the halo bias as

 (50)

where r=x12 and is the nonlinear matter correlation, we obtain the bias from Eq. (49). Note that the bias (50) does not factorize, , because of the terms  and .

At large separation, , we are in the linear regime, so that the matter correlation reads as , and the local density contrast is small and close to the linear density contrast . Note that this is the density contrast at Lagrangian radius s, and not the mean density contrast within radius s. Therefore, it is related to Eq. (25) by

 (51)

whence

 (52)

with for instance, and using Eq. (18)

 (53)

Then, from Eq. (50), the bias of halos defined by the same linear threshold, , reads as
 b2M1,M2(r) = (54)

For equal-mass halos this simplifies as
 b2M(r) = (55)

Note that the argument of the exponential is not necessarily small, as stressed in Politzer & Wise (1984). Indeed, we only assumed a large separation limit, i.e. , and a rare-event limit, . Thus, at fixed (small) ratio , we obtain a large exponent in the limit of large-mass halos, . Then, keeping the full expressions (54) or (55) gives a nonlinear bias, since b2M(r) is not a simple number and shows a non-trivial scale dependence, as will be clearly seen in Fig. 11 below. Indeed, these results cannot be recovered through a linear biasing scheme, where the fluctuations of the halo number density field, , are written as (where the matter density field is smoothed over some larger scale R), which would lead to whereas Eqs. (54)-(55) generate powers of all orders over . In the local bias framework (Fry & Gaztanaga 1993; Mo et al. 1997), where one writes , this could be interpreted as non-zero bias coefficients bi for . However, Eqs. (54)-(55) are not equivalent to such a local model, inspired from a peak-background split argument (Efstathiou et al. 1988). Indeed, they do not involve any external smoothing scale R and they depend on the three linear correlations and .

Finally, we must express the Lagrangian separation s in terms of the Eulerian distance r. At lowest order we again consider each halo as a test particle that falls into the potential well built by the other halo (i.e. we neglect backreaction effects). Then, from the analysis of Sect. 3 and the linear profile (25), we know that a test particle at Lagrangian distance q' from one halo has moved to position r', according to the mapping (26). Using Eq. (25) this gives at first order

 (56)

since at large distance we have . Therefore, we obtain the Lagrangian separation s as the solution of the implicit equation

 (57)

where again we only kept the first-order term and we took into account the displacements of both halos. Together with Eq. (54), or Eq. (55), this defines our prediction for the bias of massive collapsed halos. Note that this approach also applies to the cross-correlation between different redshifts.

At large separation, , and fixed mass (i.e. fixed ), we may linearize the bias (55) over as

 (58)

For large masses, where , but not too large, so that the exponent in (55) is still small, this gives

 (59)

Thus we recover the result of Kaiser (1984) and Mo & White (1996), except for the multiplicative factor . It expresses the facts that halos only probe the linear density field smoothed over the Lagrangian scale q (i.e. the formation of a halo does not depend on wavelengths much smaller than its radius) and that halos have moved from distance s to r by their mutual gravitational attraction. This factor yields a weak scale dependence for b(M), as the slope of the linear power spectrum slowly varies with scale r. We can also note that at lowest order over we may write the solution of Eq. (57) as

 (60)

which provides a simple explicit expression for s.

 Figure 9: The halo bias , as a function of , at redshift z=0 and distance r=50  h-1 Mpc. The solid line is the theoretical prediction (55)-(57), and the dashed line is the bias obtained in Mo & White (1996). The points are the fits to numerical simulations, from Sheth et al. (2001) (crosses) and Pillepich (2009) (circles). Open with DEXTER

We compare in Figs. 9-11 the bias obtained from Eqs. (55), (57), with fits to numerical simulations. We first show in Fig. 9 our prediction as a function of halo mass, at redshift z=0 and distance r=50  h-1 Mpc. This is typically the scale that is considered in numerical simulations to compute the large-scale bias, as b(r) is expected to be almost constant at large scales (Kaiser 1984; Mo & White 1996), see also Eq. (59). We also plot the standard theoretical prediction from Mo & White (1996) (dashed line). We can see that our result (55)-(57) agrees rather well with numerical simulations and the popular fit from Sheth et al. (2001). As expected, it follows the trend of the prediction from Mo & White (1996), since both derivations follow the spirit of Kaiser (1984) (i.e. one identifies halos from overdensities in the linear density field) and they agree at large scale for rare massive halos, up to a factor of order unity, as seen in Eq. (59). Note that Eqs. (55)-(57) only apply to the rare-event limit, as for small objects the approximations used in the derivation no longer apply. In particular, halos can no longer be considered as spherical isolated objects, and one should take into account merging effects. Note that this caveat also applies to other analytical approaches, such as Kaiser (1984) and Mo & White (1996). Then, our prediction should only be used for large masses, for instance such that b>1.

 Figure 10: The halo bias b(r), as a function of scale r, at redshift z=0 and for several masses. The solid line is the theoretical prediction (55)-(57). The crosses show the large-scale fit to numerical simulations from Sheth et al. (2001), while the triangles show the fit from Hamana et al. (2001). The dashed line is the linearized bias (58), the dot-dashed line is the nonlinear bias (55) where we set s=r while the dotted line uses Eq. (60) (for in the three cases). We only plot our predictions for . Open with DEXTER

 Figure 11: The halo bias b(r), as a function of scale r, at redshift z=10and for several masses. As in Fig. 10, the solid line is the prediction (55)-(57), while the crosses show the large-scale fit from Sheth et al. (2001), but the squares now show the fit to numerical simulations from Reed et al. (2009). The dashed line is the linearized bias (58), the dot-dashed line is the nonlinear bias (55) where we set s=r while the dotted line uses Eq. (60) (for in the three cases). Again we only plot our predictions for . Open with DEXTER

Next, we compare in Fig. 10 the dependence on r of the bias (55)-(57) with the fit to numerical simulations from Hamana et al. (2001) (using their cosmological parameters). The crosses are the large-scale limit given by Sheth et al. (2001) and we only plot our prediction (solid lines) down to scale r=2q, since it should only apply to large halo separations. We can see that the scale-dependence that we obtain is opposite to the one observed in the simulations. However, both are very weak and the prediction (55)-(57) may still lie within error bars of numerical results. For the largest mass, , we also plot for illustration the linearized bias (58) (lower dashed line), and the nonlinear bias (55) where we set s=r (upper dot-dashed line) or we use Eq. (60) (lower dotted line). As expected, at very large scales the linearized bias (58) agrees with the nonlinear expression (55). Using the simpler Eq. (60) also gives the same results at large scales, but the Lagrangian to Eulerian mapping still gives a non-negligible correction as shown by the upper dot-dashed line where we set s=r. In any case, it is always best to use the full expression (55)-(57).

We compare in Fig. 11 our results at high redshift, z=10, with the fit to numerical simulations from Reed et al. (2009) (using their cosmological parameters). Again, the crosses are the large-scale limit of Sheth et al. (2001) and we only plot our prediction (solid lines) down to scale r=2q. For we also plot the linearized bias (58) (lower dashed line), and the nonlinear bias (55) where we set s=r (upper dot-dashed line) or we use Eq. (60) (dotted line). As noticed in Reed et al. (2009), the scale-dependence is much steeper than the one found at small redshifts and it is not consistent with the fits obtained at low z in Hamana et al. (2001) or Diaferio et al. (2003). This was interpreted as a breakdown of universality for massive halos at high redshift by Reed et al. (2009). However, we can see that our prediction (55)-(57) agrees reasonably well with their numerical results. Therefore, the change of behavior of the bias b(r) between the two regimes studied in Figs. 10 and 11 can be understood from the standard picture of massive halos arising from rare overdensities in the initial (linear) Gaussian density field, by using the same theoretical prediction (55)-(57) that applies to any z. We can note that the linearized bias (58), or the approximation s=r, show strong deviations in this regime and disagree with the simulations. Therefore, one should use the nonlinear bias (55)-(57) (but using Eq. (60) gives similar results) and one cannot neglect the correction due to the Lagrangian to Eulerian mapping that is associated with .

On the other hand, the comparison with the result from (58) shows that the steep dependence on scale is due to the nonlinear term in (55), i.e. keeping the exponential factor. Indeed, as noticed below Eq. (55), and in Politzer & Wise (1984), the derivation of the bias presented above only assumes a large separation between very massive (rare) halos, that is, and . Therefore, for sufficiently massive objects (that correspond to a large bias), the variance  can be small enough to make the exponent in Eq. (55) of order unity or larger. Then, one needs to keep the nonlinear expression (55) rather than expanding the exponential as in Eq. (58). As noticed below Eq. (55), this implies a nonlinear biasing scheme as the halo correlation is not proportional to the matter correlation but shows a steeper scale dependence. Within the local bias framework (Fry & Gaztanaga 1993; Mo et al. 1997), this could be interpreted as non-zero higher order bias parameters bi in the expansion . However, the bias (55) cannot be exactly reduced to such a model (but one could certainly derive within such a framework a good approximation to Eq. (55), restricted to some larger scale R, by using Eq. (60), writing the correlations and in terms of , expanding over and finally smoothing over the external scale R of interest).

We should stress here that our prediction (55)-(57) has no free parameter. This can lead to a slightly larger inaccuracy as compared with fits to simulations in the regime where the latter have been tested, as in Fig. 10, but this improves the robustness of the predictions as one consider other regimes (e.g. other cosmological parameters or other redshifts as in Fig. 11). Therefore, we think that Eqs. (55)-(57) could provide a useful alternative to current fitting formulae, as they can be readily applied to any set of cosmological parameters or redshifts.

Finally, we show in Fig. 12 the bias ratio b2(M1,M2)/[b(M1)b(M2)], as a function of the mass ratio M2/M1, at fixed geometrical mean . We consider several mass scales M, at distance and redshift (solid lines) and (dashed lines). This shows that making the factorized approximation, , can lead to an error of up to  for a mass ratio . Therefore, it is best to use the full Eq. (54).

 Figure 12: The ratio b2(M1,M2)/[b(M1)b(M2)], from Eqs. (54) and (55), at fixed geometrical mean . Solid lines are for distance and redshift , whereas dashed lines are for . Curves are labelled by their fixed geometrical mean M. Open with DEXTER

## 7 Conclusion

We have pointed out in this article that the large-mass exponential tail of the mass function of collapsed halos is exactly known, provided halos are defined as spherical overdensities above a nonlinear density contrast threshold  (i.e. using a spherical overdensity algorithm in terms of numerical simulations). This arises from the fact that massive rare events are governed by (almost) spherical fluctuations in the initial (linear) Gaussian density field (if one does not explicitly breaks statistical isotropy by looking for non-spherical quantities). This is most easily seen from a steepest-descent approach, which becomes asymptotically exact in the large-scale limit, applied to the action  associated with the probability distribution of the nonlinear density contrast within spherical cells. This result holds for any nonlinear threshold  used to define halos, provided it is below an upper bound that marks the point where shell-crossing comes into play. For a standard CDM cosmology, typically grows from 200 to 600 as one goes from to (which also corresponds to increasing redshift). This dependence on mass is due to the change of slope of the linear power spectrum with scale.

We have also noted that in two similar systems, the one-dimensional adhesion models with Brownian or white-noise initial (linear) velocity, the same method can be used for both the density distribution and the mass function n(M), and one can check that this yields rare-event tails that agree with the exact distributions, which can be derived by other techniques (Valageas 2009a,b).

Therefore, defining collapsed halos by a threshold to follow the common practice, the large-mass tail of the halo mass function is of the form e , up to subleading prefactors such as power laws, where is the linear density contrast associated to  through the spherical collapse dynamics. In particular, we obtain for . We checked that this value, which is slightly lower than the commonly used value of associated with complete collapse, gives a good match with numerical simulations (at large masses) when we simply use the Press-Schechter functional form. We also give a fitting formula, which obeys this exact exponential cutoff, that agrees with simulations over all mass scales.

We suggest that halos should be defined by such a nonlinear density threshold (i.e. friends-of-friends algorithms are not so clearly related to theoretical computations) and fits to numerical simulations should use this exact exponential tail, rather than treating  as a free parameter. This would avoid introducing unnecessary degeneracies between fitting parameters and it would make the fits more robust.

Next, we have briefly recalled that in the large-mass limit the outer density profile of collapsed halos is given by the radial profile of the relevant spherical saddle-point. This applies to radii beyond the density threshold , where shell-crossing comes into play. In agreement with numerical simulations, for rare massive halos this separates an outer region dominated by a radial flow from an inner region where virialization takes place and a strong transverse velocity dispersion quickly builds up. We have recalled that this can be explained from a strong radial-orbit instability, which implies that infinitesimal deviations from spherical symmetry are sufficient to govern the dynamics.

Finally, following the approach of Kaiser (1984), we have obtained an analytical formula for the bias of massive halos that improves the match with numerical simulations. In particular, it captures the steepening of the scale dependence that is observed for large-mass halos at higher redshifts. This requires keeping the bias in its nonlinear form and taking care of the Lagrangian-Eulerian mapping. We also note that using a factorization approximation, , may lead to non-negligible inaccuracies. We stress that this analytical estimate of the bias contains no free parameter. Although this can yield a match to numerical simulations that is not as good as fitting formulae derived from the same set of simulations, it provides a more robust prediction for general cases, as shown by the good agreement obtained at both low and high redshifts (z=0 and z=10), whereas published fitting formulae cannot reproduce both cases. We think this makes such a model useful for cosmological purposes, where it is desirable to have versatile analytical estimates that follow the correct trends as one varies cosmological parameters or redshifts. In particular, the scale-dependence of the bias of massive halos has recently been proposed as a test of primordial non-Gaussianity (e.g., Dalal et al. 2008), which requires robust theoretical models.

Our results are exact (for the mass function) or are expected to provide a good approximation (for the bias) in the limit of rare massive halos. However, this remains of interest as large-mass tails are also the most sensitive to cosmological parameters (e.g., through the linear growth factor and the primordial powerspectrum), thanks to their steep dependence on mass or scale. Moreover, we think that more general fitting formulae (such as the one we provide for the mass function) should follow such theoretical predictions in their relevant limits, so as to reduce the number of free parameters and improve their robustness.

## Footnotes

... minimum
As described in details in Valageas (2002a), depending on the slope of the linear power spectrum the saddle-point (25) may only be a local minimum or maximum, but it still governs the rare-event limit, in agreement with physical expectations.
... displacements
For instance, let us consider a large system which can be subdivided into cells of two classes, , with matter densities and volume fractions , and , . Then, from the conservation of volume ( ) and mass ( ), we obtain in the limits and , .

## All Figures

 Figure 1: The function that describes the spherical dynamics when there is no shell-crossing, at z=0. The solid line corresponds to the CDM cosmology (with ) and the dashed line to the Einstein-de Sitter case . The second peak corresponds to a second collapse to the center, but for our purposes we only need before first collapse. Open with DEXTER In the text

 Figure 2: The linear density contrast associated with the nonlinear density contrast , through the spherical dynamics, as a function of the cosmological parameter at the redshift of interest. We show the three cases and 300. The dashed line is the fit (12) to the case . Open with DEXTER In the text

 Figure 3: The radial profile (25) of the linear density contrast of the saddle point of the action . We show the profiles obtained with a CDM cosmology for the masses M=1011,1012,1013,1014 and . A larger mass corresponds to a lower ratio at large radii q'/q>1. Open with DEXTER In the text

 Figure 4: The Lagrangian map given by the spherical dynamics (neglecting shell-crossing) for the saddle-point (25) with a nonlinear density contrast at the Eulerian radius r. We plot the curves obtained at z=0 for several masses, as in Fig. 3. Inner shells have already gone once through the center (hence their dynamics is no longer exactly given by Eq. (11)) but they have not crossed the radius r yet. Open with DEXTER In the text

 Figure 5: The nonlinear density contrast , beyond which shell-crossing must be taken into account, as a function of redshift for several mass scales. Open with DEXTER In the text

 Figure 6: The mass function at redshift z=0 of halos defined by the nonlinear density contrast . The points are the fits to numerical simulations from Sheth & Tormen (1999), Jenkins et al. (2001), Reed et al. (2003), Warren et al. (2006) and Tinker et al. (2008). The dashed line ( '') is the usual Press-Schechter mass function, while the solid line that goes close to the Press-Schechter prediction at small mass is the mass function (34) with the exact cutoff (32). Here we have . The second solid line that agrees with simulations over the whole range is the fit (36), that obeys the same large-mass exponential cutoff. Open with DEXTER In the text

 Figure 7: The mass functions at z=0 of halos defined by the nonlinear density contrasts ( upper panel) and ( lower panel). The points show the numerical simulations of Tinker et al. (2008). The dashed line is the usual Press-Schechter mass function, while the solid lines correspond to Eqs. (34) and (36), with now ( upper panel) and ( lower panel). Open with DEXTER In the text

 Figure 8: The density profile of rare massive halos, for several masses M. For each mass the redshift is such that , as the theoretical prediction from Eq. (25) (solid line) only applies to rare events. The second branch that appears at small radii is due to the shell-crossing, hence the theoretical prediction only holds to the right of this branch. The points are fits to numerical simulations, based on an NFW profile (Dolag et al. 2004; Duffy et al. 2008) or an Einasto profile (Duffy et al. 2008). Note that we show the mean density within radius r', , rather than the local density at radius r', so that the NFW and Einasto profiles are integrated once. Open with DEXTER In the text

 Figure 9: The halo bias , as a function of , at redshift z=0 and distance r=50  h-1 Mpc. The solid line is the theoretical prediction (55)-(57), and the dashed line is the bias obtained in Mo & White (1996). The points are the fits to numerical simulations, from Sheth et al. (2001) (crosses) and Pillepich (2009) (circles). Open with DEXTER In the text

 Figure 10: The halo bias b(r), as a function of scale r, at redshift z=0 and for several masses. The solid line is the theoretical prediction (55)-(57). The crosses show the large-scale fit to numerical simulations from Sheth et al. (2001), while the triangles show the fit from Hamana et al. (2001). The dashed line is the linearized bias (58), the dot-dashed line is the nonlinear bias (55) where we set s=r while the dotted line uses Eq. (60) (for in the three cases). We only plot our predictions for . Open with DEXTER In the text

 Figure 11: The halo bias b(r), as a function of scale r, at redshift z=10and for several masses. As in Fig. 10, the solid line is the prediction (55)-(57), while the crosses show the large-scale fit from Sheth et al. (2001), but the squares now show the fit to numerical simulations from Reed et al. (2009). The dashed line is the linearized bias (58), the dot-dashed line is the nonlinear bias (55) where we set s=r while the dotted line uses Eq. (60) (for in the three cases). Again we only plot our predictions for . Open with DEXTER In the text

 Figure 12: The ratio b2(M1,M2)/[b(M1)b(M2)], from Eqs. (54) and (55), at fixed geometrical mean . Solid lines are for distance and redshift , whereas dashed lines are for . Curves are labelled by their fixed geometrical mean M. Open with DEXTER In the text