Open Access
Issue
A&A
Volume 694, February 2025
Article Number A29
Number of page(s) 20
Section Cosmology (including clusters of galaxies)
DOI https://doi.org/10.1051/0004-6361/202451178
Published online 30 January 2025

© The Authors 2025

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1. Introduction

The spatial distribution of galaxies is one of the most promising probes of the cosmology of our Universe. While the clustering of galaxies through gravity can be modelled accurately through perturbation theory (see Bernardeau et al. 2002, for a review) and N-body simulations (see Angulo & Hahn 2022, for a review), their formation and morphological evolution is a highly complicated process that is so far difficult to predict reliably. Accounting for this appropriately is one of the central challenges to optimally interpreting current and future large-scale surveys, such as the Dark Energy Instrument (DESI, Levi et al. 2013) or Euclid (Amendola et al. 2018).

Despite the complexity of galaxy formation, the clustering of galaxies appears rather simple on sufficiently large scales. For example, on linear scales, the two-point correlation function of galaxies can be described well by a linear bias factor times the correlation function of matter (Kaiser 1984). Such simplicity arises because the large-scale clustering is mostly driven (beyond the gravitational movement) by the average response of large ensembles of galaxies to small perturbations in the linear density field. This response can be modelled through a simple bias relation (see Desjacques et al. 2018, for a review).

In bias expansion approaches, the bias function is expanded as a Taylor series in terms of small perturbations to the linear fields. This approach offers great flexibility, allowing a single model to describe biased tracers with vastly different properties down to k ≈ 0.2 h/Mpc (Baumann et al. 2012; Baldauf et al. 2016; Vlah et al. 2016). Bias expansion approaches have been applied successfully to extract robust cosmological constraints from surveys (Ivanov et al. 2020; d’Amico et al. 2020; Colas et al. 2020; Nishimichi et al. 2020; Chen et al. 2020; Philcox & Ivanov 2022).

However, traditional bias methods also exhibit some noteworthy inconsistencies. Bias expansion models generically predict negative galaxy densities for some values of the underlying matter densities (e.g. Wu et al. 2022). While the amplitude of this problem can in principle be controlled by limiting the investigation to very large scales with small variance, it is almost inevitable that some high-σ-outlier regions of space are predicted to be negative. This fact is commonly ignored, since the summary statistic of interest – for example, the power spectrum or the field level error spectrum (Schmittfull et al. 2019) – may not be significantly affected by it. However, the predicted negative densities automatically exclude some possible applications of bias models. For example, they prevent one from describing the locations of galaxies through their joint probability distribution, or make it impossible to consistently sample discrete tracers from the predicted galaxy density field.

Further, it is not always clear how well the bias expansion converges. For example, the coefficients of higher-order terms may actually grow so that not every perturbative series is well convergent (Stücker et al. 2024). This can severely limit the ability to describe highly biased objects even at quite large scales.

It is the goal of this article to investigate the properties of the bias function in a ‘non-parametric’ way and to propose a solution to the mentioned shortcomings through a Gaussian Lagrangian bias model. Non-perturbative and strictly positive formulations of the bias function have already been considered in previous studies (Szalay 1988; Matsubara 1995; Sigad et al. 2000; Matsubara 2011; Neyrinck et al. 2014; Friedrich et al. 2022). Such approaches have been particularly popular in reconstruction methods in which a likelihood for the joint distribution of all observed galaxies needs to be modelled (Ata et al. 2015; Jasche & Lavaux 2019; Hernández-Sánchez et al. 2021). For example, the BORG and COSMIC BIRTH algorithms (Jasche & Lavaux 2019; Kitaura et al. 2021) uses a power law bias function with exponential truncation in low-density regions developed by Neyrinck et al. (2014). These bias models are considerably more complicated than the one that we propose here, because they are formulated in Eulerian space where the matter distribution is already quite complex.

Analytical models of structure formation naturally give rise to strictly positive formulations of the Lagrangian bias function: in excursion set models, bias parameters can be estimated through the probability of crossing a large-scale barrier given a large-scale perturbation (Bond et al. 1991; Mo & White 1996; Musso et al. 2012). In peak theory, the response of the number density of peaks can be investigated as a function of a larger-scale density perturbation (Bardeen et al. 1986). In both cases the resulting bias relations are closely related to the Gaussian statistics of the linear density field, and therefore close to a Gaussian. From an analytical perspective, it seems natural to consider parametric approaches that easily recover a Gaussian form for the bias function.

The bias relation has already been investigated by non-parametric methods in previous studies. On the one hand, there exist approaches that smooth both the matter density field and the galaxy density field (both in Eulerian space) and study the relation through the ‘scatter-plot method’ (Manera & Gaztañaga 2011; Uhlemann et al. 2018; Desjacques et al. 2018; Balaguera-Antolínez et al. 2019; Pellejero-Ibañez et al. 2020; Kitaura et al. 2022; Friedrich et al. 2022). This approach is relatively complex, because the bias relation depends on both smoothing scales and further stochasticity needs to be described simultaneously to the deterministic aspects of the bias relation. On the other hand, recently another non-parametric approach has been proposed by Wu et al. (2022), whereby a high-dimensional Lagrangian bias function, f, is defined on a grid and is fitted to reproduce the Eulerian galaxy density field (see also Wu et al. 2023). The authors show that this approach gives a consistent bias model that follows the physical constraints (f > 0) and accurately describes the galaxy field. However, this comes at the cost of a very high-dimensional parameter space and a very complex optimisation problem.

Here, we present a new method of measuring the Lagrangian bias function, f. We define f as the excess probability of forming a galaxy in an infinitesimal Lagrangian volume element as a function of the properties of the linear density field. The measurement of f is done through the distribution of linear densities at the Lagrangian locations of galaxies (or haloes) p(δ|g) – that is, the ‘galaxy environment distribution’. Our approach is considerably simpler than previous approaches and it only requires smoothing the linear density field, but not the galaxy field. Further, we show that while the bias relation f is dependent on the smoothing scale that it is measured at, it is possible to define and measure a re-normalised bias function, F, which captures those aspects of f that are independent of the measurement scale.

Further, we introduce a new Lagrangian Gaussian bias model. We show that the Gaussian bias model has intriguing properties that distinguish it from a traditional bias expansion, the most noteworthy being that it predicts only strictly positive probabilities, f > 0. We show that the Gaussian bias model qualifies as a valid bias model, since it can be written in a re-normalised form that is mutually consistent across different smoothing scales. Further, we show that in the multi-variate case (e.g. considering additionally the tidal field or the Laplacian of the density field) a straightforward generalisation is given by a multi-variate Gaussian.

We measure the bias function, f, the galaxy environment distribution, p(δ|g), and the re-normalised bias function, F, for a large variety of cases, including haloes of different mass selections and mock galaxy catalogues. We find that the bias function of haloes is extremely close to a Gaussian and almost perfectly described by the Gaussian bias model, both in the mono-variate case and the multi-variate one. We show that a Gaussian can give a much improved description compared to an expansion bias model with the same number of parameters, especially for highly biased objects that are very challenging to describe with the traditional approach.

The considerations in this article are limited to Lagrangian space, since only in this case is the distribution of matter known exactly and given by a multi-variate Gaussian. They combine therefore optimally with hybrid bias approaches whereby bias functions are set up in Lagrangian space and advected to Eulerian space through the displacement field from N-body simulations (Modi et al. 2020; Kokron et al. 2021; Zennaro et al. 2023; Pellejero Ibañez et al. 2022, 2023; DeRose et al. 2023).

We publish this article jointly with a companion paper, Stücker et al. (2024), in which we investigate bias parameters through the moments and cumulants of the galaxy environment distribution. One of the main results of that article is that cumulant biases beyond order three are close to zero, which additionally motivates the consideration of the Gaussian bias function that we present here.

The article is structured as follows, In Sect. 2, we introduce for the mono-variate density-only case the necessary definitions and theory to understand and measure bias at a functional level. Further, we introduce the Gaussian bias model and show that it can be written in a self-consistent re-normalised form. In Sect. 3, we show measurements of mono-variate bias functions and compare them to the Gaussian model and a second-order expansion model. In Sect. 4, we extend the theory to multiple variables and in Sect. 5 we show measurements of the multi-variate bias function. In Sect. 6, we show how bias functions across multiple scales can be described by a single multi-variate model and show how this can be summarised through the scale-independence of the re-normalised bias function. In Sect. 7, we give a physical interpretation of our measurements and speculate about the reason why the bias function appears to be Gaussian in so many scenarios. Finally, in Sect. 8 we summarise our discoveries and suggest possible applications.

2. Theory

In this section, we show how to measure the (scale-dependent) bias function in a non-parametric way through the galaxy environment distribution. Further, we show how to relate the scale-dependent bias function to its ‘re-normalised’ large-scale limit and how to define parametric forms with scale independent bias parameters. We introduce the new Gaussian bias model as a compelling parametric form of the bias function. For simplicity, the considerations in this section are limited to the Lagrangian density bias; we shall consider the more general multi-variate case later in Sect. 4. Many of the considerations here are brief summaries of the more elaborate derivations presented in Stücker et al. (2024).

2.1. Measuring the bias function

Arguably, the most insightful way to understand galaxy bias is by measuring the bias function directly. Here, we propose a new non-parametric method for measuring the Lagrangian bias function, f, from cosmological simulations.

We consider an infinitesimally small Lagrangian volume element which nothing is known about except the linear density contrast δ smoothed at some scale (and possibly other features of the linear field like the Laplacian L or the tidal field). Neglecting primordial non-Gaussianity, the density contrast follows a Gaussian distribution,

p ( δ ) = 1 2 π σ exp ( δ 2 2 σ 2 ) . $$ \begin{aligned} p(\delta )&= \frac{1}{\sqrt{2 \pi } \sigma } \exp \left( - \frac{\delta ^2}{2 \sigma ^2} \right). \end{aligned} $$(1)

For simplicity, we assume throughout this article that the smoothed density contrast is defined with a sharp Fourier-space filter. However, we discuss in the appendix of Stücker et al. (2024) how other filters can easily be incorporated by taking additionally into account the effect of a slightly different correlation between large and small scales.

We call the average probability that a galaxy forms in such a volume element p(g) and we call the conditional probability, given the knowledge of the linear density contrast, p(g|δ). The excess probability,

f ( δ ) : = p ( g | δ ) p ( g ) , $$ \begin{aligned} f(\delta ) := \frac{p(\mathrm{g} | \delta )}{p(\mathrm{g} )} ,\end{aligned} $$(2)

is parameterised through a function, f(δ), which we refer to as the ‘scale-dependent bias function’ or just the ‘bias function’ throughout this article. The bias function depends in a predictable manner on the variance of δ at the considered scale, as we shall see in Sect. 2.2.

Further, we define the ‘galaxy environment distribution’ as the distribution of (smoothed) linear densities at the Lagrangian locations of galaxies p(δ|g). This distribution is related to the bias function through Bayes’ theorem:

f ( δ ) = p ( g | δ ) p ( g ) = p ( g δ ) p ( g ) p ( δ ) = p ( δ | g ) p ( δ ) . $$ \begin{aligned} f(\delta ) = \frac{p(\mathrm {g}|\delta )}{p(\mathrm {g})} = \frac{p(\mathrm {g} \cap \delta )}{p(\mathrm {g}) p(\delta )} = \frac{p(\delta | \mathrm {g})}{p(\delta )} . \end{aligned} $$(3)

That is, the bias function is the ratio between the galaxy environment distribution p(δ|g) and the background distribution p(δ). We can use this fact to measure f in the following steps:

In a simulation, we can trace galaxies or haloes back to Lagrangian space qi. For example, we consider the set of tracers that is given by the locations of the most bound particles of haloes with a mass close to M200b ∼ 2 × 1014 h M. The Lagrangian positions qi of these tracers can be inferred from the ids of the most bound particles. We show the Lagrangian positions of such tracers in the left panel of Fig. 1. Further, we consider the smoothed linear density field1δ(q) which is shown as coloured contours in the left panel of Fig. 1.

thumbnail Fig. 1.

Illustration of the steps necessary for measuring the bias function, f. Left: the smoothed linear density field and the initial (Lagrangian) locations of a set of tracers – here the most bound particles of haloes with M200b ∼ 2 × 1014 h−1 M. Center: Distribution of the linear densities at the tracer locations (orange histogram) which is biased relative to the distribution at random locations (blue histogram). Right: The bias function, which is inferred by dividing the orange histogram by the background distribution. The solid lines show different bias models out of which the Gaussian bias model describes the data clearly the best.

The galaxy environment distribution p(δ|g) is now simply given by the distribution of the linear density contrast at the locations of these tracers δi = δ(qi). We show this as an orange histogram in the central panel of Fig. 1. If the tracers were uniformly distributed in Lagrangian space, they would follow a Gaussian distribution as in Eq. (1) – indicated as a dashed black line and as a blue histogram in the central panel of Fig. 1. However, the distribution of galaxies is notably biased with respect to the distribution of matter.

As is shown in Eq. (3), we can divide the galaxy environment distribution by the background distribution to infer the bias function. We show this as black data points in the right panel of Fig. 1 (with jackknife error bars as will be explained in Sect. 3.3). We show the bias function in comparison to a linear, a quadratic and a Gaussian approximation. Strikingly, the bias function for the selected set of haloes is very close to a Gaussian. We shall show in Sect. 3 that this is the case for many different scenarios. This motivates to approximate the bias function through a Gaussian bias model.

In the remainder of this section, we show how to write such a Gaussian bias model in a re-normalised form with scale independent bias parameters, by relating the scale-dependent bias function, f, to its re-normalised large-scale limit.

We note that our method for inferring the bias function through a histogram is considerably simpler than previously presented non-parametric approaches (Wu et al. 2022, 2023). For example, Wu et al. (2022), have inferred the bias function by fitting a large number of function values as free parameters to optimally recover the Eulerian galaxy density. In this article, we mostly use our new method to investigate what are good parametric approximations to the bias function. However, this method can also have important other applications. For example, one could measure the non-parametric bias function, f, in a small hydrodynamical simulation and use it in larger N-body simulations to create mock catalogues of galaxies that are biased in a similar way.

2.2. The re-normalised bias function, F

The bias function, f, is strongly dependent on the smoothing scale at which it is inferred. However, it is possible to define a scale independent bias function, F, which corresponds to the large-scale limit of f.

The considerations here are based on the idea of the peak-background split (PBS) which states that ‘a long-wavelength density perturbation acts like a local modification of the background density for the purposes of the formation of haloes and galaxies’ (Kaiser 1984; Bardeen et al. 1986; Desjacques et al. 2018). Bias parameters describe the response of the galaxy number to such long-wavelength perturbations.

An exact implementation of the PBS is given by the separate universe approach. In this approach, one considers some universe in which a measurable number of galaxies ng, 0 forms. If one were to increase the background density of the universe by a relative amount δ0 (e.g. in a separate universe simulation, Li et al. 2014; Wagner et al. 2015), then in the new universe a different number of galaxies ng(δ0) forms. We call their ratio,

F ( δ 0 ) = n g ( δ 0 ) n g , 0 = 1 + b 1 δ 0 + 1 2 b 2 δ 0 2 + , $$ \begin{aligned} F(\delta _0)&= \frac{n_g(\delta _0)}{n_{g,0}} = 1 + b_1 \delta _0 + \frac{1}{2} b_2 \delta _0^2 + \ldots ,\end{aligned} $$(4)

the ‘large-scale limit of the bias function’ or the ‘re-normalised bias function’. This function can directly be measured with separate universe simulations (e.g. Lazeyras et al. 2016) and we refer to the coefficients of the indicated expansion as the ‘canonical bias parameters’ or just ‘the bias parameters’:

b n = n F ( δ 0 ) δ 0 n | δ 0 = 0 . $$ \begin{aligned} b_n&= \left. \frac{\partial ^{n} F(\delta _0)}{\partial \delta _0^{n}} \right|_{\delta _0=0}. \end{aligned} $$(5)

These bias parameters physically describe the response of the galaxy number density to small perturbations at infinitely large scales. However, in practice, perturbations do not originate from infinitely large scales, but from finite large scales, so that it is necessary to relate F to the scale-dependent bias function, f, which is observable at finite scales.

Since densities at different scales add up linearly, a separate-universe style modification of the large-scale density contrast from 0 to δ0 will immediately translate to a modification of the linear density in every volume element δ → δ + δ0. Therefore F and f should be related through

F ( δ 0 ) = f ( δ + δ 0 ) , $$ \begin{aligned} F(\delta _0)&= \langle f(\delta + \delta _0) \rangle ,\end{aligned} $$(6)

where the angled brackets indicate an expectation value taken over the Lagrangian volume (see also Desjacques et al. 2018) so that

F ( δ 0 ) = p ( δ ) f ( δ + δ 0 ) d δ . $$ \begin{aligned} F(\delta _0)&= \int _{-\infty }^\infty p(\delta ) f(\delta + \delta _0) \mathrm{d} \delta . \end{aligned} $$(7)

The relation indicates that, in a separate universe experiment, the number of galaxies should change according to the average change in probability of forming galaxies when changing the linear density contrast everywhere in space. Importantly, this relation defines how to re-normalise bias at the functional level, which we use a lot throughout this article. We note that it corresponds to a convolution with the Gaussian background distribution. Therefore, parametric models which maintain (the same) parameteric form after convolution with a Gaussian qualify as particularly convenient bias models. We shall see examples of this in the next two subsections.

2.3. Expansion bias

At the second order (in density only), the canonical bias expansion of the large-scale bias function, F, reads

F quad ( δ 0 ) = 1 + b 1 δ 0 + 1 2 b 2 δ 0 2 . $$ \begin{aligned} F_{\mathrm{quad} }(\delta _0)&= 1 + b_1 \delta _0 + \frac{1}{2} b_2 \delta _0^2. \end{aligned} $$(8)

To find a form for the scale-dependent bias function, f, we made a quadratic Ansatz,

f quad ( δ ) = c 0 + c 1 δ + 1 2 c 2 δ 2 , $$ \begin{aligned} f_{\mathrm{quad} }(\delta )&= c_0 + c_1 \delta + \frac{1}{2} c_2 \delta ^2 ,\end{aligned} $$(9)

and we applied Eq. (6) to obtain

F quad ( δ 0 ) = c 0 + c 1 ( δ + δ 0 ) + 1 2 c 2 ( δ 2 + 2 δ δ 0 + δ 0 2 ) = c 0 + c 1 δ 0 + 1 2 c 2 σ 2 + 1 2 c 2 δ 0 2 , $$ \begin{aligned} F_{\mathrm{quad} }(\delta _0)&= \left\langle {c_0 + c_1 (\delta + \delta _0) + \frac{1}{2} c_2 (\delta ^2 + 2 \delta \delta _0 + \delta _0^2)}\right\rangle \nonumber \\&= c_0 + c_1 \delta _0 + \frac{1}{2} c_2 \sigma ^2 + \frac{1}{2} c_2 \delta _0^2 ,\end{aligned} $$(10)

for the Gaussian background distribution with ⟨δ⟩ = 0 and ⟨δ2⟩=σ2. By identifying coefficients with Eq. (8), we find c1 = b1, c2 = b2, and c 0 = 1 1 2 b 2 δ 0 2 $ c_0 = 1 - \frac{1}{2} b_2 \delta_0^2 $, leading to

f quad ( δ ) = 1 + b 1 δ + 1 2 b 2 ( δ 2 σ 2 ) $$ \begin{aligned} f_{\mathrm{quad} }(\delta )&= 1 + b_1 \delta + \frac{1}{2} b_2 (\delta ^2 - \sigma ^2) \end{aligned} $$(11)

as the re-normalised form of the quadratic bias function. Polynomials of any degree maintain their degree after convolution with a Gaussian which makes it possible to re-normalise any polynomial bias model in a simple manner. Further, it is worth emphasizing that the re-normalisation procedure is quite simple here, because δ is the linear density field in Lagrangian space. In Eulerian bias schemes the re-normalisation procedure can be considerably more complex (e.g. Assassi et al. 2015).

2.4. Gaussian bias

In Stücker et al. (2024) we have introduced ‘cumulant bias parameters’ as an alternative way of phrasing the bias expansion. These are defined as

β n = n δ 0 n log ( F ( δ 0 ) ) | δ 0 = 0 $$ \begin{aligned} \beta _n&= \left. \frac{\partial ^n}{\partial \delta _0^n} \log \left( F(\delta _0) \right) \right|_{\delta _0=0} \end{aligned} $$(12)

and they are related to canonical bias parameters in the same way that cumulants are related to moments; for example,

β 1 = b 1 $$ \begin{aligned} \beta _{1}&= b_{1} \end{aligned} $$(13)

β 2 = b 2 b 1 2 $$ \begin{aligned} \beta _{2}&= b_{2} - b_{1}^{2}\end{aligned} $$(14)

β 3 = b 3 3 b 1 b 2 + 2 b 1 3 . $$ \begin{aligned} \beta _{3}&= b_{3} - 3 b_{1} b_{2} + 2 b_{1}^{3} . \end{aligned} $$(15)

In Stücker et al. (2024) we have found that cumulant biases are very close to zero at orders of n ≥ 3. This motivates to consider a cumulant bias expansion that is truncated at the second order,

log F gaus = β 1 δ 0 + 1 2 β 2 δ 0 2 , $$ \begin{aligned} \log F_{\mathrm{gaus} }&= \beta _1 \delta _0 + \frac{1}{2} \beta _2 \delta _0^2 ,\end{aligned} $$(16)

as a particularly interesting case. Under the constraint β2 < 0, which is generally fulfilled as shown in Stücker et al. (2024), Fgaus is a Gaussian function that is normalised to Fgaus(0) = 1, that has its maximum at −β1/β2 and a width of 1 / β 2 $ 1/\sqrt{-\beta_2} $. Again, we can find an explicit form for the scale-dependent bias function fgaus through an Ansatz,

f gaus ( δ ) = N b exp ( ( δ μ b ) 2 2 σ b 2 ) $$ \begin{aligned} f_{\mathrm{gaus} }(\delta )&= N_b \exp \left(- \frac{(\delta - \mu _{b})^2}{2 \sigma _b^2} \right)\end{aligned} $$(17)

F gaus = f gaus ( δ + δ 0 ) 1 2 π σ exp ( δ 2 σ 2 ) d δ $$ \begin{aligned} F_{\mathrm{gaus} }&= \int _{-\infty }^{\infty } f_{\mathrm{gaus} }(\delta + \delta _0) \frac{1}{\sqrt{2 \pi } \sigma } \exp \left( -\frac{\delta }{2 \sigma ^2} \right) \mathrm{d} \delta \end{aligned} $$(18)

= N b 1 + σ 2 / σ b 2 exp ( ( δ 0 μ b ) 2 2 ( σ 2 + σ b 2 ) ) . $$ \begin{aligned}&= \frac{N_b}{\sqrt{1 + \sigma ^{2} / \sigma _b^2}} \exp \left(-\frac{ \left(\delta _0 - \mu _b\right)^2}{2 (\sigma ^2 + \sigma _b^2)}\right). \end{aligned} $$(19)

By taking the logarithm and identifying coefficients with Eq. (16), we find

N b = exp ( β 1 2 2 β 2 ) β 2 σ 2 + 1 $$ \begin{aligned} N_b&= \frac{\exp \left(- \frac{\beta _{1}^{2}}{2 \beta _{2}}\right)}{\sqrt{\beta _{2} \sigma ^{2} + 1}} \end{aligned} $$(20)

μ b = β 1 β 2 $$ \begin{aligned} \mu _{b}&=- \frac{\beta _{1}}{\beta _{2}} \end{aligned} $$(21)

σ b 2 = 1 β 2 σ 2 , $$ \begin{aligned} \sigma _b^2&= - \frac{1}{\beta _{2}} - \sigma ^{2} ,\end{aligned} $$(22)

which leads to

f gaus = exp ( β 1 2 2 β 2 ) 1 + β 2 σ 2 exp ( β 2 ( β 1 β 2 + δ ) 2 2 ( 1 + β 2 σ 2 ) ) $$ \begin{aligned} f_{\mathrm{gaus} }&= \frac{\exp \left(- \frac{\beta _{1}^{2}}{2 \beta _{2}}\right)}{\sqrt{1 + \beta _{2} \sigma ^{2}}} \exp \left(\frac{\beta _{2} \left(\frac{\beta _{1}}{\beta _{2}} + \delta \right)^{2}}{2 (1 + \beta _{2} \sigma ^{2})} \right) \end{aligned} $$(23)

as the Gaussian density bias model in re-normalised form – our first important theoretical result.

First of all, we note a few consistency properties of this Gaussian bias model. The large-scale limit of fgaus yields Fgaus:

lim σ 0 f gaus = exp ( β 1 δ + β 2 δ 2 2 ) = F gaus ( δ ) . $$ \begin{aligned} \lim _{\sigma \rightarrow 0} f_{\mathrm{gaus} }&= \exp \left( \beta _1 \delta + \frac{\beta _2 \delta ^2}{2} \right) = F_{\mathrm{gaus} }(\delta ). \end{aligned} $$(24)

So, under the Gaussian bias assumption, both the bias function that can be deduced from finite scales and the limiting function at infinite scales are Gaussians and mutually consistent with each other. This is an important result, since it ensures that a bias model that is set up to be Gaussian at some small scale, will also appear Gaussian at any larger scale.

Further, we note the Taylor expansion of F around δ0 = 0 is consistent with the canonical bias expansion at the second order and many of the results that are valid for second-order expansion models translate directly to the Gaussian model. However, the Gaussian makes a more graceful assumption about unmodelled higher-order terms which ensures improved behavior outside of the |δ0|∼0 regime. For example, the Gaussian ensures f > 0 for any amplitudes of density perturbations. Since we should only observe positive probabilities and positive galaxy number densities in the real universe, this is a desirable property. Ensuring it may allow additional applications for bias models, as we shall discuss in Sect. 8.

Regarding the parameters of the Gaussian, we note that the location of the maximum μb is independent of σ and gets larger for larger β1 (since β2 < 0). The effective width of the bias function σb decreases as σ increases. This is so, since a larger fraction of the information, which is necessary to decide whether a volume element collapses into a halo, is resolved.

It is worth noting that this can lead to undefined behavior beyond σ max = 1 / β 2 $ \sigma_{\mathrm{max}} = 1 / \sqrt{-\beta_2} $. If a model is extrapolated with fixed parameters to the corresponding scale, the bias model becomes a Dirac delta function and galaxy formation becomes formally deterministic. Latest at that scale the PBS assumption has to break down, since density perturbations from smaller scales have to impact galaxy formation in a different way (e.g. they become irrelevant). As we explain in Stücker et al. (2024), this behavior is not unique to the Gaussian bias model, but it must happen for any model that is (physically correctly) restricted to positive probability densities: The PBS predicts that the galaxy environment distribution has zero variance at the corresponding scale and negative variance beyond that scale. Expansion biases simply hide this problematic fact by allowing negative probability densities and a negative variance. When considering additional variables (like the Laplacian) the corresponding scale can be shifted to smaller scales, but generally there has always to be a scale where the PBS breaks down, because all information has been accounted for. We expect that fitting bias models beyond this point will always lead to scale-dependent bias parameters, which ensure for example that 0 > β2 > −1/σ2 for the pure density bias. We discuss this in more detail in Stücker et al. (2024).

We summarise that a Gaussian bias model is a well defined bias model that has a simple re-normalised form and therefore makes consistent predictions across different damping scales. In the limit of large scales and small values of δ, it gets arbitrarily close to a quadratic bias model, but it ensures additionally f > 0 outside the perturbative range.

Throughout this article, we always compare the performance of a Gaussian bias model to a quadratic expansion bias, since these two models use both two free parameters and have the same limiting behaviour on large scales.

2.5. Bias parameters

To compare non-parametric measurements of the bias function with the bias models, we have to infer the bias parameters b1 and b2 that are used to parameterise them. As is shown in Stücker et al. (2024), these can easily be found through the moments of the galaxy environment distribution,

b n = ( 1 ) n p ( n ) ( δ ) p ( δ ) p ( δ | g ) d δ = ( 1 ) n p ( n ) ( δ ) p ( δ ) g , $$ \begin{aligned} b_n&= (-1)^n \int _{-\infty }^\infty \frac{p^{(n)}(\delta )}{p(\delta )} p(\delta |g) \mathrm{d} \delta \nonumber \\&= (-1)^n \left\langle {\frac{p^{(n)} (\delta )}{p(\delta )} } \right\rangle _{\mathrm{g} } ,\end{aligned} $$(25)

where (n) in the superscript indicates an n-th derivative with respect to δ and the expectation value with a g in the subscript denotes that this expectation is taken over the galaxy environment distribution. Therefore, this estimator only has to be evaluated at the location of galaxies. Inserting the Gaussian distribution, we find

b n = H n ( δ / σ ) σ n g $$ \begin{aligned} b_n&= \left\langle { \frac{H_n \left( \delta / \sigma \right)}{\sigma ^n} } \right\rangle _{\mathrm{g} }\end{aligned} $$(26)

b 1 = δ σ 2 g $$ \begin{aligned} b_1&= \left\langle { \frac{\delta }{\sigma ^2} }\right\rangle _{\mathrm{g} } \end{aligned} $$(27)

b 2 = δ 2 σ 2 σ 4 g , $$ \begin{aligned} b_2&= \left\langle { \frac{\delta ^2 - \sigma ^2}{\sigma ^4} }\right\rangle _{\mathrm{g} } ,\end{aligned} $$(28)

where Hn is the n-th probabilist’s Hermite polynomial (see also Szalay 1988). These estimators have already been derived and used in Musso et al. (2012), Paranjape et al. (2013a,b) and we have extended them in Stücker et al. (2024) through scale-dependent corrections, as we shall review in Sect. 4.

In this paper, we always measure the bias parameters independently of the model and then use the same values of (b1, b2) to compare quadratic bias models with Gaussian bias models2. This is also the case for the bias models that we have shown in Fig. 1, so that this is a fair comparison – rather independently of the fitting method.

It is worth noting that the assumption of a Gaussian bias model for f also results in a Gaussian for the galaxy environment distribution:

p gaus ( δ | g ) = p ( δ ) f gaus ( δ ) $$ \begin{aligned} p_{\mathrm{gaus} }(\delta |g)&= p(\delta ) f_{\mathrm{gaus} }(\delta )\end{aligned} $$(29)

= 1 2 π σ g exp ( ( δ μ g ) 2 2 σ g 2 ) , $$ \begin{aligned}&= \frac{1}{\sqrt{2 \pi } \sigma _{\mathrm{g} }} \exp \left( - \frac{(\delta - \mu _{\mathrm{g} })^2}{2 \sigma _{\mathrm{g} }^2} \right) ,\end{aligned} $$(30)

where

μ g = β 1 σ 2 = δ g $$ \begin{aligned} \mu _{\mathrm{g} }&= \beta _1 \sigma ^2 = \langle {\delta } \rangle _{\mathrm{g} }\end{aligned} $$(31)

σ g 2 = β 2 σ 4 + σ 2 = ( δ μ g ) 2 g . $$ \begin{aligned} \sigma _{\mathrm{g} }^2&= \beta _2 \sigma ^4 + \sigma ^2 = \langle {(\delta - \mu _{\mathrm{g} })^2}\rangle _{\mathrm{g} } .\end{aligned} $$(32)

These parameters directly correspond to the mean and the variance of the galaxy environment distribution. Therefore, the selected Gaussian bias model also optimally fits the galaxy environment distribution. It is an intriguingly simple property of the Gaussian bias model that all three, the bias function, f, its large-scale limit, F, and the galaxy environment distribution, p(δ|g), are Gaussian functions under this assumption.

3. Measurements of the scale-dependent density bias function

In this section, we show measurements of the scale-dependent bias function, f, for different set-ups and develop a variety of statistics to systematically test how well different models approximate the function.

3.1. Simulations

Throughout this paper, we consider a single high-resolution cosmological simulation box. The simulation was run as part of the ‘BACCO simulation project’ Angulo et al. (2021). It has a box size of L = 1440 h−1 Mpc with 43203 particles corresponding to a mass resolution of mp = 3.2 × 109h−1 M. The cosmological parameters are Ωm = 0.307, ΩΛ = 0.693, Ωb = 0.048., ns = 0.9611, σ8 = 0.9, h = 0.677 which are similar to the Planck Collaboration VI (2020) cosmology except for the roughly 10% larger value of σ8. To have a halo-mass versus bias relations similar to the more commonly used Planck cosmology, we use a snaphshot of the simulation at a scale-factor of a = 0.8 where σ8 has the effective value D(a)σ8 = 0.79 where D(a) is the linear growth factor normalised to 1 at a = 1.

3.2. Tracer catalogues

To identify haloes, the simulation code uses a modified version of SUBFIND (Springel et al. 2001) that first identifies haloes through a friends-of-friends (FoF) algorithm and subsequently calculates for each FoF group the mass, M200b, in a region that encloses 200 times the mean density of the universe.

We considered haloes in narrow mass bins as tracers. For this, we selected all tracers that are in a 25% range above and below a stated target mass,

M 200 b [ M / 1.25 , M · 1.25 ] . $$ \begin{aligned} M_{200b} \in [M / 1.25, M \cdot 1.25 ]. \end{aligned} $$(33)

Halo selections throughout this article always use a = 0.8, as was explained above.

Further, we considered two catalogues of mock galaxies that are created based on subhalo abundance matching techniques in the same dark matter only simulation. The first set of ‘stellar mass’ (SM) galaxies is optimised to mimic galaxies that may be observed in a survey like BOSS that use a cut in stellar mass. We choose a number density of n = 2 × 10−3h3 Mpc−3, a redshift of z ∼ 1 leading to a Lagrangian bias of order b1 ∼ 0.5. This catalogue is created with the SHAMe model that was introduced by Contreras et al. (2021). For a detailed discussion, we refer the reader to that paper. However, in short, this method uses an abundance matching technique based on the value of vpeak – the highest circular velocity in the history of an object – and uses additional prescriptions to model dynamical friction induced mergers and tidal stripping. The free parameters are tuned to mimic the clustering of stellar-mass-selected galaxies in the TNG300 hydrodynamical simulation (Nelson et al. 2018; Springel et al. 2018; Marinacci et al. 2018; Pillepich et al. 2018; Naiman et al. 2018).

As a second catalogue of galaxies, we considered ‘star formation rate’ (SFR) selected galaxies as they would be observed by future surveys like EUCLID. For these, we assumed the same target parameters: n = 2 × 10−3h3 Mpc−3, a redshift of z ∼ 1, and a Lagrangian bias of order b1 ∼ 0.5. However, for the abundance matching, we used a novel method developed by Ortega-Martinez et al. (2024) named ‘SHAMe-SF’. This is an extension to the SHAMe method, in which additional adjustments have been made to accurately describe the redshift space clustering of star forming galaxies in the TNG300 simulations. We again adopted the set of parameters that optimally describes SFR-selected galaxies from the TNG300.

3.3. Lagrangian fields

To measure the galaxy environment distribution, p(δ|g), and the bias function, f, we need to know the linear field evaluated at the Lagrangian locations of the tracers. We approximate the Lagrangian location of each halo through the Lagrangian origin of its most bound particle. Since the simulation started from a Lagrangian grid, the Lagrangian origin of the most bound particle can easily be inferred from its id imb as

q mb = L N grid ( i x i y i z ) $$ \begin{aligned} \boldsymbol{q}_{\mathrm{mb} }&= \frac{L}{N_{\mathrm{grid} }} \begin{pmatrix} i_x \\ i_y \\ i_z \end{pmatrix} \end{aligned} $$(34)

i mb = i x N grid 2 + i y N grid + i z , $$ \begin{aligned} i_{\mathrm{mb} }&= i_x N_{\mathrm{grid} }^2 + i_y N_{\mathrm{grid} } + i_z ,\end{aligned} $$(35)

where Ngrid = 4320 is the number of particles per dimension.

We know the linear density field of the simulation through the initial condition generator. To save computation time, we created a low-resolution grid representation of the linear density field with Nlin3 grid points. For fields different than the linear density field, we additionally multiplied by the correct operator (e.g. −k2 for the Laplacian) in Fourier space. We created a smoothed version of this field by multiplying with a sharp-k filter in Fourier space,

δ k = δ lin , k · Θ ( k d k ) , $$ \begin{aligned} \delta _k&= \delta _{\rm {lin},k} \cdot \Theta (k_{\rm {d}} - k) ,\end{aligned} $$(36)

with a damping scale, kd, where Θ is the Heavyside function. We then deconvolved this field with a linear interpolation kernel and interpolated it to the Lagrangian locations of our tracer set. We chose Nlin to be sufficiently large that the resulting interpolated values are virtually independent of this discretisation; for example, Nlin = 183 at kd = 0.1h−1Mpc and Nlin = 549 at kd = 0.3h−1Mpc.

Given the linear densities at the tracer locations, we can infer the galaxy environment distribution through a histogram of the densities and the bias function, f, through a weighted histogram, where each tracer is weighted by 1/p(δ). We always used 50 bins equally spaced in the range δ ∈ [ − 5σ, 5σ]. We wrote the measurement of f(δ) as a vector, f, where each component corresponds to the inferred value in a different bin. We then estimated the covariance matrix of the measurement through a Jackknife technique. For this, we divided the box in Lagrangian space into Njk3 sub-boxes with Njk = 4. We performed 64 measurements of fi by subsequently leaving out all tracers in one of the sub-boxes. Then we estimated the covariance matrix of the measurements,

C f = 1 N jk 3 1 i ( f i f 0 ) ( f i f 0 ) $$ \begin{aligned} \mathsf C _{\boldsymbol{f}}&= \frac{1}{N_{\mathrm{jk} }^3 -1} \sum _i (\boldsymbol{f}_i - \boldsymbol{f}_0) \otimes (\boldsymbol{f}_i - \boldsymbol{f}_0)\end{aligned} $$(37)

f 0 = 1 N jk 3 i f i . $$ \begin{aligned} \boldsymbol{f}_0&= \frac{1}{N_{\mathrm{jk} }^3} \sum _i \boldsymbol{f}_i. \end{aligned} $$(38)

Further, we estimated the bias parameters with the estimators from Eqs. (27) and (28) and used these to parametrise the different bias models. The statistical uncertainties of these estimators are so small that we have neglected them here.

3.4. Measured functions

We show examples of bias function measurements for different scenarios in Fig. 2. The first three panels use haloes of M200b ∼ 4 × 1013 h−1 M at different damping scales. At the largest scale, kd = 0.07 h Mpc−1, all bias models are indistinguishable and approximate the data well. At a smaller smoothing scale, kd = 0.15 h Mpc−1, small differences emerge and the Gaussian bias is a slightly better approximation than the quadratic bias. At very small damping scales, kd = 0.4 h Mpc−1, the expansion bias models completely fail to capture the data, whereas the Gaussian bias is still a very good approximation. We note that in all cases the bias models use the same re-normalised bias parameters and the failure of expansion biases cannot be attributed to fitting techniques. It seems generally that expansion bias struggles to capture scenarios in which the bias function, f(δ), already gets close to zero at intermediate density values, |δ|/σ ≲ 1.

thumbnail Fig. 2.

Galaxy environment distribution (left) and Lagrangian bias function (right) of mass selected haloes (M200b ∼ 4 × 1013 h−1 M) at different damping scales (kd = 0.07 h Mpc−1, 0.15 h M−1 and 0.4 h Mpc−1 in top, center and bottom respectively). At all scales, the Gaussian bias provides a description that is as good as quadratic bias or significantly better – especially so at small smoothing scales.

Beyond the accuracy of the models, we note a few relevant details in the scale dependence of the bias function. As is explained in Sect. 2.4, under the assumption of a Gaussian density bias with scale-independent bias parameters, the location of the maximum of the bias function should be scale independent, but the width may be scale-dependent. While the maximum of the bias function, f, seems approximately at a consistent location between the two larger scales at δmax ≈ 0.5, at the smaller scale kd = 0.4 h Mpc−1 the maximum has clearly shifted away to δmax ≈ 1.5. Such scale dependencies can be explained by the dependence on secondary variables; for example the Laplacian, as we shall discuss in Sects. 5 and 6.

Finally, we considered the bias functions of the two mock galaxy catalogues. We show the bias function of the stellar-mass-selected galaxy sample in the top panel of Fig. 3, and of the SFR-selected galaxies in the bottom panel, both at a damping scale of kd = 0.4 h Mpc−1. For the stellar-mass-selected galaxies, the Gaussian is again much closer to the measured bias function than the expansion biases. For the SFR-selected galaxies, none of the models optimally approximate the bias function, especially at high densities. However, we might still favour the Gaussian approximation over the expansion in this scenario, since it approximates quite well how the bias function approaches zero at small δ. Despite these inaccuracies, it is worth noting that it is more relevant how well the bias models approximate the galaxy environment distribution, p(δ|g), which is well recovered by all models.

thumbnail Fig. 3.

Galaxy environment distribution (left) and the Lagrangian bias function (right) of stellar-mass-selected galaxies (top) and SFR-selected galaxies (bottom).

We conclude that generally the Gaussian bias offers a much improved approximation to the bias function over a quadratic bias for many plausible scenarios. For scenarios in which the Gaussian is not an optimal description, it still performs at a similar level to a quadratic bias function. We shall show this systematically for a variety of metrics and tracer selections in the next section.

3.5. Systematic evaluation

Here, we present metrics to systematically evaluate the performance of the considered bias models. The metrics are meant to compare the true galaxy environment distribution with the modelled one. It makes more sense to compare the galaxy environment distributions than the bias functions, since this automatically weights the bias function appropriately by the Lagrangian volume. In principle, it would be desirable to use common probability theory measures for comparing the distributions, like for example the Kullback-Leibler divergence etc.. However, the expansion bias models generically predict distributions that are negatively valued for some values of δ, so that the majority of probability metrics are not well defined. To alleviate this, we present as a first metric the Wasserstein metric, which is well defined even for negative values and as a second metric the likelihood with an appropriate method for clipping negative values.

The Wasserstein distance – also known as the ‘Earth-movers’ metric – is the average absolute amount that the δ values of one distribution would need to be shifted to transform it into the other distribution. Conveniently, this concept can also be applied for distributions that get negative so that it provides a fair comparison between the models. Further, it is simple to measure in one dimension and does not require coarse-graining the data (e.g. through a histogram). In one dimension, the Wasserstein distance between two distributions is given by

δ ws = 0 1 | F 1 1 ( F ) F 2 1 ( F ) | d F , $$ \begin{aligned} \delta _{ws}&= \int _0^1 \left|F^{-1}_1(F) - F^{-1}_2(F)\right| \mathrm{d} F ,\end{aligned} $$(39)

where F1−1 and F2−1 are the quantile functions – that is, the inverse of the cumulative distribution functions, F1 and F2. However, since the measured area between the two curves is equivalent in F-space and in δ-space, it can more conveniently be measured as

δ ws = | F 1 ( δ ) F 2 ( δ ) | d δ . $$ \begin{aligned} \delta _{ws}&= \int _{-\infty }^{\infty } \left|F_1(\delta ) - F_2(\delta )\right| \mathrm{d} \delta . \end{aligned} $$(40)

For the negative distributions, Eq. (40) can still be applied, but with a cumulative distribution function that is non-monotonic. For the discrete galaxy distribution, we approximate the cumulative distribution function with linear splines between the ordered data points (δn, n/Ngal) where n is the rank in the ordered data points.

We show measurements of δws in the top panel of Fig. 4 evaluated for halo selections of different masses and for the stellar mass and SFR-selected galaxies at several different damping scales. The solid brighter lines correspond to quadratic bias model, whereas the darker dashed lines correspond to the Gaussian bias. Strikingly, δws is quite small for the Gaussian bias model at all smoothing scales and for all tracer populations – never getting beyond δws ∼ 0.03. That means that using a Gaussian bias model a galaxy will be born at a smoothed linear density that is typically only off by δws = 0.03 or less. On the other hand, the performance of the quadratic bias model depends strongly on the selected set of tracers. It only behaves similarly well to the Gaussian bias for the SFR galaxies, but significantly worse than the Gaussian bias at large kd for all other considered selections, easily differing on δws by an order of magnitude; for example, for SM galaxies or even two orders of magnitude for extremely massive haloes, M200b ∼ 1015 h M.

thumbnail Fig. 4.

Metrics to compare the performance of the Gaussian bias model to expansion biases. The top panel shows the Wasserstein distance between the actual and the modelled galaxy environment distribution (smaller is better), the central panel the normalised log likelihood (larger is better) and the bottom panel the amplitude of the first unmodelled term (closer to zero is better). Brighter solid lines show the quadratic bias model and darker solid lines the Gaussian bias. All metrics and selections show the Gaussian bias performing either similar to a quadratic model or significantly better.

As the second metric, we consider the likelihood of observing a given set of environment densities δi, given the model. This metric is only well defined for the Gaussian bias model, since only for that case is p(δ|g) a well defined (non-negative) probability density. Assuming that the linear densities δi of each tracer are randomly drawn from the distribution p(δ|g), the likelihood of observing the set of tracers is given by

L = i N gal p ( δ i | g ) . $$ \begin{aligned} L&= \prod _i^{N_{\rm {gal}}} p(\delta _i | g) . \end{aligned} $$(41)

To reduce the scaling with tracer numbers, we considered the log-likelihood per tracer. Further, the likelihood scales quite significantly with the width of the considered distribution. The log likelihood per tracer of a Gaussian that follows the background distribution would be given by

log p ( δ ) = log ( 1 / 2 π σ 2 ) δ 2 2 σ 2 = 1 2 log ( 2 π σ 2 ) 1 2 . $$ \begin{aligned} \langle {\log p(\delta )} \rangle&= \log (1/\sqrt{2 \pi \sigma ^2}) - \left\langle {\frac{\delta ^2}{2\sigma ^2}}\right\rangle \nonumber \\&= -\frac{1}{2} \log (2 \pi \sigma ^2) - \frac{1}{2}. \end{aligned} $$(42)

We subtracted this term from the log likelihood to reduce the scaling with σ and obtained a normalised log likelihood,

log L n = log p ( δ | g ) g + 1 2 log ( 2 π σ 2 ) + 1 2 , $$ \begin{aligned} \log L_n&= \left\langle {\log p(\delta | \mathrm{g} ) }\right\rangle _{\mathrm{g} } + \frac{1}{2} \log (2 \pi \sigma ^2) + \frac{1}{2} ,\end{aligned} $$(43)

which should reach zero in the large-scale limit, σ → 0, where the galaxy environment distribution approaches the background distribution.

To be able to evaluate this metric for the quadratic bias model, we defined a slightly modified distribution that is strictly positive, by clipping the bias function at a minimum value fmin. The new distribution reads

p ( δ | g ) = N p ( δ ) { f ( δ ) if f ( δ ) f min f min else , $$ \begin{aligned} p^*(\delta | g)&= N p(\delta ) {\left\{ \begin{array}{ll} f(\delta )&\text{ if} f(\delta ) \ge f_{\mathrm{min} } \\ f_{\mathrm{min} }&\text{ else}, \end{array}\right.} \end{aligned} $$(44)

where the normalisation constant, N, is adjusted so that the area of p* is appropriately normalised to 1. By testing different clipping values, fmin = {0.001, 0.003, 0.01, 0.03, 0.1, 0.3}, we verified that the value of log Ln is very insensitive to the choice of fmin, except for the largest case, fmin = 0.3, where it gets slightly reduced. However, for the comparison we simply chose the best case scenario for the quadratic expansion, where we always chose the clipping value from the above choices that produced the largest likelihood at a given scale. For the Gaussian bias, we did not employ any clipping so that the clipping procedure could only enhance the likelihood of the expansion bias.

We show the normalised log-likelihood in the central panel of Fig. 4. The likelihood seems to be almost identical between the Gaussian and quadratic bias for low-mass haloes M200b ≲ 1012 h M and for the SFR galaxies. For SM galaxies and intermediate-mass haloes, 1013 h MM200b ≲ 1014 h M, the Gaussian bias has a slightly improved log-likelihood per tracer. The differences are smaller at smaller kd and larger at larger values of kd, which is expected, since for smaller values of σ the bias function is mostly probed close to zero where the Taylor expansion is accurate. However, even small differences in the log likelihood per tracer may imply quite large differences in the full likelihood L which scales with the number of tracers. Finally, we note that for the largest considered mass selection, M200b ∼ 1015 h M, the Gaussian bias drastically outperforms the quadratic bias. As was mentioned earlier, the likelihood is quite ill-defined for the quadratic bias in this case, but it is remarkable that such extreme cases are so well described by a Gaussian.

As a final measure of the difference of the true bias function with respect to the quadratic and Gaussian assumption, we consider the values of b3 and β3 respectively. The value of b3 quantifies the lowest (third) order term that is not modelled in the quadratic model whereas the value of β3 quantifies the lowest order term that is not modelled by the Gaussian. Recall that the Gaussian implicitly models some aspects of b3 through its implicit dependence on b1 and b2, as can be seen from Eq. (15) under the assumption β3 = 0.

We show the obtained values of b3 and β3 in the bottom panel of Fig. 4. First of all, we note that these parameters are quite scale-dependent which is due to neglecting the Laplacian bias, as we discuss in detail in Stücker et al. (2024). However, these parameters still describe, how well the third moment is captured by the bias models at the considered scale. Clearly β3 is very small, with |β3|< 1 for almost all considered cases and generally getting even smaller for large kd. On the other hand we see that for some selections b3 can be very large, for example b3 ≫ 10 for the largest mass haloes, indicating a catastrophic behaviour of the canonical bias expansion for such high-mass objects or β3 ∼ −5 for M200b ∼ 1014 h M at kd ≲ 0.2 h Mpc−1. See Stücker et al. (2024), for a more detailed discussion of the behaviour of b3 versus β3.

We conclude that in all cases the Gaussian bias model approximates the density bias function at least as well as a quadratic bias model, and for most cases it does so significantly better. It is worth noting that the Gaussian bias still poses a good approximation for scenarios in which the canonical expansion is typically expected to break down, for example for very large-scale haloes or for very small smoothing scales.

While this shows that a Gaussian is always a good description for the galaxy density-environment distribution at a given scale, this does not yet show that it is always sufficient to fully describe all aspects of the galaxy field. For this it is also necessary to consider the ability of the model to jointly describe all scales that are larger than the considered smoothing scale and to show that other variables, such as the Laplacian and the tidal field, can be appropriately incorporated. We shall make the necessary theoretical considerations in the next section and we shall evaluate it quantitatively in Sects. 5 and 6.

4. The multi-variate bias function

Galaxy formation does not only get affected by a uniform large-scale density, but it can also get affected by other aspects of the environment, such as the tidal field or the Laplacian of the density field,

L = 2 δ . $$ \begin{aligned} L&= \nabla ^2 \delta . \end{aligned} $$(45)

Therefore, multiple variables may be necessary to fully characterise the bias function (see e.g. Bartlett et al. 2024 for a clear demonstration that local density-only bias models (in Eulerian space) are already an insufficient description of the halo density field on fairly large scales). In this section, we generalise the concepts introduced in Sect. 2 to the multi-variate case. Additionally, we show how to reconstruct the large-scale bias function, F, from measurements at finite damping scales in Sect. 4.5. Further, in Sect. 4.6, we shall briefly mention how a dependence on the tidal field can be included into a Gaussian bias model (though we do not use it in this article).

4.1. Definitions

We characterise the linear field through a vector, such as x = (δ, L)T, which contains all scalar variables of interest. Our examples mainly focus on the joint treatment of density and Laplacian, but the considerations in this section generalise to other (linear) variables as well.

Most equations that were derived in Sect. 2 generalise in a straightforward way to the multi-variate case. The large-scale bias function, F, and the scale-dependent bias function, f, are again related through

F ( x 0 ) = f ( x + x 0 ) , $$ \begin{aligned} F(\boldsymbol{x}_0)&= \langle {f(\boldsymbol{x} + \boldsymbol{x}_0)}\rangle ,\end{aligned} $$(46)

where the average now goes over a multi-variate Gaussian distribution,

p ( x ) = 1 2 π det ( C ) exp ( 1 2 x T C 1 x ) , $$ \begin{aligned} p(\boldsymbol{x})&= \frac{1}{2 \pi \sqrt{\det (\mathsf C )}} \exp \left(- \frac{1}{2} \boldsymbol{x}^T \mathsf C ^{-1} \boldsymbol{x} \right) ,\end{aligned} $$(47)

where C is the covariance matrix between the components of x. For example, for the joint distribution of density and Laplacian, we have

C δ , L = [ σ 0 2 σ 1 2 σ 1 2 σ 2 2 ] , $$ \begin{aligned} \mathsf{C }_{\delta ,L}&= \left[\begin{matrix}\sigma _{0}^{2}&- \sigma _{1}^{2}\\ - \sigma _{1}^{2}&\sigma _{2}^{2}\end{matrix}\right] ,\end{aligned} $$(48)

where σ02 = ⟨δ2⟩=σ2, σ22 = ⟨L2⟩ and σ12 = ⟨−δL⟩.

The bias parameter, b1, now becomes a vector and b2 a matrix:

b 1 = x F | x = 0 $$ \begin{aligned} \boldsymbol{b}_1&= \left. \nabla _{\boldsymbol{x}} F \, \right|_{\boldsymbol{x} = 0} \end{aligned} $$(49)

b 2 = ( x x ) F | x = 0 , $$ \begin{aligned} \mathsf b _2&= \left. (\nabla _{\boldsymbol{x}} \otimes \nabla _{\boldsymbol{x}}) F \, \right|_{\boldsymbol{x} = 0} ,\end{aligned} $$(50)

where ⊗ denotes an outer product. For the case x = (δ, L)T, the components of b1 correspond to the linear density bias parameter, which we label bδ in this section to avoid confusion (before, it was called b1), and the Laplacian bias bL. The components of b2 correspond to second derivative biases, including bδδ (before called “b2”), bδL and bLL.

The terms bδL and bLL are often considered to be of a higher order than two in perturbation theory. How orders are assigned to different terms depends on the assumption of how spatial derivatives are discounted versus the appearance of higher-order powers of the same variable in the expansion. For mathematical simplicity, we only count orders by the powers of the variables here so that, for example, the term δL is counted as second-order.

4.2. Multi-variate Gaussian bias

We again define cumulant biases through the derivatives of log F:

β 1 = x log F | x = 0 = b 1 $$ \begin{aligned} \boldsymbol{\beta }_1&= \left. \nabla _{\boldsymbol{x}} \log F \, \right|_{\boldsymbol{x} = 0} = b_1 \end{aligned} $$(51)

β 2 = ( x x ) F | x = 0 = b 2 b 1 b 1 , $$ \begin{aligned} {\beta }_2&= \left. (\nabla _{\boldsymbol{x}} \otimes \nabla _{\boldsymbol{x}}) F \, \right|_{\boldsymbol{x} = 0} = \mathsf b _2 - \boldsymbol{b}_1 \otimes \boldsymbol{b}_1 ,\end{aligned} $$(52)

and write the large-scale bias function,

log F = β 1 x + 1 2 x T β 2 x , $$ \begin{aligned} \log F&= \boldsymbol{\beta }_1 \boldsymbol{x} + \frac{1}{2} \boldsymbol{x}^T {\beta }_2 \boldsymbol{x} ,\end{aligned} $$(53)

and assume a multi-variate Gaussian bias model,

f ( x ) = N b exp ( 1 2 ( x μ b ) T C b 1 ( x μ b ) ) . $$ \begin{aligned} f(\boldsymbol{x})&= N_b \exp \left(- \frac{1}{2} (\boldsymbol{x} - \boldsymbol{\mu }_b)^T \mathsf C _{b}^{-1} (\boldsymbol{x} - \boldsymbol{\mu }_b) \right). \end{aligned} $$(54)

By applying Eq. (46), we find the coefficients

μ b = β 2 1 β 1 $$ \begin{aligned} \boldsymbol{\mu }_b&= -\beta _2^{-1} \boldsymbol{\beta }_1 \end{aligned} $$(55)

C b = β 2 1 C $$ \begin{aligned} \mathsf C _b&= -\beta _2^{-1} - \mathsf C \end{aligned} $$(56)

N b = exp ( 1 2 β 1 T β 2 1 β 1 ) det ( 1 + C β 2 ) , $$ \begin{aligned} N_b&= \frac{\exp \left(- \frac{1}{2} \boldsymbol{\beta }_1^T \beta _2^{-1} \boldsymbol{\beta }_1 \right)}{\sqrt{\det (\mathsf 1 + \mathsf C \beta _2 )}} ,\end{aligned} $$(57)

which are simple generalisations of the results from Sect. 2. We note that the multi-variate Gaussian bias of two variables has by default five free parameters – two describing the location of the maximum and three describing the covariance matrix.

4.3. Multi-variate expansion

The re-normalised form of the quadratic expansion model can then be written as

f ( x 0 ) = 1 + b 1 x + 1 2 ( x T b 2 x tr ( C b 2 ) ) $$ \begin{aligned} f(\boldsymbol{x}_0)&= 1 + \boldsymbol{b}_1 \boldsymbol{x} + \frac{1}{2} (\boldsymbol{x}^T \mathsf b _2 \boldsymbol{x} - \mathrm{tr} (\mathsf{C \mathsf b _2}))\end{aligned} $$(58)

= 1 + b δ δ + b L L + 1 2 b δ δ ( δ 2 σ 0 2 ) + b δ L ( δ L + σ 1 2 ) + 1 2 b LL ( L 2 σ 2 2 ) ) . $$ \begin{aligned}&= 1 + b_{\delta } \delta + b_L L + \frac{1}{2} b_{\delta \delta } (\delta ^2 - \sigma _0^2) \nonumber \\&{\quad } + b_{\delta L} (\delta L + \sigma _1^2) + \frac{1}{2} b_{LL} (L^2 - \sigma _2^2)). \end{aligned} $$(59)

The last two terms are not considered as second-order terms in most studies, because terms that include spatial derivatives (like the Laplacian L) are often counted as higher-order terms than those which just include δ. However, since it is most natural to include the corresponding terms in the multi-variate Gaussian bias model, we here use this five parameter model as the ‘multi-variate bias expansion model’ as a fair comparison with the Gaussian bias.

4.4. Measuring bias parameters

Just as in Sect. 2, we can measure the multi-variate bias function, f, by measuring the galaxy environment distribution p(x|g) through a multi-variate histogram and dividing by p(x). Further, we can measure bias parameters through the same method as in Sect. 3 – that is through the moments of the galaxy environment distribution. As was derived in Stücker et al. (2024), the bias parameters can be measured through

b 1 = C 1 x g $$ \begin{aligned} \boldsymbol{b}_1&= \langle \mathsf{C ^{-1} \boldsymbol{x} } \rangle _{\mathrm{g} }\end{aligned} $$(60)

b 2 = C 1 ( x x C ) C 1 g . $$ \begin{aligned} \mathsf b _2&= \left\langle \mathsf{C ^{-1} (\boldsymbol{x} \otimes \boldsymbol{x} - \mathsf C ) \mathsf C ^{-1} } \right\rangle _{\mathrm{g} }. \end{aligned} $$(61)

We note that if we consider x = (δ, L), the obtained density bias estimator is different to the pure density case from Eq. (27):

b δ = δ σ 2 2 + L σ 1 2 σ 0 2 σ 2 2 σ 1 4 g . $$ \begin{aligned} b_{\delta }&= \left\langle {\frac{\delta \sigma _{2}^{2} + L \sigma _{1}^{2}}{\sigma _{0}^{2} \sigma _{2}^{2} - \sigma _{1}^{4}}}\right\rangle _{\mathrm{g} }. \end{aligned} $$(62)

As we investigate in large detail in Stücker et al. (2024), this estimator is much more independent of the scale it is evaluated at than Eq. (27), since the Laplacian term corrects for the scale dependence.

We note again that for the Gaussian bias model the galaxy environment distribution p(x|g) = p(x)f(x) is a (conventionally normalised) multi-variate Gaussian and that our fitting method ensures that its mean and covariance matrix correspond exactly to the mean and covariance of the galaxy environment distribution,

μ g = x g $$ \begin{aligned} \boldsymbol{\mu }_{\mathrm{g} }&= \left\langle {\boldsymbol{x}}\right\rangle _{\mathrm{g} }\end{aligned} $$(63)

C g = ( x μ g ) ( x μ g ) g . $$ \begin{aligned} \mathsf C _{\mathrm{g} }&= \left\langle {(\boldsymbol{x} - \boldsymbol{\mu }_{\mathrm{g} }) \otimes (\boldsymbol{x} - \boldsymbol{\mu }_{\mathrm{g} })} \right\rangle _{\mathrm{g} }. \end{aligned} $$(64)

Arguably, the simplest way to fit and parameterise the multi-variate Gaussian bias model is by inferring the mean and covariance of the galaxy environment distribution as above and then write f through the ratio p(x|g)/p(x). However, the explicit representation through bias parameters is still useful to predict the scale dependence and to compare it fairly with an expansion bias model.

4.5. Measuring the large-scale bias function

It is not only possible to measure f in a non-parametric way, but we can also directly reconstruct the large-scale bias function, F, at finite damping scales:

F ( x 0 ) = f ( x + x 0 ) = p ( x ) f ( x + x 0 ) d x = p ( x x 0 ) f ( x ) d x . $$ \begin{aligned} F(\boldsymbol{x}_0)&= \langle f(\boldsymbol{x} + \boldsymbol{x}_0) \rangle \nonumber \\&= \int _{-\infty }^\infty p(\boldsymbol{x}) f(\boldsymbol{x} + \boldsymbol{x}_0) \mathrm{d} \boldsymbol{x} \nonumber \\&= \int _{-\infty }^\infty p(\boldsymbol{x} - \boldsymbol{x}_0) f(\boldsymbol{x}) \mathrm{d} \boldsymbol{x} . \end{aligned} $$(65)

Therefore, we could in principle obtain F by first measuring f and then convolving it with a Gaussian. However, there is a more accurate method of measuring F that requires fewer discretisation steps. For the multi-variate background distribution, it is:

p ( x x 0 ) = 1 2 π det ( C ) ( 1 2 x T C 1 x + x T C 1 x 0 1 2 x 0 T C 1 x 0 ) = p ( x ) exp ( x T C 1 x 0 ) exp ( 1 2 x 0 T C 1 x 0 ) , $$ \begin{aligned} p(\boldsymbol{x} - \boldsymbol{x}_0)&= \frac{1}{2 \pi \sqrt{\det (\mathsf C )}} \left(- \frac{1}{2} \boldsymbol{x}^T \mathsf C ^{-1} \boldsymbol{x} + \boldsymbol{x}^T \mathsf C ^{-1} \boldsymbol{x}_0 - \frac{1}{2} \boldsymbol{x}_0^T \mathsf C ^{-1} \boldsymbol{x}_0 \right) \nonumber \\&= p(\boldsymbol{x}) \exp \left(\boldsymbol{x}^T \mathsf C ^{-1} \boldsymbol{x}_0 \right) \exp \left(- \frac{1}{2} \boldsymbol{x}_0 ^T \mathsf C ^{-1} \boldsymbol{x}_0 \right) \nonumber ,\end{aligned} $$

where we have used the symmetry of the covariance matrix xTC−1x0 = x0TC−1x. Therefore, we find

F ( x 0 ) = exp ( 1 2 x 0 T C 1 x 0 ) p ( x | g ) exp ( x T C 1 x 0 ) d x = exp ( 1 2 x 0 T C 1 x 0 ) exp ( x T C 1 x 0 ) g . $$ \begin{aligned} F(\boldsymbol{x}_0)&= \exp \left(- \frac{1}{2} \boldsymbol{x}_0 ^T \mathsf C ^{-1} \boldsymbol{x}_0 \right) \int _{-\infty }^\infty p(\boldsymbol{x}|\mathrm{g} ) \exp \left(\boldsymbol{x}^T \mathsf C ^{-1} \boldsymbol{x}_0 \right) \mathrm{d} \boldsymbol{x} \nonumber \\&= \exp \left(- \frac{1}{2} \boldsymbol{x}_0 ^T \mathsf C ^{-1} \boldsymbol{x}_0 \right) \left\langle {\exp \left(\boldsymbol{x}^T \mathsf C ^{-1} \boldsymbol{x}_0 \right)}\right\rangle _{\mathrm{g} } . \end{aligned} $$(66)

The appearing expectation value corresponds to a moment generating function (Stücker et al. 2024). If we consider only the mono-variate distribution of densities x = δ, we have

F ( δ 0 ) = exp ( δ 0 2 2 σ 2 ) exp ( δ δ 0 σ 2 ) g . $$ \begin{aligned} F(\delta _0)&= \exp \left(- \frac{\delta _0^2}{2 \sigma ^2} \right) \left\langle {\exp \left(\frac{\delta \delta _0}{\sigma ^2} \right)}\right\rangle _{\mathrm{g} }. \end{aligned} $$(67)

We call this the ‘spatial order 0’ estimator of the re-normalised density bias function. However, if we consider the joint distribution of density and Laplacian, we get a different estimator, even for pure density displacements, x0 = (δ0, 0)T:

F ( δ 0 ) = exp ( δ 0 2 σ 2 2 2 σ 4 ) exp ( δ 0 δ σ 2 2 + L σ 1 2 σ 4 ) g , $$ \begin{aligned} F(\delta _0)&= \exp \left(- \frac{\delta _0^2 \sigma _2^2}{2 \sigma _*^4} \right) \left\langle {\exp \left(\delta _0 \frac{ \delta \sigma _2^2 + L \sigma _1^2 }{\sigma _*^4} \right)}\right\rangle _{\mathrm{g} } ,\end{aligned} $$(68)

which we call the ‘spatial order 2’ estimator of the re-normalised density bias function, since it contains corrections based on the Laplacian of the density field.

We want to emphasise here the important observation that the re-normalised bias function is not only an abstract theoretical concept, but can also directly be measured in simulations. Further, it means that it is possible to define non-parametric bias approaches at some smoothing scale and to rescale them to different smoothing scales; for example, by directly discretizing F.

While we shall focus in Sect. 5 on measurements of the scale-dependent bias function, f, we shall show reconstructions of the scale-independent large-scale bias function in Sect. 6. Whether both f and F can be consistently described through the same bias model can be used as a test of whether a bias model accurately captures the deterministic dependencies on scales larger than the damping scale. Further, whether the non-parametric measurement F is consistent across different smoothing scales can be used as a test of the range of validity of the PBS assumption for a given set of variables – independently of any assumptions about the functional form.

4.6. Gaussian tidal bias

For the sake of simplicity, and since the tidal bias of haloes is generally very low, we shall not consider tidal fields for the measurements in the remainder of this paper. However, since it is very common to include the tidal field in a bias expansion at the second order, we want to briefly mention how it can be included in the Gaussian bias framework for the benefit of future studies. We define the traceless tidal tensor

K = ( ) ϕ δ 3 J 2 , $$ \begin{aligned} \mathsf K&= (\boldsymbol{\nabla } \otimes \boldsymbol{\nabla }) \phi - \frac{\delta }{3} \mathsf J _2 ,\end{aligned} $$(69)

where J2 is the unit matrix. We show in Appendix B that (at order two) the joint bias function of (δ, L, K) factorises as

f ( δ , L , K ) = f 0 ( δ , L ) f K ( K ) , $$ \begin{aligned} f(\delta , L, \mathsf K )&= f_0(\delta , L) f_\mathsf{K }(\mathsf K ) ,\end{aligned} $$(70)

where the bias function of density and Laplacian f0 is given as earlier by Eq. (54) for x = (δ, L)T and the tidal component of the bias function is given by

f K ( K ) = | 1 + 4 σ 2 15 b K 2 | 5 / 2 exp ( b K 2 K 2 1 + 4 15 σ 2 b K 2 ) , $$ \begin{aligned} f_\mathsf{K }(\mathsf K )&= \left|1 + \frac{4 \sigma ^2}{15} b_{K^2} \right|^{-5/2} \exp \left(\frac{b_{K^2} K^2}{1 + \frac{4}{15} \sigma ^2 b_{K^2}} \right) ,\end{aligned} $$(71)

where K2 = tr(KK) and fK has the large-scale limit

F K ( K 0 ) = f K ( K + K 0 ) = exp ( b K 2 K 0 2 ) . $$ \begin{aligned} F_\mathsf{K }(\mathsf K _0)&= \langle {f_\mathsf{K }(\mathsf K + \mathsf K _0)} \rangle \nonumber \\&= \exp \left(b_{K^2} K_0^2 \right) . \end{aligned} $$(72)

The form of fK can be understood as follows: the normalisation factor originates from the normalisation of a five-dimensional Gaussian, since the distribution p(K) is effectively five-dimensional. Further, the distribution of the tidal tensor has per degree of freedom the variance

1 5 K 2 = 2 σ 2 15 , $$ \begin{aligned} \frac{1}{5} \langle K^2 \rangle&= \frac{2\sigma ^2}{15} ,\end{aligned} $$(73)

which naturally appears in Eq. (71).

We summarise that it is possible to phrase a re-normalised Gaussian bias model in the multi-variate case – for example, using the density, Laplacian, and tidal field as variables. Therefore, it is possible to encode all terms that are traditionally considered in a second-order expansion in a fully self consistent Gaussian bias model that is strictly positive and for which f(δ, L, K), F(δ, L, K) and p(δ, L, K|g) are all simple multi-variate Gaussians.

5. Measurements of the multi-variate scale-dependent bias function

In this section, we show measurements of the multi-variate bias function considering the density and the Laplacian of the density field. Here, we shall only check whether the models describe the bias relation at a single scale, whereas we shall consider the problem across multiple scales in Sect. 6.

5.1. Example function

First, we visualise the multi-variate bias function for one example case. Here, we select haloes with masses M200b ∼ 2 × 1014 h M at a damping scale kd = 0.2h−1Mpc. We measure the galaxy environment distribution p(δ, L|g) through a normalised histogram of (δ, L) at the locations of galaxies and we measure the bias function through the histogram weighted by 1/p(δ, L).

In the left column of Fig. 5, we show the galaxy environment distribution for different cases whereas the right panels show the bias function for the respective cases. The matter distribution is – as expected – a multi-variate Gaussian and leads to a bias function with f = 1 in the measured regime. Considering the distribution of haloes in the second row of Fig. 5, we find that the bias function is measured reasonably well within the 3σ region (outermost grey contour). The distribution of haloes is offset from the background distribution, clearly preferring higher densities and more negative values of the Laplacian.

thumbnail Fig. 5.

Distribution of Lagrangian over-densities, δ, and the Laplacian, L (in units of h2 Mpc−2) p(δ, L|g) (left), and the multi-variate bias function (right). The top row shows the distribution for matter particles for kd = 0.2 h Mpc−1 and the second row of haloes selected in the mass-range M200b ∼ 2 × 1014 h M. The remaining rows show approximations to the halo distribution through a 3 parameter second-order expansion, a 5 parameter second-order expansion and a Gaussian bias model. The contours show the 1-,2- and 3-sigma regions of the matter distribution. The measurement for haloes has only reasonable statistics roughly inside of the 3-sigma region of the matter distribution (outermost grey contour). The multi-variate Gaussian bias seems to be a good approximation to the actual distribution and adequately respects the positivity constraint.

We consider three different approximations to the bias function – a quadratic expansion bias with bδ, bδ, δ and bL as parameters, a quadratic bias model with the five parameters bδ, bδ, δ, bL, bδ, L, bL, L, and a Gaussian bias with the same parameters. All of these parameters are inferred as in Eqs. (60) and (61) (through moments of the galaxy environment distribution) independently of the considered model so that this is a fair comparison independent of fitting technique. While the quadratic biases both capture the rough shape of the environment distribution and of the bias function reasonably well, they also have some severe shortcomings. Most significantly they predict negative values for significant fraction of space. On the other hand, the Gaussian bias is as expected strictly positive and it describes the shape of the actual bias function almost perfectly. It is quite plausible that the Gaussian would also be a significantly better description of the bias function in this scenario than expansions of significantly higher order. The built-in positivity constraint makes it significantly easier to approximate the actual bias function which is very close to zero over a significant fraction of space.

5.2. Systematic evaluation

Here, we systematically evaluate the performance of the quadratic versus Gaussian bias model for the same set of damping scales and tracer selections as in Sect. 3.5. We again consider the Wasserstein distance between the two distributions. In the two-dimensional scenario, it is necessary to assume a metric between δ and L. The most natural choice is the distance notion,

d ( x 1 , x 2 ) = ( x 1 x 2 ) T C δ , L 1 ( x 1 x 2 ) , $$ \begin{aligned} d(\boldsymbol{x}_1, \boldsymbol{x}_2)&= \sqrt{(\boldsymbol{x}_1 - \boldsymbol{x}_2)^T C^{-1}_{\delta , L} (\boldsymbol{x}_1 - \boldsymbol{x}_2)} ,\end{aligned} $$(74)

which measures the distance in units of standard deviations. The Wasserstein distance is then again the minimal average amount that probability mass has to be transported to transform the first distribution into the second. In two dimensions, calculating the Wasserstein distance is a rather complicated optimisation problem. We used the POT (Python Optimal Transport) library to estimate the Wasserstein distance between the actual and the measured distribution. To obtain a manageable computation time, we discretised the distributions to a grid that covers the 5σ-region (in the principal component frame) through 502 bins. We checked that increasing the region or the number of bins does not affect the results notably. Since the Wasserstein distance is invariant under adding constant offsets to both distributions, it is also well defined for the case with negative probability densities.

We show the thus-obtained Wasserstein distances in the top panel of Fig. 6. The overall picture is very similar to what we have found in Sect. 3: The multi-variate Gaussian bias always has smaller or equal values of the Wasserstein distance compared to the quadratic (five parmeter) expansion. The difference is most significant for high-mass, high-bias objects, M200b ≳ 1014 h M, reaching more than an order of magnitude at large kd. For intermediate-mass objects and stellar-mass-selected galaxies the difference is still quite significantly smaller for the Gaussian bias. For stellar-mass-selected galaxies, the Wasserstein metric is almost equal between the Gaussian and the expansion bias, though it is slightly smaller for the expansion bias at some intermediate scales; for example, at kd = 0.3 h Mpc−1.

thumbnail Fig. 6.

Metrics for evaluating the performance of the different bias models in the multi-variate case. The top panel shows the Wasserstein metric (smaller is better), the central panel the normalised log likelihood (larger is better) and the bottom panel the fraction of volume for which negative galaxy densities are predicted. Generally, the multi-variate Gaussian bias model appears to describe the data either significantly better or at least equally well to a quadratic bias model and it never predicts negative galaxy densities, which can get quite significant for the expansion bias.

As a secondary metric, we again consider the log-likelihood, with the likelihood now normalised to the multi-variate Gaussian case:

log L n = log p ( δ , L | g ) g + 1 2 log ( 4 π 2 det C ) + 1 . $$ \begin{aligned} \log L_n&= \left\langle {\log p(\delta , L | \mathrm{g} ) }\right\rangle _{\mathrm{g} } + \frac{1}{2} \log (4 \pi ^2 \det {C}) + 1 . \end{aligned} $$(75)

For the quadratic bias we use the same clipping procedure as described in Sect. 3.5. We have again verified that results are quite independent of the clipping value, and we always use the one that leads to the largest likelihood.

We show the likelihood values in the second panel of Fig. 6. The picture is very similar to the mono-variate case from Fig. 4 with some differences being additionally enhanced.

Finally, we measure the fraction of Lagrangian Volume that is assigned a negative probability (and therefore galaxy density) by the expansion bias

V neg = 1 H ( f ( x ) ) , $$ \begin{aligned} V_{\mathrm{neg} }&= 1 - \left\langle {H(f(\boldsymbol{x}))}\right\rangle ,\end{aligned} $$(76)

where H is the Heaviside step function. We show the negative volume fraction in the bottom panel of Fig. 6. While the Gaussian bias has always Vneg = 0, the expansion bias predicts varying amounts of negative volume. Generally, the problem gets worse with larger kd and with stronger biased tracers; for example, for the SFR galaxies the fraction seems to be below 1% at all scales, but about half of the volume is predicted to be negative for the haloes with M200b ∼ 1014 M at the largest kd.

We conclude that the joint bias function of (δ, L) behaves very similar to the mono-variate case that only considers δ: Generally, the multi-variate Gaussian bias model describes the data either significantly better or at least equally well to a quadratic bias model. The benefits are most significant for highly biased tracers and large kd.

6. Bias functions across scales

6.1. The scale dependence of the bias function, f

We have so far focussed on measurements of the bias function, f, at finite smoothing scales and tested how well different bias models describe f if they are optimised to reproduce the moments of the galaxy environment distribution. However, these tests did not yet ensure that the bias function at smoothing scales larger than the considered scale is well described by the same model.

Bias functions at different smoothing scales can be unified in a single model through the core assumption of the bias scheme, the Peak-background split (PBS), stating that the response to perturbations is independent of the scale of the perturbation (as long as the scale is sufficiently large). The PBS ensures that if the bias function fC1 is defined at some scale with some covariance matrix C1 of the perturbations, the bias function at any other scale fC2 (with another covariance matrix) follows automatically from fC1.

Consider Fig. 7 where we show the density bias function of haloes with M200b ∼ 1014 h M evaluated at different damping scales. In comparison we show the predicted density distribution of a multi-variate (δ, L) Gaussian bias model with parameters estimated at kd = 0.2 h Mpc−1, but evaluated with the covariances of the different scales. One can recover the projected density-only version of a multi-variate model by projecting the galaxy environment distribution,

thumbnail Fig. 7.

Bias function measured at different smoothing scales in comparison to a multi-variate Gaussian bias model fitted at kd = 0.2 h Mpc−1 and rescaled under the assumption of the PBS. The PBS assumption recovers the bias function at larger scales kd ≲ 0.2 h Mpc−1 very well, but seems to be less accurate at smaller scales.

f ( δ ) = p ( δ , L | g ) d L p ( δ ) , $$ \begin{aligned} f(\delta )&= \frac{\int p(\delta , L | \mathrm{g} ) \mathrm{d} L}{p(\delta )} \nonumber ,\end{aligned} $$

which leads to a Gaussian with scale-dependent mean and variance depending on the Laplacian bias terms. We notice that the bias functions that are predicted at the scales kd ≲ 0.2 h Mpc−1 are very well reconstructed by the PBS assumption. At the scales kd = 0.25 h Mpc−1 and kd = 0.3 h Mpc−1 the reconstruction is still reasonably close to the actual function, but clearly not optimal. This means either that the PBS assumption becomes inaccurate at those scales or that additional variables (e.g. ∇4δ) become relevant. This observation is relatively independent of the scale where we have performed the fit (as following considerations show).

Here, we want to understand whether the bias function is well described by a single multi-variate Gaussian model across multiple scales. There is a convenient alternative to measuring the bias function at each scale individually and rescaling the bias model accordingly. We may instead measure the aspects of the bias function that should be invariant across the different scales. While this is usually done through the large-scale bias parameters, it is also possible to recover this information in a non-parameteric way by directly measuring the re-normalised large-scale bias function, F.

6.2. Measurements of the re-normalised bias function, F

As we have shown in Sect. 4.5, we can directly measure the large-scale bias function through Eq. (66). We shall use this to directly measure F(δ0). While the equation should be valid for any value of δ0, the expectation can be quite dominated by rare outliers with large weights for large values of δ0/σ0. We discuss this in detail for a toy model with known solution in Appendix A, where we also devise a technique to identify the regime that is unaffected by rare outlier statistics. In plots throughout this section, we only show this regime that is statistically reliable.

In Fig. 8, we show measurements of the re-normalised bias function F(δ0), comparing the spatial order 0 estimator from Eq. (67) with the spatial order 2 estimator from Eq. (68). The spatial order 0 estimator only reconstructs the function for δ0 ≲ 0.2 in a reasonably scale-independent manner, but it clearly yields a very scale-dependent behavior at larger δ0. This implies that any bias model that is only a function of density (no matter which order) that is fitted at some scale, such as kd = 0.2 h Mpc−1, would not reliably capture the bias at larger scales. Lagrangian local in matter density (LLIMD) models are therefore limited in validity to very large scales, even if they consider polynomials of arbitrary high orders in density.

thumbnail Fig. 8.

Re-normalised bias function of haloes with M200b ∼ 1014 h M reconstructed at zeroth spatial order (left) and second spatial order (right) for different damping scales. At second spatial order the reconstructed function is independent of the measurement scale for kd ≲ 0.2 h Mpc−1 and it is well described by a Gaussian bias model.

On the other hand, the spatial order two reconstruction yields an almost perfectly scale-independent behavior at scales kd ≲ 0.2 h Mpc−1. Only at smaller scales, kd = 0.25 h Mpc−1 and kd = 0.3 h Mpc−1, does the reconstruction become scale-dependent, consistent with the observations from Fig. 8. At such small scales, the agreement may be improved by considering higher-order variables in a reconstruction of fourth spatial order. However, it seems clear that we can reconstruct the re-normalised bias function very well on sufficiently large scales with the spatial order 2 estimator.

Where F is reliably reconstructed, it yields a direct estimate of the bias function that can also be measured with separate universe simulations (Li et al. 2014; Wagner et al. 2015; Lazeyras et al. 2016). The region δ0 < −1 is well defined and well behaved, because the δ0 value corresponds only to a linearly extrapolated density contrast – describing aspects of the initial conditions, but not the final density. However, we should expect ill-defined behavior for δ0 ≳ 1.68, since a hypothetical separate universe with such a density contrast would collapse to a single point. This point lies anyways far beyond the region that we can reliably reconstruct in a scale-independent manner, so that this limitation is not relevant for the presented measurements.

We compare the measured re-normalised bias function with the quadratic and the Gaussian model that is using the bias parameters measured at kd = 0.2 h Mpc−1. Clearly, both models match the function well at small absolute values |δ0|≲0.2. However, while the expansion bias strongly deviates at larger values and catastrophically fails at very small δ0 ≲ −0.5, the Gaussian bias seems to be an excellent approximation to the actual function everywhere where it is well measured.

We measure the re-normalised bias function in the same (spatial order 2) manner for three different sets of haloes with masses 1013, 3 × 1014 and 1015 h M in Fig. 9. The reconstructions up to kd ≲ 0.2 h Mpc−1 seem again reliably scale independent. The Gaussian seems to be a better approximation than the quadratic bias for all three sets of haloes. The difference is most significant in regions where the bias function gets close to zero F ≲ 0.5.

thumbnail Fig. 9.

Measurements of the re-normalised bias function, F, for haloes of three different mass selections. A reliable scale-independent reconstruction seems generally possible from damping scales up to kd ≲ 0.2 h Mpc−1. The function looks generally better approximated by a Gaussian than a quadratic bias model, especially in regions where the function gets close to zero F ≲ 0.5.

Finally, we show in Fig. 10 the equivalent measurements for galaxies. Similar to our observations from the measurements of the scale-dependent bias functions, the stellar-mass-selected galaxies are slightly better approximated by a Gaussian, whereas the SFR galaxies are roughly equally well reconstructed by both models. However, it seems that these galaxies have a sufficiently low bias that even a linear bias model would be a good approximation.

thumbnail Fig. 10.

Measurements of the re-normalised bias function, F, for stellar-mass-selected galaxies (top) and SFR-selected galaxies (bottom).

We conclude that a multi-variate Gaussian bias based on the density and Laplacian model may consistently and accurately describe the bias relation at all scales kd ≲ 0.2 h Mpc−1. While we have seen in Sect. 5 that a Gaussian is still a good approximation to f at scales far beyond kd > 0.2 h Mpc−1, such scales seem not perfectly reconcilable with the PBS assumption that only assumes δ and L as variables. It might be possible to push the scale-independence to smaller scales by considering additional variables like ∇4δ, which is, however, beyond the scope of this paper.

7. Why the bias function should be Gaussian

We have so far shown through measurements in simulations that the bias function appears close to a Gaussian. A priori it is not obvious why the bias function should appear Gaussian, since it is possible to imagine very complicated galaxy formation scenarios.

We suggest that the main reason that the large-scale bias function appears Gaussian, is that the statistics of the linear density field is itself Gaussian. If the smooth linear density δ1 on some large length scale is specified, then the conditional distribution of density at any other length scale p(δ2|δ1) will follow a Gaussian distribution. Therefore, if we had some non-Gaussian bias function, f2, at some small scale, the bias function at larger scales will be f2 convolved with a Gaussian. The resulting bias function f1 will ultimately be closer to Gaussian.

To illustrate this, we consider a set of non-Gaussian selection functions as a toy model. We measure the linear density contrast at a damping scale of kd = 0.3 h Mpc−1 where σ = 1.1 and we define different ways to select biased populations based on this. To define these tracers, we use a rejection sampling approach with different probability densities:

p Dirac Delta = { 1 if 0.9 < δ < 1.1 0 else $$ \begin{aligned} p_{\mathrm{Dirac Delta} }&= {\left\{ \begin{array}{ll} 1 \,\,&\mathrm{if} \,\,\quad 0.9 < \delta < 1.1 \\ 0&\mathrm{else} \end{array}\right.} \end{aligned} $$(77)

p Tophat = { 1 if 2 < δ < 0 0 else $$ \begin{aligned} p_{\mathrm{Tophat} }&= {\left\{ \begin{array}{ll} 1 \,\,&\mathrm{if} \,\,\quad -2 < \delta < 0 \\ 0&\mathrm{else} \end{array}\right.} \end{aligned} $$(78)

p Heavyside = { 1 if δ > 0 0 else $$ \begin{aligned} p_{\mathrm{Heavyside} }&= {\left\{ \begin{array}{ll} 1 \,\,&\mathrm{if} \,\,\quad \delta > 0 \\ 0&\mathrm{else} \end{array}\right.} \end{aligned} $$(79)

p Double linear = { 1 2 + δ if 1 2 < δ < 1 2 3 2 δ if 1 2 < δ < 3 2 0 else $$ \begin{aligned} p_{\mathrm{Double linear} }&= {\left\{ \begin{array}{ll} \frac{1}{2} + \delta \,\,&\mathrm{if} \quad -\frac{1}{2} < \delta < \frac{1}{2} \\ \frac{3}{2} - \delta \,\,&\mathrm{if} \,\,\quad \frac{1}{2} < \delta < \frac{3}{2} \\ 0&\mathrm{else} \end{array}\right.} \end{aligned} $$(80)

p Quadratic = { 1 1 5 ( δ + 1 2 ) 2 if 1 2 < δ < 1 2 0 else . $$ \begin{aligned} p_{\mathrm{Quadratic} }&= {\left\{ \begin{array}{ll} 1 - \frac{1}{5} (\delta + \frac{1}{2})^2 \,\,&\mathrm{if} \quad -\frac{1}{2} < \delta < \frac{1}{2} \\ 0&\mathrm{else} . \end{array}\right.} \end{aligned} $$(81)

After sampling the tracers, we measure the bias function, f, at the corresponding scale as shown in the top panel of Fig. 11. Further, we measure the re-normalised bias function through Eq. (67) – representing the bias relation that would be observed at much larger scales. The spatial order 0 reconstruction is perfect in the considered scenario, since the actual selection function depends only on density. Further, recall that F corresponds exactly to f convolved by the Gaussian background distribution at the measured scale:

thumbnail Fig. 11.

Bias function, f (top panel), vs the re-normalised large-scale bias function, F (bottom), for a set of toy models. Selection functions that are very non-Gaussian on small scales naturally lead to bias relations that are close to Gaussian on large scales, since their relation is given by a convolution with a Gaussian.

F = f ° p . $$ \begin{aligned} F&= f \circ p . \end{aligned} $$(82)

We show the re-normalised bias function in the bottom panel of Fig. 11 alongside with Gaussian bias models that use the bias parameters measured from the moments of the environment distribution, as in Eq. (27).

While all the large-scale selection functions are quite different from Gaussians, the large-scale bias function, F, appears much closer to a Gaussian in almost all scenarios. To understand this, we first considered the case of the ‘Dirac Delta’ selection function, where F has to be exactly Gaussian. Mathematically, this is the case because a Dirac delta distribution convolved with a Gaussian distribution simply gives a Gaussian distribution centered at the location of the Dirac delta distribution (at δ = 1). Physically, we can interpret F in this situation as the probability to reach exactly the density δ = 1 at kd = 0.3 h Mpc−1, given that the density contrast is δ0 at infinitely large scales (normalised to F = 1 at δ0 = 0).

While for none of the other selection functions it is expected that F is exactly Gaussian, it is striking that F is quite close to a Gaussian for almost all of them. The convolution simply erases all sharp features of the selection and leaves us with functions that are smooth on the scale σ and that are very well approximated by Gaussians. The only case that quite significantly deviates from a Gaussian is the Heavyside selection function. In this scenario, F corresponds exactly to an error-function. However, even in this case, major parts of F (δ0 ≲ 0.5) are still very well described.

We may expect that the more compact the selection function is on small scales, the more closely the large-scale bias function resembles a Gaussian. From simple toy models – for example, the spherical collapse excursion set formalism (Press & Schechter 1974; Bond et al. 1991) – we can plausibly understand why the selection function for haloes should be relatively compact. In these models, a volume element is considered to be part of a halo of mass M if the density contrast crosses a barrier of δc = 1.68 for the first time at the Lagrangian smoothing scale of the halo. The bias function has to appear as a Dirac-delta function at that scale. Deriving the actual bias function on larger scales is slightly more complicated, because it receives some additional scale dependence due to the first-crossing requirement. However, for the case of a sharp k filter the resulting function is

F ES = ( 1 δ δ c ) exp ( ( 2 δ c δ ) δ 2 σ M 2 ) , $$ \begin{aligned} F_{\mathrm{ES} }&= \left(1 - \frac{\delta }{\delta _c}\right) \exp \left( \frac{(2 \delta _c - \delta ) \delta }{2 \sigma _{M}^2} \right) ,\end{aligned} $$(83)

as is explained, for example, in Desjacques et al. (2018), where σM is the variance of the linear density field at the scale of the halo. Given our argument from above, we understand that this is quite close to a Gaussian.

We may further speculate about the reason why the bias function of our mock-galaxies seems to be much wider than that of haloes. Galaxies of some particular selection may occupy haloes of many different masses and appear therefore as a superposition of the bias function of different haloes. It appears that our mocked SFR galaxies occupy a sufficiently wide range in halo masses that the bias function is so smooth that it is hard to tell the difference between, for example, a linear, quadratic, or Gaussian bias model.

We conclude that large-scale bias functions are likely to be close to Gaussian even if galaxy formation acts in a very non-Gaussian manner on small scales, because large-scale and large-scale bias are related through a convolution with a Gaussian. More elaborate models beyond the Gaussian paradigm may consider more general basis functions that are optimised to describe the space of positive functions that are smooth at some scale σ.

8. Conclusions

In this article, we have presented a new non-parametric method of measuring the galaxy environment distribution, p(δ|g), the scale-dependent bias function, f(δ), and its re-normalised counterpart, F(δ0), in Lagrangian space. The new method is very simple and robust, only requiring one to evaluate simple expectation values and histograms. We have used the new method to measure the considered functions and we have found that all three functions are strikingly close to a Gaussian.

Therefore, we have introduced a new Gaussian Lagrangian bias model. This bias model has the remarkably simple property that all three functions, p(δ|g), f(δ), and F(δ0), are given by Gaussian functions with parameters that can easily be expressed in terms of re-normalised biases, b1 and b2. Further, we have shown that in the multi-variate case – considering additionally for example the Laplacian of the density field or the tidal field as variables – a multi-variate Gaussian bias model can be formulated that has the same merits.

We have systematically tested how well the scale-dependent bias function is approximated through a Gaussian bias in comparison to a (quadratic) expansion bias model with the same number of free parameters. While the quadratic model generally only works well for some carefully selected regime (e.g. large scales and low-bias objects), the Gaussian seems to be a good approximation in all regimes. In general, the Gaussian appears to be at least as accurate as a quadratic bias (e.g. for low-mass haloes and SFR galaxies) or significantly more accurate (e.g. for high-mass haloes and small scales), but never notably worse. This is so in both the mono-variate and multi-variate cases.

Further, we have investigated the scale dependence of the bias function and tested whether a multi-variate (δ, L) Gaussian bias model that fits the bias function well at one damping scale also accurately captures the behavior on larger scales. We have found that such consistency between scales can be reliably achieved up to a damping scale of kd ∼ 0.2 h Mpc−1 for any set of tracers. For even larger kd, some additional care must be taken; for example, by additionally considering higher-order derivatives and by carefully testing the reliability of the PBS assumption.

Beyond the improved accuracy, the Gaussian bias model also has the desirable property that it only predicts positive function values over the whole input domain. This is a physical constraint that is obviously the case for realistic galaxy catalogues, but that is universally violated by the canonical bias expansion approach. This encoded constraint likely contributes significantly to the accuracy of the bias model, especially at small smoothing scales where much of the space may be predicted to be with negative galaxy densities by canonical bias expansion models. Further, it opens up a new set of possibilities that presuppose the positivity of the bias function:

  • Probability theory can be applied in a rigorous manner with a Gaussian bias model. For example, the galaxy environment distribution forms a well-defined probability distribution and the likelihood of observing a set of galaxy environments, given a Gaussian bias model, is a well-defined metric and could potentially be used by initial condition reconstruction methods.

  • With strictly positive predicted galaxy densities, the Gaussian bias could be used to create mock galaxy catalogues with discrete tracers from a given set of bias parameters, by sampling the corresponding probability density. Predictions of discrete tracers would mimic actual galaxy catalogues with a notably larger degree of realism than the continuous fields that are typically predicted by bias models. However, it would be important to develop a correlated way of sampling that allows one to create super- or sub-Poissonian statistics.

For the latter point, we note that such an option also exists when using non-parametric measurements of the bias function that we have presented in this article. For example, one could measure the bias function of galaxies from some small-volume hydro-simulation and use it to populate a large-scale dark-matter-only simulation.

While the Gaussian bias model seems to be a promising alternative to polynomial expressions, there still remain a few open questions that can be addressed in future studies. An important aspect of bias models that describe most measurable statistics (e.g. the power spectrum) is a description of the large-scale contribution of large-scale stochastic components. Conveniently, the galaxy environment distribution that we have investigated here does not include any unaccounted stochastic component from small scales, so that we could investigate the deterministic part of the bias relation without modelling stochastic terms. For other statistics, the stochastic components may probably be modelled for a Gaussian model in the same way in which they are typically modelled for expansion biases, but this aspect deserves some additional theoretical consideration. In particular, it would be an interesting alternative option to try mimicking the stochastic components through a discrete sampling process, as was mentioned above.

Further, it would be interesting to consider higher-order generalisations that naturally recover a Gaussian bias at the second order. In Stücker et al. (2024), we have introduced the cumulant bias expansion as a general description of the large-scale bias function, F, and suggested a few practical suggestions to implement it into actual bias functions. However, it is difficult to find a general method to describe probability distributions with more than two cumulants in a way that ensures the positivity, f > 0. Therefore, the question of what is an optimal basis at orders of n > 2 remains open and can hopefully be addressed by future studies.

We conclude that a Gaussian bias model provides a mathematically simple, well-behaved, and accurate description of the bias function. If computationally feasible, it should generally be preferred over a second-order expansion. In cases in which practical aspects may favour an expansion approach3, this knowledge can still be used by re-expressing canonical higher-order parameters through the first two cumulant biases, as is explained in Stücker et al. (2024).


1

With a damping scale kd = 0.15 h Mpc−1 as explained in Sect. 3.

2

Where for the Gaussian bias we additionally use that β1 = b1 and β2 = b2 − b12.

3

For example, for a bias expansion, the power spectrum can be decomposed into a finite set of cross spectra that only need to be calculated once, but not for every set of parameters.

Acknowledgments

The authors thank Sergio Contreras and Sara Ortega Martinez for providing mocks of galaxies. The authors thank Oliver Philcox for helpful discussions and comments to the draft and Simon White and Oliver Hahn for helpful discussions. We acknowledge funding from the Spanish Ministry of Science and Innovation through grant number PID2021-128338NB-I00. RV acknowledges the support of the Juan de la Cierva fellowship (FJC2021-048002-I). MPI is supported by STFC consolidated grant no. RA5496.

References

  1. Amendola, L., Appleby, S., Avgoustidis, A., et al. 2018, Liv. Rev. Rel., 21, 2 [Google Scholar]
  2. Angulo, R. E., & Hahn, O. 2022, Liv. Rev. Comput. Astrophys., 8, 1 [CrossRef] [Google Scholar]
  3. Angulo, R. E., & Pontzen, A. 2016, MNRAS, 462, L1 [NASA ADS] [CrossRef] [Google Scholar]
  4. Angulo, R. E., Zennaro, M., Contreras, S., et al. 2021, MNRAS, 507, 5869 [NASA ADS] [CrossRef] [Google Scholar]
  5. Assassi, V., Baumann, D., & Schmidt, F. 2015, JCAP, 2015, 043 [CrossRef] [Google Scholar]
  6. Ata, M., Kitaura, F.-S., & Müller, V. 2015, MNRAS, 446, 4250 [Google Scholar]
  7. Balaguera-Antolínez, A., Kitaura, F.-S., Pellejero-Ibáñez, M., Zhao, C., & Abel, T. 2019, MNRAS, 483, L58 [CrossRef] [Google Scholar]
  8. Baldauf, T., Mirbabayi, M., Simonović, M., & Zaldarriaga, M. 2016, arXiv e-prints [arXiv:1602.00674] [Google Scholar]
  9. Bardeen, J. M., Bond, J. R., Kaiser, N., & Szalay, A. S. 1986, ApJ, 304, 15 [Google Scholar]
  10. Bartlett, D. J., Ho, M., & Wandelt, B. D. 2024, arXiv e-prints [arXiv:2405.00635] [Google Scholar]
  11. Baumann, D., Nicolis, A., Senatore, L., & Zaldarriaga, M. 2012, JCAP, 2012, 051 [Google Scholar]
  12. Bernardeau, F., Colombi, S., Gaztañaga, E., & Scoccimarro, R. 2002, Phys. Rep., 367, 1 [Google Scholar]
  13. Bond, J. R., Cole, S., Efstathiou, G., & Kaiser, N. 1991, ApJ, 379, 440 [NASA ADS] [CrossRef] [Google Scholar]
  14. Chen, S.-F., Vlah, Z., & White, M. 2020, JCAP, 2020, 062 [CrossRef] [Google Scholar]
  15. Colas, T., d’Amico, G., Senatore, L., Zhang, P., & Beutler, F. 2020, JCAP, 2020, 001 [Google Scholar]
  16. Contreras, S., Angulo, R. E., & Zennaro, M. 2021, MNRAS, 508, 175 [NASA ADS] [CrossRef] [Google Scholar]
  17. d’Amico, G., Gleyzes, J., Kokron, N., et al. 2020, JCAP, 2020, 005 [Google Scholar]
  18. DeRose, J., Kokron, N., Banerjee, A., et al. 2023, JCAP, 2023, 054 [CrossRef] [Google Scholar]
  19. Desjacques, V., Jeong, D., & Schmidt, F. 2018, Phys. Rep., 733, 1 [Google Scholar]
  20. Friedrich, O., Halder, A., Boyle, A., et al. 2022, MNRAS, 510, 5069 [NASA ADS] [CrossRef] [Google Scholar]
  21. Hernández-Sánchez, M., Kitaura, F.-S., Ata, M., & Dalla Vecchia, C. 2021, MNRAS, 502, 3976 [CrossRef] [Google Scholar]
  22. Ivanov, M. M., Simonović, M., & Zaldarriaga, M. 2020, JCAP, 2020, 042 [Google Scholar]
  23. Jasche, J., & Lavaux, G. 2019, A&A, 625, A64 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  24. Kaiser, N. 1984, ApJ, 284, L9 [NASA ADS] [CrossRef] [Google Scholar]
  25. Kitaura, F.-S., Ata, M., Rodríguez-Torres, S. A., et al. 2021, MNRAS, 502, 3456 [NASA ADS] [CrossRef] [Google Scholar]
  26. Kitaura, F. S., Balaguera-Antolínez, A., Sinigaglia, F., & Pellejero-Ibáñez, M. 2022, MNRAS, 512, 2245 [NASA ADS] [CrossRef] [Google Scholar]
  27. Kokron, N., DeRose, J., Chen, S.-F., White, M., & Wechsler, R. H. 2021, MNRAS, 505, 1422 [NASA ADS] [CrossRef] [Google Scholar]
  28. Lazeyras, T., Wagner, C., Baldauf, T., & Schmidt, F. 2016, JCAP, 2016, 018 [CrossRef] [Google Scholar]
  29. Levi, M., Bebek, C., Beers, T., et al. 2013, arXiv e-prints [arXiv:1308.0847] [Google Scholar]
  30. Li, Y., Hu, W., & Takada, M. 2014, Phys. Rev. D, 89, 083519 [NASA ADS] [CrossRef] [Google Scholar]
  31. Manera, M., & Gaztañaga, E. 2011, MNRAS, 415, 383 [NASA ADS] [CrossRef] [Google Scholar]
  32. Marinacci, F., Vogelsberger, M., Pakmor, R., et al. 2018, MNRAS, 480, 5113 [NASA ADS] [Google Scholar]
  33. Matsubara, T. 1995, ApJS, 101, 1 [NASA ADS] [CrossRef] [Google Scholar]
  34. Matsubara, T. 2011, Phys. Rev. D, 83, 083518 [NASA ADS] [CrossRef] [Google Scholar]
  35. Mo, H. J., & White, S. D. M. 1996, MNRAS, 282, 347 [Google Scholar]
  36. Modi, C., Chen, S.-F., & White, M. 2020, MNRAS, 492, 5754 [NASA ADS] [CrossRef] [Google Scholar]
  37. Musso, M., Paranjape, A., & Sheth, R. K. 2012, MNRAS, 427, 3145 [CrossRef] [Google Scholar]
  38. Naiman, J. P., Pillepich, A., Springel, V., et al. 2018, MNRAS, 477, 1206 [Google Scholar]
  39. Nelson, D., Pillepich, A., Springel, V., et al. 2018, MNRAS, 475, 624 [Google Scholar]
  40. Neyrinck, M. C., Aragón-Calvo, M. A., Jeong, D., & Wang, X. 2014, MNRAS, 441, 646 [Google Scholar]
  41. Nishimichi, T., D’Amico, G., Ivanov, M. M., et al. 2020, Phys. Rev. D, 102, 123541 [NASA ADS] [CrossRef] [Google Scholar]
  42. Ortega-Martinez, S., Contreras, S., & Angulo, R. 2024, A&A, 689, A66 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  43. Paranjape, A., Sefusatti, E., Chan, K. C., et al. 2013a, MNRAS, 436, 449 [CrossRef] [Google Scholar]
  44. Paranjape, A., Sheth, R. K., & Desjacques, V. 2013b, MNRAS, 431, 1503 [CrossRef] [Google Scholar]
  45. Pellejero-Ibañez, M., Balaguera-Antolínez, A., Kitaura, F.-S., et al. 2020, MNRAS, 493, 586 [CrossRef] [Google Scholar]
  46. Pellejero Ibañez, M., Stücker, J., Angulo, R. E., et al. 2022, MNRAS, 514, 3993 [Google Scholar]
  47. Pellejero Ibañez, M., Angulo, R. E., Zennaro, M., et al. 2023, MNRAS, 520, 3725 [Google Scholar]
  48. Philcox, O. H. E., & Ivanov, M. M. 2022, Phys. Rev. D, 105, 043517 [CrossRef] [Google Scholar]
  49. Pillepich, A., Nelson, D., Hernquist, L., et al. 2018, MNRAS, 475, 648 [Google Scholar]
  50. Planck Collaboration VI. 2020, A&A, 641, A6 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  51. Press, W. H., & Schechter, P. 1974, ApJ, 187, 425 [Google Scholar]
  52. Schmittfull, M., Simonović, M., Assassi, V., & Zaldarriaga, M. 2019, Phys. Rev. D, 100, 043514 [NASA ADS] [CrossRef] [Google Scholar]
  53. Sigad, Y., Dekel, A., & Branchini, E. 2000, ASP Conf. Ser., 201, 360 [NASA ADS] [Google Scholar]
  54. Springel, V., White, S. D. M., Tormen, G., & Kauffmann, G. 2001, MNRAS, 328, 726 [Google Scholar]
  55. Springel, V., Pakmor, R., Pillepich, A., et al. 2018, MNRAS, 475, 676 [Google Scholar]
  56. Stücker, J., Pellejero-Ibáñez, M., Angulo, R. E., Maion, F., & Voivodic, R. 2024, A&A, submitted [arXiv:2405.01950] [Google Scholar]
  57. Szalay, A. S. 1988, ApJ, 333, 21 [NASA ADS] [CrossRef] [Google Scholar]
  58. Uhlemann, C., Feix, M., Codis, S., et al. 2018, MNRAS, 473, 5098 [NASA ADS] [CrossRef] [Google Scholar]
  59. Vlah, Z., Castorina, E., & White, M. 2016, JCAP, 2016, 007 [Google Scholar]
  60. Wagner, C., Schmidt, F., Chiang, C. T., & Komatsu, E. 2015, MNRAS, 448, L11 [CrossRef] [Google Scholar]
  61. Wu, X., Muñoz, J. B., & Eisenstein, D. 2022, JCAP, 2022, 002 [CrossRef] [Google Scholar]
  62. Wu, X., Muñoz, J. B., & Eisenstein, D. J. 2023, JCAP, 2023, 040 [CrossRef] [Google Scholar]
  63. Zennaro, M., Angulo, R. E., Pellejero-Ibáñez, M., et al. 2023, MNRAS, 524, 2407 [Google Scholar]

Appendix A: Measuring the bias function for a toy model

Here, we consider a toy model to establish the range of validity of our measurements of the large-scale bias function, F. As is explained in Sect. 4.5, the estimator of F is given by

F ( δ 0 ) = exp ( δ 0 2 2 σ 2 ) exp ( δ δ 0 σ 2 ) g $$ \begin{aligned} F(\delta _0)&= \exp \left(\frac{-\delta _0^2}{2 \sigma ^2} \right) \left\langle {\exp \left(\frac{\delta \delta _0}{\sigma ^2} \right)}\right\rangle _{\mathrm{g} } \end{aligned} $$(A.1)

for the mono-variate density-only scenario. This expectation value is very sensitive to rare outliers. For example, at δ0 = 1 and σ = 0.5 a 5σ outlier galaxy with δ = 2.5 would contribute with a weight that is exp(5)∼150 times higher than the weight of a galaxy with δ = 0. What makes matters worse is that a simulation does not perfectly follow Gaussian statistics. The density distribution at scales close to the fundamental frequency of the box is quite non-Gaussian and it is relatively unlikely to exhibit a realistic number of outliers.

To investigate this effect we set up an example set of biased tracers. For this we select all particles that have at a small damping scale kd = 0.3 h Mpc−1 a density δ higher than 0, corresponding to exactly half of the tracers. Therefore, f = 2Θ(δ) is given at this scale by a heavy-side function Θ and the large-scale bias function is then given by

F ( δ 0 ) = 0 p ( δ δ 0 ) f ( δ ) d δ = 1 + erf ( δ 0 2 σ ) $$ \begin{aligned} F(\delta _0)&= \int _{0}^{\infty } p(\delta - \delta _0) f(\delta ) \mathrm{d} \delta \nonumber \\&= 1 + \mathrm{erf} \left( \frac{\delta _0}{\sqrt{2} \sigma } \right) \end{aligned} $$(A.2)

where the factor 2 comes from the normalisation constraint F(0) = 1 and where σ2 is the variance at kd = 0.3 h Mpc−1. Since the set of biased tracers depends exactly only on density, we should be able to recover them exactly by applying Eq. (A.1) at any damping scale (keeping the set of tracers fixed). We show such measurements for different scales in the top panel of Fig. A.1. The measurements at different scales agree excellently at small δ0. However, the smaller kd, the earlier in |δ0| the measurements tend to deviate from the correct function. It is noteworthy that the deviations are much larger than the jackknife error estimates. However, given the extreme sensitivity to outliers and the effectively low number of truely independent samples (due to the limited number of modes at large scales), it is very likely that the Jackknife sampling never sees the rare outliers that carry the expectation value, and therefore strongly underestimates the statistical error. To highlight this, we draw as grey contours the uncertainty ranges of 12 different realisations of this set-up evaluated at kd = 0.1 h Mpc−1. Most of these go down at a similar value of δ and also with a small error estimate. However, there is one realisation which gets extremely large, with a maximum of F ∼ 26 at δ0 = 2.7 which is far outside the range of the plot. The average of a large number of realisations will still give the correct value at δ0 = 2.7. However, the distribution over realisations is so imbalanced that it is very difficult to access the true statistical uncertainty from only a single realisation.

thumbnail Fig. A.1.

The large-scale bias function, F, measured for a toy model with known solution (top) and for unbiased tracers (bottom). Coloured shaded regions show the measurement (and jackknife error) at different damping scales and the grey regions show different realisations at kd = 0.1 h Mpc−1. At larger damping scales the measurement gets generally biased at smaller values of δ. The jackknife errors underestimate the uncertainty, since the measurement is sensitive to rare outliers, but we can estimate the validity range of the measurements as where unbiased tracers have F ∼ 1.

Therefore, to estimate the range of validity of our bias function measurements, we access the values of F that we would determine for an unbiased set of tracers. For this we average Eq. (A.1) over the full Lagrangian grid. For the pure density case the result is shown in the bottom panel of Fig. A.1. If the distribution on the grid was a perfect Gaussian, it would be F = 1, but due to the aforementioned non-Gaussianity and the sensitivity to outliers, the measurements deviate from the correct F in a similar way to the top panel. Therefore, we define as the range of validity, the range where |F − 1|< ϵ when F is averaged over the whole grid and where we choose ϵ = 0.02. We mark the corresponding points in Fig. A.1 and we find that F is measured quite reliably for this selection across all damping scales. For all figures in the main-text we only show measurements in this reliable regime.

While we use only simulations with fixed amplitudes of modes as in Angulo & Pontzen (2016), we have repeated the experiment in this section for both fixed and unfixed initial conditions and we found no significant difference in the range of validity.

Appendix B: Derivation of the re-normalised Gaussian tidal bias

We want to find the re-normalised form for a multi-variate Gaussian bias function, f, that uses δ, K and L as variables. We make the Ansatz for its large-scale limit

log F ( δ 0 , L 0 , K 0 ) = log F 0 ( δ 0 , L 0 ) + log F K ( K 0 ) $$ \begin{aligned} \log F(\delta _0, L_0, \mathsf K _0)&= \log F_0(\delta _0, L_0) + \log F_\mathsf{K }(\mathsf K _0)\end{aligned} $$(B.1)

log F K ( K 0 ) = b K 2 K 0 2 $$ \begin{aligned} \log F_\mathsf{K }(\mathsf K _0)&= b_{K^2} K_0^2 \end{aligned} $$(B.2)

where F0 is the function that we have already derived in Sect. 4.2 and where K2 = tr(KK). We note that this is the only possible Ansatz that is at most second order in the considered variables and obeys the isotropy constraint. Since both F and the background distribution p(δ, L, K) = p(δ, L)p(K) factorise, also the scale-dependent bias function factorises in the same manner

f ( δ , L , K ) = f 0 ( δ , L ) f K ( K ) $$ \begin{aligned} f(\delta , L, \mathsf K )&= f_0(\delta , L) f_\mathsf{K }(\mathsf K ) \end{aligned} $$(B.3)

and it is sufficient to enforce the re-normalisation constraint in (δ, L) and in K individually. Therefore, we only have to find a function fK such that

F K ( K 0 ) = f K ( K + K 0 ) . $$ \begin{aligned} F_\mathsf{K }(\mathsf K _0)&= \langle {f_\mathsf{K }(\mathsf K + \mathsf K _0)} \rangle . \end{aligned} $$(B.4)

We start with the assumption that fK corresponds to a multi-variate Gaussian with a flexible normalisation, A:

f K ( K ) = A · N ( K , μ = 0 , C f ) $$ \begin{aligned} f_\mathsf{K }(\mathsf K )&= A \cdot N(\mathsf K , {\mu }=0, \mathsf C _f) \end{aligned} $$(B.5)

where Cf is a rank four tensor and the mean μ has to be zero, because of the isotropy + tracelessness constraint and we are to determine A and Cf. The pdf of a zero-mean multi-variate normal distribution over a symmetric traceless tensor K (with five degrees of freedom) can be written as

N ( K , μ = 0 , C ) = 1 ( 2 π ) 5 | det C | exp ( 1 2 K T C + K ) $$ \begin{aligned} N(\mathsf K , {\mu }=0, \mathsf C )&= \frac{1}{\sqrt{(2\pi )^5 |\det \mathsf C |}} \exp \left( -\frac{1}{2} \mathsf K ^T \mathsf C ^+ \mathsf K \right) \end{aligned} $$(B.6)

where C+ is the pseudo inverse of C and det indicates a pseudo-determinant – that is the product of the non-zero eigenvalues of C. Formally this pdf is not a distribution over all nine components of K, but only describes the variation in the probability density in the five-dimensional subspace where K is symmetric and traceless. The background distribution also corresponds to a multi-variate Gaussian

p ( K ) = N ( K , 0 , C K ) $$ \begin{aligned} p(\mathsf K )&= N(\mathsf K , 0, \mathsf C _\mathsf{K }) \end{aligned} $$(B.7)

C K = 2 σ 2 15 J 2 = 2 $$ \begin{aligned} \mathsf C _\mathsf{K }&= \frac{2 \sigma ^2}{15} \mathsf J _{2=2} \end{aligned} $$(B.8)

where J2 = 2 is the isotropic rank four tensor with indices

J 2 = 2 , i j k l = 1 2 ( δ ik δ jl + δ il δ jk ) 1 3 δ ij δ kl $$ \begin{aligned} \mathsf J _{2=2, ijkl}&= \frac{1}{2} (\delta _{ik} \delta _{jl} + \delta _{il} \delta _{jk}) - \frac{1}{3} \delta _{ij} \delta _{kl} \end{aligned} $$(B.9)

where here δij is the Kronecker delta symbol. The symmetry requirements that arise to this form are explained in more detail in Sect. 4 of Stücker et al. (2024). Therefore, we have

F K = p ( K ) f K ( K + K 0 ) d 5 K = p ( K 0 K ) f K ( K ) d 5 K $$ \begin{aligned} F_\mathsf{K }&= \int p(\mathsf K ) f_\mathsf{K }(\mathsf K + \mathsf K _0) \mathrm{d} ^5 \mathsf K \nonumber \\&= \int p(\mathsf K _0 - \mathsf K ) f_\mathsf{K }(\mathsf K ) \mathrm{d} ^5 \mathsf K \end{aligned} $$(B.10)

where we have used a substitution plus the symmetry of p to show that this corresponds to a convolution between p and fK. The convolution of two multi-variate Gaussian leads to a new multi-variate Gaussian with their mean and their covariances added together:

F K ( K 0 ) = A N ( K 0 , 0 , C F = C f + C K ) $$ \begin{aligned} F_\mathsf{K }(\mathsf K _0)&= A N(\mathsf K _0, 0, \mathsf C _F = \mathsf C _{f} + \mathsf C _\mathsf{K })\end{aligned} $$(B.11)

= A ( 2 π ) 5 | det ( C F ) | exp ( 1 2 K 0 T C F + K 0 ) . $$ \begin{aligned}&= \frac{A}{\sqrt{(2\pi )^5 \left| \det (\mathsf C _{F}) \right|}} \exp \left(- \frac{1}{2} \mathsf K _0^T \mathsf C _F^{+} \mathsf K _0 \right) \,\,. \end{aligned} $$(B.12)

Identifying terms with Eq. (B.2) leads to

C F + = 2 b K 2 J 2 = 2 $$ \begin{aligned} \mathsf C _\mathsf{F }^{+}&= - 2 b_{K^2} \mathsf J _{2=2} \end{aligned} $$(B.13)

A = | ( 2 π ) 5 det ( C F ) | $$ \begin{aligned} A&= \sqrt{\left| (2 \pi )^5 \det (\mathsf C _{F}) \right|}\end{aligned} $$(B.14)

C b = C F C K = 1 2 b K 2 J 2 = 2 2 σ 2 15 J 2 = 2 = 1 + 4 σ 2 15 b K 2 2 b K 2 J 2 = 2 $$ \begin{aligned} \mathsf C _\mathsf{b }&= \mathsf C _\mathsf{F } - \mathsf C _\mathsf{K } \nonumber \\&= - \frac{1}{2 b_{K^2}} \mathsf J _{2=2} - \frac{2 \sigma ^2}{15} \mathsf J _{2=2} \nonumber \\&= - \frac{1 + \frac{4\sigma ^2}{15} b_{K^2}}{2 b_{K^2}} \mathsf J _{2=2} \end{aligned} $$(B.15)

where we have used that J2 = 2 is its own pseudo-inverse. We find

f K = | det C F det C b | exp ( 1 2 K T C b + K ) = | 1 + 4 σ 2 15 b K 2 | 5 / 2 exp ( b K 2 K 2 1 + 4 15 σ 2 b K 2 ) $$ \begin{aligned} f_\mathsf{K }&= \sqrt{\left| \frac{\det \mathsf C _F}{ \det \mathsf C _\mathsf{b } } \right|} \exp \left(- \frac{1}{2} \mathsf K ^T \mathsf C _b^{+} \mathsf K \right) \nonumber \\&= \left|1 + \frac{4 \sigma ^2}{15} b_{K^2} \right|^{-5/2} \exp \left(\frac{b_{K^2} K^2}{1 + \frac{4}{15} \sigma ^2 b_{K^2}} \right) \end{aligned} $$(B.16)

where we have used that J2 = 2 has five non-zero eigenvalues that are equal to 1 so that the pseudo determinant is

det ( a J 2 = 2 ) = a 5 . $$ \begin{aligned} \det (a \mathsf J _{2=2}) = a^5. \end{aligned} $$(B.17)

All Figures

thumbnail Fig. 1.

Illustration of the steps necessary for measuring the bias function, f. Left: the smoothed linear density field and the initial (Lagrangian) locations of a set of tracers – here the most bound particles of haloes with M200b ∼ 2 × 1014 h−1 M. Center: Distribution of the linear densities at the tracer locations (orange histogram) which is biased relative to the distribution at random locations (blue histogram). Right: The bias function, which is inferred by dividing the orange histogram by the background distribution. The solid lines show different bias models out of which the Gaussian bias model describes the data clearly the best.

In the text
thumbnail Fig. 2.

Galaxy environment distribution (left) and Lagrangian bias function (right) of mass selected haloes (M200b ∼ 4 × 1013 h−1 M) at different damping scales (kd = 0.07 h Mpc−1, 0.15 h M−1 and 0.4 h Mpc−1 in top, center and bottom respectively). At all scales, the Gaussian bias provides a description that is as good as quadratic bias or significantly better – especially so at small smoothing scales.

In the text
thumbnail Fig. 3.

Galaxy environment distribution (left) and the Lagrangian bias function (right) of stellar-mass-selected galaxies (top) and SFR-selected galaxies (bottom).

In the text
thumbnail Fig. 4.

Metrics to compare the performance of the Gaussian bias model to expansion biases. The top panel shows the Wasserstein distance between the actual and the modelled galaxy environment distribution (smaller is better), the central panel the normalised log likelihood (larger is better) and the bottom panel the amplitude of the first unmodelled term (closer to zero is better). Brighter solid lines show the quadratic bias model and darker solid lines the Gaussian bias. All metrics and selections show the Gaussian bias performing either similar to a quadratic model or significantly better.

In the text
thumbnail Fig. 5.

Distribution of Lagrangian over-densities, δ, and the Laplacian, L (in units of h2 Mpc−2) p(δ, L|g) (left), and the multi-variate bias function (right). The top row shows the distribution for matter particles for kd = 0.2 h Mpc−1 and the second row of haloes selected in the mass-range M200b ∼ 2 × 1014 h M. The remaining rows show approximations to the halo distribution through a 3 parameter second-order expansion, a 5 parameter second-order expansion and a Gaussian bias model. The contours show the 1-,2- and 3-sigma regions of the matter distribution. The measurement for haloes has only reasonable statistics roughly inside of the 3-sigma region of the matter distribution (outermost grey contour). The multi-variate Gaussian bias seems to be a good approximation to the actual distribution and adequately respects the positivity constraint.

In the text
thumbnail Fig. 6.

Metrics for evaluating the performance of the different bias models in the multi-variate case. The top panel shows the Wasserstein metric (smaller is better), the central panel the normalised log likelihood (larger is better) and the bottom panel the fraction of volume for which negative galaxy densities are predicted. Generally, the multi-variate Gaussian bias model appears to describe the data either significantly better or at least equally well to a quadratic bias model and it never predicts negative galaxy densities, which can get quite significant for the expansion bias.

In the text
thumbnail Fig. 7.

Bias function measured at different smoothing scales in comparison to a multi-variate Gaussian bias model fitted at kd = 0.2 h Mpc−1 and rescaled under the assumption of the PBS. The PBS assumption recovers the bias function at larger scales kd ≲ 0.2 h Mpc−1 very well, but seems to be less accurate at smaller scales.

In the text
thumbnail Fig. 8.

Re-normalised bias function of haloes with M200b ∼ 1014 h M reconstructed at zeroth spatial order (left) and second spatial order (right) for different damping scales. At second spatial order the reconstructed function is independent of the measurement scale for kd ≲ 0.2 h Mpc−1 and it is well described by a Gaussian bias model.

In the text
thumbnail Fig. 9.

Measurements of the re-normalised bias function, F, for haloes of three different mass selections. A reliable scale-independent reconstruction seems generally possible from damping scales up to kd ≲ 0.2 h Mpc−1. The function looks generally better approximated by a Gaussian than a quadratic bias model, especially in regions where the function gets close to zero F ≲ 0.5.

In the text
thumbnail Fig. 10.

Measurements of the re-normalised bias function, F, for stellar-mass-selected galaxies (top) and SFR-selected galaxies (bottom).

In the text
thumbnail Fig. 11.

Bias function, f (top panel), vs the re-normalised large-scale bias function, F (bottom), for a set of toy models. Selection functions that are very non-Gaussian on small scales naturally lead to bias relations that are close to Gaussian on large scales, since their relation is given by a convolution with a Gaussian.

In the text
thumbnail Fig. A.1.

The large-scale bias function, F, measured for a toy model with known solution (top) and for unbiased tracers (bottom). Coloured shaded regions show the measurement (and jackknife error) at different damping scales and the grey regions show different realisations at kd = 0.1 h Mpc−1. At larger damping scales the measurement gets generally biased at smaller values of δ. The jackknife errors underestimate the uncertainty, since the measurement is sensitive to rare outliers, but we can estimate the validity range of the measurements as where unbiased tracers have F ∼ 1.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.