Towards an optimal marked correlation function analysis for the detection of modified gravity

M. Kärcher; J. Bel; S. de la Torre

doi:10.1051/0004-6361/202450977

Home

All issues

Volume 694 (February 2025)

A&A, 694 (2025) A253

Full HTML

Open Access

Issue		A&A Volume 694, February 2025


Article Number		A253
Number of page(s)		29
Section		Cosmology (including clusters of galaxies)
DOI		https://doi.org/10.1051/0004-6361/202450977
Published online		19 February 2025

A&A, 694, A253 (2025)

Towards an optimal marked correlation function analysis for the detection of modified gravity

M. Kärcher¹^,2^⋆, J. Bel² and S. de la Torre¹

¹ Aix-Marseille Université, CNRS, CNES, LAM, Marseille, France
² Aix-Marseille Université, Université de Toulon, CNRS, CPT, Marseille, France

^⋆ Corresponding author; martin.karcher@lam.fr

Received: 3 June 2024
Accepted: 13 December 2024

Abstract

Modified gravity (MG) theories have emerged as a promising alternative to explain the late-time acceleration of the Universe. However, the detection of MG in observations of the large-scale structure remains challenging due to the screening mechanisms that obscure any deviations from general relativity (GR) in high-density regions. The marked two-point correlation function, which is particularly sensitive to the surrounding environment, offers a promising approach to enhancing the discriminating power in clustering analyses and to potentially detecting MG signals. This work investigates novel marks based on large-scale environment estimates, which also that exploit the anti-correlation between objects in low- and high-density regions. This is the first time that the propagation of discreteness effects in marked correlation functions is investigated in depth. In contrast to standard correlation functions, the density-dependent marked correlation function estimated from catalogues is affected by shot noise in a non-trivial way. We assess the performance of various marks to distinguish GR from MG. This is achieved through the use of the ELEPHANT suite of simulations, which comprise five realisations of GR and two different MG theories: f(R) and nDGP. In addition, discreteness effects are thoroughly studied using the high-density Covmos catalogues. We have established a robust method to correct for shot-noise effects that can be used in practical analyses. This methods allows the recovery of the true signal, with an accuracy below 5% over the scales of 5 h⁻¹ Mpc up to 150 h⁻¹ Mpc. We find that such a correction is absolutely crucial to measure the amplitude of the marked correlation function in an unbiased manner. Furthermore, we demonstrate that marks that anti-correlate objects in low- and high-density regions are among the most effective in distinguishing between MG and GR; they also uniquely provide visible deviations on large scales, up to about 80 h⁻¹ Mpc. We report differences in the marked correlation function between f(R) with |f_R0| = 10⁻⁶ and GR simulations of the order of 3–5σ in real space. The redshift-space monopole of the marked correlation function in this MG scenario exhibits similar features and performance as the real-space marked correlation function. The combination of the proposed tanh-mark with shot-noise correction paves the way towards an optimal approach for the detection of MG in current and future spectroscopic galaxy surveys.

Key words: large-scale structure of Universe

© The Authors 2025

Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1. Introduction

The seminal works of Riess et al. (1998) and Perlmutter et al. (1999) revived the cosmological constant Λ as a form of dark energy to explain the late-time accelerated expansion of the Universe. Together with cold dark matter (CDM), this established the ΛCDM model as the current concordance model of cosmology. Upon closer examination, however, the ΛCDM model is found to exhibit certain inherent problems. On the theoretical side, the fine-tuning problem of Λ, as extensively studied in Martin (2012), represents a significant challenge. On the observational side, most recent cosmological results show a growing tension between early- and late-time measurements of the Hubble constant H₀, respectively, extracted from the cosmic microwave background (CMB) anisotropies (Planck Collaboration VI 2020) and local distance ladders (Riess et al. 2022). Another source of contention, although not as significant, comes from an apparent mismatch in the measured variance of matter fluctuations, σ₈, between early- (Planck Collaboration VI 2020) and late-time large-scale structure measurements (Tröster et al. 2020).

To address the aforementioned issues, numerous attempts have been made on the theoretical level. Of particular interest to circumvent the introduction of a cosmological constant or dark energy component in the first place, are modified gravity (MG) theories (see Clifton et al. 2012). One of the most popular modifications is the theory of inflation by Guth (1981) accommodated with scalar-tensor models to resolve the classical flatness and horizon problem in ΛCDM. MG theories are commonly defined and compared through their respective action. Possibly the most straightforward extension to the Einstein-Hilbert action in general relativity (GR) consists of replacing the Ricci scalar by a free function of it, dubbed f(R) theories (see for De Felice & Tsujikawa 2010, a review). The latter have been subjected to comprehensive analysis, resulting in tight constraints on viable f(R) functions to ensure the realisation of accelerated expansion without the necessity of a cosmological constant, while simultaneously satisfying solar system GR tests (Cognola et al. 2008). The f(R) model proposed by Hu & Sawicki (2007) is of particular importance, as it simultaneously realises accelerated expansion and evades Solar System tests through the use of a so-called screening mechanism.

For MG theories to be a viable replacement or extension to GR they have to fulfil very stringent tests coming from solar system observations (see Bertotti et al. 2003; Williams et al. 2004, 2012). On larger scales, the situation is more complex, but there is an increasing effort to tighten constraints on MG theories by using CMB data (Planck Collaboration XIV 2016) in combination with large-scale structure and supernova observables (Lombriser et al. 2009; Battye et al. 2018). In order for a modification to standard gravity to shroud its effects on small scales to recover GR, screening mechanisms are invoked. In general terms, the screening mechanism describes a suppression of any fifth force to a negligible level such that gravity follows GR in certain environments. Screening can happen in different ways such as the chameleon screening in scalar-tensor-theories of gravity (see Khoury & Weltman 2004a,b). It also emerges in some f(R) theories due to the equivalence between scalar-tensor and f(R) theories (Sotiriou & Faraoni 2010). The DGP gravity model, originally developed by Dvali et al. (2000), exhibits the screening mechanism first introduced by Vainshtein (1972). A third popular screening mechanism by Damour & Polyakov (1994) is present in the symmetron model (Hinterbichler & Khoury 2010). For a detailed description of the field of screening mechanisms we refer the reader to Brax et al. (2022). Intuitively, screening mechanisms in the cosmological context, particularly the chameleon one, can be understood as a density dependency, where modification to GR should appear only in low-density regions compared to the mean density of the Universe. In high-density regions, inside galaxies or stellar systems for instance, any modification should be negligible. This imprints a fundamental environmental dependency on the clustering of matter predicted in those theories.

Since the modifications to GR are expected to be small, the observational detection of MG on cosmological scales poses a major challenge. Guzzo et al. (2008) advocated the use of the growth rate of structure f measured from redshift-space distortions (RSD) in the galaxy clustering pattern as an indicator of the validity of GR in the large-scale structure. Since then, f has become a quantity of major interest and has been measured in large galaxy redshift surveys (e.g. Blake et al. 2011; Beutler et al. 2012; de la Torre et al. 2013; Bautista et al. 2021). It is now a standard probe that will be measured by ongoing surveys, in particular the dark energy spectroscopic instrument (DESI) (DESI Collaboration 2016) and Euclid mission (Euclid Collaboration: Mellier et al. 2025) with an exquisite precision. It is worth mentioning the E_g statistic developed by Zhang et al. (2007), a mixture of galaxy clustering and weak lensing measurements to probe the properties of the underlying gravity theory and that has been measured (e.g. Reyes et al. 2010; de la Torre et al. 2017; Jullo et al. 2019; Blake et al. 2020). Other quantities that can in principle be measured from observations are the gravitational slip parameter η and the growth index γ (see Ishak 2019, for a review). At the present time, any of the aforementioned observables has enabled the detection of a deviation from the standard gravitational field.

To improve on existing approaches and to exploit the additional environmental dependency of MG in clustering analyses, White (2016) proposed the marked correlation function as a tool to increase the difference in the clustering signal between MG and GR. In that case, the marked correlation function is a weighted correlation function normalised to the unweighted correlation function, and where object weights or marks, are a function of the local density. The latter is estimated from the density field inferred by dark matter or galaxies. With this methodology, Hernández-Aguayo et al. (2018), Armijo et al. (2018), and Alam et al. (2021) investigated marked correlation functions in N-body simulations of MG. In addition to examining different mark functions based on density, they also considered marks based on the local gravitational potential or the host halo mass of the galaxy. They observed significant differences between MG and GR for marks based on density on small scales, below about 20 h⁻¹ Mpc. White & Padmanabhan (2009) also showed the potential of marked correlation functions to break the degeneracy between the halo occupation distribution (HOD) and cosmological parameters. Similar approaches using weighted statistics or transformation of the density field have further been proposed. Llinares & McCullagh (2017) used logarithmic transformations of the density field and computed power spectra of the transformed field in N-Body simulations to improve on the detection of MG. Boosting the constraining power in cosmological parameter inference using power spectra has been shown by using Fisher forecasts by Valogiannis & Bean (2018), where they compare the Fisher boost using the field transformation of Llinares & McCullagh (2017), the clipping strategy to mask out high-density regions (Simpson et al. 2011, 2013) and the mark proposed in White (2016). Another application of clipping has been done by Lombriser et al. (2015) to the power spectrum in order to better detect f(R) theories with chameleon screening. Recently, the use of marked power spectra has been extended to constrain massive neutrinos (Massara et al. 2021) and tighten constraints on cosmological parameters (Yang et al. 2020; Xiao et al. 2022).

While a lot of effort on marked statistics for MG has been carried out on simulation, there have been several applications to observational data. Satpathy et al. (2019), for the first time, measured marked correlation functions from observations in the context of MG. They used the proposed original mark introduced by White (2016) and investigated the monopole and quadrupole of the marked correlation function measured over the LOWZ sample of the Sloan Digital Sky Survey (SDSS) data release 12 (DR12) dataset (Alam et al. 2015). They could not detect significant differences between MG and GR and they attributed this to modelling uncertainties of the two-point correlation function (2PCF) on scales of 6 h⁻¹ Mpc < s < 69 h⁻¹ Mpc. Armijo et al. (2024b) applied the strategy introduced in Armijo et al. (2024a) to LOWZ and CMASS catalogues of SDSS thereby incorporating uncertainties of the HOD on the projected weighted clustering. They compare predictions from GR and f(R) but find no significant differences, both fit the LOWZ data and are within the uncertainties for the investigated scales between 0.5 h⁻¹ Mpc and 40 h⁻¹ Mpc. For the CMASS catalogue the predictions for both GR and f(R) models fail to properly follow the data in the first place.

A number of the issues encountered in the literature regarding the use of marked correlation functions to distinguish MG from GR can be identified as arising from two main sources. The first is the choice of the mark function, which, in the majority of cases, results in significant differences on small scales only. On those scales, a thorough theoretical modelling is difficult as a proper inclusion of non-linear effects of redshift-space distortions is needed as well. The second issue is the propagation of discreteness effects in the mark estimation, namely, computing the local density from a finite point set, into the measurement of the marked correlation function. To the best of our knowledge, this has not been done so far and can lead to biased measurements if not accounted for. The present work therefore aims at identifying an optimal mark function that is able to significantly discriminate GR from MG on larger scales where theoretical modelling is more tractable. For this, we develop new ways to include environmental information into weighted statistics as well as investigating new algebraic functions of the density contrast to be used as a mark. Furthermore, we investigate the discreteness effects and devise a new methodology to correct marked correlation function measurements for the bias induced by estimating density-dependent marks on discrete point sets. We demonstrate that by applying this methodology we are able to robustly measure the amplitude of marked correlation function and mitigate possible artefacts in the subsequent analysis of MG signatures.

This article is structured as follows. Section 2 describes the f(R) and nDGP gravity models that are later investigated and tested. Section 3 introduces the basics of weighted two-point statistics and marked correlation function. Section 4 presents the MG simulations used in this work and measurements of unweighted statistics, which serve as a reference for comparison with the marked correlation function. Section 5 presents new marks to be used in the analysis of MG. This is followed, in Sect. 6, by the study of the effects of shot noise in weighted two-point statistics. Section 7 shows the main results of this article, which are obtained by applying the previously-defined methodology to MG simulations. Section 8 comprises a discussion on the optimal methodology for marked correlation function and the conclusions are provided in Sect. 9.

2. Modified gravity

We provide in this section a brief review of the theory behind the two classes of MG models that are used later in this work. In particular, we report the respective actions alongside with the equation of motion for the additional scalar degree of freedom, which elucidates the different screening mechanisms incorporated in those gravity theories.

2.1. f(R) Gravity

A general extension to the Einstein-Hilbert action in GR is accomplished by adding a general function of the Ricci scalar f(R), which then takes the form

$\begin{matrix} S = \int d^{4} x \sqrt{- g} {\frac{R + f (R)}{16 π G} + L_{m}}, \end{matrix}$ $\begin{aligned} S = \int \text{ d}^4x \sqrt{-g} \left\{ \frac{R+f(R)}{16\pi G}+\mathcal{L} _m\right\} , \end{aligned}$ (1)

when including a matter Lagrangian ℒ_m. This leads to the field equations

$\begin{matrix} G_{α β} + f_{R} R_{α β} - (\frac{f}{2} - □ f_{R}) g_{α β} - \nabla_{α} \nabla_{β} f_{R} = 8 π G T_{α β}, \end{matrix}$ $\begin{aligned} G_{\alpha \beta } + f_R R_{\alpha \beta } - \left(\frac{f}{2}-\Box f_R\right)g_{\alpha \beta } - \nabla _{\alpha }\nabla _{\beta } f_R = 8\pi G T_{\alpha \beta }, \end{aligned}$ (2)

where □ denotes the d’Alembertian operator and Greek indices run from 1 to 4. From these field equations an equation of motion for the scalaron field f_R = ∂f/∂_R can be deduced by taking the trace. The Ricci scalar is given by

$\begin{matrix} R = 12 H^{2} + 6 H H^{'}, \end{matrix}$ $\begin{aligned} R = 12 H^2 + 6HH^{\prime }, \end{aligned}$ (3)

where a prime denotes a differentiation with respect to the natural logarithm of the scale factor. In a ΛCDM universe, today’s Ricci scalar is

$\begin{matrix} R_{0} = 12 H_{0}^{2} - 9 H_{0}^{2} Ω_{m}^{0} . \end{matrix}$ $\begin{aligned} R_0 = 12H_0^2 - 9 H_0^2 \Omega _{\text{ m}}^0. \end{aligned}$ (4)

Although the f(R) function being completely general, there are several constraints concerning its derivatives with respect to R to obtain a theory that is free from ghost instabilities (see Tsujikawa 2010, for a derivation of those stability conditions). Furthermore, specific functions can be chosen depending on the context of the theory. Here we focus on a cosmological model with a late-time accelerated expansion for which the Hu-Sawicki theory (Hu & Sawicki 2007) is the most promising. The f(R) function in this model takes the form

$\begin{matrix} f (R) = - m^{2} \frac{c_{1} {(R / m^{2})}^{n}}{c_{2} {(R / m^{2})}^{n} + 1}, \end{matrix}$ $\begin{aligned} f(R) = -m^2 \frac{c_1 (R/m^2)^n}{c_2(R/m^2)^n+1}, \end{aligned}$ (5)

with m = 8πGρ₀/3, and c₁, c₂, n being constants. In the simulations presented in the next sections, a value of n = 1 was used. To produce a background expansion as dictated by ΛCDM, the ratio between c₁ and c₂ has to be chosen such that

$\begin{matrix} \frac{c_{1}}{c_{2}} = 6 \frac{Ω_{Λ}^{0}}{Ω_{m}^{0}} . \end{matrix}$ $\begin{aligned} \frac{c_1}{c_2} = 6 \frac{\Omega _{\Lambda }^0}{\Omega _{\text{ m}}^0}. \end{aligned}$ (6)

From this follows a Lagrangian of the form ℒ = R/16πG − Λ for the gravitational sector in the R ≫ m² limit where f(R)≈ − m²c₁/c₂, which corresponds to the well-known Einstein-Hilbert action with cosmological constant. Furthermore, by expanding the f(R) function in the aforementioned limit but keeping the next-to-leading order term we arrive at

$\begin{matrix} f (R) = - \frac{c_{1}}{c_{2}} m^{2} (1 - \frac{m^{2}}{R c_{2}}) = - m^{2} 6 \frac{Ω_{Λ}}{Ω_{m}} - f_{R 0} \frac{R_{0}^{2}}{R}, \end{matrix}$ $\begin{aligned} f(R) = -\frac{c_1}{c_2} m^2 \left(1-\frac{m^2}{Rc_2}\right) = -m^2 6\frac{\Omega _{\Lambda }}{\Omega _{\text{ m}}} - f_{R0} \frac{R_0^2}{R}, \end{aligned}$ (7)

where, in the second equality, we used the expression of the scalaron field,

$\begin{matrix} f_{R} = - \frac{m^{4}}{R^{2}} \frac{c_{1}}{c_{2}^{2}}, \end{matrix}$ $\begin{aligned} f_R = -\frac{m^4}{R^2} \frac{c_1}{c_2^2} ,\end{aligned}$ (8)

evaluated for the background Ricci scalar value today (f_R0). We replaced c₁/c₂ with the previous expression to obtain a ΛCDM background. In this approximation, and by fixing n = 1, the f(R) function depends solely on the cosmological parameters and f_R0, the latter encoding the strength of the modification to GR.

Having an f(R) modification in the Lagrangian will introduce additional force terms into the Poisson equation in the quasi-static and weak-field limit, as can be derived from perturbed field equations (Bose et al. 2015)

$\begin{matrix} \nabla^{2} Φ = 4 a^{2} π G (ρ - \bar{ρ}) - \frac{1}{2} \nabla^{2} f_{R} \end{matrix}$ $\begin{aligned} \nabla ^2\Phi = 4 a^2\pi G (\rho - \bar{\rho }) -\frac{1}{2}\nabla ^2 f_R \end{aligned}$ (9)

and

$\begin{matrix} \nabla^{2} f_{R} = - \frac{a^{2}}{3} (R - \bar{R}) - \frac{8 π G}{3} a^{2} (ρ - \bar{ρ}), \end{matrix}$ $\begin{aligned} \nabla ^2f_R = -\frac{a^2}{3} (R - \bar{R}) - \frac{8\pi G}{3} a^2 (\rho -\bar{\rho }), \end{aligned}$ (10)

where $\bar{ρ}$ $\bar{\rho}$ and $\bar{R}$ $\bar{R}$ are the matter density and Ricci scalar at the background level. These additional terms should be suppressed in the vicinity of massive objects, otherwise solar system tests might have detected the fifth force. When f(R) gravity is rewritten as a scalar-tensor gravity, the potential of the scalar field receives a contribution from the matter density (Khoury & Weltman 2004a) as

$\begin{matrix} V_{eff} (φ) \equiv V (φ) + ρ e^{φ β / M_{pl}}, \end{matrix}$ $\begin{aligned} V_{\text{ eff}}(\varphi ) \equiv V(\varphi ) + \rho e^{\varphi \beta /M_{pl}}, \end{aligned}$ (11)

with β being a dimensionless constant and M_pl = (8πG)⁻¹. This leads in turn to a modified equation of motion for the scalar field φ that includes density-dependent potential. In this context a thin-shell condition can be derived, stating that the difference between the scalar field far away from the source φ_∞ and inside the object φ_c should be small compared to the gravitational potential on the surface of the object (Khoury & Weltman 2004a). Exterior solutions for φ around compact objects satisfying the thin-shell condition will reach the solution φ_∞ at larger distances, thereby suppressing the effect of the scalar field close to the object.

2.2. nDGP Gravity

The modification to standard gravity devised by Dvali, Gabadadze and Porrati (Dvali et al. 2000), hereafter DGP gravity, is of a radically different kind compared to f(R) gravity. The setup is a 4D brane embedded in a 5D bulk and the modification to gravity comes from the fifth-dimensional contribution. The action is given by (Clifton et al. 2012)

$\begin{matrix} S & = M_{5}^{3} \int d^{5} x \sqrt{- g_{5}} R_{5} \\ + \int d^{4} x \sqrt{- g_{4}} {- 2 M_{5}^{3} K + \frac{M_{4}^{2}}{2} R_{4} - σ + L_{m}}, \end{matrix}$ $\begin{aligned} S&= M_5^3 \int \text{ d}^5 x \sqrt{-g_5}\, R_5\nonumber \\&\quad + \int \text{ d}^4 x \sqrt{-g_4} \left\{ -2M_5^3 K + \frac{M_4^2}{2}R_4 - \sigma + \mathcal{L} _m\right\} , \end{aligned}$ (12)

where g₅ and g₄ are the 5D and 4D metric, respectively. The matter Lagrangian ℒ_m does live on the 4D brane as well as the brane tension σ, which can act as a cosmological constant. Furthermore, there is both a 5D Ricci scalar R₅ and its 4D counterpart R₄, and the brane has an extrinsic curvature term K. Generally, both the brane and bulk have their individual mass scales M₄ and M₅ and they give rise to a specific cross-over scale r_c defined as

$\begin{matrix} r_{c} = \frac{M_{4}^{2}}{2 M_{5}^{3}}, \end{matrix}$ $\begin{aligned} r_c = \frac{M_4^2}{2M_5^3}, \end{aligned}$ (13)

which regulates the contribution of 4D with respect to 5D gravity.

The modified Poisson equation for the gravitational potential and the equation for the additional scalar degree of freedom φ (also called brane-bending mode as it describes the displacement of the brane) lead to the fifth force. They are given in the quasi-static approximation by (see Koyama & Silva 2007)

$\begin{matrix} \nabla^{2} Φ & = 4 π G a^{2} (ρ - \bar{ρ}) + \frac{1}{2} \nabla^{2} φ \\ \nabla^{2} φ + \frac{r_{c}}{3 β a^{2}} ({(\nabla^{2} φ)}^{2} - {(\nabla_{i} \nabla_{j} φ)}^{2}) & = \frac{8 π G a^{2}}{3 β} (ρ - \bar{ρ}), \end{matrix}$ $\begin{aligned} \nabla ^2 \Phi&= 4\pi G a^2 (\rho -\bar{\rho }) + \frac{1}{2} \nabla ^2 \varphi \nonumber \\ \nabla ^2 \varphi + \frac{r_c}{3\beta a^2 }\left( (\nabla ^2\varphi )^2 - (\nabla _i\nabla _j\varphi )^2\right)&= \frac{8\pi G a^2}{3\beta }(\rho - \bar{\rho }), \end{aligned}$ (14)

where β is

$\begin{matrix} β (t) = 1 \pm 2 H r_{c} (1 + \frac{\dot{H}}{3 H^{2}}) . \end{matrix}$ $\begin{aligned} \beta (t) = 1 \pm 2Hr_c \left(1+\frac{\dot{H}}{3H^2}\right). \end{aligned}$ (15)

The dot refers to a derivative with respect to metric time t. One important feature of the DGP model is the existence of a normal branch and of a self-accelerating branch, indicated respectively by the + and − signs in the equation for β. While the self-accelerating branch appears appealing for cosmology at first sight, as it can generate accelerated expansion without cosmological constant (the limit of vanishing brane tension), it contains unphysical ghost instabilities (Clifton et al. 2012). Hence the model used in the simulation analysed in this work implements the normal branch, which does need a non-vanishing brane tension to produce accelerated expansion. It is interesting to study normal branch DGP models as it exhibits the Vainshtein screening mechanism (see Schmidt 2009; Barreira et al. 2015). To illustrate that mechanism, the equation for the scalar field has to be studied around a mass source. Far away from the source, only the linear term ∇²φ will dominate and this will contribute substantially to the usual gravitational force as it will also scale ∝1/r. However, non-linear terms start to dominate once we are closer to the source than to the Vainshtein radius r_V, defined by

$\begin{matrix} r_{V} \approx {(r_{s} r_{c}^{2})}^{1 / 3}, \end{matrix}$ $\begin{aligned} r_V \approx (r_sr_c^2)^{1/3}, \end{aligned}$ (16)

with r_s being the Schwarzschild radius of the source. At some point, non-linear terms will dominate and the resulting force will scale as $\sqrt{r}$ $\sqrt{r}$ and hence will be suppressed with respect to the gravitational force. A derivation of the solution for φ can be found in Koyama & Silva (2007) for the general case, which includes linear and non-linear terms, and where the same scaling are recovered in the respective regimes.

At fixed Schwarzschild radius, the cross-over scale determines the Vainshtein radius, so by running simulations with different r_c one will obtain different strengths of the Vainshtein screening. Therefore, varying r_c allows for the tuning of the amount of deviation to GR that is required.

3. Weighted statistics and estimators

In Table 1 we summarise the notation used throughout this work to ease distinguishing between the different discrete and continuous quantities.

Table 1.

Notations used in this article.

3.1. Unweighted Statistics

The density contrast δ(x), which encodes the relative change of the density field ρ(x), is defined as

$\begin{matrix} δ (x) = \frac{ρ (x) - \bar{ρ}}{\bar{ρ}}, \end{matrix}$ $\begin{aligned} \delta (\mathbf x ) = \frac{\rho (\mathbf x ) - \bar{\rho }}{\bar{\rho }}, \end{aligned}$ (17)

where $\bar{ρ}$ $\bar\rho$ is the mean density. To study the matter clustering in the cosmological context, one of the most common summary statistics to characterise the density field is the two-point correlation function ξ(x, y) or its Fourier counterpart the power spectrum. The 2PCF is the cumulant ⟨δ(x)δ(y)⟩_c of the density contrast at positions x and y. For two-point correlations, the cumulant and standard ensemble average are the same quantity. They remain the same up to three-point correlations but start to differ from four-point correlations onwards. Due to the assumed statistical invariance by translation, the correlation function does only depend on the separation vector r = x − y. By inserting the definition of the density contrast we have that

$\begin{matrix} ξ (r) = \frac{{⟨ ρ (x + r) ρ (x) ⟩}_{c}}{{\bar{ρ}}^{2}} . \end{matrix}$ $\begin{aligned} \xi (\mathbf r ) = \frac{\langle \rho (\mathbf x +\mathbf r )\rho (\mathbf x )\rangle _c}{\bar{\rho }^2}. \end{aligned}$ (18)

From the last equation one can see that the 2PCF is zero if the field is totally uncorrelated at two different positions.

To estimate the 2PCF, we can deploy the commonly-used Landy-Szalay pair-counting estimator proposed by Landy & Szalay (1993) to minimise the variance, and which takes the form

$\begin{matrix} ξ (r) = \frac{D D (r) - 2 D R (r) + R R (r)}{R R (r)} . \end{matrix}$ $\begin{aligned} \xi (\mathbf r ) = \frac{DD(\mathbf r )-2DR(\mathbf r )+RR(\mathbf r )}{RR(\mathbf r )}. \end{aligned}$ (19)

The terms DD(r) and RR(r) are the normalised pair counts measured in the data sample and a random sample following the geometry of the data sample, respectively. In addition, a cross term with pairs consisting of one point in the data sample and the other in the random sample is given by DR(r). In this work, we only compute two-point correlation functions in periodic boxes without selection function. In this case, the term DR converges to the term RR in the limit of many realisations of random catalogues, and we can use the natural estimator given by Peebles & Hauser (1974)

$\begin{matrix} ξ (r) = \frac{D D (r) - R R (r)}{R R (r)} . \end{matrix}$ $\begin{aligned} \xi (r) = \frac{DD(r)-RR(r)}{RR(r)}. \end{aligned}$ (20)

The distribution of pairs in real space is isotropic, and together with periodic boundary conditions, lead the correlation function to only depend on the modulus r of the pair separation vector.

In redshift space, it is useful to compute the anisotropic correlation function ξ(s, μ), binned in the norm of the pair separation vector s and the cosine angle between the line of sight (LOS) and the pair separation vector μ. The 2PCF estimator for a periodic box is hence

$\begin{matrix} ξ (s, μ) = \frac{D D (s, μ) - R R (s, μ)}{R R (s, μ)}, \end{matrix}$ $\begin{aligned} \xi (s, \mu ) = \frac{DD(s, \mu ) - RR(s, \mu )}{RR(s, \mu )}, \end{aligned}$ (21)

and normalised RR counts are given by

$\begin{matrix} \begin{matrix} R R ([[s, s + Δ s], [μ + Δ μ]]) & = \frac{1}{L^{3}} \frac{4}{3} π Δ μ {{(s + Δ s)}^{3} - s^{3}}, \end{matrix} \end{matrix}$ $\begin{aligned} \begin{split} RR([[s, s+\Delta s], [\mu + \Delta \mu ]])&= \frac{1}{L^3}\frac{4}{3}\pi \Delta \mu \{(s + \Delta s)^3 - s^3 \}, \end{split} \end{aligned}$ (22)

which can be derived by calculating the volume covered by the respective bins in s and μ relative to the total volume of the bin. For real space measurements, the RR counts can be evaluated analytically in a similar fashion as in Eq. (22).

The 2PCF in redshift space, ξ(s, μ), can be decomposed into multipole moments, which is a basis encoding the different angle dependencies of the full 2PCF. Usually the decomposition is done into the first three non-vanishing multipole moments, being the monopole, quadrupole and hexadecapole. In the following, we focus on the first two since the hexadecapole can be quite noisy for small point sets. The multipole moment correlation functions are obtained by decomposing the ξ(s, μ) in the basis of Legendre polynomials as

$\begin{matrix} ξ_{ℓ} (s) = \frac{(2 ℓ + 1)}{2} \int_{- 1}^{1} d μ ξ (s, μ) P_{ℓ} (μ), \end{matrix}$ $\begin{aligned} \xi _{\ell }(s) = \frac{(2\ell +1)}{2} \int _{-1}^1 \text{ d} \mu \, \xi (s, \mu ) P_{\ell }(\mu ), \end{aligned}$ (23)

yielding for the monopole and quadrupole to

$\begin{matrix} ξ_{0} (s) & = \frac{1}{2} \int_{- 1}^{1} d μ ξ (s, μ) \\ ξ_{2} (s) & = \frac{5}{2} \int_{- 1}^{1} d μ ξ (s, μ) \frac{1}{2} (3 μ^{2} - 1) . \end{matrix}$ $\begin{aligned} \xi _0(s)&= \frac{1}{2}\int _{-1}^1 \text{ d} \mu \, \xi (s, \mu )\nonumber \\ \xi _2(s)&= \frac{5}{2}\int _{-1}^1 \text{ d} \mu \, \xi (s, \mu ) \frac{1}{2}(3\mu ^2-1). \end{aligned}$ (24)

In practice, these integrals are discretised and we measure ξ(s, μ) in 100 bins from μ = 0 to μ = 1 using the symmetry under interchange of galaxies for a given pair, which is fulfilled in our periodic box simulations. The discretised correlation function is then integrated by approximating the integral as a Riemann sum.

3.2. Weighted statistics

Let us now define the weighted density contrast

$\begin{matrix} δ_{M} (x) = \frac{ρ_{M} (x) - \bar{ρ_{M}}}{\bar{ρ_{M}}}, \end{matrix}$ $\begin{aligned} \delta _M(\mathbf x ) = \frac{\rho _M(\mathbf x ) - \overline{\rho _M}}{\overline{\rho _M}}, \end{aligned}$ (25)

where the weighted density field is given by ρ_M(x) = m(x)ρ(x), $\bar{ρ_{M}} = ⟨ ρ_{M} (x) ⟩$ $\overline{\rho_M}=\langle \rho_M(\mathbf{x})\rangle$ , and m(x) is the mark field. The weighted correlation function is the ensemble average of the weighted density contrast correlation,

$\begin{matrix} W (r) = ⟨ δ_{M} (x) δ_{M} (x + r) ⟩, \end{matrix}$ $\begin{aligned} W(\mathbf r ) = \langle \delta _M(\mathbf x ) \delta _M(\mathbf x +\mathbf r )\rangle , \end{aligned}$ (26)

which, when substituted with the definition of the density contrast, takes the form

$\begin{matrix} \begin{matrix} 1 + W (r) & = \frac{{\bar{ρ}}^{2}}{{\bar{ρ_{M}}}^{2}} ⟨ m (x) (1 + δ (x)) m (x + r) (1 + δ (x + r)) ⟩ . \end{matrix} \end{matrix}$ $\begin{aligned} \begin{split} 1+W(\mathbf r )&= \frac{\bar{\rho }^2}{\overline{\rho _M}^2}\langle m(\mathbf x ) (1+\delta (\mathbf x )) m(\mathbf x +\mathbf r ) (1+\delta (\mathbf x +\mathbf r )) \rangle . \end{split} \end{aligned}$ (27)

The mark field m(x) can be continuous in space, or defined on the point set (galaxy or halo catalogue) in a discrete way. Each object in the catalogue can be assigned a mark from the mark field, e.g. the i-th object has a mark m_i. The normalised weighted pair counts are obtained as

$\begin{matrix} W W (r) = \frac{\sum_{i \neq j} m_{i} m_{j}}{{(\sum m_{i})}^{2} - \sum m_{i}^{2}}, \end{matrix}$ $\begin{aligned} WW(r) = \frac{\sum _{i\ne j}m_i m_j}{(\sum m_i)^2 - \sum m_i^2}, \end{aligned}$ (28)

where the sum is computed over all pairs with a separation inside the bin centred on r. The marked correlation function is then defined as (Beisbart & Kerscher 2000; Sheth 2005)

$\begin{matrix} M (r) \equiv \frac{1 + W (r)}{1 + ξ (r)} . \end{matrix}$ $\begin{aligned} \mathcal{M} (r) \equiv \frac{1+W(r)}{1+\xi (r)}. \end{aligned}$ (29)

It converges to ℳ(r) = 1 on large scales as W(r) and ξ(r) approach zero.

To estimate the weighted correlation function from a catalogue, the natural estimator can be generalised to include weighted DD(r) so that we can simply replace DD(r) with WW(r) counts, arriving at

$\begin{matrix} W (r) = \frac{W W (r) - R R (r)}{R R (r)} . \end{matrix}$ $\begin{aligned} W(r) = \frac{WW(r)-RR(r)}{RR(r)}. \end{aligned}$ (30)

Inserting this into the definition of the marked correlation function ℳ(r) we have

$\begin{matrix} M (r) = \frac{1 + \frac{W W (r) - R R (r)}{R R (r)}}{1 + \frac{D D (r) - R R (r)}{R R (r)}} = \frac{W W (r)}{D D (r)} . \end{matrix}$ $\begin{aligned} \mathcal{M} (r) = \frac{1+\frac{WW(r)-RR(r)}{RR(r)}}{1+\frac{DD(r)-RR(r)}{RR(r)}} = \frac{WW(r)}{DD(r)}. \end{aligned}$ (31)

If the LS estimator is employed instead, one has to compute WR(r) and DR(r) terms in addition.

Computing the multipoles of the weighted correlation function is analogous to the unweighted case. However, the multipoles of the marked correlation function can be defined in two ways. The most intuitive definition is obtained by decomposing the marked correlation function ℳ(s, μ) in the basis of Legendre polynomials, yielding

$\begin{matrix} M_{ℓ} (s) = \frac{(2 ℓ + 1)}{2} \int_{- 1}^{1} d μ M (s, μ) P_{ℓ} (μ) . \end{matrix}$ $\begin{aligned} \mathcal{M} _{\ell }(s) = \frac{(2\ell +1)}{2} \int _{-1}^1 \text{ d}\mu \, \mathcal{M} (s, \mu ) P_{\ell }(\mu ). \end{aligned}$ (32)

The second approach uses the following definition

$\begin{matrix} M_{ℓ} (s) = \frac{1 + W_{ℓ} (s)}{1 + ξ_{ℓ} (s)}, \end{matrix}$ $\begin{aligned} \mathcal{M} _{\ell }(s) = \frac{1+W_{\ell }(s)}{1+\xi _{\ell }(s)}, \end{aligned}$ (33)

which is motivated by the fact that the denominator is the actual multipole of the unweighted 2PCF. This is not the case in the first definition in Eq. (32). The second definition has been used for instance by White (2016) and Satpathy et al. (2019). Throughout this work we use the form given in Eq. (32).

4. Simulations

4.1. Characteristics

To investigate different marked correlation functions and assess their discriminating power regarding GR and MG, we use the Extended LEnsing PHysics using ANalytic ray Tracing (ELEPHANT) simulation suite, thoroughly discussed in Sect. II B. of Alam et al. (2021). We only provide a brief description of it in the following. This simulation suite consists of five realisations of GR with ΛCDM cosmology, f(R) gravity with three different values of |f_R0|=[10⁻⁶, 10⁻⁵, 10⁻⁴], and nDGP gravity with H₀r_c = [5.0, 1.0]. Henceforth, we refer to the different simulations as GR, F6, F5, F4, N5, and N1, respectively. The background cosmology is summarised in Table 2 and resembles the best-fitting cosmology obtained from the nine-year Wilkinson Microwave Anisotropy Probe (WMAP) CMB analysis presented in Hinshaw et al. (2013).

Table 2.

Reference cosmology of the ELEPHANT (first column) and DEMNUni (second column) simulations.

Key simulation parameters are summarised in Table 3. The dark matter halos have been identified with the ROCKSTAR algorithm (Behroozi et al. 2013) and have been subsequently populated with galaxies using the 5-parameter HOD model of Zheng et al. (2007). For each realisation, redshift-space coordinates have been calculated by fixing the LOS to one of the three simulation box axes, individually, and by ‘observing’ the box from a distance equal to 100 times the box side length. One crucial property of this suite of simulations, which makes it particularly suitable for our studies, is the matching of the projected 2PCF of galaxies w_p(r_p) predicted by GR in the MG simulations. The latter was done by tuning the HOD parameters of the MG simulations. For the GR simulation, the best-fit HOD parameters were taken from Manera et al. (2013).

Table 3.

Characteristics of the ELEPHANT simulation suite.

We use a second set of simulations to assess discreteness effects in the estimation of the mark and how they propagate into the marked correlation function. For this, we make use of the Covmos realisations from Baratta et al. (2023). These are not full N-body simulations, rather they reproduce dark-matter particle one- and two-point statistics following the technique described in Baratta et al. (2020). This procedure consists of applying a local transformation to a Gaussian density field such that it follows a target PDF and power spectrum. The point set is then obtained by a local Poisson sampling on the linearly interpolated density values. For the set of Covmos realisations, the target PDF and power spectrum were set by the DEMNUni N-body simulation (Castorina et al. 2015) statistics. The DEMNUni simulation assumes a ΛCDM cosmology with parameters presented in Table 2. The Covmos catalogues contain about 20 × 10⁶ points in a box of 1 h⁻¹ Gpc side length, resulting in a number density of about 0.02 h³ Mpc⁻³. This high density level enables us to treat those catalogues as if they were (almost) free from shot noise.

4.2. Two-point correlation function

Although the galaxy projected correlation function is matched in the ELEPHANT suite, it is instructive to assess residual deviations in other statistics, particularly for the interpretation of differences arising in the analysis of marked correlation functions.

We measured both the real- and redshift-space correlation functions in 30 linear bins in r and s, respectively, ranging from 10⁻³ h⁻¹ Mpc to 150 h⁻¹ Mpc. For the redshift-space measurements, we used the ELEPHANT catalogues with the LOS fixed to the x-direction. All correlation function measurements in this work have been performed using the publicly available package Corrfunc (Sinha & Garrison 2019, 2020). In the upper panel of Fig. 1, we show the standard correlation function in real space for the different gravity simulations. The measurements appear to be within the respective uncertainties over all scales, albeit on very small scales, a more careful assessment of possible deviations is advised as the error bars are very small. In Fig. 2, the monopole (right) and quadrupole (left) of the anisotropic 2PCF in redshift space are presented in the upper panels. Similarly as for the real space correlation function, the multipoles are within the respective uncertainties on large scales, although the N1 measurement appears to deviate from the others in the quadrupole. On smaller scales, discrepancies seem to appear as uncertainties get very small and a visual inspection is not sufficient to quantify those differences.

To properly assess the difference between MG and GR in weighted or unweighted correlation functions, we define the difference between MG and GR as the mean

$\begin{matrix} \bar{Δ M} (r) = \frac{1}{5} \sum_{i = 1}^{5} {M_{i, MG} (r) - M_{i, GR} (r)}, \end{matrix}$ $\begin{aligned} \overline{\Delta \mathcal{M} }(r) = \frac{1}{5}\sum _{i=1}^{5}\{\mathcal{M} _{i,\text{ MG}}(r)-\mathcal{M} _{i,\text{ GR}}(r)\}, \end{aligned}$ (34)

where i ranges over the number of realisations. However, the mean differences alone does not tell about the significance as the data might fluctuate much more than differences. We therefore divide the mean difference by the standard deviation as

$\begin{matrix} σ_{avg} (r) = \sqrt{\frac{1}{N (N - 1)} \sum_{i = 1}^{5} {Δ_{i} M (r) - \bar{Δ M} (r)}^{2}} . \end{matrix}$ $\begin{aligned} \sigma _{\text{ avg}}(r) = \sqrt{\frac{1}{N(N-1)}\sum _{i=1}^{5} \{\Delta _i \mathcal{M} (r) - \overline{\Delta \mathcal{M} }(r)\}^2}. \end{aligned}$ (35)

The factor of 1/(N − 1) is necessary in order to compute an unbiased standard deviation, since we only have five realisations at hand. Furthermore, the additional factor of 1/N comes from fact that we want the error on the mean and not of a single measurement. In a similar manner, we compute the standard deviation of a single marked correlation function as

$\begin{matrix} σ_{s} (r) = \sqrt{\frac{1}{N - 1} \sum_{i = 1}^{5} {M_{i} (r) - \bar{M} (r)}^{2}} . \end{matrix}$ $\begin{aligned} \sigma _{\text{ s}}(r) = \sqrt{\frac{1}{N-1}\sum _{i=1}^{5} \{\mathcal{M} _{i}(r) - \overline{\mathcal{M} }(r)\}^2}. \end{aligned}$ (36)

In the end, the ratio of interest is the signal-to-noise ratio (S/N) defined by

$\begin{matrix} S/N (r) = \frac{\bar{Δ M} (r)}{σ_{avg} (r)}, \end{matrix}$ $\begin{aligned} \text{ S/N}(r) = \frac{\overline{\Delta \mathcal{M} }(r)}{\sigma _{\text{ avg}}(r)}, \end{aligned}$ (37)

giving directly the difference in terms of standard deviations. If the absolute value of this S/N is larger than 3 then we would advocate a significant deviation between MG and GR.

Another quantity of interest that we use throughout this work is the ratio between the error on a single measurement of the marked correlation function and the noise σ_avg, as used in the signal-to-noise ratio. We refer to this ratio as

$\begin{matrix} α (r) = \frac{σ_{s} (r)}{σ_{avg} (r)} \end{matrix}$ $\begin{aligned} \alpha (r) = \frac{\sigma _{\text{ s}}(r)}{\sigma _{\text{ avg}}(r)} \end{aligned}$ (38)

and will include it the figures as shaded regions. The error of a single measurement σ_s(r) is hereby taken for the GR case. This ratio gives an indication on the statistical significance of a difference, if we would have only one simulation/measurement at hand. To assess this, we have to compare S/N(r) with α(r), and if S/N(r) > 3α(r) then we can claim a 3σ difference to be detectable with a single measurement. Of course, care must be taken if the error of a single measurement is significantly different between GR and MG, since α(r) will differ depending on what simulations are used to estimate the error, therefore possibly affecting conclusions.

In the lower panel of Fig. 1, we display the S/N(r) as introduced above. The differences rarely cross the limit of 3σ except for the very lowest scales below 20 h⁻¹ Mpc, or for F4 at intermediate scales where deviations can reach up to 6σ. However these large deviations happen only for single scales and there is no general trend. This suggests that the crossing of the 3σ border might be caused by sample variance, as the statistical power is limited with only five realisations. In addition, when considering the error of a single realisation volume, as displayed by the blue shaded region, we can see that the deviations for F4 are within the uncertainty.

Fig. 1.

Difference in the measured standard correlation function ξ(r) between GR and MG in real space. In the upper panel, the correlation functions themselves are plotted where different colours indicate the underlying gravity theory. The curves show the average over five realisations and the errorbars correspond to the mean standard deviation over these realisations. The lower panel quantifies possible differences in terms of the S/N, as introduced in Sect. 4. Black dashed lines indicate a S/N of ±3. The shaded region refers to the error of a single measurement divided by the mean error of the difference as described in Eq. (38).

In the lower panels of Fig. 2 we show the S/N(s) for the multipoles in redshift space. Except for the smallest scales, the simulations F5, F6 and N5 show generally no significant differences to GR. In the case of the monopoles, the S/N is close to 0 for almost all scales. For F4 and N1 significant differences are present although mainly on scales below 20 h⁻¹ Mpc. On larger scales both simulations show S/N varying around 3σ. These differences are mostly within the uncertainty, if the error of a single volume is considered. This confirms that by considering the standard correlation function only, in real or redshift space, we cannot really distinguish between GR and those MG models.

Fig. 2.

Differences in the standard measured correlation function multipoles in redshift space between GR and MG. The upper panels present the mean correlation function multipoles taken over five realisations with the monopole on the left side and the quadrupole on the right side. The errorbars correspond to the mean standard deviation over five realisations. The lower panels show the respective S/N with 3σ indicated by the black dashed lines. The colour coding refers to different gravity simulations and the corresponding shaded regions refer to the error of a single measurement divided by the mean error of the difference (see Eq. (38)).

5. Marks for modified gravity

There is a vast space of possible marks that can be used and the specific choice strongly depends on the context in which the marked correlation function is studied. The most popular mark function M[ρ(x)] in the literature within the context of detecting MG was introduced by White (2016) and takes the form

$\begin{matrix} m (x) = M_{W} [ρ (x)] \equiv {(\frac{1 + ρ_{*}}{ρ_{*} + ρ (x)})}^{p}, \end{matrix}$ $\begin{aligned} m(\mathbf x ) = M_W[\rho (\mathbf x )] \equiv \left(\frac{1 + \rho _{*}}{\rho _{*}+\rho (\mathbf x )}\right)^{p}, \end{aligned}$ (39)

where ρ_* and p are free parameters used to control the mark’s upweighting of low- versus high-density regions. We refer to this mark in the following as the White mark and be indicated via the W in the subscript. The White mark can be seen as a local transformation of the density field. Choices for the free parameters range from (ρ_*, p) = (10.0, 7.0) (Aviles et al. 2020), (ρ_*, p) = (4.0, 10.0) Alam et al. (2021), Valogiannis & Bean (2018), to (ρ_*, p) = (10⁻⁶, 1.0) (Hernández-Aguayo et al. 2018). Upweighting galaxies in high-density regions using (ρ_*, p) = (1.0, −1.0) has also been explored in Alam et al. (2021). Other values have also been investigated by Satpathy et al. (2019) and Massara et al. (2021) for the marked power spectrum. This underlines the wide range of possible mark functions and configurations to be used and the amount of freedom this can introduce in the analysis.

Marks based on the local density require an estimation of the latter from a finite point set in the first place and there exist several different approaches to do so. While we use an estimation based on mass assignment schemes (MAS), adaptive approaches such as Delaunay (Schaap & van de Weygaert 2000) or Voronoi tessellations as used in void finders (e.g. Neyrinck 2008) could also be used. With a MAS applied to a discrete density field (subscript f for finite):

$\begin{matrix} ρ_{f} (x) = m \sum_{i = 0}^{N - 1} δ_{D} (x - x_{i}), \end{matrix}$ $\begin{aligned} \rho _f(\mathbf x ) = m \sum _{i=0}^{N-1} \delta _D(\mathbf x -\mathbf x _i), \end{aligned}$ (40)

where the i-th point is located at position x_i, the estimated density field on the grid takes the form (Sefusatti et al. 2016)

$\begin{matrix} δ_{Rf} (x) = \frac{1}{\bar{N}} \sum_{i = 0}^{N - 1} F (\frac{x - x_{i}}{a}) - 1, \end{matrix}$ $\begin{aligned} \delta _{Rf}(\mathbf x ) = \frac{1}{\bar{N}} \sum _{i=0}^{N-1} F\left(\frac{{\mathbf x -\mathbf x _i}}{a}\right) -1, \end{aligned}$ (41)

with $\bar{N}$ $\bar N$ is the density of points per grid cell, N is the number of points, a is the size of one grid-cell, and F(x) the MAS kernel. The coordinate x is only evaluated at grid points but in principle can be placed anywhere. For this derivation, we assumed all points to have the same mass m and we made use of the simplified notation where F(x)≡F(x₁)F(x₂)F(x₃). The density field obtained in this way is related to the true density field by a convolution with the MAS kernel. In this work, we mainly use a piece-wise cubic spline (PCS) for the MAS but higher- and lower-order kernels are employed for specific tests. The explicit form of the used kernels up to septic order can be found in the appendix of Chaniotis & Poulikakos (2004)¹.

5.1. Beyond local density

A way to include information beyond the local density field is by using the large-scale environment that can be divided into clusters, filaments, walls and voids. Generally, there are different ways to define these structures from a galaxy catalogue ranging from the sophisticated approach by Sousbie (2011) based on topological considerations to the work of Falck et al. (2012) using phase-space information. One of the most straightforward approaches utilises the T-web formalism (Forero-Romero et al. 2009) based on the Hessian of the gravitational potential. For a thorough comparison of the above mentioned cosmic web classifications and many more we refer the reader to Libeskind et al. (2018). In this analysis we deploy the T-web classification that uses the relation of the eigenvalues λ₁, λ₂ and λ₃ of the tidal tensor to the density evolution as given in Cautun et al. (2014)

$\begin{matrix} ρ (x) = \frac{\bar{ρ}}{(1 - D (t) λ_{1}) (1 - D (t) λ_{2}) (1 - D (t) λ_{3})}, \end{matrix}$ $\begin{aligned} \rho (\mathbf x ) = \frac{\bar{\rho }}{(1-D(t)\lambda _1)(1-D(t)\lambda _2)(1-D(t)\lambda _3)}, \end{aligned}$ (42)

where D(t) is the growth factor. This expression can be derived from Lagrangian perturbation theory to linear order (Zel’dovich 1970). The dimensionality of the structure then depends on the number of eigenvalues with positive sign. Three positive eigenvalues corresponds to a cluster as it encodes a collapse among all three spatial directions. Two or one positive eigenvalues result in a filament or wall, respectively. If all eigenvalue are negative then ρ(x) will never diverge and we can interpret this as a void. A pitfall of this classification appears if some of the eigenvalues are very small but positive as the corresponding structure might not collapse in a Hubble time. To circumvent this issue while not having to rely on thresholds for the eigenvalues we use the scheme as proposed in Cautun et al. (2013). They give an environmental signature 𝒮 for ordered eigenvalues λ₁ ≤ λ₂ ≤ λ₃ in their Eqs. (6) and (7) as signatures for clusters, filaments, walls and voids, respectively. We adapted this scheme to be used on the eigenvalues of the Tidal tensor instead of the Hessian of the density contrast as proposed in Cautun et al. (2013).

To obtain the tidal tensor in a simulation we follow the grid-based approach as used for the density field. With a density field on a grid at hand, the gravitational potential or tidal tensor can be straightforwardly deduced by a series of fast Fourier transforms (FT). For this we use the Poisson equation to relate the density field to the gravitational potential

$\begin{matrix} \nabla^{2} Φ (x) = 4 π G \bar{ρ} δ (x) \overset{FT}{\Leftrightarrow} k^{2} Φ (k) = - 4 π G \bar{ρ} δ (k) . \end{matrix}$ $\begin{aligned} \nabla ^2 \Phi (\mathbf x ) = 4\pi G \bar{\rho } \delta (\mathbf x ) \quad \overset{\text{ FT}}{\Leftrightarrow }\quad k^2 \Phi (\mathbf k ) = -4\pi G \bar{\rho } \delta (\mathbf k ). \end{aligned}$ (43)

We absorb the spatial constant $4 π G \bar{ρ}$ $4\pi G \bar{\rho}$ into the definition of the gravitational potential. There exists a singularity when the wavevector is equal to zero, which we evade by simply setting the zeroth mode of Φ(k) to zero, as we expect the gravitational potential sourced by the density contrast to have a zero mean. The components of the tidal tensor T_ij can then be derived by taking successive derivatives in the respective directions as

$\begin{matrix} T_{ij} (x) = \partial_{i} \partial_{j} Φ (x) \overset{FT}{\Leftrightarrow} T_{ij} (k) = - k_{i} k_{j} Φ (k) . \end{matrix}$ $\begin{aligned} T_{ij}(\mathbf x ) = \partial _i\partial _j \Phi (\mathbf x ) \quad \overset{\text{ FT}}{\Leftrightarrow } \quad T_{ij}(\mathbf k ) = -k_i k_j \Phi (\mathbf k ). \end{aligned}$ (44)

The off-diagonal terms suffer from a break in the Fourier symmetry pairs when evaluated on a finite grid. This leads to a non-vanishing imaginary part once the tidal tensor in configuration space is obtained by inverse FT. We circumvent this issue by setting the imaginary part to zero via applying a filter that sets the symmetry-breaking modes at the Nyquist frequency to zero. In our implementation we evaluate the environmental signatures on the grid over which the eigenvalues of the tidal tensor have been computed. For each grid cell the largest signatures defines the corresponding environment and if all signatures are zero then the environment is set to be a void.

Once each galaxy has been classified, this information can be used to enhance effects of MG in clustering measurements. The simplest use of the environmental classification is to divide the catalogue into sub-catalogues consisting of galaxies located in voids, walls, filaments and clusters, respectively, and compute auto-correlation functions. The difference in clustering amplitude in the different environments is expected to be stronger in MG, particularly in the correlation function of void galaxies. In the work of Bonnaire et al. (2022), they computed the power spectra of density fields that have been obtained by splitting the original density field into respective contributions from galaxies in voids, walls, filaments and clusters. They applied this approach to the dark matter particles of the Quijote simulation (Villaescusa-Navarro et al. 2020) thereby having a much larger set of points. A drawback in the analysis of these galaxy mock catalogues, but also when using limited survey samples, is the loss of information due to discarding many galaxies leading to an increase in uncertainty of the measurements. Computing environmental correlation functions is the same as computing weighted correlation functions but with a mark set to one for all galaxies living in the respective environment and zero otherwise. We refer the interested reader to Appendix A for some notes on the structure of weighted correlation functions for those kinds of weights. In the following we denote such marked correlation functions, consisting of the environmental weighted correlation function divided by the total un-weighted correlation function as in Eq. (29), via their respective environment; for instance, with a void marked correlation function.

Conversely, we can use the full catalogue of objects and put larger weights to progressively more unscreened galaxies, as done by the following mark field

$\begin{matrix} m (x) = {\begin{matrix} 4 if void \\ 3 if wall \\ 2 if filament \\ 1 if cluster \end{matrix} Void< Subscript> LEM< /Subscript>, \end{matrix}$ $\begin{aligned} m(\mathbf x ) = {\left\{ \begin{array}{ll} 4 \quad \text{ if} \text{ void} \\ 3 \quad \text{ if} \text{ wall} \\ 2 \quad \text{ if} \text{ filament} \\ 1 \quad \text{ if} \text{ cluster} \end{array}\right.} \quad \text{ Void<Subscript>LEM</Subscript>}, \end{aligned}$ (45)

where LEM stands for linear environment mark. This approach is similar to the density split technique as used in Paillas et al. (2021), where cross-correlation functions of galaxies living in differently dense regions have been investigated. Our proposed mark can also be divided into a specific combination of auto and cross-correlations. This mark could in principle be extended to a Wall_LEM mark, where wall galaxies get assigned a weight of 4 and voids galaxies a weight of 3, as well as similarly peaked functions for filaments and clusters. However, as we expect MG to be the strongest in low-density regions we restrict ourselves to upweighting void or wall galaxies only.

Yet another idea of using the environmental classification of galaxies as a mark would be to further increase the anti-correlation present in low density regions. This can be accommodated with the following mark

$\begin{matrix} m (x) = {\begin{matrix} - 1 if void \\ 1 else \end{matrix} Void< Subscript> AC< /Subscript>, \end{matrix}$ $\begin{aligned} m(\mathbf x ) = {\left\{ \begin{array}{ll} -1 \quad \text{ if} \text{ void} \\ 1 \quad \quad \text{ else} \end{array}\right.} \quad \text{ Void<Subscript>AC</Subscript>}, \end{aligned}$ (46)

where we abbreviated anti-correlation with AC. In principle, there is no difference if we switch signs of this mark because from Eq. (28) it is clear that any overall factor of the marks would be cancelled by the division of the normalisation. This mark leaves galaxy pairs that are in voids unweighted as well as galaxy pairs not in voids. However, if one galaxy is in a void and the other is not, the weight will be -1 thereby creating an anti-correlation.

Marks based on the tidal tensor components may appear promising to go beyond the local density. An interesting quantity first introduced by Heavens & Peacock (1988) and then used by Alam et al. (2019), is the tidal torque, defined as

$\begin{matrix} t (x) = \frac{1}{2} {{(λ_{3} - λ_{2})}^{2} + {(λ_{3} - λ_{1})}^{2} + {(λ_{2} - λ_{1})}^{2}}, \end{matrix}$ $\begin{aligned} t(\mathbf x ) = \frac{1}{2}\{(\lambda _3-\lambda _2)^2 + (\lambda _3-\lambda _1)^2 + (\lambda _2-\lambda _1)^2\}, \end{aligned}$ (47)

with λ₁, λ₂ and λ₃ are the eigenvalues of the tidal tensor. The larger the difference between the eigenvalues the more anisotropic is the structure. Hence we expect the tidal torque to be large for filaments and walls and small for clusters or voids. Another field depending directly on tidal tensor components is the tidal field, also known as the second Galileon 𝒢₂ (Nicolis et al. 2009), and which was used extensively for the emergence of non-local bias between galaxies and dark matter (Chan et al. 2012). The tidal field is defined as 𝒢₂ = (∂_i∂_jΦ)² − (∂ⁱ∂_iΦ)², where we can identify the components of the tidal tensor as introduced in Eq. (44). In practise we investigate two separate marks consisting of the tidal field and tidal torque as they are, respectively. Using these fields in this way should give an insight on the suitability of the tidal field or tidal torque to disentangle MG from GR.

In Fig. 3, we present the different marked correlation functions introduced in this section. For the cluster marked correlation function we see a strong signal on very small scales which relates to the correlation between galaxies insides clusters. The compensation feature on scales between 20 h⁻¹ Mpc and 60 h⁻¹ Mpc comes from less clustered regions around clusters and is similar, although reversed, to the compensation seen in the void-galaxy cross-correlation function (e.g. Aubert et al. 2022; Hamaus et al. 2022). The filament and wall marked correlation functions show progressively less signal as the clustering of galaxies inside walls and filaments are closer to the total clustering of all galaxies. Notably, if voids are considered, the observed signal below unity implies that void galaxies are less clustered compared to the total clustering. The large signal of the cluster marked correlation function comes at the cost of larger errors due to small amount of galaxies residing in clusters. In general we aim for marked correlation functions with a signal different from unity over a wide range of scales as this might lead to differences at those scales between MG and GR. On the other side, if the marked correlation function stays very close to unity on most scales, then any possible difference between MG and GR can only originate from the clustering itself if the marked correlation function of MG is also close to unity. The 2PCF is matched between MG and GR in the ELEPHANT simulations to fit observations as described in Sect. 4. For these reasons, the Void_AC mark and particularly the Wall_AC mark are of strong interest as they exhibit a signal up to large scales. Looking at the lower panel of Fig. 3, we can see a strong signal when the tidal field is used, which extends to large scales. The same, although with considerably less amplitude on small scales, is found for the tidal torque. Hence, these marks are also interesting candidates to be investigated to discriminate between MG and GR. Although an impact of the mark is a necessary prerequisite, it is not sufficient to guarantee a disentanglement of GR from MG because the signal could be the same in MG and GR, nevertheless.

Fig. 3.

Summary of the different marked correlation functions using marks based on the tidal field, tidal torque (both in the lower panel) or large-scale environment (upper panel). The two panels show measurements made using the ELEPHANT simulations of GR. The black dashed line indicates an amplitude of 1. Curves represent the mean taken over five realisations and the error corresponds to the mean standard deviation over five realisations. The measurements have not been corrected for any form of bias due to shot-noise effects.

It has to be noted that the marked correlation functions shown in Fig. 3 are not corrected for a possible bias due to the estimation of the mark on a discrete catalogue, as we discuss in the next section. Hence, the exact amplitude of the measurements might be subject to changes if such a correction is applied. For the marks based on the environmental classification we do not expect this bias to be particularly strong because possible miss-classifications, originating from a biased estimate of the density field, should not affect every galaxy in a catalogue.

5.2. Anti-correlating galaxies using local density

Until now, we have seen that marks based on the local density are particularly simple and introducing an anti-correlation with the mark appears to be promising regarding the discrimination between GR and MG. Therefore, we propose the following mark function based on the hyperbolic tangent, satisfying both aforementioned advantages,

$\begin{matrix} m (x) = M_{tanh} [δ_{R} (x)] \equiv tanh (a (δ_{R} (x) + b)), \end{matrix}$ $\begin{aligned} m(\mathbf x ) = M_{\tanh }[\delta _R(\mathbf x )] \equiv \tanh (a\,(\delta _R(\mathbf x )+b)), \end{aligned}$ (48)

where a and b are parameters controlling how steeply the transition from -1 to 1 takes place and where the transition happens, respectively. In general, we could use a third parameter c as an overall factor in front of the hyperbolic tangent but constant factors can be pulled out of the mean and hence are cancelled by the normalisation in Eq. (28). It is worth mentioning the fact that theoretical modelling of marks based on the environmental classification, such as the Void_AC mark, might be particularly challenging as it is not straightforward to express the mark in terms of the density contrast. From a theoretical perspective, marked correlation functions with marks based on the density are more tractable. Furthermore, discreteness effects, arising in the density estimation itself can be more easily corrected for in the measurement of the marked correlation function as elucidated in the next section.

6. Propagation of discreteness effects of the mark estimation into weighted correlation functions

When dealing with finite point sets we can assume the sampling process to be locally of Poisson nature (Layzer 1956). This means that the number of points found in some small-enough grid cells appears as being drawn from a Poisson distribution with some expectation value. However, the expectation value of the local Poisson process does have a PDF on its own. The PDF from which the expectation values is drawn is continuous and describes the density field globally. If this PDF is a Dirac delta function then the expectation value of the Poisson process is the same everywhere and the moments estimated from the sample points coincide with the moments from the continuous PDF. However, if the continuous PDF is not a Dirac delta function, as is the case for the cosmological density field, then the estimated moments contain a bias with respect to the true moments of the continuous PDF. This bias is usually called shot noise or Poisson noise in the literature. In the power spectrum estimation, the shot noise appears as an additive constant for all scales in k-space. In the 2PCF instead, the shot noise emerges only at zero lag, that is at a pair separation of zero. Hence, shot noise is inherently a problem of correlating a point with itself. When we use the density field inside a mark function, we use a smoothed version of the true field. We spread points over a finite volume leading to self-correlations also at non-zero pair separation and in turn to shot-noise effects. Intuitively, this can be understood in the following manner: in the unsmoothed case all points are infinitely small dots, while in the smoothed case the points are represented by circles with a non-zero radius. Inside this radius one point can be correlated with itself.

To precisely understand how shot noise affects marked correlation functions we have to do a small detour and carefully distinguish between the statistical properties of the true density contrast δ(x), the smoothed true density contrast δ_R(x), and the respective quantities estimated from finite point sets, hereby denoted with an f in the subscript δ_f(x) and δ_Rf(x). The weighted correlation function estimated from a finite point set can be written as

$\begin{matrix} 1 + W_{f} (r) = \frac{w_{f} (r)}{{\bar{m}}_{f}^{2}}, \end{matrix}$ $\begin{aligned} 1+W_f(\mathbf r ) = \frac{w_f(\mathbf r )}{\bar{m}_f^2}, \end{aligned}$ (49)

where we defined the quantity w_f(r) as

$\begin{matrix} w_{f} (r) \equiv ⟨ M [δ_{Rf} (x)] (1 + δ_{f} (x)) M [δ_{Rf} (x + r)] (1 + δ_{f} (x + r)) ⟩, \end{matrix}$ $\begin{aligned} w_f(\mathbf r ) \equiv \langle M[\delta _{Rf}(\mathbf x )] (1+\delta _f(\mathbf x )) M[\delta _{Rf}(\mathbf x +\mathbf r )] (1+\delta _f(\mathbf x +\mathbf r ))\rangle , \end{aligned}$ (50)

and ${\bar{m}}_{f} = \frac{1}{V} \int_{V} m_{f} (x) ρ_{f} (x) / {\bar{ρ}}_{f} d^{3} x$ $\bar m_f = \frac{1}{V}\int_V m_f(\mathbf{x})\rho_f(\mathbf{x})/\bar{\rho}_f\mathrm{d}^3 x$ is the mean mark taken over the points, i.e. weighted by the density. In Eq. (49) both w_f(r) and ${\bar{m}}_{f}$ $\bar m_f$ are expected to be sensitive to the noise induced by the auto-correlation of objects with themselves, and we denote the corresponding shot-noise free signals w and $\bar{m}$ $\bar m$ . Indeed, Eq. (50) shows that there is a mark function M of the smoothed density field that is multiplied by the density field itself. This constitutes the main source of shot noise that is expected to happen even at large separation r, where there is no overlap between the smoothing kernels. In this section, we first show the effect of shot noise on the marked correlation function for a specific mark and then devise a general method to correct for shot noise.

6.1. A toy model

To understand how the shot noise propagates into w_f(r) and ${\bar{m}}_{f}$ $\bar m_f$ , we focus on a very simple mark function defined by M[δ_Rf(x)] = δ_Rf(x) and M[δ_Rf(x + r)] = 1. This corresponds to a marked correlation function in which only one point of the pair is weighted by the density contrast and the other point stays unweighted. In this instructive case we can split w_f(r) into three terms

$\begin{matrix} w_{f} (r) = σ_{Rf}^{2} + ξ_{Rf} (r) + ζ_{Rf} (r), \end{matrix}$ $\begin{aligned} w_f(\mathbf r ) = \sigma _{Rf}^2 + \xi _{Rf}(\mathbf{r}) + \zeta _{Rf}(\mathbf{r}), \end{aligned}$ (51)

where the individual contributions are given by

$\begin{matrix} σ_{Rf}^{2} & = ⟨ δ_{Rf} (x) δ_{f} (x) ⟩ \\ ξ_{Rf} (r) & = ⟨ δ_{Rf} (x) δ_{f} (x + r) ⟩ \\ ζ_{Rf} (r) & = ⟨ δ_{Rf} (x) δ_{f} (x) δ_{f} (x + r) ⟩, \end{matrix}$ $\begin{aligned} \sigma _{Rf}^2&= \langle \delta _{Rf}(\mathbf x ) \delta _f(\mathbf x )\rangle \nonumber \\ \xi _{Rf}(\mathbf{r})&= \langle \delta _{Rf}(\mathbf x ) \delta _f(\mathbf x +\mathbf r )\rangle \nonumber \\ \zeta _{Rf}(\mathbf{r})&= \langle \delta _{Rf}(\mathbf x ) \delta _f(\mathbf x ) \delta _f(\mathbf x +\mathbf r )\rangle , \end{aligned}$ (52)

and consist of correlators between the smoothed and unsmoothed density field estimated on a finite point set. Given that the smoothed density field δ_Rf(x) is related to the density field δ_f(x) through the convolution

$\begin{matrix} δ_{Rf} (x) = \frac{1}{a^{3}} \int_{x^{'}} F (\frac{x - x^{'}}{a}) δ_{f} (x^{'}) d^{3} x^{'}, \end{matrix}$ $\begin{aligned} \delta _{Rf}(\mathbf x ) = \frac{1}{a^3}\int _\mathbf{x ^{\prime }} F\left(\frac{{\mathbf x -\mathbf x ^{\prime }}}{a}\right) \delta _f(\mathbf x ^{\prime }) \text{ d}^3x^{\prime }, \end{aligned}$ (53)

one can immediately see that the three contributions in Eq. (51) involve integrals over two- and three-point correlation functions of the density field. In general, N-point correlation functions are affected by shot noise as (see Chan & Blot 2017)

$\begin{matrix} Ξ_{f} (r_{1}, . ., r_{n - 1}) & = {⟨ δ_{f} (x) δ_{f} (x + r_{1}) . . . δ_{f} (x + r_{n - 1}) ⟩}_{c} \\ = Ξ (r_{1}, . . ., r_{n - 1}) + \sum_{m = 1}^{n - 1} \frac{1}{{\bar{n}}^{m}} A_{m}^{(n)} (r_{1}, . . ., r_{n - 1}) \end{matrix}$ $\begin{aligned} \Xi _f(\mathbf r _1,.., \mathbf r _{n-1})&= \langle \delta _f(\mathbf x ) \delta _f(\mathbf x +\mathbf r _1) ...\delta _f(\mathbf x +\mathbf r _{n-1}) \rangle _c \nonumber \\&= \Xi (\mathbf r _1,...,\mathbf r _{n-1}) + \sum _{m=1}^{n-1}\frac{1}{\bar{n}^m} \mathcal{A} ^{(n)}_{m}(\mathbf r _1,...,\mathbf r _{n-1}) \end{aligned}$ (54)

where the function 𝒜_m⁽ⁿ⁾ contains all the scale dependency of the shot-noise contribution to the N-point correlation function at the respective order in $\bar{n}$ $\bar{n}$ , the mean density of points in the volume V. Therefore, the shot noise takes the form of a power series in $1 / \bar{n}$ $1/\bar{n}$ . In particular for the 2PCF we have

$\begin{matrix} ξ_{f} (r) = ξ (r) + \frac{δ_{D} (r)}{\bar{n}}, \end{matrix}$ $\begin{aligned} \xi _f(\mathbf r ) = \xi (\mathbf r ) + \frac{\delta _D(\mathbf r )}{\bar{n}}, \end{aligned}$ (55)

and for the three-point correlation function

$\begin{matrix} ζ_{f} (r, s) & = {⟨ δ_{f} (x) δ_{f} (x + r) δ_{f} (x + s) ⟩}_{c} \\ = ζ (r, s) + \frac{1}{\bar{n}} [δ_{D} (r) ξ (s) \\ + δ_{D} (r - s) ξ (r) + δ_{D} (s) ξ (r - s)] \\ + \frac{1}{{\bar{n}}^{2}} δ_{D} (r) δ_{D} (s) . \end{matrix}$ $\begin{aligned} \zeta _f(\mathbf r , \mathbf s )&= \langle \delta _f(\mathbf x ) \delta _f(\mathbf x +\mathbf r ) \delta _f(\mathbf x +\mathbf s ) \rangle _c \nonumber \\&= \zeta (\mathbf r , \mathbf s ) + \frac{1}{\bar{n}}\left[\delta _D(\mathbf r ) \xi (\mathbf s ) \right.\nonumber \\&\left. \quad +~\delta _D(\mathbf r - \mathbf s ) \xi (\mathbf r ) + \delta _D(\mathbf s ) \xi (\mathbf r - \mathbf s ) \right] \nonumber \\&\quad + \frac{1}{\bar{n}^2} \delta _D(\mathbf r ) \delta _D(\mathbf s ). \end{aligned}$ (56)

As a result, one can express each individual term of Eq. (51) in terms of the true signal and a shot-noise contribution (depending on the number density of objects) as

$\begin{matrix} σ_{Rf}^{2} & = σ_{R}^{2} + \frac{F (0)}{\bar{N}} \\ ξ_{Rf} (r) & = ξ_{R} (r) + \frac{F (r / a)}{\bar{N}} \\ ζ_{Rf} (r) & = ζ_{R} (r) + \frac{ξ (r)}{\bar{N}} [F (0) + F (r / a)], \end{matrix}$ $\begin{aligned} \sigma _{Rf}^2&= \sigma _{R}^2 + \frac{F(\mathbf{0})}{\bar{N}} \nonumber \\ \xi _{Rf}(\mathbf{r})&= \xi _{R}(\mathbf{r}) + \frac{F(\mathbf{r}/a)}{\bar{N}} \nonumber \\ \zeta _{Rf}(\mathbf{r})&= \zeta _{R}(\mathbf{r} ) + \frac{\xi (\mathbf{r})}{\bar{N}} \left[F(\mathbf{0})+ F(\mathbf{r}/a) \right], \end{aligned}$ (57)

where we introduce $\bar{N} = a^{3} \bar{n}$ $\bar N=a^3\bar n$ that corresponds to the mean number of objects per grid cell. These noise contributions are obtained by using Eq. (53) in Eq. (52), inserting Eqs. (55) and (56), and integrating out the Dirac delta functions where applicable. It has to be noted that we only report noise contributions in Eq. (57) that are not proportional to Dirac delta functions, as these would appear at zero lag only and hence be irrelevant for our considerations. By utilising the aforementioned splitting into signal and noise we can write Eq. (51) as

$\begin{matrix} w_{f} (r) = w (r) + ϵ_{w} (r), \end{matrix}$ $\begin{aligned} w_f(\mathbf r ) = w(\mathbf r ) + {\epsilon }_w(\mathbf r ), \end{aligned}$ (58)

where w is the true signal and the shot-noise contribution is formally expressed as

$\begin{matrix} ϵ_{w} (r) = \frac{1}{\bar{N}} (1 + ξ (r)) [F (0) + F (r / a)] . \end{matrix}$ $\begin{aligned} {\epsilon }_w(\mathbf r ) = \frac{1}{\bar{N}} \left( 1+\xi (\mathbf{r}) \right)\left[ F(\mathbf{0}) + F(\mathbf{r}/a) \right]. \end{aligned}$ (59)

Equation (59) shows that even if on a scale r larger than the smoothing scale (when F(r/a) = 0) there is still a large-scale contribution to the shot noise due to F(0). In addition, the large-scale contribution is expected to decrease when increasing the order of the MAS (F(0) decreases). That is the reason why, in general, increasing the order of the MAS reduces the intrinsic shot-noise contribution to the signal.

Following the same reasoning, it is straightforward to show that with the toy model the shot-noise affected mean mark ${\bar{m}}_{f}$ $\bar m_f$ , estimated from a discrete set of objects, can be related to the true mean mark $\bar{m}$ $\bar m$ via

$\begin{matrix} {\bar{m}}_{f} = \bar{m} + ϵ_{\bar{m}}, \end{matrix}$ $\begin{aligned} \bar{m}_f = \bar{m} + {\epsilon }_{\bar{m}}, \end{aligned}$ (60)

where

$\begin{matrix} ϵ_{\bar{m}} = \frac{F (0)}{\bar{N}} . \end{matrix}$ $\begin{aligned} {\epsilon }_{\bar{m}} = \frac{F(\mathbf{0})}{\bar{N}}. \end{aligned}$ (61)

Finally, by combining Eqs. (58) and (60) we can show that the shot-noise-corrected weighted correlation function 1 + W(r) can be expressed as

$\begin{matrix} 1 + W (r) = \frac{w_{f} (r) - ϵ_{w} (r)}{{\bar{m}}_{f} - ϵ_{\bar{m}}} . \end{matrix}$ $\begin{aligned} 1 + W(\mathbf{r}) = \frac{w_f(\mathbf r )-{\epsilon }_w(\mathbf r )}{\bar{m}_f - {\epsilon }_{\bar{m}}}. \end{aligned}$ (62)

This demonstrates that the shot-noise correction on the marked correlation function implies to correct both the numerator w_f(r) and denominator ${\bar{m}}_{f}$ $\bar m_f$ in order to properly extract the true signal. This is at odd with usual shot-noise corrections on N-point correlation functions that are only additive.

There is another subtlety due to the fact that we assign the mark field back on the galaxies to measure the weighted correlation function. This back-assignment is done with a specific scheme in the sense that we check in which grid cell a galaxy is located and assign the mark corresponding to that grid cell, thereby introducing another smoothing of the field with a nearest-grid-point (NGP) kernel. In our computation this leads to an additional convolution for the field δ_Rf(x). We show in Appendix B that this additional convolution is equivalent to a single one with a kernel that is a convolution of both a PCS and a NGP kernel, that is, a quartic kernel. Hence, in the actual calculation of the analytic shot noise as in Eqs. (59) and (61), a quartic kernel has to be used, which explicit expression can be found in the appendix of Chaniotis & Poulikakos (2004).

To validate the analytic prediction of the noise in w_f(r) and ${\bar{m}}_{f}$ $\bar{m}_f$ we use five realisations of Covmos as introduced in Sect. 4. The goal is to have realisations at different number densities to assess the behaviour of shot noise as a function of $\bar{N}$ $\bar{N}$ . Therefore, for each realisation we deplete the catalogue by randomly throwing away points down to the desired density. The exact densities are motivated by applying the shot-noise correction later to the ELEPHANT simulation suite, which has much lower point densities compared to Covmos. Hence by depleting the Covmos realisations down to {1.7%,1.53%,1.36%,1.19%,1.02%,0.85%,0.68%,0.51%} we generate catalogues with the same $\bar{N}$ $\bar{N}$ as in the ELEPHANT suite if they were depleted down to {100%,90%,80%,70%,60%,50%,40%,30%} with 64 grid cells per dimension. The depletion is done to match the density of points in grid cells and not in the full volume because the shot-noise behaviour is a power series in $1 / \bar{N}$ $1/\bar{N}$ . The depletion is repeated 100 times followed by a mean to minimise the sample variance coming from the stochasticity of the random depletion process. We need to carefully distinguish the five independent realisations of Covmos from the depletion realisations used to get a converged result for a depleted catalogue, which has to be done for each of the five independent realisations.

As we have seen in Eq. (62) we need to measure $w_{f} (r) = (1 + W_{f} (r)) {\bar{m}}_{f}$ $w_f(\mathbf{r}) = (1+W_f(\mathbf{r}))\bar{m}_f$ as well as ${\bar{m}}_{f}$ $\bar{m}_f$ and those can be straightforwardly computed from the weighted correlation function at each level of depletion. In the upper panel of Fig. 4 we present the measurements in one Covmos realisation of w_f(r) (blue points) and ${\bar{m}}_{f}$ $\bar{m}_f$ (orange points) as a function of $1 / \bar{N}$ $1/\bar{N}$ . The scale at which we plot w_f(r) is fixed to a bin close to 20 h⁻¹ Mpc. We can already see that we there is a linear relation with $1 / \bar{N}$ $1/\bar{N}$ as predicted by the expression in Eqs. (59) and (61). The solid and dashed curve refer to the analytical prediction using the depletion case to 1.7% (the second data point from the left) as an anchor. That anchor is needed to obtain a noiseless signal by correcting for shot noise and then add to the true signal the noise contribution, as a function of $1 / \bar{N}$ $1/\bar{N}$ , to obtain the curve. By doing so, the relative difference between the prediction and the measurement is exactly zero by construction for this depletion as can be seen in the lower panel of Fig. 4. Moreover, even for the other data points at different levels of depletion we can predict the expected signal with high accuracy. The relative difference is at the sub-percent level for both w_f(r) and ${\bar{m}}_{f}$ $\bar{m}_f$ .

Fig. 4.

Analytical correction for w_f(r) and ${\bar{m}}_{f}$ $\bar{m}_f$ in case of the toy model where only one particle in the pair is weighted by the density contrast δ_Rf. The points refer to the measured data from one of the Covmos realisations, in blue for w_f and in orange for ${\bar{m}}_{f}$ $\bar{m}_f$ . The upper panel shows the measurements alongside the analytical prediction and the bottom panel presents the relative difference between the measurements and the theory. We show only one Covmos realisations for which we used 100 realisations of depletions to obtain the depleted catalogues and the scale bin in r for w_f is fixed to be close to 20 h⁻¹ Mpc. For the analytic correction we use the depletion down to 1.7% as an anchor and computed from there the expected signal using Eqs. (59) and (61).

Now that we have established the correctness of our analytical predictions for w_f(r) and ${\bar{m}}_{f}$ $\bar{m}_f$ individually, we can check how well they perform when combining them into 1 + W_f(r), as shown in Fig. 5. In the upper panel we present the mean over the five Covmos realisations at different levels of depletion as indicated with different colours in the legend. As expected, since the kernel and 2PCF drop off at large separations, differences in the curves are only evident on small scales. This is further underlined by the lower panel where the relative difference between the depletion down to 1.7% and the undepleted case is shown in black. This curve refers to the difference between the two if we would not have applied any correction and only data with a depleted number density of 1.7% would be available. For 1 + W_f(r) the relative difference can reach more than 10% on small scales but at scales above around 60 h⁻¹ Mpc the depleted and undepleted case lay within 1% and the effect of shot noise becomes negligible. This is somewhat expected due to the smoothing of the density field, as the quartic kernel decreases down to zero over the course of 2.5 grid cells, which in Covmos corresponds to ≈40 h⁻¹ Mpc. It is important to note here that, although we expect shot noise to be stronger when correlating within the volumes of the smoothing kernels, it is peculiar to the toy model that the shot-noise contribution does only contain linear factors of the kernel with and without the 2PCF. It can be shown in the more general case that if one weights both galaxies in a pair by the associated density field, then the shot noise will contain contributions from a convolution of two quartic kernels resulting in a nonic kernel, which is much more extended in configuration space. In contrast to the black curve that has no correction, we show the relative difference of the analytical correction for 1 + W_f(r) to the undepleted case in red. To have a fair comparison we corrected the 1.7% case down to the density of the undepleted realisations, as these still have a finite, yet very high density. The analytical correction reproduces the undepleted measurements to within 1% relative difference on all scales. We conclude that for the toy model we are able to analytically predict the shot noise. Moreover, we show that even with this simple toy model the shot noise acquires a non-trivial scale dependency. In the next section we extend this formalism to general weighted correlation functions and describe a procedure to estimate the signal without having to rely on an analytical model.

Fig. 5.

Results of the analytic shot-noise correction applied to the toy model for which only one point in a pair is weighted by the density contrast δ_Rf. The upper panel shows the full weighted correlation function 1 + W_f as a mean over the five Covmos realisations, where different colours denote different levels of depletion. Errorbars are computed by taking the mean standard deviation over five realisations. To obtain the depleted catalogues we took the mean over 100 depletions. The lower panel shows the relative difference of the analytically corrected result to the undepleted case as a red dashed line. The solid black line refers to the relative difference of the depletion level 1.7% to the undepleted case, which illustrates the effect of no correction. Horizontal dashed lines in black indicate levels of relative differences of ±1% and the vertical dashed line in grey refers to side length of one grid cell. We used 64 grid cells per dimension and a PCS MAS to obtain the density field on the grid.

6.2. A general model

Building on top of the results obtained with the toy model, we can devise a general model that has a mark function expandable in powers of the density contrast as

$\begin{matrix} M [δ_{Rf} (x)] = \sum_{i = 0}^{\infty} \frac{c_{i}}{i!} δ_{Rf}^{i} (x), \end{matrix}$ $\begin{aligned} M[\delta _{Rf}(\mathbf x )] = \sum _{i=0}^{\infty } \frac{c_i}{i!} \delta _{Rf}^i(\mathbf x ), \end{aligned}$ (63)

where c_i are the coefficients of the Taylor series. Plugging the series expansion in Eq. (63) into Eq. (49) we arrive at

$\begin{matrix} w_{f} (r) = \sum_{i, j} \frac{c_{i} c_{j}}{i! j!} ⟨ δ_{Rf}^{i} (x) (1 + δ_{f} (x)) δ_{Rf}^{j} (x + r) (1 + δ_{f} (x + r)) ⟩ \end{matrix}$ $\begin{aligned} w_f(\mathbf{r}) = \sum _{i,j} \frac{c_i c_j}{i!j!} \langle \delta _{Rf}^i(\mathbf x ) (1+\delta _f(\mathbf x )) \delta _{Rf}^j(\mathbf x +\mathbf r )(1+\delta _f(\mathbf x +\mathbf r ))\rangle \end{aligned}$ (64)

and

$\begin{matrix} {\bar{m}}_{f} = \sum_{i} \frac{c_{i}}{i!} 〈 δ_{Rf}^{i} (x) \frac{ρ_{f} (x)}{{\bar{ρ}}_{f}} 〉 . \end{matrix}$ $\begin{aligned} \bar{m}_f = \sum _i \frac{c_i}{i!} \left\langle \delta _{Rf}^i(\mathbf x ) \frac{\rho _f(\mathbf x )}{\bar{\rho }_f}\right\rangle . \end{aligned}$ (65)

It is evident from these expressions that by weighting both galaxies in a given pair, the resulting marked correlation function will contain auto-correlation contributions of the mark with itself. If for example the weight is constructed by an external catalogue of voids then the weighted correlation function will consist of a smoothed version of the void auto-correlation and void-galaxy cross-correlation functions. In Appendix A, we give some further insights in how the weighted correlation function can be split up into two auto- and one cross-correlation function for certain weighting schemes.

To work out the shot-noise contribution to w_f(r) and ${\bar{m}}_{f}$ $\bar{m}_f$ we can use, analogously to Eq. (54), the relation

$\begin{matrix} ⟨ δ_{Rf}^{i} (x) (1 + δ_{f} (x)) δ_{Rf}^{j} (x + r) (1 + δ_{f} (x + r)) ⟩ \\ = ⟨ δ_{R}^{i} (x) (1 + δ (x)) δ_{R}^{j} (x + r) (1 + δ (x + r)) ⟩ + \sum_{p = 1}^{i + j + 1} \frac{1}{{\bar{N}}^{p}} B_{p}^{(i, j)} (r), \end{matrix}$ $\begin{aligned}&\langle \delta _{Rf}^i(\mathbf x )(1+\delta _f(\mathbf x ))\delta _{Rf}^j(\mathbf x +\mathbf r )(1+\delta _f(\mathbf x +\mathbf r ))\rangle \nonumber \\&= \langle \delta _{R}^i(\mathbf x )(1+\delta (\mathbf x ))\delta _{R}^j(\mathbf x +\mathbf r )(1+\delta (\mathbf x +\mathbf r ))\rangle + \sum _{p=1}^{i+j+1} \frac{1}{\bar{N}^p}\mathcal{B} ^{(i,j)}_p(\mathbf r ), \end{aligned}$ (66)

where ℬ_p^(i, j)(r) contains the shot-noise contribution for a given (i, j) proportional to the inverse of $\bar{N}$ $\bar{N}$ to the power of p. Inserting this expression into Eq. (64) we obtain

$\begin{matrix} w_{f} (r) = & \sum_{i, j} \frac{c_{i} c_{j}}{i! j!} ⟨ δ_{R}^{i} (x) (1 + δ (x)) δ_{R}^{j} (x + r) (1 + δ (x + r)) ⟩ \\ + \sum_{i, j} \frac{c_{i} c_{j}}{i! j!} \sum_{p = 1}^{i + j + 1} \frac{1}{{\bar{N}}^{p}} B_{p}^{(i, j)} (r) \\ = & w (r) + ϵ_{w} (r), \end{matrix}$ $\begin{aligned} w_f(\mathbf r ) =&\sum _{i,j} \frac{c_ic_j}{i!j!} \langle \delta _{R}^i(\mathbf x )(1+\delta (\mathbf x ))\delta _{R}^j(\mathbf x +\mathbf r )(1+\delta (\mathbf x +\mathbf r ))\rangle \nonumber \\&+ \sum _{i,j} \frac{c_ic_j}{i!j!}\sum _{p=1}^{i+j+1} \frac{1}{\bar{N}^p}\mathcal{B} ^{(i,j)}_p(\mathbf r ) \nonumber \\ =\&w(\mathbf r ) + {\epsilon }_{w}(\mathbf r ), \end{aligned}$ (67)

where we identified in the second equality the first sum to be the desired true signal w(r) and the second sum to be the shot-noise contribution ϵ_w(r). Similarly, for ${\bar{m}}_{f}$ $\bar{m}_f$ we obtain

$\begin{matrix} {\bar{m}}_{f} = \bar{m} + \sum_{i} \frac{c_{i}}{i!} \sum_{p = 1}^{i} \frac{1}{{\bar{N}}^{p}} B_{p}^{(i)} = \bar{m} + ϵ_{\bar{m}} . \end{matrix}$ $\begin{aligned} \bar{m}_f = \bar{m} + \sum _i \frac{c_i}{i!} \sum _{p=1}^{i} \frac{1}{\bar{N}^p} \mathcal{B} ^{(i)}_p = \bar{m} + {\epsilon }_{\bar{m}}. \end{aligned}$ (68)

At this point it is clear that the double sum can be written as a power series of $1 / \bar{N}$ $1/\bar{N}$ such that

$\begin{matrix} ϵ_{w} (r) = \sum_{p = 1}^{\infty} \frac{1}{{\bar{N}}^{p}} ϵ_{w, p} (r) ϵ_{\bar{m}} = \sum_{p = 1}^{\infty} \frac{1}{{\bar{N}}^{p}} ϵ_{\bar{m}, p}, \end{matrix}$ $\begin{aligned} {\epsilon }_w(\mathbf r ) = \sum _{p=1}^{\infty }\frac{1}{\bar{N}^p} {\epsilon }_{w,p}(\mathbf r ) \qquad {\epsilon }_{\bar{m}} = \sum _{p=1}^{\infty }\frac{1}{\bar{N}^p} {\epsilon }_{\bar{m}, p}, \end{aligned}$ (69)

with correspondingly defined ϵ_w, p(r) and $ϵ_{\bar{m}, p}$ ${\epsilon}_{\bar{m},p}$ . The correction in the general case is therefore, analogously to Eq. (62),

$\begin{matrix} 1 + W (r) = \frac{w_{f} (r) - ϵ_{w} (r)}{{({\bar{m}}_{f} - ϵ_{\bar{m}})}^{2}} . \end{matrix}$ $\begin{aligned} 1+W(\mathbf r ) = \frac{ w_f(\mathbf r ) - {\epsilon }_w(\mathbf r )}{(\bar{m}_f - {\epsilon }_{\bar{m}})^2}. \end{aligned}$ (70)

The power series in $1 / \bar{N}$ $1/\bar{N}$ (which in principle extends to infinite order) together with the fact that, following Eqs. (55) and (56), shot noise of N-point correlation functions is scale-dependent and contains (N − 1)-point correlation functions, makes an analytic correction for the general case intractable. This would require, in particular, the computation of higher order correlation functions, which are computationally expensive. It might appear at first glance that a simple truncation of the Taylor expansion would solve the problem, but to avoid computing four-point correlators and above, the Taylor expansion would need to be cut already at linear order. Moreover, the conversion of moments into cumulants might lead to significant contributions from higher order correlators at low order in $1 / \bar{N}$ $1/\bar{N}$ . In the following we outline an approach to circumvent analytical computation and that uses the resummation of contributions into a power series in $1 / \bar{N}$ $1/\bar{N}$ .

The quantities $w_{f} = (1 + W_{f} (r)) {\bar{m}}_{f}^{2}$ $w_f=(1+W_f(\mathbf{r}))\bar{m}_f^2$ as well as ${\bar{m}}_{f}$ $\bar{m}_f$ are directly measurable from simulations for a given mark. We propose therefore an algorithm consisting of a polynomial fit through measurements of $w_{f} (\bar{N})$ $w_f(\bar{N})$ and ${\bar{m}}_{f} (\bar{N})$ $\bar{m}_f(\bar{N})$ made at different levels of depletion, that is, at different values of $1 / \bar{N}$ $1/\bar{N}$ . For w_f(r), the fit is done with the same polynomial order for each bin in r but with separate coefficients, which is necessary since the shot noise ϵ_w(r) is scale-dependent. With such a polynomial we can simply read off the noiseless signal from the y-axis intersection as this gives the extrapolation to $1 / \bar{N} = 0$ $1/\bar{N}=0$ , i.e. infinite densities. It is important to note that truncating the fit at some polynomial order is not the same as truncating the Taylor expansion in δ_Rf as the linear coefficient in the power series contains the resummed contributions from all higher order correlators as well. To test this approach and find the best order of polynomial to fit, we used the same depletion levels as described in the previous section.

In Fig. 6, we present the results from polynomial fits to the quantities w_f(r) and ${\bar{m}}_{f}$ $\bar{m}_f$ as appearing in Eq. (70). It is evident that for the general case, higher order shot-noise contributions play an important role resulting in a more curved shape due to quadratic and cubic dependencies on $1 / \bar{N}$ $1/\bar{N}$ . Therefore, a simple linear fit is not sufficient anymore and at least second- or third-order polynomials are to be used. Going to even higher orders, as we show with a fourth-order polynomial in purple, the behaviour outside of the fitted range becomes more unstable and can lead to severe over- or under-estimation of the true signal at the y-axis intersection. Moreover, since we only employ 8 data points, it is crucial to keep the polynomial order as low as possible since otherwise an overfitting of the data might happen. In contrast to the behaviour for w_f(r), the dependency of ${\bar{m}}_{f}$ $\bar{m}_f$ on $1 / \bar{N}$ $1/\bar{N}$ appears to be much more linear and fits with first- or second-order polynomials should be sufficient to recover the true signal accurately.

Fig. 6.

Fitting procedure to obtain the shot-noise-corrected signal in case of w_f(r) (upper panel) and ${\bar{m}}_{f}$ $\bar{m}_f$ (lower panel) for the tanh-mark in the configuration (a, b) = (0.6, −0.5). Following Eq. (69) we present the fits in dependency of $1 / \bar{N}$ $1/\bar{N}$ , the reciprocal of the average number of points per grid cell. We show only one realisation of Covmos and the scale bin in r for w_f is fixed to be close to 20 h⁻¹ Mpc. The depleted measurements were obtained by taking the mean over 100 realisations of depletions. The orange points refer to the fitted measurements and the blue point is the undepleted reference (not included in the fit). Differently coloured lines refer to different orders in the polynomials we used to fit the data. Errorbars of the orange points are obtained by taking the mean standard deviation over the 100 realisations. Since the depletion down to 1.7% (first orange point from the left) mimics the undepleted ELEPHANT density we use for this point an error that is 10% of the minimum uncertainty over the remaining depletions.

In Fig. 7, the performance of the different correction orders as well as the weighted correlation function 1 + W_f(r) and the quantity w_f(r) are shown. The diverging behaviour of 1 + W_f(r) in the upper left panel at the depletions down to 0.68% and 0.85% (olive green and brown lines, respectively) can be understood when looking at the corresponding points of the mean mark ${\bar{m}}_{f}$ $\bar{m}_f$ in Fig. 6, that is, the second and third to last data point at $1 / \bar{N}$ $1/\bar{N}$ of around 2.0 and 1.6. The mean mark is very close to zero in that case and therefore 1 + W_f(r) acquires a very large amplitude due to the division by ${\bar{m}}_{f}^{2}$ $\bar{m}_f^2$ . This behaviour is somewhat peculiar to marks that can switch signs, as the tanh-mark, because in certain cases this can lead to a mean mark which is very close to zero. Moreover, this can result in a turn-around in the dependency on $1 / \bar{N}$ $1/\bar{N}$ as seen for the last two depletions, 0.68% and 0.51%, both in Figs. 6 and 7. While a very small mean mark does not appear to be problematic for the fit, it can be an issue if the true mean mark is very close to zero. Since we use only 8 data points in the fit, we have a limited accuracy on the recovery of ${\bar{m}}_{f}$ $\bar{m}_f$ . This can become problematic as soon as the amplitude of the recovered mean mark approaches the accuracy of the fit, leading to very large relative uncertainties on ${\bar{m}}_{f}$ $\bar{m}_f$ and on 1 + W_f(r). In this case, the accuracy of the polynomial fit is not enough to properly recover small mean marks and the results should not be trusted. Even though one could try to mitigate this issue and improve the accuracy of the fit with better estimates of the data points from larger sets of depleted catalogues, in general, we advise against using marks with a recovered mean mark being very close to zero.

Fig. 7.

Results of the fitted shot-noise correction for the tanh-mark with (a, b) = (0.6, −0.5). The left panels present 1 + W_f while the right panel shows $w_{f} = (1 + W_{f}) {\bar{m}}_{f}^{2}$ $w_f = (1+W_f)\bar{m}_f^2$ . Different colours in the upper panel refer to the mean over five Covmos realisations and errorbars correspond to the mean standard deviation over those five realisations. Depleted measurements were obtained by taking the mean over 100 realisations of depletions. The bottom panels show the relative differences of the corrected signal from the fit to the undepleted reference case. Different colours in the lower panels refer to different orders used in the polynomial fit and e.g. ‘3rd/2nd’ indicates that a third- and second-order polynomial fit was used for w_f and ${\bar{m}}_{f}$ $\bar{m}_f$ , respectively. The horizontal dashed lines in black corresponds to a relative difference of ±5% and the vertical dashed line in grey refers to the side length of one grid cell.

In the upper panel to the right of Fig. 7, we show the measurements of w_f(r). While the noise behaviour on small scales appears to be less severe compared to 1 + W_f(r), on large scales we can observe a constant offset. Focusing on the relative differences in the lower left panel of Fig. 7, it is evident that for this particular mark the effect of shot noise diminishes at scales of around 60 h⁻¹ Mpc as the relative difference between the undepleted case and uncorrected measurement at 1.7% depletion shrinks to below 5%. This is in contrast to the toy model in Fig. 5 as here the 5% border is crossed already at scales of around 40 h⁻¹ Mpc. This illustrates the fact that the shot-noise behaviour can have different amplitudes and scale-dependency that is subject to the chosen mark. The coloured lines in the lower panels refer to different orders in the polynomial fit used for w_f(r) and ${\bar{m}}_{f}$ $\bar{m}_f$ . From this we can conclude that fitting the behaviour of w_f(r) with a third-order and ${\bar{m}}_{f}$ $\bar{m}_f$ with a second-order polynomial results in a satisfactory performance and should be used hereafter as the adequate shot-noise correction. With this choice, the relative difference in 1 + W_f(r), depicted by the orange line in Fig. 7, is within 5% across all scales all the way up to 150 h⁻¹ Mpc.

Now that we have an optimal choice for the polynomial orders to describe the shot noise as a function of $1 / \bar{N}$ $1/\bar{N}$ , we need to assess how many realisations of depletions yield converged results for the polynomial fits. Since the process of depletion consists in randomly throwing away a number of points such that we end up with some desired percentage of the original points, it is inherently noisy and should be repeated several times. The aim is to obtain a representation of the original catalogue with a lower density that looks as if the simulation has been run with lesser points in the first place. In Fig. 8 we show the best correction as obtained from Fig. 7, being the 3rd/2nd-order polynomial, and compute relative differences for different amounts of realisations of depletion. As we can see, using 30 realisations or more, the curves do not differ substantially and results can be considered converged. Even with only 10 or 20 realisations at hand the performance is nevertheless acceptable and well within 5% except for the lowest bin in r. As a conservative choice, we use 30 realisations in the following. This is expected to allow for the sample variance to be mitigated, without affecting resulting corrections.

Fig. 8.

Test of convergence of the shot-noise correction for different amounts of depletion realisations. Black dashed lines refer to relative differences of ±5% and colours denote the number of realisations for the depletions. Relative differences are shown for the total 1 + W_f(r) where both w_f(r) and ${\bar{m}}_{f}$ $\bar{m}_f$ are corrected for the tanh-mark with (a, b) = (0.6, −0.5).

6.3. Shot noise in redshift space

In real observations, the quantity of interest in clustering analyses are usually the multipoles of the anisotropic 2PCF. In the following, we show that the correction to the marked correlation function in redshift space is similar to that in real space. We indicate redshift-space quantities via their explicit dependency on the separation s and angle μ, but also the mean mark has to be understood as measured in redshift space. For the full anisotropic marked correlation function ℳ_f(s, μ), the shot noise enters in the following way:

$\begin{matrix} M_{f} (s, μ) & = \frac{1 + W_{f} (s, μ)}{1 + ξ (s, μ)} = \frac{(1 + W (s, μ)) {\bar{m}}^{2} + ϵ_{w} (s, μ)}{(1 + ξ (s, μ)) {(\bar{m} + ϵ_{\bar{m}})}^{2}} \\ = M (s, μ) \frac{{\bar{m}}^{2}}{{(\bar{m} + ϵ_{\bar{m}})}^{2}} + \frac{ϵ_{w} (s, μ)}{(1 + ξ (s, μ)) {(\bar{m} + ϵ_{\bar{m}})}^{2}} . \end{matrix}$ $\begin{aligned} \mathcal{M} _f(s, \mu )&= \frac{1+W_f(s, \mu )}{1+\xi (s, \mu )} = \frac{(1+W(s, \mu )) \bar{m}^2 + {\epsilon }_w(s, \mu )}{(1+\xi (s, \mu ))(\bar{m}+{\epsilon }_{\bar{m}})^2} \nonumber \\&= \mathcal{M} (s, \mu ) \frac{\bar{m}^2}{(\bar{m}+{\epsilon }_{\bar{m}})^2} + \frac{{\epsilon }_w(s, \mu )}{(1+\xi (s, \mu ))(\bar{m}+{\epsilon }_{\bar{m}})^2}. \end{aligned}$ (71)

Here, we take advantage of the fact that the unweighted 2PCF is not affected by shot noise and, hence, it does not have the subscript f. Since the shot noise does contain N-point correlators, it will acquire an angle dependency as well. After decomposing the anisotropic marked correlation function into multipoles ℳ_f, ℓ(s) we obtain

$\begin{matrix} \begin{matrix} M_{f, ℓ} (s) = M_{ℓ} (s) \frac{{\bar{m}}^{2}}{{\bar{m}}_{f}^{2}} + \frac{2 ℓ + 1}{2 {\bar{m}}_{f}^{2}} \int_{- 1}^{1} P_{ℓ} (μ) {\tilde{ϵ}}_{w} (s, μ) d μ, \end{matrix} \end{matrix}$ $\begin{aligned} \begin{split} \mathcal{M} _{f,\ell }(s) = \mathcal{M} _{\ell }(s) \frac{\bar{m}^2}{\bar{m}_f^2} + \frac{2\ell +1}{2\bar{m}_f^2} \int _{-1}^1 P_{\ell }(\mu ) \widetilde{{\epsilon }}_w(s, \mu ) \text{ d} \mu , \end{split} \end{aligned}$ (72)

with

$\begin{matrix} \begin{matrix} {\tilde{ϵ}}_{w} (s, μ) = \frac{ϵ_{w} (s, μ)}{(1 + ξ (s, μ))} . \end{matrix} \end{matrix}$ $\begin{aligned} \begin{split} \widetilde{{\epsilon }}_w(s, \mu ) = \frac{{\epsilon }_w(s, \mu )}{(1+\xi (s, \mu ))}. \end{split} \end{aligned}$ (73)

Solving the former expression for the true signal ℳ_ℓ(s), the shot-noise correction takes the same form analogous to Eq. (70)

$\begin{matrix} M_{ℓ} (s) = \frac{M_{f, ℓ} (s) {\bar{m}}_{f}^{2} - {\tilde{ϵ}}_{w, ℓ} (s)}{{({\bar{m}}_{f} - ϵ_{\bar{m}})}^{2}}, \end{matrix}$ $\begin{aligned} \mathcal{M} _{\ell }(s) = \frac{\mathcal{M} _{f,\ell }(s) \bar{m}_f^2 - \widetilde{{\epsilon }}_{w,\ell }(s)}{(\bar{m}_f - {\epsilon }_{\bar{m}})^2}, \end{aligned}$ (74)

with

$\begin{matrix} {\tilde{ϵ}}_{w, ℓ} (s) = \frac{(2 ℓ + 1)}{2} \int_{- 1}^{1} P_{ℓ} (μ) {\tilde{ϵ}}_{w} (s, μ) d μ, \end{matrix}$ $\begin{aligned} \widetilde{{\epsilon }}_{w,\ell }(s) = \frac{(2\ell +1)}{2} \int _{-1}^1 P_{\ell }(\mu ) \widetilde{{\epsilon }}_w(s, \mu ) \text{ d} \mu , \end{aligned}$ (75)

being the redefined shot-noise contribution to w_f in redshift space. The shot noise will still be, under the assumption of a mark that can be Taylor expanded in powers of δ_Rf(x), a power series in $1 / \bar{N}$ $1/\bar{N}$ because the integral is additive. This means that the previously introduced methodology of fitting polynomials to w_f(r) and ${\bar{m}}_{f}$ $\bar{m}_f$ is applicable to redshift-space multipoles of the marked correlation function as well. In the case of redshift-space multipoles of the weighted correlation function the formulas are the same except the division by 1 + ξ(s, μ). It is important to note here that, since the shot noise acquires a non-trivial angle dependency and hence has to be corrected for in each multipole, also in the marked power spectrum the shot noise will appear in higher multipoles. This is in contrast to the ordinary power spectrum with constant shot noise that will only affect the monopole.

To test our derived shot-noise correction in redshift space we compute the analytical correction to the toy model in redshift space and compare with our fitting methodology. As we have seen in Eqs. (59) and (61) the shot noise in the toy model is described by a linear polynomial in $1 / \bar{N}$ $1/\bar{N}$ . It has to be noted that in this case, analogous to the real-space weighted correlation function, we correct w_f with a linear factor of the mean mark due to only one of the galaxies in each pair being actually weighted. The results can be seen Fig. 9, where we plot both the result from the polynomial fit as well as the analytical correction for the monopole and quadrupole. We find very good agreement between the two methods and the relative difference in the monopole is below 1% over all scales up to 150 h⁻¹ Mpc. For the quadrupole, the agreement is worse but still within around 2% for most of the scales. There are specific spikes in the relative difference caused by the quadrupole crossing zero at about 15 h⁻¹ Mpc and approaching zero on large scales.

Fig. 9.

Redshift-space multipoles of the toy model, measured in the GR simulations of ELEPHANT, corrected for shot noise. The upper panel shows both the monopole and quadrupole in different colours corrected via the polynomial fit as solid lines and analytically corrected in dashed lines. In order for better visualisation we offset the quadrupole by +1. Errorbars refer to the mean standard deviation over five realisations. The lower panel shows the relative difference between the analytic and polynomial correction in percent.

6.4. Limits of the shot-noise correction

It is important to assess where the shot-noise correction breaks down due to the assumptions not being valid anymore. While an exhaustive investigation is beyond the scope of this work, we discuss here several points in order to give conservative limits on when it is safe to apply the proposed method. The most crucial assumption comes from approximating the mark functional as a Taylor expansion of the density contrast, as well as the derived power series in $1 / \bar{N}$ $1/\bar{N}$ to be approximated by a low-order polynomial. A Taylor series of a function f(x) has a convergence radius B for which the series converges to the true function if it is evaluated inside the radius, that is, for |x|≤B. There is a straightforward way to compute the convergence radius of the Taylor series of the tanh(x), for which techniques such as the ratio test might not be applicable in certain cases. We can simply compute the closest singularity to the point that we expand around (x = 0), which gives us the convergence radius. While the tanh(x) is non-singular on the axis of real numbers, it has singularities in the complex plane. Since tanh(x) = sinh(x)/cosh(x) singularities appear when cosh(x) = 0, which is the case at x = −b − 1/a(1/2iπ + 2iπc) with c ∈ ℤ, if we have both a shift b and factor a as in the mark of Eq. (48). The closest singularity to x = 0 is therefore obtained by setting c = 0, leading to a convergence radius of $B = \sqrt{b^{2} + {(π / (2 a))}^{2}}$ $B = \sqrt{b^2 + (\pi/(2a))^2}$ . For the White mark the convergence radius is simply given by B = 1 + ρ_* in the case of a positive exponent p in Eq. (39).

To assess the effective validity of our correction based on the convergence radius, we can reformulate it into a criterion involving a measurable statistical quantity from the catalogues. For this, we follow a similar approach as in Philcox et al. (2020). Starting from the mathematical convergence criterion for the Taylor expansion |δ_Rf|≤B, we take the density-weighted average² of the square on both sides. After taking the square-root we obtain

$\begin{matrix} \sqrt{{⟨ δ_{Rf}^{2} ⟩}_{ρ}} \leq B . \end{matrix}$ $\begin{aligned} \sqrt{\langle \delta _{Rf}^2\rangle _{\rho }} \le B. \end{aligned}$ (76)

The left-hand side quantity can be straightforwardly estimated by taking the arithmetic mean of the square of δ_Rf at galaxy positions. For the Covmos realisations using 64 grid cells per dimension $\sqrt{{⟨ δ_{Rf}^{2} ⟩}_{ρ}}$ $\sqrt{\langle \delta_{Rf}^2\rangle_{\rho}}$ ranges from 0.61 to 1.01 over the different levels of depletion (1.7% to 0.048%). In contrast, for the ELEPHANT simulations of GR, $\sqrt{{⟨ δ_{Rf}^{2} ⟩}_{ρ}}$ $\sqrt{\langle \delta_{Rf}^2 \rangle_{\rho}}$ takes values from 1.32 to 1.72 for no depletion down to 30%, respectively. By choosing (a, b) = (0.6, −0.5) for the tanh-mark we have a convergence radius of B ≈ 2.67, which satisfies the convergence criterion and thus we can trust the Taylor expansion. Furthermore, the White mark with (ρ_*, p) = (4.0, 10.0) and (ρ_*, p) = (1.0, −1.0) have convergence radii of B = 5 and B = ∞ therefore they do also fulfil our convergence criterion.

The differences between the catalogues used in this analysis affecting the convergence criterion is illustrated in Fig. 10 showing the density weighted PDF of δ_Rf. First thing that has to be noted is the fact that for the black dashed curve, depicting the undepleted ELEPHANT PDF, there is not depletion involved leading to more noise compared to the solid black curve for Covmos. For the latter, we took the mean over 30 realisations of depletion down to 1.7% of the points of the full catalogue to match the undepleted $\bar{N}$ $\bar{N}$ of ELEPHANT. It is evident from the figure that the PDF for Covmos is more peaked around zero with a less pronounced tail to high densities as exhibited by the ELEPHANT PDF. That difference can be explained by the fact that the Covmos realisations are meant to reproduce the distribution of dark matter particles while the ELEPHANT catalogues are galaxies and hence contain bias. Due to the higher skewness for the PDF in the ELEPHANT simulation, care has to be taken to select appropriate marks not violating the convergence criterion of the Taylor expansion.

Fig. 10.

Comparison of the PDF for the density contrast weighted by the particle/galaxy density as measured in Covmos and ELEPHANT (GR), respectively. Colours encode the amount of depletion resulting in number densities per grid cell as indicated in the legend. Dashed lines refer to the ELEPHANT simulation while solid lines refer to the Covmos catalogues. We show the mean over five realisations and depleted measurement were obtained by taking the mean over 30 realisations of depletions.

The careful reader might have noticed that for the White mark with (ρ_*, p) = (10⁻⁶, 1.0) the radius of convergence is only around unity and the above criterion would not be valid for the ELEPHANT simulation and only partially valid for the Covmos catalogues. However, we do find good recovery of the undepleted signal via applying the shot-noise correction to this configuration in the Covmos catalogues. Moreover, we find good recovery for the tanh-mark with parameters (a, b) = (10.6, −0.5) with a convergence radius of only B ≈ 0.52 therefore breaking the criterion completely. This shows that even if the convergence criterion is not completely fulfilled, it does not directly imply a failure of the shot-noise correction. Due to the strong skewness of the distributions as shown in Fig. 10, it might make sense to look directly at the percentage of points with assigned densities located inside the convergence radius. While this percentage ranges from 70% down to 46% in the Covmos realisations for the tanh-mark with (a, b) = (10.6, −0.5), for the White mark with (ρ_*, p) = (10⁻⁶, 1.0) in ELEPHANT it is still ranging from around 64% down to 51%, including at least half the amount of points. Therefore, we would still trust the results of the applied shot-noise correction in this configuration as we analyse them in the next section. After all, even without a Taylor expansion, the shot-noise behaviour might be well described with a polynomial since almost any function can be locally fitted with a polynomial.

As mentioned earlier, next to the Taylor expansion, another limiting factor are possible contributions of higher powers of $1 / \bar{N}$ $1/\bar{N}$ , which are not captured by e.g. third-order polynomial fits. This is connected to the amplitude of Taylor coefficients, which do only grow if the convergence radius is smaller than 1. In general, the coefficients of the polynomial are expected to decrease at some power as otherwise the shot-noise would look very noisy as a function of $1 / \bar{N}$ $1/\bar{N}$ . However, the coefficients might not be decreasing fast enough such that a low-order polynomial is sufficient. This situation is expected to worsen when the Taylor coefficients grow. Non-accounted for higher order polynomials in the fit should manifest as a bias in the recovered signal. A way to circumvent this issue is by extending the polynomials to higher order, which, however, is not recommended in the case of only 8 data points due to overfitting. Although we do find good performance of the shot-noise correction for some marks with growing coefficients we also find worse performance if e.g. the b parameter is set to zero in the tanh-mark. In the latter case the Taylor coefficients are only non-zero for odd powers of δ_Rf; hence, we suspect that higher order correlators are more important leading to higher order polynomial contributions. While a thorough assessment of the impact of Taylor expansion coefficients on the polynomial fit is not done in this work we conservatively advise to use only marks with a convergence radius larger than one to have decreasing Taylor coefficients. Furthermore, the shift parameter b in the tanh-mark should be non-zero. In Sect. 8 we elaborate further on the connection between polynomial amplitudes and the smoothing induced by the MAS.

6.5. Shot noise in the White mark

With the previously developed methodology for correcting for shot-noise effects in the weighted correlation function, we can evaluate the shot-noise contributions to marks used in the literature so far. One particular widely-used mark is the White mark given in Eq. (39), which has been used with the parameter combinations (ρ_*, p) = (4.0, 10.0) and (ρ_*, p) = (1.0, −1.0) in the work of Alam et al. (2021) and in the combination (ρ_*, p) = (10⁻⁶, 1.0) in the work of Hernández-Aguayo et al. (2018). Both studies used an NGP scheme on the ELEPHANT simulations to compute a density field on the grid. In contrast, we used for most of our analysis a PCS MAS, which is much wider in configuration space and produces a smooth continuous and differentiable field.

Before we come to shot-noise-corrected differences between GR and MG in the White mark, it is instructive to check how the shot-noise behaviour changes when a different MAS is used. This can be investigated even without a shot-noise correction by studying marked correlation functions for the undepleted point set and the depleted point set down to 1.7% in the Covmos catalogues as depicted in Fig. 11. For this particular case, we use 60 grid cells per dimension, instead of 64, to mimic closer the procedure in Alam et al. (2021) and Hernández-Aguayo et al. (2018). By comparing the upper and middle panel in Fig. 11 it can be immediately seen that a more extended MAS, meaning a more smooth density field, decreases the amplitude of the marked correlation function. Intuitively this makes sense as we can expect a possibly stronger small-scale correlation of the mark if the density field, used for the mark, is less smooth. However, this comes at the price of stronger shot-noise effects on small separations as can be seen in the lowest panel. Below scales of around 10 h⁻¹ Mpc, the relative difference between the depleted and undepleted catalogue is larger if a NGP MAS is used as compared to a PCS scheme. However, while the PCS scheme certainly reduces shot-noise effects on the smallest scales it does show extended effects up to larger scales, which can be seen as the solid lines (PCS) crossing the 5% limit at larger scales compared to the dashed lines (NGP). We have shown such a feature already in the toy model in Eqs. (59) and (61), where the shot noise is regulated to some extend by the MAS kernels. In our test on the Covmos realisations, we find the shot-noise correction to give biased results when a NGP MAS is used due to a very low smoothing size and shape of the NGP MAS. In Sect. 8 we give a more extended discussion on this aspect and in the following we refrain ourselves from using the NGP scheme in the White mark and use the PCS MAS throughout, unless otherwise indicated.

Fig. 11.

Marked correlation functions for the White mark measured in the Covmos catalogues. The upper and middle panel show the marked correlation functions for different configurations of the parameters (ρ_*, p) using NGP and PCS MAS, respectively. The dashed line refers to the undepleted case, whereas the solid line shows the measurement for a depleted catalogue down to 1.7%, which corresponds to the same mean density of points per grid cell as in the undepleted ELEPHANT simulations. The lowest panel displays the relative difference between the marked correlation function as measured in the full data set and the depleted one. We used 60 grid cells per side length for the density field and the vertical dashed line in grey corresponds to the side length of a grid cell.

Let us now focus on the impact of shot noise on differences between MG and GR in the ELEPHANT simulations. We present in Fig. 12 the uncorrected marked correlation function as measured in ELEPHANT for different configurations of the White mark alongside with the shot-noise-corrected version. The MAS is fixed to a PCS and we use again 60 grid cells per dimension. The shot noise appears to only significantly affect the measurement below around 20 h⁻¹ Mpc, but relative differences to the case with no correction can reach up to 40% at these scales (middle panel). This is similar to what we reported for the effect of shot noise in the Covmos simulations presented in Fig. 11. The shot noise has the smallest contributions for the configuration (ρ_*, p) = (1.0, −1.0) where galaxies in high density regions get upweighted as compared to the configurations with positive p for which galaxies in low-density regions get upweighted. In general, the effect of shot noise as shown in Fig. 12 is expected to be even stronger on the smallest scales when a NGP MAS is used. In the studies of Alam et al. (2021) and Hernández-Aguayo et al. (2018) relative differences between MG and GR are particularly pronounced for scales below 20 h⁻¹ Mpc where we see that shot noise has a significant effect. However, our findings do not nullify those claimed differences as they might still be present after correcting both GR and MG for shot noise. Rather the amplitude of the marked correlation function itself will be different when correcting for shot noise, which is particularly important for the modelling of marked statistics as e.g. done in Aviles et al. (2020), Philcox et al. (2020, 2021). Indeed, as can be seen in the lowest panel of Fig. 12 the relative differences are largely unaffected regarding a correction for shot noise. Although caution is advised as this does not have to be universally valid for every mark. Furthermore, as we have shown in the last paragraph about shot-noise effects in the Covmos catalogues (see Fig. 11), if a NGP MAS is used, the larger contributions at small scales might impact relative differences stronger on those scales.

Fig. 12.

Shot noise in the White mark for the ELEPHANT suite. The different configurations of parameters are colour coded and the solid line refers to the corrected case while the dashed line refers to the uncorrected case. The upper panel displays the marked correlation function both corrected and uncorrected. The middle panel shows the relative difference between the corrected and the uncorrected marked correlation function in percent. The MG model is fixed to F4 in the upper and the middle panel. In the lower panel we show the relative differences for the different configurations between the F4 model and GR in percent. We used 60 grid cells per dimension and the vertical dashed line in grey refers to the side length of one grid cell.

7. Results

7.1. Performance of marks based on the large-scale environment

It has been already argued in Sect. 5 that, even though we do not have a method for correcting the bias introduced by shot noise in this case, it is instructive to assess the performance of marks based on the environmental classification and tidal field/torque regarding distinguishing GR from MG.

In Fig. 13 we present the S/N, as defined in Eq. (37) for marks based on the environmental classification introduced in Sect. 5. While marks such as the Void and Wall correlation function only exhibit significant differences on small scales below 10 h⁻¹ Mpc and 40 h⁻¹ Mpc, respectively, the marks introducing anti-correlation produce significant differences up to scales of around 80 h⁻¹ Mpc. Particularly the weaker modifications of gravity such as F5 and F6 appear to profit from anti-correlation as the Void_AC mark has an S/N of 3 up to around 40 h⁻¹ Mpc and the Wall_AC mark shows a similar behaviour for F5. Furthermore N1 can be well distinguished with the Wall_AC mark up to scales of around 60 h⁻¹ Mpc. The Void_LEM mark performs well on the F4 simulations up to scales around ∼60 h⁻¹ Mpc, but the S/N is only around 3 from 30 h⁻¹ Mpc onwards. N1 and F5 show a S/N of around 3 or higher, only for scales up to around 20 h⁻¹ Mpc.

Fig. 13.

S/N measured in the ELEPHANT suite for the marked correlation functions using marks based on the large-scale environment as introduced in Sect. 5. Colours refer to MG simulations and the panels show different marks as indicated on the labels. The horizontal dashed lines in black indicate a S/N of ±3.

The situation is different for marks using the tidal field/torque as depicted in Fig. 14. Using the tidal field as it is does not seem to yield any significant difference for the investigated MG theories. Interestingly, by taking just a linear function of the tidal torque, we can report significant differences for F6 up to scales of ∼60 h⁻¹ Mpc. As the tidal torque is small for symmetric large-scale environments we basically upweight filaments and walls by taking the tidal torque as a mark. Since there are many galaxies located in walls this appears to compensate for the fact that MG effects are expected to be more screened in walls compared to voids and lead to significant differences to GR when used as a mark.

Fig. 14.

Same as Fig. 13 but for marks based on the tidal field and tidal torque.

Although shot noise is expected to decrease at higher scales and that they might not affect differences between two measurements strongly, there is no guarantee that the observed significant differences seen in Figs. 13 and 14 are still present after a correction for shot noise. In principle we could assess the overall impact of shot noise on these marks by looking at differences in the corresponding measurements for the Covmos catalogues as we have done it in Fig. 11 for the White mark. However, due to the peculiar way the Covmos catalogues have been set up we do not expect that they also represent large-scale structure environments in a realistic way. Furthermore, qualitative differences are not enough to assess precisely how the S/N will change after a proper correction. We relegate, therefore, a thorough investigation of shot-noise effects for these marks to future work for which high resolution full N-body simulations of MG are necessary.

7.2. Performance of the White mark

In Fig. 15, we present the performance of the White mark, corrected for shot noise, to be compared with the other marks to follow. We used 64 grid cells per dimension and a PCS MAS to obtain the density field on the grid and the parameters were fixed to (ρ_*, p) = (10⁻⁶, 1.0). As described in Sect. 6, we use third- and second-order polynomials to fit the shot-noise dependency of w_f(r) and ${\bar{m}}_{f}$ $\bar{m}_f$ , respectively. Overall, the amplitude of the marked correlation function does not differ much from unity implying a minor impact of the mark. Although we can see differences from unity up to scales of around 70 h⁻¹ Mpc, the signal is very similar between GR and MG for most of the scales except below 20 h⁻¹ Mpc. This can be deduced from the lower plot as well where we show the S/N, directly quantifying the difference between GR and MG. The S/N lays inside the 3σ region with only occasional peaks outside that range, which can be accounted to sample variance. Only for F4 we can report significant differences for the first four bins in r ranging up to ∼20 h⁻¹ Mpc. The S/N for the case (ρ_*, p) = (4.0, 10.0), although not shown here, exhibit similar overall structure as the presented case. This makes the White mark with those configurations in real space not particularly powerful in distinguishing MG from GR as possible differences only show up at very low scales.

Fig. 15.

Performance of the White mark for fixed parameters (ρ_*, p) = (10⁻⁶, 1.0). The upper panel displays the mean marked correlation function taken over five realisations together with errorbars estimated as the mean standard deviation. Colours refer to the different gravity simulations. The lower panel shows the S/N, as introduced in Eq. (37) and the blue shaded region refers to the error of a single measurements divided by the error of the mean difference (see Eq. (38)) for F4.

In Fig. 16 we present the monopole and quadrupole of the White mark in redshift space with the same parameter configuration as before, meaning (ρ_*, p) = (10⁻⁶, 1.0). The left panel depicts the monopole, exhibiting a very similar amplitude and shape as the real space measurements in Fig. 15. The monopole does converge to unity at higher scales which is expected as, even in redshift space, the marks should become uncorrelated at high scales leading to ℳ(s, μ) = 1. The S/N in the monopole shows no significant difference over all scales except for F4 below 20 h⁻¹ Mpc that we have also seen in real space. The quadrupole (right panel) has much weaker signal compared to the monopole and converges to zero at high scales. This can be explained with the same reasoning as to why the monopole converges to one. ℳ(s, μ) is independent of μ on large scales and therefore does not possess higher order multipoles. Even though the amplitude is very small we nevertheless see interesting differences for N1 in the S/N on moderate to high scales. However, the S/N is barely above 3 and the bin-to-bin variance is fairly high. Lastly, we did not find significant differences for the configuration (ρ_*, p) = (4.0, 10.0) over extended scales. Similar to what we have seen in real space, the White mark in these configurations is overall not really promising in redshift space.

Fig. 16.

Redshift-space monopole (left side) and quadrupole (right side) of the marked correlation function for the White mark with parameters fixed to (ρ_*, p) = (10⁻⁶, 1.0) and corrected for shot noise. Upper panels show the multipoles themselves with colours referring to different gravity simulations. Displayed is the mean over five realisations with respective mean standard deviations as the error bars. For the monopole the horizontal dashed line in black marks an amplitude of 1 while for the quadrupole it marks an amplitude of 0. Lower panels show the signal to noise ratio, as in Eq. (37), with according colour for the different MG models. Shaded regions refer to the error of a single measurement divided by the mean error of the difference as introduced in Eq. (38). Dashed black lines in the lower panels indicate a S/N of ±3.

7.3. Performance of the tanh-mark

In Fig. 17, we present both the marked correlation function before and after shot-noise correction in the left and right panel, respectively, for the tanh-mark as introduced in Sect. 5.2. The configuration is set to (a, b) = (0.6, −0.5) for which we showed in Sect. 6 that our correction algorithm can be safely applied. We use polynomials of order three and two for correcting w_f(r) and ${\bar{m}}_{f}$ $\bar{m}_f$ , respectively. The marked correlation function is largely featureless and converges to 1 on large scales regardless if a correction for shot noise is applied or not. This convergence is also present in the White mark in Fig. 15 and is due to the mark getting uncorrelated at large scales. It strikes how the amplitudes differ on smaller scales between the uncorrected (left panel) and corrected case (right panel) underlining again the importance of applying a shot-noise correction in order to measure correct amplitudes. In the lower panels we show the S/N, as defined in Eq. (37). The general trend of the S/N for the different MG models appears to be similar regardless if a shot-noise correction is applied or not. However, e.g. F4 shows much larger significance on small scales in the case of no correction. Most importantly, the tanh-mark in this configuration leads to significant differences for F6 in the corrected case, up to scales of around 80 h⁻¹ Mpc. Worth to notice here that significant differences are also found for F4 and N5 but only for scales smaller than roughly 30 and 40 h⁻¹ Mpc, respectively. While differences on these scales might be sufficient to be grasped by theoretical models appropriately, it has yet to be tested as modelling becomes increasingly more challenging at small scales. In Sect. 8 we discuss our results on the tanh-mark in the light of current constraints in the literature on the f_R0 parameter.

Fig. 17.

Marked correlation function for the tanh-mark with (a, b) = (0.6, −0.5). The left panels depict the case where we do not apply any correction for shot noise while in the right panels we do apply our shot-noise correction methodology as described in Sect. 6. Upper panels show the marked correlation functions with colours encoding the different MG models. Displayed is the mean over five realisations and errorbars are obtained by taking the mean standard deviation. The lower panel displays the S/N where the black-dashed line indicates a difference of ±3. Shaded areas mark the error of a single measurement divided by the mean error of the difference. The black dashed line in the upper panels indicates an amplitude of 1.

Since our S/N as defined in Eq. (37) computes the error over five realisations any significant differences can only be claimed for a volume of five realisations. It is therefore instructive to elaborate on how the difference compares to the error of a single measurement as indicated by the shaded area in the plot. It appears that particularly at higher scales the difference is of the same size as the error itself rendering a detection at the current volume of around 1 h⁻³ Gpc³ with only one measurement at hand impossible. However, this could be alleviated by considering simulations in larger volumes. Assuming that the error of the single measurement is Gaussian, it scales as $\propto 1 / \sqrt{V}$ $\propto 1/\sqrt{V}$ , where V is the volume of the survey/simulation. An increase in volume by a factor of 9 for F6 would be necessary in order to detect the difference with just one measurement at scales between 60 h⁻¹ Mpc and 80 h⁻¹ Mpc. This increase in volume translates in a larger box side length by a factor of around 2.1. On smaller scales, below around 40 h⁻¹ Mpc, the reported differences would be significant even with only a single measurement.

To better compare our results with the literature where often only relative differences between GR and MG are reported (Hernández-Aguayo et al. 2018; Armijo et al. 2018; Alam et al. 2021) we show a corresponding plot in Fig. 18. Large relative differences beyond 5% are reached for F4 only on smaller scales below around 20 h⁻¹ Mpc while N5 shows larger relative differences all the way up to around 50 h⁻¹ Mpc. This is topped by F6 exhibiting relative differences above 15% over almost all scales decreased only above ∼80 h⁻¹ Mpc. However, the relative error for the GR simulation increases towards those scales rendering the detection with only one realisation at those high scales unfeasible. As shown in Fig. 17 more volume is needed to shrink the uncertainties to a level at which the large reported differences between GR and F6 at high scales can be taken advantage of. These results can be compared with the Fig. 5 in the work of Hernández-Aguayo et al. (2018) and Fig. 15 in Alam et al. (2021) where marks based on the Newtonian gravitational potential were used and appear to be the most performant in terms of relative difference between GR and MG. While certainly performing very good for F4 and to some extend for F5, we can report much larger relative differences for F6, reaching up to higher scales if a tanh-mark in the configuration (a, b) = (0.6, −0.5) is used. Furthermore, our mark is very easy to compute and does not need information from halos as is the case if the Newtonian potential is to be computed in the way defined in the mentioned studies.

Fig. 18.

Relative differences between GR and MG for the tanh-mark with parameters fixed to (a, b) = (0.6, −0.5). We show the mean relative difference over five realisations and the shaded area corresponds to the relative standard deviation (relative error of single measurement) for the GR realisations. Grey dashed lines indicate relative differences of ±5%.

Finally, in Fig. 19 we present the shot-noise-corrected monopole and quadrupole of the marked correlation function in redshift space for the tanh-mark with parameters fixed to (a, b) = (0.6, −0.5). In general, compared to the White mark in Fig. 16, the amplitude of both the monopole and the quadrupole is much larger but the large-scale behaviour is very similar. Looking at the monopole in the left panel, similarities with the real space measurements in Fig. 17 are striking both in the shape of the measurements as well as the S/N. However, N5 has a reduced S/N in redshift space while conversely N1 has now significant differences at scales lower than 40 h⁻¹ Mpc. F6 and F4 look largely the same as in real space, most importantly the former still showing significant differences up to around 80 h⁻¹ Mpc. The shaded region seems to increase in size for F6 making an even larger volume necessary to detect the difference with a single observation only. The amplitude in the quadrupole in the upper right panel is smaller and also the S/N in the lower right panel stays within 3σ rendering the quadrupole not suitable as a statistic to detect MG with this mark.

Fig. 19.

Redshift-space monopole (left side) and quadrupole (right side) of the marked correlation function for the tanh-mark with parameters fixed to (a, b) = (0.6, −0.5) and corrected for shot noise. Upper panels show the multipoles themselves with colours referring to different gravity simulations. Displayed is the mean over five realisations with respective mean standard deviations as the errorbars. For the monopole the horizontal dashed line in black indicates an amplitude of 1 and for the quadrupole an amplitude of 0. Lower panels show the S/N, as in Eq. (37), with according colours for the different MG models. Shaded regions refer to the error of a single measurement divided by the mean error of the difference and dashed lines in the lower panels indicate a S/N of ±3.

8. Discussion

We have found the tanh-mark to be particularly promising regarding distinguishing MG from GR. Of course, studying possible differences between f(R) theories and GR has to be done in the light of constraints on f_R0 in the literature. A somewhat older compilation of constraints can be found in Table 1 in the work of Lombriser (2014), where the strongest limit comes from dwarf galaxies and the solar system imposing |f_R0|≤10⁻⁷ − 10⁻⁶. In a more recent analysis, Liu et al. (2021) found similar limits using Fisher forecasts on cluster abundances and galaxy clustering. Even tighter constraints, f_R0 < 1.4 × 10⁻⁸, are found in the work of Desmond & Ferreira (2020) using galaxy morphology. Although F6 might not be a viable MG theory after all and has to be replaced with weaker modifications such as F7 or F8, finding significant differences for F6 makes the tanh-mark promising to distinguish also weaker models.

In Sect. 6, we presented a robust technique to correct for shot-noise effects for general marks, where the mark function can be expressed as a Taylor expansion in powers of the density contrast. Since the error on the measurements plays a crucial role in our performance metric in Eq. (37), it is instructive to discuss how the relative error of the measurements is impacted when the shot-noise correction is applied. In particular due to the approximations made in estimating the shot-noise-corrected signal, additional uncertainties might be introduced. In Fig. 20 we present the relative error for the undepleted case and the corrected case, both for the tanh-mark with (a, b) = (0.6, −0.5) and toy model. Since we can capture the shot-noise behaviour in the toy model very accurately, the relative error stays almost the same and does not significantly change. However, in the case of the tanh-mark where the correction is only approximate, the relative error does increase to around 1% on scales larger than 25 h⁻¹ Mpc. Although, at very large scales, the shot-noise correction does not greatly impact the uncertainty since the overall contribution of shot noise on those scales is marginal. It is evident from the figure that most of the uncertainty from the correction is introduced on scales below 25 h⁻¹ Mpc. Here, we are within the smoothing radii and shot noise is expected to be the strongest. It is important to note that this does not capture the effect on the relative error between not applying a correction at all and applying a correction. We can only conclude that while we might be able to accurately recover the true signal, the step of applying the correction via the fitting introduces additional uncertainties that increase towards smaller scales. This uncertainty is expected to be smaller the higher $\bar{N}$ $\bar{N}$ is, since the fitting process should be less prone to uncertainties. Intuitively, this means that the points that we need to fit as depicted in Figs. 4 and 6 are distributed closer to $1 / \bar{N} = 0$ $1/\bar{N}=0$ and hence the extrapolation to the y-axis is more robust.

Fig. 20.

Relative error of the weighted correlation function in Covmos. The mark is set to the tanh with parameters fixed to (a, b) = (0.6, −0.5) for the solid line and the toy model for the dashed line. Green colours refer to the correction using 30 realisations of depletions and blue colours indicate the undepleted case. The relative error is computed as the mean standard deviation over five realisations divided by the mean.

We have briefly touched upon the impact of the MAS kernel on our methodology in Sect. 6.5 with Fig. 11 and it is crucial to assess this in more detail. One can show that the shot-noise correction is smaller the higher the order of the MAS is, which can be understood as a larger smoothing scale as higher order MAS kernels are more extended in configuration space. In Figs. 21 and 22, we present an illustration of how the smoothing scale as well as the shape of the MAS affect the behaviour of shot noise, particularly on the amplitude of the different powers of $1 / \bar{N}$ $1/\bar{N}$ . We show the result of the standard fitting procedure for obtaining the shot-noise-corrected signal and fits do include the undepleted catalogue, which is not accessible in real data. This figure illustrates the contributions from different powers of $1 / \bar{N}$ $1/\bar{N}$ in the shot-noise polynomial. Including the undepleted case enables us to obtain an accurate estimate of the shot-noise behaviour and identify any breakdown at a given polynomial order. As we can already seen from the fits in Fig. 21, once an NGP MAS is employed the low orders for the polynomials do not seem to be sufficient anymore to describe the data. This is further underlined by Fig. 22 showing an increase in amplitude for the polynomial coefficients when lowering the MAS order. This means that the higher the smoothing scale the less contributions come from higher order shot-noise expressions, and the better we can fit the dependency with a low-order polynomial. Intuitively this makes sense if we hypothetically increase the smoothing scale to infinity. In that case, the obtained density field would be a constant in space and therefore all galaxies will have the same weight. In that scenario, the weighted correlation function will reduce to the unweighted correlation function that is only affected by shot noise at zero-lag. Hence in our measurements there would be no contamination and the polynomial fits would just be a constant. Such a trend can also be seen to some extent in Fig. 21, where the curve becomes more and more linear and converges to a vertical line the higher the order of the MAS.

Fig. 21.

Impact of the smoothing scale on the polynomial behaviour of shot noise. The mark is fixed to the White mark with (ρ_*, p) = (4.0, 10.0). Left and right panels show the fit of w_f and ${\bar{m}}_{f}$ $\bar{m}_f$ for one realisation of Covmos, respectively. Depleted measurements are obtained via taking the mean of 30 realisations of depletions. Linestyles refer to different MAS used for obtaining the density field. The error of the undepleted case and 1.7% depletion are computed as 10% of the smallest uncertainty of the other depletions.

Fig. 22.

Coefficient amplitudes of the polynomial fits corresponding to Fig. 21.

Using two different types of catalogues for gauging the shot-noise correction has its limitations, which we discuss in the following. The Covmos catalogues were absolutely necessary in the first place in order to have an almost noise-free signal to test the method. An inherent difference, which has to be kept in mind when interpreting our findings, is the fact that Covmos realisations are not simply an upscaled version of the ELEPHANT suite, rather a completely different set of catalogues. First and foremost, Covmos catalogues consist of dark matter particles instead of galaxies and second, the redshift and assumed cosmology are different compared to ELEPHANT. Most importantly, the Covmos catalogues are not extracted from full N-body simulations. This means that the features and the general shape of the weighted correlation functions look different between Covmos and ELEPHANT simulations. Nevertheless, the functional form of the Taylor expansion of the mark stays the same and having access to realisations with very large number densities of points served the purpose of validating the method for estimating the corrected signal. Care has to be taken, however, in the choice of the mark to comply with the convergence criterion of the Taylor expansion as described in Sect. 6.4, which is universally valid for both sets of catalogues. By construction, the ELEPHANT suite is limited by the small number of galaxies in the catalogues. This results in the shot-noise correction to introduce larger uncertainties than would do in larger catalogues. Having only twice as many objects in the simulations would already half the distance to extrapolate from the undepleted case to $1 / \bar{N} = 0$ $1/\bar{N}=0$ on a linear scale. The ELEPHANT simulations are therefore a particularly difficult case and we expect better results if applied to state-of-the-art N-body simulations with higher densities. Nevertheless, as long as the mark is chosen appropriately, a robust recovery of the true signal is possible. For future analysis we recommend to use simulations with higher resolution in order to mitigate the impact of shot noise.

In this work, we have studied numerical simulations with a cubic geometry, periodic boundary conditions, and a homogeneous sampling of matter tracers. In real observations instead, galaxy surveys can have non-trivial geometries and be affected by incompleteness and selection effects, which makes such an analysis more challenging. In un-weighted two-point statistics analysis, the estimation usually relies on the Landy-Szalay estimator that makes explicitly use of a random catalogue incorporating the effect of non-trivial survey geometries and discreteness. The latter effects can also be accounted for in a marked correlation function analysis by using a similar estimator. The additional aspect that needs to be accounted for in weighted two-point statistics, is the estimation of the weight or mark, and in turn, of the local density contrast. Using a random sample can help in a proper estimation of the local density contrast and get rid of border effects since it follows the same survey geometry (see e.g. Satpathy et al. 2019, for an application of marked correlation functions to real data), but also in incorporating the effect of a survey selection function. An accurate estimation of the density contrast is also necessary for a correct recovery of the tidal tensor for marks utilising the large-scale environment.

The crucial effect that can potentially lead to biased estimates of two-point statistics and density contrasts, is that of the impurity and incompleteness of the sample. Observed galaxy catalogues might not contain all targeted galaxies and redshift estimation might not be perfect for all objects. Those effects are usually corrected for in unweighted two-point statistics by using a set of single object or pair weights, reproducing any known variation of completeness and purity in the survey (e.g. Ross et al. 2011, 2012; de la Torre et al. 2013). The latter very much depend on the observational strategy and characteristics of the survey under consideration. A similar strategy can be used in principle for the density contrast estimation, that is, applying completeness weights in the counts for the mark estimation. In the work of Satpathy et al. (2019) weights have been used that account for fibre collision and angular systematics among others. For real applications, realistic simulations of the observed catalogues are inevitable to develop optimal mitigation strategies, and this applies to both unweighted and weighted two-point statistics.

Still, survey incompleteness can lead to an increased shot noise contribution in the estimation of weighted correlation functions. The fact that our proposed methodology remains agnostic to the exact form of the shot noise playing out; thus, it is useful in super- or sub-Poisson shot noise. Furthermore, since the shot noise is already scale-dependent in weighted correlation functions we should be able to capture possible additional scale-dependent contributions, such as the halo-exclusion effect (Baldauf et al. 2013) as well. Thus, we expect our methodology to correct for shot noise to be directly applicable to real surveys.

We conclude the discussion with a brief summary of the main points raised in this section as well as Sect. 6 to be considered when applying our methodology to correct for shot noise in marked correlation functions:

Sufficient realisations of depletion should be used in order to get converged depleted point sets. In our setting, 30 depletions appear to be sufficient.
The polynomial order should be chosen accordingly such that the shot noise dependency is well modelled without overfitting.
A higher order MAS (e.g. PCS) is preferred to reduce possible bias due to shot noise at low scales.
To apply our methodology to correct for shot noise, a Taylor expansion of the mark function in powers of δ_Rf has to exist.
The convergence radius B of the Taylor series should be larger than $\sqrt{{⟨ δ_{Rf}^{2} ⟩}_{ρ}}$ $\sqrt{\langle \delta_{Rf}^2 \rangle_{\rho}}$ to satisfy the assumption of a convergent Taylor expansion.
In addition to the criterion, as given in the last bullet point, also the fraction of points inside the convergence radius can be checked, which is more agnostic about the actual PDF of δ_Rf in the catalogues.
A Taylor expansion with non-zero and decreasing coefficients both for odd and even powers of δ_Rf are preferred in order to allow for a robust estimation of shot noise.
We do not recommend the use of marks that have a mean mark very close to zero after correcting for shot noise; this is because measurements get very unstable and uncertainties diverge.

9. Conclusions

In this work, we study marked correlation functions in the context of detecting MG and how discreteness effects resulting from estimating marks on a finite point set propagate into the measurement of marked correlation functions. We utilised the Covmos realisations (Baratta et al. 2023), which have a particularly high density of points, making them the most suitable for this purpose, along with ELEPHANT simulations with HOD galaxies (Alam et al. 2021), to investigate the discriminating power of marked correlation functions between MG and GR. The latter is comprised of several realisations of GR as well as of f(R) and nDGP gravity theories. These are two particularly interesting modifications to GR to be studied with marked correlation functions because they exhibit screening mechanisms making the fifth force dependent on the environment. We proposed several marks based on large-scale environments using the T-Web formalism as well as local density. This includes marks that creates anti-correlation between galaxies in different environments or between galaxies in low- and high-density regions.

For the first time, we have undertaken a thorough investigation of a possible bias due to shot noise in marked correlation functions. We used a toy model and showed that the effect of shot noise can be treated analytically by computation of a small amount of terms. We were able to recover the signal $w_{f} (r) = (1 + W_{f} (r)) {\bar{m}}_{f}$ $w_f(\mathbf{r}) = (1+W_f(\mathbf{r}))\bar{m}_f$ from the undepleted catalogue to a sub-percent precision. For general marks, under the assumption that the mark function can be Taylor-expanded in powers of the density contrast, we show that an analytic treatment is hopeless due to the necessity of computing an infinity of higher order correlators. Instead, we developed a methodology for estimating the shot-noise-corrected signal from measuring the weighted correlated function and mean mark in catalogues depleted to different densities. This is possible by noting the resummation behaviour of shot-noise contributions to $w_{f} (r) = (1 + W_{f} (r)) {\bar{m}}_{f}^{2}$ $w_f(\mathbf{r}) = (1+W_f(\mathbf{r}))\bar{m}_f^2$ and ${\bar{m}}_{f}$ $\bar{m}_f$ as a power series of the reciprocal of the mean number of points per grid cell $1 / \bar{N}$ $1/\bar{N}$ . By applying our method to the tanh-mark in the Covmos realisations, we were able to recover an unbiased signal of 1 + W_f(r) within 5% accuracy for all tested scales. We proceeded with an extension of the formalism in redshift space, where we found the same method to be applicable. Furthermore, we derived a measurable criterion based on the work of Philcox et al. (2020) to assess the validity of assuming a Taylor expansion of the mark and provide guidelines for the application of our methodology for shot-noise correction. We found effects of shot noise mostly on scales below 20–30 h⁻¹ Mpc when using the White mark, which might be important for the modelling of marked correlation functions, although the impact on the relative difference between GR and MG appears to be mild. Moreover, we found that the NGP MAS to give biased results due to higher order terms in the $1 / \bar{N}$ $1/\bar{N}$ series being non-negligible. This makes the NGP MAS a sub-optimal choice compared to higher order schemes, such as PCS.

Equipped with a robust method to recover the true signal in measured marked correlation function, we tested the performance of the previously proposed marks on the ELEPHANT simulations. Concerning marks based on the local density, we did not find the White mark to be particularly powerful on large scales. Only on the very lowest scales, below 20 h⁻¹ Mpc, we reported significant differences. In redshift space, the situation changes slightly for the N1 model where we found differences in the quadrupole at s > 20 h⁻¹ Mpc. We found that the novel tanh-mark that we introduced is very effective. It allows for significant differences to be seen for f(R) gravity with log(|f_R0|) = −6 compared to GR, up to scales of 80 h⁻¹ Mpc. These differences also propagate into the monopole of the marked correlation function in redshift space. At those scales, modelling the weighted correlation function is more tractable; making this mark an excellent candidate to test for deviations from GR in real surveys. Furthermore, we found promising results when using the tidal torque as a mark, with significant differences up to scales of r ≈ 60 h⁻¹ Mpc. The use of anti-correlation together with large-scale environments, as in the Void_AC- and Wall_AC-mark, exhibits significant discriminating power for several MG theories on moderate scales. However, these findings have to be tested with high-density simulations to assess if they are biased by discreteness effects, as our correction method cannot be applied straightforwardly to those types of mark.

In summary, this work demonstrates that correcting for shot noise in marked correlation functions is of paramount importance to measure unbiased amplitudes, without being plagued by shot noise, and, in turn, to allow us to distinguish MG from GR. This is also particularly important for the modelling of the weighted correlation function in the same way as it is for the power spectrum. Generally, we found shot noise to have the strongest impact on small to intermediate scales. Marks that incorporates an anti-correlation between objects in high- and low-density regions by switching signs in the mark are found to be the most effective for distinguishing between GR from MG, also when using scales beyond 20 h⁻¹ Mpc. In the future, extending the concept of anti-correlation in weighted correlation functions to different marks could alleviate the current constraint due to the convergence radius of the Taylor expansion as is the case for the tanh-mark. In general, a consolidation of the tanh-mark performances on improved MG and GR simulations with higher densities would be desired. Moreover, a thorough investigation of a kind of model-independent shot-noise effect on general marked correlation functions would enable an appropriate correction for marks based on large-scale environments or the tidal torque or field, which we showed to be interesting candidates. A future study should assess the capability of the Lagrangian perturbation theory model to capture the behaviour of our novel mark based on the local density on intermediate scales. The tanh-mark could then serve as an optimal choice for a weighted clustering analysis in current and future galaxy surveys, since there is no accurate modelling of small scales required. Having access to a working model of marked correlation functions together with a high-performance mark should add a powerful observable to help find the needle in the haystack of gravity theories.

¹

It has to be noted that there are two minor typos. Their octic spline is actually the septic spline and in its expression the term s⁷/20 should be replaced by s⁷/720.

²

That is ${⟨ g (x) ⟩}_{ρ} = \frac{1}{V} \int g (x) \frac{ρ_{f} (x)}{{\bar{ρ}}_{f}} d^{3} x$ $\langle g(\mathbf{x})\rangle_{\rho} = \frac{1}{V}\int g(\mathbf{x}) \frac{\rho_f(\mathbf{x})}{\bar{\rho}_f} \mathrm{d}^3x$ .

Acknowledgments

This work received support from the French government under the France 2030 investment plan, as part of the Excellence Initiative of Aix Marseille University – amidex (AMX-19-IET-008 – IPhU). This research made use of matplotlib, a Python library for publication quality graphics (Hunter 2007).

References

Alam, S., Albareti, F. D., Allende Prieto, C., et al. 2015, ApJS, 219, 12 [Google Scholar]
Alam, S., Zu, Y., Peacock, J. A., & Mandelbaum, R. 2019, MNRAS, 483, 4501 [NASA ADS] [CrossRef] [Google Scholar]
Alam, S., Arnold, C., Aviles, A., et al. 2021, JCAP, 2021, 050 [Google Scholar]
Armijo, J., Cai, Y.-C., Padilla, N., Li, B., & Peacock, J. A. 2018, MNRAS, 478, 3627 [NASA ADS] [CrossRef] [Google Scholar]
Armijo, J., Baugh, C. M., Norberg, P., & Padilla, N. D. 2024a, MNRAS, 529, 2866 [NASA ADS] [CrossRef] [Google Scholar]
Armijo, J., Baugh, C. M., Norberg, P., & Padilla, N. D. 2024b, MNRAS, 528, 6631 [NASA ADS] [CrossRef] [Google Scholar]
Aubert, M., Cousinou, M.-C., Escoffier, S., et al. 2022, MNRAS, 513, 186 [NASA ADS] [CrossRef] [Google Scholar]
Aviles, A., Koyama, K., Cervantes-Cota, J. L., Winther, H. A., & Li, B. 2020, JCAP, 2020, 006 [CrossRef] [Google Scholar]
Baldauf, T., Seljak, U., Smith, R. E., Hamaus, N., & Desjacques, V. 2013, Phys. Rev. D, 88, 083507 [NASA ADS] [CrossRef] [Google Scholar]
Baratta, P., Bel, J., Plaszczynski, S., & Ealet, A. 2020, A&A, 633, A26 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Baratta, P., Bel, J., Gouyou Beauchamps, S., & Carbone, C. 2023, A&A, 673, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Barreira, A., Bose, S., & Li, B. 2015, JCAP, 2015, 059 [CrossRef] [Google Scholar]
Battye, R. A., Bolliet, B., & Pace, F. 2018, Phys. Rev. D, 97, 104070 [NASA ADS] [CrossRef] [Google Scholar]
Bautista, J. E., Paviot, R., Vargas Magaña, M., et al. 2021, MNRAS, 500, 736 [Google Scholar]
Behroozi, P. S., Wechsler, R. H., & Wu, H.-Y. 2013, ApJ, 762, 109 [NASA ADS] [CrossRef] [Google Scholar]
Beisbart, C., & Kerscher, M. 2000, ApJ, 545, 6 [NASA ADS] [CrossRef] [Google Scholar]
Bertotti, B., Iess, L., & Tortora, P. 2003, Nature, 425, 374 [Google Scholar]
Beutler, F., Blake, C., Colless, M., et al. 2012, MNRAS, 423, 3430 [NASA ADS] [CrossRef] [Google Scholar]
Blake, C., Brough, S., Colless, M., et al. 2011, MNRAS, 415, 2876 [NASA ADS] [CrossRef] [Google Scholar]
Blake, C., Amon, A., Asgari, M., et al. 2020, A&A, 642, A158 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bonnaire, T., Aghanim, N., Kuruvilla, J., & Decelle, A. 2022, A&A, 661, A146 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bose, S., Hellwing, W. A., & Li, B. 2015, JCAP, 2015, 034 [CrossRef] [Google Scholar]
Brax, P., Davis, A.-C., & Elder, B. 2022, Phys. Rev. D, 106, 044040 [CrossRef] [Google Scholar]
Castorina, E., Carbone, C., Bel, J., Sefusatti, E., & Dolag, K. 2015, JCAP, 2015, 043 [Google Scholar]
Cautun, M., van de Weygaert, R., & Jones, B. J. T. 2013, MNRAS, 429, 1286 [NASA ADS] [CrossRef] [Google Scholar]
Cautun, M., van de Weygaert, R., Jones, B. J. T., & Frenk, C. S. 2014, MNRAS, 441, 2923 [Google Scholar]
Chan, K. C., & Blot, L. 2017, Phys. Rev. D, 96, 023528 [NASA ADS] [CrossRef] [Google Scholar]
Chan, K. C., Scoccimarro, R., & Sheth, R. K. 2012, Phys. Rev. D, 85, 083509 [CrossRef] [Google Scholar]
Chaniotis, A. K., & Poulikakos, D. 2004, J. Comput. Phys., 197, 253 [NASA ADS] [CrossRef] [Google Scholar]
Clifton, T., Ferreira, P. G., Padilla, A., & Skordis, C. 2012, Phys. Rep., 513, 1 [NASA ADS] [CrossRef] [Google Scholar]
Cognola, G., Elizalde, E., Nojiri, S., et al. 2008, Phys. Rev. D, 77, 046009 [NASA ADS] [CrossRef] [Google Scholar]
Damour, T., & Polyakov, A. M. 1994, Nuclear Physics B, 423, 532 [NASA ADS] [CrossRef] [Google Scholar]
De Felice, A., & Tsujikawa, S. 2010, Living Reviews in Relativity, 13, 3 [NASA ADS] [CrossRef] [Google Scholar]
de la Torre, S., Guzzo, L., Peacock, J. A., et al. 2013, A&A, 557, A54 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
de la Torre, S., Jullo, E., Giocoli, C., et al. 2017, A&A, 608, A44 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
DESI Collaboration (Aghamousa, A., et al.) 2016, arXiv e-prints [arXiv:1611.00036] [Google Scholar]
Desmond, H., & Ferreira, P. G. 2020, Phys. Rev. D, 102, 104060 [NASA ADS] [CrossRef] [Google Scholar]
Dvali, G., Gabadadze, G., & Porrati, M. 2000, Physics Letters B, 485, 208 [NASA ADS] [CrossRef] [Google Scholar]
Euclid Collaboration (Mellier, Y., et al.) 2025, A&A, in press, https://doi.org/10.1051/0004-6361/202450810 [Google Scholar]
Falck, B. L., Neyrinck, M. C., & Szalay, A. S. 2012, ApJ, 754, 126 [Google Scholar]
Forero-Romero, J. E., Hoffman, Y., Gottlöber, S., Klypin, A., & Yepes, G. 2009, MNRAS, 396, 1815 [Google Scholar]
Guth, A. H. 1981, Phys. Rev. D, 23, 347 [Google Scholar]
Guzzo, L., Pierleoni, M., Meneux, B., et al. 2008, Nature, 451, 541 [Google Scholar]
Hamaus, N., Aubert, M., Pisani, A., et al. 2022, A&A, 658, A20 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Heavens, A., & Peacock, J. 1988, MNRAS, 232, 339 [NASA ADS] [CrossRef] [Google Scholar]
Hernández-Aguayo, C., Baugh, C. M., & Li, B. 2018, MNRAS, 479, 4824 [CrossRef] [Google Scholar]
Hinshaw, G., Larson, D., Komatsu, E., et al. 2013, ApJS, 208, 19 [Google Scholar]
Hinterbichler, K., & Khoury, J. 2010, Phys. Rev. Lett., 104, 231301 [NASA ADS] [CrossRef] [Google Scholar]
Hu, W., & Sawicki, I. 2007, Phys. Rev. D, 76, 064004 [CrossRef] [Google Scholar]
Hunter, J. D. 2007, Comput. Sci. Eng., 9, 90 [NASA ADS] [CrossRef] [Google Scholar]
Ishak, M. 2019, Liv. Rev. Rel., 22, 1 [NASA ADS] [CrossRef] [Google Scholar]
Jullo, E., de la Torre, S., Cousinou, M. C., et al. 2019, A&A, 627, A137 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Khoury, J., & Weltman, A. 2004a, Phys. Rev. D, 69, 044026 [CrossRef] [Google Scholar]
Khoury, J., & Weltman, A. 2004b, Phys. Rev. Lett., 93, 171104 [NASA ADS] [CrossRef] [Google Scholar]
Koyama, K., & Silva, F. P. 2007, Phys. Rev. D, 75, 084040 [NASA ADS] [CrossRef] [Google Scholar]
Landy, S. D., & Szalay, A. S. 1993, ApJ, 412, 64 [Google Scholar]
Layzer, D. 1956, AJ, 61, 383 [NASA ADS] [CrossRef] [Google Scholar]
Libeskind, N. I., van de Weygaert, R., Cautun, M., et al. 2018, MNRAS, 473, 1195 [NASA ADS] [CrossRef] [Google Scholar]
Liu, R., Valogiannis, G., Battaglia, N., & Bean, R. 2021, Phys. Rev. D, 104, 103519 [NASA ADS] [CrossRef] [Google Scholar]
Llinares, C., & McCullagh, N. 2017, MNRAS, 472, L80 [CrossRef] [Google Scholar]
Lombriser, L. 2014, Annalen der Physik, 264, 259 [CrossRef] [Google Scholar]
Lombriser, L., Hu, W., Fang, W., & Seljak, U. 2009, Phys. Rev. D, 80, 063536 [CrossRef] [Google Scholar]
Lombriser, L., Simpson, F., & Mead, A. 2015, Phys. Rev. Lett., 114, 251101 [NASA ADS] [CrossRef] [Google Scholar]
Manera, M., Scoccimarro, R., Percival, W. J., et al. 2013, MNRAS, 428, 1036 [Google Scholar]
Martin, J. 2012, C. R. Phys., 13, 566 [NASA ADS] [CrossRef] [Google Scholar]
Massara, E., Villaescusa-Navarro, F., Ho, S., Dalal, N., & Spergel, D. N. 2021, Phys. Rev. Lett., 126, 011301 [CrossRef] [Google Scholar]
Neyrinck, M. C. 2008, MNRAS, 386, 2101 [CrossRef] [Google Scholar]
Nicolis, A., Rattazzi, R., & Trincherini, E. 2009, Phys. Rev. D, 79, 064036 [CrossRef] [Google Scholar]
Paillas, E., Cai, Y.-C., Padilla, N., & Sánchez, A. G. 2021, MNRAS, 505, 5731 [NASA ADS] [CrossRef] [Google Scholar]
Peebles, P. J. E., & Hauser, M. G. 1974, ApJS, 28, 19 [NASA ADS] [CrossRef] [Google Scholar]
Perlmutter, S., Aldering, G., Goldhaber, G., et al. 1999, ApJ, 517, 565 [Google Scholar]
Philcox, O. H. E., Massara, E., & Spergel, D. N. 2020, Phys. Rev. D, 102, 043516 [CrossRef] [Google Scholar]
Philcox, O. H. E., Aviles, A., & Massara, E. 2021, JCAP, 2021, 038 [CrossRef] [Google Scholar]
Planck Collaboration XIV. 2016, A&A, 594, A14 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Planck Collaboration VI. 2020, A&A, 641, A6 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Reyes, R., Mandelbaum, R., Seljak, U., et al. 2010, Nature, 464, 256 [Google Scholar]
Riess, A. G., Filippenko, A. V., Challis, P., et al. 1998, AJ, 116, 1009 [Google Scholar]
Riess, A. G., Yuan, W., Macri, L. M., et al. 2022, ApJ, 934, L7 [NASA ADS] [CrossRef] [Google Scholar]
Ross, A. J., Ho, S., Cuesta, A. J., et al. 2011, MNRAS, 417, 1350 [Google Scholar]
Ross, A. J., Percival, W. J., Sánchez, A. G., et al. 2012, MNRAS, 424, 564 [Google Scholar]
Satpathy, S., A C Croft, R., Ho, S., & Li, B. 2019, MNRAS, 484, 2148 [NASA ADS] [CrossRef] [Google Scholar]
Schaap, W. E., & van de Weygaert, R. 2000, A&A, 363, L29 [Google Scholar]
Schmidt, F. 2009, Phys. Rev. D, 80, 123003 [NASA ADS] [CrossRef] [Google Scholar]
Sefusatti, E., Crocce, M., Scoccimarro, R., & Couchman, H. M. P. 2016, MNRAS, 460, 3624 [NASA ADS] [CrossRef] [Google Scholar]
Sheth, R. K. 2005, MNRAS, 364, 796 [NASA ADS] [CrossRef] [Google Scholar]
Simpson, F., James, J. B., Heavens, A. F., & Heymans, C. 2011, Phys. Rev. Lett., 107, 271301 [NASA ADS] [CrossRef] [Google Scholar]
Simpson, F., Heavens, A. F., & Heymans, C. 2013, Phys. Rev. D, 88, 083510 [CrossRef] [Google Scholar]
Sinha, M., & Garrison, L. 2019, in Software Challenges to Exascale Computing, eds. A. Majumdar, & R. Arora (Singapore: Springer Singapore), 3 [NASA ADS] [CrossRef] [Google Scholar]
Sinha, M., & Garrison, L. H. 2020, MNRAS, 491, 3022 [Google Scholar]
Sotiriou, T. P., & Faraoni, V. 2010, Rev. Mod. Phys., 82, 451 [NASA ADS] [CrossRef] [Google Scholar]
Sousbie, T. 2011, MNRAS, 414, 350 [NASA ADS] [CrossRef] [Google Scholar]
Tröster, T., Sánchez, A. G., Asgari, M., et al. 2020, A&A, 633, L10 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Tsujikawa, S. 2010, in Lecture Notes in Physics, ed. G. Wolschin (Berlin Springer Verlag), 800, 99 [NASA ADS] [CrossRef] [Google Scholar]
Vainshtein, A. I. 1972, Phys. Lett. B, 39, 393 [NASA ADS] [CrossRef] [Google Scholar]
Valogiannis, G., & Bean, R. 2018, Phys. Rev. D, 97, 023535 [CrossRef] [Google Scholar]
Villaescusa-Navarro, F., Hahn, C., Massara, E., et al. 2020, ApJS, 250, 2 [CrossRef] [Google Scholar]
White, M. 2016, JCAP, 2016, 057 [CrossRef] [Google Scholar]
White, M., & Padmanabhan, N. 2009, MNRAS, 395, 2381 [NASA ADS] [CrossRef] [Google Scholar]
Williams, J. G., Turyshev, S. G., & Boggs, D. H. 2004, Phys. Rev. Lett., 93, 261101 [CrossRef] [Google Scholar]
Williams, J. G., Turyshev, S. G., & Boggs, D. H. 2012, CQG, 29, 184004 [NASA ADS] [CrossRef] [Google Scholar]
Xiao, X., Yang, Y., Luo, X., et al. 2022, MNRAS, 513, 595 [NASA ADS] [CrossRef] [Google Scholar]
Yang, Y., Miao, H., Ma, Q., et al. 2020, ApJ, 900, 6 [Google Scholar]
Zel’dovich, Y. B. 1970, A&A, 5, 84 [NASA ADS] [Google Scholar]
Zhang, P., Liguori, M., Bean, R., & Dodelson, S. 2007, Phys. Rev. Lett., 99, 141302 [Google Scholar]
Zheng, Z., Coil, A. L., & Zehavi, I. 2007, ApJ, 667, 760 [Google Scholar]

Appendix A: Marks and cross-correlation

We assume a population of N galaxies that can be split up into a 1-population and a 2-population with respective numbers of N₁ and N₂, summing up to N = N₁ + N₂. This split could have been done by using the Void_AC mark where we assign a mark of -1 to all galaxies residing in voids and +1 otherwise. We assume boxes with periodic boundary conditions, as in the main text, and RR_n denotes the normalised RR counts such that RR_n = RR/(N_R(N_R − 1)). Next we define the correlation functions for each population of galaxies as well as the cross-correlation among the two sub-populations to be

$\begin{matrix} \begin{matrix} ξ & = \frac{DD}{N (N - 1) R R_{n}} - 1, \\ ξ_{11} & = \frac{D_{1} D_{1}}{N_{1} (N_{1} - 1) R R_{n}} - 1, \\ ξ_{22} & = \frac{D_{2} D_{2}}{N_{2} (N_{2} - 1) R R_{n}} - 1, \\ and ξ_{12} & = \frac{D_{1} D_{2}}{N_{1} N_{2} R R_{n}} - 1 . \end{matrix} \end{matrix}$ $\begin{aligned} \begin{split} \xi&= \frac{DD}{N(N-1)RR_n} -1, \\ \xi _{11}&= \frac{D_1D_1}{N_1(N_1-1)RR_n} -1, \\ \xi _{22}&= \frac{D_2D_2}{N_2(N_2-1)RR_n} -1, \\ \text{ and} \quad \xi _{12}&= \frac{D_1D_2}{N_1N_2RR_n} -1. \end{split} \end{aligned}$ (A.1)

We note here that when we cross-correlate we do not assume double counting hence the normalisation by the total number of possible pairs is only N₁N₂. Terms of the form D_iD_j refer to unnormalised counts of pairs between the i- and j-population.

The total double counted pair counts can be split up into contributions such that

$\begin{matrix} D D = D_{1} D_{1} + D_{2} D_{2} + 2 D_{1} D_{2}, \end{matrix}$ $\begin{aligned} DD = D_1D_1 + D_2D_2 + 2D_1D_2, \end{aligned}$ (A.2)

where the factor of 2 is necessary as the cross-counts are not double counted on their own. This allows for the following split of the total correlation function

$\begin{matrix} \begin{matrix} ξ = & \frac{D_{1} D_{1}}{N (N - 1) R R_{n}} + \frac{D_{2} D_{2}}{N (N - 1) R R_{n}} \\ + 2 \frac{D_{1} D_{2}}{N (N - 1) R R_{n}} - 1 \\ = & \frac{N_{1} (N_{1} - 1)}{N (N - 1)} \frac{D_{1} D_{1}}{N_{1} (N_{1} - 1) R R_{n}} + \frac{N_{2} (N_{2} - 1)}{N (N - 1)} \frac{D_{2} D_{2}}{N_{2} (N_{2} - 1) R R_{n}} \\ + 2 \frac{N_{1} N_{2}}{N (N - 1)} \frac{D_{1} D_{2}}{N_{1} N_{2} R R_{n}} - 1 \\ = & f_{11} (ξ_{11} + 1) + f_{22} (ξ_{22} + 1) + 2 f_{12} (ξ_{12} + 1) - 1, \end{matrix} \end{matrix}$ $\begin{aligned} \begin{split} \xi =&\frac{D_1D_1}{N(N-1)RR_n} + \frac{D_2D_2}{N(N-1)RR_n} \\&+ 2\frac{D_1D_2}{N(N-1)RR_n} -1 \\ =&\frac{N_1(N_1-1)}{N(N-1)}\frac{D_1D_1}{N_1(N_1-1)RR_n} + \frac{N_2(N_2-1)}{N(N-1)}\frac{D_2D_2}{N_2(N_2-1)RR_n} \\&+ 2\frac{N_1N_2}{N(N-1)}\frac{D_1D_2}{N_1N_2RR_n} -1 \\ =&f_{11} (\xi _{11}+1) + f_{22} (\xi _{22}+1) + 2f_{12}(\xi _{12}+1) -1, \end{split} \end{aligned}$ (A.3)

where we defined

$\begin{matrix} \begin{matrix} f_{11} & = \frac{N_{1} (N_{1} - 1)}{N (N - 1)}, \\ f_{22} & = \frac{N_{2} (N_{2} - 1)}{N (N - 1)}, \\ and f_{12} & = \frac{N_{1} N_{2}}{N (N - 1)} . \end{matrix} \end{matrix}$ $\begin{aligned} \begin{split} f_{11}&= \frac{N_1(N_1-1)}{N(N-1)}, \\ f_{22}&= \frac{N_2(N_2-1)}{N(N-1)}, \\ \text{ and}\quad f_{12}&= \frac{N_1N_2}{N(N-1)}. \end{split} \end{aligned}$ (A.4)

By realising that f₁₁ + f₁₂ + 2f₁₂ = 1 we can finally write

$\begin{matrix} ξ = f_{11} ξ_{11} + f_{22} ξ_{22} + 2 f_{12} ξ_{12} . \end{matrix}$ $\begin{aligned} \xi = f_{11}\, \xi _{11} + f_{22} \,\xi _{22} + 2f_{12} \,\xi _{12}. \end{aligned}$ (A.5)

Completely analogous can be proceeded if we consider weighted correlation functions where the mark can only take two values, as in the Void_AC mark. We define the individual weighted correlation functions to be

$\begin{matrix} \begin{matrix} W & = \frac{WW}{{(\sum m_{i})}^{2} - \sum m_{i}^{2}} \frac{1}{R R_{n}} - 1, \\ W_{11} & = \frac{W_{1} W_{1}}{{(\sum_{1} m_{i})}^{2} - \sum_{1} m_{i}^{2}} \frac{1}{R R_{n}} - 1, \\ W_{22} & = \frac{W_{2} W_{2}}{{(\sum_{2} m_{i})}^{2} - \sum_{2} m_{i}^{2}} \frac{1}{R R_{n}} - 1, \\ and W_{12} & = \frac{W_{1} W_{2}}{\sum_{1} m_{i} \sum_{2} m_{i}} \frac{1}{R R_{n}} - 1, \end{matrix} \end{matrix}$ $\begin{aligned} \begin{split} W&= \frac{WW}{(\sum m_i)^2 - \sum m_i^2}\frac{1}{RR_n} - 1, \\ W_{11}&= \frac{W_1 W_1}{(\sum _1 m_i)^2 - \sum _1 m_i^2}\frac{1}{RR_n} - 1, \\ W_{22}&= \frac{W_2W_2}{(\sum _2 m_i)^2 - \sum _2 m_i^2}\frac{1}{RR_n} - 1, \\ \text{ and} \quad W_{12}&= \frac{W_1W_2}{\sum _1 m_i \sum _2 m_i}\frac{1}{RR_n} - 1, \end{split} \end{aligned}$ (A.6)

for which we used the shorthand notation ∑ = ∑₁ + ∑₂ to indicate sums over weights belonging solely to galaxies from either population 1 or population 2. The total sum WW over products of weights can be split up into contributions from W₁W₁, W₂W₂ and W₁W₂ in an analogous way as the DD counts in the unweighted case. Defining prefactors as

$\begin{matrix} \begin{matrix} f_{11}^{W} & = \frac{{(\sum_{1} m_{i})}^{2} - \sum_{1} m_{i}^{2}}{{(\sum m_{i})}^{2} - \sum m_{i}^{2}}, \\ f_{22}^{W} & = \frac{{(\sum_{2} m_{i})}^{2} - \sum_{2} m_{i}^{2}}{{(\sum m_{i})}^{2} - \sum m_{i}^{2}}, \\ and f_{12}^{W} & = \frac{\sum_{1} m_{i} \sum_{2} m_{i}}{{(\sum m_{i})}^{2} - \sum m_{i}^{2}}, \end{matrix} \end{matrix}$ $\begin{aligned} \begin{split} f^W_{11}&= \frac{(\sum _1m_i)^2 - \sum _1 m_i^2}{(\sum m_i)^2 - \sum m_i^2},\\ f^W_{22}&= \frac{(\sum _2 m_i)^2 - \sum _2 m_i^2}{(\sum m_i)^2 - \sum m_i^2},\\ \text{ and} \quad f^W_{12}&= \frac{\sum _1 m_i \sum _2 m_i}{(\sum m_i)^2 - \sum m_i^2}, \end{split} \end{aligned}$ (A.7)

and also noting that (∑w_i)² − ∑w_i² = (∑₁w_i)² − ∑₁w_i² + (∑₂w_i)² − ∑₂w_i² + 2∑₁w_i∑₂w_i we arrive at

$\begin{matrix} W = f_{11}^{W} W_{11} + f_{22}^{W} W_{22} + 2 f_{12}^{W} W_{12} . \end{matrix}$ $\begin{aligned} W = f^W_{11}\, W_{11} + f^W_{22}\, W_{22} + 2f^W_{12}\, W_{12}. \end{aligned}$ (A.8)

This shows that for specific marks the weighted correlation function can be split up into a sum of auto-correlations and a cross-correlation. This can be generalised if the mark e.g. takes three or more different values and the result will include contributions from all possible auto- and cross-correlations.

One particularly interesting case is the Void_AC mark where the two values the mark can take is simply −1 and +1. Let us assume that the 1-population has −1 as a mark and the 2-population has +1. First of all we realise that in that case W₁W₁ = D₁D₁ as well as W₂W₂ = D₂D₂ because the sum of pair-product weights will simply be a sum of 1’s. Furthermore W₁W₂ = −D₁D₂ as the product of two weights will always be -1 for pairs in the cross-correlation. The normalisation also simplifies yielding (∑₁w_i)² − ∑₁w_i² = N₁(N₁ − 1) and analogous for the 2-population. For the cross-correlation we get ∑₁w_i∑₂w_i = −N₁N₂. Hence the individual auto- and cross-correlations are the same between the weighted and unweighted case W₁₁ = ξ₁₁, W₂₂ = ξ₂₂ and W₁₂ = ξ₁₂. This implies that the total weighted correlation function has the same individual contributions as the unweighted one but with different prefactors

$\begin{matrix} W = f_{11}^{W} ξ_{11} + f_{22}^{W} ξ_{22} + 2 f_{12}^{W} ξ_{12}, \end{matrix}$ $\begin{aligned} W = f^W_{11} \, \xi _{11} + f^W_{22} \, \xi _{22} + 2 f^W_{12} \, \xi _{12}, \end{aligned}$ (A.9)

since the prefactors simplify to

$\begin{matrix} \begin{matrix} f_{11}^{W} & = \frac{N_{1} (N_{1} - 1)}{{(N_{1} - N_{2})}^{2} - N}, \\ f_{22}^{W} & = \frac{N_{2} (N_{2} - 1)}{{(N_{1} - N_{2})}^{2} - N}, \\ and f_{12}^{W} & = - \frac{N_{1} N_{2}}{{(N_{1} - N_{2})}^{2} - N} . \end{matrix} \end{matrix}$ $\begin{aligned} \begin{split} f^W_{11}&= \frac{N_1(N_1-1)}{(N_1 - N_2)^2-N}, \\ f^W_{22}&= \frac{N_2(N_2-1)}{(N_1 - N_2)^2-N},\\ \text{ and} \quad f^W_{12}&= -\frac{N_1N_2}{(N_1 - N_2)^2-N}. \end{split} \end{aligned}$ (A.10)

If we define the constant 𝒞 = ((N₁ + N₂)² − N)/(N(N − 1)) then we can write the marked correlation function for the Void_AC-mark as

$\begin{matrix} M = \frac{1 + \frac{f_{11}}{C} ξ_{11} + \frac{f_{22}}{C} ξ_{22} - 2 \frac{f_{12}}{C} ξ_{12}}{1 + f_{11} ξ_{11} + f_{22} ξ_{22} - 2 f_{12} ξ_{12}} . \end{matrix}$ $\begin{aligned} \mathcal{M} = \frac{1+\frac{f_{11}}{\mathcal{C} }\, \xi _{11} + \frac{f_{22}}{\mathcal{C} }\, \xi _{22} - 2 \frac{f_{12}}{\mathcal{C} } \, \xi _{12}}{1+f_{11}\, \xi _{11} + f_{22}\, \xi _{22} - 2 f_{12}\, \xi _{12}}. \end{aligned}$ (A.11)

This illustrates the fact that in this case the marked correlation function is nothing else than a specific combination of unweighted auto and cross-correlations.

Appendix B: Convolution of the density contrast

In this section we show how an additional convolution of the already smoothed density contrast can be treated simply as a single convolution with a higher order smoothing kernel. Let us start with

$\begin{matrix} δ_{RR} (x) = \frac{1}{a^{6}} \int_{x^{″}} [\int x^{'} F (\frac{x^{″} - x^{'}}{a}) δ (x^{'}) d^{3} x^{'}] G (\frac{x - x^{″}}{a}) d^{3} x^{″}, \end{matrix}$ $\begin{aligned} \delta _{RR}(\mathbf x ) = \frac{1}{a^6} \int _\mathbf{x ^{\prime \prime }} \left[\int _\mathbf{x ^{\prime }} F\left(\frac{{x ^{\prime \prime }-\mathbf x ^{\prime }}}{a}\right) \delta (\mathbf x ^{\prime }) \, \text{ d}^3 x^{\prime }\right] G\left(\frac{{\mathbf x -\mathbf x ^{\prime \prime }}}{a}\right) \, \text{ d}^3 x^{\prime \prime } ,\end{aligned}$ (B.1)

where we can identify the smoothed density contrast inside the square brackets and two smoothing kernels F and G. This can be rewritten as an integral over x″ involving only the two kernels with a coordinate transformation y = x − x″

$\begin{matrix} \begin{matrix} δ_{RR} (x) & = \frac{1}{a^{6}} \int_{x^{'}} [\int_{x^{″}} F (\frac{x - x^{'} - y}{a}) G (\frac{y}{a}) d^{3} y] δ (x^{'}) d^{3} x^{'} \\ = \frac{1}{a^{3}} \int_{x^{'}} δ (x^{'}) H (\frac{x - x^{'}}{a}) d^{3} x^{'}, \end{matrix} \end{matrix}$ $\begin{aligned} \begin{split} \delta _{RR}(\mathbf x )&= \frac{1}{a^6} \int _\mathbf{x ^{\prime }}\left[ \int _\mathbf{x ^{\prime \prime }} F\left(\frac{\mathbf x -\mathbf x ^{\prime }-\mathbf y }{a}\right) G\left(\frac{\mathbf y }{a}\right) \, \text{ d}^3 y\right] \delta (\mathbf x ^{\prime })\, \text{ d}^3 x^{\prime } \\&= \frac{1}{a^3} \int _\mathbf{x ^{\prime }} \delta (\mathbf x ^{\prime }) H\left(\frac{\mathbf x -\mathbf x ^{\prime }}{a}\right) \, \text{ d}^3 x^{\prime }, \end{split} \end{aligned}$ (B.2)

where we identified in the first equality that after the coordinate change the two kernels are convolved in the variable y resulting in a new kernel H at location x − x′. Hence the density field is only convolved once with the H-kernel that is the convolution of both G and F. Now, if the two kernels are a PCS and NGP kernel, respectively, then the H-kernel would be a quartic kernel.

All Tables

Table 1.

Notations used in this article.

In the text

Table 2.

Reference cosmology of the ELEPHANT (first column) and DEMNUni (second column) simulations.

In the text

Table 3.

Characteristics of the ELEPHANT simulation suite.

In the text

All Figures

Fig. 1.

Difference in the measured standard correlation function ξ(r) between GR and MG in real space. In the upper panel, the correlation functions themselves are plotted where different colours indicate the underlying gravity theory. The curves show the average over five realisations and the errorbars correspond to the mean standard deviation over these realisations. The lower panel quantifies possible differences in terms of the S/N, as introduced in Sect. 4. Black dashed lines indicate a S/N of ±3. The shaded region refers to the error of a single measurement divided by the mean error of the difference as described in Eq. (38).

In the text

Fig. 2.

Differences in the standard measured correlation function multipoles in redshift space between GR and MG. The upper panels present the mean correlation function multipoles taken over five realisations with the monopole on the left side and the quadrupole on the right side. The errorbars correspond to the mean standard deviation over five realisations. The lower panels show the respective S/N with 3σ indicated by the black dashed lines. The colour coding refers to different gravity simulations and the corresponding shaded regions refer to the error of a single measurement divided by the mean error of the difference (see Eq. (38)).

In the text

Fig. 3.

Summary of the different marked correlation functions using marks based on the tidal field, tidal torque (both in the lower panel) or large-scale environment (upper panel). The two panels show measurements made using the ELEPHANT simulations of GR. The black dashed line indicates an amplitude of 1. Curves represent the mean taken over five realisations and the error corresponds to the mean standard deviation over five realisations. The measurements have not been corrected for any form of bias due to shot-noise effects.

In the text

Fig. 4.

Analytical correction for w_f(r) and ${\bar{m}}_{f}$ $\bar{m}_f$ in case of the toy model where only one particle in the pair is weighted by the density contrast δ_Rf. The points refer to the measured data from one of the Covmos realisations, in blue for w_f and in orange for ${\bar{m}}_{f}$ $\bar{m}_f$ . The upper panel shows the measurements alongside the analytical prediction and the bottom panel presents the relative difference between the measurements and the theory. We show only one Covmos realisations for which we used 100 realisations of depletions to obtain the depleted catalogues and the scale bin in r for w_f is fixed to be close to 20 h⁻¹ Mpc. For the analytic correction we use the depletion down to 1.7% as an anchor and computed from there the expected signal using Eqs. (59) and (61).

In the text

Fig. 5.

Results of the analytic shot-noise correction applied to the toy model for which only one point in a pair is weighted by the density contrast δ_Rf. The upper panel shows the full weighted correlation function 1 + W_f as a mean over the five Covmos realisations, where different colours denote different levels of depletion. Errorbars are computed by taking the mean standard deviation over five realisations. To obtain the depleted catalogues we took the mean over 100 depletions. The lower panel shows the relative difference of the analytically corrected result to the undepleted case as a red dashed line. The solid black line refers to the relative difference of the depletion level 1.7% to the undepleted case, which illustrates the effect of no correction. Horizontal dashed lines in black indicate levels of relative differences of ±1% and the vertical dashed line in grey refers to side length of one grid cell. We used 64 grid cells per dimension and a PCS MAS to obtain the density field on the grid.

In the text

Fig. 6.

Fitting procedure to obtain the shot-noise-corrected signal in case of w_f(r) (upper panel) and ${\bar{m}}_{f}$ $\bar{m}_f$ (lower panel) for the tanh-mark in the configuration (a, b) = (0.6, −0.5). Following Eq. (69) we present the fits in dependency of $1 / \bar{N}$ $1/\bar{N}$ , the reciprocal of the average number of points per grid cell. We show only one realisation of Covmos and the scale bin in r for w_f is fixed to be close to 20 h⁻¹ Mpc. The depleted measurements were obtained by taking the mean over 100 realisations of depletions. The orange points refer to the fitted measurements and the blue point is the undepleted reference (not included in the fit). Differently coloured lines refer to different orders in the polynomials we used to fit the data. Errorbars of the orange points are obtained by taking the mean standard deviation over the 100 realisations. Since the depletion down to 1.7% (first orange point from the left) mimics the undepleted ELEPHANT density we use for this point an error that is 10% of the minimum uncertainty over the remaining depletions.

In the text

Fig. 7.

Results of the fitted shot-noise correction for the tanh-mark with (a, b) = (0.6, −0.5). The left panels present 1 + W_f while the right panel shows $w_{f} = (1 + W_{f}) {\bar{m}}_{f}^{2}$ $w_f = (1+W_f)\bar{m}_f^2$ . Different colours in the upper panel refer to the mean over five Covmos realisations and errorbars correspond to the mean standard deviation over those five realisations. Depleted measurements were obtained by taking the mean over 100 realisations of depletions. The bottom panels show the relative differences of the corrected signal from the fit to the undepleted reference case. Different colours in the lower panels refer to different orders used in the polynomial fit and e.g. ‘3rd/2nd’ indicates that a third- and second-order polynomial fit was used for w_f and ${\bar{m}}_{f}$ $\bar{m}_f$ , respectively. The horizontal dashed lines in black corresponds to a relative difference of ±5% and the vertical dashed line in grey refers to the side length of one grid cell.

In the text

	Fig. 8. Test of convergence of the shot-noise correction for different amounts of depletion realisations. Black dashed lines refer to relative differences of ±5% and colours denote the number of realisations for the depletions. Relative differences are shown for the total 1 + W_f(r) where both w_f(r) and ${\bar{m}}_{f}$ $\bar{m}_f$ are corrected for the tanh-mark with (a, b) = (0.6, −0.5).
In the text

Fig. 9.

Redshift-space multipoles of the toy model, measured in the GR simulations of ELEPHANT, corrected for shot noise. The upper panel shows both the monopole and quadrupole in different colours corrected via the polynomial fit as solid lines and analytically corrected in dashed lines. In order for better visualisation we offset the quadrupole by +1. Errorbars refer to the mean standard deviation over five realisations. The lower panel shows the relative difference between the analytic and polynomial correction in percent.

In the text

Fig. 10.

Comparison of the PDF for the density contrast weighted by the particle/galaxy density as measured in Covmos and ELEPHANT (GR), respectively. Colours encode the amount of depletion resulting in number densities per grid cell as indicated in the legend. Dashed lines refer to the ELEPHANT simulation while solid lines refer to the Covmos catalogues. We show the mean over five realisations and depleted measurement were obtained by taking the mean over 30 realisations of depletions.

In the text

Fig. 11.

Marked correlation functions for the White mark measured in the Covmos catalogues. The upper and middle panel show the marked correlation functions for different configurations of the parameters (ρ_*, p) using NGP and PCS MAS, respectively. The dashed line refers to the undepleted case, whereas the solid line shows the measurement for a depleted catalogue down to 1.7%, which corresponds to the same mean density of points per grid cell as in the undepleted ELEPHANT simulations. The lowest panel displays the relative difference between the marked correlation function as measured in the full data set and the depleted one. We used 60 grid cells per side length for the density field and the vertical dashed line in grey corresponds to the side length of a grid cell.

In the text

Fig. 12.

Shot noise in the White mark for the ELEPHANT suite. The different configurations of parameters are colour coded and the solid line refers to the corrected case while the dashed line refers to the uncorrected case. The upper panel displays the marked correlation function both corrected and uncorrected. The middle panel shows the relative difference between the corrected and the uncorrected marked correlation function in percent. The MG model is fixed to F4 in the upper and the middle panel. In the lower panel we show the relative differences for the different configurations between the F4 model and GR in percent. We used 60 grid cells per dimension and the vertical dashed line in grey refers to the side length of one grid cell.

In the text

	Fig. 13. S/N measured in the ELEPHANT suite for the marked correlation functions using marks based on the large-scale environment as introduced in Sect. 5. Colours refer to MG simulations and the panels show different marks as indicated on the labels. The horizontal dashed lines in black indicate a S/N of ±3.
In the text

	Fig. 14. Same as Fig. 13 but for marks based on the tidal field and tidal torque.
In the text

Fig. 15.

Performance of the White mark for fixed parameters (ρ_*, p) = (10⁻⁶, 1.0). The upper panel displays the mean marked correlation function taken over five realisations together with errorbars estimated as the mean standard deviation. Colours refer to the different gravity simulations. The lower panel shows the S/N, as introduced in Eq. (37) and the blue shaded region refers to the error of a single measurements divided by the error of the mean difference (see Eq. (38)) for F4.

In the text

Fig. 16.

Redshift-space monopole (left side) and quadrupole (right side) of the marked correlation function for the White mark with parameters fixed to (ρ_*, p) = (10⁻⁶, 1.0) and corrected for shot noise. Upper panels show the multipoles themselves with colours referring to different gravity simulations. Displayed is the mean over five realisations with respective mean standard deviations as the error bars. For the monopole the horizontal dashed line in black marks an amplitude of 1 while for the quadrupole it marks an amplitude of 0. Lower panels show the signal to noise ratio, as in Eq. (37), with according colour for the different MG models. Shaded regions refer to the error of a single measurement divided by the mean error of the difference as introduced in Eq. (38). Dashed black lines in the lower panels indicate a S/N of ±3.

In the text

Fig. 17.

Marked correlation function for the tanh-mark with (a, b) = (0.6, −0.5). The left panels depict the case where we do not apply any correction for shot noise while in the right panels we do apply our shot-noise correction methodology as described in Sect. 6. Upper panels show the marked correlation functions with colours encoding the different MG models. Displayed is the mean over five realisations and errorbars are obtained by taking the mean standard deviation. The lower panel displays the S/N where the black-dashed line indicates a difference of ±3. Shaded areas mark the error of a single measurement divided by the mean error of the difference. The black dashed line in the upper panels indicates an amplitude of 1.

In the text

	Fig. 18. Relative differences between GR and MG for the tanh-mark with parameters fixed to (a, b) = (0.6, −0.5). We show the mean relative difference over five realisations and the shaded area corresponds to the relative standard deviation (relative error of single measurement) for the GR realisations. Grey dashed lines indicate relative differences of ±5%.
In the text

Fig. 19.

Redshift-space monopole (left side) and quadrupole (right side) of the marked correlation function for the tanh-mark with parameters fixed to (a, b) = (0.6, −0.5) and corrected for shot noise. Upper panels show the multipoles themselves with colours referring to different gravity simulations. Displayed is the mean over five realisations with respective mean standard deviations as the errorbars. For the monopole the horizontal dashed line in black indicates an amplitude of 1 and for the quadrupole an amplitude of 0. Lower panels show the S/N, as in Eq. (37), with according colours for the different MG models. Shaded regions refer to the error of a single measurement divided by the mean error of the difference and dashed lines in the lower panels indicate a S/N of ±3.

In the text

Fig. 20.

Relative error of the weighted correlation function in Covmos. The mark is set to the tanh with parameters fixed to (a, b) = (0.6, −0.5) for the solid line and the toy model for the dashed line. Green colours refer to the correction using 30 realisations of depletions and blue colours indicate the undepleted case. The relative error is computed as the mean standard deviation over five realisations divided by the mean.

In the text

Fig. 21.

Impact of the smoothing scale on the polynomial behaviour of shot noise. The mark is fixed to the White mark with (ρ_*, p) = (4.0, 10.0). Left and right panels show the fit of w_f and ${\bar{m}}_{f}$ $\bar{m}_f$ for one realisation of Covmos, respectively. Depleted measurements are obtained via taking the mean of 30 realisations of depletions. Linestyles refer to different MAS used for obtaining the density field. The error of the undepleted case and 1.7% depletion are computed as 10% of the smallest uncertainty of the other depletions.

In the text

	Fig. 22. Coefficient amplitudes of the polynomial fits corresponding to Fig. 21.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Alam, S., Albareti, F. D., Allende Prieto, C., et al. 2015, ApJS, 219, 12 [Google Scholar]

[2] Alam, S., Zu, Y., Peacock, J. A., & Mandelbaum, R. 2019, MNRAS, 483, 4501 [NASA ADS] [CrossRef] [Google Scholar]

[3] Alam, S., Arnold, C., Aviles, A., et al. 2021, JCAP, 2021, 050 [Google Scholar]

[4] Armijo, J., Cai, Y.-C., Padilla, N., Li, B., & Peacock, J. A. 2018, MNRAS, 478, 3627 [NASA ADS] [CrossRef] [Google Scholar]

[5] Armijo, J., Baugh, C. M., Norberg, P., & Padilla, N. D. 2024a, MNRAS, 529, 2866 [NASA ADS] [CrossRef] [Google Scholar]

[6] Armijo, J., Baugh, C. M., Norberg, P., & Padilla, N. D. 2024b, MNRAS, 528, 6631 [NASA ADS] [CrossRef] [Google Scholar]

[7] Aubert, M., Cousinou, M.-C., Escoffier, S., et al. 2022, MNRAS, 513, 186 [NASA ADS] [CrossRef] [Google Scholar]

[8] Aviles, A., Koyama, K., Cervantes-Cota, J. L., Winther, H. A., & Li, B. 2020, JCAP, 2020, 006 [CrossRef] [Google Scholar]

[9] Baldauf, T., Seljak, U., Smith, R. E., Hamaus, N., & Desjacques, V. 2013, Phys. Rev. D, 88, 083507 [NASA ADS] [CrossRef] [Google Scholar]

[10] Baratta, P., Bel, J., Plaszczynski, S., & Ealet, A. 2020, A&A, 633, A26 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[11] Baratta, P., Bel, J., Gouyou Beauchamps, S., & Carbone, C. 2023, A&A, 673, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[12] Barreira, A., Bose, S., & Li, B. 2015, JCAP, 2015, 059 [CrossRef] [Google Scholar]

[13] Battye, R. A., Bolliet, B., & Pace, F. 2018, Phys. Rev. D, 97, 104070 [NASA ADS] [CrossRef] [Google Scholar]

[14] Bautista, J. E., Paviot, R., Vargas Magaña, M., et al. 2021, MNRAS, 500, 736 [Google Scholar]

[15] Behroozi, P. S., Wechsler, R. H., & Wu, H.-Y. 2013, ApJ, 762, 109 [NASA ADS] [CrossRef] [Google Scholar]

[16] Beisbart, C., & Kerscher, M. 2000, ApJ, 545, 6 [NASA ADS] [CrossRef] [Google Scholar]

[17] Bertotti, B., Iess, L., & Tortora, P. 2003, Nature, 425, 374 [Google Scholar]

[18] Beutler, F., Blake, C., Colless, M., et al. 2012, MNRAS, 423, 3430 [NASA ADS] [CrossRef] [Google Scholar]

[19] Blake, C., Brough, S., Colless, M., et al. 2011, MNRAS, 415, 2876 [NASA ADS] [CrossRef] [Google Scholar]

[20] Blake, C., Amon, A., Asgari, M., et al. 2020, A&A, 642, A158 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[21] Bonnaire, T., Aghanim, N., Kuruvilla, J., & Decelle, A. 2022, A&A, 661, A146 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[22] Bose, S., Hellwing, W. A., & Li, B. 2015, JCAP, 2015, 034 [CrossRef] [Google Scholar]

[23] Brax, P., Davis, A.-C., & Elder, B. 2022, Phys. Rev. D, 106, 044040 [CrossRef] [Google Scholar]

[24] Castorina, E., Carbone, C., Bel, J., Sefusatti, E., & Dolag, K. 2015, JCAP, 2015, 043 [Google Scholar]

[25] Cautun, M., van de Weygaert, R., & Jones, B. J. T. 2013, MNRAS, 429, 1286 [NASA ADS] [CrossRef] [Google Scholar]

[26] Cautun, M., van de Weygaert, R., Jones, B. J. T., & Frenk, C. S. 2014, MNRAS, 441, 2923 [Google Scholar]

[27] Chan, K. C., & Blot, L. 2017, Phys. Rev. D, 96, 023528 [NASA ADS] [CrossRef] [Google Scholar]

[28] Chan, K. C., Scoccimarro, R., & Sheth, R. K. 2012, Phys. Rev. D, 85, 083509 [CrossRef] [Google Scholar]

[29] Chaniotis, A. K., & Poulikakos, D. 2004, J. Comput. Phys., 197, 253 [NASA ADS] [CrossRef] [Google Scholar]

[30] Clifton, T., Ferreira, P. G., Padilla, A., & Skordis, C. 2012, Phys. Rep., 513, 1 [NASA ADS] [CrossRef] [Google Scholar]

[31] Cognola, G., Elizalde, E., Nojiri, S., et al. 2008, Phys. Rev. D, 77, 046009 [NASA ADS] [CrossRef] [Google Scholar]

[32] Damour, T., & Polyakov, A. M. 1994, Nuclear Physics B, 423, 532 [NASA ADS] [CrossRef] [Google Scholar]

[33] De Felice, A., & Tsujikawa, S. 2010, Living Reviews in Relativity, 13, 3 [NASA ADS] [CrossRef] [Google Scholar]

[34] de la Torre, S., Guzzo, L., Peacock, J. A., et al. 2013, A&A, 557, A54 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[35] de la Torre, S., Jullo, E., Giocoli, C., et al. 2017, A&A, 608, A44 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[36] DESI Collaboration (Aghamousa, A., et al.) 2016, arXiv e-prints [arXiv:1611.00036] [Google Scholar]

[37] Desmond, H., & Ferreira, P. G. 2020, Phys. Rev. D, 102, 104060 [NASA ADS] [CrossRef] [Google Scholar]

[38] Dvali, G., Gabadadze, G., & Porrati, M. 2000, Physics Letters B, 485, 208 [NASA ADS] [CrossRef] [Google Scholar]

[39] Euclid Collaboration (Mellier, Y., et al.) 2025, A&A, in press, https://doi.org/10.1051/0004-6361/202450810 [Google Scholar]

[40] Falck, B. L., Neyrinck, M. C., & Szalay, A. S. 2012, ApJ, 754, 126 [Google Scholar]

[41] Forero-Romero, J. E., Hoffman, Y., Gottlöber, S., Klypin, A., & Yepes, G. 2009, MNRAS, 396, 1815 [Google Scholar]

[42] Guth, A. H. 1981, Phys. Rev. D, 23, 347 [Google Scholar]

[43] Guzzo, L., Pierleoni, M., Meneux, B., et al. 2008, Nature, 451, 541 [Google Scholar]

[44] Hamaus, N., Aubert, M., Pisani, A., et al. 2022, A&A, 658, A20 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[45] Heavens, A., & Peacock, J. 1988, MNRAS, 232, 339 [NASA ADS] [CrossRef] [Google Scholar]

[46] Hernández-Aguayo, C., Baugh, C. M., & Li, B. 2018, MNRAS, 479, 4824 [CrossRef] [Google Scholar]

[47] Hinshaw, G., Larson, D., Komatsu, E., et al. 2013, ApJS, 208, 19 [Google Scholar]

[48] Hinterbichler, K., & Khoury, J. 2010, Phys. Rev. Lett., 104, 231301 [NASA ADS] [CrossRef] [Google Scholar]

[49] Hu, W., & Sawicki, I. 2007, Phys. Rev. D, 76, 064004 [CrossRef] [Google Scholar]

[50] Hunter, J. D. 2007, Comput. Sci. Eng., 9, 90 [NASA ADS] [CrossRef] [Google Scholar]

[51] Ishak, M. 2019, Liv. Rev. Rel., 22, 1 [NASA ADS] [CrossRef] [Google Scholar]

[52] Jullo, E., de la Torre, S., Cousinou, M. C., et al. 2019, A&A, 627, A137 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[53] Khoury, J., & Weltman, A. 2004a, Phys. Rev. D, 69, 044026 [CrossRef] [Google Scholar]

[54] Khoury, J., & Weltman, A. 2004b, Phys. Rev. Lett., 93, 171104 [NASA ADS] [CrossRef] [Google Scholar]

[55] Koyama, K., & Silva, F. P. 2007, Phys. Rev. D, 75, 084040 [NASA ADS] [CrossRef] [Google Scholar]

[56] Landy, S. D., & Szalay, A. S. 1993, ApJ, 412, 64 [Google Scholar]

[57] Layzer, D. 1956, AJ, 61, 383 [NASA ADS] [CrossRef] [Google Scholar]

[58] Libeskind, N. I., van de Weygaert, R., Cautun, M., et al. 2018, MNRAS, 473, 1195 [NASA ADS] [CrossRef] [Google Scholar]

[59] Liu, R., Valogiannis, G., Battaglia, N., & Bean, R. 2021, Phys. Rev. D, 104, 103519 [NASA ADS] [CrossRef] [Google Scholar]

[60] Llinares, C., & McCullagh, N. 2017, MNRAS, 472, L80 [CrossRef] [Google Scholar]

[61] Lombriser, L. 2014, Annalen der Physik, 264, 259 [CrossRef] [Google Scholar]

[62] Lombriser, L., Hu, W., Fang, W., & Seljak, U. 2009, Phys. Rev. D, 80, 063536 [CrossRef] [Google Scholar]

[63] Lombriser, L., Simpson, F., & Mead, A. 2015, Phys. Rev. Lett., 114, 251101 [NASA ADS] [CrossRef] [Google Scholar]

[64] Manera, M., Scoccimarro, R., Percival, W. J., et al. 2013, MNRAS, 428, 1036 [Google Scholar]

[65] Martin, J. 2012, C. R. Phys., 13, 566 [NASA ADS] [CrossRef] [Google Scholar]

[66] Massara, E., Villaescusa-Navarro, F., Ho, S., Dalal, N., & Spergel, D. N. 2021, Phys. Rev. Lett., 126, 011301 [CrossRef] [Google Scholar]

[67] Neyrinck, M. C. 2008, MNRAS, 386, 2101 [CrossRef] [Google Scholar]

[68] Nicolis, A., Rattazzi, R., & Trincherini, E. 2009, Phys. Rev. D, 79, 064036 [CrossRef] [Google Scholar]

[69] Paillas, E., Cai, Y.-C., Padilla, N., & Sánchez, A. G. 2021, MNRAS, 505, 5731 [NASA ADS] [CrossRef] [Google Scholar]

[70] Peebles, P. J. E., & Hauser, M. G. 1974, ApJS, 28, 19 [NASA ADS] [CrossRef] [Google Scholar]

[71] Perlmutter, S., Aldering, G., Goldhaber, G., et al. 1999, ApJ, 517, 565 [Google Scholar]

[72] Philcox, O. H. E., Massara, E., & Spergel, D. N. 2020, Phys. Rev. D, 102, 043516 [CrossRef] [Google Scholar]

[73] Philcox, O. H. E., Aviles, A., & Massara, E. 2021, JCAP, 2021, 038 [CrossRef] [Google Scholar]

[74] Planck Collaboration XIV. 2016, A&A, 594, A14 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[75] Planck Collaboration VI. 2020, A&A, 641, A6 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[76] Reyes, R., Mandelbaum, R., Seljak, U., et al. 2010, Nature, 464, 256 [Google Scholar]

[77] Riess, A. G., Filippenko, A. V., Challis, P., et al. 1998, AJ, 116, 1009 [Google Scholar]

[78] Riess, A. G., Yuan, W., Macri, L. M., et al. 2022, ApJ, 934, L7 [NASA ADS] [CrossRef] [Google Scholar]

[79] Ross, A. J., Ho, S., Cuesta, A. J., et al. 2011, MNRAS, 417, 1350 [Google Scholar]

[80] Ross, A. J., Percival, W. J., Sánchez, A. G., et al. 2012, MNRAS, 424, 564 [Google Scholar]

[81] Satpathy, S., A C Croft, R., Ho, S., & Li, B. 2019, MNRAS, 484, 2148 [NASA ADS] [CrossRef] [Google Scholar]

[82] Schaap, W. E., & van de Weygaert, R. 2000, A&A, 363, L29 [Google Scholar]

[83] Schmidt, F. 2009, Phys. Rev. D, 80, 123003 [NASA ADS] [CrossRef] [Google Scholar]

[84] Sefusatti, E., Crocce, M., Scoccimarro, R., & Couchman, H. M. P. 2016, MNRAS, 460, 3624 [NASA ADS] [CrossRef] [Google Scholar]

[85] Sheth, R. K. 2005, MNRAS, 364, 796 [NASA ADS] [CrossRef] [Google Scholar]

[86] Simpson, F., James, J. B., Heavens, A. F., & Heymans, C. 2011, Phys. Rev. Lett., 107, 271301 [NASA ADS] [CrossRef] [Google Scholar]

[87] Simpson, F., Heavens, A. F., & Heymans, C. 2013, Phys. Rev. D, 88, 083510 [CrossRef] [Google Scholar]

[88] Sinha, M., & Garrison, L. 2019, in Software Challenges to Exascale Computing, eds. A. Majumdar, & R. Arora (Singapore: Springer Singapore), 3 [NASA ADS] [CrossRef] [Google Scholar]

[89] Sinha, M., & Garrison, L. H. 2020, MNRAS, 491, 3022 [Google Scholar]

[90] Sotiriou, T. P., & Faraoni, V. 2010, Rev. Mod. Phys., 82, 451 [NASA ADS] [CrossRef] [Google Scholar]

[91] Sousbie, T. 2011, MNRAS, 414, 350 [NASA ADS] [CrossRef] [Google Scholar]

[92] Tröster, T., Sánchez, A. G., Asgari, M., et al. 2020, A&A, 633, L10 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[93] Tsujikawa, S. 2010, in Lecture Notes in Physics, ed. G. Wolschin (Berlin Springer Verlag), 800, 99 [NASA ADS] [CrossRef] [Google Scholar]

[94] Vainshtein, A. I. 1972, Phys. Lett. B, 39, 393 [NASA ADS] [CrossRef] [Google Scholar]

[95] Valogiannis, G., & Bean, R. 2018, Phys. Rev. D, 97, 023535 [CrossRef] [Google Scholar]

[96] Villaescusa-Navarro, F., Hahn, C., Massara, E., et al. 2020, ApJS, 250, 2 [CrossRef] [Google Scholar]

[97] White, M. 2016, JCAP, 2016, 057 [CrossRef] [Google Scholar]

[98] White, M., & Padmanabhan, N. 2009, MNRAS, 395, 2381 [NASA ADS] [CrossRef] [Google Scholar]

[99] Williams, J. G., Turyshev, S. G., & Boggs, D. H. 2004, Phys. Rev. Lett., 93, 261101 [CrossRef] [Google Scholar]

[100] Williams, J. G., Turyshev, S. G., & Boggs, D. H. 2012, CQG, 29, 184004 [NASA ADS] [CrossRef] [Google Scholar]

[101] Xiao, X., Yang, Y., Luo, X., et al. 2022, MNRAS, 513, 595 [NASA ADS] [CrossRef] [Google Scholar]

[102] Yang, Y., Miao, H., Ma, Q., et al. 2020, ApJ, 900, 6 [Google Scholar]

[103] Zel’dovich, Y. B. 1970, A&A, 5, 84 [NASA ADS] [Google Scholar]

[104] Zhang, P., Liguori, M., Bean, R., & Dodelson, S. 2007, Phys. Rev. Lett., 99, 141302 [Google Scholar]

[105] Zheng, Z., Coil, A. L., & Zehavi, I. 2007, ApJ, 667, 760 [Google Scholar]