The VIMOS Public Extragalactic Redshift Survey (VIPERS)

J. Bel; E. Branchini; C. Di Porto; O. Cucciati; B. R. Granett; A. Iovino; S. de la Torre; C. Marinoni; L. Guzzo; L. Moscardini; A. Cappi; U. Abbas; C. Adami; S. Arnouts; M. Bolzonella; D. Bottini; J. Coupon; I. Davidzon; G. De Lucia; A. Fritz; P. Franzetti; M. Fumana; B. Garilli; O. Ilbert; J. Krywult; V. Le Brun; O. Le Fèvre; D. Maccagni; K. Małek; F. Marulli; H. J. McCracken; L. Paioro; M. Polletta; A. Pollo; H. Schlagenhaufer; M. Scodeggio; L. A. M. Tasca; R. Tojeiro; D. Vergani; A. Zanichelli; A. Burden; A. Marchetti; Y. Mellier; R. C. Nichol; J. A. Peacock; W. J. Percival; S. Phleps; M. Wolk

doi:10.1051/0004-6361/201526455

Home

All issues

Volume 588 (April 2016)

A&A, 588 (2016) A51

Full HTML

Free Access

Issue		A&A Volume 588, April 2016


Article Number		A51
Number of page(s)		15
Section		Cosmology (including clusters of galaxies)
DOI		https://doi.org/10.1051/0004-6361/201526455
Published online		17 March 2016

A&A 588, A51 (2016)

On the recovery of the count-in-cell probability distribution function^⋆

J. Bel²^,7^,31^,⋆⋆, E. Branchini¹⁰^,28^,29, C. Di Porto⁹, O. Cucciati¹⁷^,9, B. R. Granett², A. Iovino², S. de la Torre⁴, C. Marinoni⁷^,30^,31, L. Guzzo²^,27, L. Moscardini¹⁷^,18^,9, A. Cappi⁹^,21, U. Abbas⁵, C. Adami⁴, S. Arnouts⁶, M. Bolzonella⁹, D. Bottini³, J. Coupon³², I. Davidzon⁹^,17, G. De Lucia¹³, A. Fritz³, P. Franzetti³, M. Fumana³, B. Garilli³^,4, O. Ilbert⁴, J. Krywult¹⁵, V. Le Brun⁴, O. Le Fèvre⁴, D. Maccagni³, K. Małek²³, F. Marulli¹⁷^,18^,9, H. J. McCracken¹⁹, L. Paioro³, M. Polletta³, A. Pollo²²^,23, H. Schlagenhaufer²⁴^,20, M. Scodeggio³, L. A. M. Tasca⁴, R. Tojeiro¹¹, D. Vergani²⁵^,9, A. Zanichelli²⁶, A. Burden¹¹, A. Marchetti¹^,2, Y. Mellier¹⁹, R. C. Nichol¹¹, J. A. Peacock¹⁴, W. J. Percival¹¹, S. Phleps²⁰ and M. Wolk¹⁹

¹ Università degli Studi di Milano, via G. Celoria 16, 20130 Milano, Italy
² INAF–Osservatorio Astronomico di Brera, via Brera 28, 20122 Milano, via E. Bianchi 46, 23807 Merate, Italy
³ INAF–Istituto di Astrofisica Spaziale e Fisica Cosmica Milano, via Bassini 15, 20133 Milano, Italy
⁴ Aix Marseille Université, CNRS, LAM (Laboratoire d’Astrophysique de Marseille) UMR 7326, 13388 Marseille, France
⁵ INAF–Osservatorio Astronomico di Torino, 10025 Pino Torinese, Italy
⁶ Canada-France-Hawaii Telescope, 65–1238 Mamalahoa Highway, Kamuela, HI 96743, USA
⁷ Aix Marseille Université, CNRS, CPT, UMR 7332, 13288 Marseille, France
⁸ Université de Lyon, 69003 Lyon, France
⁹ INAF–Osservatorio Astronomico di Bologna, via Ranzani 1, 40127 Bologna, Italy
¹⁰ Dipartimento di Matematica e Fisica, Università degli Studi Roma Tre, via della Vasca Navale 84, 00146 Roma, Italy
¹¹ Institute of Cosmology and Gravitation, Dennis Sciama Building, University of Portsmouth, Burnaby Road, Portsmouth, PO1 3FX, UK
¹² Institute of Astronomy and Astrophysics, Academia Sinica, PO Box 23-141, 10617 Taipei, Taiwan
¹³ INAF–Osservatorio Astronomico di Trieste, via G. B. Tiepolo 11, 34143 Trieste, Italy
¹⁴ SUPA, Institute for Astronomy, University of Edinburgh, Royal Observatory, Blackford Hill, Edinburgh EH9 3HJ, UK
¹⁵ Institute of Physics, Jan Kochanowski University, ul. Swietokrzyska 15, 25-406 Kielce, Poland
¹⁶ Department of Particle and Astrophysical Science, Nagoya University, Furo-cho, Chikusa-ku, 464-8602 Nagoya, Japan
¹⁷ Dipartimento di Fisica e Astronomia – Alma Mater Studiorum Università di Bologna, viale Berti Pichat 6/2, 40127 Bologna, Italy
¹⁸ INFN, Sezione di Bologna, viale Berti Pichat 6/2, 40127 Bologna, Italy
¹⁹ Institute d’Astrophysique de Paris, UMR7095 CNRS, Université Pierre et Marie Curie, 98bisboulevard Arago, 75014 Paris, France
²⁰ Max-Planck-Institut für Extraterrestrische Physik, 84571 Garching b. München, Germany
²¹ Laboratoire Lagrange, UMR7293, Université de Nice Sophia Antipolis, CNRS, Observatoire de la Côte d’Azur, 06300 Nice, France
²² Astronomical Observatory of the Jagiellonian University, Orla 171, 30-001 Cracow, Poland
²³ National Centre for Nuclear Research, ul. Hoza 69, 00-681 Warszawa, Poland
²⁴ Universitätssternwarte München, Ludwig-Maximillians Universität, Scheinerstr. 1, 81679 München, Germany
²⁵ INAF–Istituto di Astrofisica Spaziale e Fisica Cosmica Bologna, via Gobetti 101, 40129 Bologna, Italy
²⁶ INAF–Istituto di Radioastronomia, via Gobetti 101, 40129 Bologna, Italy
²⁷ Dipartimento di Fisica, Università di Milano-Bicocca, P.zza della Scienza 3, 20126 Milano, Italy
²⁸ INFN, Sezione di Roma Tre, via della Vasca Navale 84, 00146 Roma, Italy
²⁹ INAF–Osservatorio Astronomico di Roma, via Frascati 33, 00040 Monte Porzio Catone ( RM), Italy
³⁰ Institut Universitaire de France, 75231 Paris Cedex 05, France
³¹ Université de Toulon, CNRS, CPT, UMR 7332, 83957 La Garde, France
³² Astronomical Observatory of the University of Geneva, ch. d’Ecogia 16, 1290 Versoix, Switzerland

Received: 3 May 2015
Accepted: 21 January 2016

Abstract

We compare three methods to measure the count-in-cell probability density function of galaxies in a spectroscopic redshift survey. From this comparison we found that, when the sampling is low (the average number of object per cell is around unity), it is necessary to use a parametric method to model the galaxy distribution. We used a set of mock catalogues of VIPERS to verify if we were able to reconstruct the cell-count probability distribution once the observational strategy is applied. We find that, in the simulated catalogues, the probability distribution of galaxies is better represented by a Gamma expansion than a skewed log-normal distribution. Finally, we correct the cell-count probability distribution function from the angular selection effect of the VIMOS instrument and study the redshift and absolute magnitude dependency of the underlying galaxy density function in VIPERS from redshift 0.5 to 1.1. We found a very weak evolution of the probability density distribution function and that it is well approximated by a Gamma distribution, independently of the chosen tracers.

Key words: large-scale structure of Universe / cosmology: observations / galaxies: high-redshift

^⋆

Based on observations collected at the European Southern Observatory, Cerro Paranal, Chile, using the Very Large Telescope under programmes 182.A-0886 and partly 070.A-9007. Also based on observations obtained with MegaPrime/MegaCam, a joint project of CFHT and CEA/DAPNIA, at the Canada-France-Hawaii Telescope (CFHT), which is operated by the National Research Council (NRC) of Canada, the Institut National des Sciences de l’Univers of the Centre National de la Recherche Scientifique (CNRS) of France, and the University of Hawaii. This work is based in part on data products produced at TERAPIX and the Canadian Astronomy Data Centre as part of the Canada-France-Hawaii Telescope Legacy Survey, a collaborative project of NRC and CNRS. The VIPERS web site is http://www.vipers.inaf.it/

^⋆⋆

Corresponding author: J. Bel, This email address is being protected from spambots. You need JavaScript enabled to view it.

© ESO, 2016

1. Introduction

Galaxy clustering offers a formidable playground in which to try to understand how structures have grown during since. the evolution of the Universe. A number of statistical tools have been developed and used over the past 30 years (see Bernardeau et al. 2002, for a review). In general, these statistical methods use the fact that the clustering of galaxies is the result of the gravitational pull of the underlying matter distribution. Hence, the study of the spatial distribution of galaxies in the Universe allows us to get information about the statistical properties of its content matter. As a result, it is of paramount importance to be able to measure the statistical quantities that describe the galaxy distribution from a redshift survey. In particular, we focus on the probability distribution of galaxy cell count which has also been measured in previous redshift surveys (Bouchet et al. 1993; Szapudi et al. 1996; Yang & Saslaw 2011).

The development of multi-object spectrographs on 8-m class telescopes during the 1990s triggered a number of deep redshift surveys with measured distances beyond z ~ 0.5 over areas of 1–2 deg² (e.g. VVDS Le Fevre et al. 2005; DEEP2 Newman et al. 2013; and zCOSMOS Lilly et al. 2009). Even so, it was not until the wide extension of VVDS was produced (Garilli et al. 2008), that a survey existed with sufficient volume to attempt cosmologically meaningful computations at z ~ 1 (Guzzo et al. 2008). In general, clustering measurements at z ≃ 1 from these samples remained dominated by cosmic variance, as is dramatically shown by the discrepancy observed between the VVDS and zCOSMOS correlation functions at z ≃ 0.8 (de la Torre et al. 2010).

The VIMOS Public Extragalactic Redshift Survey (VIPERS) is part of a global attempt to take cosmological measurements at z ~ 1 to a new level in terms of statistical significance. In contrast to the BOSS and WiggleZ surveys, which use large field-of-view (~1 deg²) fibre optic positioners to probe huge volumes at low sampling density, VIPERS exploits the features of VIMOS at the ESO VLT to yield a dense galaxy sampling over a moderately large field-of-view (~0.08 deg²). It reaches a volume at 0.5 <z< 1.2, comparable to that of the 2dFGRS (Colless et al. 2001) at z ~ 0.1, allowing the cosmological evolution to be tested with few statistical errors.

The VIPERS redshifts are being collected by tiling the selected sky areas with a uniform mosaic of VIMOS fields. The area covered is not contiguous, but presents regular gaps owing to the specific footprint of the instrument field of view, in addition to intrinsic unobserved areas, which are due to bright stars or defects in the original photometric catalogue. The VIMOS field of view has four rectangular regions of about 8 × 7 square arcminutes each, separated by an unobserved cross (Guzzo et al. 2014; de la Torre et al. 2013). This creates a regular pattern of gaps in the angular distribution of the measured galaxies. Additionally, the target sampling rate and the survey success rate vary among the quadrants, and a few of the latter were lost because of mechanical problems within VIMOS (Garilli et al. 2014). Finally, the slit-positioning algorithm, SPOC (see Bottini et al. 2005), also introduces some small-scale angular selection effects, with different constraints along the dispersion and spatial directions of the spectra, as thoroughly discussed in de la Torre et al. (2013). Clearly, this combination of angular selection effects has to be properly taken into account when estimating any clustering statistics.

In this paper we measure the probability distribution function of galaxy fluctuations from the VIPERS Public Data Release 1 (PDR-1) redshift catalogue, including ~64% of the final number of redshifts expected at completion (see Guzzo et al. 2014; Garilli et al. 2014, for a detailed description of the survey data set). The paper is organized as follows: in Sect. 2, we introduce the VIPERS survey and the features of the PDR-1 sample. In Sect. 3, we review the basics of the three methods that we compared. In Sect. 4, we present a null test of the three method on a synthetic galaxy catalogue. In Sect. 5, we use galaxy mock catalogues to assess the performances of two of the methods. Magnitude and redshift dependance of the probability distribution function of VIPERS PDR-1 galaxies are presented in Sect. 6 and conclusions are drawn in Sect. 7.

Throughout, the Hubble constant is parameterized via h = H₀/ 100 km s^-1 Mpc^-1, all magnitudes in this paper are in the AB system (Oke & Gunn 1983), and we will not give an explicit AB suffix. To convert redshifts into comoving distances, we assume that the matter density parameter is Ω_m = 0.27, and that the Universe is spatially flat with a ΛCDM cosmology without radiations.

2. Data

The VIMOS Public Extragalactic Redshift Survey (VIPERS) is a spectroscopic redshift survey, which is being built using the VIMOS spectrograph at the ESO VLT. The survey target sample has been selected from the Canada-France-Hawaii Telescope Legacy Survey Wide (CFHTLS-Wide) optical photometric catalogues (Mellier et al. 2009). The final VIPERS will cover ~24 deg² on the sky, divided over two areas within the W1 and W4 CFHTLS fields. Galaxies are selected to a limit of i_AB< 22.5, further applying a simple and robust gri colour pre-selection, to effectively remove galaxies at z< 0.5. Coupled to an aggressive observing strategy (Scodeggio et al. 2009), this allows us to double the galaxy sampling rate in the redshift range of interest, with respect to a pure magnitude-limited sample (~40%). At the same time, the area, and depth of the survey result in a fairly large volume, ~5 × 10⁷h^-3 Mpc³, analogous to that of the 2dFGRS at z ~ 0.1. This combination of sampling and depth is quite unique over current redshift surveys at z> 0.5. The VIPERS spectra are collected with the VIMOS multi-object spectrograph (Le Fevre et al. 2003) at moderate resolution (R = 210), using the LR Red grism, which provides a wavelength coverage of 5500–9500 Å and a typical redshift error of 141(1 + z) km s^-1. The full VIPERS area is covered through a mosaic of 288 VIMOS pointings (192 in the W1 area, and 96 in the W4 area). A discussion of the survey data-reduction and management infrastructure is presented in Garilli et al. (2012). An early subset of the spectra that is used here is analysed and classified through a principal component analysis (PCA) in Marchetti et al. (2013).

A quality flag is assigned to each measured redshift, based on the quality of the corresponding spectrum. Here and in all parallel VIPERS science analyses we use only galaxies with flags 2 to 9 inclusive, corresponding to a global redshift confidence level of 98%. The redshift confirmation rate and redshift accuracy have been estimated using repeated spectroscopic observations in the VIPERS fields. A more complete description of the survey construction, from the definition of the target sample to the actual spectra and redshift measurements, is given in the parallel survey description paper (Guzzo et al. 2014).

Fig. 1

Upper: expected mean number count in spheres (solid line, from Eq. (2)) with respect to the observed one (symbols) for the various luminosity cuts and for the three redshift bins [ 0.5,0.7 ] (left panel), [ 0.7,0.9 ] (central panel), and [ 0.9,1.1 ] (right panel). The selection in absolute magnitude M_B in B-band, corresponding to each symbols/lines and colors, are indicated in the inset. The dotted line displays the $Mathematical equation: \hbox{$\bar N=1$}$ . Lower: deviation α (see Eq. (3)) between the expected mean number $Mathematical equation: \hbox{$\bar{\mathcal{N}}_R$}$ and the observed one $Mathematical equation: \hbox{$\bar N$}$ with respect to the radius R of the cells.

The data set used in this paper and the other papers of this early science release is the VIPERS Public Data Release 1 (PDR-1) catalogue, which has been made publicly available in September 2013. This includes 55 359 objects, spread over a global area of 8.6 × 1.0 deg² and 5.3 × 1.5 deg² in W1 and W4, respectively. This corresponds to the data frozen in the VIPERS database at the end of the 2011/2012 observing campaign, i.e. 64% of the final expected survey. For the specific analysis presented here, the sample has been further limited to its higher-redshift part, selecting only galaxies with 0.55 <z< 1.1. The reason for this selection is related to minimizing the shot noise and maximizing the volume. This reduces the usable sample to 18 135 and 16 879 galaxies in W1 and W4, respectively (always with quality flags between 2 and 9). The corresponding effective volume of the two samples are 6.57 and 6.14 × 10⁶h^-3 Mpc³. At redshift, z = 1.1 the two volumes span the angular comoving distances ~370 and 230 h^-1 Mpc, respectively. We divide the W1 and W4 fields into three redshift bins and we build magnitude limited subsamples in each of them. For convenience, we use the magnitude limits that are listed in Table 1 of di Porto et al. (2014), which we recall in Table 1.

Table 1

Magnitude selected objects (in B-band) in the VIPERS PDR-1.

The VIMOS footprint has an important impact on the observed probability of finding N galaxies in a randomly placed spherical cell in the survey volume. As a matter of fact, a direct appreciation of the masked area can be shown on the first moment of the probability distribution, i.e. the expectation value of the number count $N̅ \equiv \sum_{N = 0}^{\infty} N P_{N}$ $Mathematical equation: \hbox{$\bar N\equiv \sum_{N=0}^{\infty} N P_N$}$ . On the one hand, we can predict the mean number of objects per cells from the knowledge of the number density in each considered redshift bins and, on the other hand, we can estimate it by placing a regular grid of spherical cells of radius R into the volume surveyed by VIPERS. In fact, given the solid angle of W1 and W4 and the corresponding number of galaxies N₁ and N₄ contained in a redshift bin that has been extracted from each field, we can estimate the total number density as $ρ̅ = \frac{N_{1} + N_{4}}{Ω_{1} + Ω_{4}} \frac{1}{V_{k}},$ $Mathematical equation: \begin{equation} \bar\rho=\frac{N_1+N_4}{\Omega_1 + \Omega_4}\frac{1}{V_k}, \label{density} \end{equation}$ (1)where V_k is defined as the volume corresponding to a sector of a spherical shell with a solid angle equal to unity. In the case of VIPERS PDR-1, the effective solid angles that correspond to W1 and W4 are Ω₁ = 1.6651683 × 10^-3 and Ω₄ = 1.5573021 × 10^-3 (in square radians), respectively. The corresponding expected number of objects in each cell can be predicted by multiplying the averaged number density by the volume of a cell. This reads as $𝒩̅ R = \frac{4}{3} π R^{3} ρ̅,$ $Mathematical equation: \begin{equation} \bar{\mathcal{N}}_R=\frac{4}{3}\pi R^3 \bar\rho, \label{nbarth} \end{equation}$ (2)in the case of the spherical cells of radius R, as considered in this work. The expectation value $Mathematical equation: \hbox{$\bar{\mathcal{N}}_R$}$ , with respect to the radius of the cells corresponding to each luminosity sub-sample extracted from VIPERS-PDR1, is represented by lines in Fig. 1. In the same figure, we display the measured mean number of object $Mathematical equation: \hbox{$\bar N$}$ in each redshift bins. We note that, to perform this measurement, we place a grid of equally separated (4 h^-1 Mpc) spheres of radius R = 4,6,8 h^-1 Mpc and we reject spheres with more than 40% of their volume outside the observed region (see Bel et al. 2014). We quantify the effect of the mask using the quantity $α \equiv \frac{N̅}{𝒩̅ R} \cdot$ $Mathematical equation: \begin{equation} \alpha\equiv \frac{\bar N}{\bar{\mathcal{N}}_R}\cdot \label{alpha} \end{equation}$ (3)In fact the bottom panels of Fig. 1 show that, for all subsamples and at all redshifts, the neat effect of the masks is to under-sample the galaxy field by roughly 72%. It also shows that the correction factor α depends on the redshift that is considered, on the luminosity, and on the cell-size. The scale dependency can be explained by the fact that the correction parameter α depends on how the cells overlap with the masked regions. The left panel of Fig. 1 suggests that, at low redshift, the mask effect behaves in the same way for all the luminosity samples, while the middle panel shows a clear dependency with respect to luminosity. The correction factor α depends on the redshift distribution. As a result, the apparent dependency with respect to the luminosity is due to the dependence of the number density with respect to the luminosity of the considered objects.

The mask not only modifies the mean number of object, it also modifies the higher order moments of the distribution in such a way that the measured P_N will be systematically altered. In this paper, we show that this systematic effect can be taken into account by measuring the underlying probability density function of the galaxy density contrast δ. It has been shown (see Fig. 8 of Bel et al. 2014) that, after rejecting spheres with more than 40% of their volume outside the survey, the local poisson process approximation holds. The same kind of rejection criteria is implemented by Cappi et al. (2015) to measure the moments of the galaxy distribution function. In our case, it allows us to use the ‘wrong’ probability distribution function to get reliable information on the underlying probability density function p(δ). By then applying the Poisson sampling, we can recover the unaltered P_N using that $Mathematical equation: \hbox{$\bar N=\bar N(masked)/\alpha$}$ . For the sake of completeness, we provide tthe measured probability function that was obtained after rejecting the cells with more than 40% of their volume outside the survey (see Fig. 8).

In particular, let P_M and P_N, respectively, be the observed and the true counting probability distribution function (CPDF). Assuming that, from the knowledge of P_M, there exists a process to get the underlying probability density function of the stochastic field Λ, which is associated with the random variable N, one can compute the true CPDF applying $P_{N} = \int_{0}^{\infty} P [N | Λ] p (Λ) dΛ,$ $Mathematical equation: \begin{equation} P_N=\int_0^\infty P[N|\Lambda]p(\Lambda)\dif\Lambda, \label{sampling} \end{equation}$ (4)where P [ N | Λ ] is called the sampling conditional probability; this determines the sampling process from which the discrete cell-count arises. In the following, we assume that this sampling conditional probability follows a Poisson law (Layzer 1956), and as a result in Eq. (4) we substitute $P [N | Λ] = K [N, Λ] \equiv \frac{Λ^{N}}{N!} e^{- Λ} .$ $Mathematical equation: \begin{equation} P[N|\Lambda]=K[N, \Lambda]\equiv \frac{\Lambda^N}{N!}{\rm e}^{-\Lambda}. \label{pkernel} \end{equation}$ (5)It is also convenient to express Eq. (4) in terms of the density contrast of the stochastic field Λ, $Mathematical equation: \hbox{$\delta\equiv \Lambda/\bar\Lambda -1$}$ , it follows that $P_{N} = \int_{-1}^{\infty} K [N | N̅ (1 + δ)] p (δ) d δ,$ $Mathematical equation: \begin{equation} P_N=\int_{-1}^\infty K[N|\bar N(1+\delta)]p(\delta)\dif\delta, \label{samplingd} \end{equation}$ (6)where we used $Mathematical equation: \hbox{$\bar \Lambda=\bar N$}$ , which is a property of the Poisson sampling.

Continuing in this direction, we propose to compare three methods which aim at extracting the underlying probability density function (PDF) to correct the observed CPDF from the angular selection effects of VIPERS.

3. Methods

In this section we review the PDF estimators that we use and compare them with each other in this paper. The purpose is to select the method that is more adapted to the characteristics of VIPERS.

3.1. The Richardson-Lucy deconvolution

This is an iterative method that aims at inverting Eq. (6) without parametrising the underlying PDF, and it has been investigated by Szapudi & Pan (2004). This method starts with an initial guess p₀ for the probability density function p, which is used to compute the corresponding expected observed P_N,0 via $\begin{matrix} P_{N, 0} = \int_{-1}^{\infty} K̂ [N, N̅ (1 + δ)] p_{0} (δ) d δ, \end{matrix}$ $Mathematical equation: \begin{eqnarray*} P_{N,0}=\int_{-1}^{\infty} \hat K\left[ N, \bar N(1+\delta)\right]p_0(\delta)\dif\delta, \end{eqnarray*}$ where $Mathematical equation: \hbox{$\hat K\left[ N, \bar N(1+\delta)\right]\equiv K/\sum_N K$}$ . The probability density function used at the next step is obtained using $\begin{matrix} p̂ i + 1 (δ) = p̂ i (δ) \sum_{N = 0}^{N_{\max}} \frac{P_{N}}{P_{N,i}} K̂ [N, N̅ (1 + δ)], \end{matrix}$ $Mathematical equation: \begin{eqnarray*} \hat p_{i+1}(\delta)=\hat p_{i}(\delta) \sum_{N=0}^{N_{\max}}\frac{P_N}{P_{N,i}} \hat K\left[ N, \bar N(1+\delta)\right], \end{eqnarray*}$ where $Mathematical equation: \hbox{$\hat p \equiv p\sum_N K$}$ . For each step, the agreement between the expected observed probability distribution P_N,i and the true P_N is quantified by $\begin{matrix} χ_{i}^{2} \equiv \sum_{N = 0}^{N_{\max}} {(\frac{P_{N}}{P_{N,i}} - 1)}^{2} . \end{matrix}$ $Mathematical equation: \begin{eqnarray*} \chi_i^2\equiv \sum_{N=0}^{N_{\max}}\left ( \frac{P_N}{P_{N,i}}-1 \right )^2. \end{eqnarray*}$ It is therefore possible to know the evolution of the cost function χ² with respect to the steps i.

In fact it has been shown by Szapudi & Pan (2004) that the cost function converges toward a constant value that corresponds with the best evaluation of the probability density function p, given the observed probability distribution P_N. Since these authors have shown that this convergence occurs after around 30 iterations, we did our own convergence tests, which show that adopting a value of 30 iterations is enough. However, the evolution of the χ² is not always monotonic. In practice, we store the χ² result of each step and we look for the step for which the χ² is minimum, i.e. p(δ) = p_{i_min}(δ). As an initial guess, we make sure that the discret CPDF is equal to the continuous one (p₀(δ) = p).

3.2. The skewed log-normal distribution

This is a parametric method where the shape of the probability density depends on a given number of parameters, in this case the probability density function is assumed to be well described by a skewed log-normal (SLN; Colombi 1994) distribution. It is derived from the log-normal distribution (Coles & Jones 1991; Kim & Strauss 1998) but is more flexible. It is, in fact, built upon an Edgeworth expansion; if the stochastic field Φ ≡ ln(1 + δ), is following a normal distribution then the density contrast δ instead follows a log-normal distribution. In the case of the SLN density function, the field Φ follows an Edgeworth expanded normal distribution $P_{Φ} (Φ) \equiv {1 + \frac{⟨ ν^{3} ⟩_{c}}{6} H_{3} (ν) + \frac{⟨ ν^{4} ⟩_{c}}{24} H_{4} (ν) + \frac{5}{72} ⟨ ν^{3} ⟩_{c}^{2} H_{6} (ν)} \frac{G (ν)}{σ_{Φ}},$ $Mathematical equation: \begin{equation} P_\Phi (\Phi)\equiv \left\{ 1 + \frac{\langle\nu^3\rangle_{\rm c}}{6} H_3(\nu) + \frac{\langle\nu^4\rangle_{\rm c}}{24} H_4(\nu) + \frac{5}{72}\langle\nu^3\rangle_{\rm c}^2 H_6(\nu) \right\}\frac{G(\nu)}{\sigma_\Phi}, \label{edgeworth} \end{equation}$ (7)where $ν \equiv \frac{Φ - μ_{φ}}{σ_{Φ}}$ $Mathematical equation: \hbox{$\nu\equiv \frac{\Phi-\mu_\phi}{\sigma_{\Phi}}$}$ , G is the central reduced normal distribution $G (ν) \equiv \frac{e^{- \frac{ν^{2}}{2}}}{\sqrt{2 π}}$ $Mathematical equation: \hbox{$G(\nu)\equiv\frac{{\rm e}^{-\frac{\nu^2}{2}}}{\sqrt{2\pi}}$}$ , and ⟨ νⁿ ⟩ _c denotes the cumulant expectation value of ν. As a result, the SLN is parameterised by the four parameters μ_Φ, σ_Φ, ⟨ ν³ ⟩ _c, and ⟨ ν⁴ ⟩ _c which are related, respectively to the mean, the dispersion, the skewness, and the kurtosis of the stochastic variable Φ. They can all be expressed in terms of cumulants ⟨ Φⁿ ⟩ _c of order n of the weakly non-Gaussian field Φ. In Szapudi & Pan (2004), they use a best-fit approach and determine these parameters by minimizing the difference between the measured counting probability P_N and the one obtained from $\begin{matrix} P_{N}^{th} & = & \int \begin{matrix} \infty \\ -1 \end{matrix} K [N, N̅ (1 + δ)] P_{Φ} [\ln (1 + δ), μ_{Φ}, σ_{Φ}^{2}, ⟨ Φ^{3} ⟩_{c}, ⟨ Φ^{4} ⟩_{c}] \end{matrix}$ $Mathematical equation: \begin{eqnarray} P_N^{\rm th}&= & \int_{-1}^{\infty}K\left[N, \bar N(1+\delta)\right] P_\Phi\left[\ln(1+\delta), \mu_\Phi, \sigma_\Phi^2, \langle\Phi^3\rangle_{\rm c},\langle\Phi^4\rangle_{\rm c}\right] \nonumber \\ & &\times \dif\ln(1+\delta). \label{pth} \end{eqnarray}$ (8)However, this requires us to perform the integral (Eq. (8)) in a four-dimensional parameter space which is numerically expensive.

In this paper, we use an alternative implementation, which is computationally more efficient. Instead of trying to maximize the likelihood of the model given the observations, we instead use the observations to predict the parameters of the SLN. To do so, we use the property of the local Poisson sampling (Bel & Marinoni 2012); the factorial moments $⟨ (N)_{f}^{n} ⟩$ $Mathematical equation: \hbox{$\langle(N)_{\rm f}^n\rangle$}$ of the discrete counts are equal to the moments of the underlying continuous distribution ⟨ Λⁿ ⟩. Since the transformation between the density contrast δ and the Edgeworth expanded field Φ is local and deterministic, it is possible to find a relation between the moments ⟨ Λⁿ ⟩ and the cumulants ⟨ Φⁿ ⟩ _c.

By definition, the moments of the positive continuous field Λ are given by $\begin{matrix} ⟨ Λ^{n} ⟩ \equiv \int_{0}^{\infty} Λ^{n} P (Λ) dΛ . \end{matrix}$ $Mathematical equation: \begin{eqnarray*} \langle\Lambda^n\rangle\equiv \int_0^\infty\Lambda^n P(\Lambda)\dif\Lambda. \end{eqnarray*}$ Since, for a local deterministic transformation the conservation of probability imposes P(Λ)dΛ = P_Φ(Φ)dΦ, it follows that the moments of Λ can be recast in terms of Φ; $\begin{matrix} ⟨ Λ^{n} ⟩ = {Λ̅}^{n} \int_{0}^{\infty} e^{n Φ} P_{Φ} (Φ) dΦ . \end{matrix}$ $Mathematical equation: \begin{eqnarray*} \langle\Lambda^n\rangle={\bar\Lambda}^n\int_0^\infty {\rm e}^{n\Phi} P_\Phi(\Phi)\dif\Phi. \end{eqnarray*}$ On the right hand side, we recognise the definition of the moment that generates function ℳ_Φ(t) ≡ ⟨ e^tΦ ⟩ , we therefore obtain that $ℳ_{Φ} (t = n) = \frac{⟨ Λ^{n} ⟩}{{Λ̅}^{n}} \equiv A_{n} .$ $Mathematical equation: \begin{equation} \mm_\Phi(t=n)=\frac{\langle\Lambda^n\rangle}{{\bar\Lambda}^n}\equiv A_n. \label{cumumom} \end{equation}$ (9)This equation allows us to link the moment of Λ to the cumulants of Φ via the moment generating function ℳ_Φ.

Moreover, since the probability density P_φ is the product of a sum of Hermite polynomials with a Gaussian function, it is straightforward to compute the explicit expression of the moment-generating function so that we obtain $ℳ_{Φ} (t) = {1 + ⟨ Φ^{3} ⟩_{c} \frac{t^{3}}{6} + ⟨ Φ^{4} ⟩_{c} \frac{t^{4}}{24} + ⟨ Φ^{3} ⟩_{c}^{2} \frac{5}{72} t^{6}} e^{t μ_{Φ} + t^{2} \frac{σ_{Φ}^{2}}{2}} .$ $Mathematical equation: \begin{equation} \mm_\Phi(t)=\left\{ 1 + \langle\Phi^3\rangle_{\rm c}\frac{t^3}{6} + \langle\Phi^4\rangle_{\rm c}\frac{t^4}{24} + \langle\Phi^3\rangle_{\rm c}^2\frac{5}{72}t^6 \right\} {\rm e}^{t\mu_\Phi + t^2\frac{\sigma_\Phi^2}{2}}. \label{mphi} \end{equation}$ (10)In fact, Eqs. (10) and (9) together allow us to set up a system of four equations, so that for n = 1,2,3,4 it reads $Y^{n^{2}} X^{n} B_{n} = A_{n},$ $Mathematical equation: \begin{equation} Y^{n^2}X^nB_n=A_n, \label{system} \end{equation}$ (11)where $Y \equiv e^{\frac{σ_{Φ}^{2}}{2}}$ $Mathematical equation: \hbox{$Y\equiv {\rm e}^{\frac{\sigma_\Phi^2}{2}}$}$ , X ≡ e^μ_Φ and B_n ≡ ℳ_Φ(t = n,μ_Φ = 0,σ_Φ = 0). In the system of equations (Eq. (11)), the right hand side is given by observations and the left hand side depends on the cumulants μ_Φ, $σ_{Φ}^{2}$ $Mathematical equation: \hbox{$\sigma_\Phi^2$}$ , $⟨ Φ ⟩_{c}^{3}$ $Mathematical equation: \hbox{$\langle\Phi\rangle_{\rm c}^3$}$ , and $⟨ Φ ⟩_{c}^{4}$ $Mathematical equation: \hbox{$\langle\Phi\rangle_{\rm c}^4$}$ parameterised in terms of X, Y, $x \equiv ⟨ Φ ⟩_{c}^{3}$ $Mathematical equation: \hbox{$x\equiv \langle\Phi\rangle_{\rm c}^3 $}$ and $y \equiv ⟨ Φ ⟩_{c}^{4}$ $Mathematical equation: \hbox{$y\equiv \langle\Phi\rangle_{\rm c}^4 $}$ . In Appendix A, we detail the procedure to solve this non-linear system of equations. We therefore get the values of the four parameters of the SLN by simply measuring the moments of the counting variable N up to the fourth order.

3.3. The Gamma expansion

The Gamma expansion method follows the same idea as described in Sect. 3.2 but uses a Gamma distribution instead of a Gaussian one. Thit uses the orthogonality properties of the Laguerre polynomialsto modify the moments of the Gamma PDF. This type of an expansion has been investigated in Gaztañaga, Fosalba & Elizalde (2000) where they compared it to the Edgeworth expansion to model the one-point PDF of the matter-density field. Since then it has been extended further by Mustapha & Dimitrakopoulos (2010), in a more general context, to multi-point distributions.

As mentioned above the Gamma expansion requires the use of the Gamma distribution φ_G defined as $φ_{G} (u) \equiv \frac{u^{k - 1}}{θ Γ (k)} e^{- u},$ $Mathematical equation: \begin{equation} \phi_{\rm G}(u)\equiv \frac{u^{k-1}}{\theta\Gamma(k)}{\rm e}^{-u}, \label{defgam} \end{equation}$ (12)where Γ is the Gamma function (for an integer n, Γ(n + 1) = n !, θ and k are two parameters, which are related to the first two moments of the PDF. If the galaxy probability density function is well described by a Gamma expansion at order n then it can be formally written as $P (Λ) = φ_{G} (u) f_{n}^{(k - 1)} (u),$ $Mathematical equation: \begin{equation} P(\Lambda)=\phi_{\rm G}(u)f_n^{(k-1)}(u), \label{gamexp} \end{equation}$ (13)where, by definition, $u \equiv \frac{Λ}{θ}$ $Mathematical equation: \hbox{$u\equiv \frac{\Lambda}{\theta}$}$ , $k = \frac{{Λ̅}^{2}}{σ_{Λ}^{2}}$ $Mathematical equation: \hbox{$k=\frac{{\bar \Lambda}^2}{\sigma_\Lambda^2}$}$ , $θ \equiv \frac{Λ̅}{k} = \frac{σ_{Λ}^{2}}{Λ̅}$ $Mathematical equation: \hbox{$\theta\equiv \frac{\bar \Lambda}{k}=\frac{\sigma_\Lambda^2}{\bar \Lambda}$}$ . The function $f_{n}^{(k - 1)}$ $Mathematical equation: \hbox{$f_n^{(k-1)}$}$ represents the expansion. which aims at tuning the moments of the Gamma distribution; we note that the exponent (k − 1) is not the derivative of order k − 1. Since this expansion is built upon the orthogonal properties of products of Laguerre polynomials with the Gamma distribution, the function $f_{n}^{(k - 1)}$ $Mathematical equation: \hbox{$f_n^{(k-1)}$}$ is given by the sum $f_{n}^{(k - 1)} (x) \equiv \sum_{i = 0}^{n} c_{i} L_{i}^{(k - 1)} (x),$ $Mathematical equation: \begin{equation} f_n^{(k-1)}(x)\equiv \sum_{i=0}^nc_iL_i^{(k-1)}(x), \label{fn} \end{equation}$ (14)where $L_{i}^{(k - 1)}$ $Mathematical equation: \hbox{$L_i^{(k-1)}$}$ are the generalised Laguerre polynomials of order i and the coefficients c_i represent the coefficients of the Gamma expansion and therefore depend on the moments of the galaxy field Λ: $c_{n} \equiv \sum_{i = 0}^{n} (\begin{matrix} n \\ i \end{matrix}) \frac{Γ (k)}{Γ (k + i)} (- 1)^{i} \frac{⟨ Λ^{i} ⟩}{θ^{i}} \cdot$ $Mathematical equation: \begin{equation} c_n\equiv \sum_{i=0}^n {n \choose i} \frac{\Gamma(k)}{\Gamma(k+i)}(-1)^i\frac{\langle\Lambda^i\rangle}{\theta^i}\cdot \label{cn} \end{equation}$ (15)The main interrest of the Gamma expansion with respect to the SLN is that the coefficients of the expansion are directly related to the moments of the distribution we want to model, i.e. it is not necessary to solve a complicated non-linear system of equations, or perform a likelihood estimation of the coefficients. Moreover, it can be easily performed at higher order to describe, as well as possible, the underlying probability-density function of galaxies.

Another advantage of describing the galaxy field Λ by a Gamma expansion probability-density function is that the corresponding observed P_N can be expressed analytically, which is not the case for the SLN, which must be integrated numerically.

In Appendix B we demonstrate the previous statement, which follows that the CPDF P_N can be calculated from $P_{N} = \frac{(- θ)^{N}}{N!} \sum_{i = 0}^{n} c_{i} \frac{Γ (i + k)}{Γ (k)} h_{i}^{(N)} (θ),$ $Mathematical equation: \begin{equation} P_N=\frac{(-\theta)^N}{N!}\sum_{i=0}^nc_i\frac{\Gamma(i+k)}{\Gamma(k)}h_i^{(N)}(\theta), \label{pngam} \end{equation}$ (16)where $h_{i} \equiv \frac{1}{i!} \frac{θ^{i}}{(1 + θ)^{i + k}}$ $Mathematical equation: \hbox{$h_i\equiv \frac{1}{i!}\frac{\theta^i}{(1+\theta)^{i+k}}$}$ and, in this case, we use the notation $h_{i}^{(N)} = \frac{d^{N} h_{i}}{d γ^{N}}$ $Mathematical equation: \hbox{$h_i^{(N)}=\frac{\dif^N h_i}{\dif\gamma^N}$}$ . The successive derivatives of h_i can be obtained from the recursive relation $\begin{matrix} h_{i}^{(N)} (θ) = \frac{(i)_{f}^{N}}{θ^{N}} h_{i} (θ) - \sum_{m = 1}^{N} (\begin{matrix} N \\ m \end{matrix}) \frac{(i + k)_{f}^{m}}{(1 + θ)^{m}} h_{i}^{(N - m)} (θ) . \end{matrix}$ $Mathematical equation: \begin{eqnarray*} h_i^{(N)}(\theta)=\frac{(i)_{\rm f}^N}{\theta^N}h_i(\theta) - \sum_{m=1}^{N} {N \choose m} \frac{(i+k)_{\rm f}^m}{(1+\theta)^m}h_i^{(N-m)}(\theta). \end{eqnarray*}$ In addition to the fact that having the possibility of computing the corresponding observed P_N without requiring an infinite integral for each number N is computationally more efficient, it is also practical to have the analytical calculation for some particular values of the k parameter of the distribution. In fact, when k is lower than 1, which occurs on small scales (4 h^-1 Mpc), the probability-density function goes to infinity when Λ goes to 0 (although the distribution is still well defined). In particular, this numerical divergence would induce large numerical uncertainties in the computation of the void probability P₀. Moreover, one can see that, for the void probability, we have the simple relation $P_{0} = \sum_{i = 0}^{n} c_{i} \frac{Γ (k + i)}{Γ (k)} h_{i} (θ),$ $Mathematical equation: \begin{equation} P_0=\sum_{i=0}^{n}c_i\frac{\Gamma(k+i)}{\Gamma(k)}h_i(\theta), \label{po} \end{equation}$ (17)which can be used to recover the true void probability in VIPERS.

4. Application of the methods on a synthetic galaxy distribution

In this section we analyse a suite of synthetic galaxy distributions generated from 20 realizations of a Gaussian stochastic field. The full process involved in generating these benchmark catalogues is detailed in Appendix C. Each comoving volume has a cubical geometry of size 500 h^-1 Mpc. We generate the galaxies by discretizing the density field according to the sampling conditional probability P [ N | Λ ] , which we assume to be a Poisson distribution with mean Λ. In this way, we know the true underlying galaxy density contrast δ. We can therefore perform a reasonable comparison between the methods introduced in Sect. 3.

To avoid the effect of the grid (0.95 h^-1 Mpc), we smooth both the density field and the discrete field using a spherical top-hat filter of radius R = 8 h^-1 Mpc. We apply the three methods mentioned in Sect. 3 and compare the reconstructed probability-density function to the one expected to be obtained directly from the density field δ.

The discrete distribution of points contains an average number of object per cell $Mathematical equation: \hbox{$\bar N=8,$}$ which is the one that is expected, according to our sampling process. The corresponding P_N is given by the black histogram in the lower panel of Fig. 2. From this measurement we apply the three methods R-L, SLN, and Γ_e and obtain an estimation of the probability density function that corresponds to each method. In the upper panel of Fig. 2, we compare the performance of the three methods at recovering the true probability-density function (the black histogram referred to as a reference in the inset). We note that, for this test case, we use a Gamma expansion at order 4 to be coherent with the order of the expansion of the skewed log-normal. We have also represented the probability-density function, estimated when neglecting the shot noise (red dotted line), which is used as the initial guess in the case of the R-L method.

From the top panel of Fig. 2, we can conclude that the three methods perform reasonably well. It seems that the Γ_e method reproduces the density distribution of under-dense regions (δ ~ − 1) better but this is expected in the sense that the distribution used to generate the synthetic catalogues is a Gamma distribution (see Appendix C). However, this is not obvious because the scale on which the density field has been set up is one order of magnitude smaller than the scale of the reconstruction R = 8 h^-1 Mpc.

The performance of the three methods is also represented in the bottom panel of Fig. 2, in which we compare the expected observed P_N we compare the PN that is expected from the underlying density distribution obtained from each method to the true probability distribution. It can be seen that they all agree at the 15% level, hence it is not possible to conclude that one is better than an other. This scenario was actually expected, based on a comparison of the underlying density field (Fig. 2). Indeed, if one of the methods did not agree with the PDF, then we would also expect a disagreement on the observed CPDF (see Sect. 6).

Below, we investigate the sensitivity of the three methods with respect to the shot noise. In fact, as shown in Fig. 1, we will work with a high shot noise level ( $Mathematical equation: \hbox{$\bar N \le 1$}$ ) in most of the sub-samples of VIPERS PDR-1. We therefore randomly under-sample the fake galaxy distribution by keeping only 10% of the total number of objects contained in each comoving volume. This process gives an average number per cell of 0.8, which is more representative in the context of the application of the reconstruction method. We perform the same comparison as in the ideal case ( $Mathematical equation: \hbox{$\bar N \simeq 8$}$ ) and find that the R-L method appears to be highly sensitive to shot noise. In fact, if the mean number of objects per cell is too few then the output of the method depends too much on the initial guess. It follows that, if it is too far from the true PDF, the process does not converge (see top panel of Fig. 3) and the corresponding P_N does not match the observed P_N (see bottom panel of Fig. 3). We note that we explicitly checked this effect by increasing the number of iterations from 30 to 200. While in the case of both the SLN and the Gamma expansion, in Fig. 3 we can see the output probability-density function is in agreement (with a larger scatter) to the one obtained in the $Mathematical equation: \hbox{$\bar N \simeq 8$}$ case. This means that the sensitivity regarding to the shot noise is much smaller when considering parametric methods.

Fig. 2

Upper: black histogram with error bars showing the true underlying probability-density function (referred to as reference in the inset) compared to the reconstruction obtained with the R-L (red dashed line), the SLN (green dot-dashed line), and the Γ_e (blue long dashed line) methods. The red dotted histogram shows the PDF used as the initial guess for the R-L method and the coloured dotted lines around each method line represent the dispersion of the reconstruction among the 20 fake galaxy catalogues. We also display the relative difference of the result obtained from each method with respect to the true PDF. Lower: the black histogram with error bars shows the observed probability-density function (referred to as reference in the inset) compared to the reconstruction obtained with the R-L (red dashed line), the SLN (green dot-dashed line), and the Γ_e (blue long dashed line) methods. We also display the relative difference in the result obtained from each method with respect to the observed P_N.

Fig. 3

Same as in Fig. 2, but we use only 10% of the galaxies contained in the fake galaxy catalogues. As a result, the average number of galaxies per cell drops from $Mathematical equation: \hbox{$\bar N=8$}$ to $Mathematical equation: \hbox{$\bar N=0.8$}$ .

Considering the sensitivity of the R-L method to the initial guess, knowing that the average number of galaxies per cell can be lower than unity and, finally, taking computational time into account, we continue our analysis using only the two parametric methods SLN and Γ_e. In the following, we compare them using more realistic mock catalogues for which we don’t know, apriori, the true underlying PDF.

5. Performances in realistic conditions

In this section, we discuss how observational effects have been accounted for in our analysis and test the robustness of the reconstruction methods SLN and Gamma expansion. For this purpose we use a suite of mock catalogues created from the Millenium simulation, which are also used in the analysis performed by di Porto et al. (2014).

We compare the reconstruction methods between two catalogues, namely REFERENCE and MOCK. The reference is a galaxy catalogue that was obtained from semi-analytical models. We simulate the redshift errors of VIPERS PDR-1 by perturbing the redshift (including distortions owing to peculiar motions) with a normally distributed error with rms 0.00047(1 + z). Each MOCK catalogue is built from the corresponding REFERENCE catalogue by applying the same observational strategy (de la Torre et al. 2013) which is applied to VIPERS PDR-1; spectroscopic targets are selected from the REFERENCE catalogue by applying the slit-positioning algorithm (SPOC, Bottini et al. 2005) with the same settings as for the PDR-1. This allows us to reproduce the VIPERS footprint on the sky, the small-scale angular incompleteness that is due to spectra collisions, and the variation of the target sampling rate across the fields. Finally, we deplete each quadrant to reproduce the effect of the survey success rate (SSR, see de la Torre et al. 2013). In this way, we end up with 50 realistic mock catalogues, which simulate the detailed survey completeness function and observational biases of VIPERS in the W1 and W4 fields.

To perform a similar analysis to the one we aim at doing for VIPERS PDR-1, we construct subsamples of galaxies selected according to their absolute magnitude M_B in B-band; we take all objects brighter than a given luminosity. We list these samples in Table 2, having a total of six galaxy samples. The highest luminosity cut (M_B − 5log (h) < 19.72 − z) allows us to follow a single population of galaxies at three cosmic epochs.

Table 2

List of the magnitude selected objects (in B-band) in the mock catalogues.

Fig. 4

Comparison between the SLN and Γ_e methods at 0.9 <z< 1.1. Each panel corresponds to a cell radius R of 4, 6, and 8 h^-1 Mpc from left to right. Top: the red histogram shows the observed PDF in the MOCK catalogues while the black histogram displays the PDF extracted from the REFERENCE catalogues. The blue diamonds with lines and the magenta triangles each show the Γ_e expansion performed in the REFERENCE and MOCK catalogues, respectively. On the other hand, the cyan diamonds with lines and the orange triangles show, respectively, the SLN expansion performed in the REFERENCE and MOCK catalogues. Bottom: relative deviation of the Γ_e and SLN expansions applied to both the REFERENCE and MOCK catalogues with respect to the PDF of the REFERENCE catalogues.

Fig. 5

Comparison between the SLN and Γ_e methods at 0.7 <z< 0.9. Each panel corresponds to a cell radius R of 4, 6, and 8 h^-1 Mpc from left to right.

Fig. 6

Comparison between the SLN and Γ_e methods at 0.5 <z< 0.7. Each panel corresponds to a cell radius R of 4, 6, and 8 h^-1 Mpc from left to right.

Fig. 7

Comparison between the SLN and Γ_e methods. Each column corresponds to a cell radius R of 4, 6, and 8 h^-1 Mpc from left to right, and each row corresponds to a combination of redshift and magnitude cut.

In Figs. 4–6, we show the reconstruction performances for the SLN and the Γ_e method. We consider the same population (M_b − 5log h + z< −19.72) but in three redshift bins, 0.9 <z< 1.1, 0.7 <z< 0.9, and 0.5 <z< 0.7. To test the stability of the methods, we perform the reconstruction using three smoothing scales, R = 4, 6, and 8 h^-1 Mpc. The comparison is done as follows, on the one hand we estimate the true P_N from the REFERENCE catalogue (before applying the observational selection) and we perform the reconstruction on it so that we can test the intrinsic biases that are the result of the assumed parametric method (SLN or Γ_e). On the other hand, we estimate the observed P_M in the MOCK catalogues, from which we perform the reconstruction to verify if we recover the expected P_N from the REFERENCE catalogue.

Looking more closely at Fig. 4, firstly, we can see that the intrinsic error that is due to the specific modelling of the methods is much greater for the SLN (cyan diamonds compared to the black histogram) than for the Γ_e (magenta diamonds compared to the black histogram). From the top panel we see that the SLN does not reproduce the tail of the CPDF, and from the bottom panel we see that, even for low counts, it shows deviations as large as 20%. This intrinsic limitation propagates when performing the reconstruction on the MOCK catalogue (orange triangles compared to the black histogram) while, for the Γ_e, we see that the agreement is better than 10% (magenta triangles compared to the black histogram) in the low count regime and the tail is fairly well reproduced. In the second place, comparing the Γ_e that was performed on the REFERENCE and the MOCK catalogues (blue diamonds with respect to magenta triangles), we can see the loss of information owing to the observational strategy that has, at most, an impact of 10% on the reconstructed CPDF, which is reduced when considering larger cells (less shot noise).

In general, Figs. 5 and 6 confirm that for the considered galaxy population the same results hold at lower redshifts. However, the reconstruction at R = 4 h^-1 Mpc can, in particular, exhibit deviations larger than 20%, which is at odds with the fact that the shot noise contribution is expected to be the same for the three redshift bins (magnitude limited). We attribute this larger instability to the fact that, not only is the shot noise contribution higher for R = 4 h^-1 Mpc, but the volume probed is also smaller when decreasing the redshift.

The performances of the reconstruction for the last three galaxy samples are shown in Fig. 7, where each row corresponds to a galaxy sample (we only show the residual with respect to the REFERENCE). This comparison allows us to claim that the reconstruction instability at 4 h^-1 Mpc was indeed due to the high level of shot noise. We can conclude that, in the HOD galaxy mock catalogues, the galaxy distribution is more likely to be modelled by a Γ_e instead of an SLN. Finally, for a chosen reconstruction method, the information contained in the MOCK catalogues is enough to be able to reconstruct the CPDF of the REFERENCE catalogue at the 10% level.

6. VIPERS PDR-1 data

In this section, we apply the reconstruction method to the VIPERS PDR-1. In the previous sections, we saw that the SLN and Γ_e methods are sensitive to the assumptions we make about the underlying PDF. In fact, we saw in Sect. 4 that, if the underlying PDF is close to the chosen model, then the reconstruction works. In Sect. 5, we found that the galaxy distribution arising from semi-analytic models is better described by a Γ_e than an SLN distribution. However, in the following we do not take for granted that the same property holds for galaxies in the PDR-1.

We want to choose which one of the two distributions (log-normal or gamma) best describes the observed galaxy distribution in VIPERS PDR-1, when no expansion is applied. Thus, we compare the observed PDF to the one that is expected from the Poisson sampling of the log-normal probability density function (PS-LN) and to the one that is expected from the Poisson sampling of the Gamma distribution (the so-called negative binomial). Error bars are obtained by performing a jack-knife resampling of 3 × 7 subregions in each of the fields, W1 and W4.

Fig. 8

Observed count-in-cell probability distribution function P_N (histograms) from VIPERS PDR-1 for various luminosity cuts (indicated in the inset). Each row corresponds to a redshift bin, from the bottom to the top, 0.5 <z< 0.7, 0.7 <z< 0.9, and 0.9 <z< 1.1. Each column corresponds to a cell radius R = 4,6,8 h^-1 Mpc from left to right. Moreover we added the expected PDF from two models which match the two first moments of the observed distribution; the red solid line shows the prediction for a Poisson-sampled log-normal (PS-LN) CPDF, while the green dashed line indicates the negative binomial model for the CPDF.

The SP-LN distribution does not have an analytic expression and must be obtained by numerically integrating Eq. (6), while the Poisson sampling of the Gamma distribution leads to the negative binomial distribution defined as $P_{N} = \frac{θ^{N}}{N!} \frac{r (r + 1) ... (r + N - 1)}{(1 + θ)^{N + r}},$ $Mathematical equation: \begin{equation} P_N=\frac{\theta^N}{N!}\frac{r(r+1)...(r+N-1)}{(1+\theta)^{N+r}}, \label{negbin} \end{equation}$ (18)where $θ = \frac{N̅}{r}$ $Mathematical equation: \hbox{$\theta=\frac{\bar N}{r}$}$ and $r = \frac{N̅^{2}}{σ_{N}^{2} - N̅}$ $Mathematical equation: \hbox{$r=\frac{\bar N^2}{\sigma_N^2-\bar N}$}$ to ensure that the first two moments of the negative binomial match those of the observed distribution. We note that the applicability of the negative binomial to galaxies was first suggested by Carruthers & Duong-van (1983). In Fig. 8, we show the outcome of this comparison; it follows that the negative binomial is much closer to the observed PDF than the PS-LN. This is in agreement with what has been found by Yang & Saslaw (2011), who compared the negative binomial and the gravitational equilibrium distribution (Saslaw & Hamilton 1984) to model the galaxy clustering. As a result, the underlying galaxy distribution is more likely to be described by a Gamma distribution than by a log-normal. Hence, we only use the Gamma expansion to model the galaxy distribution of VIPERS PDR-1.

Moreover, the use of the Gamma expansion instead of the SLN substantially simplifies the analysis. In Fig. 9 we provide the reconstructed probability-distribution function of VIPERS PDR-1, together with the corresponding underlying probability-density function for each redshift bin and luminosity cut. Each panel of Fig. 9 shows how the choice of a particular class of tracers (selected according to their absolute magnitude in B-band) influences the PDF of galaxies. When measuring specific properties of the intrinsic galaxy distribution for each luminosity cut, it is enough to look at the CPDF. However, when comparing the distributions with each other, it is necessary to take account of the averaged number of objects per cell, which varies from sample to sample. As a result, it appears more useful to compare the properties of the different galaxy samples using their underlying probability-density function which, assuming Poisson sampling, is free from sampling-rate variation between different type of tracers.

For the two first redshift bins, we can see that the probability density function is broadening when selecting more luminous galaxies, this goes in the direction of increasing the linear bias with respect to the matter distribution. However, for the highest redshift bin, it seems that this goes in the opposite direction, despite a less significant trend. This trend might be an artifact; indeed by analysing Fig. 1, we see that, for all these samples, the averaged number of object per cell is between 0.2–0.4, which shows that theses samples could be highly affected by shot noise effects. Consequently, particular care should be taken when interpreting these three high redshift samples.

In the following, we focus on the evolution of the underlying PDF for a particular class of objects on the wide redshift range probed by VIPERS PDR-1. Figure 10 displays the outcome of this study and shows how the PDF evolves, with regard to the redshift at which it is measured, for three populations (the three highest magnitude cuts). The three populations (top, middle, and bottom panels) exhibit non-monotonic evolution in relation to the redshift. In particular, the more luminous population shows that the PDF at 0.9 <z< 1.1 appears to be systematically different to that is the two lower redshift bins. However, we see also that some instabilities appear in the reconstruction (see wiggles at high 1 + δ). This might be due to the fact that we have fewer galaxies in this sample, giving rise to a large shot-noise contribution ( $Mathematical equation: \hbox{$\bar N <0.3$}$ ). Indeed, we verified that, for the high mass bin and the two other galaxy populations, if we vary the order of the expansion from 6 to 4, the resulting PDF changes by less than 1σ, while for the most luminous population, truncating the expansion at order 4 only removes the instability without changing the overall behaviour of the PDF significantly. This consistency test shows that the radical change in the measured PDF for the highest redshift bin appears to be the true feature. Probably only the final VIPERS data set will be able to give a robust conclusion.

Fig. 9

Top: reconstructed PDF applying the Γ_e method in three redshift bins (from left to right) at the intermediate smoothing scale R = 6 h^-1 Mpc. Bottom: underlying PDF corresponding to the CPDF in the top panel, for each luminosity cut the 1-sigma uncertainty is represented by the dotted lines.

Fig. 10

Evolution of three galaxy populations, selected according to their luminosity (from bottom to top). On each panel, the black solid, red dashed, and cyan dot-dashed lines represent, respectively, the three redshift bins 0.5 <z< 0.7, 0.7 <z< 0.9, and, 0.9 <z< 1.1.

Table 3

Coefficients of the Γ_e expansion, which describe the VIPERS PDR-1 data for R = 6 h^-1 Mpc.

Finally, in Table 3, we list the relevant coefficients of the Gamma expansion, which we measured from the VIPERS PDR-1 at the scale R = 6 h^-1 Mpc. These can be used to model both the CPDF (Eq. (16)) and the PDF (Eq. (13)).

7. Summary

The main goal of the present paper is to measure the probability of finding N galaxies falling into a spherical cell that is randomly placed inside a sparsely sampled (i.e. with masked areas or with low-sampling rate) spectroscopic survey. Our overall approach to this problem has been to use the underlying probability-density distribution of the density contrast of galaxies to recover the counting probability that has been corrected from sparseness effects. We therefore compared three ways (R-L, SLN and Γ_e) of measuring the probability density of galaxies that are classified in two categories: direct and parametric. We found that, when the sampling is high ( $Mathematical equation: \hbox{$\bar N \simeq 10$}$ ), the direct method (Rychardson-Lucy deconvolution) performs well and avoids putting any prior on the shape of the distribution. On the other hand, we saw that, when the sampling is low ( $Mathematical equation: \hbox{$\bar N \simeq 1$}$ ), the direct method fails to converge to the true underlying distribution. We thus concluded that, in such cases, the only alternative is to use a parametric method.

We presented two parametric forms that are aimed at describing the galaxy density distribution, the SLN, which is often used in the literature to model the matter distribution and the Γ_e. Despite the fact that the two distributions used in this paper already have been investigated in previous works, the approach we propose to estimate their parameters is completely new. Previously, fitting procedures were used to estimate parameters. Here we propose to measure the parameters of the distributions directly from the observations. The method can be applied to both SLN and Γ_e distributions and decreases the computational time of the process considerably.

Relying on simulated galaxy catalogues of VIPERS PDR1, we tested the reconstruction scheme of the counting probability (P_N) under realistic conditions in the case of the SLN and Γ_e expansions. We found, that the reconstruction depends on the choice of the model for the galaxy distribution. However, we have also shown that it is possible to test which distribution better describes the observations.

Using VIPERS PDR1, on the relevant scales that are investigated in this paper (R = 4,6,8 h^-1 Mpc), we found that the Γ distribution gives a better description of the observed P_N than that provided by the log-normal (see Fig. 8). We therefore adopted the Γ_e parametric form to reconstruct the probability-density functions of galaxies. From these reconstruction we studied how their PDF changes according to their absolute luminosity in B-band and we also studied their redshift evolution. We found that little evolution has been detected in the first two redshift bins, while it seems that the density distribution of the galaxy field is strongly evolving in the last redshift bin.

Finally, we used the measured PDF to reconstruct the counting probability (CPDF) that would be observed if VIPERS was not masked by gaps between the VIMOS quadrants.

Acknowledgments

J.B. acknowledges useful discussions with E. Gaztañaga. We acknowledge the crucial contribution of the ESO staff for the management of service observations. In particular, we are deeply grateful to M. Hilker for his constant help and support of this programme. Italian participation in VIPERS has been funded by INAF through PRIN 2008 and 2010 programmes. J.B., L.G. and B.J.G. acknowledge support of the European Research Council through the Darklight ERC Advanced Research Grant (# 291521). OLF acknowledges support of the European Research Council through the EARLY ERC Advanced Research Grant (# 268107). A.P., K.M., and J.K. have been supported by the National Science Centre (Grants UMO-2012/07/B/ST9/04425 and UMO-2013/09/D/ST9/04030), the Polish-Swiss Astro Project (co-financed by a grant from Switzerland, through the Swiss Contribution to the enlarged European Union), the European Associated Laboratory Astrophysics Poland-France HECOLS and a Japan Society for the Promotion of Science (JSPS) Postdoctoral Fellowship for Foreign Researchers (P11802). G.D.L. acknowledges financial support from the European Research Council under the European Community’s Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement no. 202781. W.J.P. and R.T. acknowledge financial support from the European Research Council under the European Community’s Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement no. 202686. W.J.P. is also grateful for support from the UK Science and Technology Facilities Council through Grant ST/I001204/1. E.B., F.M. and L.M. acknowledge the support from grants ASI-INAF I/023/12/0 and PRIN MIUR 2010-2011. C.M. is grateful for support from specific project funding of the Institut Universitaire de France and the LABEX OCEVU.

References

Bel, J., & Marinoni, C. 2012, MNRAS, 424, 971 [NASA ADS] [CrossRef] [Google Scholar]
Bel, J., Marinoni, C., Granett, B. R., et al. (the VIPERS Team) 2014, A&A, 563, A37 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bernardeau, F., Colombi, S., Gaztaãga, E., & Scoccimarro, R. 2002, Phys. Rep., 367, 1 [NASA ADS] [CrossRef] [EDP Sciences] [MathSciNet] [Google Scholar]
Bottini, D., Garilli, B., Maccagni, D., et al. 2005, PASP, 117, 996 [NASA ADS] [CrossRef] [Google Scholar]
Bouchet, F. R., Strauss, M. A., Davis, M., et al. 1993, ApJ, 417, 36 [NASA ADS] [CrossRef] [Google Scholar]
Cappi, A., Marulli, F., Bel, J., et al. (the VIPERS team) 2015, A&A, 579, A70 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Carruthers, P., Duong-van, M. 1983, Phys. Lett. B, 131, 116 [NASA ADS] [CrossRef] [Google Scholar]
Coles, P., & Jones, B. 1991, MNRAS, 248, 1 [NASA ADS] [CrossRef] [Google Scholar]
Colless, M., Dalton, G., Maddox, S., et al. 2001, MNRAS, 328, 1039 [NASA ADS] [CrossRef] [Google Scholar]
Colombi, S. 1994, ApJ, 435, 536 [NASA ADS] [CrossRef] [Google Scholar]
de la Torre, S., Guzzo, L., Kovac, K., et al. (the ZCOSMOS collaboration) 2010, MNRAS, 409, 867 [NASA ADS] [CrossRef] [Google Scholar]
de la Torre, S., Guzzo, L., Peacock, J. A., et al. (VIPERS team) 2013, A&A, 557, A54 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
di Porto, C., Branchini, E., Bel, J., et al. (VIPERS team) 2014, A&A, submitted [arXiv:1406.6692] [Google Scholar]
Eisenstein, D. J., & Hu, W. 1998, ApJ, 496, 605 [NASA ADS] [CrossRef] [Google Scholar]
Garilli, B., Le Fèvre, O., Guzzo, L., et al. (the VVDS collaboration) 2008, A&A, 486, 683 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Garilli, B., Paioro, L., Scodeggio, M., et al. 2012, PASP, 124, 1232 [NASA ADS] [CrossRef] [Google Scholar]
Garilli, B., Guzzo, L., Scodeggio, M., et al. (the VIPERS team) 2014, A&A, 562, A23 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Gaztañaga, E., Fosalba, P., & Elizalde, E. 2000, ApJ, 539, 522 [NASA ADS] [CrossRef] [Google Scholar]
Greiner, M., & Enβlin, T. A. 2015, A&A, 574, A86 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Guzzo, L., Pierleoni, M., Meneux, B., et al. (the VVDS team) 2008, Nature, 451, 541 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
Guzzo, L., Scodeggio, M., Garilli, B., et al. (the VIPERS team) 2014, A&A, 566, A108 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Kim, R. S. J., & Strauss, M. A. 1998, ApJ, 493, 39 [NASA ADS] [CrossRef] [Google Scholar]
Layzer, D. 1956, AJ, 61, 383 [NASA ADS] [CrossRef] [Google Scholar]
LeFèvre, O., Saisse, M., Mancini, D., et al. 2003, Proc. SPIE, 4841, 1670 [NASA ADS] [CrossRef] [Google Scholar]
Le Fèvre, O., Vettolani, G., Garilli, B., et al. 2005, A&A, 439, 845 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Lilly, S. J., Le Brun, V., Maier, C., et al. (the ZCOSMOS collaboration) 2009, ApJS, 184, 218 [Google Scholar]
Marchetti, A., Granett, B. R., Guzzo, L., et al. (the VIPERS team) 2013, MNRAS, 428, 1424 [NASA ADS] [CrossRef] [Google Scholar]
Mellier, Y., Bertin, E., Hudelot, P., et al. 2008, The CFHTLS T0005 Release, http://terapix.iap.fr/cplt/oldSite/Descart/CFHTLS-T0005-Release.pdf [Google Scholar]
Mustapha, H., & Dimitrakopoulos, R. 2010, Int. Conf. of Numerical Analysis and Appl. Math., AIP Conf. Proc., 60, 2178 [Google Scholar]
Newman, J. A., Cooper, M. C., Davis, M., et al. (the DEEP2 collaboration) 2013, ApJS, 208, 5 [Google Scholar]
Oke, J. B., & Gunn, J. E. 1983, ApJ, 266, 713 [NASA ADS] [CrossRef] [Google Scholar]
Saslaw, W. C., & Hamilton, A. J. S. 1984, ApJ, 276, 13 [NASA ADS] [CrossRef] [Google Scholar]
Scodeggio, M., Franzetti, P., Garilli, B., et al. 2009, The Messenger, 135, 13 [NASA ADS] [Google Scholar]
Szapudi, I., & Pan, J. 2004, ApJ, 602, 26 [NASA ADS] [CrossRef] [Google Scholar]
Szapudi, I., Meiksin, A., Nichol, R. C., et al. 1996, ApJ, 473, 15 [NASA ADS] [CrossRef] [Google Scholar]
Yang, A., & Saslaw, W. C. 2011, ApJ, 729, 123 [NASA ADS] [CrossRef] [Google Scholar]

Appendix A: Non-linear system

The problem with this system of equations is that it is non-linear, and is therefore difficult to solve. However, it can be reduced to a one-dimensional equation which can be solved numerically.

The two first equations (n = 1 and n = 2) can be used to express the first two cumulants with respect to the third and fourth order ones: $\begin{matrix} σ_{Φ}^{2} & = \\ μ_{Φ} & = \end{matrix}$ $Mathematical equation: \appendix \setcounter{section}{1} \begin{eqnarray} \sigma_\Phi^2 & = & \ln(A_2) + \ln\left( \frac{B_1^2}{B_2}\right) \label{sig} \\ \mu_\Phi & = & -\frac{1}{2}\left [\ln(A_2) + \ln\left( \frac{B_1^4}{B_2}\right) \right ] \label{mu} , \end{eqnarray}$ where B₁ and B₂ are both functions of x and y. Then, using other combinations of the equation, one can express a system of two equations for x and y alone: $\begin{matrix} B_{3}^{2} & = \\ B_{3} B_{1}^{3} & = \end{matrix}$ $Mathematical equation: \appendix \setcounter{section}{1} \begin{eqnarray} B_3^2 & = & a_1B_1^2B_4 \label{eq1}\\ B_3B_1^3 & = & a_2B_2^3,\label{eq2} \end{eqnarray}$ where $a_{1} \equiv \frac{A_{3}^{2}}{A_{4}}$ $Mathematical equation: \hbox{$a_1\equiv \frac{A_3^2}{A_4}$}$ and $a_{2} \equiv \frac{A_{3}}{A_{2}^{3}}$ $Mathematical equation: \hbox{$a_2\equiv \frac{A_3}{A_2^3}$}$ . To properly solve the system, we prefer to express it in terms of one parameter η ≡ B₂/B₁, moreover one can see that polynomials B₁ to B₄ are not independent, as a result. $\begin{matrix} B_{4} = d + a B_{1} + b B_{2} + c B_{3}, \end{matrix}$ $Mathematical equation: \appendix \setcounter{section}{1} \begin{eqnarray*} B_4=d + aB_1 + bB_2 + cB_3, \end{eqnarray*}$ where $a = 96,b = - 32,c = \frac{224}{27},d = - \frac{1925}{27}$ $Mathematical equation: \hbox{$a=96, b=-32, c=\frac{224}{27}, d=-\frac{1925}{27}$}$ and which can be substituted in Eq. (A.3). Combining Eqs. (A.3) and (A.4) one obtains a parametric equation for B₁ $(a + bη) B_{1}^{3} + (d + cf (η)) B_{1}^{2} - g (η) = 0,$ $Mathematical equation: \appendix \setcounter{section}{1} \begin{equation} (a+b\eta)B_1^3 + (d+cf(\eta))B_1^2 - g(\eta)=0, \label{b1param} \end{equation}$ (A.5)which can be solved for each value of the parameter η and an independent parametric equation for B₃ $\begin{matrix} B_{3} = f (η) . \end{matrix}$ $Mathematical equation: \appendix \setcounter{section}{1} \begin{eqnarray*} B_3=f(\eta). \end{eqnarray*}$ As a result we can find a couple B₁, B₃ for each value of the parameter η. It follows that one can express x and y with respect to η and, given the definition of η, the possible solution x and y must satisfy the condition $\begin{matrix} B_{2} [x (η),y (η)] - η B_{1} [x (η),y (η)] = 0, \end{matrix}$ $Mathematical equation: \appendix \setcounter{section}{1} \begin{eqnarray*} B_2[x(\eta),y(\eta)]-\eta B_1[x(\eta),y(\eta)]=0, \end{eqnarray*}$ which gives the possible values of η from which one can recover x and y. Finally, from Eqs. (A.1) and (A.2) we can compute the values of σ_Φ and μ_Φ that correspond to each couple (x, y) of the solutions. This allows us to select the solution which provides a value of A₅ that is closer to the observed one.

Once the values of the cumulants μ_Φ, $σ_{Φ}^{2}$ $Mathematical equation: \hbox{$\sigma_\Phi^2$}$ , ⟨ Φ³ ⟩ _c and ⟨ Φ⁴ ⟩ _c are known from the process detailed above, we know that the moments of the corresponding $P_{N}^{th}$ $Mathematical equation: \hbox{$P_N^{\rm th}$}$ will match those of the observed up to order 4. In the end, we can check whether the SLN distribution provides a good match to data by numerically integrating the probability-density function that was convolved with the Poisson kernel K (see Eq. (5)).

Appendix B: Generating function

We show that the CPDF that was associated with a Gamma expanded PDF can be calculated analytically from an expression that depends explicitly on the coefficients c_i of the Gamma expansion.

Be $Mathematical equation: \hbox{$\mathcal{G}_N$}$ the generating function associated to the probability distribution P_N, it is defined as $𝒢_{N} (λ) \equiv \sum_{i = 0}^{\infty} λ^{N} P_{N} .$ $Mathematical equation: \appendix \setcounter{section}{2} \begin{equation} \mathcal{G}_N(\lambda)\equiv\sum_{i=0}^{\infty}\lambda^NP_N. \label{gener} \end{equation}$ (B.1)In case of the Poisson sampling of a Gamma distribution, after some algebra, one can show that it can be expressed with respect to the coefficients of the Gamma expansion as $𝒢_{N} (λ) = \frac{1}{Γ (k)} \sum_{i = 0}^{n} c_{i} F_{i} (γ),$ $Mathematical equation: \appendix \setcounter{section}{2} \begin{equation} \mathcal{G}_N(\lambda)=\frac{1}{\Gamma(k)}\sum_{i=0}^nc_iF_i(\gamma), \label{gn} \end{equation}$ (B.2)where γ ≡ (1 − λ)θ and $\begin{matrix} F_{i} (γ) \equiv \int_{0}^{\infty} x^{k - 1} e^{- x} L_{i}^{(k - 1)} (x) e^{- γx} d x . \end{matrix}$ $Mathematical equation: \appendix \setcounter{section}{2} \begin{eqnarray*} F_i(\gamma)\equiv \int_{0}^{\infty}x^{k-1}{\rm e}^{-x}L_i^{(k-1)}(x){\rm e}^{-\gamma x}\dif x. \end{eqnarray*}$ Nevertheless, this integral can be computed using the Laguerre expansion of the exponential $\begin{matrix} e^{- γx} = \sum_{i = 0}^{\infty} \frac{γ^{i}}{(1 + γ)^{i + α + 1}} L_{i}^{(α)} (x), \end{matrix}$ $Mathematical equation: \appendix \setcounter{section}{2} \begin{eqnarray*} {\rm e}^{-\gamma x}=\sum_{i=0}^{\infty}\frac{\gamma^i}{(1+\gamma)^{i+\alpha+1}}L_i^{(\alpha)}(x), \end{eqnarray*}$ where it reads to $F_{i} (γ) = \frac{γ^{i}}{(1 + γ)^{i + k}} \frac{Γ (i + k)}{i!} \cdot$ $Mathematical equation: \appendix \setcounter{section}{2} \begin{equation} F_i(\gamma)=\frac{\gamma^i}{(1+\gamma)^{i+k}}\frac{\Gamma(i+k)}{i!}\cdot \label{fi} \end{equation}$ (B.3)The formal expression of the generating function is therefore given by $𝒢_{N} (λ) = \frac{(1 + γ)^{- k}}{Γ (k)} \sum_{i = 0}^{n} c_{i} \frac{Γ (i + k)}{i!} (\frac{γ}{1 + γ}),$ $Mathematical equation: \appendix \setcounter{section}{2} \begin{equation} \mathcal{G}_N(\lambda)=\frac{(1+\gamma)^{-k}}{\Gamma(k)}\sum_{i=0}^nc_i\frac{\Gamma(i+k)}{i!}\left (\frac{\gamma}{1+\gamma} \right ), \label{expli} \end{equation}$ (B.4)where we still use γ = (1 − λ)θ. From the explicit expression of the moment-generating function (Eq. (B.4)) one can get the probability distribution P_N by iteratively deriving the generating function with respect to γ: $\begin{matrix} P_{N} \equiv \frac{1}{N!} {\frac{d^{N} 𝒢_{N} (λ)}{d λ^{N}} |}_{λ = 0} = \frac{(- θ)^{N}}{N!} {\frac{d^{N} 𝒢_{N} (γ)}{d γ^{N}} |}_{γ = θ} \cdot \end{matrix}$ $Mathematical equation: \appendix \setcounter{section}{2} \begin{eqnarray*} P_N\equiv \frac{1}{N!}\left.\frac{\dif^N \mathcal{G}_N(\lambda)}{\dif \lambda^N}\right|_{\lambda=0}=\frac{(-\theta)^N}{N!}\left.\frac{\dif^N \mathcal{G}_N(\gamma)}{\dif \gamma^N}\right|_{\gamma=\theta}\cdot \end{eqnarray*}$ These derivatives can be calculated explicitly.

Appendix C: Synthetic galaxy catalogues

We describe here how we generate synthetic galaxy catalogues from Gaussian realizations. The first requirement of these catalogues is that they must be characterised by a known power spectrum and 1-point probability-distribution function. The second requirement is that the probability-distribution function must be measurable.

The basic idea is simple, we generate a Gaussian random field in Fourier space (assuming a power spectrum), we inverse Fourier transform it to get its analog in configuration space. We further apply a local transform to map the Gaussian field into a stochastic field that is characterised by the target PDF. The two crucial steps of this process are the choice of the input power spectrum and the choice of the local transform.

Be ν a stochastic field following a centered (⟨ ν ⟩ = 0) reduced ( $σ_{ν}^{2} \equiv ⟨ ν^{2} ⟩_{c} = 1$ $Mathematical equation: \hbox{$\sigma_\nu^2\equiv\langle\nu^2\rangle_{\rm c}=1$}$ ) Gaussian distribution. From a realization of this field, one can generate a non-Gaussian density field δ by applying a local mapping L between the two, hence $δ = L (ν) .$ $Mathematical equation: \appendix \setcounter{section}{3} \begin{equation} \delta = L(\nu). \label{local} \end{equation}$ (C.1)The local transform L must be chosen to match some target PDF P_δ for the density contrast δ. Assuming that the local transform is a monotonic function, which maps the ensemble ]−∞, +∞[ into ]−1, +∞[ then, owing to the probability conservation P_δ(δ)dδ = P_ν(ν)dν, the local transform must verify the following matching: $𝒞_{δ} [δ] = 𝒞_{ν} [ν],$ $Mathematical equation: \appendix \setcounter{section}{3} \begin{equation} \mathcal{C}_\delta[\delta] = \mathcal{C}_\nu[\nu], \label{matching} \end{equation}$ (C.2)where $Mathematical equation: \hbox{$\mathcal{C}_x$}$ stands for the cumulative probability distribution function. If [ a,b ] is the definition assemble of the variable x, then its cumulative probability distribution function is defined as $𝒞_{x} [x] \equiv {^{\int}}_{a}^{x} P_{x} (x^{'}) d x^{'}$ $Mathematical equation: \hbox{$\mathcal{C}_x[x]\equiv \int_{a}^xP_x(x^\prime)\dif x^\prime$}$ , where P_x is the PDF of x. By definition a probability-density function is positive, it follows that its cumulative is a monotonic function and therefore Eq. (C.2) can always be inverted to read $\begin{matrix} δ = 𝒞_{δ}^{-1} [𝒞_{ν} (ν)], \end{matrix}$ $Mathematical equation: \appendix \setcounter{section}{3} \begin{eqnarray*} \delta=\mathcal{C}_\delta^{-1}\left [ \mathcal{C}_\nu (\nu ) \right], \end{eqnarray*}$ where the exponent −1 stands for the reciprocal function such that F^-1[F(x)] = x. For example, by defining the local mapping L, this allows a normal distribution to transform into a log-normal distribution, which is δ = e^ν − 1. We note that, depending on the PDF to be matched, this inversion can require a numerical evaluation, which can be tabulated.

Once a local transform is chosen, we need to address the question of finding the appropriate power spectrum of the Gaussian field ν which, once locally mapped into the density field δ, will match the expected power spectrum. Following Greiner & Enβlin (2015), who considered a log-transform, we generalised their result to a generic local transformation. This mapping is not directly in Fourier space although it is in configuration. Writing the two-point moment of order two of the density field δ and assuming the probability conservation leads to $ξ_{δ} \equiv ⟨ δ_{1} δ_{2} ⟩ = \int L (ν_{1}) L (ν_{2}) ℬ (ν_{1}, ν_{2}, ξ_{ν}) d ν_{1} d ν_{2},$ $Mathematical equation: \appendix \setcounter{section}{3} \begin{equation} \xi_\delta\equiv \langle\delta_1\delta_2\rangle=\int {L}(\nu_1) {L}(\nu_2) \mathcal{B}(\nu_1,\nu_2,\xi_\nu)\dif \nu_1\dif \nu_2, \label{gaussint} \end{equation}$ (C.3)where ℬ is a bivariate Gaussian defined as $ℬ (ν_{1}, ν_{2}, ξ_{ν}) \equiv \frac{1}{2 π | C_{ν} |^{1 / 2}} \exp {- \frac{1}{2} ν^{T} C_{ν}^{-1} ν} .$ $Mathematical equation: \appendix \setcounter{section}{3} \begin{equation} \mathcal{B}(\nu_1,\nu_2,\xi_\nu)\equiv \frac{1}{2\pi|C_\nu|^{1/2}}\mathrm{exp}\left\{ -\frac{1}{2} {\bf \nu}^T C_\nu^{-1} {\bf \nu}\right\}. \label{biv} \end{equation}$ (C.4)We notice that, in our case (central reduced Gaussian), the covariance matrix C_ν takes the simple form $C_{ν} = {}^{[}{\begin{matrix} 1 ξ_{ν} \\ ξ_{ν} 1 \end{matrix}}^{]}$ $Mathematical equation: \hbox{$C_\nu = \left[ \substack{1 \,\,\, \xi_\nu \\ \xi_\nu \,\,\, 1} \right]$}$ . Once integrated over the definition domain of ν₁ and ν₂, Eq. (C.3) provides a mapping between the two-point correlation function of the Gaussian field ν and the two-point correlation function of the density field δ. However, we prefer to rotate the coordinate system before performing the integral (C.3) because, in the case of high correlation (~1), the Gaussian will be comparable with a straight line – and most of the sampling of this function will be useless. That is why we look for the rotation that allows us to diagonalise the matrix C_ν and therefore convert ν into a new variable x. It follows that $\begin{matrix} C_{x} = [\begin{matrix} \end{matrix}] \end{matrix}$ $Mathematical equation: \appendix \setcounter{section}{3} \begin{eqnarray*} C_x=\left [ \begin{array}{cc} 1 -\xi_\nu & 0\\ 0 & 1+\xi_\nu \end{array} \right] \end{eqnarray*}$ and the integral becomes $ξ_{δ} = \frac{1}{2 π \sqrt{1 - ξ_{ν}^{2}}} \int L (\frac{x_{2} - x_{1}}{2}) L (\frac{x_{2} + x_{1}}{2}) e^{- \frac{1}{2} (\frac{x_{1}^{2}}{σ_{1}^{2}} + \frac{x_{2}^{2}}{σ_{2}^{2}})} d x_{1} d x_{2},$ $Mathematical equation: \appendix \setcounter{section}{3} \begin{equation} \xi_\delta=\frac{1}{2\pi\sqrt{1-\xi_\nu^2}}\int {L}\left(\frac{x_2-x_1}{2}\right) {L}\left(\frac{x_2+x_1}{2}\right) {\rm e}^{-\frac{1}{2} \left( \frac{x_1^2}{\sigma_1^2} + \frac{x_2^2}{\sigma_2^2} \right)}\dif x_1\dif x_2, \label{gaussintx} \end{equation}$ (C.5)where $σ_{1}^{2} = 1 - ξ_{ν}$ $Mathematical equation: \hbox{$\sigma_1^2=1-\xi_\nu$}$ and $σ_{2}^{2} = 1 + ξ_{ν}$ $Mathematical equation: \hbox{$\sigma_2^2=1+\xi_\nu$}$ . We can therefore integrate over a bounded domain that corresponds to the −8σ₁, 8σ₁ along the x₁ axis and −8σ₂, 8σ₂ along the x₂ axis. Another possibility for perform the integral (C.3) is to use Mehler’s formula. By doing so, one can show that the two-point correlation of the density field can be expressed as a Taylor expansion on the two-point correlation function of the ν field. Thihs reads $ξ_{δ} = λ (ξ_{ν}) \equiv \sum_{n = 0}^{\infty} n! c_{n}^{2} ξ_{ν}^{n},$ $Mathematical equation: \appendix \setcounter{section}{3} \begin{equation} \xi_\delta=\lambda(\xi_\nu)\equiv\sum_{n=0}^\infty n!c_n^2\xi_\nu^n, \label{xiexp} \end{equation}$ (C.6)where the c_n are the coefficients of the Hermit transform of the local mapping $L (ν) = \sum_{n = 0}^{\infty} c_{n} H_{n} (ν)$ $Mathematical equation: \hbox{$L(\nu)=\sum_{n=0}^\infty c_nH_n(\nu)$}$ . The c_n coefficients can be calculated using the orthogonal properties of Hermit polynomials $c_{n} = \frac{1}{n!} \int_{- \infty}^{+ \infty} L (ν) H_{n} (ν) P_{ν} (ν) d ν .$ $Mathematical equation: \appendix \setcounter{section}{3} \begin{equation} c_n=\frac{1}{n!}\int_{-\infty}^{+\infty}L(\nu)H_n(\nu)P_\nu(\nu)\dif\nu. \label{cn2} \end{equation}$ (C.7)The latter approach speeds up the numerical evaluation of Eq. (C.5) considerably. It allows us to compute the 2D integral as a finite sum of 1D integrals. It also allows us to verify that, when the two-point function of the field ν is positive, then the derivative of ξ_δ with respect to ξ_ν is positive. Moreover, from Eq. (C.3) we can see that ξ_ν = 0 implies ξ_δ = 0. This means that the function that transforms ξ_ν into ξ_δ is invertible as long as ξ_δ is positive. On the other hand, we know that the zero-crossing of the two-point correlation function occurs at very large scales at which one can safely assume that | ξ_δ | ≪ 1. Thus, continuing along this train of thought, we can truncate Eq. (C.6) at order one, which provides a linear relation between ξ_δ and ξ_ν. As a result, we can take the reciprocal of the function λ such that ξ_ν = λ^-1(ξ_δ).

Once the local transform L and the two-point correlation mapping λ are known, then the input power spectrum of the Gaussian field ν can be obtained as follows. We choose a power spectrum P(k), in the present case Eisenstein & Hu (1998), for the density field δ. We calculate its corresponding 2-point correlation function $ξ_{δ} = \int P (k) e^{i k \cdot r} d^{3} k .$ $Mathematical equation: \appendix \setcounter{section}{3} \begin{equation} \xi_\delta=\int P(k){\rm e}^{{\rm i} \vec k\cdot \vec r}\dif^3\vec k. \label{fback} \end{equation}$ (C.8)At each scale r, we can deduce the two-point correlation function of the Gaussian field ξ_ν = λ^-1(ξ_δ) and finally, using a Fourier transform, we obtain the input power spectrum $P_{in} (k) = \frac{1}{(2 π)^{3}} \int ξ_{ν} (r) e^{- i k \cdot r} d^{3} r .$ $Mathematical equation: \appendix \setcounter{section}{3} \begin{equation} P_{\rm in}(k)=\frac{1}{(2\pi)^3}\int \xi_\nu(r){\rm e}^{-{\rm i} \vec k\cdot \vec r}\dif^3\vec r. \label{fback2} \end{equation}$ (C.9)Finally, to make sure that the PDF target will be reproduced, we need to verify that its integral is indeed equal to the expected variance in the size of the mesh, once the input power spectrum P_in(k) has been set up on a regular k-space grid, which will be used to generate the Gaussian field. Thus, $σ̂ \begin{matrix} 2 \\ a \end{matrix} = (\frac{2 π}{L})^{3} \sum_{n} P (k_{n})$ $Mathematical equation: \hbox{$\hat\sigma_a^2=(\frac{2\pi}{L})^3\sum_{n}P(\vec k_n)$}$ should be equal to $σ_{a}^{2} =^{\int} P (k) d^{3} k$ $Mathematical equation: \hbox{$\sigma_a^2=\int P(k)\dif^3\vec k$}$ . In general, σ_a and $Mathematical equation: \hbox{$\hat\sigma_a$}$ are not equal, in which case we renormalise the target power spectrum by the quantity $S = σ̂ \begin{matrix} 2 \\ a \end{matrix} / σ_{a}^{2}$ $Mathematical equation: \hbox{$S=\hat\sigma_a^2/\sigma_a^2$}$ , $Mathematical equation: \hbox{$\hat P_{\rm in}(k)=SP_{\rm lin}(k)$}$ .

We generate a Gaussian field (with a flat power spectrum) on a regular mesh of a = 0.95 h^-1 Mpc and a comoving box of 500³h^-3 Mpc³. We then Fourier transform with an FFT and keep only the phases of the field ν_k = e^iθ(k). At each position k_n, we generate the value of the modulus of $ν_{k} = \sqrt{X_{k}} e^{i θ (k)}$ $Mathematical equation: \hbox{$\nu_{\vec k}=\sqrt{X_k}{\rm e}^{{\rm i}\theta(\vec k)}$}$ , where $Mathematical equation: \hbox{$X_k=-\hat P_{\rm in}(k)\ln(1-\epsilon)$}$ and ϵ is a random number with a uniform probability distribution between 0 and 1. We then inverse Fourier transform the field to get a centred reduced Gaussian field. In Fig. C.1, we show the input power spectrum of the Gaussian field ν compared to the one that was measured using a FFT, and to the one expected from the local transformation, which was applied to the ν field to generate the density field δ.

Fig. C.1

Upper: grey dotted lines show the power spectrum measured in each of the 20 fake galaxy distributions, the black solid line represent their average and the errors display the dispersion of the measurements. The blue long dashed line displays the input power spectrum used too generate the Gaussian stochastic field nu and the red dashed line shows the corresponding expectation value for the power spectrum of the density contrast δ. Lower: shows the deviation between the measured power spectrum of the δ-field and the expected one.

All Tables

Table 1

Magnitude selected objects (in B-band) in the VIPERS PDR-1.

In the text

Table 2

List of the magnitude selected objects (in B-band) in the mock catalogues.

In the text

Table 3

Coefficients of the Γ_e expansion, which describe the VIPERS PDR-1 data for R = 6 h^-1 Mpc.

In the text

All Figures

	Fig. 3 Same as in Fig. 2, but we use only 10% of the galaxies contained in the fake galaxy catalogues. As a result, the average number of galaxies per cell drops from $Mathematical equation: \hbox{$\bar N=8$}$ to $Mathematical equation: \hbox{$\bar N=0.8$}$ .
In the text

Fig. 4

In the text

	Fig. 5 Comparison between the SLN and Γ_e methods at 0.7 <z< 0.9. Each panel corresponds to a cell radius R of 4, 6, and 8 h^-1 Mpc from left to right.
In the text

	Fig. 6 Comparison between the SLN and Γ_e methods at 0.5 <z< 0.7. Each panel corresponds to a cell radius R of 4, 6, and 8 h^-1 Mpc from left to right.
In the text

	Fig. 7 Comparison between the SLN and Γ_e methods. Each column corresponds to a cell radius R of 4, 6, and 8 h^-1 Mpc from left to right, and each row corresponds to a combination of redshift and magnitude cut.
In the text

Fig. 8

In the text

	Fig. 9 Top: reconstructed PDF applying the Γ_e method in three redshift bins (from left to right) at the intermediate smoothing scale R = 6 h^-1 Mpc. Bottom: underlying PDF corresponding to the CPDF in the top panel, for each luminosity cut the 1-sigma uncertainty is represented by the dotted lines.
In the text

	Fig. 10 Evolution of three galaxy populations, selected according to their luminosity (from bottom to top). On each panel, the black solid, red dashed, and cyan dot-dashed lines represent, respectively, the three redshift bins 0.5 <z< 0.7, 0.7 <z< 0.9, and, 0.9 <z< 1.1.
In the text

Fig. C.1

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[R1] Bel, J., & Marinoni, C. 2012, MNRAS, 424, 971 [NASA ADS] [CrossRef] [Google Scholar]

[R2] Bel, J., Marinoni, C., Granett, B. R., et al. (the VIPERS Team) 2014, A&A, 563, A37 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R3] Bernardeau, F., Colombi, S., Gaztaãga, E., & Scoccimarro, R. 2002, Phys. Rep., 367, 1 [NASA ADS] [CrossRef] [EDP Sciences] [MathSciNet] [Google Scholar]

[R4] Bottini, D., Garilli, B., Maccagni, D., et al. 2005, PASP, 117, 996 [NASA ADS] [CrossRef] [Google Scholar]

[R5] Bouchet, F. R., Strauss, M. A., Davis, M., et al. 1993, ApJ, 417, 36 [NASA ADS] [CrossRef] [Google Scholar]

[R6] Cappi, A., Marulli, F., Bel, J., et al. (the VIPERS team) 2015, A&A, 579, A70 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R7] Carruthers, P., Duong-van, M. 1983, Phys. Lett. B, 131, 116 [NASA ADS] [CrossRef] [Google Scholar]

[R8] Coles, P., & Jones, B. 1991, MNRAS, 248, 1 [NASA ADS] [CrossRef] [Google Scholar]

[R9] Colless, M., Dalton, G., Maddox, S., et al. 2001, MNRAS, 328, 1039 [NASA ADS] [CrossRef] [Google Scholar]

[R10] Colombi, S. 1994, ApJ, 435, 536 [NASA ADS] [CrossRef] [Google Scholar]

[R11] de la Torre, S., Guzzo, L., Kovac, K., et al. (the ZCOSMOS collaboration) 2010, MNRAS, 409, 867 [NASA ADS] [CrossRef] [Google Scholar]

[R12] de la Torre, S., Guzzo, L., Peacock, J. A., et al. (VIPERS team) 2013, A&A, 557, A54 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R13] di Porto, C., Branchini, E., Bel, J., et al. (VIPERS team) 2014, A&A, submitted [arXiv:1406.6692] [Google Scholar]

[R14] Eisenstein, D. J., & Hu, W. 1998, ApJ, 496, 605 [NASA ADS] [CrossRef] [Google Scholar]

[R15] Garilli, B., Le Fèvre, O., Guzzo, L., et al. (the VVDS collaboration) 2008, A&A, 486, 683 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R16] Garilli, B., Paioro, L., Scodeggio, M., et al. 2012, PASP, 124, 1232 [NASA ADS] [CrossRef] [Google Scholar]

[R17] Garilli, B., Guzzo, L., Scodeggio, M., et al. (the VIPERS team) 2014, A&A, 562, A23 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R18] Gaztañaga, E., Fosalba, P., & Elizalde, E. 2000, ApJ, 539, 522 [NASA ADS] [CrossRef] [Google Scholar]

[R19] Greiner, M., & Enβlin, T. A. 2015, A&A, 574, A86 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R20] Guzzo, L., Pierleoni, M., Meneux, B., et al. (the VVDS team) 2008, Nature, 451, 541 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]

[R21] Guzzo, L., Scodeggio, M., Garilli, B., et al. (the VIPERS team) 2014, A&A, 566, A108 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R22] Kim, R. S. J., & Strauss, M. A. 1998, ApJ, 493, 39 [NASA ADS] [CrossRef] [Google Scholar]

[R23] Layzer, D. 1956, AJ, 61, 383 [NASA ADS] [CrossRef] [Google Scholar]

[R24] LeFèvre, O., Saisse, M., Mancini, D., et al. 2003, Proc. SPIE, 4841, 1670 [NASA ADS] [CrossRef] [Google Scholar]

[R25] Le Fèvre, O., Vettolani, G., Garilli, B., et al. 2005, A&A, 439, 845 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R26] Lilly, S. J., Le Brun, V., Maier, C., et al. (the ZCOSMOS collaboration) 2009, ApJS, 184, 218 [Google Scholar]

[R27] Marchetti, A., Granett, B. R., Guzzo, L., et al. (the VIPERS team) 2013, MNRAS, 428, 1424 [NASA ADS] [CrossRef] [Google Scholar]

[R28] Mellier, Y., Bertin, E., Hudelot, P., et al. 2008, The CFHTLS T0005 Release, http://terapix.iap.fr/cplt/oldSite/Descart/CFHTLS-T0005-Release.pdf [Google Scholar]

[R29] Mustapha, H., & Dimitrakopoulos, R. 2010, Int. Conf. of Numerical Analysis and Appl. Math., AIP Conf. Proc., 60, 2178 [Google Scholar]

[R30] Newman, J. A., Cooper, M. C., Davis, M., et al. (the DEEP2 collaboration) 2013, ApJS, 208, 5 [Google Scholar]

[R31] Oke, J. B., & Gunn, J. E. 1983, ApJ, 266, 713 [NASA ADS] [CrossRef] [Google Scholar]

[R32] Saslaw, W. C., & Hamilton, A. J. S. 1984, ApJ, 276, 13 [NASA ADS] [CrossRef] [Google Scholar]