Issue |
A&A
Volume 578, June 2015
|
|
---|---|---|
Article Number | A50 | |
Number of page(s) | 10 | |
Section | Cosmology (including clusters of galaxies) | |
DOI | https://doi.org/10.1051/0004-6361/201424905 | |
Published online | 02 June 2015 |
A new data compression method and its application to cosmic shear analysis
1
SUPA, Institute for Astronomy, University of Edinburgh, Royal
Observatory,
Blackford Hill,
Edinburgh
EH9 3HJ,
UK
e-mail: ma@roe.ac.uk
2
Argelander-Institut für Astronomie, Universität
Bonn, Auf dem Hügel
71, 53121
Bonn,
Germany
Received:
2
September
2014
Accepted:
14
January
2015
Context. Future large scale cosmological surveys will provide huge data sets whose analysis requires efficient data compression. In particular, the calculation of accurate covariances is extremely challenging with an increasing number of observables used in the statistical analysis.
Aims. The primary aim of this paper is to introduce a formalism for achieving efficient data compression, based on a local expansion of the observables around a fiducial cosmological model. We specifically apply and test this approach for the case of cosmic shear statistics. In addition, we study how well band powers can be obtained from measuring shear correlation functions over a finite interval of separations.
Methods. We demonstrate the performance of our approach, using a Fisher analysis on cosmic shear tomography described in terms of E-/B-mode separating statistics (COSEBIs).
Results. We show that our data compression is highly effective in extracting essentially the full cosmological information from a strongly reduced number of observables. Specifically, the number of observables needed decreases by at least one order of magnitude relative to the COSEBIs, which already compress the data substantially compared to the shear two-point correlation functions. The efficiency appears to be affected only slightly if a highly inaccurate covariance is used for defining the compressed data vector, showing the robustness of the method. In addition, we show the strong limitations on the possibility of constructing top-hat filters in Fourier space, for which the real-space analog has a finite support, yielding strong bounds on the accuracy of band power estimates.
Conclusions. We conclude that efficient data compression is achievable and that the number of compressed data points depends on the number of model parameters. Furthermore, a band convergence power spectrum inferred from a finite angular range cannot be accurately estimated. The error on an estimated band power is larger for a narrower filter and a narrower angular range, which for relevant cases can be as large as 10% for Δℓ/ℓ ~ 0.1.
Key words: gravitational lensing: weak / methods: numerical / methods: statistical / cosmology: observations
© ESO, 2015
1. Introduction
Future cosmological surveys are faced with the difficulty of extracting cosmological parameters from their wealth of observables. Taking Euclid1 as an example, statistics to be obtained from the data include second-order shear statistics across several populations of source galaxies, which – using the common usage – will be termed “redshift bins” throughout this paper. As shown in Schneider et al. (2010); and Asgari et al. (2012), the COSEBIs (Complete Orthogonal E-/B-mode Integrals) form appropriate combinations of the shear two-point correlation functions ξ±(θ) that cleanly separate E- and B-mode shear (see, e.g., Crittenden et al. 2002; Schneider et al. 2002b). In addition, COSEBIs are highly efficient in terms of data compression, since essentially all cosmological information is contained in a few COSEBIs modes (see, e.g., Kilbinger et al. 2013; Huff et al. 2014, for applications of COSEBIs to cosmic shear data sets).
The efficiency of data compression decreases, however, if several populations of sources are used. For example, with about ten redshift bins, the total number of COSEBIs that should be used to extract cosmological information is approximately 500. Furthermore, higher order shear information contains additional, valuable information – both regarding cosmological parameters and for calibrating the shear data – and should be taken into account. Since third-order shear statistics depend on three variables (say, three sides of a triangle), and combinations of three redshift bins, the number of observables for third-order shear statistics that need to be considered is almost certainly considerably larger than for second-order shear statistics. Furthermore, shear-peak statistics have been shown to yield powerful constraints and should likewise be considered (see, e.g., Marian et al. 2013, and references therein). Therefore, the number of pure shear observables will be several thousand, although the number of cosmological parameters to be determined is around a dozen.
In practice, issues are even more complicated in that astrophysical and other systematics need to be accounted for. For example, the effects of intrinsic alignments (see, e.g., Joachimi & Bridle 2010, and references therein) need to be mitigated by including further observables, i.e., the galaxy-galaxy lensing signal and the galaxy correlation functions. Even if one uses a COSEBIs-like data compression for them (e.g., Eifler et al. 2014), the number of redshift combinations will still lead to a strongly enhanced number of observables.
One of the major difficulties in analyzing this data is to determine the expectation values for these observables as a function of the parameters and, in particular, to estimate their covariance matrix. If one determines the covariance as a sample variance of different numerical realizations, one needs many more realizations than the dimension of the data vector in order to get a reliable estimate of the covariance matrix and its inverse (see, e.g., Hartlap et al. 2007). Because of this difficulty, data compression is mandatory for any analysis of survey data. Taylor & Joachimi (2014) have found relations between the accuracy of the estimated covariance, the number of simulations, and observables for Gaussian distributed data and underlying parameters. Their work also suggests that data compression is essential for reducing the number of simulations required for reaching a certain accuracy in the covariance.
In this paper, we suggest a form of data compression that is based on the sensitivity of the various observables to the parameters that are to be estimated. The cosmological parameters currently are and, until the launch of Euclid, will be even more strongly constrained, and thus only a relatively small volume in parameter space needs to be explored2. We therefore assume that the relevant parameter region is small, which allows us to define linear combinations of observables based on a low-order Taylor expansion of the dependence of these observables on parameters, which should contain almost all the cosmological information in the data.
In the following section, we introduce our data compression formalism for general observables (statistics). We then apply this method to COSEBIs in Sect. 3 to study how this strategy works for compressing it. In Sect.4 we specify our cosmological model, which is used for the results section. In Sect.5 we first illustrate the weight functions for the compressed statistics made of COSEBIs, and then use a Fisher formalism to explore the efficiency of the compressed versus regular COSEBIs. Section6 is dedicated to mimicking a band power spectrum using linear combinations of COSEBIs. Finally we conclude in Sect.7.
2. Formalism
We let be the statistics obtained from the data, 1 ≤ n ≤ N, with expectation value
, where the φμ, 1 ≤ μ ≤ P, denotes the parameters of the model, including the cosmological parameters and others. Assuming that the uncertainty in the parameters is “small”, we consider an expansion of the functions Xn(φμ) around the fiducial value
,
(1)where
, and
(2)are the first and second derivatives of the expectation values with respect to the model parameters, taken at the fiducial point in parameter space. Here and below, summation over repeated indices is implied, unless noted otherwise.
We assume that the likelihood ℒ(χ2) is a monotonically decreasing function of (3)where C is the covariance matrix of the observables
. Maximizing the likelihood then requires finding the minimum of χ2 with respect to the parameters; using Eq.(1), we obtain
(4)In this equation, we have neglected the dependence of the covariance matrix on the parameters, either because C is determined from the data itself or because the dependence of C on the parameters is assumed to be weak. From Eq.(4), we see that the determination of the parameters pμ does involve the observables
but only in the linear combinations
(5)with expectation value
(6)Thus, the expansion of the expectation values of the original observables
around a fiducial model motivates the definition of linear combinations of observables that contain all the information about the parameters φκ, provided the second-order expansion is accurate. The set (5) of P + P(P + 1) / 2 = P(P + 3) / 2 observables thus is expected to allow for an efficient data compression (note that Ŝμν = Ŝνμ).
To obtain the new observables and Ŝκν, one first needs to estimate the covariance C of the original observables that, owing to the high dimensionality in future cosmological surveys, provides a real challenge. However, the covariance C is needed here for the definition of appropriate combinations of observables, and not for parameter estimates. As a result, an approximation for C may be expected to be sufficient for this purpose. Disregarding the parameter dependence of C in the derivation of Eq.(4)provides such an approximation, which avoids the need to obtain a large covariance matrix for more than one cosmological model. If the approximation for C deviates substantially from the true covariance, we expect that the new observables do not contain the full information about the parameters, since they deviate from the “optimal” combination of the original
. Hence, the better the initial estimate of C, the more efficient the new observables will be.
We therefore propose a strategy for first obtaining an approximation for the covariance C, based on which of the new observables and Ŝκν are defined. The number of these observables is substantially smaller than the original ones, so an accurate estimation of their covariance can be obtained from fewer simulations compared to C. On the other hand, the number of new observables is substantially larger than the number of parameters, which is expected to mitigate the choice of non-optimal combinations from an approximate form of C. It is for this reason that we consider the second-order derivatives of the original observables; the first-order ones coincide with that of the Karhunen–Loève method for the case of known covariance (see, e.g., Tegmark et al. 1997).
We now combine the new observables and Ŝκν into the N′ = P(P + 3) / 2 compressed quantities
. According to Eq.(5), we can write
(7)where we use vectorial notation for the
and
. The N′ × N (rows × columns) matrix H is given in terms of first and second partial derivatives of the functions Xn(φκ) at the fiducial point in parameter space and B = HC-1 is the compression matrix. Accordingly, the covariance matrix of
is given as
(8)where the superscript “t” denotes the transpose of a matrix. The χ2-function in terms of the new observables is
(9)From what was discussed above, the covariance Cc should be calculated from C only if an accurate estimate of the latter can be obtained; in general, it will be much more practical to determine Cc directly, such as from simulations.
For completeness, we mention for the special case that the covariance Ccan indeed be determined accurately, which we can solveEq.(4)for the parameters . Writing it in terms of the new observables, Eq.(4)becomes
(10)with
,
. If we then expand
, where
(
) is first (second) order in the ΔFκ, ΔSκν, we obtain to first order
(11)from which we can easily obtain
from the inverse of the symmetric matrix U,
. The second-order terms lead to the equation
(12)where we defined
(13)With the foregoing solution for
and the inverse of U, this can be immediately solved for
.
3. Application to COSEBIs
We now apply the method of the previous section to specific statistics for cosmic shear measurements, the COSEBIs (see Schneider et al. 2010). They provide a complete representation of the shear two-point correlation functions (2PCFs) in a given finite interval of angular scales, chosen such that they cleanly separate between E and B modes (Crittenden et al. 2002; Schneider et al. 2002b). In our previous work (Asgari et al. 2012), we showed that COSEBIs also provide an efficient means of data compression, since the full cosmological information contained in the 2PCFs can be recovered with a small number of COSEBIs. However, in the case of several redshift bins for the source galaxies, the number of components grows with the number of tomographic redshift bins, r, by a factor of r(r + 1) / 2. In this section we use the formalism explained in Sect.2 to obtain a way to compress the number of relevant statistical quantities and compare the results with a full COSEBIs analysis. The E-mode COSEBIs are related to the 2PCFs via (14)where 1 ≤ i,j ≤ r label the redshift bins considered. The COSEBIs are defined for a given range of angular separations, [ θmin,θmax ], i.e., the T± n(ϑ) are zero outside this interval. They form a complete basis for all filter functions that are defined on a finite angular range and satisfy the conditions
(15)which are the necessary and sufficient conditions for separating the E- and B-modes obtained from the shear two-point correlation function measured on a finite interval and for removing ambiguous E-/B-modes (Schneider & Kilbinger 2007). As a result any allowed filter function is a linear combination of them. E-mode COSEBIs are related to the power spectrum by
(16)where
is the E-mode convergence cross-power spectra of redshift bins i and j and
(17)where J0 is the zeroth-order Bessel function of the first kind (see Schneider et al. 2010; and Asgari et al. 2012, where the filters are defined and shown).
In the following, we use the logarithmic COSEBIs, which yield a more efficient data compression than the linear COSEBIs. The Log-COSEBIs T+ n(ϑ) filters are polynomials in ln(ϑ) (see Schneider et al. 2010); i.e., they have more oscillations on small scales so are more sensitive to variations in the shear 2PCFs on those scales. As it turned out, an approximately uniform distribution of roots of the weight function on logarithmic angular scales covers the cosmological information in the shear 2PCFs with a smaller number of components.
To apply the method of the past section for obtaining a compressed version of COSEBIs, we need to find their (approximate) covariance matrix for a given cosmology, in addition to their first- and second-order derivatives with respect to the cosmological parameters. The new set of statistics are related to the COSEBIs via the compression matrix, B, defined before in Eq.(7), (18)where the new index
(19)is a combination of the three indices i,j and n, and nmax is the maximum order of COSEBIs considered, and r is the total number of redshift bins.
4. Cosmological model, survey parameters, and covariance
Fiducial cosmological parameters consistent with the WMAP 7-years results, and the underlying true parameters consistent with Planck.
A cold dark matter (CDM) cosmological models with dynamical dark energy, characterized by its equation-of-state parameter, w0, is used throughout this work (for references to wCDM models, see Peebles & Ratra 2003, and references therein). Table1 contains the two sets of parameter values considered here. The fiducial model is used for obtaining the compressed COSEBIs (CCOSEBIs hereafter), while the assumed “true” underlying cosmology is slightly different. That means that we calculate the CCOSEBIs according to the equations of Sect. 2, using the covariance and parameter derivatives of the COSEBIs, C, D, and Z, for the fiducial cosmology, but these new observables are applied using the “true” cosmological model. The linear matter power spectrum is calculated using the Bond & Efstathiou (1984) transfer function and a primordial power-law power spectrum with spectral index ns. The halo fit formula of Smith et al. (2003) is adopted for nonlinear scales.
For a cosmic shear analysis, we need the survey parameters and the redshift distribution of the galaxies. The latter is characterized by (see, e.g., Brainerd et al. 1996) (20)for zmin ≤ z ≤ zmax where the parameters, α, β, z0, zmin and zmax depend on the survey. Table2 summarizes the survey and redshift parameters assumed in our analysis.
Parameters of a fiducial large future survey.
We assume Gaussian shear fields to find the covariances needed for obtaining CCOSEBIs and also for the Fisher analysis (see Joachimi et al. 2008). The relation between the E-COSEBIs covariance and the convergence power spectrum for redshift bin pairs ij and kl is (21)where
(22)and A is the survey area, σϵ is the galaxy intrinsic ellipticity dispersion and
is the average galaxy number density in redshift bin i. The overall shape of the COSEBIs covariance is shown in Asgari et al. (2012).
5. Results
This section is dedicated to our results. The filter functions of the CCOSEBIs for the fiducial cosmology are shown first, followed by a figure-of-merit analysis. We compare the figure-of-merit values for cases where the covariance is known versus the use of a wrong covariance in constructing the CCOSEBIs.
5.1. Weight functions of compact COSEBIs
Inserting Eqs.(14)and (16)into Eq.(18)results in relations between Ec and the COSEBIs filters, (23)and
(24)For each redshift bin pair, ij, a set of N′ = P(P + 3) / 2 (P is the number of free parameters) filters exist. The new filters in real and Fourier space, respectively, are
(25)With the above definitions we can rewrite the compressed statistics, Ec, in terms of the compressed filter functions,
(26)and
(27)Multiplying each
by a constant has no effect on the information level. We can therefore normalize the filter functions for each compressed statistic separately, so that
(28)Figure1 shows the first-order filter functions,
and
, with 1 ≤ μ ≤ P = 7 for our fiducial cosmology described in Sect.4. Here we assume three redshift bins and seven free fiducial parameters, using 20 COSEBIs filters defined between θmin = 1′ and θmax = 400′. The redshift bins were chosen such that they contain an equal number of galaxies. Since the filters are designed to maximize the information obtained, their shape shows where most of the information in ξ+(ϑ) or PE(ℓ) lies. Here we choose to only show the first-order filters, although later on we use the first and second order, as well as the combination of both, to obtain the figure-of-merit. The general trend of TF shows that there is more information about all of the parameters in the higher-redshift bins and on smaller angular scales.
![]() |
Fig. 1 Filter functions |
However, each individual parameter shows a different pattern for each of the redshift pairs. For example, the real space filters, TF, for ΩΛ have significantly higher amplitudes for combinations of redshift bins two and three compared to combinations with the lowest redshift bin. The σ8 and Ωm filters follow each other closely, although in Fourier space, i.e. WF, the differences are more pronounced. A closer look at the plots shows that, since these curves are not exactly the same and also evolve with redshift, it is possible to break their degeneracies, which is present in single-redshift studies (e.g., van Waerbeke et al. 2001; Hoekstra et al. 2002; Jarvis et al. 2003; Hetterscheidt et al. 2007; Kilbinger et al. 2013). The oscillations of the WF are a real feature of the CCOSEBIs weights and do not vanish when more COSEBIs are incorporated in calculating them.
Tables 3 and 4 show the elements of the compression matrix, B, for CCOSEBIs. According to Eq.(18), each row of B corresponds to the coefficients for making one of the CCOSEBIs statistics, , by linearly combining the COSEBIs En. The value of the elements of B show how important each COSEBIs mode is for building a CCOSEBIs mode. In both tables, the element values are much lower for large n compared to smaller n. As a result we can safely (conservatively) take just the first 20 COSEBIs to build the compressed statistics.
Elements of the normalized compression matrix in percentage, 100 × Bμn, for 1 redshift bin and 3 parameters.
Normalized compression matrix elements in percentage, 100 × B, for 2 redshift bins and 3 parameters.
5.2. Fisher analysis
The Fisher matrix depends on the data, , via
(29)where Cc is the data covariance,
and the commas followed by subscripts indicate partial derivatives with respect to the cosmological parameters (see Tegmark et al. 1997, for example). We use the same figure-of-merit, f, which gives a measure of the mean error on parameters, as in Asgari et al. (2012),
(30)where P is the number of free parameters considered. Furthermore, we have shown in Asgari et al. (2012) that for a sufficiently large survey one can neglect the first term in Eq. since it does not depend on the survey area (see Eq. (21)), while the second term is proportional to the survey area. Therefore, we neglect the first term in this study.
Figure 2 shows the dependence of f on the number of COSEBIs modes, nmax, for eight redshift bins and seven free parameters. The constraints get tighter as nmax increases and reach a saturation level for all cases. The solid curve shows how much information can be gained if the 8nmax × 9 / 2 = 36nmax COSEBIs are used, i.e., the maximum information, or minimum f value. The points show the amount of information in the first- (F) and second-order (S) CCOSEBIs, as well as the combination of both, denoted by Ec as before. The parameters used for calculating the covariance matrix to build F, S, and Ec are that of the fiducial cosmology, which are slightly different from the assumed true parameters (see Table 1). Nevertheless, the first-order CCOSEBIs are sufficient to reach a similar Fisher information level. The F data vector for this case have seven components, S have 28, and Ec have 35, while for nmax = 10, the COSEBIs have 360 components; i.e., there is at least an order-of-magnitude difference between the number of observables for CCOSEBIs and COSEBIs. As a result, we can obtain the same accuracy of derived parameters with a highly significant reduction of observables.
The strong reduction of observables needed to cover all the cosmological information is of great interest for obtaining accurate covariances, hence reliable confidence regions for cosmological parameters. Whereas analytical methods may be able to obtain approximate covariances (see, e.g., Takada & Jain 2009; Sato et al. 2009; Pielorz et al. 2010; Hilbert et al. 2011; Takahashi et al. 2011, 2014, and references therein), an accurate covariance accounting for the complex survey geometry will probably require extensive simulations. Obtaining the covariance as sample variance from independent realizations of the simulated cosmology requires a number of realizations, which is about proportional to the number of observables (see Hartlap et al. 2007; Taylor & Joachimi 2014). Even a modest reduction in the number of relevant observables is therefore useful. As we have seen above, the CCOSEBIs serve this purpose very well.
Whereas the construction of the CCOSEBIs requires information about the covariance, this does not have to be very accurate. To show how using a substantially wrong covariance in defining the compressed data vector affects the constrains, we artificially change the value of σϵ which affects the diagonals of the covariance matrix according to Eqs.(21)and (22). Figure 3 shows f for seven free parameters, five redshift bins, and 20 COSEBIs modes, as a function of the change in the parameter σϵ. The parameter f is normalized by its minimum value, i.e., using COSEBIs with their true covariance, while σϵ is normalized by its true value. The first-order statistics, F, which has the same dimension as the parameter space, rapidly diverges from the true Fisher information limit, while the second order, S, and Ec, which span a larger dimensional space, are much less sensitive to the errors of the COSEBIs covariance, used for constructing the CCOSEBIs. Even for a 16 times larger σϵ, the fractional difference between the optimal f and the measured one from Ec is small. As a result, the consideration of the second-order statistics indeed provides a powerful mitigation for inaccurate covariances.
![]() |
Fig. 2 Figure-of-merit, f, as a function of the number of COSEBIs, nmax, used. 7 free parameters listed in Table1, and 8 tomographic redshift bins are considered here. The solid line shows the result for using Log-COSEBIs with the true underlying cosmology. It also represents the maximum information level for a given nmax. The circles, stars and the Y-shaped symbols represent the f-values for First order, Second order, and their combination Ec, where nmax COSEBIs modes with the fiducial cosmological parameters are utilized in making them. |
We stress again that the accuracy with which the original covariance can be determined affects the efficiency of our data compression, in the sense that what fraction of the cosmological information contained in the original COSEBIs is preserved in the compressed COSEBIs. However, it does not bias the parameter determination, because the CCOSEBIs are linear combinations of the original ones.
We therefore conclude that the method proposed here – constructing new observables using an approximate covariance, and employing them for cosmological parameter studies – yields a very promising tool for effectively reducing the necessary efforts for constructing accurate covariances. This data compression will also be of great help if the covariances are to be obtained from the data themselves, e.g., by subdividing the survey region, calculating the sample variance on each subsurvey, and scaling the result with the ratio of subsurvey to survey area.
![]() |
Fig. 3 Figure-of-merit, f, as a function of σϵ. f is normalized by its minimum value which corresponds to using COSEBIs with the correct covariance (the solid line). The intrinsic ellipticity dispersion of galaxies, σϵ, is varied with respect to its true value, 0.3, to show the effects of using a wrong covariance. The markers show the value of f for first order, F, second order, S and the combination of both Ec CCOSEBIs. |
6. Band power
As mentioned in Sect.3, any filter function defined on a finite angular interval that satisfies the constraints (15)can be expressed in terms of the COSEBIs filters. A particular filter one might be interested in is a top-hat function in Fourier space, corresponding to a band power (e.g., Brown et al. 2003; Hikage et al. 2011). In this section we study how well band powers can be approximated from correlation functions measured on a finite interval with clean E-/B-mode separation.
Thus, let Ŵ(ℓ) be a target filter function in Fourier space, and let us design a filter that approximates Ŵ(ℓ) as closely as possible. That means we want to find a filter that minimizes (31)where W(ℓ) is a linear combination of the Wn(ℓ),
(32)The Δ integral can be defined with a different weighting of ℓ; for example, one can replace dℓℓ in Eq.(31)by dlnℓ or simply dℓ. Doing so does not affect the final estimation accuracy of Ŵ(ℓ) significantly, but may be numerically advantageous.
We want to find the coefficients cn that minimize Δ; setting the derivatives of Δ with respect to cm to zero, we find (33)By defining the matrix
(34)and the vector
(35)we can rewrite Eq.(33)in matrix form,
(36)The minimum of Δ for this solution is
(37)and we quantify the relative deviation of the closest filter W to Ŵ by
(38)Filters that satisfy Eq.(15)and vanish outside of the angular range [ θmin,θmax ] are the only ones that can be represented by COSEBIs. Applying any filter that does not satisfy these conditions, on either power spectra or 2PCFs, therefore results in spillage from outside of the measured angular range. A top-hat function in Fourier space is an example of a filter that is not easily represented by weight functions that correspond to a finite range in real space. A top-hat function in Fourier is defined as Ŵ(ℓ) = 1 between ℓmin and ℓmax, and zero otherwise. The real space version,
, of such a function is
(39)This equation shows that the function
is manifestly non-zero for ϑ<θmin and ϑ>θmax, which implies that the band power cannot be represented in terms of E-/B-mode separating combinations of correlation functions over a finite angular interval.
Using Parseval’s theorem, we can find a lower bound for Δ, (40)where T+(ϑ) is the real space form of W(ℓ). The sum of the first two integrals in (40)is the absolute lower bound on Δ, since the last integral in that equation is non-negative,irrespective of the choice of T+(ϑ). Hence, the lower bound for δmin is
(41)To reach the absolute lower bound, the last integral in (40)should vanish. It is necessary and sufficient for
to satisfy the conditions (15)for that to happen, since then
in the range θmin ≤ ϑ ≤ θmax can be represented as a sum over the COSEBIs weights T+ n. Inserting the analytic form of
from Eq.(39)into Eq.(15)results in the following two conditions:
(42)and
(43)which should be simultaneously true for δmin = δLB . However, most combinations of θmin, θmax, ℓmin, and ℓmax do not satisfy these conditions. As a result in general δmin>δLB for most cases.
Once the COSEBIs, En, are measured from the data, the band power can be estimated by linearly combining them, (44)Figure 4 depicts the convergence to an estimated W(ℓ) by increasing the number of COSEBIs modes. The number of COSEBIs needed for convergence is substantially higher than the number needed for constraining parameters with one redshift bin.
![]() |
Fig. 4 Estimated top-hat filter function with ℓmin = 200 and ℓmax = 400, from nmax COSEBIs filters defined on 1′<ϑ< 400′. The changes between using 40 and 80 COSEBIs filters are small, so that no better representation is obtained by using an even higher value of nmax. |
Figure5 shows the dependence of δmin on the number nmax of COSEBIs. Here we demonstrate that for all ranges considered in ℓ, a saturation level is reached; i.e., adding more COSEBIs filters will not lead to a smaller difference between the estimated W and the top hat. Furthermore, in Table 5 we show the value of δmin for 80 COSEBIs that can be compared to its lower bound, δLB (see Eq.(41)), and the relative difference between the estimated band power and its true value, δband = | (Ê − E) /Ê |. The saturated δmin values are higher than but close to δLB. The difference between the two arises from violating conditions (42)and (43). In the table, we use three ℓ-weighting schemes, which do not change the δmin values significantly. However, the estimated band power deviations, δband, can vary by much more than a few percent between the cases. This is due to the spillage of the estimated band power and the fact that the ℓ-weighting scheme decides which way the spillage is directed to. The δband values are cosmology dependent and can be very different for a power spectrum with more features.
It is interesting to note that the deviations δband of the estimated band powers from their true values are in some cases smaller than the relative deviation δmin between the top-hat filter and the best representation of the top hat by COSEBIs weight functions. This, however, is an effect of the properties of the power spectrum in our assumed cosmological model: the power spectrum is smooth enough that the spilling caused by the effective weight W(ℓ) out of, and into the range of the top hat, can compensate for each other (see Schneider et al. 2002a, for a related discussion on band powers in cosmic shear analysis). That δband is relatively small is therefore not a statement about the accuracy of the method of band power estimates, but rather a consequence of the properties of the power spectrum. But the latter should be probed by estimating the band power. Thus, it would be strongly misleading to judge the accuracy of the method on presumed properties that ought to be investigated instead. Indeed, it is the quantity δmin that yields an estimate of the accuracy with which band powers can be obtained.
Becker & Rozo (2014) constructed slightly different band powers, with a weight function in ℓ-space of a log-normal form. As expected, their smoother weight Ŵ, which does not impose strict ℓ cuts in Fourier space, compared to the top-hat considered here, leads to less spilling of the best-fitting weight function in ϑ-space. In other words, the log-normal weight Ŵ can be more accurately represented by the Fourier transform of the shear correlation function on a finite interval, at the price of a non-diagonal covariance of band powers even in the case of Gaussian shear fields.
![]() |
Fig. 5 Relative difference, δmin, between the estimated top hat and the input as a function of the number of COSEBIs filters, nmax, utilized for a few ℓ-ranges. In all cases the saturation level is reached before nmax = 80. The minimum value of δmin is shown in Table5. In general, a higher number of modes is needed for a narrower top-hat filter, which is due to the spillage beyond the observed angular range (see Eq.(41)). |
Examples of band power estimation from 80 COSEBIs for ξ±(ϑ) in the interval ϑ ∈ [ 1′,400′ ].
7. Summary and discussion
Data compression is an important challenge for future cosmological surveys, and it is essential for estimating accurate covariances. Current cosmological surveys, such as Planck, provide us with tight constraints on most cosmological parameters. This motivated us to define combinations of statistics inspired by their low-order Taylor expansion around a fiducial cosmological model. The strategy for finding the compressed statistics involves first- and second-order derivatives of parent statistics with respect to the free parameters, as well as their covariance. The statistics corresponding to the first-order derivatives, F, have the same dimension as the parameter space, while the statistics derived from second-order derivatives, S, provide a possibility to span a larger dimensional space. Consequently, F is more sensitive to the choice of the fiducial cosmology and covariance. The combination of F and S enables one to use well-defined and motivated sets of statistics that alleviate many of the data analysis problems. In total the number of compressed statistics is P(P + 3) / 2, where P is the number of free parameters in the model.
In the case of a cosmic shear analysis, the COSEBIs already provide an effective compression compared to other two-point statistics, such as the shear two-point correlation functions. However, adding tomographic bins, which is necessary for intrinsic alignment corrections, substantially increases the number of observables. As a result, further data compression is required. We applied our compression formalism to Log-COSEBIs to study its properties. We found that for a well-estimated COSEBIs covariance matrix, the first-order compressed statistics are sufficient. However, as mentioned above, the accuracy of covariance estimations from simulations depend on the number of observables incorporated. The higher this number is, the more simulations are needed, which rapidly becomes too expensive. Consequently, we used highly inaccurate covariances for defining the compressed COSEBIs (CCOSEBIs), to test their efficiency for such cases. We found that the figure-of-merit obtained from the first-order CCOSEBIs deviates substantially from the optimal information level as the difference between the assumed COSEBIs covariance and their true covariance increases. In contrast, the set of second-order CCOSEBIs is far less sensitive to the choice of covariance, owing to its larger dimensionality. The combination of both is basically insensitive to the accuracy of the covariance, at least in the framework of the simple model that we have tested here. Consequently, we propose that this strategy is applicable to the future data analysis. We note that our first-order CCOSEBIs is equivalent to the Karhunen–Loève data compression (with parameter-independent covariance) in that the covariance is accurately known (Tegmark et al. 1997).
In this paper we used a Fisher analysis, which assumes the parameters have a normal distribution, to compare the constrains from COSEBIs and CCOSEBIs. Both Fisher matrix and F, the first-order compressed statistics depend only on the first-order derivatives and the covariance. If the fiducial cosmology coincides with the truth and the covariance is exact, then the F is equivalent to a Fisher formalism, since in this case the derivative matrix of F is equal to its covariance matrix, which is consequently equal to the Fisher matrix. However, when the covariance deviates from the truth, the differences become visible. For our future studies we plan to use likelihood analysis, which does not make assumptions about the Gaussianity of the likelihood with respect to the model parameters.
The COSEBIs filter functions form a complete basis for any filter that satisfies Eq.(15), which are necessary and sufficient conditions for a clean E-/B-separation on a finite interval, together with the condition that the filters should also vanish outside of the finite angular range. Consequently, any filter that satisfies these conditions can be represented by a linear combination of the COSEBIs filters. In this paper we showed how any given weight function can be mimicked by COSEBIs weights. In particular, we tried to represent top-hat filters in Fourier space using this strategy. We found that, owing to the infinite support of a Fourier top-hat in real space, an accurate representations of them is impossible. This task becomes harder as the top hat and
the angular range get narrower. Consequently, band convergence power spectra estimated from finite angular range information will suffer from spillage, so they will be inaccurate and biased, in a way that depends on the power spectrum – the quantity to be probed. We therefore caution against using narrow-band power spectra for cosmic shear analysis. The estimated powers are relatively accurate if the power spectra are instead smooth functions of ℓ. However, for such smooth functions, there are better ways to characterize them than using band powers, such as presenting them by a set of basis functions. We thus see no advantage in using power spectra for cosmic shear analysis on a finite angular range.
Acknowledgments
We thank Andy Taylor for interesting discussions, and Andrea Ferrara and an anonymous referee for constructive comments. This work was supported in part by the Deutsche Forschungsgemeinschaft under the TR33 “The Dark Universe”.
References
- Asgari, M., Schneider, P., & Simon, P. 2012, A&A, 542, A122 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Becker, M. R., & Rozo, E. 2014 [arXiv:1412.3851] [Google Scholar]
- Bond, J. R., & Efstathiou, G. 1984, ApJ, 285, L45 [NASA ADS] [CrossRef] [Google Scholar]
- Brainerd, T. G., Blandford, R. D., & Smail, I. 1996, ApJ, 466, 623 [NASA ADS] [CrossRef] [Google Scholar]
- Brown, M. L., Taylor, A. N., Bacon, D. J., et al. 2003, MNRAS, 341, 100 [NASA ADS] [CrossRef] [Google Scholar]
- Crittenden, R. G., Natarajan, P., Pen, U.-L., & Theuns, T. 2002, ApJ, 568, 20 [NASA ADS] [CrossRef] [Google Scholar]
- Eifler, T., Krause, E., Schneider, P., & Honscheid, K. 2014, MNRAS, 440, 1379 [NASA ADS] [CrossRef] [Google Scholar]
- Hartlap, J., Simon, P., & Schneider, P. 2007, A&A, 464, 399 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Hetterscheidt, M., Simon, P., Schirmer, M., et al. 2007, A&A, 468, 859 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Hikage, C., Takada, M., Hamana, T., & Spergel, D. 2011, MNRAS, 412, 65 [NASA ADS] [CrossRef] [Google Scholar]
- Hilbert, S., Hartlap, J., & Schneider, P. 2011, A&A, 536, A85 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Hoekstra, H., Yee, H. K. C., & Gladders, M. D. 2002, ApJ, 577, 595 [NASA ADS] [CrossRef] [Google Scholar]
- Huff, E. M., Eifler, T., Hirata, C. M., et al. 2014, MNRAS, 440, 1322 [NASA ADS] [CrossRef] [Google Scholar]
- Jarvis, M., Bernstein, G. M., Fischer, P., et al. 2003, AJ, 125, 1014 [NASA ADS] [CrossRef] [Google Scholar]
- Joachimi, B., & Bridle, S. L. 2010, A&A, 523, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Joachimi, B., Schneider, P., & Eifler, T. 2008, A&A, 477, 43 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Kilbinger, M., Fu, L., Heymans, C., et al. 2013, MNRAS, 430, 2200 [NASA ADS] [CrossRef] [Google Scholar]
- Laureijs, R., Amiaux, J., Arduini, S., et al. 2011, ArXiv e-prints [arXiv:1110.3193] [Google Scholar]
- Marian, L., Smith, R. E., Hilbert, S., & Schneider, P. 2013, MNRAS, 432, 1338 [NASA ADS] [CrossRef] [Google Scholar]
- Peebles, P. J., & Ratra, B. 2003, Rev. Mod. Phys., 75, 559 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
- Pielorz, J., Rödiger, J., Tereno, I., & Schneider, P. 2010, A&A, 514, A79 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Sato, M., Hamana, T., Takahashi, R., et al. 2009, ApJ, 701, 945 [NASA ADS] [CrossRef] [Google Scholar]
- Schneider, P., & Kilbinger, M. 2007, A&A, 462, 841 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Schneider, P., van Waerbeke, L., Kilbinger, M., & Mellier, Y. 2002a, A&A, 396, 1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Schneider, P., van Waerbeke, L., & Mellier, Y. 2002b, A&A, 389, 729 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Schneider, P., Eifler, T., & Krause, E. 2010, A&A, 520, A116 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Smith, R. E., Peacock, J. A., Jenkins, A., et al. 2003, MNRAS, 341, 1311 [NASA ADS] [CrossRef] [Google Scholar]
- Takada, M., & Jain, B. 2009, MNRAS, 395, 2065 [NASA ADS] [CrossRef] [Google Scholar]
- Takahashi, R., Yoshida, N., Takada, M., et al. 2011, ApJ, 726, 7 [NASA ADS] [CrossRef] [Google Scholar]
- Takahashi, R., Soma, S., Takada, M., & Kayo, I. 2014, MNRAS, 444, 3473 [NASA ADS] [CrossRef] [Google Scholar]
- Taylor, A., & Joachimi, B. 2014, MNRAS, 442, 2728 [NASA ADS] [CrossRef] [Google Scholar]
- Tegmark, M., Taylor, A. N., & Heavens, A. F. 1997, ApJ, 480, 22 [NASA ADS] [CrossRef] [Google Scholar]
- van Waerbeke, L., Mellier, Y., Radovich, M., et al. 2001, A&A, 374, 757 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
All Tables
Fiducial cosmological parameters consistent with the WMAP 7-years results, and the underlying true parameters consistent with Planck.
Elements of the normalized compression matrix in percentage, 100 × Bμn, for 1 redshift bin and 3 parameters.
Normalized compression matrix elements in percentage, 100 × B, for 2 redshift bins and 3 parameters.
Examples of band power estimation from 80 COSEBIs for ξ±(ϑ) in the interval ϑ ∈ [ 1′,400′ ].
All Figures
![]() |
Fig. 1 Filter functions |
In the text |
![]() |
Fig. 2 Figure-of-merit, f, as a function of the number of COSEBIs, nmax, used. 7 free parameters listed in Table1, and 8 tomographic redshift bins are considered here. The solid line shows the result for using Log-COSEBIs with the true underlying cosmology. It also represents the maximum information level for a given nmax. The circles, stars and the Y-shaped symbols represent the f-values for First order, Second order, and their combination Ec, where nmax COSEBIs modes with the fiducial cosmological parameters are utilized in making them. |
In the text |
![]() |
Fig. 3 Figure-of-merit, f, as a function of σϵ. f is normalized by its minimum value which corresponds to using COSEBIs with the correct covariance (the solid line). The intrinsic ellipticity dispersion of galaxies, σϵ, is varied with respect to its true value, 0.3, to show the effects of using a wrong covariance. The markers show the value of f for first order, F, second order, S and the combination of both Ec CCOSEBIs. |
In the text |
![]() |
Fig. 4 Estimated top-hat filter function with ℓmin = 200 and ℓmax = 400, from nmax COSEBIs filters defined on 1′<ϑ< 400′. The changes between using 40 and 80 COSEBIs filters are small, so that no better representation is obtained by using an even higher value of nmax. |
In the text |
![]() |
Fig. 5 Relative difference, δmin, between the estimated top hat and the input as a function of the number of COSEBIs filters, nmax, utilized for a few ℓ-ranges. In all cases the saturation level is reached before nmax = 80. The minimum value of δmin is shown in Table5. In general, a higher number of modes is needed for a narrower top-hat filter, which is due to the spillage beyond the observed angular range (see Eq.(41)). |
In the text |
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.