A&A 407, 385-392 (2003)
DOI: 10.1051/0004-6361:20030849
M. Lombardi - P. Schneider
Institut für Astrophysik und Extraterrestrische Forschung, Universität Bonn, Auf dem Hügel 71, 53121 Bonn, Germany
Received 30 September 2002 / Accepted 28 May 2003
Abstract
In a series of papers (Lombardi & Schneider 2002, 2001) we studied in detail the statistical
properties of an interpolation technique widely used in astronomy.
In particular, we considered the average interpolated map and its
covariance under the hypotheses that the map is obtained by
smoothing unbiased measurements of an unknown field, and that the
measurements are uniformly distributed on the sky. In this paper we
generalize the results obtained to the case of observations carried
out only on a finite field and distributed on the field with a
non-uniform density. These generalizations, which are required in
many astronomically relevant cases, still allow an exact, analytical
solution of the problem. We also consider a number of properties of
the interpolated map, and provide asymptotic expressions for the
average map and the two-point correlation function which are valid
at high densities.
Key words: methods: statistical - methods: analytical - methods: data analysis - gravitational lensing
Interpolation techniques play a central role in many physical sciences. In fact, experimental data can often only be obtained at discrete points, while quantitative, global analyses can normally be performed only on a field. A classical example of such a situation is given by meteorological data, such as temperature, pressure, or humidity: these data are collected by a large number of ground-based weather stations, and then need to be interpolated in order to obtain a continuous field.
The situation is, apparently, very similar in Astronomy. Indeed, many astronomical observations are carried out "discretely,'' i.e. data are available only on some locations of the sky (typically corresponding to some astronomically significant objects, such as stars, galaxies, quasars). If there is some reason to think that the data represent discrete measurements of a continuous field, then the observer will want to interpolate the data in order to obtain a smooth map of the quantity being investigated.
In reality, astronomical observations have a characteristic that make them quite peculiar with respect to other physical experiments: in most cases, it is not possible to choose where to perform the measurements. A meteorologist, for example, can always decide to put his weather station in a particular location, or to have weather stations regularly spaced; this, clearly, is impossible for an astronomer. As a result, it is sensible to perform a statistical analysis of interpolation techniques by considering the measurement locations as random variables, i.e. by performing an ensemble average on the positions of the astronomical objects used for the analysis (this technique has been already used by several authors; see, e.g., Lombardi & Bertin 1998; van Waerbeke 2000).
In a series of previous papers (Lombardi & Schneider 2001,
hereafter Paper I, and Lombardi & Schneider 2002, hereafter
Paper II), we analyzed the statistical properties of a widely used
interpolation technique. In particular, we considered a set of
measurements
performed at locations
.
The measurements were taken to be unbiased estimates of a field
at the relative positions, i.e.
In this paper we study the expectation value and the two-point
correlation of the smoothed map
under the hypotheses
that:
It should be stressed that the generalizations carried out here are very important in the astrophysical context. Indeed, astronomical data will normally be available only on a limited area of the sky, and so boundary effects have to be taken into account. Moreover, often the measurements will not be uniformly distributed on the observed area. This happens, for example, for data based on stars, which have a higher density when one approaches the galactic equator. However, even for astronomical objects that are, in principle, uniformly distributed on the sky (e.g., distant galaxies or quasars), we might need to deal with a non-uniform distribution because of observational effects (e.g., because of a non-constant sensitivity on the field of the detector used, of dithering patterns, or of the presence on the field of bright objects that interfere with the measurements).
The paper is organized as follow. In Sect. 2 we carry out the various generalizations in turn. The properties of the average and the two-point correlation function of the smoothed field are discussed in Sect. 3. Finally, we summarize the results of this paper in Sect. 4.
Looking at Eq. (2), we can note that the interpolation point
does not directly enter the problem, but is basically only
used to "label'' the interpolated point. For our analysis, indeed,
it is convenient to fix that point to, say,
,
and to
rewrite Eq. (2) as
Equations (4)-(8) show explicitly that the location
does not enter our problem. As a result, looking again at Eq. (2), there is no need to take the weight function to be
of the form
,
but we can instead consider in the
same framework the more general form
.
As we will
see below, the trivial generalization described in this subsection is
actually a fundamental step for more interesting results.
We now focus on a slightly different generalization, namely the use of
finite fields. We observe that having no data outside a given field
is totally equivalent to having data everywhere and using a vanishing
weight for locations outside the field. In other words, even if our
observations are confined on a small part of the sky, we can always
imagine to have data in the whole sky, by generating arbitrary values
for the data locations and values, and then discard the arbitrary data
by using a vanishing weight for them. In turn, from the form of the
integrands in Eqs. (5) and (8), we see that the
integrals actually need to be performed only on the domain of the
weight function (the integrands, indeed, vanish ot the points where wA vanish). As a result, Eqs. (5)-(8) can still
be used, provided we interpret
as the observation field.
We can now finally generalize Eqs. (5)-(8) to
non-constant densities. We first observe that the remaining spatial
variable
that appears in the definition (4) is
a dummy variable. Indeed,
is basically used to "name''
locations, but does not really play any role on interpolation process.
For example, performing an arbitrary bijective (i.e., one-to-one)
mapping
described by the function
,
will not change the value of
,
provided that we use the "mapped'' weight function
.
On the other hand, when performing the
transformation
,
we are bound to change
the density distribution of objects. More precisely, if the objects
are uniformly distributed on the plane
with density
,
they will be distributed according to a
non-uniform density
on the
plane. The final density, indeed, is given by
The method described above clearly allows us to solve a much more
general problem but it has also two main problems. From the
theoretical side, one has to show that it is possible to find a
one-to-one mapping that satisfies our needs (namely, that
is uniform). From the practical side, it might be
non-trivial to find the function
;
moreover, for
every point
,
one needs to transform the weight function
into a
weight function on the
plane (see above). We now address
both problems, showing that our equations can be reformulated in a way
that naturally allows for non-uniform densities.
First, we explicitly show that, for every density distribution
it is always possible to find a
one-to-one function
such that the corresponding
,
evaluated from Eq. (9), is constant.
Let us, in fact, consider the transformation
We now turn to the second problem, namely the practical difficulties
in applying the technique discussed in this section. Suppose again
that we are interested in evaluating the expectation value of
of Eq. (4) with a non-uniform density
.
Then, we can use Eq. (10)
to convert the problem into the
plane, so that the
corresponding density is unity. We can then finally apply
Eqs. (5)-(8) on
,
using
.
In particular, for Eq. (5) we have
Before closing this subsection, we consider the case of a
non-continuous density field
.
We recall,
indeed, that with Eq. (10) we have been able to provide a
one-to-one, continuous transformation
only
under the hypothesis that
be continuous.
Although this hypothesis is not needed, it is non-trivial to exhibit a
mapping with the require properties in the general case of a
non-continuous density. In reality, this problem is only apparent.
Note, in fact, that the density
enters
Eqs. (13) and (15) only as a term inside an integral
and thus the continuity of this function does not play any role in our
problem. For example, if we convolve a discontinuous density
with a Gaussian,
![]() |
Figure 2:
Effective weight function in presence of boundaries. Three
Gaussian weight function (shown in solid lines) centered on
different parts of the field
![]() ![]() |
We summarize here the results obtained in this section. We have shown
that the expectation value of
can be evaluated from the
set of equations
Similarly to what was done in Paper I, we call
the effective weight function, so that we can
write Eq. (18) as
We now turn to the generalization of the results of Paper II
concerning the covariance of
,
i.e. its two-point
correlation function. Since the generalization procedure closely
follows the one used in Sect. 2 for the average,
we skip here many details and mainly report the final result only.
We first recall that in Paper II we have defined
and wB similarly with
and wA, with the only
difference that now these quantities are calculated with respect to a
different point
.
We then have defined the two-point
correlation function of
as
Using an argument similar to the one adopted in
Sect. 2, we can generalize the results of
Paper II to the hypotheses discussed in the items of
Sect. 1. We show here only the final results and
skip the proof, which is a trivial repetition of what was done above for
the average.
In this section we will consider some interesting properties of the
average (Sect. 3.1) and of the two-point correlation
function (Sect. 3.2) of
.
Hence, here we basically generalize Sect. 5 of Paper I and Sect. 6 of
Paper II.
By construction, for a constant field
the smoothed
function
defined in Eq. (4) returns on average 1, a property related to the normalization of the effective weight
(see Lombardi 2002). Indeed, if
,
we
find
Suppose we rescale the weight function
into
,
and at the same time the density
into
;
then we can verify using Eqs. (15)-(18)
that the effective weight is rescaled similarly to wA, i.e.
.
This scaling property suggests the following definition:
A study of
wA CA(wA) can be carried out using the same technique
adopted in Paper I. Since
YA(s) > 0 for every s, CA(wA) decreases as wA increases. Regarding
wA CA(wA), from the
properties of Laplace transform (see, e.g., Arfken 1985; see also
Appendix D of Paper II) we have
We now consider the limits of
for small and
large values of wA,
Regarding the other limit we have
At high densities (
)
only values of QA(s) close to s = 0are important, because for large s, YA(s) vanishes. Hence, we
expand QA(s) by writing
At large densities ,
we can expand CA(wA) in terms of the
moments of wA defined in Eq. (41). Calculations are
basically identical to the one provided in Paper I (see Eq. (66) of
that paper), with only minor corrections due to the different
definition of CA. Hence, we skip the derivation and report here
only the final result (up to the fifth term):
The generalization of the properties of the covariance terms
and TP
is, in most cases, trivial and closely follows the
generalization carried out in the Sect. 3.1. Hence,
here we will skip much of the proofs and just outline the main
results.
It can be shown that the Poisson noise satisfies a simple
normalization property: if
is constant, then
TP1 + TP2 = TP3 = 1, and thus the
Poisson noise vanishes. A proof of this property can be carried out
either with the technique described in Paper II, or, more easily,
using the following argument, taken from Lombardi (2002).
If
is constant on the field, we will on average measure
for each point. Let us now assume for a moment
that we do not have any measurement error, so that
for every n. Then, we will always measure
,
and thus
.
In this case, thus, we find
TP1 + TP2 = TP3 = 1. The situation is
actually the same even if the measurements are affected by errors:
These, in fact, appear only in the evaluation of
,
and thus the
Poisson noise is left unaffected.
In the limit
,
the expressions for
and TP take a particularly simple form. Indeed, since
,
to evaluate
,
TP1, and TP2 we just
need
C(wA, wA); this quantity, in turn, can be easily shown to be
C(wA, wA) = -C'A(wA), where CA(wA) is given by
Eq. (17).
If instead
is large compared to the
scale lengths of the weight functions wA and wB, then
The normalization of the Poisson noise terms derived in
Sect. 3.2.1 can be used to derive an upper limit
for .
Indeed, from the expression of
one sees that this
quantity is very similar to TP1, provided we replace
with
.
On the other hand, from
Sect. 3.2.1 we know that, if
,
then
TP1 < 1, because in this case
TP1 +
TP2 = 1 and TP2 is positive. Hence we find
,
where
is an upper limit
for
.
A lower limit for
can be obtained from the inequality (50):
![]() |
Figure 3:
The variance ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
At high densities we can expand
Q(sA, sB) as done in
Sect. 3.1.4 for QA:
If instead
,
then
,
and
thus
.
Note that here we are assuming
and
.
In the same limit, we have
(see Eqs. (19) and (30))
In this paper we have studied the statistical properties of a
smoothing technique widely used in Astronomy and in other physical
sciences. In particular, we have provided simple analytical
expressions to calculate the average and the two-point correlation
function of the smoothed field
defined in Eq. (1). The results generalize what was already obtained in
Paper I and II to the case where observations are carried out in a
finite field, with a non-uniform spatial density for the measurements,
and with non-uniform measurement errors
.
These
generalizations together greatly widen the range of applicability of
our results in the astronomical context. Finally, we have shown
several interesting properties of the average map and of the two-point
correlation function, and we have considered the behavior of these
quantities in some relevant limiting cases.
Acknowledgements
This work was partially supported by a grant from the Deutsche Forschungsgemeinschaft, and the TMR Network "Gravitational Lensing: New constraints on Cosmology and the Distribution of Dark Matter.''
In this appendix, we derive the same results obtained in Sect. 2 using a more direct method. Although not necessary, this alternative derivation is helpful in order to fully understand the whole problem and also clarifies some of the peculiarities of the equations derived in Paper I (cf., in particular, the case of vanishing weights).
The derivation will follow quite closely the one adopted in Paper I,
with the needed modifications due to the finite-field and the
non-constant density. The only significant exception will be the use
of a different strategy in performing the so-called "continuous
limit'' (because of the finite field, we cannot perform the limit
,
but we must rather take N as a random
variable). Note that throughout this appendix we will drop everywhere
the index A, so that, e.g.,
will be written just
as
.
This simplification should not create
ambiguities, since anyway here we are concerned only with the value of
at
.
Let us consider a field
and locations randomly distributed on
this field with density
.
Let us assume, for
simplicity, that
is strictly positive in
;
if this is not the case, we can always redefine
to include
only points inside the support of w (cf. discussion in
Sect. 2.2). The expected average number of
locations in
is given by
Since the locations are distributed inside
according to the
density
,
a single location follows the probability
distribution
;
note that the factor
is needed here in order to satisfy the normalization of
probabilities (the integral on
must be unity). Hence, the
probability of having exactly N locations inside
at the
positions
(with
)
is
given by