A&A 392, 11531174 (2002)
DOI: 10.1051/00046361:20020965
Smooth maps from clumpy data: Covariance analysis
M. Lombardi  P. Schneider
Institüt für Astrophysik und Extraterrestrische Forschung,
Universität Bonn, Auf dem Hügel 71, 53121 Bonn, Germany
Received 23 January 2002 / Accepted 27 June 2002
Abstract
Interpolation techniques play a central role in Astronomy, where one
often needs to smooth irregularly sampled data into a smooth map.
In a previous article (Lombardi & Schneider 2001, hereafter
Paper I), we have considered a widely used smoothing technique and
we have evaluated the expectation value of the smoothed map under a
number of natural hypotheses. Here we proceed further on this
analysis and consider the variance of the smoothed map, represented
by a twopoint correlation function. We show that two main sources
of noise contribute to the total error budget and we show several
interesting properties of these two noise terms. The expressions
obtained are also specialized to the limiting cases of low and high
densities of measurements. A number of examples are used to show in
practice some of the results obtained.
Key words: methods: statistical  methods: analytical  methods: data analysis 
gravitational lensing
1 Introduction
Raw astronomical data are very often discrete, in the sense that
measurements are performed along a finite number of directions on the
sky. In many cases, the discrete data are believed to be single
measurements of a smooth underlying field. In such cases, it is
desirable to reconstruct the original field using interpolation
techniques. A typical example of the general situation just described
is given by weak lensing mass reconstructions in clusters of galaxies.
In this case, thousands of noisy estimates of the tidal field of the
cluster (shear) can be obtained from the observed shapes of background
galaxies whose images are distorted by the gravitational field of the
cluster. All these measurements can then be combined to produce a
smooth map of the cluster shear, which in turn is subsequently
converted into a projected density map of the cluster mass
distribution.
One of the most widely used interpolation techniques in Astronomy is
based on a weighted average. More precisely, a positive weight
function, describing the relative weight of a datum at the position
on the point
,
is introduced.
The weight function is often chosen to be of the form
,
i.e. depends only on the separation
of the two points considered. Normally, w is also a decreasing
function of
in order to ensure that the largest
contributions to the interpolated value at
comes from
nearby measurements. Then, the data are averaged using a weighted
mean with the weights given by the function w. More precisely,
calling
the nth datum obtained at the position
,
the smooth map is defined as

(1) 
where N is the total number of objects. In a previous paper
(Lombardi & Schneider 2001, hereafter Paper I) we have evaluated
the expectation value for this expression under the following
hypothesis:
In Paper I we have shown that

(5) 
Thus,
is the convolution of the
unknown field f with an effective weight
which, in general, differs from the weight function w. We also have
shown that
has a "similar'' shape as w and
converges to w when the object density
is large; however for
finite ,
is broader than w.
Here we proceed further with the statistical analysis by obtaining an
expression for the twopoint correlation function (covariance) of this
estimator. More precisely, given two points
and
,
we consider the twopoint correlation function of
,
defined as

(6) 
In our calculations, similarly to Paper I, we assume that are unbiased and mutually independent estimates of
(cf. Eq. (2)). We also assume that the
have fixed variance ,
so that

(7) 
The paper is organized as follows. In Sect. 2 we
summarize the results obtained in this paper. In
Sect. 3 we derive the general expression for the
covariance of the interpolating techniques and we show that two main
noise terms contribute to the total error. These results are then
generalized in Sect. 4 to include the case
of weight functions that are not strictly positive. A useful
expansion at high densities
of the covariance is obtained in
Sect. 5. Section 6 is
devoted to the study of several interesting properties of the
expressions obtained in the paper. Finally, in
Sect. 7 we consider three simple weight functions and
derive (analytically or numerically) the covariance for these cases.
Four appendixes on more technical topics complete the paper.
2 Summary
As mentioned in the introduction, the primary aim of this paper is the
evaluation of the covariance (twopoint correlation function) of the
smoothing estimator (1) under the hypotheses that
measurements
are unbiased estimates of a field
(Eq. (2)) and that location measurements
are independent, uniformly distributed variables with
density .
Hence, we do not allow for angular clustering on the
positions
,
and we also do not include the effects
of a finite field in our calculations (these effects are expected to
play a role on points close to the boundary of the region where data
are available). Moreover, we suppose that the noise on the
measurements
is uncorrelated with the signal (i.e.,
that variance
is constant on the field), and that
measurements are uncorrelated to each other. Finally, we stress that
in the whole paper we assume a nonnegative (i.e., positive or
vanishing) weight function
.
Surprisingly,
weight functions with arbitrary sign cannot be studied in our
framework (see discussion at the end of
Sect. 4).
The results obtained in this paper can be summarized in the following
points.
 1.
 We evaluate analytically the twopoint correlation function of
,
showing that it is composed of two main
terms:

(8) 
The term
is proportional to
and can thus be
interpreted as the contribution to the covariance from measurement
errors; the term
depends on the signal
and can be interpreted as Poisson noise. These
terms can be evaluated using the following set of equations:
Q(s_{A}, s_{B}) 
= 

(9) 
Y(s_{A}, s_{B}) 
= 

(10) 
C(w_{A}, w_{B}) 
= 

(11) 

= 

(12) 

= 





(13) 
In the last two equations we used the notation
,
and similarly for
;
moreover the two functions C_{A} and C_{B} can be obtained from the
following limits:



(14) 
 2.
 We show that the quantity
C(w_{A}, w_{B}) of Eq. (11), in
the limit of high density ,
converges to

(15) 
where S_{ij} are the moments of the functions
(w_{A}, w_{B}):

(16) 
 3.
 We derive a number of properties for the noise terms and the
function
C(w_{A}, w_{B}). In particular, we show (1) that
in every point
;
(2)
that the measurement error has as upper bound
;
(3) that the same error has as lower bound the
convolution
of the two
effective weights
and
(cf. Lombardi & Schneider 2001); (4) that
the measurement noise converges to
at low
densities (
)
and to
at high densities (
).
3 Evaluation of the covariance
3.1 Preliminaries
Before starting the analysis, let us introduce a simpler notation. In
the following we will often drop the arguments
and
in
and
other related quantities. Note, in fact, that the problem is
completely defined with the introduction of the two "shifted'' weight
functions
and
.
We also call
and
the values of
at the two points of
interest
and
,
so that

(17) 
Hence, Eq. (6) can be rewritten in this notation as

(18) 
Note that, using this notation, we are not taking advantage of the
invariance upon translation of
in Eq. (1); in
other words, we are not using the fact that w_{A} and w_{B} are
basically the same function shifted by
.
Actually, all calculations can be carried out without using this
property; however, we will explicitly point out simplifications that
can be made using the invariance upon translation.
We would also like to spend a few words about averages. Note that, as
anticipated in Sect. 1, we need to carry out two
averages, one with respect to
(Eqs. (2) and
(7)), and one with respect to
(Eqs. (3) and (4)). Taking
to
be random variables is often reasonable because in Astronomy one does
not have a direct control over the positions where observations are
made (this happens because measurements are normally performed in the
direction of astronomical objects such as stars and galaxies, and thus
at "almost random'' directions); it also has the advantage of letting
us obtain general results, independent of any particular configuration
of positions. Note, however, that taking
to be
independent variables is a strong simplification which
might produce inaccurate results in some context (e.g., in case of a
direction dependent density, or in case of clustering; see
Lombardi et al. 2001). Finally, since the number of observations N is
itself a random variable, we need to perform first the average on
and then the one on
.
In closing this section, we observe that in this paper, similarly to
Paper I, we will almost always consider the smoothing problem on the
plane, i.e. we will assume that the positions
are vectors of
.
We proceed this way because in Astronomy the
smoothing process often takes places on small regions of the celestial
sphere, and thus on sets that can be well approximated with subsets of
the plane. However, we stress that all the results stated here can be
easily applied to smoothing processes that takes places on different
sets, such as the real axis
or the space
.
3.2 Analytical solution
Let us now focus on the first term on the r.h.s. of
Eq. (18). We have

(19) 
Note that the average in the r.h.s. of this equation is only with
respect to
.
Expanding the numerator in the integrand
of this equation, we obtain N^{2} terms, N of which have n = m and
N (N  1) have .
We can then rewrite Eq. (19)
above as

(20) 
where



(21) 



(22) 
Despite the apparent differences, these two terms can be simplified in
a similar manner. Let us consider first T_{1}. Using
Eq. (7), we can evaluate the average
.
Since the
positions
appear as "dummy variables'' in
Eq. (21), we can relabel them as follows

(23) 
In order to simplify this equation, we use a technique similar to the
one adopted in Paper I. More precisely, we split the two sums in the
denominator of the integrand of Eq. (23), taking away the
terms
and
.
Hence, we write

(24) 
where
C(w_{A}, w_{B}) is a corrective factor given by

(25) 
The additional factor
has been introduced to simplify
some of the following equations. Note that in the definition of Cw_{A} and w_{B} are formally taken to be two real variables (instead
of two real functions of argument
).
The definition of C above suggests to define two new random
variables y_{A} and y_{B}:

(26) 
Note that the sum runs from n = 2 to n = N. If we could evaluate
the combined probability distribution function
p_{y}(y_{A},
y_{B}) for y_{A} and y_{B}, we would have solved our problem: In fact
we could use this probability to write
C(w_{A}, w_{B}) as follows

(27) 
To obtain the probability distribution
p_{y}(y_{A}, y_{B}), we need to use
the combined probability distribution
p_{w}(w_{A}, w_{B}) for w_{A} and
w_{B}. This distribution is implicitly defined by saying that the
probability that
be in the range
and
be in the range
is
.
We can evaluate
p_{w}(w_{A}, w_{B}) using

(28) 
Turning back to
(y_{A}, y_{B}), we can write a similar expression for
p_{y}:

(29) 
where for simplicity we have called
.
Note that inserting this equation into Eq. (27) we recover
Eq. (25), as expected. Actually, for our purposes it is more
useful to consider y_{X} to be the sum of N random variables
.
In other words, we consider the set of couples
,
made of the two weight functions at the
various positions, as a set of N independent
twodimensional random variables
(w_{A}, w_{B}) with probability
distribution
p_{w}(w_{A}, w_{B}). (Hence, similarly to Eq. (25),
we consider the weight functions w_{X} to be real variables instead of
real functions; the independence of the positions
then
implies the independence of the couples
(w_{An},
w_{Bn})). Taking this point of view, we can rewrite
Eq. (29) as
It is well known in Statistics that the sum of independent random
variables with the same probability distribution can be better studied
using Markov's method (see, e.g., Chandrasekhar 1989; see
also Deguchi & Watson 1987 for an application to microlensing
studies). This method is based on the use of Fourier transforms for
the probability distributions p_{w} and p_{y}. However, since we are
dealing with non negative quantities (we recall that we assumed
), we can replace the Fourier transform with
Laplace transform which turns out to be more appropriate in for our
problem (see Appendix D for a summary of the
properties of Laplace transforms). Hence, we define
W(s_{A}, s_{B}) and
Y(s_{A}, s_{B}) to be the Laplace transforms of
p_{w}(w_{A}, w_{B}) and
p_{y}(w_{A}, w_{B}), respectively. Note that, since both functions p_{w}and p_{y} have two arguments, we need two arguments for the Laplace
transforms as well:



(31) 



(32) 
We use now in these expressions the Eq. (28) for p_{w} and
Eq. (30) for p_{y}, thus obtaining
W(s_{A}, s_{B}) 
= 

(33) 
Y(s_{A}, s_{B}) 
= 

(34) 
Hence, p_{y} can in principle be obtained from the following scheme.
First, we evaluate
W(s_{A}, s_{B}) using Eq. (33), then we
calculate
Y(s_{A}, s_{B}) from Eq. (34), and finally we
backtransform this function to obtain
p_{y}(y_{A}, y_{B}).
Actually, another, more convenient, technique is viable. Following
the path of Paper I, we now take the "continuous limit'' and treat
N as a random variable. As explained in
Sect. 1, we can take this limit using two
equivalent approaches:
 We keep the area A fixed and consider N to be a random
variable with Poisson distribution given by Eq. (3). We
then average over all possible configurations obtained.
 We take the limit
taking the density
fixed.
The equivalence of the two methods can be shown as follows. Let us
consider a large area
,
and let us suppose that the
number
of objects inside A' is fixed. Since objects
are randomly distributed inside A', the probability for each object
to fall inside A is just A / A'. Hence N, the number of objects
inside A, follows a binomial distribution:

(35) 
If we now let N' go to infinity with
fixed, the
probability distribution for N converges (see, e.g. Eadie et al. 1971)
to the Poisson distribution in Eq. (3).
We will follow here the second strategy, i.e. we will take the limit
keeping
constant. In the limit
the quantity
W(s_{A}, s_{B}) goes to unity and
thus is not useful for our purposes. Instead, it is convenient to
define

(36) 
This definition is sensible because, this way, Q remains finite for
.
In the continuous limit, Eq. (34)
becomes

(37) 
In order to evaluate
C(w_{A}, w_{B}), we rewrite its definition (27) as

(38) 
where
and

(39) 
Here
are Heaviside functions at the positions
w_{X}, i.e.

(40) 
Note that
is basically a "shifted'' version of p_{y}.
Looking back at Eq. (38), we can interpret the integration
present in this equation as a very particular case of
Laplace transform with vanishing argument. In other words, we can
write

(41) 
Thus our problem is solved if we can obtain the Laplace transform of
evaluated at
s_{A} = s_{B} = 0. From the properties
of Laplace transform (cf. Eq. (D.7)) we find

(42) 
where Z_{w} is the Laplace transform of :

(43) 
Combining together Eqs. (41)(43) we finally obtain

(44) 
In summary, the set of equations that can be used to evaluate T_{1}are

(45) 

(46) 

(47) 

(48) 
These equations solve completely the first part of our problem,
the determination of T_{1}.
Let us now consider the second term of Eq. (20), namely T_{2}(see Eq. (22)). We first evaluate the average in
that appears in the numerator of the integrand of
Eq. (22), obtaining
(cf. Eq. (7) with ). Then we relabel the "dummy'' variables
similarly to what has been done for T_{1}, thus obtaining

(49) 
We now split, in the two sums in the denominator, the terms
and
and define the new random variables

(50) 
Again, if we know the combined probability distribution
p_{z}(z_{A}, z_{B}) of z_{A} and z_{B} our problem is solved, since we can
write (cf. Eqs. (24) and (27))



(51) 
Actually, in the continuous limit, z_{X} is indistinguishable from
y_{X} (z_{X} differs from y_{X} only on the fact that it is the sum of
N2 "weights'' instead of N1; however, N goes to infinity in
the continuous limit and thus y_{X} and z_{X} converge to the same
quantity). Thus we can rewrite Eq. (51) as

(52) 
where C is still given by Eq. (47).
Finally, in order to evaluate
,
we still need the
simple averages
and
.
These can be obtained
directly using the technique described in Paper I, where we have shown
that the set of equations to be used is

(53) 

(54) 

(55) 

(56) 
We recall that in Paper I we called the combination
effective weight (cf.
Eq. (5) in the introduction). Alternatively, we can use the
quantities
Y(s_{A}, s_{B}) and
C(w_{A}, w_{B}) to calculate the correcting
factors C_{A} and C_{B}. From Eqs. (53) and (54) we
immediately find
Q_{A}(s_{A}) 
= 

(57) 
Q_{B}(s_{B}) 
= 

(58) 
Then, using the properties of Laplace transforms (cf.
Eq. (D.10)), and comparing the definition of
C(w_{A}, w_{B})(Eq. (44)) with the one of C_{X}(w_{X}) (Eq. (55)) we
find



(59) 
We now have at our disposal the complete set of equations that can be
used to determine the covariance of .
In closing this subsection we makes a few comments on the translation
invariance for w_{X} (see Sect. 3.1). Since
and
differ by an angular shift
only, the two functions Q_{A} and Q_{B} are the same, so that C_{A}coincides with C_{B}. Not surprisingly, the two effective weights
and
differ also only by a
shift.
3.3 Noise contributions
A simple preliminary analysis of the Eqs. (48) and (52)
allows us to recognize two main sources of noise. In
fact, a term in Eq. (48) is proportional to ,
and
is clearly related to measurement errors of f, namely

(60) 
Other factors entering
can be interpreted as Poisson
noise. Hence, we call
,
,
and
,
so that
the total Poisson noise is
.
Note that the Poisson noise
,
in contrast with the measurement noise ,
strongly depends on the signal
.
The noise term
is quite intuitive and does not require a
long explanation. We note here only that this term is independent of
the field
because we assumed measurements with fixed variance
(see Eq. (7)).
The Poisson noise
can be better understood with a
simple example. Suppose that
is not
constant and let us focus on a point where this function has a strong
gradient. Then, when measuring
in this point, we could
obtain an excess of signal because of an overdensity of objects in the
region where
is large; the opposite happens if we have
an overdensity of objects in the region where
is
small. This noise source, called Poisson noise, vanishes if the
function
is flat.
In the rest of this paper we will study the properties of the
twopoint correlation function. Before proceeding, however, we need
to consider an important generalization of the results obtained here
to the case of vanishing weights.
4 Vanishing weights
So far we have implicitly assumed that both w_{A} and w_{B} are always
positive. In some cases, however, it might be interesting to consider
vanishing weight functions (for example, functions with finite
support). We need then to modify accordingly our equations.
When using vanishing weights, we might encounter situations where the
denominator of Eq. (1) vanishes because all weight functions
vanish as well. In this case, the
estimator
cannot be even defined (we encounter
the ratio 0 / 0), and any further statistical analysis is
meaningless. In practice, when smoothing data using a vanishing
weight function, one could just ignore the points
where
the smoothed function
is not defined, i.e. the
points
for which
for
every n. This simple approach leads to smoothed maps with
"holes'', i.e. defined only on subsets of the plane. Hence, if we
choose this approach we need to modify accordingly the statistical
analysis that we carry out in this paper.
This problem was already encountered in Paper I, where we used the
following prescription. When using a finitefield weight function, we
discard, for every configuration of measurement points
,
the points
on the plane for which the
smoothing
is not defined. Then, when taking
the average with respect to all possible configurations
of
,
we just exclude these
configurations. We stress that, this way, the averages
and
of the smoothing (1) at two
different points
and
are effectively
carried out using different ensembles: In one case we exclude the
"bad configurations'' for
,
in the other case the "bad
configurations'' for
.
The same prescription is also adopted here to evaluate the covariance
of our estimator. Hence, when performing the ensemble average to
estimate the covariance
,
we explicitly exclude configurations where either
or
cannot be evaluated. This is implemented with a slight
change in the definition of p_{y}, which in turn implies a change in
Eq. (46) for Y. A rigorous generalization of the relevant
equations can now be carried out without significant difficulties.
However, the equations obtained are quite cumbersome and present some
technical peculiarities. Hence, we prefer to postpone a complete
discussion of vanishing weights until
Appendix A; we report here only the main
results.
As mentioned above, the basic problem of having vanishing weights is
that in some cases the estimator is not defined. Hence, it is
convenient to define three probabilities, namely P_{A} and P_{B}, the
probabilities, respectively, that
and
are
not defined, and P_{AB}, the probability that both quantities are
not defined. Note that, because of the invariance upon translation
for w, we have P_{A} = P_{B}. These probabilities can be estimated
without difficulties. In fact, the quantity
is not
defined if and only if there is no object inside the support of w_{X}.
Since the number of points inside the support of w_{X} follows a
Poisson probability, we have
,
where is the area of the support of w_{X}. Similarly, calling
the area of the union of the supports of w_{A} and w_{B}, we
find
.
Using Eqs. (45)
and (46) we can also verify the following relations
P_{AB}= 
(61) 
P_{A}= 
(62) 
Appendix A better clarifies the
relationship between the limiting values of Y and the probabilities
defined above. In the following we will use a simplified notation for
limits, and we will write something like
for
the left equation in (62).
The only significant modification to the equations obtained above for
vanishing weights is an overall factor in Eq. (47), which now
becomes

(63) 
The factor
1/(1  P_{A}  P_{B} + P_{AB}) is basically a
renormalization; more precisely, it is introduced to take into account
the fact that we are discarding cases where either
or
are not defined. Note, in fact, that in agreement with
the inclusionexclusion principle,
(1  P_{A}  P_{B} + P_{AB}) is the
probability that the both
and
are defined.
Since the combination
(1  P_{A}  P_{B} + P_{AB}) enters several
equations, we define

(64) 
Equation (63) is the most important correction to take into
account for vanishing weights. Actually, there are also a number of
peculiarities to consider when dealing with the probability p_{y} and
its Laplace transform Y. Fortunately, however, these peculiarities
have no significant consequence for our purpose and thus we can still
safely use Eqs. (45) and (46). Again, we refer to
Appendix A for a complete explanation.
In closing this section, we spend a few words on weight functions with
arbitrary sign (i.e., functions
that can be positive,
vanishing, or positive depending on
). As mentioned in
Sect. 2, in this case a statistical study of the
smoothing (1) cannot be carried out using our framework. In
order to understand why this happens, let us consider the weight
function

(65) 
This function is continuous, positive for
,
and quickly vanishes for large
.
Let us
then consider separately the numerator and denominator of
Eq. (1). The denominator can clearly be positive or
negative; more precisely, the denominator is positive for points
close to at least one of the locations
,
and negative for points
which are in "voids'' (i.e., far
away from the locations
). Hence, the lines where
the denominator vanishes separate the regions of high density of
locations from the regions of low density. Note that, even for very
large average densities ,
we still expect to find "voids'' of
arbitrary size (in other words, for every finite density ,
there
is a nonvanishing probability of having no point inside an
arbitrarily large region). As a result, there will be always regions
where the denominator vanishes. The discussion for the numerator is
similar but, in this case, we also need to take into account the field
.
Hence, we still expect to have regions where the
numerator is positive and regions where it is negative but, clearly,
these regions will in general be different from the analogous regions
for the denominator. As a result, when evaluating the ratio between
the numerator and the denominator, we will obtain arbitrarily large
values close to the lines where the denominator vanishes. Note also
that these lines will change for different configurations of locations
.
In summary, if the weight function is allowed
to be negative, the denominator of Eq. (1) is no longer
guaranteed to be positive, and infinities are expected when performing
the ensemble average.
5 Moments expansion
In most applications, the density of objects is rather large. Hence,
it is interesting to obtain an expansion for
C(w_{A}, w_{B}) valid at
high densities.
In Paper I we already obtained an expansion for C_{A}(w_{A}) (or,
equivalently, C_{B}(w_{B})) for
:

(66) 
In this equation, S_{ij} are the moments of the functions
(w_{A},
w_{B}), defined as

(67) 
Clearly, in Eq. (66) enter only the moments S_{i0}, since the
form of w_{B} is not relevant for C_{A}(w_{A}). Similarly, the
expression for C_{B}(w_{B}) contains only the moments S_{0j}. Note
that for weight functions invariant upon translation we have
S_{ij} =
S_{ji}.

Figure 1:
The moment expansion of
C(w_{A}, w_{B}) for
1dimensional Gaussian weight functions
w_{A}(x) = w_{B}(x) centered on 0 and with unit variance. The plot shows the
various order approximations obtained using the method
described in Sect. 5 (equations for
the orders n=3 and n=4 are not explicitly reported in the
text; see however Table B.1 in
Appendix B). The density used is
. 
Open with DEXTER 

Figure 2:
The function
C(w_{A}, w_{B}) is
monotonically decreasing with w_{A} and w_{B}, while
w_{A} w_{B}
C(w_{A}, w_{B}) (scaled in this plot) is monotonically
increasing. The parameters used for this figure are the same
as Fig. 1. Note that, since
P_{A} = P_{B} = 0, we
have
in
agreement with Eqs. (82) and (83); moreover
as expected from
Eq. (84). 
Open with DEXTER 
A similar expansion can be obtained for
C(w_{A}, w_{B}). Calculations
are basically a generalization of what was done in Paper I for C(w)and can be found in Appendix B. Here we
report only the final result obtained:



(68) 
We note that using this expansion and Eqs. (59) we can
recover the first terms of Eq. (66), as expected.
Figure 1 shows the results of applying this expansion
to a Gaussian weight. For clarity, we have considered in this figure
(and in others shown below) a 1dimensional smoothing instead of the
2dimensional case discussed in the text, and we have used x as
spatial variable instead of
.
The figure refers to two
identical Gaussian weight functions with vanishing average and unit
variance. A comparison of this figure with Fig. 2 of Paper I shows
that the convergence here is much slower. Nevertheless,
Eq. (68) will be very useful to investigate some important
limiting cases in the next section.
6 Properties
In this section we will study in detail the two noise terms and
introduced in Sect. 3.3,
showing their properties and considering several limiting cases. The
results obtained are of clear interest of themselves; for example, we
will derive here upper and lower limits for the measurement error
that can be used at low and high densities. Moreover, this
section helps us understand the results obtained so far, and in
particular the peculiarities of vanishing weights.
6.1 Normalization
A simple normalization property for
C(w_{A}, w_{B}) can be derived,
similarly to what we have already done for the average of in Paper I. Suppose that
and that no errors are
present on the measurements, so that
.
In this case we
will always measure
(see Eq. (1)),
so that
,
,
and no error is expected on .
This
result can be also recovered using the analytical expressions obtained
so far. Let us first consider the simpler case of nonvanishing
weights.
Using Eqs. (47) and (48), we can write the term
in the case
as

(69) 
The last integrand in this equation can be rewritten as
(cf. the definition of Q,
Eq. (45)):

(70) 
Analogously, for
we obtain (cf. Eq. (52))
We can integrate this expression by parts taking
as differential term:

(72) 
We now observe that the last term in Eq. (72) is identical to
what we founded in Eq. (70). Hence, the sum
is
The last equation holds because, for nonvanishing weights,
Y(0^{+},
0^{+}) = 1 and all other terms vanishes (cf.
Eqs. (61)(62)). Hence, as expected,
.
In case of vanishing weights, we can still use Eqs. (70)
and (72) with an additional factor
(due to the
extra factor in Eq. (63)). The last step in
Eq. (73) thus now becomes

(74) 
The last equality holds since now Y does not vanishes for large
(s_{A}, s_{B}) (see again Eqs. (61)(62)).
6.2 Scaling
Similarly to what was already shown in Paper I, for all expressions
encountered so far some scaling invariance properties hold.
First, we note that, although we have assumed that the weight
functions w_{A} and w_{B} are normalized to unity, all results are
clearly independent of their actual normalization. Hence, a trivial
scaling property holds: All results (and in particular the final
expression for
)
are left unchanged by the
transformation
or,
equivalently,



(75) 
A more interesting scaling property is the following. Consider the
transformation



(76) 
where both factors k^{2} must be changed according to the dimension of
the
vector space. If we apply this transformation, then
the expression for
is transformed according to

(77) 
This invariance suggests that the shape of
is
controlled by the expected number of objects for which the two weight
functions are significantly different from zero. Hence, similarly to
what done in Paper I, we define the two weight areas
and
as

(78) 
For weight functions invariant upon translation we have
.
We call
the weight number of objects (again,
because of the invariance upon translation). Note that
this quantity is left unchanged by the scaling (76). Similar
definitions hold for the effective weight
and the effective number of objects
.
6.3 Behavior of C
In order to better understand the properties of C, it is useful to
briefly consider its behavior as a function of the weights w_{A} and w_{B}.
We observe that, since
Y(s_{A}, s_{B}) > 0 for every
(s_{A}, s_{B}) (see
Eq. (46)),
C(w_{A}, w_{B}) decreases if either w_{A} or w_{B}increase. In order to study the behavior of the quantity
w_{A} w_{B}
C(w_{A}, w_{B}) that enters the noise term T_{1}, we consider the
quantity
w_{A} C(w_{A}, w_{B}):

(79) 
This equation can be shown by integrating by parts the integral over
s_{A}. The partial derivative required in Eq. (79) can be
evaluated from Eq. (46):

(80) 
Since this derivative is negative, we can deduce that the integral
over s_{A} in Eq. (79) increases with w_{A}, and thus
w_{A}
C(w_{A}, w_{B}) also increases as w_{A} increases. Similarly, it can be
shown that
w_{B} C(w_{A}, w_{B}) increases as w_{B} increases. In
summary, the quantity
w_{A} w_{B} C(w_{A}, w_{B}) behaves as w_{A} w_{B}, in
the sense that its partial derivatives have the same sign as the
partial derivatives of w_{A} w_{B} (see Fig. 2). Also,
since
C(w_{A}, w_{B}) decreases if either w_{A} or w_{B} increase, we
can deduce that
w_{A} w_{B} C(w_{A}, w_{B}) is "broader'' than w_{A} w_{B}.
Since
is positive,
the function
shares the same support as
.
It is also interesting to study the limits of
w_{A}
w_{B} C(w_{A}, w_{B}) at high and low values for w_{A} and w_{B}. From the
properties of Laplace transform (see Eq. (D.10)), we have

(81) 
where Eq. (61) has been used in the second equality. Hence,
the quantity
w_{A} w_{B} C(w_{A}, w_{B}) goes to zero only if
P_{AB} = 0.
In other cases, we expect a discontinuity at
w_{A} = w_{B} = 0.
Similarly, using Eqs. (61)(62) we find
Since
w_{A} w_{B} C(w_{A}, w_{B}) increases with both w_{A} and w_{B}, the
last equation above puts a superior limit for this quantity:

(85) 
Suppose that the two points
and
are far
away from each other, so that
is
very close to zero everywhere. In this situation we can greatly
simplify our equations.
If
is far away from
,
then
and
are never significantly
different from zero at the same position
.
In this case,
the integral in the definition of
Q(s_{A}, s_{B}) (see
Eq. (45)) can be split into two integrals that corresponds to
Q_{A} and Q_{B} (Eq. (53)):



(86) 
Hence, if the two weight functions w_{A} and w_{B} do not have
significant overlap, the function
C(w_{A}, w_{B}) reduces to the product
of the two correcting functions C_{A} and C_{B}.
In general, it can be shown that
.
In fact, we have

(87) 
We now observe that

(88) 
Hence,
and the difference
between the two terms of this inequality is an indication of overlap
between the two weight functions w_{A} and w_{B}. Since the
exponential function is monotonic, we find
and thus

(89) 
6.5 Upper and lower limits for
The normalization property shown in Sect. 6.1 can
also be used to obtain an upper limit for .
We observe, in
fact, that
is indistinguishable from
for a constant function
.
This
case, however, has already been considered above in
Sect. 6.1: There we have shown that
.
Since
,
we find the
relation
.
The property just obtained has a simple interpretation. As shown by
Eq. (60),
is proportional to
and thus we
would expect that this quantity is unbounded superiorly. In reality,
even when we are dealing with a very small density of objects, the
estimator (1) "forces'' us to use at least one object. This
point has already been discussed in Paper I, where we showed that the
number of effective objects,
,
is always
larger than unity. The upper limit found for
can be
interpreted using the same argument. Note that this result also holds
for weight functions with finite support.
A lower limit for ,
instead, can be obtained from the
inequality (89):

(90) 
Hence, the error
is larger than a convolution of the two
effective weight functions. In case of finitefield weight functions,
the limit just obtained must be corrected with a factor .
The
argument to derive Eq. (90) is then slightly more complicated
because of the presence of the P_{X} probabilities. However, using
the relation
,
we can recover Eq. (90)
with the aforementioned corrective factor.
6.6 Limit of low and high densities
In the limit
we can obtain simple expressions for
the noise terms. If
vanishes, we have
Y(s_{A}, s_{B}) = 1 (cf.
Eq. (46)) and thus

(91) 
In this equation we have assumed
w_{A} w_{B} > 0. Note that we have
reached here the superior limit indicated by Eq. (85). In
the same limit,
,
and
,
where
is the area of the intersection of the
supports of w_{A} and w_{B}. Hence we find

(92) 
Analogously, in the same limit, we have found in Paper I

(93) 
where w_{X} > 0 has been assumed. We can then proceed to evaluate the
various terms. For
we obtain the expression

(94) 
Note that the integral has been evaluated only on the subset of the
plane where
w_{A} w_{B} > 0; the case where this product vanishes, in
fact, need not to be considered because the quantity
w_{A} w_{B} C(w_{A},
w_{B}) vanishes as well. Exactly the same result holds for weight
functions with infinite support. Hence, when
we
reach the superior limit discussed in
Sect. 6.5 for .
Equation (94) can be better appreciated with the following
argument. As the density
approaches zero, the probability of
having two objects on
vanishes. Because of the
prescription regarding vanishing weights (cf. beginning of
Sect. 4), the ensemble average in our limit
is performed with one and only one object in
.
Since
we have only one object, this is basically used with unit weight in
the average (17), and thus the measurement noise is just given
by
.
Let us now consider the limit at low densities of the Poisson noise,
which, we recall, has been split into three terms,
,
,
and
(see
Sect. 3.3). Inserting Eq. (92)
into Eq. (24), we find for

(95) 
where
denotes the
simple average of f^{2} on the set
.
Hence,
converges to the average of f^{2} on the intersection
of the supports of w_{A} and w_{B}. Again, we can explain this result
using an argument similar to the one used for Eq. (94).
Regarding
,
we observe that this term is of
first order in
because
C(w_{A}, w_{B}) is of first order (cf.
Eqs. (92) and (52)). We can then safely ignore this
term in our limit
.
Finally, as shown in Paper I,
at low densities the expectation value for
is a simple
average of f on the support of w_{X}, i.e.
.
Hence,
and the Poisson
noise in the limit of small densities is given by

(96) 
In case of a constant function
,
this expression
vanishes as expected. Surprisingly, in general, we cannot say that
.
Rather, if
,
and if in particular
the two weight functions have different supports, we might have a
negative .
Suppose, for example, that f vanishes on
the intersection of the two supports
,
but is
otherwise positive. In this case, the first term in the r.h.s. of
Eq. (96) vanishes, while the second term contributes with a
negative sign, and thus
.
On the other hand, if
w_{A} = w_{B} then
has to be nonnegative.
We now consider the opposite limiting case, namely high density. In
this case, it is useful to use the moment expansion (68).
Since
and
have an overall factor in its definition (cf. Eq. (60)), we can simply take the
0th order for
C(w_{A}, w_{B}), thus obtaining



(97) 
For
and
,
instead, we need to use a
first order expansion in
for
C(w_{A}, w_{B}). This can be done
by using the first terms in series (66), and by expanding all
fractions in terms of powers of .
Inserting the result into
the definitions of
and
we obtain
Note that we have dropped, in these equations, terms of order higher
than .
The difference
is

(100) 
Using Eqs. (100) and (97), we can verify that
vanishes if f is constant, as expected:
where the normalization of w has been used. Also, it is apparent
that all noise sources, including Poisson noise, are proportional to
at high densities.
In order to further investigate the properties of Poisson noise at
high densities, we write it in a more compact form. Let us define the
average of a function
weighted with
as

(102) 
Using this definition we can rearrange Eqs. (97) and
(100) in the form

(103) 
This expression suggests that the Poisson noise is actually made of
two different terms,
and
.
The
first term is proportional to the difference between two averages of
f^{2} and f; both averages are performed using w_{A} w_{B} as weight.
Hence, this term is controlled by the "internal scatter'' of f on
points where both weight functions are significantly different from
zero; it is always positive. The second term is made of averages fusing different weight functions. It can be either positive or
negative if
.
Actually, as already seen in the limiting
case
,
the overall Poisson noise does not need to
be positive, and anticorrelation can be present in some cases.
6.7 Limit of high and low frequencies
The strong dependence of the Poisson noise on the function
makes an analytical estimate of this noise
contribution extremely difficult in the general case. However, it is
still possible to study the behavior of
in two important
limiting cases, that we now describe.
Suppose that the function
does not change
significantly on the scale length of the weight functions
and
(or, in other words, that the
power spectrum of f has a peak at significantly lower frequencies
than the power spectra of w_{A} and w_{B}). In this case, we can take
the function f as a constant in the integrals of Eq. (13),
and apply the results of Sect. 6.1. Hence, in the
limit of low frequencies, the Poisson noise vanishes.
Suppose now, instead, that the function
does not have
any general trend on the scale length of the weight
functions, but that instead changes at significantly smaller scales
(again, this behavior is better described in terms of power spectra:
We require here that the power spectrum of f has a peak at high
frequencies, while it vanishes for the frequencies where the power
spectra of w_{A} and w_{B} are significantly different from zero).
In this case, we can assume that integrals such as



(104) 
vanish approximately, because the average of f on large scales
vanishes (remember that we are assuming that f has no general trend
on large scales). Similarly, the integrals that appear in
and
vanish as well. In this case, then, the only
contribution to the Poisson noise arises from
.
This can be
easily evaluated

(105) 
where we have denoted with
the
average of f^{2} on large scales. Hence we finally obtain

(106) 
The results discussed in this section can also be numerically verified
in simple cases. Figure 8, for example, shows the Poisson
noise expected in the measurement of a periodic field when using
two Gaussian weight functions (see Sect. 7.2 for details).
From this figure, we see that the Poisson noise increases with the
frequency of the field f, and quickly attains a maximum value at
high frequencies. Moreover, the same figure shows that, in agreement
with Eq. (106), the Poisson noise at the maximum is simply
related to the measurement noise
(cf. Fig. 7 for
).
7 Examples
Similarly to what has been done in Paper I, in this section we
consider three typical weight functions, namely a tophat, a Gaussian,
and a parabolic weight. For simplicity, we will consider
1dimensional cases only; this will have also some advantages when
representing the results obtained with figures. Hence, we will use
x instead of
as spatial variable.
7.1 Tophat

Figure 3:
The value of
for tophat weights as a
function of the density .
Both weight functions w_{A} and
w_{B} are tophats (see Eq. (107)) centered on zero.
Using Eq. (108), we can use this graph to obtain
as a function of the density and the point separation
x_{A}  x_{B}. 
Open with DEXTER 

Figure 4:
The noise term
for two tophat weights as a
function of the point separation
for
two densities,
and .
The plot also shows
the quantity
,
which at high densities
approximates
(since then
). Note that
S_{11} for a tophat function is just given by
. 
Open with DEXTER 
The simplest weight that we can consider is a tophat function,
defined as

(107) 
Since w is either 1 or 0, we just need to consider C(1,1) to
evaluate .
Regarding the Poisson noise, from
Eq. (52) we deduce that C(1,2), C(2,1), and C(2,2) are
also required.
Figure 3 shows C(1,1) and
as functions of
the density
for two identical tophat weight functions centered
on the origin. From this plot we can recognize some of the limiting
cases studied above. In particular, the fact that
goes
to unity at low densities is related to Eq. (92); similarly,
the limit of C(1,1) at high densities is consistent with
Eq. (68). The same figure shows also the moments expansion
of C(1,1) up to forth order. As expected, the expansion completely
fails at low densities, while is quite accurate for .
Curves in Fig. 3 have been calculated using the standard
approach described by Eqs. (45), (46) and
(63). Actually, in the simple case of tophat weight
functions, we can evaluate C(1,1) using a more direct statistical
argument. We start by observing that in our case, for x_{A} = x_{B}, we have

(108) 
On the other hand, a tophat weight function is basically acting by
taking simple averages for all objects that fall inside its support.
This suggests that, for x_{A} = x_{B}, we can evaluate its measurement
noise as

(109) 
where p(N) is the probability of having N objects inside the
support. This probability is basically a Poisson probability
distribution with average .
However, since we are adopting the
prescription of "avoiding'' weight functions without objects in their
support, we must explicitly discard the case N = 0 and consequently
renormalize the probability. In summary, we have

(110) 
This expression combined with Eq. (109) allows us to evaluate
:

(111) 
We can directly verify this result using Eqs. (45),
(46) and (63). In fact, for the tophat function we
find
Q(s_{A}, s_{B}) 
= 

(112) 
Y(s_{A}, s_{B}) 
= 

(113) 
C(1, 1) 
= 



= 

(114) 
Finally, with a change of the dummy variable
we
recover Eq. (111).

Figure 5:
Numerical calculations for 1dimensional
Gaussian weight functions w_{A} = w_{B} centered on 0 and with
unit variance. The various curves shows the function
w_{A} w_{B}
C(w_{A}, w_{B}) for different densities .
Note that, as
expected,
C(w_{A}, w_{B}) approaches unity for largedensities. 
Open with DEXTER 

Figure 6:
Same as Fig. 5, but for two Gaussian weight
functions centered on 0 and 1 and with unit variance. 
Open with DEXTER 

Figure 7:
The noise term
for two Gaussian weights (of
unit variance) as function of their separation. Similarly to
Fig. 4, the plot also shows the highdensity
approximations
.
Note that in this case S_{11} is also a Gaussian (with double variance). 
Open with DEXTER 

Figure 8:
The Poisson noise T_{P} for two Gaussian weights (of unit
variance) for a periodic function of the form
as a function of the weight separation
,
for a density .
Note that, as expected, the Poisson
noise increases with k and approaches the limit discussed in
Sect. 6.7 for high frequencies. More
precisely, since for a sine function we have
,
Eq. (106) gives
(this can indeed be verified by a comparison with
Fig. 7). Note also that, while T_{P} is strictly
positive for
,
it can became negative (see curve for
k = 0.5) at larger separations. 
Open with DEXTER 
The other terms needed for the Poisson noise can be evaluated using a
calculation similar to the one performed in Eq. (114).
Actually, it can be shown that for any positive integers w_{A} and
w_{B} we have

(115) 
Figure 4 shows the expected measurement noise
as
a function of the point separation
.
Note that,
for densities of order
or larger, a good approximation is
obtained by just taking
C(w_{A}, w_{B}) = 1 (cf. the moments expansion
(68)), so that
;
we also
observe that, for a tophat weight function, S_{11} is a linear
function.
7.2 Gaussian
Frequently, a Gaussian weight function of the form

(116) 
is used. Although it is not possible to carry out analytical
calculations and obtain
C(w_{A}, w_{B}), numerical integrations do not
pose any problem. Figure 5 shows, for different densities,
the function
w_{A} w_{B} C(w_{A}, w_{B}) for two identical weights w_{A} =
w_{B} centered in zero; Fig. 6 shows the same quantity when
one of the weight function is centered at unity. Note that, in this
last figure, the largest covariance is at x = 0.5, as expected.
Figure 7 shows the expected measurement noise
as
a function of the weight separation. Similarly to the tophat weight,
an approximation valid for high density is
.
Figure 8 shows, instead, the Poisson noise
expected
for a field f of the form
,
for different values of
k. Note that the noise, as expected, increases with k, and
quickly reaches the "saturation'' value discussed in
Sect. 6.7. Note also that the noise is, at lowest
lowest density, negative for
.
7.3 Parabolic weight

Figure 9:
Numerical calculations for 1dimensional
parabolic weight functions w_{A} = w_{B} centered on 0 and with
unit variance. The various curves shows the function
w_{A} w_{B}
C(w_{A}, w_{B}) for different densities . 
Open with DEXTER 

Figure 10:
The noise term
for two parabolic weights as a
function of their separation (see also Figs. 4 and
7). 
Open with DEXTER 
Finally, we study of a parabolic weight function of the form

(117) 
This function illustrates well some of the peculiarities of finite
support weight functions. Figure 9 shows the results of
numerical integrations for
w_{A} w_{B} C(w_{A}, w_{B}) at different
densities .
A first interesting point to note is the
discontinuity observed at x = 1, which is in agreement with
Eq. (81). Moreover, as expected from Eq. (92), the
function plotted clearly approaches a constant at low densities
.
Finally, the measurement noise
is plotted in
Fig. 10.
8 Conclusions
In this article we have studied in detail the covariance of a widely
used smoothing technique. The main results obtained are summarized in
the following items.
 1.
 The covariance is composed of two main terms,
and
,
representing measurement errors and Poisson noise,
respectively; the latter one depends on the field f on which the
smoothing is performed.
 2.
 Expressions to compute
and
have been
provided. In particular, it has been shown that both terms can be
obtained in term of a kernel
C(w_{A}, w_{B}), which in turn can be
evaluated from the weight function
.
 3.
 We have obtained an expansion of the kernel
C(w_{A}, w_{B}) valid
at high densities .
 4.
 We have shown that
has an upper limit, given by
,
and a lower limit, provided by Eq. (90).
 5.
 We have evaluated the form of the noise contributions in the
limiting cases of high and low densities.
 6.
 We have considered three typical cases of weight functions and
we have evaluated
C(w_{A}, w_{B}) for them.
Finally, we note that although the smoothing technique considered in
this paper is by far the most widely used in Astronomy, alternative
methods are available. A statistical characterization of these
methods, using a completely different approach, will be presented in a
future paper (Lombardi & Schneider, in preparation).
Acknowledgements
This work was partially supported by a grant from
the Deutsche Forschungsgemeinschaft, and the TMR Network
"Gravitational Lensing: New constraints on Cosmology and the
Distribution of Dark Matter.''
Appendix A: Vanishing weights
In Sect. 3.2 we have obtained the solution
of the covariance problem under the hypothesis that the weight
function
is strictly positive. In this appendix we
will generalize the results obtained there to nonnegative weight
functions (see also Sect. 4).
If w_{A} is allowed to vanish, then we might have a finite probability
that y_{A} vanishes, i.e. a finite probability that no point
is inside the support of w_{A}. A finite probability
in a probability distribution function appears as a Dirac's delta
distribution. Since this point is quite important for our discussion,
let us make a simple example. Suppose that
is a real random
variable with the following characteristics:

has probability 1/3 to vanish.

has probability 2/3 to be in the range
;
in
this range
has an exponential distribution.
Then we can write the probability distribution function for as

(A.1) 
where
is the Heaviside function (see Eq. (40)).
In other words, the probability distribution for
includes the
contribution from a Dirac's delta distribution centered on .
If
is known, the probability that
is exactly zero (1/3in this example) can be obtained using

(A.2) 
Let us now turn to our problem. As mentioned above, for vanishing
weights we expect that y_{A} might vanish, i.e. its probability might
include the contribution from a delta distribution centered on y_{A} =
0; similarly, if w_{B} is allowed to vanish, the probability
distribution for y_{B} might include a delta centered in y_{B} = 0.
For a given y_{B}, the probability P_{A}(y_{B}) that y_{A} vanishes is
given by

(A.3) 
where the properties of Laplace transform have been used in the last
equality (see Appendix D). A similar
equation holds for the probability that y_{B} vanishes, P_{B}(y_{A}).
Note that the Laplace transform in Eq. (A.3) is performed only
with respect to the first variable. The joint probability P_{AB}that both y_{A} and y_{B} vanish is (cf. Eq. (61))

(A.4) 
We then also define (cf. Eqs. (62))
Using Eq. (45), we find
,
,
and
,
where
is the area of the support of w_{A},
is the area of
the support of w_{B}, and
is the area of the union of
the two supports. This result is of course not surprising and has
been already derived in the paragraph before Eq. (61) using a
different approach.
For vanishing weights, we decided to use the following prescription:
We discard, in the ensemble average for
,
the configurations
for which the
function
is not defined either at
or at
.
In order to implement this prescription, we can
explicitly modify the probability distribution p_{y} and exclude "by
hand'' cases where the denominator of Eq. (19) vanishes; for
the purpose, we consider separately cases where w_{A} or w_{B} vanish.
We define a new probability distribution for
(y_{A}, y_{B}) which
accounts for vanishing weights:

(A.7) 
We recall that
.
In constructing
this probability, first we have explicitly removed the degenerate
situations, then we have renormalized the resulting probability. Note
that the normalization factor in the last case, namely
1  P_{A}  P_{B}
+ P_{AB}, comes from the socalled "inclusionexclusion principle''
(
1  P_{A}  P_{B} + P_{AB} is the probability that both
f_{A} and f_{B} are defined). Using this new probability
distribution in the definition (32) for Y we obtain

(A.8) 
Finally, we need to change the normalization factor in
Eq. (47) in order to account for cases where y_{A} or y_{B}are vanishing. Indeed, the correcting factor
C(w_{A}, w_{B}) has been
obtained by assuming that all objects can populate all the plane with
uniform probability distribution (cf. Eq. (25)); now,
however, a fraction
(P_{A} + P_{B}  P_{AB}) of configurations have been
excluded. Hence we have

(A.9) 
This complete the discussion of vanishing weights.
Appendix B: Moments expansion
In Sect. 5 we have written the moments
expansion for
C(w_{A}, w_{B}). Here we complete the discussion by
providing a proof for that result.
At high densities, y_{A} and y_{B} are basically Gaussian random
variables with average values
and
(we anticipate
here that these averages are given by the density ). Hence, we
can expand them in the definition of
C(w_{A}, w_{B}):
C(w_{A}, w_{B}) 
= 



= 



= 

(B.1) 
where M_{ij} are the "centered'' moments of p_{y}:

(B.2) 
The centered moments can be expressed in terms of the "uncentered''
ones, defined as

(B.3) 
Here
Y^{(i,j)}(0, 0) is the ith partial derivative on s_{A} and
jth partial derivative on s_{B} of
Y(s_{A}, s_{B}), evaluated at (0,
0). These, in turn, can be expressed as derivatives of Q. For the
first terms we have
Y^{(0,0)}(0,0) 
= 

(B.4) 
Y^{(2,0)}(0,0) 
= 

(B.5) 
Y^{(1,1)}(0,0) 
= 

(B.6) 
Y^{(0,2)}(0,0) 
= 

(B.7) 
Finally, the derivatives of Q can be evaluated as
Q^{(i,j)}(0, 0) = (1)^{i+j} S_{ij} ,

(B.8) 
where S_{ij}, we recall, is given by Eq. (67). Note that
S_{01} = S_{10} = 1 because of the normalization of w_{A} and w_{B},
and thus, as already anticipated,
.
In
summary, we find
M_{00} 
= 

(B.9) 
M_{11} 
= 

(B.10) 
We stress that, in general, it is not true that
(more complex expressions are encountered for higher order terms; cf.
the last term in Eq. (66)). Finally, we can write the
expansion of
C(w_{A}, w_{B}):

(B.11) 
This is precisely Eq. (68). Using the same technique and a
little more perseverance, we can also obtain higher order terms. In
particular, Table B.1 reports the moments M_{ij} defined in
Eq. (B.2) up to the forth order. This table, together with
Eq. (B.1), can be used to write an accurate moment expansion
of
C(w_{a}, w_{B}).
Appendix C: Varying weights
In Paper I we have considered a modified version of the estimator
(1) which allows for the use of supplementary weights.
Suppose that we measure a given field
at some
positions
of the sky. Suppose also that we use a
weight u_{n} for each object observed, so that we replace
Eq. (1) with

(C.1) 
For example, if we have at our disposal some error estimate for each object, we might use the weighting scheme
in order to minimize the noise of the estimator
(C.1).
A statistical study of the expectation value of this estimator has
already been carried out in Paper I. Here we proceed further and
study its covariance under the same assumptions as the ones used for
the study of Eq. (1) in the main text. However, since one of
the main reasons to use weights is some knowledge on the variance of
each object, we use a generalized form of Eq. (7):

(C.2) 
Note, in particular, that the variance is assumed to depend on u_{n}(or, equivalently, the weight is assumed to depend on the variance).
Similarly to Paper I, we also assume that, for each object n, the
weight u_{n} is independent of the position
and of the
measured signal ,
and that each u_{n} follows a known
probability distribution p_{u}.
In Paper I we have shown that the average value of
can be calculated using the equations
R_{X}(s_{X}) 


(C.3) 
Y_{X}(s_{X}) 


(C.4) 
B_{X}(v_{X}) 


(C.5) 

= 

(C.6) 
We now evaluate the covariance of the estimator (C.1) using a
technique similar to the one used in Sect. 3. We have

(C.7) 
As usual we consider separately the cases n = m and ,
thus
obtaining the two terms T_{1} and T_{2}:
T_{1} 
= 

(C.8) 
T_{2} 
= 

(C.9) 
Let us introduce new variables
(with
)
for the combination of weights, and let us define
similarly to Eq. (26)
.
Then
the probability distributions for v_{Xn} and y_{X} can be evaluated
using the set of equations
p_{v}(v_{A}, v_{B}) 
= 

(C.10) 
p_{y}(y_{A}, y_{B}) 
= 





(C.11) 
Again, it is convenient to consider the Laplace transforms of these
two probability distributions:
V(s_{A}, s_{B}) 
= 

(C.12) 
Y(s_{A}, s_{B}) 
= 

(C.13) 
In the continuous limit we define instead
R(s_{A}, s_{B}) 


(C.14) 
Y(s_{A}, s_{B}) 
= 

(C.15) 
Finally, the equivalent of the correcting factor
C(w_{A}, w_{B}) (cf. Eq. (11)) is, in our case, the quantity

(C.16) 
The quantity B can be used to evaluate T_{1}: In fact, we
have
T_{1} 
= 



= 

(C.17) 
Similarly, for T_{2} we obtain
T_{2} 
= 





(C.18) 
The final evaluation of
then proceeds similarly to
what done in the main text for the estimator (1).
Appendix D: Properties of the Laplace transform
For the convenience of the reader, we summarize in this appendix some
useful properties of the Laplace transform. Proofs of the results
stated here can be found in any advanced analysis book
(e.g. Arfken 1985). Although in this paper we have been dealing
mainly with Laplace transforms of twoargument functions, we write the
properties below for the case of a function of a single argument for
two main reasons: (i) The generalization to functions of several
arguments is in most cases trivial; (ii) Several properties can be
better understood in the simpler case considered here.
Suppose that a function f(x) of a real argument x is given. Its
Laplace transform is defined as

(D.1) 
Note that we use 0^{} as lower integration limit in this definition.
The Laplace transform is a linear operator; hence, if
and
are two real numbers and g(x) is a function of
real argument x, we have
.
The Laplace transform of the derivative of f can be expressed in
terms of the Laplace transform of f. In particular, we have

(D.2) 
This equation can be generalized to higher order derivatives. Calling
f^{(n)} the nth derivative of f, we have

(D.3) 
Surprisingly, this equation holds if, for n negative, we consider
f^{(n)} to be the nth integral of f; note that in this case
the summation disappears. Hence, for example, we have

(D.4) 
Often, properties of the Laplace transform come in pairs: For every
property there is a similar one where the role of f and
are
swapped. Here is the "dual'' of property (D.2):

(D.5) 
or, more generally,

(D.6) 
A similar equation holds for "negative'' derivatives, i.e. integrals
of the Laplace transform. In this case, however, it is convenient to
change the integration limits to
.
In summary, we can
write

(D.7) 
Given a positive number a, the Laplace transform of the function fshifted by a is given by

(D.8) 
where
is the Heaviside function defined in
Eq. (40). A dual of this property can also be written:

(D.9) 
Finally, we consider two useful relationships between limiting values
of f and
:



(D.10) 

Arfken, G. 1985, Mathematical methods for physicists (Orlando, Florida:
Academic press)
In the text

Chandrasekhar, S. 1989, Stochastic, statistical, and hydromagnetic problems
in physics and astronomy (Chicago: University of Chicago Press, c1989)
In the text

Deguchi, S., & Watson, W. D. 1987, Phys. Rev. Lett., 59, 2814
In the text
NASA ADS

Eadie, W., Drijard, D., James, F., Roos, M., & Sadoulet, B. 1971, Statistical
Methods in Experimental Physics (Amsterdam NewYork Oxford: NorthHolland
Publishing Company)
In the text

Lombardi, M., & Schneider, P. 2001, A&A, 373, 359
NASA ADS

Lombardi, M., Schneider, P., & MoralesMerlino, C. 2001, A&A, 382, 769
In the text
NASA ADS
Copyright ESO 2002