A&A 373, 746-756 (2001)
DOI: 10.1051/0004-6361:20010611
R. Schieder - C. Kramer
I. Physikalisches Institut, Universität zu Köln, Zülpicher Straße 77, 50937 Köln, Germany
Received 11 January 2001 / Accepted 26 April 2001
Abstract
Stability tests based on the Allan
variance method have become a standard procedure for the evaluation
of the quality of radio-astronomical instrumentation. They are very
simple and simulate the situation when detecting weak signals buried
in large noise fluctuations. For the special conditions during
observations an outline of the basic properties of the Allan
variance is given, and some guidelines how to interpret the results
of the measurements are presented. Based on a rather simple
mathematical treatment clear rules for observations in
"Position-Switch'', "Beam-'' or "Frequency-Switch'',
"On-The-Fly-'' and "Raster-Mapping'' mode are derived. Also, a
simple "rule of the thumb'' for an estimate of the optimum timing
for the observations is found. The analysis leads to a conclusive
strategy how to plan radio-astronomical observations. Particularly
for air- and space-borne observatories it is very important to
determine, how the extremely precious observing time can be used
with maximum efficiency. The analysis should help to increase the
scientific yield in such cases significantly.
Key words: instrumentation: miscellaneous - methods: data analysis, observational - space vehicles: instruments - techniques: spectroscopic - telescopes
Allan variance measurements have been demonstrated as a useful tool for the characterization of the stability of radio-astronomical equipment such as Millimeter or Submillimeter-receivers or large bandwidth back-ends (Schieder et al. 1985; Kooi et al. 2000). Particularly for the development of acousto-optical spectrometers (AOS) at the Kölner Observatorium für Sub-Millimeter Astronomy (KOSMA) the method has played a very important role, because it provides clear evidence that the spectrometers are well suited for the use at an observatory by means of a reliable test laboratory procedure (Tolls et al. 1989). The simple definition of the Allan variance makes it very easy to apply such measurements also for the characterization of the stability of other instruments, a very elementary case is the definition of the quality of a simple Lock-In amplifier for example.
For a real time spectrometer, as used in radio-astronomy with many simultaneously operating frequency channels, it is a very important condition that all channels are behaving identically in a statistical sense. Therefore, the use of the Allan variance for the investigation of the performance of the spectrometer is based on the assumption that there are no differences between different frequency channels. That this is not always correct is evident. Thus, it is always necessary to verify the similarity of all frequency channels of the spectrometer by investigating the baseline noise of measured spectra for example. Typical problem areas for instance are light scatter problems in acousto-optical spectrometers (AOS), where speckles may affect individual channels more heavily than others. The same is true for filterbanks which have occasionally same peculiar channels even in a well maintained back-end system. But in all normal cases of well behaved instrumentation, the Allan variance plot is a most useful method to precisely characterize the instrumentation in use.
In general, observations at an observatory are done with the available instrumentation as is, and it can not be modified or even improved by the observer. On the contrary, the observer has to find the correct observing parameters in order to use the available hardware in a most economic way. It is the purpose of this paper to develop a strategy for an optimization of the observing process. For this the knowledge of the stability parameters is decisive. Once this information is available from an Allan variance measurement for example, it should be a rather straightforward matter to determine the essential parameters like length of integration per position on sky et cetera. The following mathematical treatment analyses the commonly used observing methods, i.e. "Position-'', "Beam-'' or "Frequency-Switch'', "On-The-Fly'' (OTF) measurements or "Raster-Mapping'' based on the information contained in the Allan variance plot. As a result practical guidelines for the most efficient observing method are found, which can be used at any radio observatory. Particularly, all space- or air-borne observatories require a most efficient use of the extremely precious observing time, since any loss can usually not be compensated by a simple increase in observatory time. But also for ground-based observatories the results found in the following should be very useful.
If a test procedure is defined for use at any time and at any
location, it needs to be as simple and unique as possible. Therefore,
we understand the Allan variance as the ordinary statistical variance
of the difference of two contiguous measurements (see also
Rau & Schieder 1984). One has to consider a signal-function s(t), which is
the instantaneous output signal of a spectrometer channel or of a
continuum detector for example. The output is now integrated for a
time interval T representing an estimate of the mean signal which is
stored as spectrometer data in the computer:
![]() |
(1) |
![]() |
(2) |
In order to obtain a plausible estimate of the error of the difference we use the standard definition of the variance:
![]() |
(3) |
If we apply now Eq. (1), we get:
![]() |
(4) | ||
![]() |
If we have the same statistics for both, "s'' and "r''
,
then we get finally:
We have not yet made any particular assumption about the source of the
signal- and the reference-data. For our application here, the two data
"s'' and "r'' are derived from the same output signal s(t) of one
spectrometer channel. The two acquisition periods of length T for the
integration of
and
must therefore occur one after the other in
order to avoid any undesirable overlap between the two
measurements. For an unequivocal definition of the instrumental Allan
variance we assume that all "s'' and "r'' measurements are contiguous
without any dead time in between. In real life, when observing, there
will be always some unavoidable dead time, since the telescope needs
to be moved between the On- and the Off-position or there is time
needed for data transfer etc. Any delay will increase the impact of
slow drift noise, and it will therefore result in a different
appearance of the system noise. Such effects will be discussed in the
next chapter.
For a given integration time the signal output of one spectrometer
channel is described by Eq. (1). We can describe the instantaneous
noise signal s(t) before integrating using the (in this case not
normalized) auto-correlation function
,
but here as a
function of delay time
:
![]() |
(5) |
![]() |
(6) |
![]() |
(7) |
![]() |
(8) | ||
![]() |
![]() |
(9) |
![]() |
Figure 1:
Artificial data set generated by random numbers (left) with white
noise of Gaussian distribution (top), drift noise (middle), and
combined noise (bottom). Each data point corresponds to a sample
integrated for 1 s while the fluctuation bandwidth was set to
600 kHz. The drift noise is calculated by filtering white noise with
a sufficiently broad boxcar time-filter (width
![]() |
Open with DEXTER |
In this approximation we have now for the Allan variance according to
Eqs. (6), (7), and (9):
![]() |
(10) |
If we assume a simple power law for the drift contribution with a well
defined ,
and if we consider the additional presence of
radiometric noise, or "white noise'', we expect the Allan variance to
have the following structure as a function of integration time:
Within the white noise part of the Allan plot, i.e. the regime with
the slope of "-1'', the radiometer equation must be valid:
![]() |
(11) |
![]() |
(12) |
In most practical cases it is very useful to refer to the particular integration time in the Allan variance plot where the minimum occurs. This minimum describes the turn-over point where the radiometric noise with a slope of -1 in the logarithmic plot becomes dominated by the additional and undesired drift noise (see Fig. 1). Above the minimum time the rms of the measurements becomes much larger than is anticipated by the radiometer equation alone. Intuitively, the minimum time might appear as an upper limit for the integration on individual positions during radio-astronomical observations, but the Allan variance plot offers a lot more detailed advice when planning the most efficient observing strategy under the given circumstances. Since any additional noise above the radiometric level is very unfavorable, one has to find the optimum integration time, where the loss due to inevitable dead time during slew of the telescope etc. is as little as possible, and where the impact of drift contributions is nearly negligible at the same time. To find this best compromise is the goal of the following chapters.
By use of the minimum time
of the variance we can now rewrite the
above equation with:
![]() |
(13) |
The slope of the drift part in the Allan variance plot is, as is seen in Fig. 1, also one of the important parameters for the characterization of the instrument. Therefore, we can conclude that the minimum time, the fluctuation bandwidth, and the slope at large integration time are the three parameters which fully characterize the instrument in a statistical sense. All three parameters are directly accessible from the Allan variance plot once there are sufficient data collected for a reliable evaluation. It is interesting to note that generally the outcome of an Allan variance test looks nearly identical to previous ones as long as the instrumentation used for the test is not altered. This is particularly useful for checking the health of an instrument from time to time. Certainly, there are other methods to describe the noise performance of a radiometer like the plot of the noise power spectrum or the correlation function or else, but it seems rather natural to use the Allan variance plot, since it is directly related to the normal observing procedure when observing an "On''- and an "Off-position'' with a radio-telescope.
If the fluctuation bandwidth
is changed the minimum also shifts
due to the changing level of white noise, but, despite the change of
the leading factor, Eq. (13) is not altered due to the normalization
of the time with the Allan variance minimum time. How the radiometric
contribution is decreasing with increasing fluctuation bandwidth is
clear from the radiometer equation. However, the drift contribution
should not change, since it does not depend on the shape of the
filter-function of the actual spectrometer channel. The minimum
therefore shifts to smaller times with increasing
like
![]() |
(14) |
Co-adding frequency pixels of a spectrometer output is standard practice in radio-astronomy when dealing with very broad emission lines e.g. from other galaxies. Thus it is not uncommon to finally discuss spectra with an effective fluctuation bandwidth of the order of 50 MHz by binning several spectrometer channels. A typical minimum time of a complete radiometer system at an observatory is somewhere around 30 s or so at a resolution of 1 MHz of the spectrometer. According to Eq. (14) one would expect a shift of the minimum time to values somewhere between 4 and 8 s for the bins. A much larger bandwidth one has to deal with, when measuring continuum signals with large bandwidth bolometers. A typical effective bandwidth may be of the order of some 50 GHz. In this case the minimum of the Allan variance moves to values between 0.1 and 0.8 s, when assuming the origin of the white noise is still just radiometric while the drift noise remains as before. It is clear that the integration time used for sampling on each position may be a few seconds in the first case, but has to be less than 100 msec in the second.
As was mentioned above, the Allan variance plot provides information about what to expect in case there are no gaps in time between the corresponding measurements "signal'' (On) and "reference'' (Off). This is very close to the standard situation during observing, but now the presence of dead time has to be included into the discussion. When investigating the simple description of the Allan variance as a function of integration time from above it seems plausible that the plot should also provide all information about the impact of drift noise, if there is dead time between the two measurements. How to do this is fairly straightforward, and, in order to keep things short, we present the mathematical treatment only briefly.
Position-Switch measurements with one signal integration (On) per
reference measurement (Off) are very common for the observation of
single positions in an extended source for example. In other cases
Beam-Switch with a wobbling secondary mirror or Frequency-Switch
measurements are applied, since these methods seem to be more
promising for the resulting signal to noise ratio. In terms of a more
mathematical treatment, all these methods are identical, only the
typical time scale is different. In practice some dead time needs to
be included in the observing procedure, but both, On- and
Off-integration, are assumed to be of equal length. Following Eq. (1) we have for the signal- and
the reference-measurement:
![]() |
|||
![]() |
![]() |
(15) |
![]() |
= | ![]() |
(16) |
= | ![]() |
Figure 2 shows the shape of Eq. (16) as a function of the relative
integration time t for a few values of d. For each d > 0 the
function has exactly one fairly broad minimum, and it is plausible
that only in this minimum the observation can be done with maximum
efficiency. Any other t leads to a higher noise level, i.e. to lower
efficiency within a given observing time. This can be explained by the
facts that with very short integration a lot of time is wasted while
moving the telescope, and that at very long integration time the drift
noise starts to deteriorate the signal to noise ratio on the other
hand. In Fig. 3 the optimum integration time at the minimum of the
variance is shown for both cases
and 2 as a function of the
relative dead time d. The preferred relative integration time t is
always significantly smaller than unity, which leads to the important
conclusion that the integration time should always be considerably
smaller than the Allan variance minimum time. With a realistic drift
noise contribution (
)
the optimum integration time will
be located somewhere between the two solid lines in the plot. For the
figure, also those limits for the integration time have been computed,
where the rms-noise is increased by less than 1% as compared to the
optimum. The dotted curves indicate these limits for both
,
and
it is appears that these regions overlap largely. The hatched area in
the plot indicates where this overlap-region is found. It means that
for any realistic scenario it is always possible to find an
integration time with almost perfect noise performance independent on
the actual drift characteristics of the system. Consequently, the
precise knowledge of the drift slope
is not really essential
for the optimization procedure.
![]() |
Figure 2:
The development of the rms of Position-Switch measurements as a
function of integration time for a drift slope of ![]() |
Open with DEXTER |
As was mentioned before, with a standard low resolution spectrometer
one typically finds an Allan variance minimum of a complete
radiometer system in the range of 30 s or so. Chopped
measurements, using a wobbling secondary telescope mirror for
example, are considered as the ideal method for point-like sources to
reduce the impact of drift noise on the appearance of the baselines
of the spectra. If the chop delay, i.e. the time to move the
subreflector between the two positions, needs 100 msec for example,
the optimum integration time per position is found near 4 s
following Eq. (16). The situation seems to be different for the case
d = 0, as it would apply for Frequency-Switch measurements for
example, since the switch between the two nearby frequencies takes
negligible time. But, as is visible in Fig. 2, the increase in rms
noise is fairly marginal (1%) even for integration times T up
to 14% of
.
This means, in all practical cases it is of no use
to switch at high speed, on the contrary, the efficiency of the
observation might become affected, if dead time is involved. Even for
spectra at moderately reduced frequency resolution the required
integration time does not drop significantly below 1 s. It is
therefore important to note, that a higher chop frequency is only
required for continuum measurements with very large bandwidth.
The ideal, theoretical limit for the observing efficiency is reached,
when effectively all integration time is spent on the On-position and
if there would be no dead time involved. In this case we have:
![]() |
= | ![]() |
(17) |
= | ![]() |
![]() |
Figure 3:
Optimum integration time as a function of On-Off delay for the two
extreme drift contributions with ![]() ![]() ![]() ![]() |
Open with DEXTER |
Another and possibly more interesting case is the situation when
measuring maps either by Raster-Mapping or On-The-Fly. In both cases
there are N different On-positions per Off-position in one cycle,
the only difference is that for Raster-Mapping there is some dead time
between the different On-positions which does not appear during OTF
observations. It is found in literature that the Off-integration time
should be
times longer than the On-integration time
(Ball 1976). This advice leaves the question open how long the
On-integration should last. For the following treatment of this
question we assume that we have an On-integration time
,
an
Off-integration time
,
a dead time
between each of the
On-measurements, another dead time
to move from the last On-
to the Off-position, and a different dead time
to move the
telescope back to the first On-position to begin with the next cycle
again. It is plausible that
will not be identical with
,
since the first and last On-position are not the same, and
the time to move between the positions (with different velocity
requirements in OTF-mode as well) is definitely different.
The delay between one of the On-positions and the Off-position is also
dependent on the number of Ons in between. If we consider the worst
case situation, we have to investigate the On-Off pairs with maximum
delay involved, which is the first On-position when putting the Off at
the end of the cycle. The delay
is then:
![]() |
|||
![]() |
(18) |
We also have to take into account now that the integration time for On
is different than for Off. Hence we write:
![]() |
(19) | ||
![]() |
![]() |
(20) |
The function g(s,r,d) is identical with f(t,d) for s=r=t (see Eq. (15)). The variance found here is valid for one pair of a particular On- and the corresponding Off-measurement.
We have to identify now, how the noise is developing, if one wants to
observe a full map within a given total observing time
.
One
observing cycle consists of N identical On-integrations (
), one
Off-integration (
), and the various dead times in between. Thus
we have for the complete cycle time
:
![]() |
(21) |
![]() |
(22) |
![]() |
= | ![]() |
(23) |
= | ![]() |
The minimum of
can be found, where all derivatives
with respect to s, r, and N become zero. This is the set of
variables where the observing efficiency becomes the best possible
under the given circumstances. (It is simple to prove that there is
exactly one minimum as long as s, r and N are larger than zero.)
Any other set of variables will result in a degradation of the
observing efficiency. But, as was mentioned before, the use of the
relation
leads to results very close to this
optimum
. Therefore, for all
practical purposes it is sufficient to apply only a two-dimensional
optimization for the two variables s and N:
![]() |
(24) | ||
![]() |
Usually, it is rather difficult to make observations with an arbitrary
number of Ons per Off at a given geometry of a particular map. It is
therefore much more interesting to derive conclusive estimates for an
optimized observation under the assumption of a predefined and fixed Nfor both, Raster-Mapping and OTF observations. In this case one has to
find the minimum with:
![]() |
(25) |
In order to provide some idea about the best choice of the
On-observing time s, the optimum integration time in OTF mode is
shown in Fig. 4 as a function of the On-Off delay .
The delay for
the return to the begin of the cycle is taken into account by a
20% longer than
.
The two solid curves are derived from Eqs. (23)
and (20) for the two limiting cases
and
.
The
hatched area in the plot represents the region where the increase of
the rms stays below 1% as compared to the optimum for both values of
.
This means that for all assumed drift slopes one is always
safe when choosing an On-integration time within this region. Such
optimized integration time can be described by the purely empirical
formula:
![]() |
(26) |
d is the sum of all delays in one cycle. The formula is also valid
for Raster-Mapping and Position-Switch measurements, and it may be
used for values of
and
between 0 and 1, for
,
and
.
![]() |
Figure 4:
Optimum On-integration time for OTF measurements with 50 Ons per
Off. The hatched area represents the regime where the rms increase
stays below 1% for any ![]() |
Open with DEXTER |
Finally, also the overall observing efficiency can be found for the
measurement of extended maps. The theoretically best possible value of
the variance is given by:
![]() |
(27) |
![]() |
Figure 5:
Relative optimum efficiencies of OTF measurements for N=1, 10, and
100 On-positions per Off (see Eq. (27)). For each N both curves for
![]() ![]() |
Open with DEXTER |
How the efficiency develops with N is visible in Fig. 6 for some fixed
On-Off delays. Obviously, the gain in efficiency with increasing Nabove N = 50 is rather marginal. Therefore it is questionable whether
a significant improvement in observing efficiency is achievable when
going from N = 50 to N = 100 for example. Any reduction of the On-Off
delay time would be a much more effective measure. On the other hand,
the plot shows also, how valuable an increase in N can be in case one
is considering N = 10 or less.
![]() |
Figure 6:
Relative OTF efficiency as a function of the number of Ons per Off for
various relative On-Off delays according to Eqs. (27), (25), and (23). For each ![]() ![]() |
Open with DEXTER |
One of the remaining questions is, how long one cycle
will last,
once the optimum On- and Off-integration time has been found. Using
Eq. (21) it is now simple to calculate
as a function of the
On-Off delay time
.
In Fig. 7 the cycle time is plotted for three
cases with N = 1, 10, and 100. At first sight it appears surprising
that the time for a full cycle increases to values several times
longer than the Allan variance minimum time in case there is
substantial delay
.
But again, the length of one cycle depends
strongly on the number of Ons per Off. Since the On-integration time
is rather small at large N, the larger radiometric noise of the
On-measurement dominates the noise budget so that a longer delay with
an increased contribution of drift noise becomes acceptable. For a
given and fixed N the increase of the cycle time with increasing delay
is the consequence of the fact that at larger integration time the
loss due to drift noise is less costly than the loss due to the On-Off
delay. This effect is also clearly visible in Fig. 2 for the case of
Position-Switch measurements.
![]() |
Figure 7: Cycle time for OTF measurements as a function of On-Off delay. The cycle time comprises N On-integrations, one Off-integration, and the dead times in between. The three cases (N=1, 10, and 100) are calculated from Eqs. (21), (23), and (25). Similar to Fig. 5, the Position-Switch situation is also indicated by the dotted lines. Note that the increase of cycle time is partly due to the time spent during slew from On to Off and back. |
Open with DEXTER |
The discussion above provides some clear guidelines for an optimized
observing program. The first step has to be a reliable measurement of
the system Allan variance. The word "system'' includes all components
of the observatory which may possibly contribute to the noise
including the atmospheric fluctuations for example. When knowing the
applicable dead times, a simple calculation of the optimum integration
time can be made by using the "rule of the thumb'' as given by
Eq. (26). As was pointed out before, Position-Switch or Chop
measurements should be done in a most economical way by moving the
telescope or the chopper only every second time. OTF or Raster-Mapping
measurements need a clear understanding of the impact of the number of
On-positions chosen for each Off-integration. Also here it might be
of some value to reverse the sequence of the integrations on the
various positions every second time in order to reduce some of the
loss in time due to the slew of the telescope between the On- and the
Off-positions. It should be noted that the measurement of large maps
can be handled in different ways. If one wants to achieve a certain
signal to noise, it might be advisable to use larger N with smaller
and to repeat the map several times, as it is considered by the
parameter K in Eq. (22). In any case, the suggested On- and
Off-integration time should not be drastically altered, although the
plot in Fig. 4 indicates that there is quite some margin available.
In general it is surprising how closely together the curves for the
different
in Figs. 5-7 are found, which is a clear
validation for the assumption that it is sufficient to consider only
the extreme cases for the drift contributions. Therefore, there is no
need to go too deeply into the analysis of the drift part in the
noise. It is also one of the better news from the treatment here that
some freedom to plan the observation is still preserved. This might be
particularly important when considering the constraints set by the
observatory hardware. It is probably not advisable to operate with too
short integration intervals, since the data flood might become
overwhelming, and the storage capacity of the computers could easily
be exceeded. Therefore, the conclusion found before that there are no
real requirements for high speed observing most of the time is very
important.
The discussion above is most useful for observations with space-born
observatories like SWAS (Melnick et al. 2000), ODIN
(Hjalmarson 1993) or FIRST (de Graauw et al. 1998). Since
usually a satellite cannot be oriented in space very rapidly, the
impact of dead time becomes vital. The SWAS satellite is not capable
to control the pointing very accurately during slew across an
extended source, so that the OTF mode is not applicable. Instead,
Raster-Mapping is a generally used procedure. On the other hand,
since SWAS is a very small satellite, it can be pointed from one
position to a second in 3 degrees distance within less than 15 s. A 3-degree nod is often required during observations in the
Milky Way, since the emission of molecules like CO is fairly
extended. Nevertheless, the loss in observing efficiency looks
acceptable, when considering an Allan variance minimum time of the
SWAS receiver/backend system of about 150 s as found in orbit.
On the Herschel space observatory, the situation will be changed
drastically. We can assume that the pointing of the telescope during
slew is well defined so that OTF measurements should be applicable.
But, due to the fact that Herschel is going to be a very heavy
satellite, the movement by three degrees will last nearly as long as
the expected Allan variance minimum time will amount to. In
consequence, the value of the dead times
and
will be
close to unity when assuming a similar system stability like that of
SWAS. This prohibits Position-Switch measurements with the
instrument, because the efficiency would drop to values below 30%,
which would certainly be rather disappointing because of the
consequences for the extremely precious and limited observing time.
Therefore, a very careful analysis for determining the best possible
observing strategy is extremely important for such a program.
Rather different circumstances exist at ground-based observatories. Typical dead time for a slew of 3 degrees is of the order of a few seconds only, therefore the impact of dead time does not appear as devastating as with space-based observatories. A detailed planning of an observing strategy does not seem to be so easily implemented, particularly, if other parameters like varying hardware constraints or human limitations are playing a significant role as well. Typically, the Allan variance minimum time of most ground-based sub-millimeter observatories is rather small, partly due to the impact of an unstable atmosphere. Therefore, the advantage of a smaller dead time is partly eaten away by the reduced stability. But still, as should be clear from the discussion before, the actual situation has to be analyzed in detail for every individual case in order to achieve as much scientific return from the observations as possible. For this the usage of the analysis presented in this paper could be very essential.
Co-adding a couple of pixels in a measured spectrum in order to
improve the signal to noise ratio is general practice when dealing
with noisy spectra, but, the consequences of this procedure are not
quite as trivial as one would like to believe. For the discussion we
start again with the definition of the normalized first order
correlation function as defined in Eq. (4):
The data yn are here the pixel components of a fully calibrated
spectrum as measured with a multi-channel spectrometer. The index "m''
describes, by how many pixels the spectrum is shifted before the
multiplication of the pixel data is done.
The correlation function is symmetric,
since
g-m=gm. We assume that all yn behave identically in a
purely statistical sense. Then, the values of gm depend only on the
"distance'' between the data given by the parameter "m'', and the
expectation values as defined by the brackets become independent on
n. We have to determine now the expected statistics of the new
co-added data set zn with:
![]() |
|||
![]() |
|||
![]() |
|||
![]() |
Only the first few values of gm (m not larger than about 3)
should be non-zero for a decent spectrometer, since the overlap of the
power response functions between neighbored pixels should be small.
Therefore, in the limiting case of very large width of the bins (Klarge), we get now: