A&A 373, 746-756 (2001)
DOI: 10.1051/0004-6361:20010611
R. Schieder - C. Kramer
I. Physikalisches Institut, Universität zu Köln, Zülpicher Straße 77, 50937 Köln, Germany
Received 11 January 2001 / Accepted 26 April 2001
Abstract
Stability tests based on the Allan
variance method have become a standard procedure for the evaluation
of the quality of radio-astronomical instrumentation. They are very
simple and simulate the situation when detecting weak signals buried
in large noise fluctuations. For the special conditions during
observations an outline of the basic properties of the Allan
variance is given, and some guidelines how to interpret the results
of the measurements are presented. Based on a rather simple
mathematical treatment clear rules for observations in
"Position-Switch'', "Beam-'' or "Frequency-Switch'',
"On-The-Fly-'' and "Raster-Mapping'' mode are derived. Also, a
simple "rule of the thumb'' for an estimate of the optimum timing
for the observations is found. The analysis leads to a conclusive
strategy how to plan radio-astronomical observations. Particularly
for air- and space-borne observatories it is very important to
determine, how the extremely precious observing time can be used
with maximum efficiency. The analysis should help to increase the
scientific yield in such cases significantly.
Key words: instrumentation: miscellaneous - methods: data analysis, observational - space vehicles: instruments - techniques: spectroscopic - telescopes
Allan variance measurements have been demonstrated as a useful tool for the characterization of the stability of radio-astronomical equipment such as Millimeter or Submillimeter-receivers or large bandwidth back-ends (Schieder et al. 1985; Kooi et al. 2000). Particularly for the development of acousto-optical spectrometers (AOS) at the Kölner Observatorium für Sub-Millimeter Astronomy (KOSMA) the method has played a very important role, because it provides clear evidence that the spectrometers are well suited for the use at an observatory by means of a reliable test laboratory procedure (Tolls et al. 1989). The simple definition of the Allan variance makes it very easy to apply such measurements also for the characterization of the stability of other instruments, a very elementary case is the definition of the quality of a simple Lock-In amplifier for example.
For a real time spectrometer, as used in radio-astronomy with many simultaneously operating frequency channels, it is a very important condition that all channels are behaving identically in a statistical sense. Therefore, the use of the Allan variance for the investigation of the performance of the spectrometer is based on the assumption that there are no differences between different frequency channels. That this is not always correct is evident. Thus, it is always necessary to verify the similarity of all frequency channels of the spectrometer by investigating the baseline noise of measured spectra for example. Typical problem areas for instance are light scatter problems in acousto-optical spectrometers (AOS), where speckles may affect individual channels more heavily than others. The same is true for filterbanks which have occasionally same peculiar channels even in a well maintained back-end system. But in all normal cases of well behaved instrumentation, the Allan variance plot is a most useful method to precisely characterize the instrumentation in use.
In general, observations at an observatory are done with the available instrumentation as is, and it can not be modified or even improved by the observer. On the contrary, the observer has to find the correct observing parameters in order to use the available hardware in a most economic way. It is the purpose of this paper to develop a strategy for an optimization of the observing process. For this the knowledge of the stability parameters is decisive. Once this information is available from an Allan variance measurement for example, it should be a rather straightforward matter to determine the essential parameters like length of integration per position on sky et cetera. The following mathematical treatment analyses the commonly used observing methods, i.e. "Position-'', "Beam-'' or "Frequency-Switch'', "On-The-Fly'' (OTF) measurements or "Raster-Mapping'' based on the information contained in the Allan variance plot. As a result practical guidelines for the most efficient observing method are found, which can be used at any radio observatory. Particularly, all space- or air-borne observatories require a most efficient use of the extremely precious observing time, since any loss can usually not be compensated by a simple increase in observatory time. But also for ground-based observatories the results found in the following should be very useful.
If a test procedure is defined for use at any time and at any
location, it needs to be as simple and unique as possible. Therefore,
we understand the Allan variance as the ordinary statistical variance
of the difference of two contiguous measurements (see also
Rau & Schieder 1984). One has to consider a signal-function s(t), which is
the instantaneous output signal of a spectrometer channel or of a
continuum detector for example. The output is now integrated for a
time interval T representing an estimate of the mean signal which is
stored as spectrometer data in the computer:
(1) |
(2) |
In order to obtain a plausible estimate of the error of the difference we use the standard definition of the variance:
(3) |
If we apply now Eq. (1), we get:
(4) | |||
If we have the same statistics for both, "s'' and "r'' , then we get finally:
We have not yet made any particular assumption about the source of the signal- and the reference-data. For our application here, the two data "s'' and "r'' are derived from the same output signal s(t) of one spectrometer channel. The two acquisition periods of length T for the integration of and must therefore occur one after the other in order to avoid any undesirable overlap between the two measurements. For an unequivocal definition of the instrumental Allan variance we assume that all "s'' and "r'' measurements are contiguous without any dead time in between. In real life, when observing, there will be always some unavoidable dead time, since the telescope needs to be moved between the On- and the Off-position or there is time needed for data transfer etc. Any delay will increase the impact of slow drift noise, and it will therefore result in a different appearance of the system noise. Such effects will be discussed in the next chapter.
For a given integration time the signal output of one spectrometer
channel is described by Eq. (1). We can describe the instantaneous
noise signal s(t) before integrating using the (in this case not
normalized) auto-correlation function
,
but here as a
function of delay time :
(5) |
(6) |
(7) |
(8) | |||
(9) |
Figure 1: Artificial data set generated by random numbers (left) with white noise of Gaussian distribution (top), drift noise (middle), and combined noise (bottom). Each data point corresponds to a sample integrated for 1 s while the fluctuation bandwidth was set to 600 kHz. The drift noise is calculated by filtering white noise with a sufficiently broad boxcar time-filter (width in the Allan variance plot). To the right the (relative) Allan variance plots of all three noise spectra are depicted. The white noise appears with a slope of -1, the drift noise with a slope of approximately +1. The combination of both results in a typical Allan plot with a minimum at some fairly well defined minimum time. | |
Open with DEXTER |
In this approximation we have now for the Allan variance according to
Eqs. (6), (7), and (9):
(10) |
If we assume a simple power law for the drift contribution with a well
defined ,
and if we consider the additional presence of
radiometric noise, or "white noise'', we expect the Allan variance to
have the following structure as a function of integration time:
Within the white noise part of the Allan plot, i.e. the regime with
the slope of "-1'', the radiometer equation must be valid:
(11) |
(12) |
In most practical cases it is very useful to refer to the particular integration time in the Allan variance plot where the minimum occurs. This minimum describes the turn-over point where the radiometric noise with a slope of -1 in the logarithmic plot becomes dominated by the additional and undesired drift noise (see Fig. 1). Above the minimum time the rms of the measurements becomes much larger than is anticipated by the radiometer equation alone. Intuitively, the minimum time might appear as an upper limit for the integration on individual positions during radio-astronomical observations, but the Allan variance plot offers a lot more detailed advice when planning the most efficient observing strategy under the given circumstances. Since any additional noise above the radiometric level is very unfavorable, one has to find the optimum integration time, where the loss due to inevitable dead time during slew of the telescope etc. is as little as possible, and where the impact of drift contributions is nearly negligible at the same time. To find this best compromise is the goal of the following chapters.
By use of the minimum time
of the variance we can now rewrite the
above equation with:
(13) |
The slope of the drift part in the Allan variance plot is, as is seen in Fig. 1, also one of the important parameters for the characterization of the instrument. Therefore, we can conclude that the minimum time, the fluctuation bandwidth, and the slope at large integration time are the three parameters which fully characterize the instrument in a statistical sense. All three parameters are directly accessible from the Allan variance plot once there are sufficient data collected for a reliable evaluation. It is interesting to note that generally the outcome of an Allan variance test looks nearly identical to previous ones as long as the instrumentation used for the test is not altered. This is particularly useful for checking the health of an instrument from time to time. Certainly, there are other methods to describe the noise performance of a radiometer like the plot of the noise power spectrum or the correlation function or else, but it seems rather natural to use the Allan variance plot, since it is directly related to the normal observing procedure when observing an "On''- and an "Off-position'' with a radio-telescope.
If the fluctuation bandwidth
is changed the minimum also shifts
due to the changing level of white noise, but, despite the change of
the leading factor, Eq. (13) is not altered due to the normalization
of the time with the Allan variance minimum time. How the radiometric
contribution is decreasing with increasing fluctuation bandwidth is
clear from the radiometer equation. However, the drift contribution
should not change, since it does not depend on the shape of the
filter-function of the actual spectrometer channel. The minimum
therefore shifts to smaller times with increasing
like
(14) |
Co-adding frequency pixels of a spectrometer output is standard practice in radio-astronomy when dealing with very broad emission lines e.g. from other galaxies. Thus it is not uncommon to finally discuss spectra with an effective fluctuation bandwidth of the order of 50 MHz by binning several spectrometer channels. A typical minimum time of a complete radiometer system at an observatory is somewhere around 30 s or so at a resolution of 1 MHz of the spectrometer. According to Eq. (14) one would expect a shift of the minimum time to values somewhere between 4 and 8 s for the bins. A much larger bandwidth one has to deal with, when measuring continuum signals with large bandwidth bolometers. A typical effective bandwidth may be of the order of some 50 GHz. In this case the minimum of the Allan variance moves to values between 0.1 and 0.8 s, when assuming the origin of the white noise is still just radiometric while the drift noise remains as before. It is clear that the integration time used for sampling on each position may be a few seconds in the first case, but has to be less than 100 msec in the second.
As was mentioned above, the Allan variance plot provides information about what to expect in case there are no gaps in time between the corresponding measurements "signal'' (On) and "reference'' (Off). This is very close to the standard situation during observing, but now the presence of dead time has to be included into the discussion. When investigating the simple description of the Allan variance as a function of integration time from above it seems plausible that the plot should also provide all information about the impact of drift noise, if there is dead time between the two measurements. How to do this is fairly straightforward, and, in order to keep things short, we present the mathematical treatment only briefly.
Position-Switch measurements with one signal integration (On) per
reference measurement (Off) are very common for the observation of
single positions in an extended source for example. In other cases
Beam-Switch with a wobbling secondary mirror or Frequency-Switch
measurements are applied, since these methods seem to be more
promising for the resulting signal to noise ratio. In terms of a more
mathematical treatment, all these methods are identical, only the
typical time scale is different. In practice some dead time needs to
be included in the observing procedure, but both, On- and
Off-integration, are assumed to be of equal length^{}. Following Eq. (1) we have for the signal- and
the reference-measurement:
(15) |
= | (16) | ||
= |
Figure 2 shows the shape of Eq. (16) as a function of the relative
integration time t for a few values of d. For each d > 0 the
function has exactly one fairly broad minimum, and it is plausible
that only in this minimum the observation can be done with maximum
efficiency. Any other t leads to a higher noise level, i.e. to lower
efficiency within a given observing time. This can be explained by the
facts that with very short integration a lot of time is wasted while
moving the telescope, and that at very long integration time the drift
noise starts to deteriorate the signal to noise ratio on the other
hand. In Fig. 3 the optimum integration time at the minimum of the
variance is shown for both cases
and 2 as a function of the
relative dead time d. The preferred relative integration time t is
always significantly smaller than unity, which leads to the important
conclusion that the integration time should always be considerably
smaller than the Allan variance minimum time. With a realistic drift
noise contribution (
)
the optimum integration time will
be located somewhere between the two solid lines in the plot. For the
figure, also those limits for the integration time have been computed,
where the rms-noise is increased by less than 1% as compared to the
optimum. The dotted curves indicate these limits for both ,
and
it is appears that these regions overlap largely. The hatched area in
the plot indicates where this overlap-region is found. It means that
for any realistic scenario it is always possible to find an
integration time with almost perfect noise performance independent on
the actual drift characteristics of the system. Consequently, the
precise knowledge of the drift slope
is not really essential
for the optimization procedure.
Figure 2: The development of the rms of Position-Switch measurements as a function of integration time for a drift slope of in the Allan variance plot (see Eq. (16)). The curves are calculated for several delay times between On- and Off-position (d = 0, ..., 0.25). The dotted curve connects all minima of the curves and represents the optimum integration time for all delays. The values of the delay time d as well as of the integration time are given in units of the Allan variance minimum time. | |
Open with DEXTER |
As was mentioned before, with a standard low resolution spectrometer one typically finds an Allan variance minimum of a complete radiometer system in the range of 30 s or so. Chopped measurements, using a wobbling secondary telescope mirror for example, are considered as the ideal method for point-like sources to reduce the impact of drift noise on the appearance of the baselines of the spectra. If the chop delay, i.e. the time to move the subreflector between the two positions, needs 100 msec for example, the optimum integration time per position is found near 4 s following Eq. (16). The situation seems to be different for the case d = 0, as it would apply for Frequency-Switch measurements for example, since the switch between the two nearby frequencies takes negligible time. But, as is visible in Fig. 2, the increase in rms noise is fairly marginal (1%) even for integration times T up to 14% of . This means, in all practical cases it is of no use to switch at high speed, on the contrary, the efficiency of the observation might become affected, if dead time is involved. Even for spectra at moderately reduced frequency resolution the required integration time does not drop significantly below 1 s. It is therefore important to note, that a higher chop frequency is only required for continuum measurements with very large bandwidth.
The ideal, theoretical limit for the observing efficiency is reached,
when effectively all integration time is spent on the On-position and
if there would be no dead time involved. In this case we have:
= | (17) | ||
= |
Figure 3: Optimum integration time as a function of On-Off delay for the two extreme drift contributions with and as found from Eq. (16). The dotted curves represent the intervals where the rms is increased by 1% maximum for both values of . The hatched area defines the regime where the rms increase is less than 1% independent on the actual value of . In this area the preferred choice of the integration time is found. The values of the delay time d as well as of the integration time are given in units of the Allan variance minimum time. | |
Open with DEXTER |
Another and possibly more interesting case is the situation when measuring maps either by Raster-Mapping or On-The-Fly. In both cases there are N different On-positions per Off-position in one cycle, the only difference is that for Raster-Mapping there is some dead time between the different On-positions which does not appear during OTF observations. It is found in literature that the Off-integration time should be times longer than the On-integration time (Ball 1976). This advice leaves the question open how long the On-integration should last. For the following treatment of this question we assume that we have an On-integration time , an Off-integration time , a dead time between each of the On-measurements, another dead time to move from the last On- to the Off-position, and a different dead time to move the telescope back to the first On-position to begin with the next cycle again. It is plausible that will not be identical with , since the first and last On-position are not the same, and the time to move between the positions (with different velocity requirements in OTF-mode as well) is definitely different.
The delay between one of the On-positions and the Off-position is also
dependent on the number of Ons in between. If we consider the worst
case situation, we have to investigate the On-Off pairs with maximum
delay involved, which is the first On-position when putting the Off at
the end of the cycle. The delay
is then:
(18) |
We also have to take into account now that the integration time for On
is different than for Off. Hence we write:
(19) | |||
(20) |
The function g(s,r,d) is identical with f(t,d) for s=r=t (see Eq. (15)). The variance found here is valid for one pair of a particular On- and the corresponding Off-measurement.
We have to identify now, how the noise is developing, if one wants to
observe a full map within a given total observing time
.
One
observing cycle consists of N identical On-integrations (), one
Off-integration (), and the various dead times in between. Thus
we have for the complete cycle time :
(21) |
(22) |
= | (23) | ||
= |
The minimum of
can be found, where all derivatives
with respect to s, r, and N become zero. This is the set of
variables where the observing efficiency becomes the best possible
under the given circumstances. (It is simple to prove that there is
exactly one minimum as long as s, r and N are larger than zero.)
Any other set of variables will result in a degradation of the
observing efficiency. But, as was mentioned before, the use of the
relation
leads to results very close to this
optimum^{}. Therefore, for all
practical purposes it is sufficient to apply only a two-dimensional
optimization for the two variables s and N:
(24) | |||
Usually, it is rather difficult to make observations with an arbitrary
number of Ons per Off at a given geometry of a particular map. It is
therefore much more interesting to derive conclusive estimates for an
optimized observation under the assumption of a predefined and fixed Nfor both, Raster-Mapping and OTF observations. In this case one has to
find the minimum with:
(25) |
In order to provide some idea about the best choice of the
On-observing time s, the optimum integration time in OTF mode is
shown in Fig. 4 as a function of the On-Off delay .
The delay for
the return to the begin of the cycle is taken into account by a 20% longer than .
The two solid curves are derived from Eqs. (23)
and (20) for the two limiting cases
and .
The
hatched area in the plot represents the region where the increase of
the rms stays below 1% as compared to the optimum for both values of
.
This means that for all assumed drift slopes one is always
safe when choosing an On-integration time within this region. Such
optimized integration time can be described by the purely empirical
formula:
(26) |
d is the sum of all delays in one cycle. The formula is also valid
for Raster-Mapping and Position-Switch measurements, and it may be
used for values of
and
between 0 and 1, for
,
and .
Figure 4: Optimum On-integration time for OTF measurements with 50 Ons per Off. The hatched area represents the regime where the rms increase stays below 1% for any between 1 and 2. The dotted curve in the middle represents the suggested On-integration time using Eq. (26). As is clearly visible, the optimum integration time is typically of the order of a few seconds when assuming an Allan variance minimum time near or above 100 s. | |
Open with DEXTER |
Finally, also the overall observing efficiency can be found for the
measurement of extended maps. The theoretically best possible value of
the variance is given by:
(27) |
Figure 5: Relative optimum efficiencies of OTF measurements for N=1, 10, and 100 On-positions per Off (see Eq. (27)). For each N both curves for = 1 and = 2 are plotted. It is obvious that larger N lead to higher efficiency. The dotted curves for N=1 represent the Position-Switch situation with an On-Off delay every second time only. This is taken into account by setting in Eqs. (23) and (27) while N = 1. | |
Open with DEXTER |
How the efficiency develops with N is visible in Fig. 6 for some fixed
On-Off delays. Obviously, the gain in efficiency with increasing Nabove N = 50 is rather marginal. Therefore it is questionable whether
a significant improvement in observing efficiency is achievable when
going from N = 50 to N = 100 for example. Any reduction of the On-Off
delay time would be a much more effective measure. On the other hand,
the plot shows also, how valuable an increase in N can be in case one
is considering N = 10 or less.
Figure 6: Relative OTF efficiency as a function of the number of Ons per Off for various relative On-Off delays according to Eqs. (27), (25), and (23). For each both curves for and = 2 are plotted. | |
Open with DEXTER |
One of the remaining questions is, how long one cycle
will last,
once the optimum On- and Off-integration time has been found. Using
Eq. (21) it is now simple to calculate
as a function of the
On-Off delay time .
In Fig. 7 the cycle time is plotted for three
cases with N = 1, 10, and 100. At first sight it appears surprising
that the time for a full cycle increases to values several times
longer than the Allan variance minimum time in case there is
substantial delay .
But again, the length of one cycle depends
strongly on the number of Ons per Off. Since the On-integration time
is rather small at large N, the larger radiometric noise of the
On-measurement dominates the noise budget so that a longer delay with
an increased contribution of drift noise becomes acceptable. For a
given and fixed N the increase of the cycle time with increasing delay
is the consequence of the fact that at larger integration time the
loss due to drift noise is less costly than the loss due to the On-Off
delay. This effect is also clearly visible in Fig. 2 for the case of
Position-Switch measurements.
Figure 7: Cycle time for OTF measurements as a function of On-Off delay. The cycle time comprises N On-integrations, one Off-integration, and the dead times in between. The three cases (N=1, 10, and 100) are calculated from Eqs. (21), (23), and (25). Similar to Fig. 5, the Position-Switch situation is also indicated by the dotted lines. Note that the increase of cycle time is partly due to the time spent during slew from On to Off and back. | |
Open with DEXTER |
The discussion above provides some clear guidelines for an optimized observing program. The first step has to be a reliable measurement of the system Allan variance. The word "system'' includes all components of the observatory which may possibly contribute to the noise including the atmospheric fluctuations for example. When knowing the applicable dead times, a simple calculation of the optimum integration time can be made by using the "rule of the thumb'' as given by Eq. (26). As was pointed out before, Position-Switch or Chop measurements should be done in a most economical way by moving the telescope or the chopper only every second time. OTF or Raster-Mapping measurements need a clear understanding of the impact of the number of On-positions chosen for each Off-integration. Also here it might be of some value to reverse the sequence of the integrations on the various positions every second time in order to reduce some of the loss in time due to the slew of the telescope between the On- and the Off-positions. It should be noted that the measurement of large maps can be handled in different ways. If one wants to achieve a certain signal to noise, it might be advisable to use larger N with smaller and to repeat the map several times, as it is considered by the parameter K in Eq. (22). In any case, the suggested On- and Off-integration time should not be drastically altered, although the plot in Fig. 4 indicates that there is quite some margin available.
In general it is surprising how closely together the curves for the different in Figs. 5-7 are found, which is a clear validation for the assumption that it is sufficient to consider only the extreme cases for the drift contributions. Therefore, there is no need to go too deeply into the analysis of the drift part in the noise. It is also one of the better news from the treatment here that some freedom to plan the observation is still preserved. This might be particularly important when considering the constraints set by the observatory hardware. It is probably not advisable to operate with too short integration intervals, since the data flood might become overwhelming, and the storage capacity of the computers could easily be exceeded. Therefore, the conclusion found before that there are no real requirements for high speed observing most of the time is very important.
The discussion above is most useful for observations with space-born observatories like SWAS (Melnick et al. 2000), ODIN (Hjalmarson 1993) or FIRST (de Graauw et al. 1998)^{}. Since usually a satellite cannot be oriented in space very rapidly, the impact of dead time becomes vital. The SWAS satellite is not capable to control the pointing very accurately during slew across an extended source, so that the OTF mode is not applicable. Instead, Raster-Mapping is a generally used procedure. On the other hand, since SWAS is a very small satellite, it can be pointed from one position to a second in 3 degrees distance within less than 15 s. A 3-degree nod is often required during observations in the Milky Way, since the emission of molecules like CO is fairly extended. Nevertheless, the loss in observing efficiency looks acceptable, when considering an Allan variance minimum time of the SWAS receiver/backend system of about 150 s as found in orbit. On the Herschel space observatory, the situation will be changed drastically. We can assume that the pointing of the telescope during slew is well defined so that OTF measurements should be applicable. But, due to the fact that Herschel is going to be a very heavy satellite, the movement by three degrees will last nearly as long as the expected Allan variance minimum time will amount to. In consequence, the value of the dead times and will be close to unity when assuming a similar system stability like that of SWAS. This prohibits Position-Switch measurements with the instrument, because the efficiency would drop to values below 30%, which would certainly be rather disappointing because of the consequences for the extremely precious and limited observing time. Therefore, a very careful analysis for determining the best possible observing strategy is extremely important for such a program.
Rather different circumstances exist at ground-based observatories. Typical dead time for a slew of 3 degrees is of the order of a few seconds only, therefore the impact of dead time does not appear as devastating as with space-based observatories. A detailed planning of an observing strategy does not seem to be so easily implemented, particularly, if other parameters like varying hardware constraints or human limitations are playing a significant role as well. Typically, the Allan variance minimum time of most ground-based sub-millimeter observatories is rather small, partly due to the impact of an unstable atmosphere. Therefore, the advantage of a smaller dead time is partly eaten away by the reduced stability. But still, as should be clear from the discussion before, the actual situation has to be analyzed in detail for every individual case in order to achieve as much scientific return from the observations as possible. For this the usage of the analysis presented in this paper could be very essential.
Co-adding a couple of pixels in a measured spectrum in order to
improve the signal to noise ratio is general practice when dealing
with noisy spectra, but, the consequences of this procedure are not
quite as trivial as one would like to believe. For the discussion we
start again with the definition of the normalized first order
correlation function as defined in Eq. (4):
The data y_{n} are here the pixel components of a fully calibrated
spectrum as measured with a multi-channel spectrometer. The index "m''
describes, by how many pixels the spectrum is shifted before the
multiplication of the pixel data is done^{}.
The correlation function is symmetric,
since
g_{-m}=g_{m}. We assume that all y_{n} behave identically in a
purely statistical sense. Then, the values of g_{m} depend only on the
"distance'' between the data given by the parameter "m'', and the
expectation values as defined by the brackets become independent on
n. We have to determine now the expected statistics of the new
co-added data set z_{n} with:
Only the first few values of g_{m} (m not larger than about 3)
should be non-zero for a decent spectrometer, since the overlap of the
power response functions between neighbored pixels should be small.
Therefore, in the limiting case of very large width of the bins (Klarge), we get now: