Radio telescope total power mode: improving observation efficiency

Aims. Radio observing efficiency can be improved by calibrating and reducing the observations in total power mode rather than in frequency, beam, or position-switching modes. Methods. We selected a sample of spectra obtained from the Institut de Radio-Astronomie Millim\'etrique (IRAM) 30-m telescope and the Green Bank Telescope (GBT) to test the feasibility of the method. Given that modern front-end amplifiers for the GBT and direct Local Oscillator injection for the 30 m telescope provide smooth pass bands that are a few tens of megahertz in width, the spectra from standard observations can be cleaned (baseline removal) separately and then co-added directly when the lines are narrow enough (a few km/s), instead of performing the traditional ON minus OFF data reduction. This technique works for frequency-switched observations as well as for position- and beam-switched observations when the ON and OFF data are saved separately. Results. The method works best when the lines are narrow enough and not too numerous so that a secure baseline removal can be achieved. A signal-to-noise ratio improvement of a factor of sqrt(2) is found in most cases, consistent with theoretical expectations. Conclusions. By keeping the traditional observing mode, the fallback solution of the standard reduction technique is still available in cases of suboptimal baseline behavior, sky instability, or wide lines, and to confirm the line intensities. These techniques of total-power-mode reduction can be applied to any radio telescope with stable baselines as long as they record and deliver the ONs and OFFs separately, as is the case for the GBT.


Introduction
For a long time, heterodyne receivers in the millimeter (mm) domain were made of front-end mixers with optical injection of the Local Oscillator followed by intermediate-frequency amplifiers and production of spectra via filter banks for which the homogeneity of the gain was not secured from channel to channel. Combined to on-axis telescopes, these suffer from standing waves between the secondary mirror and the receiver. Local Oscillator injections via Martin-Pupplet optical diplexer devices with relatively narrow bandpass were adding to the frequencydependent gain variations of the whole system (a few telescope descriptions can be found in, e.g., Castets et al. 1988, Booth et al. 1989, Schuster et al. 2004. Combining the effects from the frequency response of the mixer, the noisy receivers, the inhomogeneous backend, and the intermediate-frequency amplifiers with their imperfect impedance match to the mixer output resulted in complex bandpass profiles with high amplitudes, making the weak astronomical lines difficult to detect without subtracting a reference spectrum with a similar profile to the observations. This subtraction has to be performed relatively quickly to protect against gain variations and atmospheric absorption fluctuations. Various strategies are employed to get a reference spectrum that is as close as possible to the scientific spectrum. These include the well-known position-switching (the telescope primary mirror is shifted to an emission-free region in the sky), beam-switching (the beam is deviated from the telescope pointing direction by a mirror or simply by wobbling the secondary mirror within a few arcminutes away from the pointed direction), and frequencyswitching observing modes (where there is no mechanical movement, but the Local Oscillator reference frequency is slightly changed so that the subtraction does not cancel the line but only the baseline offset).
For the Rayleigh-Jeans temperature scale, the noise fluctuation (root mean square deviation, σ or rms hereafter) of a radio spectrum is given by the radiometer equation: where T sys is the total system noise temperature 1 , including the receiver noise and all noise from the sky and ground 1 For the GBT, T sys is measured in the T a scale, while for the IRAM 30-m telescope, it is measured in the T * a scale, which includes atmospheric attenuation and forward beam efficiency corrections: T * a = T a × e tau η ffs (Kutner & Ulich 1981) Article number, page 1 of 8 arXiv:2010.00461v1 [astro-ph.IM] 1 Oct 2020 A&A proofs: manuscript no. 38976corr spillover, δν is the spectral resolution, and τ the integration time.
Here, η is sometimes introduced to take into account other losses such as one-or two-bit autocorrelator conversion losses. When using beam-switching or position-switching modes, a second spectrum of similar characteristics is obtained to be subtracted from the first one. Because the noise fluctuations of both spectra are uncorrelated, the subtraction increases the noise fluctuation temperature by √ 2. If we consider that τ represents the total (ON+OFF) time, then the noise fluctuation temperature increases by another √ 2, as only half of the time was spent on each individual spectrum, effectively doubling the final noise fluctuation of the spectrum (and more time is lost in overheads, like the mechanical displacement of mirrors, or the rotation of the whole telescope). If the frequency-switch mode has been used instead, then the OFF spectrum becomes a second ON spectrum and the penalty is only √ 2 but at the expense of spoiling the baseline because of the frequency dependence of the receiver (and the standing wave patterns) that prevents an exact superposition of the baseline structure from the ON1 and ON2 spectra. Improved strategies consist of integrating over longer periods of time for the OFF spectrum and subtracting the same OFF spectrum from several ONs, minimizing the time spent OFF source and the noise fluctuations added in the subtraction. These strategies require very stable receivers and sky conditions to be efficient, but can marginally beat the performance of the frequencyswitch mode in terms of noise fluctuations and provide much flatter baselines; however, they become less efficient in the end if spatial smoothing is applied, because the OFF is identical for the adjacent ON-OFF pairs such that their average does not diminish the noise fluctuations efficiently, and then frequency-switching becomes preferable.
With the advent of high-frequency amplifiers, drop of LO optical diplexer injection, Fourier-transform spectrometers, and the benefit of an off-axis telescope such as the Green Bank Observatory 100-m Telescope (GBT), the quality and stability of the bandpass has considerably improved, meaning that total power (staring in a fixed direction at fixed frequency) mode observations can now be reconsidered. This would avoid the quadratic addition of noise fluctuations and therefore save the √ 2 factor discussed above. However, there are caveats in operating in pure total power mode, and we propose that observations be run in the usual way, but with the data reduction performed on individual ON and OFF spectra to exploit the benefit of total power mode when possible while not losing the fallback possibility of standard data reduction. We present the method in Sect. 2 and discuss a few cases in Sect. 3.

Method
The calibration of mm-wave radiotelescopes has been discussed in Kutner & Ulich (1981) and a technical report from the Institut de Radio-Astronomie Millimétrique (IRAM) 30-m presents the details of the calibration procedure 2 for that telescope which is representative of mm-wave radiotelescope calibrations. Nevertheless, there are differences between the GBT and the IRAM 30-m telescope. For example, the GBT has amplifiers before the heterodyne mixer and has no image sideband.
Briefly, the observations of two internal loads at different temperatures give a linear fit between backend voltages or detector counts and temperatures (in the Rayleigh-Jeans approxi-2 http://www.iram.es/IRAMES/mainWiki/ CalibrationPapers?action=AttachFile&do=view&target= kramer_1997_cali_rep.pdf mation 3 ). The slope of the fit is referred to as the gain and is defined by where T hot (resp. cold ) represents the temperature of the hot (resp. cold) load and V hot (resp. cold ) represents the voltage (or counts) of the hot (resp. cold) load measured at the receiver backend. If T rec is the receiver equivalent noise temperature (its value is derived from the gain measurement), any signal from the sky delivers a voltage V sky which can be converted to a Rayleigh-Jeans temperature with The sky signal is a composite of ground-emission spillover, atmospheric emission, and cosmic signal attenuated by the atmospheric absorption and is beyond the scope of this discussion (see Kutner & Ulich 1981 for more information). If two measurements of the sky are performed along one of the switching procedures described in Sect. 1, we can compute their difference to cancel all unwanted signals and retrieve only the cosmic signal of interest (T sky is hereafter referred to with the more commonly used T a or antenna temperature, and similarly V sky becomes V a ): where ∆T a represents the cosmic signal we want to observe (yet uncorrected for various losses). If we define we get and eq. 4 becomes the more familiar form Here, ∆T a suffers from various sources of noise which we can analyse: where σ is the root mean square error expressed in a similar form as in eq. 1, that is, depending on the inverse square root of the integration time and frequency sampling of the radiometer. Where the ON and OFF signals do not cancel (i.e. inside the observed line), the equation can be expressed in the familiar form, In the rest of the spectrum (which is usually named the baseline), the noise is simply 4 σ 2 (∆T a ) = g 2 × (σ 2 (V a,ON ) + σ 2 (V a,OFF )).
If we suppose that the hot and cold load temperatures are known with enough accuracy to have a negligible contribution to the noise, the gain noise can be expressed as If instead of observing with a switching mode (SM) one observes in total power mode (TP), the signal is not extracted from the receiver plus sky noise during the observations; its expression is simply eq. 3 and the error budget is (supposing T rec is known with enough precision to neglect its contribution) Equations 9 and 12 can be compared only if we have manually subtracted the sky plus receiver noise contribution from the TP spectrum (this will be discussed in the following section), in which case T a (TP) = ∆T a (SM). and V a (TP) = (V a,ON − V a,OFF )(SM). When the receiver and the sky transparency are constant inside the bandpass of an observation, the gain and receiver noise temperature can be calculated by integrating over all the channels of the spectrometer, largely decreasing -suppressing in practice-their contribution to the final noise. The noise then simply depends on σ(V a ), which is inversely proportional to the square root of the integration time, everything else being equal. Using the same total integration time τ for the TP observations and for the sum of the two phases of the SM observations, we get The SM observations are therefore twice as noisy as the TP mode ones. This is true for the mechanical (position or beam) SMs because only one phase is exposed to the signal, but in the case of the frequency SM, the subsequent folding of the spectrum reduces the noise further by √ 2 by averaging two independent realizations of the measurement and the TP mode advantage is only √ 2. It is not always possible to use a constant gain and T rec throughout the spectrum and the balance between the gain noise and the observation noise contributions must be addressed when the calibration is performed channel-wise 5 . We have seen (Eqs. 9 and 11) that both sources of noise depend primarily on the voltage or count measurement noise which depends upon integration time (eq. 1). Since the calibrators are usually observed Fig. 1. 12 CO (J:1 -0) frequency SM observation obtained at the IRAM 30-m telescope with the EMIR receiver and FTS backend. Top: Standard folded spectrum after subtraction and calibration. Middle: Original two phases calibrated independently. Bottom: Two phases directly averaged after realignment without using noise-averaged gain or subtracting a baseline from the raw data. The baseline shows no ripples but the noise is three times higher than in the upper panel. The vertical axis has the same amplitude for all three boxes (16 K). The color of the rms figure corresponds to the color of the spectra to which it pertains. on a short timescale (1 -5 seconds typically), while the sky observations can last 60 seconds or more 6 , the rms of the calibration phases is the highest. In TP observations, the denominator is high (V a = V rec +V atmosphere +V source ) and comparable to the load measurements (V load = V rec + V cold or V hot ). Therefore, the dominant error term comes from the calibration part itself. In SM observations, the observation denominator (V a,ON -V a,OFF ) is much smaller than V load and despite a lower rms, this term dominates the gain and calibration errors, which is preferable. Figure 1 shows an example of the problem encountered when separately calibrating the two phases on a channel-wise mode. The final spectrum noise is dominated by the calibration noise and is therefore noisier, which confers no advantage. To benefit from the √ 2 improvement, two possibilities are available: (1) use a noise-averaged gain (either constant or smoothed over an intermediate width, as is the case for the IRAM 30-m telescope with MRTCAL), or (2) subtract a baseline from the raw data (backend voltage or counts V) before applying the calibration to minimize the contribution of the gain (calibration) noise σ(g).

The IRAM 30-m telescope
Though the IRAM 30-m telescope is still equipped with frontend mixers and the antenna is on-axis, the replacement of the optical LO injection by line injection in the new EMIR receivers (for Eight MIxer Receivers, Carter et al. 2012) has improved the baseline performance of the observations, and even frequencyswitched observations are only subject to relatively limited baseline ripples. We therefore study a few cases with two different backends: the fast Fourier Transform Spectrometer (FTS) in its narrow mode (8 × 37275 channels, 48.83 kHz each) and the VErsatile SPectrometer Array (VESPA), which is an autocorrelator used in a narrow-window, high-resolution mode (20 MHz, 10 kHz respectively). Presently, IRAM does not deliver the individual phases (ON/OFF or Freq 1/Freq 2) of the observations but only the final calibrated spectrum (ON -OFF or Freq 1 -Freq 2 with subsequent folding). We modified the calibration code (Millimeter RadioTelescope CALibration, MRTCAL) to obtain access to the individual raw phases for this work 7 .

Fourier Transform Spectrometer data
Figure 2 clearly shows that the subtraction of a baseline from the single phase observation on the full bandwidth (upper panel) will never manage to compete with the differential observation done in frequency SM (lower panel). There is therefore no hope to use this total power technique with wide lines and one must focus on a part of the backend small enough to be able to recover a flat baseline around the line of interest. Figure 3 shows a small section of the bandpass (35 km s −1 ) centered on the 12 CO (J:1 -0) line. The upper panel shows the average of the two phases expressed in counts (analog to digital units, ADU), the cosmic and telluric line emissions are masked away by two windows, and a polynomial baseline of 25 th order has been subtracted from it. The flattened spectrum is shown in the middle window. The gain to convert the spectrum to the antenna temperature scale is then applied, and the final spectrum is compared to the original spectrum obtained by folding the two phases together. The spectra are identical and their noise is in the √ 2 ratio ( 0.214 0.151 = 1.42). The polynomial degree of the baseline necessary to achieve this result is a function of the width of the spectrum onto which the baseline is fitted. Though this might be specific to each set of observations, we have explored the combination of frequency window size and baseline polynomial degree to check their interdependence and evaluate the minimum polynomial degree needed to retrieve a consistent line-integrated intensity and a baseline flat enough to recover the theoretical √ 2 improvement on the noise (Table 1). It can be noted that for polynomial degrees that are too high compared to the channel numbers, the noise drops even lower than expected. This indicates that the high polynomial degree induces the fit of the noise itself and removes part of it. Consequently, the TP noise data reduction cannot be lead blindly but must be compared to the standard SM data reduction to check that both the noise and the line-integrated  intensity achieve their expected values. In the case of no signal, the same procedure should also apply to get the correct noise improvement.

VESPA data
VESPA can be split into many small windows of 10 to 40 MHz with high frequency resolution, i.e., 3.3 to 40 kHz. There are many other modes in addition to those discussed here, but these are beyond the scope of this study. On such narrow windows, the baseline is relatively smooth but still needs polynomial fitting of high order (18-20) to remove small ripples and retrieve the correct line intensity (Fig. 4). However, the noise gain is less than the expected √ 2. We have identified the origin of this discrepancy to the greater-than √ 2 noise diminution when folding the spectrum. The measured unfolded spectrum noise in the present case is 0.37 K, and we therefore expect a folded spec-trum noise of 0.26 K instead of the measured 0.23 K (Fig. 4, with an unfolded noise of 0.33 K, expected noise of 0.23 K, and measured noise of 0.20 K for the other polarization). Though adding or subtracting spectra should have the same impact on noise, the noise we obtain in the TP mode data reduction is close to expected (0.37 K → 0.185 K, measured 0.181 K). The standard folding method returns a spectrum with a noise lower than expected because the frequency displacement could be a fractional number of channels, and when the folding is performed the folded channels are split into two-subchannels to be added to the two adjacent channels. This introduces a noise correlation between the channels which can artificially lower the noise. We note that when half-channel intensities are added, their noise is not changed, and therefore this is similar to smoothing the spectroscopic resolution. The diminution is maximal when the channel displacement is exactly a whole number plus one-half of a channel and can reach √ 2 of supplemental noise diminution. This is not specific to VESPA data and can happen to any spectrometer when using the frequency SM 8 . However this noise is correlated and a subsequent smoothing in frequency would not reduce the noise as much as expected.

The GBT 100-m
The GBT presently uses the VErsatile GBT Astronomical Spectrometer (VEGAS) as a backend, which is a somewhat similar autocorrelator to VESPA in its agility, multi-windowing, and variable resolution capabilities 9 . Contrary to those from the IRAM 30-m telescope, the data are provided raw and uncalibrated to the user. Any user can therefore treat the two phases of the observations separately in order to perform TP mode data reduction. Another autocorrelator, known as the GBT spectrometer, was in use before VEGAS, and the results are not sensitive to the backend itself for similar bandwidth and resolution. We visit the K-band and W-band cases here.

NH 3 (K-band) observations
-The GBT spectrometer Figure 5 shows the complete 50 MHz spectrum on source of a NH 3 line for an astrophysical source. Except at the edges where the bandpass drops rapidly to zero, the band is relatively smooth and flat as for VESPA, and the lines are readily visible in the band.
A polynomial fit of 13th degree is sufficient here to remove the baseline cleanly as seen in Fig. 6 which proves that in such a case, the TP mode can provide baselines as flat as those obtained naturally with mechanical SMs (for narrow lines). The noise is improved by the expected √ 2 value, and this is also the case for the frequency SM observations of NH 3 with the same spectrometer (Fig. 7.) -VEGAS VEGAS observations of the NH 3 line are quite similar to those provided by the GBT spectrometer, as expected, but the baseline is even flatter, due to the new seven-pixel K-Band Focal 8 Observers should be aware of this artifact and should select their frequency throw to adjust to a whole number of channels though this might prove difficult when several backends with different resolutions are used in parallel. 9 https://www.cv.nrao.edu/ aroshi/VEGAS/skyfretobb.pdf   Comparison between standard reduction (position and frequency SM, black) with TP mode reduction (red) for NH 3 observations with the old GBT spectrometer in two different sources. The spectrum shown at the top is identical to the one displayed in Fig. 6 but zoomed in on the lines. Noise improvement is close to √ 2. The difference between both reduction modes is displayed in blue and shifted by -0.5K.  The frequency axis is narrowed to mask the shoulders at both ends (seen in Fig. 8). The middle box displays the two spectra realigned and averaged (TP mode). Because of the frequency shift, both shoulders are brought back in the frequency range. A fifth degree polynomial is fitted in between them (red line). The bottom box shows the comparison between the standard frequency SM folded spectrum (black) and the TP mode spectrum (red, shoulders have been erased).
Plane Array receiver, (KFPA, Fig. 8) allowing the use of a polynomial of very low degree (4 or 5) to zero the baseline (Fig. 9). The baseline is also flat enough so that the channel-wise gain is hardly variable and can be replaced by a constant value, therefore suppressing the calibration noise from eq. 12. In this figure, the phases have indeed been calibrated separately before being flattened. The use of a constant gain to calibrate the data allows the user to calibrate individual phases without subtracting any offset, but conversely the shoulders in the frequency SM data are hidden and their contribution to the noise lowers the average while it should increase it if attenuation is properly taken into account. If we measure the noise in this spectrum to the edges of the plot, which remain inside the original shoulders as displayed in Fig. 8, the standard deviation is 0.165 K instead of 0.173, with constant gain applied to the whole spectrum, and 0.179 K if calibrated channel-wise.   11. N 2 D + (J:1 -0) data reduction in TP mode compared to the original frequency SM reduction. The residual in the central part of the spectrum is shown in blue with a negative offset of -0.5 K. The ends have been erased because the negative lines from the frequency SM folding have no counterpart in the TP mode data reduction. Red vertical lines mark the masks used for the TP mode data reduction to compute the baseline, while the black vertical lines mark those used for the frequency SM data reduction. Part of the negative lines are outside the window.

W-band observations
The W-band observations we present here were taken with the old GBT spectrometer. Though the baseline is smooth in appearance (Fig. 10), the baseline subtraction from the ON+OFF spectra realigned on top of each other leaves small ripples which can be reduced by narrowing the spectral window but cannot be completely erased. However, the line residual is null on average, and the noise improvement is still √ 2 for a baseline polynomial degree of 16th order (Fig. 11).

Other methods
When the line is too wide for this kind of baseline fitting, there are still various advantages to treating the ON and the OFF observations separately. For example, several adjacent OFF spectra in space and time (e.g., offsets of a map) can be averaged together to be subtracted from each individual ON spectrum, re- Fig. 12. Same NH 3 data from GBT + VEGAS spectrometer as in Fig.  9. The top window shows the first frequency-shifted spectrum with the right-hand side smoothed by a median of 0.4 MHz in width. The middle window shows the second frequency-shifted spectrum smoothed on the left-hand side. The bottom window shows a comparison between the original folded spectrum and the half-smoothed folded spectrum. The baseline is computed between the two vertical dashed lines as in Fig. 9. constructing a posteriori the 'one long OFF-several short ONs' observation method (which is not available on all telescopes). The OFF can also be smoothed as strongly as possible, as long as narrow features from the baseline itself are not smoothed out because that would replace Gaussian noise with systematic error.
For frequency SM spectra, a similar method can be applied except that the two sides of the spectra have to be treated separately because both contain signal. We applied this method to the VEGAS GBT NH 3 data with success ( Fig. 12)

Conclusions
Total power data reduction with GBT data can easily be conducted with the present data-delivery and data-reduction tools (GBTIDL, a GBT customization of the interactive data language) offered at the GBO. The advantage of the present procedure is a gain of √ 2 in sensitivity at no cost whilst retaining the ability to check the result by comparing it to the normal differential data reduction or even to fall back on the standard mode reduction in case of problems (especially if the line is too large to support a high-degree polynomial fitting without damage). This method is particularly suited for narrow lines in cold clouds that are usually weak and few in number per GHz. In principle, this method could also be used with IRAM 30-m data or any other radiotelescope data provided the baseline is flat enough and single phase observations are retrievable.
Though not illustrated in this paper because of a lack of suitable data, in case of wide lines, all OFF observations of different positions can be averaged together (and smoothed if necessary) to reach similar results in the same manner as on-the-fly position SM is proceeding, and frequency SM observations can be half averaged and/or smoothed to reach sensitivities comparable to those of TP mode observations. Finally, OTF data can be advantageously treated in this manner: after the window width and baseline degree have been adjusted to optimize the data reduction, the treatment can be ap-