A multi-scale multi-frequency deconvolution algorithm for synthesis imaging in radio interferometry

U. Rau; T. J. Cornwell

doi:10.1051/0004-6361/201117104

Home

All issues

Volume 532 (August 2011)

A&A, 532 (2011) A71

Full HTML

Free Access

Issue		A&A Volume 532, August 2011


Article Number		A71
Number of page(s)		17
Section		Astronomical instrumentation
DOI		https://doi.org/10.1051/0004-6361/201117104
Published online		25 July 2011

A&A 532, A71 (2011)

A multi-scale multi-frequency deconvolution algorithm for synthesis imaging in radio interferometry

U. Rau¹ and T. J. Cornwell²

¹ National Radio Astronomy Observatory, Socorro, NM, USA
e-mail: rurvashi@aoc.nrao.edu
² Australia Telescope National Facility, CSIRO, Sydney, Australia
e-mail: tim.cornwell@atnf.csiro.au

Received: 19 April 2011
Accepted: 4 June 2011

Abstract

Aims. We describe MS-MFS, a multi-scale multi-frequency deconvolution algorithm for wide-band synthesis-imaging, and present imaging results that illustrate the capabilities of the algorithm and the conditions under which it is feasible and gives accurate results.

Methods. The MS-MFS algorithm models the wide-band sky-brightness distribution as a linear combination of spatial and spectral basis functions, and performs image-reconstruction by combining a linear-least-squares approach with iterative χ² minimization. This method extends and combines the ideas used in the MS-CLEAN and MF-CLEAN algorithms for multi-scale and multi-frequency deconvolution respectively, and can be used in conjunction with existing wide-field imaging algorithms. We also discuss a simpler hybrid of spectral-line and continuum imaging methods and point out situations where it may suffice.

Results. We show via simulations and application to multi-frequency VLA data and wideband EVLA data, that it is possible to reconstruct both spatial and spectral structure of compact and extended emission at the continuum sensitivity level and at the angular resolution allowed by the highest sampled frequency.

Key words: techniques: interferometric / techniques: image processing / methods: numerical / radio continuum: general

© ESO, 2011

1. Introduction

Instruments such as the EVLA (Perley et al. 2009), ASKAP (Deboer et al. 2009) and LOFAR (de Vos et al. 2009) are among a new generation of broad-band radio interferometers, currently being designed and built to provide high-dynamic-range imaging capabilities. The large instantaneous bandwidths offered by new front-end systems increase the raw continuum sensitivity of these instruments and allow us to measure the spectral structure of the incident radiation across large continuous frequency ranges. Until recently, the primary goal of wideband imaging has been to obtain a continuum image that makes use of the increased sensitivity and spatial-frequency coverage offered by combining multi-frequency measurements. So far, effects due to the spectral structure of the sky-brightness distribution have been considered mainly in the context of reducing errors in the continuum image, without paying attention to the accuracy of a spectral reconstruction. But now, the new bandwidths (100%) are large enough to allow the spectral structure to also be reconstructed to produce a meaningful astrophysical measurement. To do so, we need imaging algorithms that model and reconstruct both spatial and spectral structure simultaneously, and that are also sensitive to various effects of combining measurements from a large range of frequencies (namely varying ranges of sampled spatial scales and varying array-element response functions).

The simplest method of wide-band image reconstruction is to image each frequency channel separately and combine the results at the end. However, single-channel imaging is restricted to the narrow-band uv-coverage and sensitivity of the instrument, and source spectra can be studied only at the angular resolution allowed by the lowest frequency in the sampled range. Also, for complicated extended emission, the single-frequency uv-coverage may not be sufficient to produce a consistent solution across frequency. While such imaging may suffice for many science goals, it does not take full advantage of what an instantaneously wide-band instrument provides, namely the sensitivity and spatial-frequency coverage obtained by combining measurements from multiple receiver frequencies during image reconstruction.

Multi-frequency-synthesis (MFS) (Conway et al. 1990) is the technique of combining measurements at multiple discrete receiver frequencies during synthesis imaging. MFS was initially done to increase the aperture-plane coverage of sparse arrays by using narrow-band receivers and switching frequencies during the observations. Wide bandwidth systems (~10%) later presented the problem of bandwidth-smearing, which was eliminated by splitting the wide band into narrow-band channels and mapping them onto their correct spatial-frequencies during imaging. It was assumed that at the receiver sensitivities of the time, the measured sky brightness was constant across the observed bandwidth. The next step was to consider a frequency-dependent sky brightness distribution. Conway et al. (1990) describe a double-deconvolution algorithm based on the instrument’s responses to a series of spectral basis functions, in particular, the first two terms of a Taylor series. A map of the average spectral index is derived from the coefficient maps. Sault & Wieringa (1994) describe the MF-CLEAN algorithm which uses a formulation similar to double-deconvolution but calculates Taylor-coefficients via a least-squares solution. More recently, Likhachev (2005) re-derives the least-squares method used in MF-CLEAN using more than two series coefficients.

So far, these CLEAN-based multi-frequency deconvolution algorithms used point-source (zero-scale) flux components to model the sky emission, a choice not well suited for extended emission. Cornwell (2008) describes the MS-CLEAN algorithm which does matched-filtering using templates constructed from the instrument response to various large scale flux components. Greisen et al. (2009) describe a method simular to MS-CLEAN, and Bhatnagar & Cornwell (2004) describe the ASP-CLEAN algorithm that explicitly fits for the parameters of Gaussian flux components and uses scale size to aid the separation of signal from noise. We show in this paper that with the MF-CLEAN approach, deconvolution errors that occur with a point-source model are enhanced in the spectral-index and spectral-curvature images because of error propagation effects, and that the use of a multi-scale technique can minimize this.

For high dynamic range imaging across wide fields-of-view, direction-dependent instrumental effects need to be accounted for. Bhatnagar et al. (2008) describe an algorithm for the correction of time-variable and wide-field instrumental effects for narrow-band interferometric imaging. For wide-field wide-band imaging, these algorithms must be extended to include the frequency dependence of the instrument.

In this paper, we describe MS-MFS (multi-scale multi-frequency synthesis) as an algorithm that combines variants of the MF-CLEAN and MS-CLEAN approaches to simultaneously reconstruct both spatial and spectral structure of the sky-brightness distribution. Frequency-dependent primary-beam correction is considered as a post-deconvolution correction step¹. In Sect. 5, we show imaging examples using simulations, multi-frequency VLA data and wideband EVLA data, to illustrate the capabilities and limits of the MS-MFS algorithm.

1.1. Wide-band imaging

We begin with a discussion of how well we can reconstruct both spatial and spectral information from an incomplete set of visibilities sampled at multiple observing frequencies. An interferometer samples the visibility function of the sky brightness distribution at a discrete set of spatial-frequencies (called the uv-coverage). The spatial-frequencies sampled at each observing frequency ν are between $u_{\min} = \frac{ν}{c} b_{\min}$ $\hbox{${u}_{\rm min} = \frac{\nu}{c}{b}_{\rm min} $}$ and $u_{\max} = \frac{ν}{c} b_{\max}$ $\hbox{${u}_{\rm max} = \frac{\nu}{c}{b}_{\rm max}$}$ , where u is used here as a generic label for the uv-distance² and b represents the length of the baseline vector (in units of meters) projected onto the plane perpendicular to the direction of the source. The maximum spatial-frequency measured at each frequency defines the angular resolution of the instrument at the frequency (θ_ν = 1/u_max(ν)). The range of spatial-frequencies between u_min at ν_max and u_max at ν_min represents the region where the visibility function is sampled at all frequencies in the band, and there is sufficient information to reconstruct both spatial and spectral structure. The spatial-frequencies outside this region are sampled only by a fraction of the band and the accuracy of a broad-band reconstruction depends on how well the spectral and spatial structure are constrained by an appropriate choice of flux model.

Radio interferometric measurements of wideband continuum emission can be described as one or more of the following situations.

1.
Flat spectrum sources: for a flat-spectrum source, measurements at multiple frequencies sample the same spatial structure, increasing the signal-to-noise of the measurements in regions of overlapping spatial-frequencies, and providing better overall uv-plane filling. The angular resolution of the instrument is given by u_max at ν_max. Standard deconvolution algorithms applied to measurements combined via MFS will suffice to reconstruct source structure across the full range of spatial scales measured across the band.
2.
Unresolved sources with spectral structure: consider a compact, unresolved source (with spectral structure) that is measured as a point source at all frequencies. The visibility function of a point source is flat across the entire spatial-frequency plane. Therefore, even if u_max changes with frequency, the spectrum of the source is adequately sampled by the multi-frequency measurements. Using a flux model in which each source is a δ-function with a specific spectral model (for example, a smooth polynomial), it is possible to reconstruct the spectral structure of the source at the maximum possible angular resolution (given by u_max at ν_max).
3.
Resolved sources with spectral structure: for resolved sources with spectral structure, the accuracy of the reconstruction across all spatial scales between u_min at ν_min and u_max at ν_max depends on an appropriate choice of flux model, and the constraints that it provides. For example, a source emitting broad-band synchrotron radiation can be described by a fixed brightness distribution at one frequency with a power-law spectrum associated with each location. Images can be made at the maximum angular resolution (given by u_max at ν_max) with the assumption that different observing frequencies probe the same spatial structure but measure different amplitudes (usually a valid assumption). This constraint is strong enough to correctly reconstruct even moderately resolved sources that are completely unresolved at the low end of the band but resolved at the higher end. On the other hand, a source whose structure itself changes by 100% in amplitude across the band would break the above assumption (band-limited signals).In this case, a complete reconstruction would be possible only in the region of overlapping spatial-frequencies (between u_min at ν_max and u_max at ν_min), unless the flux model includes constraints that bias the solution towards one appropriate for such sources.
4.
Spectral structure at large spatial scales: the lower end of the spatial-frequency range presents a different problem. The size of the central hole in the uv-coverage of a typical interferometer increases with frequency, and spectra are not measured adequately at spatial scales corresponding to spatial-frequencies below u_min at ν_max. In the extreme case where most of the visibility function lies within this uv-hole, a flat-spectrum large-scale source (for example) can be indistinguishable from a relatively smaller source with a steep spectrum. Additional constraints in the form of short-spacing spectra may be required for an accurate reconstruction.
5.
Frequency dependence of the instrument: array-element responses usually vary with frequency, direction and time. Standard calibration accounts for the frequency and time dependence for the direction in which the instrument is pointing. Away from the pointing-direction, the frequency-dependent shape of the primary-beam is the dominant remaining instrumental effect, and this results in artificial spectral structure in the images. To recover both spatial and spectral structure of the sky brightness across a large field of view, the frequency dependence of the primary beam must be modeled and removed before or during multi-frequency synthesis imaging.

To summarize, just as standard interferometric image reconstruction uses a priori information about the spatial structure of the sky to estimate the visibility function in unmeasured regions of the uv-plane, multi-frequency image reconstruction algorithms need to use a priori information about the spectral structure of the sky brightness. By combining a suitable model with the known frequency-dependence of the spatial-frequency coverage and element response function, it is possible to reconstruct the broad-band sky brightness distribution from incomplete spectral and spatial-frequency sampling.

2. Multi-scale multi-frequency deconvolution

The MS-MFS algorithm described here is based on the iterative image-reconstruction framework described in Rau et al. (2009) and summarized in Appendix A. Sections 2.1 to 2.7 formulate the algorithm and summarize its implementation in the CASA package. Differences between the multi-scale and multi-frequency parts of MS-MFS with the original MF-CLEAN and MS-CLEAN approaches are highlighted in Sects. 3.1 and 3.2.

2.1. Parameterization of spatial structure

An image with multi-scale structure is written as a linear combination of images at different spatial scales (Cornwell 2008). $I m = \sum_{s = 0}^{N_{s} - 1} I \begin{matrix} shp \\ s \end{matrix} ⋆ I \begin{matrix} sky,δ \\ s \end{matrix}$ $\begin{equation} \vec{I}^{\rm m} = \sum_{s=0}^{\Ns -1} \vec{I}^{\rm shp}_{s} \star \vec{I}^{\rm sky,\rm{\delta}}_s \label{Eq:ms_model} \end{equation}$ (1)where I^m is a multi-scale model image³, and $I \begin{matrix} sky,δ \\ s \end{matrix}$ $\hbox{$\vec{I}^{\rm sky,\rm{\delta}}_{s}$}$ is a collection of δ-functions that describe the locations and integrated amplitudes of flux components of scale s in the image. N_s is the number of discrete spatial scales used to represent the image and $I \begin{matrix} shp \\ s \end{matrix}$ $\hbox{$\vec{I}^{\rm shp}_s$}$ is a tapered truncated parabola of width proportional to s. The symbol ⋆ denotes convolution.

2.2. Parameterization of spectral structure

The spectrum of each flux component is modeled by a polynomial in frequency (a Taylor series expansion about ν₀). $\begin{matrix} I \begin{matrix} m \\ ν \end{matrix} = \sum_{t = 0}^{N_{t} - 1} w_{ν}^{t} I \begin{matrix} sky \\ t \end{matrix} where w_{ν}^{t} & = \end{matrix}$ $\begin{eqnarray} \vec{I}^{\rm m}_{\nu} = \sum_{t=0}^{\Nt -1} \wnt \vec{I}^{\rm sky}_{t} ~~~\mathrm{where}~~~ \wnt&=&\dnuno^t \label{Eq:mf_model} \end{eqnarray}$ (2)where $I_{t}^{sky}$ $\hbox{$\I^{\rm{\rm sky}}_t$}$ represents a multi-scale Taylor coefficient image, and N_t is the order of the Taylor series expansion.

These Taylor coefficients are interpreted by choosing an astrophysically appropriate spectral model and performing a Taylor expansion to derive an expression that each coefficient maps to. One practical choice is a power law with a varying index, represented by a second-order polynomial in $\log (I) vs . \log {}^{(}{\frac{ν}{ν_{0}}}^{)}$ $\hbox{$\log(I)~{\rm vs.}~\log\nuno$}$ space. $I_{ν}^{sky} = I_{υ_{0}}^{sky} {(\frac{ν}{ν_{0}})}^{I_{ff}^{sky} + I_{fi}^{sky} \log (\frac{ν}{ν_{0}})} .$ $\begin{equation} \I_{\nu}^{\rm sky} = \I_{\nuup_0}^{\rm sky} \nuno^{\I^{\rm sky}_{\alphaup} + \I^{\rm sky}_{\betaup} \log \nuno}. \label{EQN_POWERLAW1} \end{equation}$ (3)Here, $I_{ff}^{sky}$ $\hbox{$\I^{\rm sky}_{\alphaup}$}$ represents an average spectral-index, and $I_{fi}^{sky}$ $\hbox{$\I^{\rm sky}_{\betaup}$}$ represents spectral-curvature. The motivation behind this choice of interpretation is the fact that continuum synchrotron emission is usually modeled (and observed) as a power law distribution with frequency. Across the wide frequency ranges that new receivers are now sensitive to, spectral breaks, steepening and turnovers need to be factored into models, and the simplest way to include them and ensure smoothness, is spectral curvature⁴.

A Taylor expansion of Eq. (3) yields the following expressions for the first three coefficients from which the spectral index $I_{ff}^{sky}$ $\hbox{$\I^{\rm sky}_{\alphaup}$}$ and curvature $I_{fi}^{sky}$ $\hbox{$\I^{\rm sky}_{\betaup}$}$ images can be computed algebraically. $I_{0}^{m} = I_{υ_{0}}^{sky}; I_{1}^{m} = I_{ff}^{sky} I_{υ_{0}}^{sky}; I_{2}^{m} = (\frac{I_{ff}^{sky} (I_{ff}^{sky} - 1)}{2} + I_{fi}^{sky}) I_{υ_{0}}^{sky} .$ $\begin{equation} \I^{\rm m}_0 = \I^{\rm sky}_{\nuup_0} ~~;~~ \I^{\rm m}_1 = \I^{\rm sky}_{\alphaup} \I^{\rm sky}_{\nuup_0} ~~;~~ \I^{\rm m}_2 = \left(\frac{\I^{\rm sky}_{\alphaup}(\I^{\rm sky}_{\alphaup}-1)}{2} + \I^{\rm sky}_{\betaup}\right) \I^{\rm sky}_{\nuup_0}. \label{EQN_COEFFS} \end{equation}$ (4)Note that with this choice of parameterization, we are using a polynomial to model a power-law, and N_t rapidly increases with bandwidth. A power-series expansion about $I_{ff}^{sky}$ $\hbox{$\I^{\rm sky}_{\alphaup}$}$ and $I_{fi}^{sky}$ $\hbox{$\I^{\rm sky}_{\betaup}$}$ will yield a logarithmic expansion (i.e. I vs. log ν) which requires fewer coefficients to represent the same spectrum⁵.

2.3. Multi-scale multi-frequency model

A wideband model of the sky brightness distribution is constructed from Eqs. (1) and (2). A wideband flux component is a spatial basis function ( $I_{s}^{shp}$ $\hbox{$\I^{\rm shp}_s$}$ , Gaussian or parabola) whose integrated amplitude follows a Taylor polynomial in frequency. A region of emission in which the spectrum varies with position will be modeled as a sum of these wide-band flux components. The image-reconstruction process simultaneously solves for spatial and spectral coefficients of these flux components.

The image at each frequency can be modeled as a linear combination of Taylor-coefficient images at different spatial scales. $I \begin{matrix} m \\ ν \end{matrix} = \sum_{t = 0}^{N_{t}} \sum_{s = 0}^{N_{s}} w_{ν}^{t} [I \begin{matrix} shp \\ s \end{matrix} ⋆ I \begin{matrix} sky \\ t_{s} \end{matrix}] where w_{ν}^{t} = {(\frac{ν - ν_{0}}{ν_{0}})}^{t} \cdot$ $\begin{equation} \vec{I}^{\rm m}_{\nu} = \sum_{t=0}^{\Nt } \sum_{s=0}^{\Ns } \wnt \left[ \vec{I}^{\rm shp}_s \star \vec{I}^{\rm sky}_{s\atop t}\right] ~~~~\mathrm{where}~~~\wnt = \dnuno^t\cdot \label{Eq:msmf_model} \end{equation}$ (5)Here, N_s is the number of discrete spatial scales used to represent the image and N_t is the order of the series expansion of the spectrum. $I \begin{matrix} sky \\ t_{s} \end{matrix}$ $\hbox{$\vec{I}^{\rm sky}_{s\atop t}$}$ represents a collection of δ-functions that describe the locations and integrated amplitudes of flux components of scale s in the image of the tth series coefficient.

2.4. Measurement equations

The measurement equations⁶ for a sky brightness distribution parameterized by Eq. (5) are $\begin{matrix} V \begin{matrix} obs \\ ν \end{matrix} & = \end{matrix}$ $\begin{eqnarray} \vec{V}^{\rm obs}_{\nu} &=& [\Sna][\F]\vec{I}^{\rm m}_{\nu} = \sum_{t=0}^{\Nt } \sum_{s=0}^{\Ns } \wnt [\Sna][\T_s][\F] \vec{I}^{\rm sky}_{s\atop t}. \label{Eq:msmfs_meqn} \end{eqnarray}$ (6) $V \begin{matrix} obs \\ ν \end{matrix}$ $\hbox{$\vec{V}^{\rm obs}_{\nu}$}$ is a vector of n × 1 visibilities measured at frequency ν. w_ν are Taylor-weights (shown in Eq. (5)). [S_ν] is an n × m projection operator that represents the spatial-frequency sampling function for frequency ν. The image-domain convolution of model $I \begin{matrix} sky \\ t_{s} \end{matrix}$ $\hbox{$\vec{I}^{\rm sky}_{s\atop t}$}$ with $I_{s}^{shp}$ $\hbox{$\I^{\rm shp}_s$}$ is written as a spatial-frequency-domain multiplication. $I \begin{matrix} shp \\ s \end{matrix} ⋆ I \begin{matrix} sky \\ t_{s} \end{matrix} = [F^{†}] [T_{s}] [F] I \begin{matrix} sky \\ t_{s} \end{matrix}$ $\hbox{$ \vec{I}^{\rm shp}_s \star \vec{I}^{\rm sky}_{s\atop t} = [\Fd][\T_s][\F] \vec{I}^{\rm sky}_{s\atop t}$}$ where $[T_{s}]_{m \times m} = diag ([F] I \begin{matrix} shp \\ s \end{matrix})$ $\hbox{$[\T_s]_{m~\times\,m} = {\rm diag}([\F] \vec{I}^{\rm shp}_s)$}$ is a spatial-frequency taper function. All images are vectors of shape m × 1.

Equation (6) can be re-written to include all frequencies by stacking [S_ν] for multiple frequencies. Let N_c be the number of frequencies. $\begin{matrix} V obs & = \end{matrix}$ $\begin{eqnarray} \vec{V}^{\rm obs} &=& \sum_{t=0}^{\Nt } \sum_{s=0}^{\Ns } [\Wnt][\Sa][\T_s] [\F] \vec{I}^{\rm sky}_{s\atop t} \label{Eq:msmfs_meqn_2} \end{eqnarray}$ (7)where V^obs is a vector of nN_c × 1 visibilities, [S] is an nN_c × m sampling matrix representing the multi-frequency uv-coverage of the synthesis array. $[W_{t}^{mfs}]$ $\hbox{$[\Wnt]$}$ is a diagonal nN_c × nN_c matrix of weights, and consists of N_c diagonal blocks each of size n × n and containing $w_{ν}^{t}$ $\hbox{$\wnt$}$ .

If the summations over t and s are written as a block-matrix dot-product, the full measurement matrix has the shape nN_c × mN_sN_t. When multiplied by the set of N_sN_t model sky vectors each of shape m × 1, it produces nN_c visibilities.

For N_t = 3,N_s = 2 the measurement equations can be written as follows, in block matrix form. The subscript p denotes the pth spatial scale and the subscript q denotes the qth Taylor coefficient of the spectrum polynomial. $\begin{matrix} [\begin{matrix} [A_{{_{0}^{0}}^{}}] [A_{{_{1}^{0}}^{}}] [A_{{_{2}^{0}}^{}}] [A_{{_{0}^{1}}^{}}] [A_{{_{1}^{1}}^{}}] [A_{{_{2}^{1}}^{}}] \end{matrix}] where [A_{{_{q}^{p}}^{}}] = [W_{q}^{mfs}] [S] [T_{p}] [F] \\ for p \in {0, N_{s} - 1} and q \in {0, N_{t} - 1} \end{matrix} I \begin{matrix} sky \\ 0_{0} \end{matrix}$ $\begin{equation} \begin{array}{l} \left[\begin{array}{llllll} \noalign{\medskip} \noalign{\medskip} \noalign{\medskip} \noalign{\medskip} \left[\A_{{0}\atop{0}}\right] & \left[ \A_{{0}\atop{1}} \right] & \left[\A_{{0}\atop{2}}\right] & \left[\A_{{1}\atop{0}}\right] & \left[\A_{{1}\atop{1}}\right] & \left[\A_{{1}\atop{2}}\right] \\ \noalign{\medskip} \noalign{\medskip} \noalign{\medskip} \noalign{\medskip} \end{array} \right] %\\ \noalign{\medskip} \noalign{\medskip} \noalign{\medskip} \noalign{\medskip} \noalign{\medskip} {~\mathrm{where}~~\left[ \A_{{p}\atop{q}}\right] = [\W^{\rm mfs}_q][\Sa][\T_p][\F]} \\ \noalign{\medskip} {~\mathrm{for}~p\in\{0,\Ns -1\}~\mathrm{and}~q\in\{0,\Nt -1\} } \end{array} \left[\begin{array}{l} \noalign{\medskip} \vec{I}^{\rm sky}_{{0}\atop{0}} \\ \noalign{\medskip} \vec{I}^{\rm sky}_{{0}\atop{1}} \\ \noalign{\medskip} \vec{I}^{\rm sky}_{{0}\atop{2}} \\ \noalign{\medskip} \vec{I}^{\rm sky}_{{1}\atop{0}} \\ \noalign{\medskip} \vec{I}^{\rm sky}_{{1}\atop{1}} \\ \noalign{\medskip} \vec{I}^{\rm sky}_{{1}\atop{2}}\\ \noalign{\medskip} \end{array}\right] =\vec{V}^{\rm obs}. \label{meqn_msmfs_math} \end{equation}$ (8)

2.5. Normal equations

The normal equations for the system described in Eq. (6) can be written in block matrix form, with each block-row (for scale size s, and Taylor term t) given by $\begin{matrix} \sum_{p = 0}^{N_{s} - 1} \sum_{q = 0}^{N_{t} - 1} [H_{{_{t,q}^{s,p}}^{}}] I \begin{matrix} sky \\ {_{q}^{p}}^{} \end{matrix} & = & I \begin{matrix} dirty \\ {_{t}^{s}}^{} \end{matrix} \forall s \in [0, N_{s} - 1],t \in [0, N_{t} - 1] . \end{matrix}$ $\begin{eqnarray} \label{Eq:msmfs_neqn_1} \sum_{p=0}^{\Ns -1}\sum_{q=0}^{\Nt -1} \left[\He_{{s,p}\atop{t,q}}\right] \vec{I}^{\rm sky}_{{p}\atop{q}} &=& \vec{I}^{\rm dirty}_{{s}\atop{t}}~~~ \forall~ s \in [0,\Ns -1], t\in[0,\Nt -1]. \end{eqnarray}$ (9)Here, each $[H_{{t,q}_{s,p}}]$ $\hbox{$\left[\He_{{s,p}\atop{t,q}} \right]$}$ is an m × m block of the Hessian matrix, and $I \begin{matrix} dirty \\ t_{s} \end{matrix}$ $\hbox{$\vec{I}^{\rm dirty}_{{s}\atop{t}}$}$ is one of N_sN_t dirty images. $\begin{matrix} [H_{{_{t,q}^{s,p}}^{}}] & = & [A_{{_{t}^{s}}^{}}^{†}] [W^{im}] [A_{{_{q}^{p}}^{}}] \\ = & [F^{†} T_{s} S^{†} {W_{t}^{mfs}}^{†}] [W^{im}] [W_{q}^{mfs} S T_{p} F] \\ = & [F^{†} T_{s} F] [F^{†} S^{†} {W_{t}^{mfs}}^{†} W^{im} W_{q}^{mfs} SF] [F^{†} T_{p} F] \\ = & [F^{†} T_{s} F] {\sum_{ν} w_{ν}^{t + q} [F^{†} S_{ν}^{†} W_{ν}^{im} S_{ν} F]} [F^{†} T_{p} F] \\ = & [F^{†} T_{s} F] {\sum_{ν} w_{ν}^{t + q} [H_{ν}}} [F^{†} T_{p} F] . \end{matrix}$ $\begin{eqnarray} \left[\He_{{s,p}\atop{t,q}} \right] &=& \left[\A_{{s}\atop{t}}^{\dag}\right][\Wim] \left[\A_{{p}\atop{q}}\right] \\ &=& [\Fd \T_s \Sd \Wntd ] [\Wim] [\W^{\rm mfs}_q \Sa \T_p \F] \\ &=& [\Fd \T_s \F] [\Fd \Sd \Wntd \Wim \W^{\rm mfs}_q \Sa \F] [\Fd \T_p \F]\\ &=& [\Fd \T_s \F] \left\{ \sum_{\nu} \wntq [\Fd\Snd\Wimn\Sna\F] \right\} [\Fd \T_p \F] \\ &=& [\Fd \T_s \F] \left\{ \sum_{\nu} \wntq [\He_{\nu}\} \right\} [\Fd \T_p \F]. \end{eqnarray}$ [W^im]is a diagonal matrix of data-weights (and imaging-weights) and $[W_{t}^{mfs}]$ $\hbox{$[\W^{\rm mfs}_t]$}$ is a diagonal matrix containing Taylor-weights $w_{ν}^{t}$ $\hbox{$\wnt$}$ . $[H_{ν}] = [F^{†} S_{ν}^{†} W_{ν}^{im} S_{ν} F]$ $\hbox{$[\He_{\nu}] = [\Fd\Snd\Wimn\Sna\F]$}$ is the Hessian matrix formed using only one frequency channel, and is a convolution operator containing a shifted version of the single-frequency point-spread-function $I_{ν}^{psf} = diag [F^{†} S^{†} W_{ν}^{im} S]$ $\hbox{$\I^{\rm psf}_{\nu} = {\rm diag}[\Fd \Sd \Wimn \Sa]$}$ in each row (see Appendix A.2 for details). [F^†T_sF] and [F^†T_pF] are also convolution operators with $I_{s}^{shp}$ $\hbox{$\I^{\rm shp}_s$}$ and $I_{p}^{shp}$ $\hbox{$\I^{\rm shp}_p$}$ as their kernels. The process of convolution is associative and commutative, and therefore, $[H_{{t,q}_{s,p}}]$ $\hbox{$\left[\He_{{s,p}\atop{t,q}}\right]$}$ is also a convolution operator whose kernel is given by $I \begin{matrix} psf \\ {t,q}_{s,p} \end{matrix} = I_{s}^{shp} ⋆ {\sum_{ν} w_{ν}^{t + q} I \begin{matrix} psf \\ ν \end{matrix}} ⋆ I_{p}^{shp} .$ $\begin{equation} \label{Eq:msmfs_neqn_2.5} \vec{I}^{\rm psf}_{{s,p}\atop{t,q}} = \I^{\rm shp}_s \star \left\{ \sum_{\nu} \wntq \vec{I}^{\rm psf}_{\nu} \right\} \star \I^{\rm shp}_p. \end{equation}$ (15)The dirty images on the RHS of Eq. (9) can be written as follows. $\begin{matrix} I \begin{matrix} dirty \\ {_{t}^{s}}^{} \end{matrix} & = & [A_{{_{t}^{s}}^{}}^{†}] [W^{im}] V obs \\ = & [F^{†} T_{s} S^{†} {W_{t}^{mfs}}^{†} W^{im}] V obs \\ = & [F^{†} T_{s} F] [F^{†} S^{†} {W_{t}^{mfs}}^{†} W^{im}] V obs \\ = & [F^{†} T_{s} F] {\sum_{ν} w_{ν}^{t} [F^{†} S_{ν}^{†} W_{ν}^{im}] V \begin{matrix} obs \\ ν \end{matrix}} \\ = \end{matrix}$ $\begin{eqnarray} \label{Eq:msmfs_neqn_3} \vec{I}^{\rm dirty}_{{s}\atop{t}} &=& \left[\A_{{s}\atop{t}}^{\dag}\right][\Wim] \vec{V}^{\rm obs}\\ &=& [\Fd \T_s \Sd\Wntd\Wim] \vec{V}^{\rm obs} \\ &=& [\Fd \T_s \F][\Fd\Sd\Wntd\Wim] \vec{V}^{\rm obs} \\ &=& [\Fd \T_s \F] \left\{ \sum_{\nu} \wnt [\Fd\Snd\Wimn] \vec{V}_{\nu}^{\rm obs} \right\} \\ &=& \I^{\rm shp}_s \star \left\{\sum_{\nu} \wnt \vec{I}^{\rm dirty}_{\nu} \right\} \label{Eq:msmfs_neqn_3a} \end{eqnarray}$ where $I_{ν}^{dirty} = [F^{†} S_{ν}^{†} W_{ν}^{im}] V \begin{matrix} obs \\ ν \end{matrix}$ $\hbox{${\I}_{\nu}^{\rm dirty} = [\Fd \Snd \Wimn] \vec{V}^{\rm obs}_{\nu} $}$ is the dirty image formed by direct Fourier inversion of weighted visibilities from one frequency channel.

When all scales and Taylor terms are combined, the full Hessian matrix contains N_tN_s × N_tN_s blocks each of size m × m, and N_t Taylor coefficient images each of size m × 1, for all N_s spatial scales.

The normal equations in block matrix form for the example in Eq. (8) for N_t = 3,N_s = 2 is shown in Eq. (21). The Hessian matrix consists of N_s × N_s = 2 × 2 blocks (the four quandrants of the matrix), each for one pair of spatial scale s,p (the upper indices). Within each quadrant, the N_t × N_t = 3 × 3 matrices correspond to various pairs of t,q (Taylor coefficient indices; the lower indices). This layout shows how the multi-scale and multi-frequency aspects of this imaging problem are combined and illustrates the dependencies between the spatial and spectral basis functions. $[\begin{matrix} [H_{{_{0, 0}^{0, 0}}^{}}] & [H_{{_{0, 1}^{0, 0}}^{}}] & [H_{{_{0, 2}^{0, 0}}^{}}] & [H_{{_{0, 0}^{0, 1}}^{}}] & [H_{{_{0, 1}^{0, 1}}^{}}] & [H_{{_{0, 2}^{0, 1}}^{}}] \\ [H_{{_{1, 0}^{0, 0}}^{}}] & [H_{{_{1, 1}^{0, 0}}^{}}] & [H_{{_{1, 2}^{0, 0}}^{}}] & [H_{{_{1, 0}^{0, 1}}^{}}] & [H_{{_{1, 1}^{0, 1}}^{}}] & [H_{{_{1, 2}^{0, 1}}^{}}] \end{matrix}]$ $\begin{equation}\small \left[\begin{array}{llllll} \noalign{\medskip} \left[\He_{{ 0, 0}\atop{ 0, 0}}\right] & \left[\He_{{ 0, 0}\atop{ 0, 1}}\right] &\left[\He_{{ 0, 0}\atop{ 0, 2}}\right] & \left[\He_{{ 0, 1}\atop{ 0, 0}}\right] & \left[\He_{{ 0, 1}\atop{ 0, 1}}\right] & \left[\He_{{ 0, 1}\atop{ 0, 2}}\right] \\ \noalign{\medskip} \left[\He_{{ 0, 0}\atop{ 1, 0}} \right] & \left[\He_{{ 0, 0}\atop{ 1, 1}}\right] & \left[\He_{{ 0, 0}\atop{ 1, 2}}\right]& \left[\He_{{ 0, 1}\atop{ 1, 0}}\right] & \left[\He_{{ 0, 1}\atop{ 1, 1}}\right] & \left[\He_{{ 0, 1}\atop{ 1, 2}}\right] \\ \noalign{\medskip} \left[\He_{{ 0, 0}\atop{ 2, 0}} \right] & \left[\He_{{ 0, 0}\atop{ 2, 1}}\right] & \left[\He_{{ 0, 0}\atop{ 2, 2}}\right] & \left[\He_{{ 0, 1}\atop{ 2, 0}}\right] & \left[\He_{{ 0, 1}\atop{ 2, 1}}\right] &\left[\He_{{ 0, 1}\atop{ 2, 2}}\right] \\ % \noalign{\medskip} \noalign{\medskip} \left[\He_{{ 1, 0}\atop{ 0, 0}} \right] & \left[\He_{{ 1, 0}\atop{ 0, 1}}\right] & \left[\He_{{ 1, 0}\atop{ 0, 2}}\right] & \left[\He_{{ 1, 1}\atop{ 0, 0}}\right] & \left[\He_{{ 1, 1}\atop{ 0, 1}}\right] & \left[\He_{{ 1, 1}\atop{ 0, 2}}\right] \\ % \noalign{\medskip} \left[\He_{{ 1, 0}\atop{ 1, 0}} \right] & \left[\He_{{ 1, 0}\atop{ 1, 1}}\right] & \left[\He_{{ 1, 0}\atop{ 1, 2}}\right] & \left[\He_{{ 1, 1}\atop{ 1, 0}}\right] & \left[\He_{{ 1, 1}\atop{ 1, 1}}\right] & \left[\He_{{ 1, 1}\atop{ 1, 2}}\right] \\ % \noalign{\medskip} \left[\He_{{ 1, 0}\atop{ 2, 0}} \right] & \left[\He_{{ 1, 0}\atop{ 2, 1}}\right] & \left[\He_{{ 1, 0}\atop{ 2, 2}}\right] & \left[\He_{{ 1, 1}\atop{ 2, 0}}\right] & \left[\He_{{ 1, 1}\atop{ 2, 1}}\right]& \left[\He_{{ 1, 1}\atop{ 2, 2}}\right] \\ %\noalign{\medskip} \end{array} \right] \left[\begin{array}{l} \noalign{\medskip} \vec{I}^{\rm sky}_{{ 0}\atop{ 0}} \\ \noalign{\medskip} \vec{I}^{\rm sky}_{{ 0}\atop{ 1}}\\ \noalign{\medskip} \vec{I}^{\rm sky}_{{ 0}\atop{ 2}}\\ \noalign{\medskip} \noalign{\medskip} \vec{I}^{\rm sky}_{{ 1}\atop{ 0}} \\ \noalign{\medskip} \vec{I}^{\rm sky}_{{ 1}\atop{ 1}}\\ \noalign{\medskip} \vec{I}^{\rm sky}_{{ 1}\atop{ 2}}\\ \noalign{\medskip} \end{array}\right] = \left[\begin{array}{l} \noalign{\medskip} \vec{I}^{\rm dirty}_{{ 0}\atop{ 0}} \\ \noalign{\medskip} \vec{I}^{\rm dirty}_{{ 0}\atop{ 1}} \\ \noalign{\medskip} \vec{I}^{\rm dirty}_{{ 0}\atop{ 2}}\\ \noalign{\medskip} \noalign{\medskip} \vec{I}^{\rm dirty}_{{ 1}\atop{ 0}} \\ \noalign{\medskip} \vec{I}^{\rm dirty}_{{ 1}\atop{ 1}} \\ \noalign{\medskip} \vec{I}^{\rm dirty}_{{ 1}\atop{ 2}} \\ \noalign{\medskip} \end{array}\right]. \label{Eq:msmfs_neqn_matrix} \end{equation}$ (21)This is the system of equations to be solved. The spatial-frequency sampling of a real interferometer is always incomplete ( [S] is rank-deficient). Therefore, each Hessian block, and the entire Hessian matrix is singular, and an exact inverse does not exist⁷. An accurate reconstruction can be obtained only via successive approximation (iterative numerical optimization).

2.6. Principal solution

The principal solution⁸ of the normal equations is an approximate solution that can be computed via diagonal approximations of all Hessian blocks. This solution is then used to pick flux components within the minor-cycle of iterative deconvolution.

Each Hessian block is a convolution operator with a shifted version of a point-spread-function $I_{{t,q}_{s,p}}^{psf}$ $\hbox{$\I^{\rm psf}_{{s,p}\atop{t,q}}$}$ in each row (their centers are aligned on the diagonal). A diagonal approximation represents the assumption that the PSFs are δ-functions, or that the amplitudes in the dirty image at the location of a source reflect the true flux of the source. Also, with the assumption of spatially invariant PSFs, all elements on the diagonal within each Hessian block are the same. Therefore, we can reduce each $[H_{{t,q}_{s,p}}]$ $\hbox{$\left[\He_{{s,p}\atop{t,q}}\right]$}$ to one number. The full Hessian reduces to an N_tN_s × N_tN_s element matrix [H^peak], The principal solution can be obtained by inverting [H^peak] once, and applying it to the dirty image vectors, one pixel at a time. Such a solution will be correct only at the locations of the centers of isolated flux-components and must be augmented with an iterative optimization approach to ensure accuracy. In the case of perfect sampling (where the Hessian blocks are truly diagonal and PSFs are δ-functions), the principal solution will directly give correct images of Taylor-series coefficients.

2.6.1. Properties of [H^peak]

Some properties of [H^peak] are worth noting, to understand the numerical stability of this approach and its dependence on the choice of spectral and spatial basis functions (sky model), and spectral and spatial-frequency sampling functions (data and instrument).

1.
There are N_sN_t elements on the diagonal of [H^peak]. Each is a measure of theinstrument’s sensitivity to a flux component of unit total fluxwhose shape and spectrum are described by one pair of spatial andspectral basis functions ( $I_{s}^{shp}$ $\hbox{$\I^{\rm shp}_s$}$ and ${}^{(}{\frac{ν - ν_{0}}{ν_{0}}}^{)}^{t}$ $\hbox{$\dnuno^t$}$ ). Each diagonal element is given by $\begin{matrix} H_{{_{t,t}^{s,s}}^{}}^{peak} & = & mid {I \begin{matrix} psf \\ {_{t,t}^{s,s}}^{} \end{matrix}} = tr [\sum_{ν} w_{ν}^{t + t} [T_{s} S_{ν}^{†} W_{ν}^{im} S_{ν} T_{s}]] \end{matrix}$ $\begin{eqnarray} H^{\rm peak}_{{s,s}\atop{t,t}} &=& {\rm mid} \left\{\vec{I}^{\rm psf}_{{s,s}\atop{t,t}}\right\} = tr\left[\sum_{\nu} w_{\nu}^{t+t} [\T_s\Snd\Wimn\Sna \T_s] \right] \\ & &\forall~ s ~\in ~\{0 \ldots \Ns -1\}~~,~~t ~\in ~\{0 \ldots \Nt -1\}.\nonumber \label{Eq:hpeak_msmfs_1} \end{eqnarray}$ (22)Note that the instrument’s sampling function and data-weights are included in this expression.
2.
The off-diagonal elements measure the orthogonality⁹ between the various basis functions, for the given uv-coverage and weighting scheme. They measure the amount of overlap between basis functions in the measurement domain. Smaller values indicate a more orthogonal set of basis functions that the instrument is better able to distinguish between.
3.
The condition number of this matrix (or of blocks within this matrix) will indicate if the chosen set of basis functions and spatial-frequency coverage provide enough constraints to provide a stable solution, and can be used as a metric to choose a suitable basis set. For a simple example, if a 3-term solution is attempted with data from only two distinct frequencies, [H^peak] will be singular. Or, for some choice of multi-frequency uv-coverage, the visibilities measured by the instrument for two different spatial scales may become hard to distinguish. Then, the cross-term element of [H^peak] corresponding to this combination could have a higher value, indicating that the two parameters are highly coupled, and there is insufficient information in the data and sampling pattern to distinguish between the scales. A similar situation can arise to create ambiguity between spatial and spectral structure (an extreme example is multi-frequency measurements from only one baseline).
4.
In general, [H^peak] will be a positive-definite symmetric matrix whose inverse can be easily computed via a Cholesky decomposition¹⁰. The value of N_s is usually < 10, making the inversion of [H^peak] tractable as a one-time operation.
5.
Some further approximations can be made about the structure of [H^peak] to simplify its inversion, and it is important to understand the numerical implications of these trade-offs. One is a block-diagonal approximation of [H^peak] (i.e. using only those blocks of the Hessian in Eq. (21) for which s = p; top-left and bottom-right quadrants). This approximation treats each spatial scale separately and assumes that the scale basis functions are orthogonal. For each scale, cross-terms between Taylor functions are preserved, and a multi-frequency principal solution can be obtained separately for each spatial scale. Note that a set of tapered truncated paraboloids is never orthogonal, but this separate-scale approximation works because of the iterative χ²-minimization process. This approximation makes the Hessian inversion easier, but to preserve accuracy, the update step of the iterative deconvolution still needs to evaluate the full LHS of the normal equations while subtracting out a flux component.

2.7. MS-MFS algorithm

This section describes an iterative process that solves the normal equations (Eq. (9)) and produces a set of N_t Taylor-series coefficient images at N_s different spatial scales. Appendix B lists the algorithm-steps in pseudo-code format, reflecting the implementation of MS-MFS in the CASA¹¹ software package.

Pre-compute Hessian:

each block of the Hessian is a convolution operator, consisting of a shifted version of the same convolution kernel in each row. Therefore, it suffices to compute and store one kernel per Hessian block. Convolution kernels for all distinct blocks in the N_sN_t × N_sN_t Hessian are evaluated via Eq. (15). All kernels are normalized by the sum-of-weights such that the peak of $I \begin{matrix} psf \\ {0, 0}_{0, 0} \end{matrix}$ $\hbox{$\vec{I}^{\rm psf}_{{0,0}\atop{0,0}}$}$ is unity, and the relative weights between Hessian blocks is preserved. A block-diagonal approximation of [H^peak] is done, and a set of N_s matrices each of shape N_t × N_t and denoted as $[H_{s}^{peak}]$ $\hbox{$[\He^{\rm peak}_s]$}$ are constructed. Their inverses are computed and stored in $[{H_{s}^{peak}}^{-1}]$ $\hbox{$[{\He^{\rm peak}_s}^{-1}]$}$ .

Initialization:

all N_sN_t model images are initialized to zero (or an a priori model to start from).

Major and minor cycles:

iterative image reconstruction in radio interferometry is usually split into major and minor cycles (see Appendix A.3). The major cycles compute the RHS of the normal equations, and the minor cycle inverts the Hessian and gets an estimate of the model. This model is used in the next major cycle to compute new RHS vectors from the residuals, and the process repeats itself until convergence is achieved. Steps 1 and 5 (of the list of steps given below) form one major cycle, and repetitions of Steps through 4 form the minor cycle.

1.
Compute residual images: the RHS vectors (residual or dirtyimages) $I \begin{matrix} dirty \\ t_{s} \end{matrix} \forall t \in {0, N_{t}$ $\hbox{$\vec{I}^{\rm dirty}_{{s}\atop{t}} ~\forall~t\in\{0,\Nt $}$ -1 } of the normal equations are computed viaEq. (20) by first computing the multi-frequency dirtyimages and then convolving them by the scale basis functions.
Find a flux component: the principal solution is computed for all pixels, one scale at a time. $I_{s}^{pix, psol} = [{H_{s}^{peak}}^{-1}] I_{s}^{pix, dirty} for each pixel, and scale s .$ $\begin{equation} \I^{\rm pix,psol}_s = [{\He^{\rm peak}_s}^{-1}] \I^{\rm pix,dirty}_s ~~\mathrm{for~each~pixel,~and~scale}~s. \label{Eq:msmfs_psol} \end{equation}$ (23)Here, $I_{s}^{pix, psol}$ $\hbox{$\I^{\rm pix,psol}_s$}$ is a list of N_t Taylor-coefficients for the pixel-location pix and scale s. $[H_{s}^{peak}]$ $\hbox{$[\He^{\rm peak}_s]$}$ is the sth block (of size N_t × N_t) in the list of diagonal-blocks of [H^peak], and $I_{s}^{pix, dirty}$ $\hbox{$\I^{\rm pix,dirty}_s$}$ is the N_t × 1 vector constructed from $I \begin{matrix} dirty \\ t_{s} \end{matrix} \forall t \in {0, N_{t}$ $\hbox{$\vec{I}^{\rm dirty}_{{s}\atop{t}} ~\forall~t\in\{0,\Nt $}$ -1 } for one pixel pix. This step is performed on all pixels, separately for all scales s, resulting in N_s sets of N_t Taylor-coefficient images. The most appropriate set of Taylor coefficients must now be chosen. The search is performed across pixels and scales. Many heuristics can be used here. For example, in iteration i, choose the N_t element solution-set with the dominant q = 0 component across all scales and pixel locations. Or, pick the set of components that makes the largest impact on the value of χ². Or, choose the location from the peak in the t = 0 residual image, and compute the principal solution only for that pixel. The result of this step is a set of N_t model images, each containing one δ-function that marks the location of the center of a flux component of shape $I_{p, (i)}^{shp}$ $\hbox{$\I^{\rm shp}_{p,(i)}$}$ (p represents the scale of the chosen component, out of all possible values of s). The amplitudes of these N_tδ-functions are the Taylor coefficients that model the spectrum of the integrated flux of this component. Let these N_t model images from iteration i be denoted as ${I \begin{matrix} m \\ q_{p}, (i) \end{matrix}}; q \in [0, N_{t}]$ $\hbox{$\left\{\vec{I}^{\rm m}_{{{p}\atop{q}},(i)}\right\};q\in[0,\Nt ]$}$ .
3.
Update model images: a set of N_t multi-scale model images are accumulated. $I_{q}^{m} = I_{q}^{m} + g (I_{q_{p}, (i)}^{m} ⋆ I_{p}^{shp}) \forall q \in [0, N_{t}]$ $\begin{equation} \I^{\rm m}_{q} = \I^{\rm m}_{q} + g \left( \I^{\rm m}_{{{p}\atop{q}},(i)} \star \I^{\rm shp}_{p} \right) ~~~~~~~~ \forall q\in[0,\Nt ] \label{Eq:msmfs_updatemodel} \end{equation}$ (24)where g is a loop-gain that takes on values between 0 and 1 and controls the step size for each iteration in the χ²-minimization process.
4.
Update RHS: the RHS residual images¹² are updated by evaluating and subtracting out the entire LHS of the normal equations. However, since the chosen flux component corresponds to just one scale, this becomes a summation over only Taylor terms (and not scale; there is only one scale p in the current model). $I_{t_{s}}^{res} = I_{t_{s}}^{res} - g \sum_{q_{i} = 0}^{N_{t} - 1} [I_{{t,q}_{s,p}}^{psf} ⋆ I_{q_{p}, (i)}^{m}] .$ $\begin{equation} \I^{\rm res}_{{s}\atop{t}} = \I^{\rm res}_{{s}\atop{t}} - g \sum_{q_i=0}^{\Nt -1} \left[ \I^{\rm psf}_{{s,p}\atop{t,q}} \star \I^{\rm m}_{{{p}\atop{q}},(i)} \right]. \label{Eq:msmfs_updaterhs} \end{equation}$ (25)Repeat from Step 2 until the minor-cycle flux limit is reached.
Predict: model visibilities are computed from the set of Taylor-coefficient images via Eq. (6). Residual visibilities are computed as $V \begin{matrix} res \\ ν \end{matrix} = V \begin{matrix} obs \\ ν \end{matrix} - V \begin{matrix} m \\ ν \end{matrix}$ $\hbox{$\vec{V}^{\rm res}_{\nu}= \vec{V}^{\rm obs}_{\nu} - \vec{V}^{\rm m}_{\nu}$}$ . Repeat from Step 1 until a global convergence criterion is satisfied. (After the first iteration, $V \begin{matrix} res \\ ν \end{matrix}$ $\hbox{$\vec{V}^{\rm res}_{\nu}$}$ is used as the new $V \begin{matrix} obs \\ ν \end{matrix}$ $\hbox{$\vec{V}^{\rm obs}_{\nu}$}$ in Step 1.)

Restoration:

after convergence, the model Taylor-coefficient images can be interpreted in different ways.

1.
The most obvious data products are the Taylor-coefficientimages themselves, which are directly smoothed by the restoringbeam. Residual images are added back in after computing theprincipal solution from the residuals obtained in the last instanceof Step 1, to ensure that any undeconvolved flux has the correctflux values¹³.
2.
For the study of broad-band radio emission, the spectral coefficients can be interpreted in terms of a power law in frequency with varying index (as described in Sect. 2.2). The data products are images of the reference-frequency flux $I \begin{matrix} m \\ υ_{0} \end{matrix}$ $\hbox{$\vec{I}^{\rm m}_{\nuup_0}$}$ , the spectral-index $I \begin{matrix} m \\ α \end{matrix}$ $\hbox{$\vec{I}^{\rm m}_{\alpha}$}$ and the spectral curvature $I \begin{matrix} m \\ fi \end{matrix}$ $\hbox{$\vec{I}^{\rm m}_{\betaup}$}$ . $\begin{matrix} I \begin{matrix} m \\ υ_{0} \end{matrix} & = & I \begin{matrix} m \\ 0 \end{matrix} \\ I \begin{matrix} m \\ ff \end{matrix} & = & I \begin{matrix} m \\ 1 \end{matrix} / I \begin{matrix} m \\ 0 \end{matrix} \\ I \begin{matrix} m \\ fi \end{matrix} & = & [I \begin{matrix} m \\ 2 \end{matrix} / {I \begin{matrix} m \\ 0 \end{matrix}}^{]} - [I \begin{matrix} m \\ ff \end{matrix} (I \begin{matrix} m \\ ff \end{matrix} - 1) / 2] . \end{matrix}$ $\begin{eqnarray} \label{Eq:calcab_1} \vec{I}^{\rm m}_{\nuup_0} &=& \vec{I}^{\rm m}_0 \\ \label{Eq:calcab_2} \vec{I}^{\rm m}_{\alphaup} &=& {\vec{I}^{\rm m}_1}/{\vec{I}^{\rm m}_0} \\ \label{Eq:calcab_3} \vec{I}^{\rm m}_{\betaup} &=& \left[{\vec{I}^{\rm m}_2}/{\vec{I}^{\rm m}_0}\right] - \left[{{\vec{I}^{\rm m}_{\alphaup}(\vec{I}^{\rm m}_{\alphaup}-1)}}{/2}\right]. \end{eqnarray}$ Spectral index and curvature images are calculated only in regions where the values in $I \begin{matrix} m \\ 0 \end{matrix}$ $\hbox{$\vec{I}^{\rm m}_0$}$ are above a chosen threshold.
3.
An image cube can be constructed by evaluating the spectral polynomial via Eq. (2) for each frequency. This data product is useful for sources whose emission is not well modeled by a power law, but is a smooth polynomial in frequency. Band-limited signals that taper off smoothly in frequency are one example.
4.
When multiple sources along a given line-of-sight have different spectra, the Taylor-coefficients will represent the combined spectrum. To compute spectral index and curvature maps for foreground sources, a polynomial background-subtraction must be done on the Taylor-coefficient images before Eqs. (26) to (28) are evaluated.

Primary-beam correction:

for wide-field imaging, the spatial and spectral structure of the primary beam (array-element response function) contributes to the measured signal. If this instrumental effect is not accounted for, the output Taylor-coefficient images approximately represent the product of the sky and the primary-beam. $I_{ν}^{sky} = P_{ν} I_{ν}^{true} = P_{υ_{0}} I_{υ_{0}}^{true} {(\frac{ν}{ν_{0}})}^{[I_{ff}^{true} + P ff] + [I_{fi}^{true} + P fi] \log (\frac{ν}{ν_{0}})} .$ $\begin{equation} \I_{\nu}^{\rm sky} = \Pb_{\nu} \I_{\nu}^{\rm true} = \Pb_{\nuup_0} \I_{\nuup_0}^{\rm true} \nuno^{[\I^{\rm true}_{\alphaup}+\vec{P}_{\alphaup}] + [\I^{\rm true}_{\betaup} + \vec{P}_{\betaup} ] \log \nuno}. \label{EQN_POWERLAW2} \end{equation}$ (29) P_υ₀ is the primary beam at the reference frequency, and P_α and Pβ are spectral index and curvature due to the frequency dependence of the primary beam.

A correction for the average primary-beam and its frequency dependence can be done as a post-deconvolution step. The primary-beams are first evaluated or measured as a function of frequency, and the frequency-dependence per pixel modeled by a power-law or a polynomial (perferably the same spectral polynomial used for the image reconstruction). Primary-beam correction can then be done as follows. $\begin{matrix} I \begin{matrix} new \\ υ_{0} \end{matrix} & = & I \begin{matrix} m \\ υ_{0} \end{matrix} / P υ_{0} \\ I \begin{matrix} new \\ ff \end{matrix} & = & I \begin{matrix} m \\ ff \end{matrix} - P ff \\ I \begin{matrix} new \\ fi \end{matrix} & = & I \begin{matrix} m \\ fi \end{matrix} - P fi . \end{matrix}$ $\begin{eqnarray} \vec{I}^{\rm new}_{\nuup_0}&=&\vec{I}^{\rm m}_{\nuup_0}/\vec{P}_{\nuup_0}\\ \vec{I}^{\rm new}_{\alphaup}&=&\vec{I}^{\rm m}_{\alphaup}-\vec{P}_{\alphaup}\\ \vec{I}^{\rm new}_{\betaup}&=&\vec{I}^{\rm m}_{\betaup}-\vec{P}_{\betaup}. \end{eqnarray}$ Note that if a polynomial is fit for the frequency-dependence of the primary beam, and P_υ₀,Pα,Pβ computed from it, the above operation is numerically identical to doing a polynomial division in terms of two sets of coefficients (for N_t ≤ 3). A brute-force polynomial-division using more series coefficients will yield a more accurate solution. Note however, that such a correction will not be accurate if there are time-dependent variations in the primary-beam, and will require integration with the AW-Projection algorithm discussed in Bhatnagar et al. (2008).

3. Relation to MF-CLEAN and MS-CLEAN

The MS-MFS algorithm is a combination of the general ideas used in MS-CLEAN and MF-CLEAN, but there are some subtle differences. The next two sections briefly discuss these differences and their numerical implications.

3.1. Relation to MF-CLEAN

The MF-CLEAN algorithm (Sault & Wieringa 1994) models the sky as a collection of point sources with a Taylor polynomial spectrum. A point-source version of MS-MFS can be derived by setting N_s = 1 and using $I_{0}^{shp} = δ$ $\hbox{$\I^{\rm shp}_0 = \delta$}$ -function in the derivations in Sects. 2.4 and 2.5. The normal equations can be written in block matrix form (for example, for N_t = 3). $[\begin{matrix} [H_{0, 0}] & [H_{0, 1}] & [H_{0, 2}] \\ [H_{1, 0}] & [H_{1, 1}] & [H_{1, 2}] \\ [H_{2, 0}] & [H_{2, 1}] & [H_{2, 2}] \end{matrix}] [\begin{matrix} I \begin{matrix} sky \\ 0 \end{matrix} \\ I \begin{matrix} sky \\ 1 \end{matrix} \\ I \begin{matrix} sky \\ 2 \end{matrix} \end{matrix}] = [\begin{matrix} I \begin{matrix} dirty \\ 0 \end{matrix} \\ I \begin{matrix} dirty \\ 1 \end{matrix} \\ I \begin{matrix} dirty \\ 2 \end{matrix} \end{matrix}] .$ $\begin{equation} \left[\begin{array}{lll} [\He_{0,0}] & [\He_{0,1}] & [\He_{0,2}]\\ \noalign{\medskip} [\He_{1,0}] & [\He_{1,1}] & [\He_{1,2}] \\ \noalign{\medskip} [\He_{2,0}] & [\He_{2,1}] & [\He_{2,2}] \\ \end{array} \right] \left[\begin{array}{l} \vec{I}^{\rm sky}_{0} \\ \noalign{\medskip} \vec{I}^{\rm sky}_{1} \\ \noalign{\medskip} \vec{I}^{\rm sky}_{2}\end{array}\right] = \left[\begin{array}{l} \vec{I}^{\rm dirty}_{0}\\ \noalign{\medskip} \vec{I}^{\rm dirty}_{1} \\ \noalign{\medskip} \vec{I}^{\rm dirty}_{2}\end{array}\right]. \label{Eq:mfs_neqn3} \end{equation}$ (33)Each block [H_t,q] is a convolution operator with $I \begin{matrix} psf \\ t,q \end{matrix}$ $\hbox{$\vec{I}^{\rm psf}_{t,q}$}$ as its kernel. $\begin{matrix} I \begin{matrix} psf \\ t,q \end{matrix} & = & \sum_{ν} w_{ν}^{t + q} I \begin{matrix} psf \\ ν \end{matrix} \\ I \begin{matrix} dirty \\ t \end{matrix} & = & \sum_{ν} w_{ν}^{t} I \begin{matrix} dirty \\ ν \end{matrix} . \end{matrix}$ $\begin{eqnarray} \label{Eq:mf_neqn_2b} \vec{I}^{\rm psf}_{t,q} &=& \sum_{\nu} \wntq \vec{I}^{\rm psf}_{\nu}\\ \label{Eq:mf_neqn_3a} \vec{I}^{\rm dirty}_{t} &=& \sum_{\nu} \wnt \vec{I}^{\rm dirty}_{\nu}. \end{eqnarray}$ On the other hand, the MF-CLEAN algorithm described in Sault & Wieringa (1994) follows a matched-filtering approach using functions called spectral-PSFs, which are equivalent to the convolution kernels from the first row of Hessian blocks (q = 0) in Eq. (34). In MF-CLEAN, the Hessian elements and RHS vectors are calculated by convolving spectral-PSFs with themselves and the residual images. $\begin{matrix} I \begin{matrix} psf \\ t,q \end{matrix} & = & {\sum_{ν} w_{ν}^{t} I \begin{matrix} psf \\ ν \end{matrix}} ⋆ {\sum_{ν} w_{ν}^{q} I \begin{matrix} psf \\ ν \end{matrix}} \\ I \begin{matrix} dirty \\ t \end{matrix} & = & {\sum_{ν} w_{ν}^{t} I \begin{matrix} psf \\ ν \end{matrix}} ⋆ {\sum_{ν} I \begin{matrix} dirty \\ ν \end{matrix}} . \end{matrix}$ $\begin{eqnarray} \label{Eq:mfclean_2} \vec{I}^{\rm psf}_{t,q} &=& \left\{ \sum_{\nu}\wnt \vec{I}^{\rm psf}_{\nu} \right\} \star \left\{ \sum_{\nu}\wnq \vec{I}^{\rm psf}_{\nu} \right\}\\ \label{Eq:mfclean_3} \vec{I}^{\rm dirty}_{t} &=& \left\{ \sum_{\nu}\wnt \vec{I}^{\rm psf}_{\nu} \right\} \star \left\{\sum_{\nu} \vec{I}^{\rm dirty}_{\nu} \right\}. \end{eqnarray}$ Formally, this matched filtering approach is exactly equal to the calculations shown in Eqs. (34) and (35) only under the conditions that there is no overlap on the spatial-frequency plane between measurements from different observing frequencies, and all measurements are weighted equally across the spatial-frequency plane (uniform weighting). In practice, MF-CLEAN incurs errors for arrays with dense spatial-frequency coverage where tracks from different baselines and frequency-channels intersect. When applied to simulated EVLA data, numerical instabilities limited the fidelity of the final image, especially with extended emission, and this instability was eliminated by changing the computations from Eqs. (36) and (37) to Eqs. (34) and (35). Also, the MF-CLEAN formulation uses a two-term Taylor-polynomial, which can be shown to result in a dynamic-range limit of 10⁴ for a bandwidth ratio of 2:1 and source spectral index of –1.0.

3.2. Relation to MS-CLEAN

The MS-CLEAN algorithm (Cornwell 2008; Greisen et al. 2009) models the sky as a combination of multiscale flux components with no spectral structure.

A narrow-band (or flat-spectrum) version of MS-MFS can be derived by setting N_t = 1 in the derivations in Sects. 2.4 and 2.5. The normal equations can be written in block matrix form (for example, for N_s = 2). The peaks of the convolution kernels from the diagonal blocks of the Hessian are a measure of the sensitivity of the instrument to a particular spatial scale. $[\begin{matrix} [H_{0, 0}] & [H_{0, 1}] \\ [H_{1, 0}] & [H_{1, 1}] \end{matrix}] [\begin{matrix} I \begin{matrix} sky,δ \\ 0 \end{matrix} \\ I \begin{matrix} sky,δ \\ 1 \end{matrix} \end{matrix}] = [\begin{matrix} I \begin{matrix} dirty \\ 0 \end{matrix} \\ I \begin{matrix} dirty \\ 1 \end{matrix} \end{matrix}] .$ $\begin{equation} \left[\begin{array}{ll} \noalign{\medskip} [\He_{0,0}] & [\He_{0,1}] \\ \noalign{\medskip} [\He_{1,0}] & [\He_{1,1}] \\ \noalign{\medskip} \end{array} \right] \left[\begin{array}{l} \vec{I}^{\rm sky,\rm{\delta}}_{0} \\ \noalign{\medskip} \vec{I}^{\rm sky,\rm{\delta}}_{1} \end{array}\right] = \left[\begin{array}{l} \vec{I}^{\rm dirty}_{0}\\ \noalign{\medskip} \vec{I}^{\rm dirty}_{1} \end{array}\right]. \label{neqn_ms_math} \end{equation}$ (38)In MS-MFS, a diagonal approximation of this Hessian is used to compute the principal solution. This is equivalent to normalizing the residual images (RHS vectors) by the sum of weights for each spatial scale, before searching for peaks.

In both existing forms of MS-CLEAN, this normalization is replaced by a scale bias, an empirical term that de-emphasises large spatial scales. The scale bias b_s = 1−0.6 s/s_max used by Cornwell (2008) (where s_max is the width of the largest scale basis function) is a linear approximation of how the inverse of the area under each scale function changes with scale size¹⁴. The algorithm described by Greisen et al. (2009) uses b_s ≈ 1.0/s^2x where x ∈ {0.2,0.7} , to approximate a normalization by the area under a Gaussian, for the case when images are smoothed by applying a uv-taper that tends to unity for the zero spatial-frequency. Both these normalization schemes are approximations of using a diagonal approximation of [H^peak].

Once we have this understanding, we can see that the full Hessian [H^peak] (and not just a diagonal approximation) can be inverted to get the normalization exactly right, giving an accurate estimate of the total flux at each scale. This becomes useful for sources that contain overlapping flux components of different spatial scales. However, this solution gives correct values only at the locations of the centers of the flux-components, and introduces large errors in the PSF sidelobes. Therefore, for reasons of stability, a diagonal approximation is still a more appropriate choice (demonstrated on simulated EVLA data).

Another difference lies in the minor-cycle updates. The update steps in MS-MFS and the Cornwell MS-CLEAN evaluate the full LHS of the normal equations to update the smoothed residual images and subtract out flux components within the image domain. This allows each minor cycle iteration to search for flux components across all spatial scales. The update step in the Greisen MS-CLEAN ignores the cross-terms, and performs a full set of minor cycle iterations on one scale at a time. A choice between all these methods will depend on trade-offs between the accuracy within each minor cycle iteration, the computational cost per step, and optimized global convergence patterns to control the total number of iterations.

4. Hybrids of narrow-band and continuum techniques

The preceding sections discussed multi-frequency solutions that used data from all measured frequencies together, to take advantage of the combined spatial-frequency coverage. However, there are some situations where single-channel methods used in combination with multi-frequency-synthesis (and no built-in spectral model) will be able to deliver scientifically useful wide-band reconstructions at the continuum sensitivity.

The basic idea of a hybrid wide-band method is to combine the advantages of single-channel imaging (simplicity and non-dependence on any spectral model) with those of continuum imaging (deconvolution with full continuum sensitivity).

1.
Deconvolve each channel separately upto the single-channelsensitivity limit σ_chan. Only sources brighter than σ_chan will be detected anddeconvolved.
2.
Remove the contribution of bright (spectrally varying) sources by subtracting out visibilities predicted from the model image cube. At this stage, the peak residual brightness is at the level of the single-channel noise limit σ_chan.
3.
Perform MFS imaging (flat-spectrum assumption) on the continuum residuals to extract flux that lies between σ_chan and σ_cont. According to Conway et al. (1990) (and as discussed later in Sect. 6), errors due to this flat-spectrum assumption become visible only above a dynamic range of ~1000 (for α = −0.7 and a 2:1 bandwidth ratio). Therefore, as long as the sensitivity improvement between a single-channel and the full band is less than ~1000, this second step will incur no errors even if the remaining flux has spectral structure. This requirement translates to N_chan < 10⁶, which is usually satisfied¹⁵.
4.
Add model images from both steps, and restore the results. For unresolved sources, it may be appropriate to use a clean-beam fitted for the highest frequency, but in general, to not bias spectral information, all channels should be restored using a clean beam fitted to the PSF at the lowest frequency in the range.

The advantages of this approach are its simplicity, and that it can handle wide-band reconstructions with band-limited signals and spectral-lines. The disadvantages are that the angular resolution of the images and spectral information will be restricted to that given by the lowest frequency (a factor of two worse than what is possible for the 2:1 bandwidth available with the EVLA L-band). Also, the single-frequency spatial-frequency coverage may not be sufficient to unambiguously reconstruct all the spatial structure of interest at all frequencies, which in turn will affect spectra measured from these single-channel images. In some cases, additional constraints can be used. Bong et al. (2006) describe spatio-spectral MEM, an entropy based method in which single-channel imaging is done along with a smoothness constraint applied across frequency.

In general, single-channel methods can be used for wide-band imaging mainly to construct an image of the continuum flux. Only if there is sufficient single-frequency uv-coverage to reconstruct an accurate model of the source structure (for example, fields of isolated point sources), may reliable spectral information also be derived from such an approach.

5. MS-MFS imaging results

This section contains three imaging examples that demonstrate the MS-MFS algorithm’s basic capabilities. Section 6 discusses various sources of error and how they manifest themselves, and Sect. 7 demonstrates the limits to which the algorithm can be reasonably pushed.

5.1. EVLA simulation

Data:

wide-band EVLA observations were simulated for a sky brightness distribution consisting of one point source with spectral index of − 2.0 and two overlapping Gaussians with spectral indices of − 1.0 and + 1.0. The spectral index across the resulting extended source varies smoothly between − 1.0 and + 1.0, with a spectral turnover in the region of overlap of the two Gaussians, corresponding to a spectral curvature of approximately +0.5. Data were simulated for the EVLA in C-configuration between 1–2 GHz, for an 8-h observation with measurements 5 min apart. The goal of this test was to assess the ability of the MS-MFS algorithm to reconstruct both spatial and spectral information accurately enough for astrophysical use.

Fig. 1

MS-MFS imaging results using simulated EVLA data: these images compare truth images (left column) with the results of two wide-band imaging runs; multi-scale (middle column) and point-source (right column). The top three rows represent the first three Taylor-coefficient images, and the fourth and fifth rows show spectral index and spectral curvature respectively.

Imaging results:

two MS-MFS imaging runs were done and the results compared. Figure 1 illustrates the algorithm’s performance by comparing several truth images describing the true sky brightness (left column) and reconstructions from a multi-scale (middle-column) and a point-source (right-column) version of MS-MFS. The multi-scale version used a flux model in which N_t = 3 and N_s = 4 with scale sizes defined by widths of 0,6,18,24 pixels, and the point-source version used N_t = 3 and N_s = 1 with one scale function given by the δ-function (to emulate the MF-CLEAN algorithm). From top to bottom, the rows correspond to the intensity image at the reference frequency I₀ = I_υ₀, the first-order Taylor-coefficient I₁ = I_αI_υ₀, the second-order Taylor-coefficient I₂ = (I _α(Iα − 1)/2 + I_β)I_υ₀, the spectral index I_α = I₁/I₀ , and the spectral curvature I_β = (I₂/I₀) − I_α(I_α − 1)/2.

1.
With a multi-scale multi-frequency flux model (MS-MFS) thespectral index across the extended source was reconstructed to anaccuracy of δα < 0.02 with the errors rising at the edges of the source wherethe signal-to-noise ratio decreases. The spectral curvature acrossthe extended source was estimated to an accuracy of δβ < 0.05 in the centralregion with the maximum error of δβ ≈ 0.2 in the regions where thecurvature signal goes to zero and the source surface brightness isalso minimum (the outer edges of the source).
2.
With a multi-frequency point-source model (MF-CLEAN) the accuracy of the spectral index and curvature maps was limited to δα ≈ 0.2,δβ ≈ 0.6. This is because the use of a point source model will break any extended emission into components the size of the resolution element and this leads to deconvolution errors well above the off-source noise level. Error propagation during the computation of spectral index and curvature as ratios of these noisy Taylor-coefficient images leads to high error levels in the result (see Sect. 6.3). This clearly shows the importance of using a multiscale flux model when there is extended emission.
3.
The Gaussian on the left has α = −1.0 and β = 0.0, and N_t = 3 is not sufficient to model the power-law accurately, leading to a value of β ≈ + 0.1 in the truth image as well as in the reconstruction. However, the Gaussian on the right has α = + 1.0 and β = 0.0 (a straight line), and N_t = 3 is more than sufficient to model it, leading to an accurate value of β ≈ 0 in the truth image and MS-MFS reconstruction. The point-source has α = −2.0, and the error on β is proportionally higher. The use of N_t > 3 reduces these errors.

5.2. Multi-frequency VLA observations of Cygnus-A

Objective:

multi-frequency VLA observations of the bright radio galaxy Cygnus A were used to test the MS-MFS algorithm on real data and to test standard calibration methods on wide-band data. Cygnus A is an extremely bright (1000 Jy) radio galaxy with a pair of bright compact hotspots (α = −0.5) located about 1 arcmin away from each other on either side of a very compact flat-spectrum core, and extended radio lobes associated with the hotspots with broad-band synchrotron emission at multiple spatial scales (α = −0.6 ~ − 1.0) (Carilli & Barthel 1996). The best existing images of Cygnus-A and its spectral structure have been from large amounts of multi-configuration narrow-band VLA data (Carilli et al. 1991) designed to measure the spatial structure as completely as possible at two widely separated frequencies (1.4 and 4.8 GHz).

The goal of this test was to use multi-frequency snapshot observations of Cygnus A to evaluate how well the MS-MFS algorithm is able to reconstruct both spatial and spectral structure from measurements in which the single-frequency uv-coverage is insufficient to accurately reconstruct all the spatial structure at that frequency.

Data were taken in April 2009 when the VLA was transitioning to the EVLA. 15 out of 27 antennas had new wideband EVLA feeds, but the correlator was that of the VLA (narrow-band). Wideband data were taken as a series of narrow-band snapshots spread across 8 h and 9 distinct frequencies across the L-Band (30 min per frequency tuning). Flux calibration at each frequency was done via observations of 3C 286. Phase-only calibration was done using an existing narrow-band image of Cygnus A at 1.4 GHz (Carilli et al. 1991) as a model, and further self-calibration was done with the wide-band flux model derived from the MS-MFS algorithm. The final dataset used for imaging consisted of 9 spectral windows each of a width of about 4 MHz and separated by about 100 MHz.

Fig. 2

Cygnus A – total intensity and spectral-index: this figure shows the MS-MFS total intensity map (top), and spectral-index maps obtained via three methods – MS-MFS (second row), a hybrid single-band method (third row), and from high-resolution full-synthesis narrow-band images at 1.4 and 4.8 GHz (bottom). The MS-MFS spectral index map has the angular-resolution of the total intensity map, and agrees with the values in the high-resolution comparison map (α = −0.5 at the hotspots increasing to α ≈ − 1.0 in the halo). However, the hybrid method resulted in a map with a wider range of values (positive and negative) and is at a much lower angular resolution.

These data were imaged using two methods, the MS-MFS algorithm with N_t = 3 and N_s = 10, and a hybrid of single-channel imaging followed by MFS on the residuals. Note that these observations do not have dense single-frequency uv-coverage, and the purpose of applying the hybrid method was to emphasize the errors that can occur if this method is used inappropriately. Due to the small angular size of Cygnus-A, the effect of the L-band primary-beam was ignored in both runs.

Results:

Figure 2 shows the reconstructed total-intensity images (top), the spectral-index map obtained via MS-MFS (second row) and the spectral-index map constructed from single-subband maps (third row). For comparison, the image at the bottom is a spectral-index map constructed from existing narrow-band images at 1.4 and 4.8 GHz, each constructed from a combination of VLA A, B, C and D configuration data (Carilli et al. 1991).

1.
The total intensity image (top row) has a peak brightness of77 mJy/beam at the hotspots and a peak brightnessof about 400 mJy/beam for the fainter extendedparts of the halo. Both the methods (MS-MFS and hybrid) gavevery similar total-intensity images and residual images. Theon-source and off-source residuals for the MS-MFS algorithmare 30 mJy and 25 mJy and with thehybrid algorithm are 50 mJy and30 mJy respectively.
2.
The spatial structure seen in the MS-MFS spectral index image (second row) is very similar to that seen in the two-point (1.4–4.8 GHz) spectral index image (bottom). This shows that despite having a comparatively small amount of data (20 VLA B-configuration snapshots at 9 frequencies) the use of an algorithm that models the sky brightness distribution appropriately is able to extract the same astrophysical information as traditional methods applied to large amounts of data (full synthesis runs in multiple VLA configurations at two frequencies). The estimated errors on the spectral index map are < 0.1 for the brighter regions of the source (near the hotspots) and ≥ 0.2 for the fainter parts of the lobes and the core. In contrast, the Hybrid spectral index map (third row) clearly shows errors arising due to non-unique solutions at each separate frequency (due to insufficient narrow-band spatial-frequency coverage) as well as smoothing to the angular resolution at the lowest frequency.
3.
The MS-MFS spectral curvature map (not shown here) contains values corresponding to △ α ≈ 0.4 for the brightest regions on the source. Such a change in α appears high, and we did not have sufficient single-frequency data at other bands to verify these values (the next section contains an example where we could verify the curvature maps with other data). Also, the values of curvature changed between imaging-runs with N_t = 3 and N_t = 4, suggesting over-fitting errors due to the low signal-to-noise ratio of the curvature measurement (which could arise from inaccurate bandpass calibration).

5.3. Multi-frequency observations of M 87

A similar observation of M 87 was done with the goal of measuring the 1–2 GHz spectral index in different parts of the inner core-jet-lobe system and outer halo filaments.

M 87 is a bright (200 Jy) radio galaxy located at the center of the Virgo cluster. The spatial distribution of broad-band synchrotron emission from this source consists of a bright central region (spanning a few arcmin) containing a flat-spectrum core, a jet (with known spectral index of −0.55) and two radio lobes with steeper spectra (−0.5 > α > −0.8) (Rottmann et al. 1996; Owen et al. 2000). This central region is surrounded by a large diffuse radio halo (7 to 14 arcmin) with many bright narrow filaments (≈10′′ × 3′).

Multi-frequency VLA data were taken similar to the observations of Cygnus-A, with a series of 10 snapshots at 16 different frequencies within the sensitivity range of the EVLA L-band receivers. These data were imaged using MS-MFS with N_t = 3 and N₂ = 12. The top row of images in Fig. 3 shows the intensity, spectral index and spectral curvature maps of the bright central region at an angular resolution of 3 arcsec (C+B-configuration).

Fig. 3

M 87 core/jet/lobe – intensity, spectral index and curvature: these images show 3-arcsec resolution maps of the central bright region of M 87 (core+jet and inner lobes). The quantities displayed are the intensity at 1.5 GHz (top left), the spectral index (top middle) and the spectral curvature (top right). The spectral index is near zero at the core, varies between − 0.36 and − 0.6 along the jet and out into the lobes. The spectral curvature is on average 0.5 which translates to △ α = 0.2 across L-band. The plot at the bottom compares the resulting average spectrum (solid line) with that formed by imaging each spectral-window separately (dots). The dashed and dotted lines correspond to fixed values of spectral index (–0.43, –0.53, –0.63).

1.
The peak brightness at the center of the final restored inten-sity image was 15 Jy with an off-source rms of1.8 mJy and an on-source rms of between 3 and10 mJy. The spectral index map¹⁶ of the bright central region(at 3 arcsec resolution) shows a near flat-spectrumcore with α_LL = −0.25, a jet with α_LL = −0.52 and lobes with −0.6 > α_LL > −0.7. This bright central region hadsufficient (>100) signal-to-noise to be able to detect spectral curva-ture, and no obvious deconvolution errors. However the error barson the spectral curvature are at the same level as the measurementitself, and a reliable estimate can only be obtained as an averageover this entire bright region. The average curvature is measuredto be β_LL = −0.5 which corresponds to a change in α across L-band by $△ α = β \frac{△ ν}{ν_{0}} \approx - 0.2$ $\hbox{$\triangle \alpha = \beta \frac{\triangle\nu}{\nu_0} \approx -0.2$}$ .
2.
To verify consistency of this spectral-curvature value with the data, each of the 16 spectral-windows was imaged separately, and restored with the same clean-beam. The plot at the bottom of Fig. 3 shows this integrated flux spectrum (log (I) vs. log (ν/ν₀)) as round dots, along with the average spectrum calculated by MS-MFS (curved line), and straight dashed and dotted lines that correspond to constant spectral indices of − 0.43, − 0.53 and − 0.63. These plots show that a change in α of about 0.2 is consistent with the data. Note that the scatter seen in the single-spectral-window data points is at the 1% level of the source flux. This illustrates the accuracy at which bandpass calibration must be done in order to measure a physically plausible spectral-curvature signal across a 2:1 bandwidth.
3.
These numbers were further compared with two-point spectral indices computed between 327 MHz (P-band), 1.4 GHz (L-band), and 4.8 GHz (C-band) from existing images (Owen et al. 2000; Owen F., priv. comm.). Across the bright central region, two-point spectral indices are −0.36 > α_PL > −0.45 and −0.5 > α_LC > −0.7. The measured values from our experiment (−0.4 > α_LL > −0.7 and △ α ≈ 0.2 across L-band) are consistent with these independent calculations.

6. Sources of error

There are various sources of error that can affect the imaging process, and leave artifacts both on-source and off-source. As with any image-reconstruction algorithm, signs of these errors must be looked-for in the output images before astrophysical interpretation.

6.1. On-source polynomial-fit errors

The errors on the polynomial coefficients and quantities derived from them will depend on the number of measurements of the spectrum, their distribution across a frequency range, the signal-to-noise ratio of the pattern being fitted, and the order of the polynomial used in the fit. These errors will affect regions in the image both on-source and off-source, and the resulting error patterns and their magnitudes will depend on the available uv-coverage, and the choice of reference frequency. Although the physical parameters I_υ₀,I_α and I_β can be obtained from the first three coefficients of a Taylor expansion of a power-law with varying index (Eqs. (26) to (28)), a higher order polynomial may be required during the fitting process to improve the accuracy of the first three coefficients¹⁷.

Fig. 4

Peak residuals and Errors for MFS with different values of N_t: These plots show the measured peak residuals (top left) and the errors on I_υ₀ (top right), I_α (bottom left), and I_β (bottom right) when a point-source of flux 1.0 Jy and α = −1.0 was imaged using Taylor polynomials of different orders (N_t = 1−7) and a linear spectral basis. The x-axis of all these plots show the value of N_t used for the simulation and plots for α and β begin from N_t = 2 and N_t = 3 respectively. An example of how to read these plots: for a 2:1 bandwidth ratio, a source with spectral index = –1.0 and N_t = 4, the achievable dynamic range (measured as the ratio of the peak flux to the peak residual near the source) is about 10⁵, the error on the peak flux at the reference frequency is 1 part in 10³, and the absolute errors on α anre β are 10^-2 and 10^-1 respectively.

Figure 4 summarizes the errors obtained when the order of the polynomial chosen for imaging is not sufficient to model the power-law spectrum of the source. Data for 8 h synthesis runa with EVLA uv-coverages were simulated for 5 different frequency ranges around 2.0 GHz. The sky brightness distribution used for the simulation was one point source whose flux is 1.0 Jy and spectral index is –1.0 with no spectral curvature. The bandwidth ratios¹⁸ for these 5 datasets were 100%(3:1), 66%(2:1), 50%(1.67:1), 25%(1.28:1), 10%(1.1:1).

MS-MFS was repeated on all these datasets with N_t = 1 to N_t = 7, and one scale N_s = 1, a δ-function. All these datasets were imaged using a maximum of 10 iterations, a loop-gain of 1.0, natural weighting and a flux threshold of 1.0 µJy. No noise was added to these simulations (in order to isolate and measure numerical errors due to the spectral fits). The top left panel in Fig. 4 shows the peak residuals, measured over the entire 0th order residual image. The other three panels show on-source errors for I_υ₀ (top right), I_α (bottom left) and I_α (bottom right). Errors on I_υ₀,I_α,I_β were computed at the location of the point source by taking differences with the ideal values of I_υ₀ = 1.0,α = −1.0,β = 0.0.

One noticeable trend from these plots is that with sufficient signal-to-noise in the measurements, all errors decrease exponentially (linearly in log-space) as a function of increasing order of the polynomial, and as a function of decreasing total bandwidth. As expected, for very narrow bandwidths, the use of high-order polynomials increases the error. Also, when the order of the polynomial used is too low, the peak residuals are much smaller than the on-source error incurred on I_υ₀, I_α and I_β. These trends are based on one simple example, and represent the best-case scenario in which all sources can be described as point sources. For extended emission, there can be additional errors due to deconvolution artifacts. In the case of very noisy spectra, or at a late stage of image reconstruction when the signal-to-noise ratio in the residual images is low, errors can arise from attempting to use too many terms in the polynomial fit.

6.2. Off-source errors and dynamic-range limits

Consider the errors on the continuum image when spectral structure is ignored during MFS imaging (N_t = 0; flat-spectrum assumption). Spectral structure will masquerade as spurious spatial structure, leading to error patterns that resemble the instrument’s response to the first-order term in the Taylor-series expansion. A rough rule of thumb for EVLA uv-coverages is that for a point source with spectral index α = −1.0 measured between 1 and 2 GHz, the peak error obtained if the spectrum is ignored is at a dynamic range of < 10³. In other words, if the dynamic-range allowed by the data (peak brightness/thermal noise) is 800 (for example), there will be no visible artifacts if the spectral structure upto α = −1.0 is ignored. These numbers are consistent with error estimates derived in Conway et al. (1990) that predict errors at the level of Iα/X where X = O(500) is a factor that depends on the uv-coverage of the instrument and the choice of reference frequency. Therefore, if the only goal is to obtain a continuum image over a narrow field-of-view, it may be possible to achieve the maximum-possible dynamic range by dividing out an average spectral index (one single number over the entire sky) from the visibilities before imaging them via MFS with a flat-spectrum assumption (Conway et al. 1990).

Fig. 5

Stokes I images of the 3C 286 field, using MSMFS with N_t = 1 (top left), N_t = 2 (top right), N_t = 3 (bottom left), N_t = 4 (bottom left). The peak flux of 3C 286 is 14 Jy/beam. The axes labels are in units of arc-minutes from the pointing center and all images are shown with the same grey-scales. The residual rms near 3C 286 for these four images are 9 mJy, 1 mJy, 200 µJy and 140 µJy, and the rms measured 1 degree away from 3C 286 are 1 mJy, 200 µJy , 85 µJy and 80 µJy. The thermal-noise limit for this dataset was 70 µJy.

At high dynamic-ranges, off-source errors trace the spectral response functions for higher-order Taylor terms. As long as they are visible above the noise, they can be eliminated with a higher-order polynomial fit. Figure 5 shows a set of four images of the 3C 286 field made from wideband EVLA data taken in October 2010. These data consist of four EVLA snapshots spread across 90 min, and cover a frequency range of 1.02 GHz to 2.1 GHz (800 MHz usable bandwidth after accounting for radio-frequency-interference). 3C 286 is 14.4 Jy/beam at 1.5 GHz, with a spectral index of –0.47. The four panels correspond to MS-MFS imaging runs with N_t = 1 (top left), N_t = 2 (top right), N_t = 3 (bottom left), N_t = 4 (bottom left), all with N_s = 1 (a δ-function). All images are displayed with the same grey-scale levels, and show the levels at which the error patterns appear. The dynamic range (calculated as the peak brightness to the peak residual measured near the brightest peak) ranges from 1.6 × 10³ when spectral structure is ignored (N_t = 1), to 1.1 × 10⁵ with N_t = 4.

6.3. Propagation of multi-scale errors

Deconvolution errors contribute to the on-source error in the Taylor coefficient images, and these errors propagate to the spectral index map which is computed as a ratio of two coefficient images (Iα = I₁/I₀). Errors that result when a point-source flux model is used for extended emission can increase the error bars on the spectral index and curvature by an order of magnitude (as demonstrated by the example shown in Fig. 1). These errors are approximately given by $△ I_{ff} = I_{ff} \sqrt{{(\frac{△ I_{0}}{I_{0}})}^{2} + {(\frac{△ I_{1}}{I_{1}})}^{2}}$ $\begin{equation} \triangle \I_{\alphaup} = \I_{\alphaup} \sqrt{ \left( {\triangle \I_0}\over{\I_0} \right)^2 + \left( {\triangle \I_1}\over{\I_1} \right)^2 } \label{Eq:errprop} \end{equation}$ (39)where △ I₀ and △ I₁ are the absolute errors measured in the first two Taylor-coefficient images. For the example shown in Fig. 1, the errors on the coefficient images were measured as the rms of the deviation from the truth images within the region of high signal-to-noise. These errors result in a prediction of △ Iα ≈ 0.05 for the multi-scale version, and △ Iα ≈ 0.7 for the point-source version, which approximately matches the rms of the deviation of the output α image from the truth α image.

Fig. 6

Moderately resolved sources – single-channel images: these figures show the 6 single-channel images generated from simulated EVLA data between 1 and 4 GHz in the EVLA D-configuration. The sky brightness consists of two point sources, each of flux 1.0 Jy at a reference frequency of 2.5 GHz and separated by 18 arcsec. Their spectral indices are +1.0 (top source) and −1.0 (bottom source). The angular resolution at 1 GHz is 60 arcsec, and at 4 GHz is 15 arcsec and the circles in the lower left corner of each image shows the resolution element decreasing in size as frequency increases.

6.4. Frequency dependence of the primary beam

When imaging wide fields-of-view, sources away from the pointing center will be attenuated by the value of the primary beam at each frequency. The EVLA primary-beams across a 2:1 bandwidth contribute an extra spectral-index of –1.4 at the half-power point (measured from simulated beams, as well as a Gaussian approximation of the main lobe of the primary beam (Sault & Wieringa 1994)). Therefore, even if a source has a flat spectrum, this artificial spectral index can result in imaging artifacts at the levels described in Sect. 6 (i.e. a dynamic range limit of ~10³ for a flat spectrum source at the HPBW, due only to the spectral variation of the primary beam). This effect can be corrected as a post-deconvolution step, or by using wide-field imaging algorithms along with MS-MFS. So far, accurate post-deconvolution corrections have been demonstrated out to the 10% level of the highest-frequency primary-beam.

Fig. 7

Moderately resolved sources – MSMFS Images: these images show the intensity at 2.5 GHz (left), the spectral index showing a gradient between − 1 and +1 (middle) and the spectral curvature which peaks between the two sources and falls off on either side (right). The angular resolution of these images is 15 arcsec, corresponding to the highest frequency in the data (and Fig. 6).

7. Imaging performance (non-standard conditions)

This section describes a set of simulations that test the limits of MS-MFS for different types of source structure. Three cases are studied; sources unresolved at the lowest sampled frequency and resolved at the upper end, sources whose visibility functions lie mostly within the central unsampled region of the uv-plane near the origin, and band-limited signals.

7.1. Moderately resolved sources

Consider a source with broad-band continuum emission and spatial structure that is unresolved at the low-frequency end of the band and resolved at the high-frequency end. Traditionally, spectral information would be available only at the angular resolution offered by the lowest observed frequency. With MS-MFS, the intensity distribution as well as the spectral index of such emission can be imaged at the angular resolution allowed by the highest frequency in the band. This is because compact emission has a signature all across the spatial-frequency plane and its spectrum is well sampled by the measurements. The highest frequencies constrain the spatial structure, and the flux model (in which a spectrum is associated with each flux component) naturally fits a spectrum at the angular resolution at which the spatial structure is modeled. Such a reconstruction is model-dependent and will be accurate only if the spectrum at the smallest measured scales can really be represented as a polynomial.

EVLA simulation:

wide-band EVLA data were simulated for the D-configuration across a frequency range of 3.0 GHz with 6 frequency channels between 1 and 4 GHz (600 MHz apart). This wide frequency range was chosen to emphasize the difference in angular resolution at the two ends of the band (60 arcsec at 1 GHz, and 15 arcsec at 4.0 GHz). The sky brightness chosen for this test consists of a pair of point sources separated by a distance of 18 arcsec (about one resolution element at the highest frequency), making this a moderately resolved source. These point sources were given different spectral indices ( + 1.0 for the top source and − 1.0 for the bottom one). Figure 6 shows the six single-channel images of this source. At the low frequency end, the source is almost indistinguishable from a single flux component centered at the location of the bottom source whose flux peaks at the low-frequency end. The source structure becomes apparent only in the higher frequencies where the top source (with a positive spectral index) is brighter.

MSMFS Imaging results:

these data were imaged using the MS-MFS algorithm with N_t = 3 and N_s = 1 with only one spatial scale (a δ-function), and Fig. 7 shows the results. The intensity distribution, spectral index and curvature of this source were recovered at the angular resolution allowed by the 4.0 GHz samples (15 arcsec). This demonstrates that an appropriate flux model can constrain the solution accurately at the angular resolution given by the highest sampled frequency.

7.2. Emission at large spatial scales

Consider a very large (extended) flat-spectrum source whose visibility function falls mainly within the central hole in the uv-coverage. The size of this central hole increases with observing frequency. The minimum spatial-frequency sampled per channel will measure a decreasing peak flux level as frequency increases. Since the reconstruction below the minimum spatial-frequency involves an extrapolation of the measurements and is un-constrained by the data, these decreasing peak visibility levels can be mistakenly interpreted as a source whose amplitude itself is decreasing with frequency (a less-extended source with a steep spectrum). Usually, a physically-realistic flux model suffices as a constraint, but with the MS-MFS model (polynomial spectra associated with 2D Gaussian-like components), a large flat-spectrum source and a smaller steep-spectrum source are both allowed and considered equally probable. This creates an ambiguity between the reconstructed scale and spectrum that cannot always be resolved directly from the data, and requires additional information (a low-frequency narrow-band image at the reference frequency to constrain the spatial structure, or low-resolution spectral information).

Fig. 8

Very large spatial scales – intensity, spectral index, residuals: these images show the intensity distribution (left), spectral index (middle) and the residuals (right) for two imaging runs. The top row shows results without short-spacing information, and shows a false negative α ≈ −0.8 for the extended emission. The bottom row shows results with short-spacing constraints, in which the extended source has been reconstructed correctly.

EVLA simulation:

wide-band EVLA data were simulated for the D-configuration across a frequency range of 3.0 GHz centred at 2.5 GHz. (6 frequency channels located 600 MHz apart between 1.0 and 4.0 GHz). The size of the central hole in the uv-coverage was increased by flagging all baselines shorter than 100 m and the wide frequency range was chosen to emphasize the difference between the largest spatial scale measured at each frequency (0.3 kλ or 10.3 arcmin at 1.0 GHz, and 1.3 kλ or 2.5 arcmin at 4.0 GHz). The sky brightness chosen for this test consists of one large flat-spectrum (α = 0.0) 2D Gaussian whose FWHM is 2.0 arcmin (corresponding to 1.6 kλ, at the reference frequency of 2.5 GHz), and one steep spectrum point-source (α = −1.0) located on top of this extended source at 30 arcsec away from its peak.

Fig. 9

Very large spatial scales – visibility plots: these plots show the observed (left) and reconstructed (right) visibility functions for a simulation in which a large extended flat-spectrum source is observed with an interferometer with a large central hole in its uv-coverage. The colours/shades in these plots represent 6 frequency channels spread between 1 and 4 GHz. The top row of plots shows that when no short-spacing information was used, these data can be mistakenly fit using a less-extended source with a steep spectrum. The bottom row shows that the inclusion of short-spacing information is sufficient to reconstruct the sky brightness distribution correctly.

MS-MFS imaging results:

these data were imaged using the MS-MFS algorithm with N_t = 3 and N_s = 3 with scale sizes given by [0, 10, 30] pixels. Two imaging runs were performed with these parameters, without and with short-spacing information. Figure 8 shows images of the intensity (left), spectral index (middle) and residuals (right) for these two runs, and Fig. 9 shows the visibility amplitudes present in the simulated data (left column) as well as in the reconstructed model (right column) at each of the 6 frequencies. In both figures, the top and bottom rows correspond to runs without and with short-spacing information respectively and each pair (in Fig. 8) are displayed at the same intensity scale.

1.
The first imaging run used only baselines longer than100 m to simulate a large central hole in theuv-coverage. The spectrum of the point-source was correctlyestimated as − 1.0, but the extended source acquired a falsesteep-spectrum (α ≈ −0.8). The algorithm was able to reconstruct thecorrect flux and spectrum of the extended source only if themulti-scale basis functions were carefully chosen to match theknown scale size (i.e. a stronger a prioriconstraint). This shows that without additional constraints, it isnot always possible to distinguish large flat-spectrum sourcefrom a slightly less-extended steep spectrum source.
2.
A second imaging run was performed including additional information in the form of short-spacing constraints, added in by retaining a small number of very short-baseline measurements at each frequency (baselines shorter than 25 m). The visibility plots and imaging results with this dataset show that the short-spacing flux estimates were sufficient to bias the solution towards the correct solution (flat-spectrum extended source).
3.
The image residuals are at the same level in both runs, demonstrating that in the absence of additional short-spacing information, both flux models are equally poorly constrained by the data themselves. One way to avoid this problem altogether (but lose some information) is to flag all spatial-frequencies smaller than u_min at ν_max and not attempt to reconstruct any spatial scales larger than what ν_max allows.

7.3. Band-limited signals

Consider a source of radiation where frequency traces different physical structures in the source (as opposed to a fixed structure with each point emitting broad-band radiation). For such sources, emission may be detected in only part of the sampled frequency range and is for all practical purposes a band-limited signal.

Since the MS-MFS algorithm uses a polynomial to model the spectrum of the source (and is not restricted to a power-law spectrum) it should be able to reconstruct such structure as long as it varies smoothly. However, for a band-limited signal, the angular resolution at which the structure can be mapped will be limited to the resolution of the highest frequency at which the signal is detected (and not the highest resolution allowed by the measurements).

EVLA D-configuration data were simulated for the band-limited radiation observed with synchrotron emission from solar prominences where different frequencies probe different depths in the solar atmosphere. The structures are generally arch-like with lower frequencies sampling the top of the loop and higher frequencies sampling the legs. The MS-MFS algorithm was run on these simulated data, using N_t = 5 to fit a 4th-order polynomial to the source spectrum (to accomodate its nearly band-limited nature) and N_s = 3 with scales given by [0, 10, 30] pixels. Iterations were terminated after 200 iterations.

Figure 10 shows a comparison of the true and reconstructed structure at three different frequencies (1.2 GHz (left), 1.8 GHz (middle) and 2.6 GHz (right) ). The top row shows the true structure, and the bottom row is the reconstruction. The 3D structure is mostly recovered, with the largest errors being in the central region where the signal spans the shortest bandwidth. The point source on the right edge of the loop was reconstructed at an angular resolution slightly lower than that of the highest sampled frequency and corresponds to the highest frequency at which this spot is brighter than the background emission. Figure 11 compares the true and reconstructed spectra for three positions in the source (left leg (left), arch of the loop (middle), and right leg (right)). These spectra show the accuracy to which a 4th-order polynomial with MS-MFS was able to reconstruct the structure.

Spectral-lines are an extreme case of a band-limited signal, and the use of MS-MFS imaging is restricted to obtaining a wide-band model of the continuum flux from line-free channels, to be subtracted out of the data before spectral-line strengths are studied.

Fig. 10

Band-limited signals – multi-frequency images: these images show a comparison between the true sky brightness (top row) and the brightness reconstructed using the MS-MFS algorithm (bottom row) at a set of three frequencies (1.0, 1.8, and 2.6 GHz from left to right).

Fig. 11

Band-limited signals – spectra across the source: these plots show the true (down-arrows) and reconstructed (up-arrows) spectra at different locations for the example discussed in this section (shown in Fig. 10). The left column corresponds to the left end of the loop at the location of the leg and shows smooth structure stretching almost all across the band. The middle column corresponds to the middle of the source where the only structure in the line-of-sight is the upper part of the loop (emission at a small fraction of the band). The right column shows spectra for the brightest point on the right end of the loop.

8. Discussion

The introduction of broad-band receivers into radio interferometry has opened up new opportunities for the study of wide-band continuum emission from a vast range of astrophysical objects. With imaging algorithms that account for the frequency dependence of the incoming radiation as well as of the instrument, we can minimize imaging artifacts, achieve continuum sensitivity and reconstruct spatial and spectral structure at the angular-resolution allowed by the highest observed frequency.

The MS-MFS algorithm models the wide-band sky brightness distribution as a collection of multi-scale flux components whose amplitudes follow a Taylor-polynomial in frequency. The data products are a set of Taylor-coefficient images, which can either be interpreted in terms of continuum intensity, spectral index and curvature, or used to evaluate a spectral cube, or serve as a wide-band model for self-calibration or continuum-subtraction. For wide-field imaging, multiple pointing-centers (mosaicing) and the w-term are accounted for during image-formation, and the effect of the primary beam and its frequency-dependence is approximately corrected in a post-deconvolution step.

For point sources for which standard imaging has a dynamic-range limit of a few thousand, this algorithm achieves dynamic ranges >10⁵ on test observations with the EVLA, and >10⁶ on noise-free simulations. For sources with smooth continuum spectra, it is able to reconstruct spectral information at the angular resolution allowed by the combined multi-frequency uv-coverage. However, if the visibility functions of very large-scale emission lie mostly within the central unsampled region of the uv-plane, the algorithm requires a priori knowledge about the spectral structure at those scales. For high SNR data, error-bars on the spectral-index images range from <0.02 when multi-scale basis functions are chosen appropriately, up to >0.2 in the extreme case of modeling smooth extended emission with a set of δ-functions. In practice, on-source errors of △ α ≈ 0.1 have been achieved. For low SNR extended structure (<10), errors due to overfitting can dominate when a high-order polynomial is used.

There are several directions in which wide-band imaging techniques need to be extended and improved. The imaging errors in MS-MFS are currently dominated by the multi-scale aspect of the algorithm, and methods that do adaptive fits (Bhatnagar & Cornwell 2004) and use more a priori information about large-scale spectra may be more appropriate. Methods that adaptively find the optimal number of parameters to operate with would help in error-control. Implementations of algorithms such as MS-CLEAN and MS-MFS are inefficient in memory-use, and other approaches may be required for large image sizes. For wide-field imaging, the time variation of the antenna primary beams must be taken into account during imaging, and wide-band methods combined with algorithms for direction-dependent corrections (Bhatnagar et al. 2008, or Smirnov 2011). For full-Stokes wide-band imaging, where a Taylor-polynomial in frequency is not the most appropriate basis function to model Stokes Q,U,V emission, wide-band imaging with other flux models must be tried.

¹

The integration of direction-dependent correction algorithms such as AW-Projection with MS-MFS will be discussed in a subsequent paper.

²

The uv-distance is defined as $\sqrt{u^{2} + v^{2}}$ $\hbox{$\sqrt{u^2+v^2}$}$ and is the radial distance of the spatial-frequency measured by the baseline from the origin of the uv-plane, in units of wavelength λ.

³

In this paper, superscripts for vectors and matrices indicate type (model, sky, observed, dirty, residual, etc.), and subscripts in italics indicate enumeration indices (t,q for Taylor-term, s,p for spatial scale, ν for frequency channel). Non-italic subscripts indicate specific values of the enumerated indices (for example, I_υ₀, I₀ or I_α).

⁴

Wideband imaging algorithms described in Conway et al. (1990) and Sault & Wieringa (1994) use a fixed spectral index across the band, and handle slight curvature by performing multiple rounds of imaging after removing the dominant or average α at each stage. They also suggest using higher order polynomials to handle spectral curvature.

⁵

Conway et al. (1990) state that the logarithmic expansion has better convergence properties than the linear expansion when α ≪ 1. An even more compact representation is a polynomial in log I vs. log ν, but it becomes numerically unstable to operate on logarithms and exponentials of pixel amplitudes, especially in the presence of noise.

⁶

Appendix A contains an explanation of the matrix notation used here, and briefly describes standard radio-interferometric image-reconstruction within a least-squares model-fitting framework (measurement equations, normal equations, and iterative χ² minimization).

⁷

Even if [H] were invertible, it is impractical to evaluate and invert the full Hessian (each row of each Hessian block represents an image).

⁸

The principal solution (as defined in Bracewell & Roberts (1954) and used in Cornwell et al. (1999)) is a term specific to radio interferometry and represents the dirty image normalized by the sum of weights. It is the image formed purely from the measured data, with no contribution from the invisible distribution of images (unmeasured spatial-frequencies). It is also an approximate solution of the normal equations [H]I^sky = I^dirty (see Appendix A.2), calculated using a diagonal approximation of the Hessian. Each element on the diagonal is the peak of the PSF, which is also the sum of weights. For isolated sources, the values measured at the peaks of the principal solution (dirty) images are the true sky values. The CLEAN minor-cycle algorithm uses this fact to estimate source fluxes from the peaks of the normalized dirty image.

⁹

The following definition of orthogonality is used here. Two vectors are orthogonal if their inner product is zero. The orthogonality of a pair of scale functions is measured by the integral of the product of their uv-taper functions. To account for uv-coverage, this integral is weighted by the sampling function.

¹⁰

A Cholesky decomposition factors a symmetric positive-definite matrix into a lower triangular matrix and its conjugate transpose. It is used in the solution of system of equations [A]x = b where [A] is symmetric positive-definite. The normal equations of a linear least-squares problem in which the signal is modeled as a linear combination of basis functions, are usually in this form (Press et al. 1988).

¹¹

ASA (Common Astronomy Software Applications) is being developed at the National Radio Astronomy Observatory.

¹²

In the first iteration, the RHS vectors are called the dirty-images. In all subsequent steps, the RHS vectors are formed after subtracting model-visibilities from the data and are called residual-images.

¹³

As pointed out in Sect. 2.6, this will be accurate only for isolated point sources that were left out of the minor cycle.

¹⁴

g when s/s_max = 1.0 the bias term is 1.0−0.6 = 0.4 which is approximately equal to the inverse of the area under a Gaussian of unit peak and width, given by $1.0 / \sqrt{2 π} = 0.398$ $\hbox{${1.0}/{\sqrt{2\pi}} = 0.398$}$ .

¹⁵

Even if visibilities are measured at a very high frequency resolution, they can be averaged across frequency ranges upto the bandwidth-smearing limit for the desired image field-of-view.

¹⁶

The spectral index between two frequency bands A and B will be denoted as α_AB. For example, the symbol α_PL corresponds to the frequency range between P-band (327 MHz) and L-band (1.4 GHz), and α_LL corresponds to two frequencies within L-band (here, 1.1 and 1.8 GHz). A similar convention will be used for spectral curvature β.

¹⁷

Conway et al. (1990) comment on a bias that occurs with a 2-term Taylor expansion, due to the use of a polynomial of insufficient order to model an exponential.

¹⁸

There are two definitions of bandwidth ratio that are used in radio interferometry. One is the ratio of the highest to the lowest frequency in the band, and is denoted as ν_high:ν_low. Another definition is the ratio of the total bandwidth to the central frequency (ν_high − ν_low)/ν_mid expressed as a percentage. For example, the bandwidth ratio for ν_low = 1.0 GHz, ν_high = 2.0 GHz is 2:1 and 66%.

¹⁹

The convolution of two vectors a ⋆ b is equivalent to the multiplication of their Fourier transforms. A 1–D convolution operator is constructed from a and applied to b as follows. Let [A] = diag(a). Then, a ⋆ b = [F^†diag( [F]a)F]b = [C]b. Here, [F] is the Discrete Fourier Transform (DFT) operator. [C] is a Toeplitz matrix, with each row containing a shifted version of a. Multiplication of [C] with b implements the shift-multiply-add sequence required for the process of convolution.

²⁰

ASKAPsoft is the software package being developed at the CSIRO for the ASKAP telescope.

Acknowledgments

The authors would like to thank the National Radio Astronomy Observatory, the New Mexico Institute of Mining and Technology, and the Australia Telescope National Facility for support during the Ph.D. Thesis project that resulted in this algorithm and its implementation within CASA. The authors would like to thank J.A. Eilek in particular, for extremely helpful comments on the presentation of the thesis material that formed the basis of this paper. The authors would also like to thank S.Bhatnagar, K. Golap, R. Nityananda, F. N. Owen, R. J. Sault, and M. A. Voronkov, among others, for useful discussions pertaining to this work and its software implementation. This project used data from the (E)VLA telescope (test observations and project AR664) operated by the National Radio Astronomy Observatory, a facility of the National Science Foundation operated under cooperative agreement by Associated Universities, Inc.

References

Bhatnagar, S., & Cornwell, T. J. 2004, A&A, 426, 747 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bhatnagar, S., Cornwell, T. J., Golap, K., & Uson, J. M. 2008, A&A, 487, 419 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bong, S.-C., Lee, J., Gary, D. E., & Yun, H. S. 2006, ApJ, 636, 1159 [NASA ADS] [CrossRef] [Google Scholar]
Bracewell, R. N., & Roberts, J. A. 1954, Aust. J. Phys., 7, 615 [NASA ADS] [CrossRef] [Google Scholar]
Carilli, C. L., & Barthel, P. D. 1996, A&ARv, 7, 1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Carilli, C. L., Perley, R. A., Dreher, J. W., & Leahy, J. P. 1991, ApJ, 383, 554 [NASA ADS] [CrossRef] [Google Scholar]
Conway, J. E., Cornwell, T. J., & Wilkinson, P. N. 1990, MNRAS, 246, 490 [NASA ADS] [Google Scholar]
Cornwell, T. J. 2008, IEEE Journal of Selected Topics in Sig. Proc., 2, 793 [Google Scholar]
Cornwell, T., Braun, R., & Briggs, D. S. 1999, in Synthesis Imaging in Radio Astronomy II, ed. G. B. Taylor, C. L. Carilli, & R. A. Perley, ASP Conf. Ser., 180, 151 [Google Scholar]
de Vos, M., Gunst, A. W., & Nijboer, R. 2009, IEEE Proc., 97, 1431 [Google Scholar]
Deboer, D. R., Gough, R. G., Bunton, J. D., et al. 2009, IEEE Proc., 97, 1507 [NASA ADS] [CrossRef] [Google Scholar]
Greisen, E. W., Spekkens, K., & van Moorsel, G. A. 2009, Astron. J., 137, 4718 [Google Scholar]
Likhachev, S. 2005, in Future Directions in High Resolution Astronomy, ed. J. Romney, & M. Reid, ASP Conf. Ser., 340, 608 [Google Scholar]
Owen, F. N., Eilek, J. A., & Kassim, N. E. 2000, ApJ, 543, 611 [NASA ADS] [CrossRef] [Google Scholar]
Perley, R., Napier, P., Jackson, J., et al. 2009, IEEE Proc., 97, 1448 [Google Scholar]
Press, W., Flannery, B., Teukolsky, S., & Vetterling, W. 1988, Numerical Recipes in C. (Cambridge: Press Syndicate of the University of Cambridge) [Google Scholar]
Rau, U. 2010, Ph.D. Thesis, New Mexico Institute of Mining and Technology, Socorro, NM, USA [Google Scholar]
Rau, U., Bhatnagar, S., Voronkov, M. A., & Cornwell, T. J. 2009, IEEE Proc., 97, 1472 [Google Scholar]
Rottmann, H., Mack, K., Klein, U., & Wielebinski, R. 1996, A&A, 309, L19 [NASA ADS] [Google Scholar]
Sault, R. J., & Wieringa, M. H. 1994, A&AS, 108, 585 [NASA ADS] [Google Scholar]
Smirnov, O. M. 2011, A&A, 527, A106 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

Appendix A: Matrix notation and framework

The matrix notation used in this paper is explained here, in the context of an iterative χ² minimization process used to solve a system of linear equations. The basic idea is as follows. Let Ax = b be the system of equations to be solved (measurement equations). The goal is to find a set of parameters x that minimizes χ² = (Ax − b)^†W(Ax − b). Setting grad(χ²) = 0 to minimize χ² leads to a new system of equations called the normal equations [A^†WA]x = [A^†]Wb. The matrix on the LHS is called the Hessian [H] = [A^†WA]. Iterations begin with an initial guess for the parameters x. These parameters are updated in iteration i as x_i + 1 ← x_i + g [H^-1]A^†W(b − Ax_i), where g controls the step-size. Iterations continue until a convergence criterion is satisfied. The basic iterative framework used in most imaging and deconvolution algorithms in radio interferometry can be described using this matrix notation (Rau et al. 2009; Rau 2010).

A.1. Measurement equations

The measurement equation of an imaging instrument describes its transfer function (the effect of the measurement process on the input signal). For an ideal interferometer (a perfect spatial-frequency filter, with no instrumental gains), it can be written as follows in matrix notation as follows. Let $I_{m \times 1}^{sky}$ $\hbox{$\I^{\rm sky}_{m\times 1}$}$ be a pixelated image of the sky and let $V \begin{matrix} obs \\ n \times 1 \end{matrix}$ $\hbox{$\V^{\rm obs}_{n\times 1}$}$ be a vector of n visibilities. Let S_n × m be a projection operator that describes the instrument’s sampling function (uv-coverage) as a mapping of m discrete spatial frequencies (pixels on a grid) to n visibility samples (usually n > m). Let F_m × m be the Fourier transform operator. Then, the measurement equations become $V \begin{matrix} obs \\ n \times 1 \end{matrix} = [S_{n \times m}] [F_{m \times m}] I_{m \times 1}^{sky}$ $\hbox{$\V^{\rm obs}_{n\times 1} = [{\Sa}_{n\times m}] [\F_{m\times m}] \I^{\rm sky}_{m\times 1} $}$ .

A.2. Normal equations

The normal equations are the linear system of equations whose solution gives a weighted least-squares estimate of a set of parameters in a model (χ² minimization). For an ideal interferometer, it is given by $[F^{†} S^{†} WSF] I_{m \times 1}^{sky} = [F^{†} S^{†} W] V \begin{matrix} obs \\ n \times 1 \end{matrix}$ $\hbox{$[\Fd \Sd \W \Sa \F ] \I^{\rm sky}_{m\times 1} = [\Fd \Sd \W] \V^{\rm obs}_{n\times 1}$}$ where W_n × n is a diagonal matrix of weights and S^† denotes the mapping of measured visibilities onto a spatial-frequency grid. We can write the Normal equations as $[H_{m \times m}] I_{m \times 1}^{sky} = I_{m \times 1}^{dirty}$ $\hbox{$[\He_{m\times m}] \I^{\rm sky}_{m\times 1}= \I^{\rm dirty}_{m\times 1}$}$ where the Hessian (matrix on the LHS) is by construction a convolution operator¹⁹ with a shifted version of the point-spread function $I_{m \times 1}^{psf} = diag [F^{†} S^{†} WS]$ $\hbox{$\I^{\rm psf}_{m\times 1} = {\rm diag}[\Fd \Sd \W \Sa]$}$ in each row. The dirty image on the RHS is produced by direct Fourier inversion of weighted visibilities. The normal equations therefore state that the dirty image $I_{m \times 1}^{dirty}$ $\hbox{$\I^{\rm dirty}_{m\times 1}$}$ is the result of the convolution of $I_{m \times 1}^{sky}$ $\hbox{$\I^{\rm sky}_{m\times 1}$}$ with $I_{m \times 1}^{psf}$ $\hbox{$\I^{\rm psf}_{m\times 1}$}$ . The solution of these normal equations represents a deconvolution.

A.3. Iterative solution

Most existing iterative image reconstruction algorithms in radio interferometry consist of major and minor cycles. Major cycles compute the RHS of the normal equations, and minor cycles perform approximate (implicit) Hessian inversions to calculate updates for the sky model parameters I^m. The first major cycle starts by transforming the observed visibilities into an image as I^dirty = [F^†S^†W]V^obs, and initializing the sky-model I^m. Minor cycle steps do a deconvolution to calculate updates to the model I^m. After several iterations, this updated model is fed into the next major cycle. This and all subsequent major cycles calculate model visibilities from the current model V^m = [SF]I^m, calculate residual visibilities V^res = V^obs − V^m, and transform these residuals into images I^res = [F^†S^†W]V^res. These residual images replace the initial dirty image, and a new set of minor-cycle iterations are done. This process continues until a convergence criterion is reached. Usually, convergence is defined as I^res being noise-like with no signal left to be extracted in the minor-cycle.

Appendix B: MS-MFS as implemented in CASA

The MS-MFS algorithm described in Sect. 2.7 has been implemented and released via the CASA software package (version 3.1 onwards). More recently, it has been implemented in the ASKAPsoft²⁰ package. Algorithm 1 contains a pseudo-code listing.

The main parameters that control the algorithm are

1.
ν₀: a reference frequency chosen near the middle of the sampled frequency range, about which the Taylor expansion is performed,
2.
N_t: the number of coefficients of the Taylor polynomial to solve for, and
3.
N_s and $I_{s}^{shp}$ $\hbox{$\I^{\rm shp}_s$}$ : a set of scale sizes in units of image pixels to use for the multi-scale representation of the image. In order to always allow for the modeling of unresolved sources, the first scale function $I_{0}^{shp}$ $\hbox{$\I^{\rm shp}_0$}$ is forced to be a δ-function.

The data products are N_t Taylor-coefficient images, a spectral index image, and a curvature image (if N_t > 2). This wide-band image model can be used within a standard self-calibration loop.

All Figures

	Fig. 1 MS-MFS imaging results using simulated EVLA data: these images compare truth images (left column) with the results of two wide-band imaging runs; multi-scale (middle column) and point-source (right column). The top three rows represent the first three Taylor-coefficient images, and the fourth and fifth rows show spectral index and spectral curvature respectively.
In the text

Fig. 2

Cygnus A – total intensity and spectral-index: this figure shows the MS-MFS total intensity map (top), and spectral-index maps obtained via three methods – MS-MFS (second row), a hybrid single-band method (third row), and from high-resolution full-synthesis narrow-band images at 1.4 and 4.8 GHz (bottom). The MS-MFS spectral index map has the angular-resolution of the total intensity map, and agrees with the values in the high-resolution comparison map (α = −0.5 at the hotspots increasing to α ≈ − 1.0 in the halo). However, the hybrid method resulted in a map with a wider range of values (positive and negative) and is at a much lower angular resolution.

In the text

Fig. 3

M 87 core/jet/lobe – intensity, spectral index and curvature: these images show 3-arcsec resolution maps of the central bright region of M 87 (core+jet and inner lobes). The quantities displayed are the intensity at 1.5 GHz (top left), the spectral index (top middle) and the spectral curvature (top right). The spectral index is near zero at the core, varies between − 0.36 and − 0.6 along the jet and out into the lobes. The spectral curvature is on average 0.5 which translates to △ α = 0.2 across L-band. The plot at the bottom compares the resulting average spectrum (solid line) with that formed by imaging each spectral-window separately (dots). The dashed and dotted lines correspond to fixed values of spectral index (–0.43, –0.53, –0.63).

In the text

Fig. 4

Peak residuals and Errors for MFS with different values of N_t: These plots show the measured peak residuals (top left) and the errors on I_υ₀ (top right), I_α (bottom left), and I_β (bottom right) when a point-source of flux 1.0 Jy and α = −1.0 was imaged using Taylor polynomials of different orders (N_t = 1−7) and a linear spectral basis. The x-axis of all these plots show the value of N_t used for the simulation and plots for α and β begin from N_t = 2 and N_t = 3 respectively. An example of how to read these plots: for a 2:1 bandwidth ratio, a source with spectral index = –1.0 and N_t = 4, the achievable dynamic range (measured as the ratio of the peak flux to the peak residual near the source) is about 10⁵, the error on the peak flux at the reference frequency is 1 part in 10³, and the absolute errors on α anre β are 10^-2 and 10^-1 respectively.

In the text

Fig. 5

Stokes I images of the 3C 286 field, using MSMFS with N_t = 1 (top left), N_t = 2 (top right), N_t = 3 (bottom left), N_t = 4 (bottom left). The peak flux of 3C 286 is 14 Jy/beam. The axes labels are in units of arc-minutes from the pointing center and all images are shown with the same grey-scales. The residual rms near 3C 286 for these four images are 9 mJy, 1 mJy, 200 µJy and 140 µJy, and the rms measured 1 degree away from 3C 286 are 1 mJy, 200 µJy , 85 µJy and 80 µJy. The thermal-noise limit for this dataset was 70 µJy.

In the text

Fig. 6

Moderately resolved sources – single-channel images: these figures show the 6 single-channel images generated from simulated EVLA data between 1 and 4 GHz in the EVLA D-configuration. The sky brightness consists of two point sources, each of flux 1.0 Jy at a reference frequency of 2.5 GHz and separated by 18 arcsec. Their spectral indices are +1.0 (top source) and −1.0 (bottom source). The angular resolution at 1 GHz is 60 arcsec, and at 4 GHz is 15 arcsec and the circles in the lower left corner of each image shows the resolution element decreasing in size as frequency increases.

In the text

	Fig. 7 Moderately resolved sources – MSMFS Images: these images show the intensity at 2.5 GHz (left), the spectral index showing a gradient between − 1 and +1 (middle) and the spectral curvature which peaks between the two sources and falls off on either side (right). The angular resolution of these images is 15 arcsec, corresponding to the highest frequency in the data (and Fig. 6).
In the text

Fig. 8

Very large spatial scales – intensity, spectral index, residuals: these images show the intensity distribution (left), spectral index (middle) and the residuals (right) for two imaging runs. The top row shows results without short-spacing information, and shows a false negative α ≈ −0.8 for the extended emission. The bottom row shows results with short-spacing constraints, in which the extended source has been reconstructed correctly.

In the text

Fig. 9

Very large spatial scales – visibility plots: these plots show the observed (left) and reconstructed (right) visibility functions for a simulation in which a large extended flat-spectrum source is observed with an interferometer with a large central hole in its uv-coverage. The colours/shades in these plots represent 6 frequency channels spread between 1 and 4 GHz. The top row of plots shows that when no short-spacing information was used, these data can be mistakenly fit using a less-extended source with a steep spectrum. The bottom row shows that the inclusion of short-spacing information is sufficient to reconstruct the sky brightness distribution correctly.

In the text

	Fig. 10 Band-limited signals – multi-frequency images: these images show a comparison between the true sky brightness (top row) and the brightness reconstructed using the MS-MFS algorithm (bottom row) at a set of three frequencies (1.0, 1.8, and 2.6 GHz from left to right).
In the text

Fig. 11

Band-limited signals – spectra across the source: these plots show the true (down-arrows) and reconstructed (up-arrows) spectra at different locations for the example discussed in this section (shown in Fig. 10). The left column corresponds to the left end of the loop at the location of the leg and shows smooth structure stretching almost all across the band. The middle column corresponds to the middle of the source where the only structure in the line-of-sight is the upper part of the loop (emission at a small fraction of the band). The right column shows spectra for the brightest point on the right end of the loop.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Bhatnagar, S., & Cornwell, T. J. 2004, A&A, 426, 747 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[2] Bhatnagar, S., Cornwell, T. J., Golap, K., & Uson, J. M. 2008, A&A, 487, 419 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[3] Bong, S.-C., Lee, J., Gary, D. E., & Yun, H. S. 2006, ApJ, 636, 1159 [NASA ADS] [CrossRef] [Google Scholar]

[4] Bracewell, R. N., & Roberts, J. A. 1954, Aust. J. Phys., 7, 615 [NASA ADS] [CrossRef] [Google Scholar]

[5] Carilli, C. L., & Barthel, P. D. 1996, A&ARv, 7, 1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[6] Carilli, C. L., Perley, R. A., Dreher, J. W., & Leahy, J. P. 1991, ApJ, 383, 554 [NASA ADS] [CrossRef] [Google Scholar]

[7] Conway, J. E., Cornwell, T. J., & Wilkinson, P. N. 1990, MNRAS, 246, 490 [NASA ADS] [Google Scholar]

[8] Cornwell, T. J. 2008, IEEE Journal of Selected Topics in Sig. Proc., 2, 793 [Google Scholar]

[9] Cornwell, T., Braun, R., & Briggs, D. S. 1999, in Synthesis Imaging in Radio Astronomy II, ed. G. B. Taylor, C. L. Carilli, & R. A. Perley, ASP Conf. Ser., 180, 151 [Google Scholar]

[10] de Vos, M., Gunst, A. W., & Nijboer, R. 2009, IEEE Proc., 97, 1431 [Google Scholar]

[11] Deboer, D. R., Gough, R. G., Bunton, J. D., et al. 2009, IEEE Proc., 97, 1507 [NASA ADS] [CrossRef] [Google Scholar]

[12] Greisen, E. W., Spekkens, K., & van Moorsel, G. A. 2009, Astron. J., 137, 4718 [Google Scholar]

[13] Likhachev, S. 2005, in Future Directions in High Resolution Astronomy, ed. J. Romney, & M. Reid, ASP Conf. Ser., 340, 608 [Google Scholar]

[14] Owen, F. N., Eilek, J. A., & Kassim, N. E. 2000, ApJ, 543, 611 [NASA ADS] [CrossRef] [Google Scholar]

[15] Perley, R., Napier, P., Jackson, J., et al. 2009, IEEE Proc., 97, 1448 [Google Scholar]

[16] Press, W., Flannery, B., Teukolsky, S., & Vetterling, W. 1988, Numerical Recipes in C. (Cambridge: Press Syndicate of the University of Cambridge) [Google Scholar]

[17] Rau, U. 2010, Ph.D. Thesis, New Mexico Institute of Mining and Technology, Socorro, NM, USA [Google Scholar]

[18] Rau, U., Bhatnagar, S., Voronkov, M. A., & Cornwell, T. J. 2009, IEEE Proc., 97, 1472 [Google Scholar]

[19] Rottmann, H., Mack, K., Klein, U., & Wielebinski, R. 1996, A&A, 309, L19 [NASA ADS] [Google Scholar]

[20] Sault, R. J., & Wieringa, M. H. 1994, A&AS, 108, 585 [NASA ADS] [Google Scholar]

[21] Smirnov, O. M. 2011, A&A, 527, A106 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

A multi-scale multi-frequency deconvolution algorithm for synthesis imaging in radio interferometry

1. Introduction

1.1. Wide-band imaging

2. Multi-scale multi-frequency deconvolution

2.1. Parameterization of spatial structure

2.2. Parameterization of spectral structure

2.3. Multi-scale multi-frequency model

2.4. Measurement equations

2.5. Normal equations

2.6. Principal solution

2.6.1. Properties of [Hpeak]

2.7. MS-MFS algorithm

Pre-compute Hessian:

Initialization:

Major and minor cycles:

Restoration:

Primary-beam correction:

3. Relation to MF-CLEAN and MS-CLEAN

3.1. Relation to MF-CLEAN

3.2. Relation to MS-CLEAN

4. Hybrids of narrow-band and continuum techniques

5. MS-MFS imaging results

5.1. EVLA simulation

Data:

Imaging results:

5.2. Multi-frequency VLA observations of Cygnus-A

Objective:

Results:

5.3. Multi-frequency observations of M 87

6. Sources of error

6.1. On-source polynomial-fit errors

6.2. Off-source errors and dynamic-range limits

6.3. Propagation of multi-scale errors

6.4. Frequency dependence of the primary beam

7. Imaging performance (non-standard conditions)

7.1. Moderately resolved sources

EVLA simulation:

MSMFS Imaging results:

7.2. Emission at large spatial scales

EVLA simulation:

MS-MFS imaging results:

7.3. Band-limited signals

8. Discussion

Acknowledgments

References

Appendix A: Matrix notation and framework

A.1. Measurement equations

A.2. Normal equations

A.3. Iterative solution

Appendix B: MS-MFS as implemented in CASA

All Figures

2.6.1. Properties of [H^peak]