Image reconstruction in optical interferometry: benchmarking the regularization

S. Renard; E. Thiébaut; F. Malbet

doi:10.1051/0004-6361/201016263

Home

All issues

Volume 533 (September 2011)

A&A, 533 (2011) A64

Full HTML

Free Access

Issue		A&A Volume 533, September 2011


Article Number		A64
Number of page(s)		14
Section		Astronomical instrumentation
DOI		https://doi.org/10.1051/0004-6361/201016263
Published online		30 August 2011

A&A 533, A64 (2011)

Image reconstruction in optical interferometry: benchmarking the regularization

S. Renard¹, E. Thiébaut²^,⋆ and F. Malbet¹

¹ IPAG, CNRS/UMR 5571, Université J. Fourier, BP 53 38041, Grenoble Cedex, France
e-mail: Fabien.Malbet@obs.ujf-grenoble.fr
² Centre de Recherche Astrophysique de Lyon, CNRS/UMR5574, 69561 St-Genis-Laval, France
e-mail: thiebaut@obs.univ-lyon1.fr

Received: 6 December 2010
Accepted: 24 May 2011

Abstract

Context. With the advent of visible and infrared long-baseline interferometers with more than two telescopes, both the size and the completeness of interferometric data sets have significantly increased, allowing images based on models with no a priori assumptions to be reconstructed with an aperture synthesis technique.

Aims. Our main objective is to analyze the multiple parameters of the image reconstruction process with particular attention to the regularization term and the study of their behavior in different situations (types of astrophysical objects, telescope array configurations, level of noise, etc.). The secondary goal is to derive practical rules for the users.

Methods. Using the Multi-aperture image Reconstruction Algorithm (MiRA), we performed multiple systematic tests, analyzing 11 regularization terms commonly used. The tests are made on different astrophysical objects, different (u,v) plane coverages and several signal-to-noise ratios to determine the minimal configuration needed to reconstruct an image. We establish a methodology and we introduce the mean-square errors (MSE) to discuss the results.

Results. From the ~24 000 simulations performed for the benchmarking of image reconstruction with MiRA, we are able to classify the different regularizations in the context of the observations. We find typical values of the regularization weight. A minimal (u,v) coverage is required to reconstruct an acceptable image, whereas no limits are found for the studied values of the signal-to-noise ratio. We also show that super-resolution can be achieved with increasing performance with the (u,v) coverage filling.

Conclusions. Using image reconstruction with a sufficient (u,v) coverage is shown to be reliable. The choice of the main parameters of the reconstruction is tightly constrained. We recommend that efforts to develop interferometric infrastructures should first concentrate on the number of telescopes to combine, and secondly on improving the accuracy and sensitivity of the arrays.

Key words: instrumentation: interferometers / techniques: interferometric / techniques: image processing

^⋆

Corresponding author: E. Thiébaut

© ESO, 2011

1. Introduction

Many astrophysical studies require milli-arcsecond (mas) resolution images at optical wavelengths (visible and infrared), for example, the understanding of the interplay between accretion and ejection in the inner part of the disks of young stellar objects, the expansion mechanisms in novae just a few hours or days after the explosion, and the nature of dust in active galactic nuclei. Information at such a high resolution at the optical wavelengths requires diffraction-limited images with pupil sizes of the order of tens to hundreds of meters that can only be achieved by interferometrically combining light from separate apertures. The Very Large Telescope Interferometer (VLTI) and the Center for High Angular Resolution Array (CHARA) are facilities that provide interferometric measurements that can be used to reconstruct images of stellar surfaces (e.g., Monnier et al. 2007; Haubois et al. 2009; Zhao et al. 2009), binaries (e.g., Zhao et al. 2008; Kraus et al. 2009), circumstellar shells around evolved stars (e.g., Le Bouquin et al. 2009), and the close environment of young stars (Renard et al. 2010; Kraus et al. 2010).

Owing to the sparse (u,v) coverage, the image reconstruction process is ill posed as there are more unknowns, e.g. the pixels of the image, than measurements. The data alone are insufficient to reconstruct an unambiguous image and some additional constraints, so-called the regularizations, are needed to converge to a unique and stable solution. Compared to radio interferometry, the data are much sparser in optical interferometry; hence, we expect the image reconstruction problem to be much more sensitive to the choice and the tuning of the regularization. Since the general study of regularization by Titterington (1985), many different methods have been proposed in the literature to adjust the regularization level. Since image reconstruction for optical interferometry is still in its infancy, it is fundamental to analyze the different types of regularization to find those that are the most suitable for the different astrophysical problems and to be able to tune the weight of the regularization. In this context, we carried out systematic tests using the image reconstruction algorithm devoted to optical interferometry data developed by Thiébaut (2008), called the Multi-aperture image Reconstruction Algorithm (MiRA). The analysis of these tests allow us to extract some general conclusions and establish practical rules for the users.

The mathematical principles of the image reconstruction technique is presented in Sect. 2. The parameters of the simulated data are presented together with the characteristics of the images and the strategy in Sect. 3. The results of the simulations are presented and discussed in Sect. 4, with an analysis of the role of the different terms and parameters in the image reconstruction. Finally, our conclusions are summarized in Sect. 5.

2. Principles of image reconstruction from optical interferometric data

We do not intend to provide a formal and precise description of image reconstruction in optical interferometry, but instead sufficient details to ensure that the paper is self-contained. Readers interested by a more detailed description of the method should refer to Thiébaut & Giovannelli (2010).

2.1. Data from optical interferometric observations

The principle of interferometry is to recombine coherently the beams from two or more independent telescopes and measure the so-called complex visibilities of the fringe patterns produced by the interferences. According to the van Cittert-Zernicke theorem, for an ideal interferometer, the complex visibility V_j₁,j₂(t) of the fringes produced by the interferences of the telescopes j₁ and j₂ at time t is proportional to the Fourier transform of the object brightness distribution Î(ν_j₁,j₂(t)) at spatial frequency $ν_{j_{1}, j_{2}} (t) = B_{j_{1}, j_{2}}^{⊥} (t) / λ$ $\hbox{$\boldsymbol{\nu}_{j_1,j_2}(t) = \boldsymbol{B}_{j_1,j_2}^\perp(t)/\lambda$}$ , where λ is the wavelength, and the so-called baseline $B_{j_{1}, j_{2}}^{⊥} (t)$ $\hbox{$\boldsymbol{B}_{j_1,j_2}^\perp(t)$}$ represents the separation between the two telescopes projected on a plane perpendicular to the line of sight (Lawson 2000; Malbet & Perrin 2007). Since the number of measurements is finite, to simplify the equations we introduce some notation for the mth measured complex visibility and the corresponding spatial frequency given by $\begin{matrix} V_{m} & \overset{def}{=} & V_{j_{1,m}, j_{2,m}} (t_{m}), \\ ν_{m} & \overset{def}{=} & B_{j_{1,m}, j_{2,m}}^{⊥} (t_{m}) / λ, \end{matrix}$ $\begin{eqnarray} \ComplexVis_{m} &\bydef& \ComplexVis_{\!j_{1,m},j_{2,m}}(t_m) \, , \\ \boldsymbol{\nu}_m &\bydef& \boldsymbol{B}_{\!j_{1,m},j_{2,m}}^{\perp}(t_m)/\lambda \, , \end{eqnarray}$ where j_1,m and j_2,m are the interfering telescopes and t_m is the time of observation.

2.2. Description of the image model

The final product of the image reconstruction is an image that can be treated as a grid of square pixels. In this context, the object brightness distribution as a function of the position θ can be approximated using the parametrization $I (θ) = \sum_{n = 1}^{N} x_{n} b_{n} (θ),$ $\begin{equation} \label{eq:image-model} I(\boldsymbol{\theta}) = \sum_{n=1}^{N} x_n\,\BasisFunc_n(\boldsymbol{\theta}) , \end{equation}$ (3)where $x = {x_{n}}_{n = 1}^{N}$ $\hbox{$\boldsymbol{x}=\{ x_n\}_{n=1}^{N}$}$ are the image parameters, e.g. the pixel values of the image, and ${b_{n} (θ)}_{n = 1}^{N}$ $\hbox{$\{\BasisFunc_n(\boldsymbol{\theta})\}_{n=1}^{N}$}$ is the chosen basis of functions, e.g. the response function of each pixel. The image reconstruction then consists of estimating the N parameters x that most closely fit the interferometric data. In this paper, we chose x_n to be proportional to the value of the nth pixel of a sampled image and b_n(θ) = b(θ − θ_n), where b(θ) is the pixel shape and θ_n the position of the nth pixel; thus $x_{n} \overset{def}{=} α I (θ_{n}),$ $\begin{equation} \label{eq:sampled-image-model} x_n \bydef \alpha \, I(\boldsymbol{\theta}_n) , \end{equation}$ (4)where α > 0 is a scaling factor such that x is normalized (this is required by the interferometric data format, cf. Pauls et al. 2005). With this model, the exact Fourier transform of the brightness distribution is given by $Î (ν) = \sum_{n} x_{n} b̂ n (ν) = b̂ (ν) \sum_{n} x_{n} e^{- i 2 π θ_{n} \cdot ν},$ $\begin{equation} \label{eq:TF-image-model} \hat{I}(\boldsymbol{\nu}) = \sum_n x_n\,\hat{\BasisFunc}_n(\boldsymbol{\nu}) = \hat{\BasisFunc}(\boldsymbol{\nu}) \, \sum_n x_n\,\mathrm{e}^{-\mathrm{i}\,2\,\pi\,\boldsymbol{\theta}_n\cdot\boldsymbol{\nu}} , \end{equation}$ (5)where $\hbox{$\hat{\BasisFunc}_n(\boldsymbol{\nu})$}$ and $\hbox{$\hat{\BasisFunc}(\boldsymbol{\nu})$}$ are the Fourier transforms of the basis functions. In our case, they correspond to the Fourier transform of the pixel response function, i.e. the pixel shape. Hence, the model of the mth complex visibility is given by $Î m = Î (ν_{m}) = \sum_{n} A_{m,n} x_{n} = (A \cdot x)_{m},$ $\begin{equation} \label{eq:discrete-TF-image-model} \hat{I}_m = \hat{I}(\boldsymbol{\nu}_m) = \sum_n A_{m,n}\,x_n = (\mathbf{A}\cdot\boldsymbol{x})_m , \end{equation}$ (6)where A is a matrix with the complex coefficients $A_{m,n} = b̂ n (ν_{m}) = b̂ (ν_{m}) e^{- i 2 π θ_{n} \cdot ν_{m}} .$ $\begin{equation} \label{eq:model-coefs} A_{m,n} = \hat{\BasisFunc}_n(\boldsymbol{\nu}_m) = \hat{\BasisFunc}(\boldsymbol{\nu}_m)\,\mathrm{e}^{-\mathrm{i}\,2\,\pi\,\boldsymbol{\theta}_n\cdot\boldsymbol{\nu}_m} \, . \end{equation}$ (7)This matrix multiplication performs a linear transformation that contains the Fourier transform, the pixel shape, and the sparse sampling of the (u,v) plane.

The main problem in optical interferometry is the small number of telescopes (currently up to four or six), which leads to a sparse sampling of the spatial frequencies, the so-called (u,v) plane (see Fig. 2). Owing to random effects caused by the atmospheric turbulence, the visibility phase cannot be calibrated and the power spectrum and the closure phase are used. This results in a partial loss of the Fourier phase information (Thiébaut & Giovannelli 2010).

2.3. Inverting the problem of interferometric imaging

Because of the sparse (u,v) coverage and the possible lack of other information such as the phase, the reconstruction of an image obtained by the interferometric data alone is an ill-posed inverse problem. It needs additional a priori constraints to be recasted into a problem that has a unique and stable solution. A general prescription is to express the solution as one that minimizes a penalty function f under some strict constraints (Thiébaut 2005; Thiébaut & Giovannelli 2010) $x^{+} = {\arg \min}_{x} f (x) s . t . x ⩾ 0 and \sum_{n} x_{n} = 1,$ $\begin{equation} \boldsymbol{x}^{+} = {\rm arg\,min}_{{\hspace*{-6mm}\underset{x}{}}}\hspace*{6mm} f(\boldsymbol{x}) \quad\text{s.t.}\quad \boldsymbol{x} \geqslant 0 \, \text{and} \sum\nolimits_n x_n = 1 , \label{eq:optim-prob} \end{equation}$ (8)with $f (x) = f_{data} (x) + μ f_{prior} (x),$ $\begin{equation} f(\boldsymbol{x}) = f_{\rm data}(\boldsymbol{x}) + \mu\, f_{\rm prior}(\boldsymbol{x}) , \label{eq:penalty} \end{equation}$ (9)where the so-called likelihood term f_data(x) measures the discrepancy between the model and the available data, while the so-called regularization term f_prior(x) measures the discrepancy with the prior information. In other words, minimizing the likelihood term f_data(x) enforces the fit with the actual data, while minimizing the regularization term f_prior(x) enforces the agreement with the priors. The so-called hyperparameter μ > 0 is used to adjust the relative weight of the constraints set by the measurements and the ones set by the priors. In Eq. (8), the positivity ( $\hbox{$\boldsymbol{x} \geqslant 0$}$ ) and the normalization (∑ _nx_n = 1) of the brightness distribution are also included by default.

For the image reconstruction, we use MiRA (Thiébaut 2008) to find solutions of Eq. (8). MiRA can deal with the various kinds of data provided by an optical interferometer and implements a number of different regularizations (Thiébaut 2008; Thiébaut & Giovannelli 2010).

2.3.1. The likelihood term f_data and the data model

We focus on the choice of the priors and the tuning of their parameters. It is not within the scope of the paper to deal with global optimization issues and the search for a global minimum of the penalty function in Eq. (9). We therefore assume that the available measurements consist of complex visibilities, i.e. amplitude and phase, in order to have a convex likelihood term f_data(x). If the regularization term f_prior(x) is also convex, the global penalty function f(x) will be convex, which ensures that the solution of Eq. (8) is unique. Current optical interferometers only provide phase closures and power-spectrum data (i.e. the phase of the bispectrum and the squared amplitude of the complex visibilities), this means that our assumption will give somewhat optimistic results because some Fourier phase information is missing and because the likelihood term f_data(x) is non-convex when dealing with real data. However, as the number of simultaneous interfering telescopes increases, the number of missing phases becomes much less important and they can be reliably derived using self-calibration (Pearson & Readhead 1984) to achieve a situation similar to the case studied in our simulations. Moreover, new interferometers will make use of a phase reference source to directly measure the phase of the complex visibilities (Delplancke et al. 2003).

However, the OI-FITS standard (Pauls et al. 2005) imposes the use of complex visibilities in their polar representation with independent error bars. We therefore simulate each measured complex visibility as $ϱ_{m} = | V_{m} | + δ ϱ_{m}, and ϕ_{m} = \arg V_{m} + δ ϕ_{m},$ $\begin{equation} \varrho_m = \left|\ComplexVis_m\right| + \delta\varrho_m, \quad \mbox{and} \quad \varphi_m = \arg \ComplexVis_m + \delta\varphi_m \, , \end{equation}$ (10)where ϱ_m and ϕ_m are the measured amplitude and phase of the mth measure, V_m the corresponding complex visibility computed from the true object brightness distribution, and δϱ_m and δϕ_m are additive noise terms. In our simulations, the noise terms have independent Gaussian statistics such that $Var (δ ϱ_{m}) = ⟨ ϱ_{m} ⟩^{2} Var (δ ϕ_{m}),$ $\begin{equation} {\rm Var}(\delta\varrho_m) = \langle{\varrho_m}\rangle^2 \, {\rm Var}(\delta\varphi_m) \, , \end{equation}$ (11)where ⟨ ϱ_m ⟩ = |V_m| is the expected value of the amplitude that can be computed from the complex visibility of the true image. This particular choice follows approximately the model of Goodman (1985). In this context, to define the likelihood term we use the local approximation (Meimon et al. 2005) $f_{data} (x) = \sum_{m} \begin{matrix} ⎧ \\ ⎪ \\ ⎪ \\ ⎨ \\ ⎪ \\ ⎪ \\ ⎩ \end{matrix} {(\frac{r_{/ /,m} (x)}{σ_{/ /,m}})}^{2} + {(\frac{r_{⊥,m} (x)}{σ_{⊥,m}})}^{2} \begin{matrix} ⎫ \\ ⎪ \\ ⎪ \\ ⎬ \\ ⎪ \\ ⎪ \\ ⎭ \end{matrix},$ $\begin{equation} f_{\rm data}(\boldsymbol{x}) = \sum_m \left\{ \left(\frac{\Err_{/\!/,m}(\boldsymbol{x})}{\sigma_{/\!/,m}}\right)^2 + \left(\frac{\Err_{\perp,m}(\boldsymbol{x})}{\sigma_{\perp,m}}\right)^2 \right\} \, , \end{equation}$ (12)where r_//,m(x) and r_⊥,m(x) are the two components of the complex residuals, r_m(x) = ϱ_m e^i ϕ_m − (A·x)_m, respectively, along and orthogonally to the measured complex visibility. Given the error bars σ_ϱ,m and σ_ϕ,m of the amplitude and the phase of the complex visibility, the standard deviations in the components of the complex residuals are (Pauls et al. 2005) $σ_{/ /,m} = σ_{ϱ,m} and σ_{⊥,m} = ϱ_{m} σ_{ϕ,m} .$ $\begin{equation} \sigma_{/\!/,m} = \sigma_{\varrho,m} \quad \mbox{and} \quad \sigma_{\perp,m} = \varrho_m\,\sigma_{\varphi,m} \, . \end{equation}$ (13)

2.3.2. The regularization term f_prior

In our simulations, we test 11 different regularization terms that are commonly used in image reconstruction methods and are implemented in MiRA (see Appendix A for detailed expressions).

1.
Quadratic smoothness, which most closely describes a smoothimage and helps us to avoid unmeasured high frequencies.
2-3.
Compactness, which describes compactness in the image plane and hence smoothness in the Fourier plane (Le Besnerais et al. 2008). Two different cases were studied in the simulations, with penalties of the second and third orders with respect to the distance of the center of the field of view (FOV).
4.
Total variation (TV), which minimizes the total gradient of the image and helps us to describe uniform areas in the sought image with steep but localized changes (Strong & Chan 2003).
5.
ℓ₁smoothness, which is useful for an extended object with sharp edges since it is linear for strong gradients.
6-8.
ℓ_p-norm with p = 1.5, p = 2 and p = 3. For p > 1, the ℓ_p-norm regularization tends to produce a smooth image as it reduces the variance in the pixels.
9-11.
Maximum entropy methods (MEM) aims to identify the least informative image consistent with the data (Gull & Skilling 1984; Narayan & Nityananda 1986). We try three types of entropies MEM-sqrt, MEM-log, and MEM-prior, respectively. The first two tend to reproduce an image with a flux spread across a minimum number of pixels. The last one is minimum when the image is as close as possible to a prior image. This prior image is the Gaussian that most closely reproduces the amplitude visibility data.

The different regularization terms are expected to behave as follows. The positivity and the normalization imposed in all the reconstructions are an ℓ₁ norm and lead to the sparsity of the solution, i.e. to a minimum number of bright pixels to explain the data. As most astrophysical images are smooth and compact, we expect that the regularizations that can describe these images will behave well, i.e. smoothness (quadratic or ℓ₁), compactness, and TV. The ℓ_p-norm regularization with p = 2 (and by extension for p > 1 as the regularization has the same behavior) has the tendency to force to zero the spatial frequencies that have not been measured (according to the Parseval theorem). Since the regularization has to interpolate correctly between the data, which is closer to the reality, it is not expected to give good results. Finally, the MEM-prior is expected to yield more reliable results than the other type of entropy because our choice of the a priori image can closely describe a compact object.

3. Description of the simulations

We now describe all the various simulated data that we compute for different objects, (u,v) coverages, and signal-to-noise ratios (SNR), as well as the parameters used for the image reconstructions.

3.1. Simulated data

Our simulated data sets are saved as OI-Fits files (Pauls et al. 2005) and depend on several setting of the object type, the (u,v) coverage, and the SNR. The 90 data files are available on the JMMC website¹.

3.1.1. Astrophysical objects

Fig. 1

Astrophysical objects used in the simulations.

For our simulations, we consider ten astrophysical objects (see Fig. 1) that differ in term of their morphology and the typical length scales of their structures.

1.
LkHα: the model of LkHα describes a compact object with a peak of intensity and a smooth envelope. The model comes from the 2004 International Imaging Beauty Contest organized by Lawson for the IAU (Lawson et al. 2004).
2.
A stellar surface: this is a model of the supergiant α Ori that presents some convective cells, producing small scales on a smooth background. This model was produced by Chiavassa for the 2009 Workshop on Interferometry Imaging (WII09) organized by Berger and Malbet (Berger et al., in prep.).
3.
A stellar cluster: the model consists of a hundred point sources. The position and the brightness of the sources follow a normal law.
4.
Eta Carinae: the image of Eta Carinae presents many different scales and structures, such as the extended gas and the stars. This image was retrieved from the Hubble Space Telescope’s website². Some treatments have been applied to the image, i.e. a mean of the three different color channels to produce a grayscale image and a cut of the low intensities to produce a null background.
5.
A protoplanetary disk: the model represents an Herbig Ae/Be star with a point source (the star) and an extended structure (the disk). This model was computed by J.-P. Berger for the phase A science case of the VLTI-Spectro Imager instrument (Filho et al. 2008).
6.
A limb-darkened star: we used the power-law model of Hestroffer (1997) with an exponent α = 0.3. The image has a very smooth core with steep edges.
7.
The galaxy system M 51: this image of M 51 consists of as many different scales and structures as Eta Carinae (gas, stars, spiral arms). This image was retrieved from the Hubble Space Telescope’s website and was processed in a way similar to the image of Eta Carinae.

Fig. 2
(u,v) coverage. From left to right: rich (245 sampled frequencies), medium (88 sampled frequencies), and poor (31 sampled frequencies).
8.
The AGN M 87: this AGN has a jet that consists of a narrow structure surrounded by a smooth background due to the gas. The image was retrieved from the Hubble Space Telescope’s website and the same treatments as the Eta Carinae’s image have been applied.
9.
A gravitational microlensing image: gravitational microlensing is an astronomical phenomenon due to the gravitational lens effect. When a distant star or quasar becomes sufficiently aligned with a massive compact foreground object, the bending of light due to its gravitational field leads to two distorted unresolved images resulting in an observable magnification. The image shows four very compact structures. This model was developed by J. Surdej for the phase A science case of the VLTI-Spectro Imager instrument (Filho et al. 2008).
10.
A rapid rotator: the rapid rotation of a star affects the stellar shape and the local emitted flux. We used the model of Domiciano de Souza et al. (2002), with parameters D = 0.78 and T_p = 35 000 K. The resulting light distribution was projected onto a plane with an inclination of 45°.

Since the goal of our tests is to determine the influence of the object’s type on the image reconstruction, all the considered objects have a similar angular size of ~34 mas, which is consistent with the typical size of the FOV of an optical interferometer such as Amber/VLTI. To generalize our results, we expect the most important figure for a given object and instrument to be the number of resolved elements, which is equal to the ratio of the angular size of the object support to the effective resolution of the imaging system. Estimated by this ratio, the complexity of the objects that we have considered is in the range of 200–600 resolved elements.

3.1.2. (u,v) coverage

To study the influence of the instrumental configuration and analyze the capability of the regularization to interpolate the available data and thus fill the voids in the (u,v) plane, we consider several sparse (u,v) coverages (see Fig. 2). To remain general, our different (u,v) coverages were constructed to be uniformly spread. Each (u,v) coverage consists of concentric rings modulated by a sinusoid along the ring and with phases of the modulations chosen to maximize the distance between the points of two adjacent rings. This also avoids symmetries in the (u,v) coverage. The concentration of the rings is more important on small baselines than on the largest ones to insure a good sampling of low spatial frequencies. In this paper, we consider three (u,v) coverages which differ in terms of the number of samples, called hereafter rich (245 sampled frequencies), medium (88 sampled frequencies), and poor (31 sampled frequencies) coverages. The chosen (u,v) configurations could be considered as very good by the standards of existing data, but the goal of the paper is not to cover all possible cases but to show the general trend.

3.1.3. Signal-to-noise ratio

To investigate the effects of varying the SNR, we use a SNR factor γ in the standard deviations given by $σ_{ϱ,m} = γ ⟨ ϱ_{m} ⟩ and σ_{ϕ,m} = γ .$ $\begin{equation} \sigma_{\varrho,m} = \gamma \, \langle{\varrho_m}\rangle \quad \mbox{and} \quad \sigma_{\varphi,m} = \gamma \, . \end{equation}$ (14)Therefore with these settings, the error bars become $σ_{/ /,m} = σ_{⊥,m} = γ ⟨ ϱ_{m} ⟩ .$ $\begin{equation} \sigma_{/\!/,m} = \sigma_{\perp,m} = \gamma \, \langle{\varrho_m}\rangle. \end{equation}$ (15)To analyze the influence of the SNR on the reconstructed images, three values of γ are tested: high SNR (1%) , intermediate SNR (5%), and low SNR (10%). We limit our study to these SNR values to maintain a small amount of results. Therefore, there is no systematic attempt to search for the limit of SNR and we realize that our worst case (10%) might be considered moderate noise in real data.

Fig. 3

Left panel shows a plot of the MSE as a function of the hyperparameter μ. The different colors correspond to different levels of SNR (red high, blue intermediate, green poor). For each curve, the optimal value μ⁺ is labeled by a number (1, 2, and 3). The corresponding images are shown in the bottom row of the right panel. The top row of the right panel shows three reconstructed images with different values of μ, labeled by a letter on the red curve of the left part (A an under-regularized image, B the best image, and C an over-regularized image). This example is made for the galaxy object and the medium (u,v) coverage. The regularization is the MEM-prior one.

Fig. 4

Top row: scatter plots representing the optimal value of the hyperparameter μ⁺ as a function of the MSE of the images. Bottom row: the corresponding histograms of MSE and μ⁺. Left column: the colors and symbols indicate the different classes of objects. Right column: the colors and symbols indicate the different regularization classes.

3.2. Parameters of the synthesized image

The resolution and the FOV of the reconstructed image have to be chosen with care. On the one hand, the pixel size Δθ must be small enough to properly account for the highest spatial frequencies available in the data. Using Shannon’s rule, we find that $\hbox{$\Delta\theta \leqslant \lambda/(2\,B_{\rm max})$}$ , where B_max is the maximum length of the observed baselines and λ the wavelength. In our reconstructions, the pixel size is one third of the limit set by Shannon’s rule. This oversampling allows us to check whether image reconstruction can achieve some level of super-resolution. For instance, this yields Δθ = 0.4 mas at λ = 2.2 μm with B_max = 190 m. On the other hand, the FOV has to be large enough to avoid field aliasing and therefore we take an image three times larger than the object itself. As all our objects have a size of 34 mas and thus approximately fit into 85 × 85 pixels, the reconstructed images have 256 × 256 pixels.

3.3. Reconstruction strategy

Given the data (determined by the object, the (u,v) coverage, and the noise realization) and the regularization, we conduct a sequence of image reconstructions for different values of the regularization weight μ. Since the problems solved are convex (as explained in Sect. 2.3.1), their solutions do not depend on the starting point. To reduce the calculation time, we therefore try to use a starting point that is as close as possible to the final solution. We begin the sequences of reconstructions with the highest value of μ and use the true image as the starting point for this first reconstruction. We then reduce μ and use the image previously obtained as the starting point. This latter step is repeated until we reach the lowest value of μ. Each reconstruction is an iterative process that is stopped when the norm of the gradient of the penalty function f(x) is below a preset threshold. This threshold is derived from the true image according to $\begin{matrix} ∥ \\ ∥ \\ ∥ \end{matrix} \nabla f (x^{rec}) \begin{matrix} ∥ \\ ∥ \\ ∥ \end{matrix} \leq η \frac{| f_{μ} (x^{ref}) |}{\begin{matrix} ∥ \\ ∥ \\ ∥ \end{matrix} x^{ref} \begin{matrix} ∥ \\ ∥ \\ ∥ \end{matrix}} = 10^{-5} | f_{μ} (x^{ref}) |,$ $\begin{equation} \Norm{\nabla f(\boldsymbol{x}^\mathsf{rec})} \le \eta \, \frac{\Abs{f_\mu\left(\boldsymbol{x}^\mathsf{ref}\right)}}{\Norm{\boldsymbol{x}^\mathsf{ref}}} = 10^{-5}\,\Abs{f_\mu\left(\boldsymbol{x}^\mathsf{ref}\right)} \, , \end{equation}$ (16)where η > 0 is a small value and f_μ(x_ref) the penalty computed for the true image and for each value of μ. The ℓ₂ norm of the true image is normalized and we assume that η = 10^-5.

3.4. Image quality criterion: the mean-squared error

To assess the quality of the reconstructed images, we consider the mean-squared error (MSE) of the reconstructed images. In our simulations, we use normalized input images with the same pixel size as for the reconstructed image. Hence, to compare the reconstructed image x^rec and the true image x^ref, we can simply use the MSE defined as $MSE = \frac{1}{N} \sum_{n} {(x_{n}^{rec} - {x_{n}^{ref}}^{)}}^{2} = \frac{1}{N} {\begin{matrix} ∥ \\ ∥ \\ ∥ \end{matrix} x^{rec} - x^{ref} \begin{matrix} ∥ \\ ∥ \\ ∥ \end{matrix}}^{2},$ $\begin{equation} \mathsf{MSE} = \frac{1}{N}\,\sum_n \left( x^\mathsf{rec}_n - x^\mathsf{ref}_n\right)^2 = \frac{1}{N}\,\Norm{\boldsymbol{x}^\mathsf{rec} - \boldsymbol{x}^\mathsf{ref}}^2, \end{equation}$ (17)where N is the total number of pixels. We note that, since we use complex visibilities and our priors, apart from the FOV one, are shift-invariant, the reconstructed image is correctly centered and there is no need to compensate for registration errors.

4. Results and discussion

We performed a total of ~24 000 simulations corresponding to the reconstruction of all cases described in Sect. 3 with different values of the hyperparameter μ. We present here the results of these simulations and analyze the consequences for the image reconstruction process in order to draw general conclusions.

We begin by discussing the optimal value of the hyperparameter and determining whether the MSE is a good quality criterion. We then discuss the effects of the following parameters:

the regularization: what are the good and bad regularizations?
the limits: what are the minimal (u,v) coverage and SNR value?
the hyperparameter μ: what is the optimal value? With which parameters does it vary?
the likelihood term: how does it vary? Can it be used to tune the regularization term instead of the hyperparameter μ?
the effective resolution: what degree of super-resolution can be achieved? How does it vary?

4.1. Optimal regularization weight μ⁺

We first investigate whether there is an optimal regularization weight μ for a given situation. Therefore, for each object, configuration, SNR level, and regularization, we reconstruct an image for different values of the hyperparameter μ.

The top row of the right panel of Fig. 3 shows the effects of different values of μ, whereas the left panel displays the obtained MSE (see Sect. 3.4) for each values of μ. For too small a value of μ, the under-regularized image (labeled with an A) has plenty of artifacts. In contrast, for too large a μ, the over-regularized image (labeled with a C) is blurred, and many fine features are lost. These two extreme situations correspond to high values of the MSE (A and C), but there is a situation where the MSE reaches a minimum and the image appears to have far fewer artifacts (B). Visual comparison of the A and C images with the one obtained for B confirms that the MSE is a good criterion to correctly set the regularization weight.

We conclude that there is indeed an optimal value of μ, the one that gives the smallest MSE $μ^{+} = {\arg \min}_{μ} {\begin{matrix} ∥ \\ ∥ \\ ∥ \end{matrix} x_{μ}^{rec} - x^{ref} \begin{matrix} ∥ \\ ∥ \\ ∥ \end{matrix}}^{2},$ $\begin{equation} \label{eq:best-mu} \mu^+ = {\rm arg\,min}_{{\hspace*{-6mm}\underset{\mu}{}}}\hspace*{6mm} \Norm{\boldsymbol{x}^\mathsf{rec}_\mu - \boldsymbol{x}^\mathsf{ref}}^2 \, , \end{equation}$ (18)where $x_{μ}^{rec}$ $\hbox{$\boldsymbol{x}^\mathsf{rec}_\mu$}$ is the image reconstructed with a regularization weight set to μ. As the minimum of the curve is quite flat, the optimal value of μ is not precisely defined but may vary by a factor as large as either two or three with a negligible influence on the result.

This procedure cannot be used in practice because the true image is obviously unknown. Nevertheless, this procedure allows us to define the most accurate image that can be reconstructed given the data and the type of regularization. In the following analysis, the reconstructed images are always obtained with μ = μ⁺.

4.2. Dependence of μ⁺ on the MSE quality criterion

We wish to determine the type of relationship that links the optimized regularization factor μ⁺ to the MSE in order to detect any trends. The scatter plot in Fig. 4 reports the values of μ⁺ and MSE obtained for each simulation. In the left panel, the different colors and symbols correspond to different objects. In the right panel, the different colors and symbols correspond to different regularizations. In the bottom row, we present our computed histograms of MSE (left) and μ⁺ (right). As in the rest of the paper, the plotted histograms are approximations of the probability density functions (PDF) of our results. These curves were computed from our samples following the optimal method described in Scott (1992).

In the left part of Fig. 4, the different colors representing the objects seem to be aligned vertically. We therefore conclude that the MSE depends mostly on the structure of the observed object.

More precisely, two classes of objects can be distinguished in the distribution of MSE in the left panel of Fig. 4. We therefore grouped together the objects with similar behaviors in the top panels of Fig. 5: (i) the objects with very compact sources, i.e. the star cluster, the protoplanetary disk and the microlensing (curve in blue labeled B); and (ii) the other objects with extended structures (curve in red labeled C). The MSE is systematically higher for objects of the first class.

Following these observations, we try to find a way of renormalizing the MSE as a function of the object. To join the curves together, we define a new MSE criterion, called MSE⁺, by dividing each MSE by the smallest MSE for each object separately. As expected, this normalization cancels the different object classes, as seen in the right part of Fig. 5. Two peaks clearly appear on the graph, distinguishing the good reconstructions (left peak) from the bad ones (right peak). An example of each case is shown on the bottom part of Fig. 5. The low quality reconstructions are caused by bad configurations (not enough (u,v) points, low SNR) and/or bad regularizations as it can be seen in the right part of Fig. 6 and will be described in the next sections. To be sure that the peaks represent the good and bad reconstructions and do not come from different object’s types, we verify that each object is represented in each peak, as shown in the left part of Fig. 6.

By visual inspection, we assess that the value of the MSE⁺ leads to a correct ordering of the images reconstructed for a given object when the other settings change (data quality, type of regularization, etc.): the lower the MSE⁺, the higher the quality of the image is. Other attempts at the renormalization of the MSE are explained in Appendix B.

Now that a good quality criterion has been defined and that the optimal value of μ have been obtained, we can study the other parameters.

4.3. Limits due to the (u,v) coverage and the SNR

Fig. 5

Upper row: distribution of the MSE (left) and the MSE⁺ (right). The colors and letters represent the two classes of objects: blue/B for the objects with very compact structures, red/C for the others. The total distribution is shown in the black/A curve. Bottom row: example of reconstructed images for the good (left) and bad (right) MSE⁺ peak.

Fig. 6

Distribution of MSE⁺. Left: histograms of MSE⁺ for different objects in different colors; the gray zone corresponds to the total distribution, all objects confounded. Right: solid line, all the configurations and regularizations are kept; dashed line, with the sparsest (u,v) coverage removed; dot-dashed line, with the bad regularizations removed; in gray zone, with the sparsest (u,v) coverage and bad regularizations removed.

Fig. 7

Left: cumulative distributions of the ranks reached by the different configurations of (u,v) coverage and SNR. Right: the histograms of the MSE⁺ for different configurations of (u,v) coverage and SNR represented by different colors.

Fig. 8

Cumulative distributions of the ranks reached by the regularizations. Left: all objects; Middle: objects with very small structures; Right: other objects.

In this section, we classify the observational configurations ((u,v) coverage and SNR) on the basis of the MSE⁺. For each data set (unique combination of object and regularization), we order the pair [(u,v) coverage, SNR] according to the value of MSE⁺ they reach, giving them a rank from 1 for the best configuration (lowest MSE⁺) to 9 for the worst (highest MSE⁺). In the left panel of Fig. 7, we display the cumulative distributions of the ranks reached by every configuration, determining how many times a given configuration reaches at least the first rank, the second rank, etc. The highest quality configurations are the ones in the upper-left part of the plot.

The poor (u,v) coverage combined with any value of SNR is clearly too sparse to reconstruct good images. While acceptable reconstruction is possible for any considered SNR when there are enough samples in the (u,v) coverage. We deduce that there is a minimal (u,v) coverage needed to perform image reconstruction, whereas there is no such limit set by the tested SNR. However, when the (u,v) coverage is sufficient, the higher the SNR or the more filled the (u,v) coverage, the higher the quality of the reconstructed image is. The bottom row of the right panel of Fig. 3 shows how the visual quality of the optimal image depends on the SNR. As expected, the higher the SNR, the better the reconstructed image is.

Figure 7 (right) shows the histogram of the MSE⁺ for different configurations of (u,v) coverage and SNR. It indicates that the success image reconstruction is more influenced by the amount of data than by the SNR: the MSE⁺ is lower for a rich (u,v) coverage with a poor SNR than for a medium (u,v) coverage with an intermediate SNR. We note that all the tested (u,v) coverage are uniform, but we expect that the amount of data has to be sufficient and also homogeneously spread in the (u,v) plan.

After the removal of the sparsest coverage, the MSE⁺ distribution is shown in Fig. 6 in dashed line. There is still a little bump of bad MSE⁺ caused by bad regularizations, as discussed in the next section.

4.4. Quality of the regularizations

Using the same principles as in the previous section, we classify the regularizations as a function of the MSE⁺ for different realizations. Figure 8 shows the corresponding cumulative distributions. Isotropic TV appears to be the most successful regularization for the two classes of objects. The compactness prior with $w_{n}^{prior} = | θ_{n} |^{2}$ $\hbox{$\Weight^{\rm prior}_n = \abs{\boldsymbol{\theta}_n}^2$}$ is the second highest quality choice for the very compact objects. The worst regularizations are the ℓ_p-norm, the MEM-sqrt, and the MEM-log, as expected and explained in Sect. 2.3.2. Reconstructions for good and bad regularizations are illustrated in Fig. 9.

Fig. 9

Example of reconstructed images of the galaxy image with the medium (u,v) coverage and the intermediate SNR for different regularizations. From left to right, TV, compactness θ², ℓ_p norm with p = 2, MEM-log.

In the MSE⁺ distribution after the elimination of the bad regularizations (see Fig. 6 in dot-dashed line), there is still much evidence of lower quality MSE⁺. We conclude that the bad MSE⁺ are mainly caused by the sparsest (u,v) coverage. However, both have to be eliminated to obtain a cleaned sample of reconstructed images.

In what follows, we made a selection and only retained the regularizations (quadratic and ℓ₁ smoothness, compactness, isotropic TV and MEM with a Gaussian prior) and the (u,v) coverages (rich and medium) that lead to correct reconstructed images. We kept data for all values of SNR.

4.5. Predetermined value of the hyperparameter μ⁺?

In a fully Bayesian framework, the data noise and the variability of the object are independent. We therefore expect that μf_prior does not depend on the data (Tarantola 2005). As a result, for a given type of object, the optimal value for μ should be the same regardless of the SNR or the (u,v) coverage. However, we are not in a truly Bayesian framework because the regularizations are derived from general considerations that must help us to solve for the degeneracies of the problem and are not really derived from the statistics of the brightness distributions of the observed objects. Since the degeneracies of the problem are mostly due to the sparsity of the (u,v) coverage, this observational parameter may have some influence on the tuning of the regularization weight. Our expectation is thus that, for a given type of object, a quasi optimal value of μ⁺ can be derived from simulations and from simple considerations to scale this parameter if different image reconstruction settings are used.

Fig. 10

Histogram of μ⁺ for the TV regularization. Left: the colors correspond to different objects. Right: the colors correspond to different pair [(u,v) coverage–SNR].

The hyperparameter μ⁺ depends mostly on the regularizations, as seen in the right part of Fig. 4, where the colors, representing the regularizations, appear aligned horizontally and the pre-eminent peaks of the histograms are separated. Moreover, the hyperparameter μ appears to be quasi independent of the SNR and the (u,v) coverage, as expected. As seen in Fig. 10 for the TV regularization, if there is still a variation in the value of μ, it depends on the object morphology but not on the (u,v) plane and the SNR value.

This finding allows us to link each regularization to a mean value of the hyperparameter μ. This mean value gives a useful start point for the user (see Table 1). The equations to rescale this value in the case of different image settings are given in Appendix C.

Table 1

Table of the mean values of μ for each regularization.

4.6. How different noise realizations affect the MSE and the optimal μ?

In all simulations, in order not to influence the classification of the results, the same random seed is used to compute the noisy data. Therefore, we test 100 noise realizations for each regularization in the case of the galaxy with a medium (u,v) coverage and an intermediate SNR to study its impact on the curves computed from a single realization. From Fig. 11, we conclude that the MSE is not very different, thus the image reconstruction does not depend on the particular noise realization (as shown for example in Fig. 12). Moreover, the optimal value of μ varies by less than an order of magnitude.

4.7. The effective spatial resolution

To investigate whether super-resolution is achievable and to quantify the amount of trustable super-resolution, we estimate the effective resolution of the reconstructed images. We define the effective resolution as the full width at half maximum (hereafter FWHM) of the Gaussian that yields the smallest value of MSE between the reconstructed image and the true image smoothed by that Gaussian $FWHM = {\arg \min}_{w} {\begin{matrix} ∥ \\ ∥ \\ ∥ \end{matrix} x^{rec} - G (w) \cdot x^{ref} \begin{matrix} ∥ \\ ∥ \\ ∥ \end{matrix}}^{2},$ $\begin{equation} FWHM = {\rm arg\,min}_{{\hspace*{-6mm}\underset{w}{}}}\hspace*{6mm} \Norm{\boldsymbol{x}^\mathsf{rec} - \mathbf{G}(w)\cdot\boldsymbol{x}^\mathsf{ref}}^2 \, , \end{equation}$ (19)where G(w) is a linear smoothing filter that convolves its argument by a Gaussian of FWHM equal to w.

As shown in the left part of Fig. 13, the effective resolution varies with both the amount of data and the SNR: the more extensively the (u,v) parameter space is filled and the higher the SNR, the higher the effective resolution is. As for the MSE (cf. Sect. 4.3), the effective resolution is more influenced by the (u,v) coverage than by the SNR. Super-resolution can be achieved in only the best cases.

The bottom-right part of Fig. 13 shows the expected behavior of the effective resolution: the higher the hyperparameter μ, the lower the effective resolution is. In order words, the more the image is regularized, the fewer the details that are visible.

An interesting question is why this super-resolution exists. Although we do not have a definitive answer, we can speculate that it is due to the role of the positivity in the image reconstruction (Biraud 1969). This is confirmed by our simulations (cf. the upper right part of Fig. 13): without the positivity constraints, the distribution of the FWHMs has a peak higher than when using the positivity constraints.

4.8. Other methods for tuning the regularization?

Fig. 11

Variation in MSE and μ with different noise realizations. Blue, the quartile curve of the realizations (25% in dash line, 50% in solid line, and 75% in dot line). Red, the mean optimal μ (cross) and its variance.

Fig. 12

Reconstructed images for the best (left) and the worst (right) noise realization in the TV case.

For fully Gaussian statistics (i.e. all the likelihood and prior penalties are quadratic and there are no constraints such as the normalization and the positivity), the expected values of the total penalty function f(x⁺) and the likelihood term f_data(x⁺) at the optimal solution x⁺ is given by (Tarantola 2005) $\begin{matrix} E {f (x^{+})} & = N_{data}, \\ E {f_{data} (x^{+})} & = N_{data} - edf, \end{matrix}$ $\begin{eqnarray} \Expected{f(\boldsymbol{x}^{+})} &= N_{\rm data} \, , \\ \Expected{f_{\rm data}(\boldsymbol{x}^{+})} &= N_{\rm data} - {\rm edf} , \label{eq:fdata} \end{eqnarray}$ where N_data is the number of data points (for both visibility amplitudes and phases), and edf is the number of equivalent degrees of freedom. Despite our being unable to use fully Gaussian statistics, the left panel of Fig. 14 shows that the distribution of f(x⁺) peaks approximatively at the value of N_data. The spread of this distribution however prevents us to be able to tune the regularization level according to the criterion that f(x⁺) ≈ N_data.

The value of f_data(x⁺)/N_data is around 0.1, which means that edf/N_data ≈ 90% of the data information is resolved. The image reconstruction is able to estimate almost as many parameters as data points. Since the f_data(x⁺)/N_data ratio has a smaller range (from ~0.3 to ~0.003) than the possible value of the hyperparameter μ (from 100 to 10¹³), it may be easier to tune the balance between the terms in the penalty function thanks to the f_data(x⁺)/N_data value instead of μ. However, this criterion may be really variable with the noise statistics and a study of the variation in f_data(x⁺)/N_data with different noise statistics should be done before using it as a tune factor (Gull 1988; Pichon & Thiebaut 1998).

Fig. 13

Left: FWHM of the Gaussian computed for the effective resolution in units of the interferometric resolution of the data. Up-right: comparison between the FWHM with (green) and without (magenta) the constraint of positivity. Bottom-right: variation in the effective resolution as a function of the hyperparameter μ for three different SNR (red high, blue intermediate, green poor). The regularization used is the TV one.

Fig. 14

Histograms of f/N_data and f_data/N_data. Solid line: all configurations and regularizations are kept. Dashed line: with the sparsest (u,v) coverage removed. Dot-dashed line: with the bad regularizations removed. In gray zone: with the sparsest (u,v) coverage and bad regularizations removed.

Fig. 15

An example of the L-curve in the simulations for the TV regularization with a zoom on the right part.

Another way of determining the value of μ is the so-called L-curve. The L-curve is a log-log plot of the regularization term f_prior versus the likelihood term f_data for a range of the hyperparameter μ. In the L-curve criterion, the regularization parameter μ is such that the corresponding point on the L-curve lies in the corner. This choice is motivated by the corner being the separation between the flat part where the solution is dominated by regularization errors and the vertical part where it is dominated by the perturbation errors (Hansen 2000). The correct behavior of the L-curve is confirmed by the simulations, as seen in Fig. 15 (right): since the curves are plotted as a function of the highest quality image, they should cross in their corner, which is globally the case. We note that the shape of the L-curve seems more complicated as there are at least two corners and not only one (see Fig. 15 left). The L-curve appears to be an appropriate tool for finding the optimal value of μ but a more general study has to be done to confirm its suitability (see comparison with GCV) and its practical implementation in an image reconstruction algorithm.

5. Conclusions

Thanks to the use of a flexible algorithm devoted to image reconstruction in optical interferometry, we have performed a detailed study of the regularization terms. This study is the first one to compare such a number of regularizations on an equal footing, i.e. with the same algorithm and using the same data type. Performing these systematic tests has allowed us to discuss the different parameters and terms in the image reconstruction and to extract some practical rules, which are summarized in the following:

1.
A minimal (u,v) coverage is necessary to reconstruct an image.Even if the image quality improves with the SNR, such a limit doesnot exist for the SNR. In other words, in the image reconstructiontechnique, increasing the number of telescopes is moreinteresting than constructing larger ones. The homogeneity ofthe (u,v) coverage is probably also critical but has not been tested.
2.
Some regularizations are suitable for the optical image reconstruction and others not, regardless of the object being targeted. The holes in the (u,v) plane are the major issue in optical interferometry and the main role of the regularization is the correct interpolation of the missing data, regardless of the object. The highest quality regularization among the tested ones is the isotropic total variation.
3.
The hyperparameter μ does not depend on the (u,v) coverage and the SNR as theoretically expected and depends mostly on the type of regularization. An optimal value for each regularization tested in this paper is given in Table 1. This optimal value may vary by a factor of 2–3 without there being any major changes in the images. A slight dependence on both the object structures and pixel size is also discernible and the equation to rescale the optimal values are computed. It should be interesting to implement regularizations independent of the pixel size.
4.
Super-resolution can be achieved in the image reconstruction process and its level rises with the (u,v) coverage filling.
5.
There are various possible ways of tuning the regularization level:
- The visual tuning is enough as the μ value can slightly vary without causing any large changes in the reconstructed image.
- Setting the likelihood term f_data seems to be a more effective way of fixing the balance between the regularization and the likelihood terms. However, the variation in the likelihood term with noise statistics needs to be investigated.
- The L-curve criterion could give correct results.

¹

http://apps.jmmc.fr/oidata/shared/srenard/

²

http://hubblesite.org/

Acknowledgments

This research has made use of Yorick, a free data processing language written by D. Munro (http://yorick.sourceforge.net) and the Service de Calcul Intensif de l’Observatoire de Grenoble.

References

Biraud, Y. 1969, A&A, 1, 124 [NASA ADS] [Google Scholar]
Candes, E. J., Romberg, J., & Tao, T. 2006, IEEE Trans. Info. Theory, 52, 489 [CrossRef] [Google Scholar]
Charbonnier, P., Blanc-Fraud, L., Aubert, G., & Barlaud, M. 1997, IEEE Trans. Image Process., 6, 298 [NASA ADS] [CrossRef] [MathSciNet] [PubMed] [Google Scholar]
Delplancke, F., Derie, F., Paresce, F., et al. 2003, Ap&SS, 286, 99 [NASA ADS] [CrossRef] [Google Scholar]
Domiciano de Souza, A., Vakili, F., Jankov, S., Janot-Pacheco, E., & Abe, L. 2002, A&A, 393, 345 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Filho, M. E., Renard, S., Garcia, P., et al. 2008, in SPIE Conf. Ser., 7013 [Google Scholar]
Goodman, J. W. 1985, Statistical Optics (John Wiley & Sons) [Google Scholar]
Gull, S. F. 1988, in Maximum Entropy and Bayesian Methods in science and engineering, ed. G. J. Erickson, & C. R. Smith [Google Scholar]
Gull, S. F., & Skilling, J. 1984, in Indirect Imaging. Measurement and Processing for Indirect Imaging, ed. J. A. Roberts, 267 [Google Scholar]
Hansen, P. C. 2000, in Computational Inverse Problems in Electrocardiology, ed. P. Johnston, Advances in Computational Bioengineering (WIT Press), 119 [Google Scholar]
Haubois, X., Perrin, G., Lacour, S., et al. 2009, A&A, 508, 923 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Hestroffer, D. 1997, A&A, 327, 199 [NASA ADS] [Google Scholar]
Kraus, S., Weigelt, G., Balega, Y. Y., et al. 2009, A&A, 497, 195 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Kraus, S., Hofmann, K., Menten, K. M., et al. 2010, Nature, 466, 339 [NASA ADS] [CrossRef] [MathSciNet] [PubMed] [Google Scholar]
Lawson, P. R., ed. 2000, Principles of Long Baseline Stellar Interferometry [Google Scholar]
Lawson, P. R., Cotton, W. D., Hummel, C. A., et al. 2004, BAAS, 36, 1605 [NASA ADS] [Google Scholar]
LeBesnerais, G., Lacour, S., Mugnier, L. M., et al. 2008, IEEE J. Select. Topics Signal Process., 2, 767 [Google Scholar]
Le Bouquin, J., Lacour, S., Renard, S., et al. 2009, A&A, 496, L1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Malbet, F., & Perrin, G. 2007, New Astron. Rev., 51, 563 [NASA ADS] [CrossRef] [Google Scholar]
Meimon, S., Mugnier, L. M., & Le Besnerais, G. 2005, J. Opt. Soc. Am. A, 22, 2348 [Google Scholar]
Monnier, J. D., Zhao, M., Pedretti, E., et al. 2007, Science, 317, 342 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
Narayan, R., & Nityananda, R. 1986, ARA&A, 24, 127 [NASA ADS] [CrossRef] [Google Scholar]
Pauls, T. A., Young, J. S., Cotton, W. D., & Monnier, J. D. 2005, PASP, 117, 1255 [NASA ADS] [CrossRef] [Google Scholar]
Pearson, T. J., & Readhead, A. C. S. 1984, ARA&A, 22, 97 [NASA ADS] [CrossRef] [Google Scholar]
Pichon, C., & Thiebaut, E. 1998, MNRAS, 301, 419 [NASA ADS] [CrossRef] [Google Scholar]
Renard, S., Malbet, F., Benisty, M., Thiébaut, E., & Berger, J. 2010, A&A, 519, A26 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Scott, D. W. 1992, Multivariate density estimation: theory, practice, and visualization (John Wiley & Sons, Inc.) [Google Scholar]
Strong, D., & Chan, T. 2003, Inverse Problems, 19, S165 [NASA ADS] [CrossRef] [Google Scholar]
Tarantola, A. 2005, Inverse Problem Theory and Methods for Model Parameter Estimation (SIAM) [Google Scholar]
Thiébaut, E. 2005, in Optics in astrophysics, ed. R. Foy, & F. C. Foy, NATO ASIB Proc., 198, 397 [Google Scholar]
Thiébaut, E. 2008, in SPIE Conf. Ser., 7013 [Google Scholar]
Thiébaut, E., & Giovannelli, J.-F. 2010, IEEE Signal Process. Mag., 27, 97 [NASA ADS] [CrossRef] [Google Scholar]
Titterington, D. M. 1985, A&A, 144, 381 [NASA ADS] [Google Scholar]
Zhao, M., Gies, D., Monnier, J. D., et al. 2008, ApJ, 684, L95 [NASA ADS] [CrossRef] [Google Scholar]
Zhao, M., Monnier, J. D., Pedretti, E., et al. 2009, ApJ, 701, 209 [NASA ADS] [CrossRef] [Google Scholar]

Appendix A: The regularization expressions

In our simulations, we test 11 different regularization terms commonly used in image reconstruction methods and implemented in MiRA:

Quadratic smoothness: $f_{prior} (x) = ∥ x - S \cdot x ∥^{2},$ $\appendix \setcounter{section}{1} \begin{equation} \label{eq:regul-quadratic-smoothness} f_{\rm prior}(\boldsymbol{x}) = \norm{\boldsymbol{x} - \mathbf{S}\cdot\boldsymbol{x}}^2 \, , \end{equation}$ (A.1)where S is a smoothing operator implemented via finite differences.

Compactness (Le Besnerais et al. 2008): $f_{prior} (x) = \sum_{n} w_{n}^{prior} x_{n}^{2},$ $\appendix \setcounter{section}{1} \begin{equation} \label{eq:regul-compactness} f_{\rm prior}(\boldsymbol{x}) = \sum_n \Weight^{\rm prior}_n \boldsymbol{x}_n^2 \, , \end{equation}$ (A.2)which is a separable quadratic regularization. To enforce compactness, the weights $w_{n}^{prior} > 0$ $\appendix \setcounter{section}{1} \hbox{$\Weight^{\rm prior}_n > 0$}$ have to increase with the distance from the center of the image. Two different cases were studied in the simulations: $w_{n}^{prior} = ∥ θ_{n} / Δ θ ∥^{2}$ $\appendix \setcounter{section}{1} \hbox{$\Weight^{\rm prior}_n=\norm{\boldsymbol{\theta}_n/\Delta\theta}^2$}$ and $w_{n}^{prior} = ∥ θ_{n} / Δ θ ∥^{3}$ $\appendix \setcounter{section}{1} \hbox{$\Weight^{\rm prior}_n = \norm{\boldsymbol{\theta}_n/\Delta\theta}^3$}$ , where ∥ θ_n/Δθ ∥ is the distance in pixels of the nth pixel from the center of the FOV.

Total variation (Strong & Chan 2003): $f_{prior} (x) = \sum_{n_{1}, n_{2}} \sqrt{∥ \nabla x_{n_{1}, n_{2}} ∥^{2} + ϵ^{2}},$ $\appendix \setcounter{section}{1} \begin{equation} \label{eq:tv-prior} f_{\rm prior}(\boldsymbol{x}) = \sum_{n_1,n_2} \sqrt{\norm{\nabla x_{n_1,n_2}}^2 + \epsilon^2}, \end{equation}$ (A.3)where $∥ \nabla x_{n_{1}, n_{2}} ∥^{2} = (x_{n_{1} + 1,n 2} - x_{n_{1}, n_{2}})^{2} + (x_{n_{1}, n_{2} + 1} - x_{n_{1}, n_{2}})^{2}$ $\appendix \setcounter{section}{1} \begin{displaymath} \norm{\nabla x_{n_1,n_2}}^2 = (x_{n_1+1,n2} - x_{n_1,n_2})^2 + (x_{n_1,n_2+1} - x_{n_1,n_2})^2 \end{displaymath}$ is the squared magnitude of the spatial gradient in the image, ϵ > 0 is a small number inserted to avoid the discontinuity in zero, and (n₁,n₂) ~ n are the two dimensional indices of the nth pixel. In our reconstructions, ϵ has always been chosen so as to be negligible compared to the gradient of significant structures.

ℓ₂ − ℓ₁smoothness (Charbonnier et al. 1997): $f_{prior} (x) = τ^{2} \sum_{n} ψ (∥ (D \cdot x)_{n} ∥ / τ),$ $\appendix \setcounter{section}{1} \begin{equation} f_{\rm prior}(\boldsymbol{x}) = \tau^2 \sum_{n} \psi\bigl(\norm{(\mathbf{D}\cdot\boldsymbol{x})_{n}}/\tau\bigr) , \label{eq:l1-l2-prior} \end{equation}$ (A.4)where ψ(z) = z − log (1 + z) is a ℓ₂-ℓ₁ norm, τ > 0 is a threshold level, and D is a finite difference operator approximating the qth spatial derivatives of its argument. When ∥ (D·x)_n ∥ is much smaller than the threshold, τ² ψ(∥ (D·x)_n ∥ /τ) ≈ 1/2 ∥ (D·x)_n ∥ ², while, for ∥ (D·x)_n ∥ it is much larger than the threshold, τ² ψ(∥ (D·x)_n ∥ /τ) ≈ τ ∥ (D·x)_n ∥ . This regularization attempts to strongly smooth the weak spatial gradients and slightly smooth the strong gradients. In our simulations, we take D to approximate the spatial Laplacian and choose a threshold small enough such that the regularization behaved mostly like a linear smoothness.

ℓ_p-norm: $f_{prior} (x) = \sum_{n} {(x_{n}^{2} + ϵ^{2})}^{p / 2} \approx \sum_{n} | x_{n} |^{p},$ $\appendix \setcounter{section}{1} \begin{equation} f_{\rm prior}(\boldsymbol{x}) = \sum\nolimits_n \left(x_n^2 + \epsilon^2\right)^{p/2} \approx \sum\nolimits_n \abs{x_n}^p \, , \label{eq:lp-prior} \end{equation}$ (A.5)where ϵ > 0 is a small value introduced to avoid the singularity in zero when p ≤ 1. For p > 1, the ℓ_p-norm regularization tends to produce a smooth image as it reduces the variance of the pixels. For p < 1, the ℓ_p-norm regularization tends to promote sparsity in the image. This is interesting mostly for objects consisting of point sources. True sparsity constraints would be obtained for p = 0, although when p < 1 the regularization is no longer convex and the optimization problem becomes extremely difficult to solve as p gets closer to 0. Results in compress sensing (Candes et al. 2006) have proven that choosing p = 1 yields the most sparse solution, like p = 0 for a large class of problems. However, taking p = 1 yields a non-smooth but convex problem that is much easier to solve than the combinatorial problem resulting from the choice p = 0. With positivity and normalization constraints, the ℓ₁-norm of x is constant. Hence, taking p = 1 is meaningless in our framework and we consider only p = 1.5, p = 2 and p = 3.

Maximum entropy methods (Narayan & Nityananda 1986): $f_{prior} (x) = - \sum_{n} h (x_{n}; x̅ n) .$ $\appendix \setcounter{section}{1} \begin{equation} f_{\rm prior}(\boldsymbol{x}) = - \sum\nolimits_n h(x_n;\bar{x}_n) \, . \label{eq:mem-prior} \end{equation}$ (A.6)Here the prior is to assume that the image is drawn towards a prior model $\bar{x}$ $\appendix \setcounter{section}{1} \hbox{$\bar{\boldsymbol{x}}$}$ according to a non-quadratic potential h, called the entropy. We try three entropies: $\begin{matrix} MEM - sqrt: & h (x; x̅) = \sqrt{x}; \\ MEM - log: & h (x; x̅) = \log (x); \\ MEM - prior: & h (x; x̅) = x - x̅ - x \log (x / x̅) . \end{matrix}$ $\appendix \setcounter{section}{1} \begin{eqnarray} \text{MEM-sqrt:}\hspace*{4.1ex} &h(x;\bar{x}) = \sqrt{x} \, ; \\ \text{MEM-log:}\hspace*{4.5ex} &h(x;\bar{x}) = \log(x) \, ; \\ \text{MEM-prior:}\hspace*{3ex} & h(x;\bar{x}) = x - \bar{x} - x\, \log\left(x/\bar{x}\right) \, . \label{eq:MEM-prior} \end{eqnarray}$ MEM was first introduced in radioastronomy and is useful for images made of bright point-like sources on a smooth background. In our simulations, we took the prior image $\bar{x}$ $\appendix \setcounter{section}{1} \hbox{$\bar{\boldsymbol{x}}$}$ to be the isotropic 2D Gaussian that most accurately fits the amplitude visibility data.

Appendix B: Renormalization of MSE

Fig. B.1

Attempt to renormalize the MSE. Distribution of the first normalized MSE (left) and of the second normalized MSE (right). The colors and letters represent the two classes of objects: blue/B for the objects with very compact structures, red/C for the others. The total distribution is shown in black/A curve.

We attempted to normalize the MSE with two additional methods:

1.
Since the squared difference between the real image and asmoothed version of the real image is higher for images with sharpor point-like structures, we computed a first normalized MSE as $MS E_{norm ., 1} = \frac{\sum_{n} {(x_{n}^{rec} - {x_{n}^{ref}}^{)}}^{2}}{\sum_{n} {((S \cdot x^{ref})_{n} - {x_{n}^{ref}}^{)}}^{2}},$ $\appendix \setcounter{section}{2} \begin{equation} \mathsf{MSE_\mathrm{norm.,1}} = \frac{\sum_n \left(x^\mathsf{rec}_n - x^\mathsf{ref}_n\right)^2}{\sum_n \left((\mathbf{S}\cdot\boldsymbol{x}^\mathsf{ref})_n - \boldsymbol{x}^\mathsf{ref}_n\right)^2} \, , \end{equation}$ (B.1)where S is a smoothing operator. The distribution of this normalized MSE is shown in the right panel of Fig. B.1: the distribution is narrower but the two classes remain, despite the normalization.
2.
In the second normalization, the MSE is compared to the norm of the reference image $MS E_{norm ., 2} = \frac{\sum_{n} {(x_{n}^{rec} - {x_{n}^{ref}}^{)}}^{2}}{\sum_{n} {({x_{n}^{ref}}^{)}}^{2}} \cdot$ $\appendix \setcounter{section}{2} \begin{equation} \mathsf{MSE_\mathrm{norm.,2}} = \frac{\sum_n \left(x^\mathsf{rec}_n - x^\mathsf{ref}_n\right)^2}{\sum_n \left(\boldsymbol{x}^\mathsf{ref}_n\right)^2} \cdot \end{equation}$ (B.2)This normalization is more effective than the previous one, as it joins the curves together. However, the distribution is unimodal and does not enable us to distinguish the good and the bad reconstructions. It is thus not a useful normalization.

Appendix C: Scaling the hyperparameter μ

In a Bayesian framework, the prior penalty μ f_prior(x) should only depend on both the sought after brightness distribution I(θ) and the image parametrization. In this appendix and from these simple principles, we derive a method to adapt the value of the hyperparameter μ when the image parameters such as the pixel size are modified.

Using the sampled image model in Eq. (4), the normalization of x implies that $1 = \sum_{n} x_{n} = α \sum_{n} I (θ_{n}) \approx \frac{α}{(Δ θ)^{L}} \int I (θ) d^{L} θ,$ $\appendix \setcounter{section}{3} \begin{displaymath} 1 = \sum\nolimits_n x_n = \alpha\,\sum\nolimits_n I(\boldsymbol{\theta}_n) \approx \frac{\alpha}{(\Delta\theta)^L}\,\int I(\boldsymbol{\theta}) \, \mathrm{d}^L\boldsymbol{\theta} \, , \end{displaymath}$ where we have used the Riemann approximation of integrals, Δθ is the pixel size, and L = 2 is the number of dimensions in θ. Without any loss of generality, we can assume that the sought after brightness distribution is normalized such that ^∫I(θ)d^Lθ = 1; hence $α \approx (Δ θ)^{L},$ $\appendix \setcounter{section}{3} \begin{equation} \label{eq:scaling-of-alpha} \alpha \approx (\Delta\theta)^L , \end{equation}$ (C.1)and Eq. (4) becomes $x_{n} \approx (Δ θ)^{L} I (θ_{n}) .$ $\appendix \setcounter{section}{3} \begin{equation} \label{eq:sampled-normalized-image-model} x_n \approx (\Delta\theta)^L \, I(\boldsymbol{\theta}_n) . \end{equation}$ (C.2)

C.1. Separable ℓ_p norm

Using Eq. (4) and then, the Riemann approximation, the prior penalty for a separable ℓ_p norm regularization, after substituting in Eq. (A.5), becomes $\begin{matrix} μ f_{prior} (x) & = & μ \sum_{n} | x_{n} |^{p} \approx μ α^{p} \sum_{n} | I (θ_{n}) |^{p} \\ \approx μ \frac{α^{p}}{Δ θ^{L}} \int | I (θ) |^{p} d θ, \end{matrix}$ $\appendix \setcounter{section}{3} \begin{eqnarray} \mu\,f_{\rm prior}(\boldsymbol{x}) &=& \mu\,\sum_n \abs{x_n}^p \approx \mu\,\alpha^p \, \sum_n \abs{I(\boldsymbol{\theta}_n)}^p \notag \\ &&\quad \approx \mu\,\frac{\alpha^p}{\Delta\theta^L} \, \int \abs{I(\boldsymbol{\theta})}^p \, \mathrm{d}\boldsymbol{\theta} \, , \end{eqnarray}$ (C.3)which shows that for a regularization by a separable ℓ_p norm $μ \propto {\begin{matrix} ingeneral; \\ withthenormalizationconstraint . \end{matrix}$ $\appendix \setcounter{section}{3} \begin{equation} \mu \propto \begin{cases} \Delta\theta^{L} / \alpha^p & \text{in general;} \\ \Delta\theta^{L\,(1 - p)} & \text{with the normalization constraint.} \end{cases} \end{equation}$ (C.4)Hence, with the normalization constraint, the optimal value of μ should be the same regardless of the pixel size for a regularization given by a separable ℓ₁ norm.

C.2. ℓ_p norm on the gradient

Using 1D notation to simplify the equations, the prior penalty for a regularization by the ℓ_p norm on the gradient is given by $\begin{matrix} μ f_{prior} (x) & = μ \sum_{n} | x_{n + 1} - x_{n} |^{p} \\ \approx μ α^{p} \sum_{n} | I (θ_{n} + Δ θ) - I (θ_{n}) |^{p} \\ \approx μ α^{p} Δ θ^{p} \sum_{n} {| \partial_{θ} I (θ_{n}) |}^{p}, \end{matrix}$ $\appendix \setcounter{section}{3} \begin{eqnarray} \mu\,f_{\rm prior}(\boldsymbol{x}) &&= \mu\,\sum_n \abs{x_{n+1} - x_{n}}^p \notag \\ &&\quad\approx \mu\,\alpha^p \, \sum_n \abs{I(\theta_{n} + \Delta\theta) - I(\theta_n)}^p \notag \\ &&\quad \approx \mu\,\alpha^p\,\Delta\theta^p \, \sum_n \Abs{\partial_{\theta}I(\theta_{n})}^p \, , \notag \end{eqnarray}$ where ∂_θI(θ) denotes the partial derivative of the brightness distribution along the angular direction. In L dimensions and using the Riemann approximation, this gives $\begin{matrix} μ f_{prior} (x) & \approx μ α^{p} Δ θ^{p - L} \int {| \partial_{θ} I (θ) |}^{p} d θ, \end{matrix}$ $\appendix \setcounter{section}{3} \begin{eqnarray} \mu\,f_{\rm prior}(\boldsymbol{x}) && \approx \mu\,\alpha^p\,\Delta\theta^{p - L} \, \int \Abs{\partial_{\boldsymbol{\theta}}I(\boldsymbol{\theta})}^p \, \mathrm{d}\boldsymbol{\theta} \, , \notag \end{eqnarray}$ which shows that $μ \propto {\begin{matrix} ingeneral; \\ withthenormalizationconstraint . \end{matrix}$ $\appendix \setcounter{section}{3} \begin{equation} \mu \propto \begin{cases} \Delta\theta^{L - p} / \alpha^p & \text{in general;} \\ \Delta\theta^{L - p\,(L + 1)} & \text{with the normalization constraint.} \end{cases} \end{equation}$ (C.5)Applying this result for a regularization by quadratic smoothness in 2D, e.g. Eq. (A.1), we found that, with a normalization constraint, μ ∝ Δθ^-4.

C.3. Total variation

The preceding result, with p = 1, readily applies to regularization by the total variation, that is $μ \propto {\begin{matrix} ingeneral; \\ withthenormalizationconstraint . \end{matrix}$ $\appendix \setcounter{section}{3} \begin{equation} \mu \propto \begin{cases} \Delta\theta^{L - 1} / \alpha & \text{in general;} \\ \Delta\theta^{-1} & \text{with the normalization constraint.} \end{cases} \end{equation}$ (C.6)We can also deduce that, if a relaxed version of TV is used, as in Eq. (A.3), the relaxation parameter ϵ must scale as the pixel size Δθ to have the prior penalty approximately insensitive to the pixel size.

We note that, with our particular choice of the threshold τ for the ℓ₂ − ℓ₁ smoothness regularization defined in Eq. (A.4), we expect this regularization to behave mostly like TV.

C.4. Quadratic compactness

The quadratic compactness we used in MiRA is given by Eq. (A.2) $\begin{matrix} μ f_{prior} (x) & = μ \sum_{n} {\begin{matrix} ∥ \\ ∥ \\ ∥ \\ ∥ \\ ∥ \end{matrix} \frac{θ_{n}}{Δ θ} \begin{matrix} ∥ \\ ∥ \\ ∥ \\ ∥ \\ ∥ \end{matrix}}^{q} x_{n}^{2} \approx μ \frac{α^{2}}{Δ θ^{q}} \sum_{n} ∥ θ_{n} ∥^{q} I (θ_{n})^{2} \\ \approx \frac{μ α^{2}}{Δ θ^{q + L}} \int ∥ θ ∥^{q} I (θ)^{2} d θ, \end{matrix}$ $\appendix \setcounter{section}{3} \begin{eqnarray} \mu\,f_{\rm prior}(\boldsymbol{x}) &&= \mu\,\sum_n \Norm{\frac{\boldsymbol{\theta}_n}{\Delta\theta}}^q \, x_n^2 \approx \mu \, \frac{\alpha^2}{\Delta\theta^{q}} \, \sum_n \norm{\boldsymbol{\theta}_n}^q \, I(\boldsymbol{\theta}_n)^2 \notag \\ &&\quad \approx \frac{\mu \, \alpha^2}{\Delta\theta^{q + L}} \, \int \norm{\boldsymbol{\theta}}^q \, I(\boldsymbol{\theta})^2 \, \mathrm{d}\boldsymbol{\theta} \, , \notag \end{eqnarray}$ with q = 2 or 3. From this last approximation, we derive the scaling of μ with the pixel size $\begin{matrix} μ \propto \begin{matrix} Δ θ^{q + L} / α^{2} ingeneral; \\ Δ θ^{q - L} withthenormalizationconstraint . \end{matrix} \end{matrix}$ $\appendix \setcounter{section}{3} \begin{eqnarray} \mu \propto \begin{array}{ll} \Delta\theta^{q + L} / \alpha^2 & \text{in general;} \\ \Delta\theta^{q - L} & \text{with the normalization constraint.} \end{array} \end{eqnarray}$ (C.7)Hence, in 2D (L = 2) and for a normalized image, μ does not depend on the pixel size for q = 2, while it scales as Δθ for q = 3.

C.5. Maximum entropy

For maximum entropy methods, we have $μ \sum_{n} \sqrt{x_{n}} \approx μ \frac{α^{1 / 2}}{Δ θ^{L}} \int I (θ)^{1 / 2} d θ,$ $\appendix \setcounter{section}{3} \begin{displaymath} \mu \, \sum_n \sqrt{x_n} \approx \mu \, \frac{\alpha^{1/2}}{\Delta\theta^L} \int I(\boldsymbol{\theta})^{1/2} \, \mathrm{d}\boldsymbol{\theta} \, , \end{displaymath}$ and $μ \sum_{n} h (x_{n}; x̅ n^{)} \approx μ \frac{α}{Δ θ^{L}} \int h (I (θ); I̅ (θ)) d θ,$ $\appendix \setcounter{section}{3} \begin{displaymath} \mu \, \sum_n h\!\left(x_n ; \bar{x}_n\right) \approx \mu \, \frac{\alpha}{\Delta\theta^L} \, \int h\!\left( I(\boldsymbol{\theta}); \bar{I}(\boldsymbol{\theta})\right) \, \mathrm{d}\boldsymbol{\theta} \, , \end{displaymath}$

with $\appendix \setcounter{section}{3} \hbox{$h(x;\bar{x}) = x - \bar{x} - x\,\log(x/\bar{x})$}$ as in Eq. (A.9). from which we deduce that $\begin{matrix} μ \propto \begin{matrix} Δ θ^{L} / α^{1 / 2} forMEM - sqrt; \\ Δ θ^{L} / α forMEM - prior . \end{matrix} \end{matrix}$ $\appendix \setcounter{section}{3} \begin{eqnarray} \mu \propto \begin{array}{ll} \Delta\theta^{L} / \alpha^{1/2} & \text{for MEM-sqrt;} \\ \Delta\theta^{L} / \alpha & \text{for MEM-prior.} \end{array} \end{eqnarray}$ (C.8)Hence, for a normalized image, μ does not depend on the pixel size in a MEM-prior regularization.

All Tables

Table 1

Table of the mean values of μ for each regularization.

In the text

All Figures

	Fig. 1 Astrophysical objects used in the simulations.
In the text

	Fig. 2 (u,v) coverage. From left to right: rich (245 sampled frequencies), medium (88 sampled frequencies), and poor (31 sampled frequencies).
In the text

Fig. 3

Left panel shows a plot of the MSE as a function of the hyperparameter μ. The different colors correspond to different levels of SNR (red high, blue intermediate, green poor). For each curve, the optimal value μ⁺ is labeled by a number (1, 2, and 3). The corresponding images are shown in the bottom row of the right panel. The top row of the right panel shows three reconstructed images with different values of μ, labeled by a letter on the red curve of the left part (A an under-regularized image, B the best image, and C an over-regularized image). This example is made for the galaxy object and the medium (u,v) coverage. The regularization is the MEM-prior one.

In the text

	Fig. 4 Top row: scatter plots representing the optimal value of the hyperparameter μ⁺ as a function of the MSE of the images. Bottom row: the corresponding histograms of MSE and μ⁺. Left column: the colors and symbols indicate the different classes of objects. Right column: the colors and symbols indicate the different regularization classes.
In the text

	Fig. 5 Upper row: distribution of the MSE (left) and the MSE⁺ (right). The colors and letters represent the two classes of objects: blue/B for the objects with very compact structures, red/C for the others. The total distribution is shown in the black/A curve. Bottom row: example of reconstructed images for the good (left) and bad (right) MSE⁺ peak.
In the text

Fig. 6

Distribution of MSE⁺. Left: histograms of MSE⁺ for different objects in different colors; the gray zone corresponds to the total distribution, all objects confounded. Right: solid line, all the configurations and regularizations are kept; dashed line, with the sparsest (u,v) coverage removed; dot-dashed line, with the bad regularizations removed; in gray zone, with the sparsest (u,v) coverage and bad regularizations removed.

In the text

	Fig. 7 Left: cumulative distributions of the ranks reached by the different configurations of (u,v) coverage and SNR. Right: the histograms of the MSE⁺ for different configurations of (u,v) coverage and SNR represented by different colors.
In the text

	Fig. 8 Cumulative distributions of the ranks reached by the regularizations. Left: all objects; Middle: objects with very small structures; Right: other objects.
In the text

	Fig. 9 Example of reconstructed images of the galaxy image with the medium (u,v) coverage and the intermediate SNR for different regularizations. From left to right, TV, compactness θ², ℓ_p norm with p = 2, MEM-log.
In the text

	Fig. 10 Histogram of μ⁺ for the TV regularization. Left: the colors correspond to different objects. Right: the colors correspond to different pair [(u,v) coverage–SNR].
In the text

	Fig. 11 Variation in MSE and μ with different noise realizations. Blue, the quartile curve of the realizations (25% in dash line, 50% in solid line, and 75% in dot line). Red, the mean optimal μ (cross) and its variance.
In the text

	Fig. 12 Reconstructed images for the best (left) and the worst (right) noise realization in the TV case.
In the text

Fig. 13

Left: FWHM of the Gaussian computed for the effective resolution in units of the interferometric resolution of the data. Up-right: comparison between the FWHM with (green) and without (magenta) the constraint of positivity. Bottom-right: variation in the effective resolution as a function of the hyperparameter μ for three different SNR (red high, blue intermediate, green poor). The regularization used is the TV one.

In the text

	Fig. 14 Histograms of f/N_data and f_data/N_data. Solid line: all configurations and regularizations are kept. Dashed line: with the sparsest (u,v) coverage removed. Dot-dashed line: with the bad regularizations removed. In gray zone: with the sparsest (u,v) coverage and bad regularizations removed.
In the text

	Fig. 15 An example of the L-curve in the simulations for the TV regularization with a zoom on the right part.
In the text

	Fig. B.1 Attempt to renormalize the MSE. Distribution of the first normalized MSE (left) and of the second normalized MSE (right). The colors and letters represent the two classes of objects: blue/B for the objects with very compact structures, red/C for the others. The total distribution is shown in black/A curve.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Biraud, Y. 1969, A&A, 1, 124 [NASA ADS] [Google Scholar]

[2] Candes, E. J., Romberg, J., & Tao, T. 2006, IEEE Trans. Info. Theory, 52, 489 [CrossRef] [Google Scholar]

[3] Charbonnier, P., Blanc-Fraud, L., Aubert, G., & Barlaud, M. 1997, IEEE Trans. Image Process., 6, 298 [NASA ADS] [CrossRef] [MathSciNet] [PubMed] [Google Scholar]

[4] Delplancke, F., Derie, F., Paresce, F., et al. 2003, Ap&SS, 286, 99 [NASA ADS] [CrossRef] [Google Scholar]

[5] Domiciano de Souza, A., Vakili, F., Jankov, S., Janot-Pacheco, E., & Abe, L. 2002, A&A, 393, 345 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[6] Filho, M. E., Renard, S., Garcia, P., et al. 2008, in SPIE Conf. Ser., 7013 [Google Scholar]

[7] Goodman, J. W. 1985, Statistical Optics (John Wiley & Sons) [Google Scholar]

[8] Gull, S. F. 1988, in Maximum Entropy and Bayesian Methods in science and engineering, ed. G. J. Erickson, & C. R. Smith [Google Scholar]

[9] Gull, S. F., & Skilling, J. 1984, in Indirect Imaging. Measurement and Processing for Indirect Imaging, ed. J. A. Roberts, 267 [Google Scholar]

[10] Hansen, P. C. 2000, in Computational Inverse Problems in Electrocardiology, ed. P. Johnston, Advances in Computational Bioengineering (WIT Press), 119 [Google Scholar]

[11] Haubois, X., Perrin, G., Lacour, S., et al. 2009, A&A, 508, 923 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[12] Hestroffer, D. 1997, A&A, 327, 199 [NASA ADS] [Google Scholar]

[13] Kraus, S., Weigelt, G., Balega, Y. Y., et al. 2009, A&A, 497, 195 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[14] Kraus, S., Hofmann, K., Menten, K. M., et al. 2010, Nature, 466, 339 [NASA ADS] [CrossRef] [MathSciNet] [PubMed] [Google Scholar]

[15] Lawson, P. R., ed. 2000, Principles of Long Baseline Stellar Interferometry [Google Scholar]

[16] Lawson, P. R., Cotton, W. D., Hummel, C. A., et al. 2004, BAAS, 36, 1605 [NASA ADS] [Google Scholar]

[17] LeBesnerais, G., Lacour, S., Mugnier, L. M., et al. 2008, IEEE J. Select. Topics Signal Process., 2, 767 [Google Scholar]

[18] Le Bouquin, J., Lacour, S., Renard, S., et al. 2009, A&A, 496, L1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[19] Malbet, F., & Perrin, G. 2007, New Astron. Rev., 51, 563 [NASA ADS] [CrossRef] [Google Scholar]

[20] Meimon, S., Mugnier, L. M., & Le Besnerais, G. 2005, J. Opt. Soc. Am. A, 22, 2348 [Google Scholar]

[21] Monnier, J. D., Zhao, M., Pedretti, E., et al. 2007, Science, 317, 342 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]

[22] Narayan, R., & Nityananda, R. 1986, ARA&A, 24, 127 [NASA ADS] [CrossRef] [Google Scholar]

[23] Pauls, T. A., Young, J. S., Cotton, W. D., & Monnier, J. D. 2005, PASP, 117, 1255 [NASA ADS] [CrossRef] [Google Scholar]

[24] Pearson, T. J., & Readhead, A. C. S. 1984, ARA&A, 22, 97 [NASA ADS] [CrossRef] [Google Scholar]

[25] Pichon, C., & Thiebaut, E. 1998, MNRAS, 301, 419 [NASA ADS] [CrossRef] [Google Scholar]

[26] Renard, S., Malbet, F., Benisty, M., Thiébaut, E., & Berger, J. 2010, A&A, 519, A26 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[27] Scott, D. W. 1992, Multivariate density estimation: theory, practice, and visualization (John Wiley & Sons, Inc.) [Google Scholar]

[28] Strong, D., & Chan, T. 2003, Inverse Problems, 19, S165 [NASA ADS] [CrossRef] [Google Scholar]

[29] Tarantola, A. 2005, Inverse Problem Theory and Methods for Model Parameter Estimation (SIAM) [Google Scholar]

[30] Thiébaut, E. 2005, in Optics in astrophysics, ed. R. Foy, & F. C. Foy, NATO ASIB Proc., 198, 397 [Google Scholar]

[31] Thiébaut, E. 2008, in SPIE Conf. Ser., 7013 [Google Scholar]

[32] Thiébaut, E., & Giovannelli, J.-F. 2010, IEEE Signal Process. Mag., 27, 97 [NASA ADS] [CrossRef] [Google Scholar]

[33] Titterington, D. M. 1985, A&A, 144, 381 [NASA ADS] [Google Scholar]

[34] Zhao, M., Gies, D., Monnier, J. D., et al. 2008, ApJ, 684, L95 [NASA ADS] [CrossRef] [Google Scholar]

[35] Zhao, M., Monnier, J. D., Pedretti, E., et al. 2009, ApJ, 701, 209 [NASA ADS] [CrossRef] [Google Scholar]

Image reconstruction in optical interferometry: benchmarking the regularization

1. Introduction

2. Principles of image reconstruction from optical interferometric data

2.1. Data from optical interferometric observations

2.2. Description of the image model

2.3. Inverting the problem of interferometric imaging

2.3.1. The likelihood term fdata and the data model

2.3.2. The regularization term fprior

3. Description of the simulations

3.1. Simulated data

3.1.1. Astrophysical objects

3.1.2. (u,v) coverage

3.1.3. Signal-to-noise ratio

3.2. Parameters of the synthesized image

3.3. Reconstruction strategy

3.4. Image quality criterion: the mean-squared error

4. Results and discussion

4.1. Optimal regularization weight μ +

4.2. Dependence of μ + on the MSE quality criterion

4.3. Limits due to the (u,v) coverage and the SNR

4.4. Quality of the regularizations

4.5. Predetermined value of the hyperparameter μ + ?

4.6. How different noise realizations affect the MSE and the optimal μ?

4.7. The effective spatial resolution

4.8. Other methods for tuning the regularization?

5. Conclusions

Acknowledgments

References

Appendix A: The regularization expressions

Appendix B: Renormalization of MSE

Appendix C: Scaling the hyperparameter μ

C.1. Separable ℓp norm

C.2. ℓp norm on the gradient

C.3. Total variation

C.4. Quadratic compactness

C.5. Maximum entropy

All Tables

All Figures

2.3.1. The likelihood term f_data and the data model

2.3.2. The regularization term f_prior

4.1. Optimal regularization weight μ⁺

4.2. Dependence of μ⁺ on the MSE quality criterion

4.5. Predetermined value of the hyperparameter μ⁺?

C.1. Separable ℓ_p norm

C.2. ℓ_p norm on the gradient