Open Access
Issue
A&A
Volume 651, July 2021
Article Number A62
Number of page(s) 24
Section Numerical methods and codes
DOI https://doi.org/10.1051/0004-6361/202038957
Published online 21 July 2021

© O. Flasseur et al. 2021

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction

Circumstellar disks are at the heart of the planet formation processes. Direct imaging in the near infrared is a unique method for addressing this question because planets can be detected together with the disk environment. Despite the very promising results from direct imaging, a small fraction of the expected circumstellar disks have been unveiled and characterized using high contrast direct imaging in the near infrared (Esposito et al. 2020; Garufi et al. 2020; Langlois et al. 2020). Recent protoplanetary and debris disk studies performed in intensity or in polarimetry have focused on morphological disk characteristics such as the presence of spirals (Benisty et al. 2015; Ren et al. 2018a; Muro-Arena et al. 2020), asymmetries, warps (Dawson et al. 2011; Kluska et al. 2020) and gaps (van Boekel et al. 2017) that could be signposts for the presence of exoplanets. In this context, protoplanetary disks allow unique studies of the exoplanet-disk interactions (Keppler et al. 2018; Haffert et al. 2019; Mesa et al. 2019). In order to understand the physical processes governing these disks, it is essential to reconstruct their surface-brightness or scattering phase function but also to disentangle, in a post-processing step, the disks and the exoplanets.

From an observational point of view, the direct detection of circumstellar disks requires: (i) high angular resolution to resolve the close environment of stars (the typical size of disks is smaller than 1 arcsec) and (ii) high contrast to detect the faint light scattered by disks (in the infrared, an effective contrast better than 105 relative to the host star is typically required). Among the available ground-based observing facilities, the Spectro-Polarimetry High-contrast Exoplanet Research instrument (SPHERE; Beuzit et al. 2019) operating at the Very Large Telescope (VLT) offers such capabilities. It produces angular differential imaging (ADI; Marois et al. 2006) sequences using the pupil-tracking mode of the telescope as a means to enhance the achievable contrast. The presence of strong residual stellar leakages from the coronagraph, in the form of so-called speckles, is the current main limitation of the reachable contrast. A post-processing step is required to combine the images of the ADI sequence and cancel out most of the nuisance component1 while preserving the signal from the off-axis sources2.

There is a large variety of methods dedicated to the post-processing of ADI sequences (see for example Pueyo 2018 for a review). Even though these methods are primarily designed for the detection of unresolved point-like sources, they are also widely used to process observations of disks (Milli et al. 2012) and share a common framework with most of the existing post-processing methods designed for disks. The common principle is to estimate a reference image3 of the nuisance component (see Fig. 1a for a schematic illustration). This reference image is subtracted to each image of the sequence. The resulting residual images are aligned along a common direction to spatially superimpose the signals from the off-axis sources, and are added to each others. More sophisticated methods follow an inverse problems framework. In particular, the recent PACO algorithm (Flasseur et al. 2018a,b,c) jointly learns the average nuisance component and its fluctuations by estimating spatial covariances. Since the nuisance component is highly nonstationary, the underlying statistical model is estimated at a local scale of small patches.

thumbnail Fig. 1.

Schematic representation of the main steps followed by the state-of-the-art post-processing methods for ADI sequences with circumstellar disks. (a) Principle of the classical approach based on the estimation and subtraction of an on-axis PSF. (b) Four main categories of algorithms designed to reduce reconstruction artifacts seen in (a): using a physics-based model of the disk, mitigating artifacts directly from the ADI sequence of interest or using an additional sequence of reference, or formalizing the reconstruction task as an inverse problem. REXPACO falls into the last category: methods based on an inverse problems approach.

Most of these standard ADI post-processing algorithms are not suited to detect and reconstruct the light distribution of circumstellar disks. Their major drawback is the presence, in the output images, of strong artifacts that take the form of partial replicas, the removal of extended smooth components, smearing, and nonuniform attenuations due to the so-called self-subtraction phenomenon. As a direct consequence, both the morphology and the photometry of the disks are strongly corrupted. Very few specific solutions have been developed to tackle these limitations in the ADI processing of circumstellar disks. The existing approaches can be classified into four categories (see Fig. 1b for a schematic diagram of their principles).

A first family of methods introduce a forward-backward loop embedding a physics-based model of the disk (Milli et al. 2012; Esposito et al. 2013; Mazoyer et al. 2020). In a nutshell, the parameters constraining the disk morphology and flux are optimized by minimizing the residuals produced by the selected post-processing algorithm from the ADI sequence free from the current estimation of the disk contribution. This method is however limited as it requires a physics-based model of the disk that is sufficiently simple to be numerically optimized and yet sufficiently flexible to accurately model the actual disk.

A second category of methods attempt to avoid self-subtraction by reducing the disk contribution in the estimation of the reference on-axis PSF. For instance, Pairet et al. (2018) have proposed a modified version of the PCA-based post-processing algorithm that iteratively removes the estimated disk component to improve the estimation of the on-axis PSF. To limit self-subtraction, Ren et al. (2020) have proposed a data-imputation strategy: The areas of the field of view impacted by the disk are considered as missing-data based on prior knowledge of the disk location and replaced by regions free from any disk contribution.

So-called reference differential imaging (RDI; Gerard & Marois 2016; Wahhaj et al. 2020) is a third class of approaches that require an additional ADI sequence of a reference star that hosts no known off-axis sources to estimate the stellar contamination. The quality of the post-processing highly depends both on the selection of the reference star, and on the similarity of the observing conditions (Ruane et al. 2019). Ren et al. (2018b) also use a reference ADI sequence to build a nonnegative decomposition (via nonnegative matrix factorization) of the nuisance component.

Finally, a fourth class of methods is emerging and states the reconstruction of the object of interest as an inverse problem. It consists in jointly estimating the nuisance term, the disk and, possibly, point-like sources given the data and low-complexity priors adapted to each kind of contribution (Pairet et al. 2019, 2021). The method we propose in this paper belongs to this family.

Based on the previous analysis of the astrophysical needs and of the state-of-the-art post-processing methods for circumstellar disk detection and reconstruction, the desirable properties of post-processing algorithms may be listed as follows: (i) detection of disks at high contrasts, (ii) reconstruction of a physically plausible image of the flux distribution with limited processing artifacts, and (iii) ability to disentangle the spatially extended contribution of disks from the spatially localized contribution of point-like sources.

In this paper, we attempt to address these three points by proposing a new post-processing method to extract the light distribution of extended features like circumstellar disks from ADI sequences. To get rid of the nuisance term, we formalize the reconstruction as an inverse problem where all the contributions appear explicitly and where the measured instrumental PSF is taken into account. As a result, the estimation of the object of interest is nearly free from contamination by the stellar leakages and from blurring by the off-axis PSF. The method proposed in this paper, named REXPACO (for Reconstruction of EXtended features by PAtch COvariances), introduces dedicated constraints for each component to help disentangle the object of interest from the nuisance term. For the object of interest, these constraints take the form of specific regularizations favoring the smoothness of extended structures (e.g., disk-like component) or the sparsity of point-like sources. For the nuisance term, we consider the same statistical model as with PACO’s algorithm (Flasseur et al. 2018a,b,c), initially designed for exoplanet detection, but which has also proven very effective for the recovery of extended patterns in holographic microscopy (Flasseur et al. 2019). The power of PACO statistical model resides in its ability to capture the local behavior of the stellar leakages and the noise through the spatial covariances. Another advantage is that this model is directly learned from the ADI sequence and requires no other specific calibration data. Similarly, REXPACO is an unsupervised algorithm: All tuning parameters are automatically estimated from the data. The framework implemented by REXPACO is rather flexible. As an example, REXPACO can be coupled with PACO to better estimate off-axis point-like sources.

The paper is organized as follows. In Sect. 2, we formalize REXPACO algorithm. In Sect. 3, we illustrate the results obtained by REXPACO, both with semisynthetic data and with real observations of known circumstellar disks from the InfraRed Dual Imaging Spectrograph (IRDIS; Dohlen et al. 2008a,b) of the VLT/SPHERE instrument. Section 4 presents results from the combination of REXPACO with the PACO algorithm to unmix the spatially extended contribution of disk from the point-like contribution of exoplanets. Section 5 summarizes our contributions.

2. REXPACO: Reconstruction of EXtended features by PAtch COvariances modeling

This section describes REXPACO, the method we propose to recover the spatial distribution x of the light flux of off-axis sources from an ADI sequence r ∈ ℝNT formed by T images of N pixels each. Table 1 summarizes the main notations used throughout the paper.

Table 1.

Summary of the main notations.

2.1. Inverse problems formulation

REXPACO follows an inverse problems formulation: It computes an estimate x ^ $ {\boldsymbol{\widehat{x}}} $ from the sequence of observations r by fitting a model of the data under given constraints. Since all observations r are collected during a single night, the distribution of interest x can be considered time-invariant and described by a vector x + M $ \boldsymbol{x}\in \mathbb{R}_{+}^M $, where M is the number of pixels of the reconstructed field of view and where ℝ+ is the set of nonnegative real numbers to account that the intensity is necessarily nonnegative everywhere. The ADI data sequence r results from the contribution of two components as expressed by:

r = A x + f , $$ \begin{aligned} {\boldsymbol{r}} = \mathbf{A } \, {\boldsymbol{x}} + {\boldsymbol{f}}, \end{aligned} $$(1)

where A is the linear operator (ℝM → ℝNT) modeling the effects of the off-axis instrumental point spread function (off-axis PSF), that is to say the non-coronagraphic image of the star, and f ∈ ℝNT is a nuisance component accounting for the residual stellar light in the presence of aberrations (that is, the speckles) and for the contribution of noise. Due to the temporal variation of aberrations and to the noise, the nuisance component f is the result of some random process. The quality of the statistical modeling of this process is crucial for the inversion of Eq. (1), and it represents a key element of REXPACO. Figure 2a displays an example of an ADI sequence from the star HR 4796A observed by VLT/SPHERE-IRDIS. HR 4796A is surrounded by a bright inclined debris disk (see Sect. 3.3.1 for a more complete description of its astrophysical properties). We selected this data set to illustrate the rationale of the different ingredients of REXPACO. Part a of Fig. 2 shows that the bright disk of HR 4796A can only be marginally detected by a close visual inspection of a given frame of the sequence. This is due to the very large contrast between the star and its environment: The contribution of the component of interest Ax is typically fainter than the nuisance component f by one to four orders of magnitude. This high contrast sets the gain that must be achieved by the post-processing method in order to recover the component of interest x from the data. Part b of Fig. 2 shows two spatio-temporal slices extracted at two different locations (along the solid line: far from the host star, along the dashed line: closer to the host star) and illustrates the nonstationarity of the nuisance component. Near the star (i.e., at angular separations smaller than about 1.5 arcsec), the nuisance component is dominated by speckles. At larger angular separations, the nuisance component mainly results from a combination of photon noise, thermal background flux and detector readout. Part c of Fig. 2 shows the empirical spatial covariances of the data computed by temporal averaging through the ADI sequence in different regions of interest (ROIs). This figure not only reveals that the covariances of the fluctuations are spatially nonstationary but also that the covariances are non-negligible in a small neighborhood. In a nutshell, the nuisance component f has strong spatial correlations and its distribution is spatially nonstationary and fluctuates over time, making the extraction of the object x all the more difficult. We introduce in Sect. 2.3 a model of the probability density function p(f) of the nuisance component f that is suitable to describe the statistical behavior of these fluctuations.

thumbnail Fig. 2.

Typical ADI sequence from the VLT/SPHERE-IRDIS instrument: (a) Examples of temporal frames; (b) two spatio-temporal slices cut along the solid and dashed lines of (a) emphasizing the spatial variations of the structure of the signal; (c) estimated covariance matrices for four regions of interest (ROIs) at different angular separations s with the host star. Data set: HR 4796A, see Sect. 3.3.1 for observing conditions.

We formalize the reconstruction of the object x from the data r as an inverse problem: We seek the light distribution x ^ $ {\boldsymbol{\widehat{x}}} $ that would produce observations, given the instrumental model A, statistically close to the actual measurements. Image reconstruction methods based on the resolution of inverse problems are generally expressed in the form of an optimization problem. In a Bayesian framework, the solution of this optimization problem is interpreted as the maximum a posteriori estimator (see for example Thiébaut 2006). Introducing an objective function 𝒞 to account for the instrumental effects, the statistical model of the measurements and regularization terms that promote more satisfying solutions (e.g., sparser and smoother solutions) and better reject noise, the problem writes:

x ^ = arg min x 0 C ( r , x , A , Ω , μ ) , $$ \begin{aligned} {\boldsymbol{\widehat{x}}} = \underset{\boldsymbol{x} \ge \mathbf{0}}{\mathrm{arg\,min}} \; \fancyscript {C}(\boldsymbol{r}, {\boldsymbol{x}}, \mathbf{A }, \mathbf{\Omega }, {\boldsymbol{\upmu }}), \end{aligned} $$(2)

where:

C ( r , x , A , Ω , μ ) = D ( r , A x , Ω ) + R ( x , μ ) , $$ \begin{aligned} \fancyscript {C}({\boldsymbol{r}}, {\boldsymbol{x}}, \mathbf{A }, \mathbf{\Omega }, {\boldsymbol{\upmu }}) = \fancyscript {D}({\boldsymbol{r}}, \mathbf{A }\,{\boldsymbol{x}}, \mathbf{\Omega }) + \fancyscript {R}({\boldsymbol{x}}, {\boldsymbol{\upmu }}), \end{aligned} $$(3)

and arg minx ≥ 0 𝒞(r, x, A, Ω, μ) denotes the set of nonnegative values of x minimizing 𝒞(r, x, A, Ω, μ). The first term, 𝒟(r, Ax, Ω), is the data-fidelity term that penalizes the discrepancy between the recorded data r and the modeled contribution Ax of the object of interest x in the data. The data fidelity term also depends on parameters Ω that define the statistical distribution of the nuisance component f (e.g., mean, variances or covariances, see Sect. 2.3). The second term, ℛ(x, μ), is a regularization term enforcing some prior knowledge about the object properties to favor physically more plausible reconstructions and prevent noise amplification. The hyper-parameters μ tune the behavior of the regularization, in particular, its relative importance compared to the data fidelity term. In the following, we detail the ingredients of this general framework. The forward model A of the off-axis PSF is developed in Sect. 2.2. The statistical model of the nuisance component is stated in Sect. 2.3 while the estimation of the parameters Ω of this model is described in Sect. 2.4. Different possible regularizations are considered in Sect. 2.5. Finally, a strategy for the automatic tuning of the regularization hyper-parameters is presented in 3.2.

2.2. Forward model: off-axis point spread function

The parameters x + M $ {\boldsymbol{x}} \in \mathbb{R}_+^M $ represent the distribution of light due to the off-axis sources (e.g., circumstellar disks, exoplanets, brown dwarfs, background stars). During the acquisition of an ADI sequence, the field of view as seen by the detector rotates around the optical axis. After suitable instrument calibration, this apparent rotation is completely defined by the center of rotation and the parallactic angles recorded with the data4. The contribution of x in the t-th data frame rt is given by the linear model Atx where A t : R M R N $ {\mathbf{A}}_{t}{\,{:}~}{\mathbb{R}}^{M}\to{\mathbb{R}}^{N} $ implements the off-axis point spread function for the considered frame: it models the apparent rotation of the field of view, the nonuniform attenuation by the coronagraph and the instrumental blurring. In other words, [Ax]t ≡ Atx for all frames t ∈ ⟦1, T⟧ where A  :  ℝM → ℝN × T is the linear operator introduced in Sect. 2.1 and such that Ax models the contribution of x to all data frames.

Considering the size of the image reconstruction problem, it is essential that our model A of the off-axis PSFs not only be an accurate approximation of the actual instrumental effects but also that A and its adjoint be fast to apply in an iterative reconstruction algorithm (see Appendix A). To obtain a numerically efficient model, we decompose each operator At into different factors: Qt to model the temporally and spatially variant effects due to the rotation of the field of view, Γ to account for the attenuation close to the optical axis due to the coronagraph, and H to implement the instrumental blurring (considered in first approximation as spatially and temporally invariant). Operators Qt can be implemented by interpolation operations while operator H performs bi-dimensional convolutions that can be computed efficiently using fast Fourier transforms (FFTs). Modeling the instrumental blurring as being shift-invariant (a convolution) is accurate in most of the field of view except close to the optical axis. The coronagraph induces a reduction of the light transmission at short angular separations (i.e., under and near the coronagraphic mask) and a deformation of the PSF. We model the variable light transmission by a diagonal operator Γ = diag(γ) with γ a 2D transmission mask. The mask γ has a smooth profile with all entries in the range [0, 1]; from 0 at the center of the coronagraph (i.e., no flux from off-axis sources transmitted) to 1 farther away, and being equal to 0.5 at the inner working angle (IWA) of the coronagraph.

We neglect the deformation of the PSF in our model. This approximation has a limited effect since the area affected by the PSF deformations (close to or under the coronagraphic mask) is dominated by the nuisance component. An alternate way to handle the measurements in the area of the coronagraph would consist of assuming that they suffer an infinite variance (so that they be completely discarded in the subsequent estimations). Both approaches lead to comparable reconstructions (i.e., differences in the reconstructed flux are several orders of magnitude lower than the disk magnitude) in all experimental cases we have considered. This model of the effect of the coronagraph on off-axis point sources as a simple attenuation varying with the angular position is consistent (for angular separations larger than 0.05 arcsec) with the calibrations of the coronagraphs considered in Beuzit et al. (2019) for the SPHERE instrument. For the results presented in Sect. 3, the experimental calibration of this transmission has been performed in the H-Johnson’s spectral band (i.e., at ≃1.6 μm) and extrapolated to the other wavelengths by using the experimental measurement and dedicated simulations, as described in Beuzit et al. (2019).

To summarize, our forward model of the contribution of the object of interest x in the tth frame writes5:

A t x = V H Γ Q t x , $$ \begin{aligned} \mathbf{A }_{t} \, {\boldsymbol{x}} = \mathbf{V } \, \mathbf{H } \, \mathbf{\Gamma } \, \mathbf{Q }_{t} \, {\boldsymbol{x}}, \end{aligned} $$(4)

where V is a simple truncation operator to extract the N pixels corresponding to the actual data field of view from the (larger) M-pixels output of blurring operator H. Due to the apparent rotation of the field of view during the ADI sequence, previously unseen areas fall within the corners of the sensor and, by combining all measurements, a region larger than N pixels can be reconstructed (up to 57% more pixels).

The factorization of the forward operator At can straightforwardly be generalized to the case of a time-varying blur Ht or transmission Γt to account for an evolution of the observing conditions during the ADI sequence or a possible partial decentering of the coronagraph. Figure 3 illustrates the different steps to convert the object of interest x into the model of its contribution into the ADI sequence recorded by the high contrast imaging instrument. In this figure, we adopt the same convention as for A that, when the frame index t of an operator is omitted, the result of applying the operator for all frames is considered.

thumbnail Fig. 3.

Schematic illustration of the forward model Ax = VHΓQx of REXPACO. First, the object x is rotated by the parallactic angles by operator Q. Then, each resulting temporal frame is attenuated by the (time independent) transmission of the coronagraph with Γ, and convolved by the off-axis PSF by application of H. Finally, the field of view seen by the sensor is extracted by V to obtain the object contribution Ax. The recorded ADI sequence is modeled by the sum of Ax and of the nuisance component f, as described by Eq. (1).

2.3. Statistical model of the nuisance component

The nuisance component f in ADI sequences generally presents strong and nonstationary spatial correlations (speckles) that fluctuate over time, especially near the host star where the stellar leakages dominate, see Fig. 2a. Also, the presence of several bad pixels (hot, dead or randomly fluctuating) is a common issue in high contrast imaging where the images are recorded by infrared detectors. Despite a prereduction stage identifying and correcting for defective pixels, some of them displaying large fluctuations only on a few frames remain after this processing (see for example Pavlov et al. 2008; Delorme et al. 2017 for the SPHERE instrument). We thus aim to account for the different causes of fluctuations of the nuisance component through a statistical model.

We recover an image of the objects of interest x based on a co-log-likelihood term 𝒟(r, Ax, Ω) that captures the statistical distribution of the residuals between r and the model Ax. Like in our previous works on the PACO algorithm (Flasseur et al. 2018a,b,c; Flasseur et al. 2020a,b), we model the covariances locally at the scale of small patches of size K (a few tens of pixels in practice). This approximation is justified by the nonstationarity of the spatial correlations and can be seen as a trade-off between considering full-size spatial covariance matrices (of size N × N, and hence too large to be estimated and stored in practice) and completely neglecting the spatial correlations. Our choice amounts to model the nuisance component ft of the tth frame as the realization of independent Gaussian processes for each K-pixel patch, without overlap. Given that model, the probability density function of the nuisance component is given by:

t , p ( f t ) n P det 1 2 ( C n ) × exp [ 1 2 ( P n ( f t m ) ) C n 1 ( P n ( f t m ) ) ] , $$ \begin{aligned}&\forall t,\quad \mathrm{p} ({\boldsymbol{f}}_t)\propto \prod _{n\in \mathbb{P} } \mathrm{det}^{-\frac{1}{2}}(\mathbf{C }_n) \, \nonumber \\&\quad \times \exp \!\left[-{\textstyle \frac{1}{2}} \left(\mathbf{P }_{n}(f_t-{\boldsymbol{m}})\right)^{\top }\mathbf{C }_n^{-1} \left(\mathbf{P }_{n}({\boldsymbol{f}}_t-{\boldsymbol{m}})\right)\right]\!, \end{aligned} $$(5)

with P n : R N R K $ {\mathbf{P}}_{n}{\,{:}~}\mathbb{R}^{N} \to \mathbb{R}^{K} $ the operator extracting patch n from a given temporal frame for any index n in ℙ defining the partition of the data frame into nonoverlapping K-pixel patches, and ℙ is the list of the indices of the pixels at the center of the nonoverlapping patches. The vector m ∈ ℝN denotes the temporal mean 𝔼t(ft) while Cn ∈ ℝK × K denotes the temporal covariance matrix within the patch n.

For m and Cn, we use the same estimators that proved successful with PACO for exoplanet detection: the temporal mean m is estimated by the sample mean m ^ $ {\boldsymbol{\widehat{m}}} $ defined by:

m ^ = 1 T t = 1 T ( r t A t x ) , $$ \begin{aligned} {\boldsymbol{\widehat{m}}} = \frac{1}{T}\sum \limits _{t=1}^{T} \left({{\boldsymbol{r}}_t - \mathbf{A }_t\,{\boldsymbol{x}}}\right)\!, \end{aligned} $$(6)

and the covariance matrices Cn are estimated using a shrinkage estimator (Ledoit & Wolf 2004; Chen et al. 2010):

C ^ n = ( 1 ρ n ) C n + ρ n F n , $$ \begin{aligned} {\widehat{\mathbf{C }}}_n = \left({1 - \widetilde{\rho }_n}\right)\, {\widetilde{\mathbf{C }}}_n + \widetilde{\rho }_n \, \widetilde{\mathbf{F }}_n, \end{aligned} $$(7)

that combines the high-variance and low-bias sample covariance estimator C n $ {{\widetilde{{\textbf{C}}}}}_n $ with the low-variance and high-bias estimator F n $ \widetilde{{\mathbf{F}}}_n $ that neglects all covariances. In other words, F n $ \widetilde{{\mathbf{F}}}_n $ is a diagonal matrix containing the pixel sample variances that are the diagonal entries of the sample covariance of the considered patch:

C n = 1 T t = 1 T u ^ n , t u ^ n , t , $$ \begin{aligned} {\widetilde{\mathbf{C }}}_n = \frac{1}{T} \sum _{t=1}^{T} {\boldsymbol{\widehat{u}}}_{n,t}\,{\boldsymbol{\widehat{u}}}_{n,t}^{\top }, \end{aligned} $$(8)

with:

u ^ n , t = P n ( r t m ^ A t x ) , $$ \begin{aligned} {\boldsymbol{\widehat{u}}}_{n,t} = \mathbf{P }_n\, \left({{\boldsymbol{r}}_t - {\boldsymbol{\widehat{m}}} - \mathbf{A }_t\,{\boldsymbol{x}}}\right), \end{aligned} $$(9)

the residuals in patch n of frame t. With this specific choice for F n $ \widetilde{{\mathbf{F}}}_n $, Eq. (7) rewrites:

C ^ n = W n C n , $$ \begin{aligned} {\widehat{\mathbf{C }}}_n = \widetilde{\mathbf{W }}_n \odot {\widetilde{\mathbf{C }}}_n, \end{aligned} $$(10)

where ⊙ stands for the Hadamard product (i.e., entrywise multiplication), and W n R K × K $ \widetilde{{\mathbf{W}}}_n \in \mathbb{R}^{K \times K} $ is defined for {k, k′} ∈ ⟦1, K2 by:

[ W n ] k , k = { 1 if k = k , 1 ρ n if k k . $$ \begin{aligned} \left[ \widetilde{\mathbf{W }}_n \right]_{k,\,k^\prime } = {\left\{ \begin{array}{ll} 1 &\mathrm{if} \quad k = k^\prime , \\ 1-\widetilde{\rho }_n &\mathrm{if}\quad k \ne k^\prime . \end{array}\right.} \end{aligned} $$(11)

The shrinkage parameter ρ n $ \widetilde{\rho}_n $ balances each estimator to reach a bias-variance trade-off. The extension of the results of Chen et al. (2010) to the diagonal covariance matrix F n $ \widetilde{{\mathbf{F}}}_n $ leads to the following data-driven expression for the shrinkage parameter ρ n $ \widetilde{\rho}_n $ (see Eq. (12) of Flasseur et al. 2018a):

ρ n = tr ( C n 2 ) + tr 2 ( C n ) 2 k = 1 K [ C n ] k , k 2 ( T + 1 ) ( tr ( C n 2 ) k = 1 K [ C n ] k , k 2 ) . $$ \begin{aligned} \widetilde{\rho }_n = \frac{\mathrm{tr}\left({{\widetilde{\mathbf{C }}}_{n}^2}\right) + \mathrm{tr}^2\left({{\widetilde{\mathbf{C }}}_{n}}\right) - 2\sum _{k=1}^K\left[{{\widetilde{\mathbf{C }}}{_{n}}}\right]_{k,k}^2 }{ (T+1)\,\left({ \mathrm{tr}\left({{\widetilde{\mathbf{C }}}_{n}^2}\right) - \sum _{k=1}^K\left[{{\widetilde{\mathbf{C }}}{_{n}}}\right]_{k,k}^2 }\right) } \,. \end{aligned} $$(12)

It can be noted that most above defined quantities – m ^ $ {\boldsymbol{\widehat{m}}} $ in Eq. (6), C ^ n $ {{\widehat{\textbf{C}}}}_n $ in Eq. (7), C n $ {{\widetilde{{\textbf{C}}}}}_n $ in Eq. (8), u ^ n , t $ {\boldsymbol{\widehat{u}}}_{n,t} $ in Eq. (9), W n $ \widetilde{{\mathbf{W}}}_n $ in Eq. (11) and ρ n $ \widetilde{\rho}_n $ in Eq. (12) – depend implicitly on the sought parameters x although, and for the sake of simplicity, this is not always explicitly indicated. In Algorithms 1 and 2 presented in Sect. 2.4, these quantities are labeled with the iteration number as the parameters x change at each iteration. Figure 4 illustrates the extraction of a collection of patches from an ADI sequence for locally learning the statistics of the nuisance component, in particular for the estimation of C n $ {{\widetilde{{\textbf{C}}}}}_n $. The number K of pixels in each patch, and hence the dimension of the local covariance matrices should be large enough to encompass the core of the off-axis PSF and yet not too large compared to the number T of samples available for the estimation. The optimal patch size is estimated using the same criterion (validated empirically) as in PACO and which yields an optimal patch size corresponding roughly to twice the off-axis PSF full width at half maximum (FWHM). This rule typically leads to 50 ≲ K ≲ 100. Figure 2c gives examples of estimated patch covariance matrices C ^ n $ {{\widehat{\textbf{C}}}}_n $ for different angular separations. It shows that the structure of the correlations strongly depends on the distance to the star and that neglecting these correlations would be a crude approximation, especially near the star were the nuisance component is highly correlated and fluctuating. Figure 2c also emphasizes that the covariances are well preserved by the regularization (i.e., ρ n $ \widetilde{\rho}_n $ is small enough for C n $ {{\widetilde{{\textbf{C}}}}}_n $ to represent the largest contribution). We have shown in Flasseur et al. (2018a, 2020b) that such a local model of the covariances approximates well the empirical distribution of the nuisance component. In Flasseur et al. (2020b), a small discrepancy was observed near the star due to the larger fluctuations6 of the nuisance component in this area. The impact of a statistical mis-modeling in this area is further discussed in Sect. 3.3.2.

thumbnail Fig. 4.

Extraction of patch collections for local learning of the statistics of the nuisance component from ADI sequences. The contribution of off-axis sources is shown in blue.

Given the patch-based statistics Ω ^ = { m ^ , { C ^ n } n P } $ {\boldsymbol{\widehat{\Omega}}} = \{ {\boldsymbol{\widehat{m}}}, {\{ {{\widehat{\textbf{C}}}}_n \}_{n \in \mathbb{P}}} \} $ locally accounting for the fluctuations of the nuisance component, the co-log-likelihood rewrites:

D ( r , A x , Ω ^ ) = 1 2 n P ( t = 1 T | | u ^ n , t | | C ^ n 1 2 + T log det C ^ n ) , $$ \begin{aligned} \fancyscript {D}({\boldsymbol{r}}, \mathbf{A }\,{\boldsymbol{x}}, \widehat{\mathbf{\Omega }}) = \frac{1}{2} \sum \limits _{n \in \mathbb{P} } \left({ \sum _{t=1}^{T} \left||{{\boldsymbol{\widehat{u}}}_{n,t}}\right||_{{\widehat{\mathbf{C }}}_n^{-1}}^2 + T\,\log \det {{\widehat{\mathbf{C }}}_n} }\right), \end{aligned} $$(13)

where the summation over n is performed on the list ℙ of nonoverlapping K-pixel patches while the residuals u ^ n , t $ {\boldsymbol{\widehat{u}}}_{n,t} $ in the patch n for the frame t are defined in Eq. (9).

2.4. Unbiased estimation of the mean and spatial covariances

The estimators m ^ $ {\boldsymbol{\widehat{m}}} $, in Eq. (6), and C ^ n $ {{\widehat{\textbf{C}}}}_n $, in Eq. (7), that appear in Eq. (13) both depend on the object flux x which is unknown. To handle this problem, the simplest solution would be to neglect the object contribution as it is typically small compared to the nuisance component f. Under this assumption, Eqs. (6) and (8) rewrite:

Algorithm 1REXPACO reconstruction (Alternating estimation approach).

m ^ [ 1 ] = 1 T t = 1 T r t , $$ \begin{aligned} {\boldsymbol{\widehat{m}}}^{[1]}&= \frac{1}{T}\sum \limits _{t=1}^{T} {\boldsymbol{r}}_t, \end{aligned} $$(14a)

C n [ 1 ] = 1 T t = 1 T ( P n ( r t m ^ [ 1 ] ) ) ( P n ( r t m ^ [ 1 ] ) ) , $$ \begin{aligned} {\widetilde{\mathbf{C }}}_n^{[1]}&= \frac{1}{T} \sum \limits _{t=1}^{T} \left({\mathbf{P }_{n} \left({{\boldsymbol{r}}_t - {\boldsymbol{\widehat{m}}}^{[1]}}\right)}\right) \left({\mathbf{P }_{n} \left({{\boldsymbol{r}}_t - {\boldsymbol{\widehat{m}}}^{[1]}}\right)}\right) ^{\top }, \end{aligned} $$(14b)

and correspond respectively to the sample mean and the sample covariances learned locally from the image patches.

We show in Appendix B that neglecting the light flux of off-axis objects when characterizing the nuisance component leads to biased estimations, a problem common to many ADI post-processing methods and known as the self-subtraction problem (Milli et al. 2012; Pairet et al. 2019). To recover a better estimation of the object flux, it is then necessary to develop an unbiased estimation of the statistics Ω. To do so, two strategies are possible: (i) an alternation between the reconstruction of x ^ $ {\boldsymbol{\widehat{x}}} $ and the update of the statistics Ω ^ $ \widehat{{\mathbf{\Omega}}} $ of the nuisance component f from the residuals r A x ^ $ {\boldsymbol{r}} - {\mathbf{A}}\,{\boldsymbol{\widehat{x}}} $, until convergence, as done in Algorithm 1, or (ii) a joint estimation approach implemented by Algorithm 2. In this paper, we used η = 10−8 in the stopping condition of Algorithm 1 for all the reconstructions. Approach (ii) implemented by Algorithm 2 converges faster and is especially beneficial when reconstructing structures that are close to circular symmetry such as in Figs. 7 and 8. In order to be equivalent to Algorithm 1, Algorithm 2 requires replacing the data-fidelity term 𝒟 defined in Eq. (13) by (the proof is detailed in Appendix C):

Algorithm 2REXPACO reconstruction (Joint estimation approach).

D joint ( r , x ) = T 2 n P log det C ^ n + 1 2 n P tr ( C ^ n 1 ( W n t = 1 T u ^ n , t u ^ n , t ) ) , $$ \begin{aligned} \fancyscript {D}_{\mathrm{joint}}({\boldsymbol{r}}, {\boldsymbol{x}})&= \frac{T}{2} \sum \limits _{n \in \mathbb{P} } \log \det {{\widehat{\mathbf{C }}}_n} \nonumber \\&\quad + \frac{1}{2} \sum \limits _{n \in \mathbb{P} } \mathrm{tr}\left({{\widehat{\mathbf{C }}}_n^{-1}\,\left({ \widetilde{\mathbf{W }}_n \odot \sum _{t=1}^{T} {\boldsymbol{\widehat{u}}}_{n,t}\, \widehat{{\boldsymbol{u}}}_{n,t}^{\top }}\right) }\right), \end{aligned} $$(15)

with u ^ n , t $ {\boldsymbol{\widehat{u}}}_{n,t} $ the residuals in the patch n of frame t defined in Eq. (9). This modified data-fidelity corresponds to the co-log-likelihood under a Gaussian assumption except that shrinkage matrices W n $ \widetilde{{\mathbf{W}}}_n $ are introduced7 so that the shrunk covariances C ^ n $ {{\widehat{\textbf{C}}}}_n $ defined in Eq. (10) are minimizers of 𝒟joint. The mean m ^ $ {\boldsymbol{\widehat{m}}} $ and shrunk covariances C ^ n $ {{\widehat{\textbf{C}}}}_n $ can thus be replaced by their closed-form expressions given in Eqs. (6) and (10) that depend on x and the minimization be performed solely on x. Beyond the faster convergence of Algorithm 2, the two algorithms differ slightly due to the shrinkage factors not being refined throughout the reconstruction in Algorithm 2. We found the impact of that discrepancy to be barely noticeable in the reconstruction results and hence would recommend the general usage of Algorithm 2 for its improved speed.

2.5. Regularization term

After discussing the data-fidelity term 𝒟(r, Ax, Ω) in the previous sections, we focus here on the regularization term ℛ(x, μ) and on the resolution of the inverse problem (2).

Regularization terms are introduced to enforce prior knowledge about the unknown object x and to improve the conditioning of the inversion. Two classical regularizations are considered in the proposed method. The first one promotes smooth objects with sharp edges by favoring the sparsity of their spatial gradients. This is the goal of the so-called 2 − 1 edge-preserving regularization (Charbonnier et al. 1997; Mugnier et al. 2004; Thiébaut 2006) widely used in image processing:

R smooth ( x , ϵ ) = n = 1 N | | Δ n x | | 2 2 + ϵ 2 , $$ \begin{aligned} \fancyscript {R}_{\mathrm{smooth}}({\boldsymbol{x}}, \epsilon ) = \sum \limits _{n=1}^N \sqrt{ || \mathbf{\Delta }_n \, {\boldsymbol{x}}||_2^2+ \epsilon ^2}, \end{aligned} $$(16)

where Δn approximates the spatial gradient at pixel n by finite differences and ϵ > 0 sets the edge-preservation behavior of the regularization ( ϵ = 10 7 $ \epsilon = \sqrt{10^{-7}} $ in all the experiments of this paper): Local differences Δnx that are below ϵ in magnitude are smoothed similarly as a quadratic regularization would, while larger differences are preserved (e.g., border of a disk). This classical regularization is sufficiently flexible to remain adapted to different disk morphologies, as illustrated by Figs. 58 where elliptical disks with sharp edges and spiral disks with smooth edges are considered. Since we aim to reconstruct a well-contrasted object on an dark background, we also consider a second regularization term promoting the sparsity of the object x by penalizing its separable 1-norm:

R 1 ( x ) = n = 1 N | x n | = x 0 n = 1 N x n . $$ \begin{aligned} \fancyscript {R}_{\ell _1}({\boldsymbol{x}}) = \sum \limits _{n=1}^N |x_n| \underset{{\boldsymbol{x}} \ge {\boldsymbol{0}}}{=} \sum \limits _{n=1}^N x_n. \end{aligned} $$(17)

thumbnail Fig. 5.

Reconstructions of a simulated elliptical disk: (a) comparisons between cADI image combinations and reblurred REXPACO reconstructions; (b) resolution improvement achieved by deconvolution with REXPACO. The average reconstructions are computed over ten injections of the simulated disk with various orientations with respect to the background.

This sparsity-promoting regularization term is also well adapted to restore point-like sources. We discuss in Sect. 4 how an alternating estimation strategy can be applied to further improve the separation of overlapping point-like sources and extended structures. Due to the positivity constraint in Eq. (2) that guarantees that reconstructed light fluxes are nonnegative (i.e., x  ≥  0), the 1-norm in Eq. (17) boils down to ∑nxn which is a differentiable expression and smooth optimization techniques are applicable (see Appendix A). The two terms ℛsmooth and ℛ1 are combined by:

R ( x , μ ) = μ smooth R smooth ( x , ϵ ) + μ 1 R 1 ( x ) , $$ \begin{aligned} \fancyscript {R}({\boldsymbol{x}}, {\boldsymbol{\upmu }}) = \upmu _{\mathrm{smooth}}\,\fancyscript {R}_{\mathrm{smooth}}({\boldsymbol{x}}, \epsilon ) + \upmu _{\ell _1}\,\fancyscript {R}_{\ell _1}({\boldsymbol{x}}), \end{aligned} $$(18)

where μ = {μsmooth, μ1} balances their relative weight with respect to the data-fidelity term 𝒟. For a given set of parameters μ, we solve the constrained minimization problem (2) by running the VMLM-B algorithm (Thiébaut 2002). Some technical elements about this minimization technique can be found in Appendix A.

3. Results

3.1. Evaluation on simulated disks

We first evaluate quantitatively, on numerical simulations, the ability of REXPACO to estimate a faithful image of the light flux distribution of off-axis sources. We consider two simulated disks with very different morphologies: (i) a spatially centered elliptical disk with an eccentricity about 0.95 and with sharp edges; (ii) a spiral disk with two arms whose center is shifted by ten pixels from the star center in one of two spatial directions. Contrary to the elliptical disk, the spiral disk has smooth edges. These simulated disks resemble the actual circumstellar disks presented in Sect. 3.3.2 so that our simulations can help to assess the quality of the reconstructions of real circumstellar disks. Each simulated disk is injected into a data set of HIP 80019 with no known off-axis source, at three different contrast levels: αgt ∈ {1 × 10−6, 5 × 10−6, 1 × 10−5}. Reconstructions of the elliptical disk are performed with Algorithm 1 while the spiral disk that requires many more iterations of Algorithm 1 to converge has been processed with Algorithm 2. A total of 60 reconstructions have been performed: for each disk and each level αgt, the simulated disk has been injected in one of ten different orientations with respect to the background in order to evaluate the mean and variance of the reconstructions.

Simulations of the elliptical disk are reported in Figs. 5 and 6. We compare REXPACO reconstructions to the cADI image combination technique. Since cADI does not perform a deconvolution, the comparison is performed at the resolution of the instrument, that is to say the images produced by cADI are shown next to the reblurred reconstructions H x ^ $ {\mathbf{H}}\,{\boldsymbol{\widehat{x}}} $ of REXPACO in Figs. 5a and 6a. Large errors can be noted in cADI images, in particular at small angular separations. The computation of the average reconstruction obtained for the ten different orientations of the disk with respect to the background indicates the presence of systematic errors with cADI: an under-estimation of the light-flux in the region of the disk that is closest to the star (an issue due to limited angular diversity). In contrast, REXPACO reconstructions are close to the ground truth, even for the lower level of contrast αgt = 10−6. The deblurred reconstructions shown in Figs. 5b and 6b are in good agreement with the ground truth. Unsurprisingly, the reconstruction quality is higher when the disk is brighter: more spurious fluctuations are visible in the deblurred reconstruction at αgt = 10−6 than at αgt = 5 × 10−6 or αgt = 10−5. Although some discrepancies can be noted in the deblurred line profiles, the resolution is improved by the deconvolution process.

thumbnail Fig. 6.

Line profiles extracted from the reconstructions at αgt = 10−6 shown in Fig. 5.

Simulations of the spiral disk are reported in Figs. 7 and 8. The comparison with cADI shows a clear improvement of the reconstructions with REXPACO: cADI fails to recover most of the disk. At the lowest contrast αgt = 10−6 and owing to the near circular-symmetry of the structures close to the star, the spiral disk is challenging to disentangle from the nuisance component. REXPACO is however able to recover some parts of the spiral arms although their flux is underestimated. At the highest contrast αgt = 10−5, line profiles of Fig. 8 indicate that REXPACO reconstructions are strongly improved compared to cADI and that some errors remain in particular at the lowest angular separations.

thumbnail Fig. 7.

Reconstructions of a simulated spiral disk: (a) comparisons between cADI image combinations and reblurred REXPACO reconstructions; (b) resolution improvement achieved by deconvolution with REXPACO. The average reconstructions are computed over ten injections of the simulated disk with various orientations with respect to the background.

thumbnail Fig. 8.

Line profiles extracted from the reconstructions at αgt = 10−5 shown in Fig. 7.

The reconstructions are quantitatively evaluated by reporting the normalized root mean square error (N-RMSE, the lower the better) computed over different regions (the disk, the background, and the whole image):

N RMSE ( x gt , x ^ ) = | | x gt x ^ | | 2 | | x gt | | 2 . $$ \begin{aligned} \mathrm{N-RMSE}({\boldsymbol{x}}_{\mathrm{gt}}, {\boldsymbol{\widehat{x}}}) = \frac{|| {\boldsymbol{x}}_{\mathrm{gt}} - {\boldsymbol{\widehat{x}}}||_2}{||{\boldsymbol{x}}_{\mathrm{gt}}||_2}\,. \end{aligned} $$(19)

Table 2 reports N-RMSE values and indicates a clear improvement over cADI: the errors are typically divided by a factor of three to four (the error reduction is more modest for the more challenging reconstructions of the spiral disk at the lowest fluxes).

Table 2.

Quantitative evaluation of the reconstructions on simulated elliptical and spiral disks: N-RMSE, defined in Eq. (19), for the reconstructions shown in Figs. 58.

3.2. Automatic setting of the regularization parameters

The setting of the parameters μ is of uttermost importance since the results of the regularized reconstructions depend both on the characteristics of the data set (e.g., observing conditions, amplitude of the field rotation) and on the properties of the objects to reconstruct (e.g., morphology, contrast). The parameters μ can be tuned manually by trial and error until the reconstruction is qualitatively acceptable, but this approach is not completely satisfactory since it is time-consuming and user-dependent. Several methods have been developed in the literature to automatically tune regularization parameters by minimizing a quantitative criterion (Craven & Wahba 1978; Wahba 1985; Stein 1981). In this work, we consider the Stein’s unbiased risk estimator approach (SURE; Stein 1981), which minimizes an estimate of the mean-square error (MSE) in the data space:

MSE ( μ ) = | | A ( x gt x ^ μ ( r ) ) | | C 1 2 , $$ \begin{aligned} \mathrm{MSE}({\boldsymbol{\upmu }}) = ||\mathbf{A }\left({\boldsymbol{x}}_{\mathrm{gt}} - {\boldsymbol{\widehat{x}}}_{{\boldsymbol{\upmu }}}({\boldsymbol{r}})\right)||_{\mathbf{C }^{-1}}^2, \end{aligned} $$(20)

where xgt is the (unknown) ground truth object, and x ^ μ ( r ) $ {\boldsymbol{\widehat{x}}}_{{\boldsymbol{{\upmu}}}}({\boldsymbol{r}}) $ denotes the reconstructed object x ^ $ {\boldsymbol{\widehat{x}}} $ obtained from the data r when using regularization parameters μ. For any given set of parameters μ, the criterion SURE(μ) provides an unbiased estimation of MSE(μ) that does not require knowledge of the true object xgt (Stein 1981):

SURE ( μ ) = | | r m ^ A x ^ μ ( r ) | | C 1 2 + 2 tr ( A J μ ( r ) ) N T , $$ \begin{aligned} \mathrm{SURE}({\boldsymbol{\upmu }})&= \left||{ {\boldsymbol{r}} - {\boldsymbol{\widehat{m}}} - \mathbf{A }\,\widehat{{\boldsymbol{x}}}_{{\boldsymbol{\upmu }}}({\boldsymbol{r}})}\right||_{\mathbf{C }^{-1}}^2 \nonumber \\&\quad + 2\mathrm{tr}{\left(\mathbf{A } \, \mathbf{J }_{{\boldsymbol{\upmu }}}({\boldsymbol{r}})\right)} - N\,T, \end{aligned} $$(21)

where J μ ( r ) = x ^ μ ( r ) / r $ {\mathbf{J}}_{{\boldsymbol{{\upmu}}}}({\boldsymbol{r}}) = \partial{\boldsymbol{\widehat{x}}}_{{\boldsymbol{{\upmu}}}}({\boldsymbol{r}})/\partial{\boldsymbol{r}} $ is the Jacobian matrix of the mapping r x ^ μ ( r ) $ {\boldsymbol{r}}\to{\boldsymbol{\widehat{x}}}_{{\boldsymbol{{\upmu}}}}({\boldsymbol{r}}) $ with respect to the components of the data r: [ J μ ( r ) ] a , b = [ x ^ μ ( r ) ] a / [ r ] b $ [{\mathbf{J}}_{{\boldsymbol{{\upmu}}}}({\boldsymbol{r}})]_{a,b}=\partial[{\boldsymbol{\widehat{x}}}_{{\boldsymbol{{\upmu}}}}({\boldsymbol{r}})]_a/\partial[{\boldsymbol{r}}]_b $. Since there is no closed-form expression for x ^ μ ( r ) $ {\boldsymbol{\widehat{x}}}_{{\boldsymbol{{\upmu}}}}({\boldsymbol{r}}) $ (it is obtained by an iterative process), it is complex to derive tr(AJμ(r)). An alternative proposed by Girard (1989) considers a Monte Carlo perturbation method to numerically approximate this term by finite-differences:

tr ( A J μ ( r ) ) ξ 1 b A [ x ^ μ ( r + ξ b ) x ^ μ ( r ) ] , $$ \begin{aligned} \mathrm{tr}{\left(\mathbf{A } \, \mathbf{J }_{{\boldsymbol{\upmu }}}({\boldsymbol{r}})\right)} \approx \xi ^{-1}\, {\boldsymbol{b}}^{\top }\, \mathbf{A } \, \left[{\boldsymbol{\widehat{x}}}_{{\boldsymbol{\upmu }}}({\boldsymbol{r}} + \xi {\boldsymbol{b}}) - {\boldsymbol{\widehat{x}}}_{{\boldsymbol{\upmu }}}({\boldsymbol{r}}) \right]\!, \end{aligned} $$(22)

where b ∈ ℝNT is an independent and identically distributed pseudo-random vector of unit variance and ξ is the amplitude of the perturbation. While the precise setting of ξ is not a crucial point of the method, it must be chosen both large enough to prevent errors due to numerical underflows in the computation of the difference x ^ μ ( r + ξ b ) x ^ μ ( r ) $ {\boldsymbol{\widehat{x}}}_{{\boldsymbol{{\upmu}}}}({\boldsymbol{r}} + \xi {\boldsymbol{b}}) - {\boldsymbol{\widehat{x}}}_{{\boldsymbol{{\upmu}}}}({\boldsymbol{r}}) $ and small enough so that the approximation in Eq. (22) remains valid (the effects of the perturbation being nonlinear in the model). In practice, when we evaluate Eq. (22), we set ξ = 0.1 × MAD(r), where the median absolute deviation MAD(r) = median(|r − median(r)|) is a robust estimator of the standard-deviation.

As in our previous derivation of the statistical model of the nuisance component, we approximate the full covariance matrix C that appears in Eq. (21) as block-diagonal, with blocks corresponding to the partition of the image into nonoverlapping patches. The patch covariance model and the perturbation approximation lead to the following risk estimator:

SURE ( μ ) n P t = 1 T | | P n ( r t m ^ A x ^ μ ( r ) ) | | C ^ n 1 2 + ( 2 / ξ ) b A [ x ^ μ ( r + ξ b ) x ^ μ ( r ) ] N T . $$ \begin{aligned} \mathrm{SURE}({\boldsymbol{\upmu }})&\approx \sum \limits _{n \in \mathbb{P} } \sum _{t=1}^{T} \left||{\mathbf{P }_n \left( {\boldsymbol{r}}_t - {\boldsymbol{\widehat{m}}} - \mathbf{A }\,\widehat{{\boldsymbol{x}}}_{{\boldsymbol{\upmu }}}({\boldsymbol{r}}) \right)}\right||_{{\widehat{\mathbf{C }}}_n^{-1}}^2\nonumber \\&\quad + (2/\xi ) \, {\boldsymbol{b}}^{\top }\, \mathbf{A } \, \left[ {\boldsymbol{\widehat{x}}}_{{\boldsymbol{\upmu }}}({\boldsymbol{r}} + \xi {\boldsymbol{b}}) - {\boldsymbol{\widehat{x}}}_{{\boldsymbol{\upmu }}}({\boldsymbol{r}}) \right] - N\,T. \end{aligned} $$(23)

The optimal value μ ^ SURE $ {\boldsymbol{\widehat{{\upmu}}}}^{\mathrm{SURE}} $ of the regularization parameters is obtained by minimizing the quantity (23) with respect to μ.

We have validated on a numerical example the automatic tuning of the regularization parameters μ: The simulated elliptical disk is injected at the level of contrast αgt = 10−5 on the data set of HIP 80019. Our SURE risk estimator given in Eq. (23) is evaluated for different values of the parameters μ. Figure 9 compares this criterion with the true MSE: The two criteria reach a global minimum for similar hyper-parameter values. The SURE criterion is intrinsically non-convex and additional non-convexities arise due to the sensitivity of the evaluation of the term tr(AJμ(r)) by finite differences. Rather than proceeding by local minimization of SURE(μ), it is therefore safer to systematically evaluate the criterion on a 2D grid in order to identify the global minimum.

thumbnail Fig. 9.

Comparison between the SURE criterion accounting for the fluctuations of the nuisance component through a local learning of the covariances (left, see Eq. (23)) and the true MSE (right, see Eq. (20)). The pink circles represent the global minimum of each criterion. Data set: HIP 80019, see Sect. 3.3.1 for observing conditions.

Figure 10 illustrates the impact of the regularization in three cases: an under-regularized setting (i.e., μ → 0+), the automatic setting μ ^ SURE $ {\boldsymbol{\widehat{{\upmu}}}}^{\mathrm{SURE}} $ based on the minimization of the SURE criterion (23), and the best-possible setting μMSE computed using the ground truth by minimizing the MSE criterion (20). The morphology of the reconstructed disk is improved by the regularization: It is smoother and sharper thanks to the edge-preserving term. The background of x ^ $ {\boldsymbol{\widehat{x}}} $ is also less noisy (closer to zero) due to the 1-norm penalization. It can be noted that in this simulation, the reconstructed object x ^ $ {\boldsymbol{\widehat{x}}} $ with μ ^ SURE $ \widehat{{\boldsymbol{{\upmu}}}}^{\mathrm{SURE}} $ is flatter and sharper than the disk reconstructed with the optimal parameters because the optimal parameters lead to a better reconstruction outside of the disk (reconstructed values are very close to zero: the ground truth value) at the expense of a slightly noisier reconstruction of the disk. In Fig. 9, the location of the global minimum (denoted by a circle) corresponds to parameters μ ^ smooth SURE > μ smooth MSE $ \widehat{{\upmu}}_{\mathrm{smooth}}^{\mathrm{SURE}} > {\upmu}_{\mathrm{smooth}}^{\mathrm{MSE}} $ and μ ^ 1 SURE < μ 1 MSE $ \widehat{{\upmu}}_{\ell_1}^{\mathrm{SURE}} < {\upmu}_{\ell_1}^{\mathrm{MSE}} $, which leads to a stronger smoothing of the object with μ ^ SURE $ {\boldsymbol{\widehat{{\upmu}}}}^{\mathrm{SURE}} $ and a better rejection of close-to-zero noise in the background with μ ^ MSE $ {\boldsymbol{\widehat{{\upmu}}}}^{\mathrm{MSE}} $. This qualitative observation is also confirmed by Table 3 which reports RMSE8 scores between the ground truth xgt and the reconstructed flux distribution x ^ $ {\boldsymbol{\widehat{x}}} $ ( RMSE = 1 M | | x gt x ^ | | 2 $ \mathrm{RMSE} = \tfrac{1}{M}{||{\boldsymbol{x}}_{\mathrm{gt}} - {\boldsymbol{\widehat{x}}}||_2} $). Table 3 shows that the quality of the reconstruction obtained with the regularization parameters μ ^ SURE $ {\boldsymbol{\widehat{{\upmu}}}}^{\mathrm{SURE}} $ estimated by minimizing the SURE criterion (22) is very close to the best-possible reconstruction obtained by minimizing the MSE criterion (20).

thumbnail Fig. 10.

Influence of the regularization on the REXPACO reconstructions. xgt (1st column) is the ground truth elliptical disk injected at the level of flux αgt = 1 × 10−5. The flux distribution x ^ $ {\boldsymbol{\widehat{x}}} $ is reconstructed with a slight amount of regularization (μ → 0+; 2nd column), with μMSE derived by minimizing the optimal MSE criterion (3rd column, see Eq. (20)) and its estimate μ ^ SURE $ {\boldsymbol{\widehat{{\upmu}}}}^{\mathrm{SURE}} $ through the SURE criterion (4th column, see Eq. (23)). Bottom panel: zooms on two regions of interest (ROIs) extracted from the reconstructions given in the top panel. Data set: HIP 80019, see Sect. 3.3.1 for observing conditions.

Table 3.

RMSE scores of REXPACO reconstructions of a simulated elliptical disk, for three different regularization levels.

3.3. Evaluation on data sets with circumstellar disks

In this section, we evaluate the ability of REXPACO to reconstruct light flux distributions of actual circumstellar disks from ADI sequences. The results are compared to two existing methods, cADI and PCA (see Sect. 1 for their respective principle), both designed for the detection of point-like sources but used extensively with different tuning parameters to process data with disks (see Milli et al. 2012; Pairet et al. 2019 for studies of the resulting post-processing artifacts). We also compare REXPACO to PACO. The rationale behind this last comparison is to assess the benefit of the reconstruction framework of REXPACO in addition to the common statistical modeling of the spatial covariances shared by the two methods. For cADI and the PCA, we used the SpeCal pipeline (Galicher et al. 2018) which is the post-processing standard of the SPHERE data center (Delorme et al. 2017). The PACO implementation is based on the algorithm described in Flasseur et al. (2018a,b,c). A MATLAB™ implementation of the REXPACO main routines is available online9 as a purpose of illustration. It is based on GLOBALBIOIM10, an opensource software for image reconstruction through an inverse problem framework (Soubies et al. 2019). Before post-processing with the tested algorithms, the data sets are preprocessed and calibrated with the prereduction tools (Pavlov et al. 2008; Maire et al. 2016) of the SPHERE consortium available at the SPHERE Data Center (Delorme et al. 2017). In particular, background, flat-field, bad pixels, parallactic angles, wavelength, true-north and astrometric calibrations are performed during this step.

3.3.1. Data sets description

For the comparisons, we have selected four data sets from the VLT/SPHERE-IRDIS instrument obtained by the observation of stars hosting known circumstellar disks of various morphological structures (e.g., face-on versus edge-on configuration, presence of inner and outer gaps, spirals, etc.). The rationale of such a selection is to test the versatility and the robustness of REXPACO in reconstructing the flux distribution of circumstellar disks with various morphologies. The considered data sets were recorded on the following targets. Table 4 summarizes the main observing conditions of the corresponding ADI sequences.

Table 4.

Observing conditions of ADI sequences from the VLT/SPHERE-IRDIS instrument considered in this paper.

HR 4796A (HD 109573A) is the primary member of a binary system of the TW Hydrae association with an age about 12 Myr (Bell et al. 2015) located at a distance about 72.8 pc (Van Leeuwen 2007). HR 4796A hosts a debris disk in face-on configuration first observed by the Hubble Space Telescope (Schneider et al. 1999). Since then, its morphology has been studied intensively with ground-based facilities (Milli et al. 2017, 2019). The relatively bright disk (contrast between 10−5 and 10−4) exhibits a thin ring structure with a semimajor axis about 77 au, an inclination by about 76°, and a high surface brightness which suggest the presence of exoplanets although no companions have been detected to date.

RY Lupi (HIP 78094) is a T Tauri star of the Scorpius-Centaurus association with an age about 10–20 Myr (Pecaut et al. 2012) located at a distance about 151 pc (Brown et al. 2016). It hosts a transition disk directly observed for the first time in scattered light with the SPHERE instrument (Langlois et al. 2018). The disk is close to an edge-on configuration with an inclination about 70°, and it exhibits asymmetries and warped features near the star due to its interaction with the star magnetosphere. Observations in ADI and PDI have shown that the structure of the disk could be interpreted as spiral arms possibly due to the presence of putative low-mass exoplanets orbiting exterior to the spiral arms (Langlois et al. 2018).

SAO 206462 (HD 135344B) is the secondary member of a binary system of the Upper Centaurus Lupus constellation with an age about 9 Myr (Müller et al. 2011) located at a distance about 157 pc (Brown et al. 2016). SAO 206462 hosts a transition disk near a face-on configuration with an inclination about 11°, first resolved both in thermal emission (Doucet et al. 2006) and in scattered light (Grady et al. 2009). The disk exhibits two spiral arms, asymmetric features and an inner cavity. Its structure is of high interest since it could be due to the presence of low-mass exoplanets inside the arms or the cavity as shown by recent analysis of high contrast and high resolution observations (Maire et al. 2017).

PDS 70 (V1032 Centauri) is a T Tauri star of the Scorpius-Centaurus association with an age about 5 Myr (Müller et al. 2018) located at a distance about 113 pc (Brown et al. 2016). It hosts a protoplanetary disk and two confirmed exoplanets (PDS 70b and PDS 70c) in formation inside the disk. The protoplanetary disk hosted by PDS 70 was first resolved with the Naos Conica instrument (NaCo; Lenzen et al. 2003) at the VLT (Riaud et al. 2006). It exhibits complex structures in the form of several arcs, outer and inner gaps, and potential spiral arms at the north side of the outer disk (Keppler et al. 2018; Mesa et al. 2019). The exoplanet PDS 70b has been detected by direct imaging with the VLT/SPHERE instrument (Keppler et al. 2018) in near infrared, and the exoplanet PDS 70c has been unveiled with the VLT/MUSE instrument in Hα (Haffert et al. 2019). To date, this is the unique system embedding multiple nascent exoplanets. The analysis of the structure of this system is extremely valuable to understand exoplanet formation mechanisms and raises still-open questions like the presence of additional putative exoplanets (Mesa et al. 2019).

3.3.2. Reconstruction of circumstellar disks

We apply REXPACO to the four selected ADI sequences presented in Table 4. For each, the computation takes a few hours (MATLAB™ implementation, processor Intel i9-9880H at 2.3 GHz). Figure 11 displays the results obtained with PACO, PCA and cADI. For the PCA, we have tested several choices for the numbers of modes used in the decomposition and selected the best choice, corresponding to a small number of modes in order to limit the issue of self-subtraction (see Pairet et al. 2019 for a discussion about this effect). The images produced by PACO are obtained with a maximum likelihood estimator of the light flux derived for point-like sources. For REXPACO, the results take the form of reconstructed flux distributions x ^ $ {\boldsymbol{\widehat{x}}} $. We also give the blurred versions H x ^ $ {\mathbf{H}} \, {\boldsymbol{\widehat{x}}} $ to simplify the comparison with methods that do not perform a deconvolution. For the PCA and cADI, the results take the form of residual images obtained after subtraction of the estimated on-axis PSF. These residual images are normalized to the star flux that is encoded in the on-axis PSF provided with the data. Thus, the resulting images can also be interpreted as blurred flux distributions of the off-axis sources.

thumbnail Fig. 11.

Images of the flux distribution reconstructed by REXPACO comparatively to the PACO, PCA and cADI algorithms. For REXPACO, both the deblurred reconstruction x ^ $ {\boldsymbol{\widehat{x}}} $ and its reblurred version H x ^ $ {\mathbf{H}} \, {\boldsymbol{\widehat{x}}} $ are given for comparison with the other considered methods that do not produce deblurred images of the flux distribution. Data sets: HR 4796A, RY Lupi, SAO 206462 and PDS 70, see Sect. 3.3.1 for observing conditions.

Figure 11 shows that the mean levels of flux estimated by PCA and cADI differ significantly from the flux estimated by REXPACO and PACO11: The PCA and cADI are sensitive to several causes of artifacts severely altering both the morphology and the photometry of the disks. In particular, cADI and the PCA suffer from positive and negative replicas and also from the self-subtraction phenomenon. cADI performs much better in the presence of extended features like disks due to its limited attenuation of the disk signal. However, cADI is also sensitive to the ADI artifacts creating morphological distortions like nonphysical discontinuities in the structure of the disks, as discussed in Sect. 1. This is especially the case for the thin and bright ring surrounding HR 4796A. As discussed in Sect. 2.4, the REXPACO reconstruction exhibits an almost continuous elliptical structure, a flux asymmetry on the west side of the ring as predicted by intensity scattering models (Milli et al. 2017), and the scattering structures predicted by radiative transfer models near the disk extremities (Lagrange et al. 2012). The morphological structure of the reconstructed disk is very close the one obtained from polarimetric observations that are almost exempt from artifacts at such separations (Milli et al. 2019). The faithfulness of the REXPACO reconstruction is also supported by the simulations presented in Sect. 2.4 (see Figs. 5 and 6) where we injected a simulated elliptical disk with comparable eccentricity and spatial extent than the HR 4796A disk. In this latter case, no significant artifacts were identified in the reconstruction even though the simulated disk was one order of magnitude fainter than the HR 4796A disk. The REXPACO reconstruction of RY Lupi also exhibits reduced ADI artifacts. As seen on Fig. 11, the REXPACO reconstructions show better the smooth and faint extended disk structures. In addition, the reconstructed flux distribution uncovers new details around the brightest part of the disk such as spirals as well as the bottom and top disk planes.

The comparison with PACO is also interesting. While this latter algorithm demonstrates significantly better performance than the PCA to detect point-like sources (Flasseur et al. 2018a), it performs worse than the PCA and cADI in the presence of extended features since it assumes that the pattern to detect is known and takes the form of a point-like source. Besides, parts of the disks are lost in the PACO reconstructions due to the subtraction of the mean component of the ADI sequence12. This effect is especially problematic for SAO 206462 since the nuisance component is hidden by the disk in almost all temporal frames due to the quasi-circular symmetry of the disk. The combination of the reconstruction framework of REXPACO with its iterative strategy reducing bias of the statistics of the nuisance component leads to a physically more exploitable image of the flux distribution. The quality of the reconstruction is supported by the simulations performed in Sect. 2.4 (see Figs. 7 and 8) where a numerical model of a spiral disk with two arms of comparable spatial extent is injected to the data. These numerical tests have shown that REXPACO is able to reconstruct a faithful image of the flux distribution of such a type of disk for levels of contrast ≳5 × 10−6, that is to say about five to ten times fainter than the mean reconstructed level of flux of the SAO 206462 disk. The bright asymmetry in the southwest side of the disk could be interpreted as a reconstruction artifact possibly due to a local mis-modeling of the nuisance component, but we did not experience such an effect in our simulations for the typical level of flux of SAO 206462. Besides, the REXPACO reconstruction exhibits detailed structures of the spiral arms with a better rejection of the nuisance component near the star than the ADI and RDI post-processings performed in Maire et al. (2017) on the same sequence of observations.

Finally, the case of PDS 70 is particularly interesting due to the complex structure of the disk and to the presence of two confirmed exoplanets at formation stage inside the disk. Concerning the disk structure, the outer ring reconstructed by REXPACO is free from the self-subtraction artifacts that were visible on the cADI, PCA and PACO reductions. The bright asymmetry of the disk in the west side of the outer disk and the possible spiral-shaped features pointed by Müller et al. (2018) in the north side of the disk are also detectable in the REXPACO reconstruction. Concerning the point-like sources, the exoplanet PDS 70b exhibits a bright flux at λ = 2.11 μm (K1 spectral band) as previously reported by Müller et al. (2018), Mesa et al. (2019). Re-detecting the exoplanet PDS 70c is more difficult since it is embedded inside the disk material as reported by Haffert et al. (2019), Mesa et al. (2019). We propose in Sect. 4 a processing strategy to disentangle the contribution of point-like sources from the spatially extended contribution of the circumstellar disk. Besides, a putative point-like feature (PLF) has been recently identified by Mesa et al. (2019) based on the analysis of several near-infrared angular plus spectral differential imaging (ASDI) sequences by various post-processing methods. This PLF can be (marginally) detected from the REXPACO reconstruction while it is indistinguishable in the same ADI sequence processed by PCA (see Fig. 1 of Mesa et al. 2019), as it is also the case for our PCA reduction given in Fig. 11.

Figure 12 shows the importance of the modeling of spatial covariances in REXPACO: Regularized inversions with a statistical model of the nuisance component that neglects all covariances (but still accounts for the nonstationary variances) are given in Fig. 12a, to compare with REXPACO reconstruction shown in Fig. 12b. When spatial covariances are omitted, the reconstructions are worse. This is particularly visible on HR 4796A, in the background and close to the star, and on SAO 206462, close to the star.

thumbnail Fig. 12.

Importance of the modeling of spatial covariances. Comparison between (a) regularized inversions with a model of the nuisance component that accounts for the nonstationary variances but neglect covariances, and (b) REXPACO inversions. Reconstructions are given first with a linear scale, then with a square-root scaling in order to better distinguish the lowest reconstructed values. Data sets: HR 4796A, RY Lupi, SAO 206462 and PDS 70, see Sect. 3.3.1 for observing conditions.

Figure 13 completes this study by showing that the method for automatically setting the regularization hyper-parameters discussed in Sect. 3.2 leads to satisfying reconstructions: The HR 4796A disk is smoother, with sharper edges, and the background is less noisy when automatically selected hyper-parameters μ ^ SURE $ {\boldsymbol{\widehat{{\upmu}}}}^{\mathrm{SURE}} $ are used compared to the case of under-regularization μ → 0+. Conversely, over-regularization ( μ = 20 × μ ^ SURE $ {\boldsymbol{{\upmu}}} = 20\times{\boldsymbol{\widehat{{\upmu}}}}^{\mathrm{SURE}} $) leads to strong artifacts mimicking the self-subtraction effects, especially near the star, due to the large penalization of the 1-norm of the reconstructed flux distribution. The extremities of the disk ring also exhibit staircase-artifacts. These effects are signs of an over-regularization by the edge-preserving term.

thumbnail Fig. 13.

Effect of the setting of the regularization parameters on REXPACO reconstructions. (a) SURE criterion derived from Eq. (23); (b) comparative reconstructions (first row: μ → 0, i.e., under-regularization; second row: μ ^ SURE = { 10 6 , 10 6 } $ {\boldsymbol{\widehat{{\upmu}}}}^{\mathrm{SURE}} = \{ 10^{6}, 10^{6}\} $, i.e., estimated optimal regularization; third row: 20 × μ ^ SURE = { 2 × 10 7 , 2 × 10 7 } $ 20\times {\boldsymbol{\widehat{{\upmu}}}}^{\mathrm{SURE}} = \{ 2\times10^{7}, 2\times10^{7}\} $, i.e., over-regularization). Two regions of interest are also displayed with a saturated colormap near the north side extremity of the disk and near the star. Data set: HR 4796A, see Sect. 3.3.1 for observing conditions.

3.3.3. Discussion of the results

Based on numerical injections of fake disks (with different orientations and typical morphologies), we have shown that REXPACO is capable of reconstructing a faithful image of the flux distribution at contrasts up to 106 for elliptical disks with a large eccentricity (about 0.95). For disks in face-on configuration, our simulations have shown that the contrast achieved by REXPACO drops by a factor between two and ten due to the increased difficulty in disentangling the disk signal from the nuisance component in such a configuration. While these first quantitative results should be complemented and refined by more extensive simulations with diverse disks and observing conditions, they give first clues about the operating range of REXPACO. In addition, REXPACO better preserves both the morphology and the photometry of the disks than the standard cADI method used routinely to process ADI sequences in the presence of circumstellar disks.

We have also applied REXPACO on four ADI sequences recorded by the VLT/SPHERE-IRDIS instrument on stars hosting known circumstellar disks with various morphologies. While there is no ground truth for these objects, we have shown that REXPACO copes with the classical ADI artifacts (e.g., nonphysical distortions and large residual stellar leakages). The images produced by REXPACO allow to recover structures in the circumstellar material that were previously identified only with methods complementing ADI such as reference or polarization differential imaging. We expect that the high contrast capability of REXPACO combined with its ability to reconstruct faithful images of the circumstellar material will be helpful in detecting new circumstellar disks and in refining the astrophysical interpretation of their flux distributions.

4. Unmixing point-like sources and extended sources

For systems formed of both an extended disk and point-like sources such as exoplanets, the image reconstruction can be improved by combining REXPACO algorithm with the exoplanet detection and characterization method PACO. Both algorithms share a common statistical modeling of the background and can thus be combined in a principled way. In our experiments, this approach proved superior to our attempts to reconstruct with REXPACO a composite object x = xext + xpnt where xext accounts for extended (smooth) structures while xpnt accounts for point-like sources and with a regularization designed to favor these specific structures in each component:

R ( x , μ ) = μ smooth R smooth ( x ext , ϵ ) + μ 1 R 1 ( x pnt ) . $$ \begin{aligned} \fancyscript {R}({\boldsymbol{x}}, {\boldsymbol{\upmu }}) = \upmu _{\mathrm{smooth}}\,\fancyscript {R}_{\mathrm{smooth}}({\boldsymbol{x}}_{\rm ext}, \epsilon ) + \upmu _{\ell _1}\,\fancyscript {R}_{\ell _1}({\boldsymbol{x}}_{\rm pnt}). \end{aligned} $$

The two main arguments in favor of combining REXPACO and PACO are the ability of PACO to account for the sub-pixel location of point-like sources (while a sparse component xpnt is restricted to the discretization of the pixel grid) and direct control on the number of point-like sources (while xpnt either displayed many more nonzero entries than the actual number of point sources, or missed the weaker point-like sources).

In a nutshell, the proposed combination of the two algorithms works as follows: (i) PACO and REXPACO are launched independently on a given ADI sequence; (ii) based on the S/N map produced by PACO and on the flux distribution produced by REXPACO, the user identifies manually putative point-like sources; (iii) REXPACO and PACO are then alternatively applied. During step (iii), the astrometry and photometry of the selected point-like sources are refined by PACO, based on the residual data obtained after subtraction of the disk contribution as currently reconstructed by REXPACO. Next, the flux distribution of the disk can be refined by REXPACO on updated residuals obtained after subtraction of the refined point-sources contribution estimated by PACO. Step (iii) is repeated until convergence of the disk and point-like sources components. In practice, we stop the alternating procedure of step (iii) when the relative evolution of both the disk and the point-like sources components is smaller than 10−8 between two consecutive iterations.

This strategy can be particularly valuable for the processing of ADI sequences of stars that host point-like sources near a circumstellar disk. In this paper, we illustrate our combined approach on the PDS 70 system hosting two confirmed exoplanets (PDS 70b and PDS 70c) inside a protoplanetary disk (Keppler et al. 2018; Haffert et al. 2019), see Sect. 3.3.2 for a discussion about the REXPACO reconstruction. Based on the PACO S/N map and REXPACO flux distribution provided by step (i) of the combined method, we manually selected the rough location of the exoplanets PDS 70b and PDS 70c, see Figs. 14a–b. We also considered the point-like feature (PLF) first identified by Mesa et al. (2019)13, see Figs. 14a–b. Figures 14c–d show the contribution of the three selected point-like features (PDS 70b, PDS 70c and the PLF) estimated by PACO and the disk contribution reconstructed by REXPACO after convergence of the proposed alternating scheme. Figures 14c–d show the good separation of point-like sources and of the extended disk structures surrounding them, thus illustrating the efficiency of this approach to disentangle the sparse contribution of the point-like sources from the spatially extended contribution of the disk. It is worth noticing that near the location of PDS 70b, the spatially extended component retrieved by the REXPACO/PACO combination exhibits an extended faint circular structure that could be a hint of a circumplanetary disk as proposed by Christiaens et al. (2019) and Isella et al. (2019). It is not yet possible to conclude on the reality of this structure recovered by REXPACO/PACO, which could also be the result of reconstruction artifacts (e.g., due to a partial unmixing of the sparse and spatially extended components). This open question would require further analysis, for instance based on numerical simulations by means of injection of fake exoplanets and of a physics-based model of the PDS 70 disk to study the influence of the exoplanet-disk interactions on the reconstruction results. Such a study is left for future work. Concerning PDS 70c, the extended feature detected in the disk component near this exoplanet is in good agreement with the submillimeter continuum emission detection (Isella et al. 2019) which confirms the hypothesis that this exoplanet is still in its accretion phase.

thumbnail Fig. 14.

Combining REXPACO with PACO on a data set of PDS 70 at λ = 2.110 μm: (a) S/N map from PACO; (b) flux distribution from REXPACO; (c) contribution of the three selected point-like sources estimated with PACO; (d) contribution of the disk reconstructed with REXPACO.

Figure 15 shows how the photometry and astrometry of the three point sources evolve throughout the alternations between REXPACO and PACO. Accounting for the presence of the disk modifies the estimated photometry of the three selected point-like sources by about 10%. The astrometry of PDS 70b and of the PLF estimated with PACO are not significantly modified by the proposed alternating scheme: These two point-like sources are not located inside a bright arc of the disk so that the disk material introduces only a small bias on PACO estimations. On the other hand, the estimated astrometry of the exoplanet PDS 70c located inside a very bright arm of the disk is modified by about 0.010 arcsec (i.e., about 1 pixel in the two spatial directions, PACO astrometry is evaluated on a subpixel grid with four nodes per pixel). The astrophysical interpretation of these results would require a dedicated study on multiple epochs. If the astro-photometric estimations we derived are confirmed, they could be significant corrective factors of the spectral energy distribution and of the orbit of the exoplanet PDS 70c.

thumbnail Fig. 15.

Combining REXPACO with PACO on a data set of PDS 70 at λ = 2.110 μm: (a) evolution of the estimated photometry; (b) evolution of the estimated astrometry, for the three selected point-like sources.

5. Conclusion

In this paper, we have introduced a new post-processing algorithm, named REXPACO, dedicated to the reconstruction of flux distributions from ADI data sets in the presence of extended features, like circumstellar disks. To the best of our knowledge, REXPACO is one of the first methods specifically designed for this task. It encompasses the statistical modeling of the fluctuations of the nuisance component (i.e., stellar leakages and additional sources of noise) at the scale of small patches of a few tens of pixels. The resulting statistics of the nuisance term are accounted for in a regularized reconstruction framework following an inverse problem approach based on a forward image formation model of the off-axis sources in the ADI sequences. We have shown that suitable spatial regularizations (1 norm and 2 − 1 edge-preserving) reduce residual reconstruction artifacts. Regularization parameters are estimated by minimization of SURE. Our numerical experiments have shown that this strategy leads to a quasi-optimal setting of these parameters. The images of the circumstellar environment produced by our algorithm are deblurred from the instrumental PSF which is accounted for in the forward model of the image formation. These different ingredients make our approach very versatile and capable of recovering both sharp edges and smooth transitions of the circumstellar material. The REXPACO algorithm is fully unsupervised: The patch size, the statistics of the nuisance component and the weights given to the regularizations are estimated in a data-driven fashion. Besides, we have illustrated the benefits of the combination of REXPACO with PACO, two algorithms dedicated to complementary tasks, to analyze ADI sequences in the presence of both circumstellar disks and candidate point-like sources. The combination of the two algorithms is able to disentangle the spatially extended contribution of circumstellar disks from the spatially localized contribution of point-like sources.

While the statistical model of REXPACO is able to capture most of the fluctuations of the nuisance component, slight stellar residuals are still present in the reconstructions. The fidelity of the underlying statistical model to the empirical distribution of the nuisance component is a key point to extract accurately the signal from off-axis sources. Beyond accurate modeling of the nuisance term, the angular motion through the ADI sequence is also a key point to unmix the light coming from off-axis objects such as a circumstellar disk and point-like sources from the stellar light that leaks from the coronagraph. For instance, objects with a circular symmetry cannot be correctly recovered since the temporal diversity brought by ADI is not sufficient in this case. Only additional information such as the spectral measurements obtained with angular plus spectral differential imaging (ASDI) in addition to the temporal diversity of ADI would make the reconstruction possible in those cases. The refinement and adaptation of REXPACO to the two above-mentioned points require specific refinements that will be developed in a future paper.


1

We refer to the term “nuisance component” to designate the contribution of light sources other than the object of interest, as well as different sources of noise.

2

Since the signals from circumstellar disks and point-like sources come from directions other than the optical axis, we use the term “off-axis sources” to refer indifferently to these two types of objects.

3

Since the nuisance component is mainly due to light coming from the optical axis, we refer to the term “on-axis point-spread function” (in short, “on-axis PSF”) to designate the estimated reference image of the nuisance component.

4

In practice for VLT/SPHERE-IRDIS, not all parallactic angles are measured and some of them are interpolated from others. Parallactic angles are also corrected for the estimated overheads of storage in a calibration and preprocessing step (Pavlov et al. 2008). Estimation errors of the rotation center and parallactic angles are not included in our statistical modeling.

5

The forward model could be written differently so that the operations involved in Qt, Γ and H would be applied in a different order, subject to slight adaptations. For example, if the field rotation and translation modeled by operator Qt are applied after the convolution H, the non-isotropic off-axis PSF has to be counter-rotated by the parallactic angles (i.e., the blurring is not the same for all t) to counter-balance the effect of the permutation of Qt and H. The same remark applies for the attenuation Γ if the mask γ is not circular symmetrical.

6

Large fluctuations in the nuisance component can be due to a slight decentering of the coronagraph or to a sudden degradation of the observing conditions. Finer statistical models of the nuisance component could be considered to account for these fluctuations. In particular, multivariate Gaussian scale mixture models (GSM; Wainwright & Simoncelli 2000) have been shown to be very effective in the context of exoplanet detection (Flasseur et al. 2020a,b). Replacing the multi-variate Gaussian model consider in this paper by a GSM model requires specific developments that are left for future work.

7

Shrinkage matrices { W n } n P $ {\{ \widetilde{{\mathbf{W}}}_n \}}_{n\in \mathbb{P}} $ are defined in Eq. (11) and we make the simplifying assumption that they are independent from x so that they are estimated once for all given the data r.

8

Here, we use this metric instead of N-RMSE defined in Eq. (19) since we aim to compare the absolute error made both inside and outside the spatial support of the disk.

11

Fluxes estimated by REXPACO and PACO are approximately comparable. The residual flux difference is due to the fact that PACO is based on a point-like source assumption which can lead to bias in the estimated flux of extended features.

12

PACO implements a dedicated strategy to cope with this issue for point-like sources: The candidate companions above a given threshold are characterized in terms of astrometry and photometry after their detection. This second step prevents a statistical bias in the estimated quantities.

13

Based on estimated photometry, the study of Mesa et al. (2019) concludes that this PLF is most likely a part of the disk. In our present work, the goal is not to discuss the status of the PLF but rather to illustrate the principle of the proposed algorithm.

Acknowledgments

We thank A. Boccaletti (LESIA, Paris, France) who provided the transmission of the SPHERE coronagraphs. We also thank the anonymous referee for her/his careful reading of the manuscript as well as her/his insightful comments and suggestions. This work has made use of the SPHERE Data Center, jointly operated by OSUG/IPAG (Grenoble, France), PYTHEAS/LAM/CESAM (Marseille, France), OCA/Lagrange (Nice, France), Observatoire de Paris/LESIA (Paris, France), and Observatoire de Lyon/CRAL (Lyon, France). This work has been supported by the Région Auvergne-Rhône-Alpes under the project DIAGHOLO, by the French National Programs (PNP and PNPS), and by the Action Spécifique Haute Résolution Angulaire (ASHRA) of CNRS/INSU co-funded by CNES.

References

  1. Bell, C. P., Mamajek, E. E., & Naylor, T. 2015, MNRAS, 454, 593 [NASA ADS] [CrossRef] [Google Scholar]
  2. Benisty, M., Juhasz, A., Boccaletti, A., et al. 2015, A&A, 578, L6 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  3. Beuzit, J.-L., Vigan, A., Mouillet, D., et al. 2019, A&A, 631, A155 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  4. Brown, A. G., Vallenari, A., Prusti, T., et al. 2016, A&A, 595, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  5. Carbillet, M., Bendjoya, P., Abe, L., et al. 2011, Exp. Astron., 30, 39 [NASA ADS] [CrossRef] [Google Scholar]
  6. Charbonnier, P., Blanc-Féraud, L., Aubert, G., & Barlaud, M. 1997, IEEE Trans. Image Proc., 6, 298 [Google Scholar]
  7. Chen, Y., Wiesel, A., Eldar, Y. C., & Hero, A. O. 2010, IEEE Trans. Signal Proc., 58, 5016 [Google Scholar]
  8. Christiaens, V., Cantalloube, F., Casassus, S., et al. 2019, ApJ, 877, L33 [Google Scholar]
  9. Craven, P., & Wahba, G. 1978, Numer. Math., 31, 377 [CrossRef] [Google Scholar]
  10. Dawson, R. I., Murray-Clay, R. A., & Fabrycky, D. C. 2011, ApJ, 743, L17 [Google Scholar]
  11. Delorme, P., Meunier, N., Albert, D., et al. 2017, in SF2A-2017: Proceedings of the Annual meeting of the French Society of Astronomy and Astrophysics, eds. C. Reylé, P. Di Matteo, F. Herpin, et al., 347 [Google Scholar]
  12. Dohlen, K., Langlois, M., Saisse, M., et al. 2008a, in SPIE Astronomical Telescopes + Instrumentation, International Society for Optics and Photonics, 70143L [Google Scholar]
  13. Dohlen, K., Saisse, M., Origne, A., et al. 2008b, SPIE Astron. Telescopes Instrum., 7018 [Google Scholar]
  14. Doucet, C., Pantin, E., Lagage, P., & Dullemond, C. 2006, A&A, 460, 117 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  15. Esposito, T. M., Fitzgerald, M. P., Graham, J. R., & Kalas, P. 2013, ApJ, 780, 25 [Google Scholar]
  16. Esposito, T. M., Kalas, P., Fitzgerald, M. P., et al. 2020, AJ, 160, 24 [Google Scholar]
  17. Flasseur, O., Denis, L., Thiébaut, É., & Langlois, M. 2018a, A&A, 618, A138 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  18. Flasseur, O., Denis, L., Thiébaut, É., & Langlois, M. 2018b, in IEEE International Conference on Image Processing, 2735 [Google Scholar]
  19. Flasseur, O., Denis, L., Thiébaut, É. M., & Langlois, M. 2018c, in SPIE Astronomical Telescopes + Instrumentation, Int. Soc. Opt. Photonics, 10703, 107032R [Google Scholar]
  20. Flasseur, O., Denis, L., Thiébaut, É., Olivier, T., & Fournier, C. 2019, in European Signal Processing Conference, 1 [Google Scholar]
  21. Flasseur, O., Denis, L., Thiébaut, É., & Langlois, M. 2020a, A&A, 637, A9 [CrossRef] [EDP Sciences] [Google Scholar]
  22. Flasseur, O., Denis, L., Thiébaut, É., & Langlois, M. 2020b, A&A, 634, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  23. Galicher, R., Boccaletti, A., Mesa, D., et al. 2018, A&A, 615, A92 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  24. Garufi, A., Avenhaus, H., Pérez, S., et al. 2020, A&A, 633, A82 [CrossRef] [EDP Sciences] [Google Scholar]
  25. Gerard, B. L., & Marois, C. 2016, in SPIE Astronomical Intrumentation + Telescopes, Int. Soc. Opt. Photonics, 9909, 1544 [Google Scholar]
  26. Girard, D. A. 1989, Numer. Math., 56, 1 [Google Scholar]
  27. Grady, C., Schneider, G., Sitko, M., et al. 2009, ApJ, 699, 1822 [Google Scholar]
  28. Haffert, S., Bohn, A., de Boer, J., et al. 2019, Nat. Astron., 3, 749 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  29. Isella, A., Benisty, M., Teague, R., et al. 2019, ApJ, 879, L25 [Google Scholar]
  30. Keppler, M., Benisty, M., Müller, A., et al. 2018, A&A, 617, A44 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  31. Kluska, J., Berger, J. P., Malbet, F., et al. 2020, A&A, 636, A116 [CrossRef] [EDP Sciences] [Google Scholar]
  32. Lagrange, A.-M., Milli, J., Boccaletti, A., et al. 2012, A&A, 546, A38 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  33. Langlois, M., Pohl, A., Lagrange, A.-M., et al. 2018, A&A, 614, A88 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  34. Langlois, M., Gratton, R., Lagrange, A. M., et al. 2020, A&A, [arXiv:2103.03976] [Google Scholar]
  35. Ledoit, O., & Wolf, M. 2004, J. Multivariate Anal., 88, 365 [CrossRef] [Google Scholar]
  36. Lenzen, R., Hartung, M., Brandner, W., et al. 2003, in SPIE Astronomical Telescopes + Instrumentation, Int. Soc. Opt. Photonics, 4841, 944 [Google Scholar]
  37. Maire, A. L., Langlois, M., Dohlen, K., et al. 2016, in Ground-based and Airborne Instrumentation for Astronomy VI, Int. Soc. Opt. Photonics, 9908, 990834 [Google Scholar]
  38. Maire, A.-L., Stolker, T., Messina, S., et al. 2017, A&A, 601, A134 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  39. Marois, C., Lafrenière, D., Doyon, R., Macintosh, B., & Nadeau, D. 2006, ApJ, 641, 556 [Google Scholar]
  40. Mazoyer, J., Arriaga, P., Hom, J., et al. 2020, in Ground-based and Airborne Instrumentation for Astronomy VIII, Int. Soc. Opt. Photonics, 11447, 1144759 [Google Scholar]
  41. Mesa, D., Keppler, M., Cantalloube, F., et al. 2019, A&A, 632, A25 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  42. Milli, J., Mouillet, D., Lagrange, A.-M., et al. 2012, A&A, 545, A111 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  43. Milli, J., Vigan, A., Mouillet, D., et al. 2017, A&A, 599, A108 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  44. Milli, J., Engler, N., Schmid, H., et al. 2019, A&A, 626, A54 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  45. Mugnier, L. M., Fusco, T., & Conan, J.-M. 2004, J. Opt. Soc. Am. A, 21, 1841 [NASA ADS] [CrossRef] [Google Scholar]
  46. Müller, A., van den Ancker, M., Launhardt, R., et al. 2011, A&A, 530, A85 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  47. Müller, A., Keppler, M., Henning, T., et al. 2018, A&A, 617, L2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  48. Muro-Arena, G., Benisty, M., Ginski, C., et al. 2020, A&A, 635, A121 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  49. Pairet, B., Cantalloube, F., & Jacques, L. 2018, in International Traveling Workshop on Interactions Between Sparse Models and Technology [Google Scholar]
  50. Pairet, B., Jacques, L., & Cantalloube, F. 2019, in Signal Processing with Adaptive Sparse Structured Representations, 1, 1 [Google Scholar]
  51. Pairet, B., Cantalloube, F., & Jacques, L. 2021, MNRAS, 503, 3724 [Google Scholar]
  52. Pavlov, A., Möller-Nilsson, O., Feldt, M., et al. 2008, in SPIE Astronomical Telescopes + Instrumentation, Int. Soc. Opt. Photonics, 7019, 701939 [Google Scholar]
  53. Pecaut, M. J., Mamajek, E. E., & Bubar, E. J. 2012, ApJ, 746, 154 [Google Scholar]
  54. Pueyo, L. 2018, Handbook of Exoplanets, 705 [Google Scholar]
  55. Ren, B., Dong, R., Esposito, T. M., et al. 2018a, ApJ, 857, L9 [Google Scholar]
  56. Ren, B., Pueyo, L., Zhu, G. B., Debes, J., & Duchêne, G. 2018b, ApJ, 852, 104 [CrossRef] [Google Scholar]
  57. Ren, B., Pueyo, L., Chen, C., et al. 2020, ApJ, 892, 74 [Google Scholar]
  58. Riaud, P., Mawet, D., Absil, O., et al. 2006, A&A, 458, 317 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  59. Ruane, G., Ngo, H., Mawet, D., et al. 2019, AJ, 157, 118 [NASA ADS] [CrossRef] [Google Scholar]
  60. Schneider, G., Smith, B. A., Becklin, E., et al. 1999, ApJ, 513, L127 [Google Scholar]
  61. Soubies, E., Soulez, F., Mccann, M. T., et al. 2019, Inverse Prob., 35 [Google Scholar]
  62. Stein, C. M. 1981, Ann. Stat., 1135 [Google Scholar]
  63. Thiébaut, É. 2002, in SPIE Astronomical Telescopes + Instrumentation, Int. Soc. Opt. Photonics, 4847, 174 [Google Scholar]
  64. Thiébaut, É. 2006, Optics in Astrophysics (Springer), 397 [Google Scholar]
  65. van Boekel, R., Henning, T., Menu, J., et al. 2017, ApJ, 837, 132 [NASA ADS] [CrossRef] [Google Scholar]
  66. Van Leeuwen, F. 2007, A&A, 474, 653 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  67. Wahba, G. 1985, Ann. Stat., 13, 1378 [Google Scholar]
  68. Wahhaj, Z., Milli, J., Romero, C., et al. 2020, A&A, 648, A26 [Google Scholar]
  69. Wainwright, M. J., & Simoncelli, E. P. 2000, Advances in Neural Information Processing Systems, 855 [Google Scholar]

Appendix A: Constrained minimization of the cost function

The restored object x is the minimum of the objective function 𝒞(r, x, A, Ω, μ) defined in Eq. (2) on the nonnegative orthant (i.e., x is everywhere nonnegative). We numerically solve this constrained optimization problem with the VMLM-B algorithm (Thiébaut 2002) as implemented in OPTIMPACKLEGACY14. VMLM-B is a limited memory quasi-Newton method implementing optional bound constraints on the sought parameters. This method has proven to be very effective for solving large-scale (millions or even billions of variables) nonlinear problems with or without bound constraints that are typical of inverse problems for image restoration. The requirements for running VMLM-B are to provide the bounds (if any) and a numerical function to compute the objective function and its gradient for a given set of parameters x.

The gradient of the objective function 𝒞(r, x, A, Ω, μ) with respect to x is given by:

x C ( r , x , A , Ω , μ ) = n P t = 1 T A t P n [ C ^ n ( P n ( A t x r t + m ^ ) ) ] = x D + μ smooth Δ n Δ n x | | Δ n x | | 2 2 + ϵ 2 = x R smooth + μ 1 1 N x R 1 . $$ \begin{aligned}&{\boldsymbol{\nabla }}_{{\boldsymbol{x}}}\,\fancyscript {C}({\boldsymbol{r}}, {\boldsymbol{x}}, \mathbf{A }, \mathbf{\Omega }, {\boldsymbol{\upmu }}) = \nonumber \\&\quad \underbrace{\sum \limits _{n \in \mathbb{P} } \sum \limits _{t=1}^T \mathbf{A }_t^{\top }\, \mathbf{P }_n^{\top }\left[ {\widehat{\mathbf{C }}}_n \left( \mathbf{P }_n \left( \mathbf{A }_t\,{\boldsymbol{x}} - {\boldsymbol{r}}_t + \widehat{{\boldsymbol{m}}} \right) \right) \right]}_{= \boldsymbol{\nabla }_{{\boldsymbol{x}}}\, \fancyscript {D}} \nonumber \\&\quad + \upmu _{\mathrm{smooth}}\, \underbrace{\frac{\mathbf{\Delta }_n^{\top }\, \mathbf{\Delta }_n \, {\boldsymbol{x}}}{\sqrt{||\mathbf{\Delta }_n \, {\boldsymbol{x}}||_2^2 + \epsilon ^2}}}_{=\boldsymbol{\nabla }_{{\boldsymbol{x}}}\,\fancyscript {R}_{\mathrm{smooth}}} + \, \upmu _{\ell _1}\underbrace{\mathbf{1 }_N}_{\boldsymbol{\nabla }_{{\boldsymbol{x}}}\, \fancyscript {R}_{\ell _1}}\,. \end{aligned} $$(A.1)

The computational burden for the gradient is similar to that of the objective function.

Appendix B: Importance of removing off-axis sources when learning the statistical model of the nuisance component

Figure B.1a illustrates on the example of HR 4796A that estimating the statistics Ω of the nuisance component directly from the data with Eqs. (14) leads to self-subtraction artifacts: A portion of the light of off-axis sources gets removed from the measurements because it is erroneously attributed to the nuisance component (the mean m ^ [ 1 ] $ \widehat{{\boldsymbol{m}}}^{[1]} $ defined in Eq. (14a)).

thumbnail Fig. B.1.

Illustration of the influence of the statistics Ω ^ = { m ^ , { C ^ n } n P } $ \widehat{{\boldsymbol{\Omega}}} = \{ {\boldsymbol{\widehat{m}}}, {\{ {{\widehat{\textbf{C}}}}_n \}_{n \in \mathbb{P}}} \} $ of the nuisance component on the REXPACO reconstructions. (a) Ω ^ $ \widehat{{\boldsymbol{\Omega}}} $ is computed with Eqs. (2.4) ignoring the contribution of the object; (b) Ω ^ $ \widehat{{\boldsymbol{\Omega}}} $ is computed with Algorithm 1; (c) difference of the mean component m ^ $ \widehat{{\boldsymbol{m}}} $ estimated with Eq. (14a) and Algorithm 1. For (a) and (b), both the deconvolved flux distribution x ^ $ \widehat{{\boldsymbol{x}}} $ and its convolved version H x ^ $ {\mathbf{H}}\,\widehat{{\boldsymbol{x}}} $ are given. The difference of estimated means m ^ $ \widehat{{\boldsymbol{m}}} $ given in (c) is represented with the same colorbar than the convolved objects H x ^ $ {\mathbf{H}}\,\widehat{{\boldsymbol{x}}} $ given in (a) and (b) to ease comparisons. Data set: HR 4796A, see Sect. 3.3.1 for observing conditions.

Figure B.1b shows that refining the statistics of the nuisance component to compensate for the contribution of the off-axis sources improves the reconstruction: Parts of the disk (like its bright west side) that were previously missed are covered. Figure B.1c displays the difference between the mean component m ^ $ \widehat{{\boldsymbol{m}}} $ estimated with Eq. (14a) and with Algorithm 1. This difference is largest near the star where a significant part of disk contribution is lost in the average of the nuisance component m ^ $ \widehat{{\boldsymbol{m}}} $ when using Eq. (14a). This explains the bias and artifacts observed in Fig. B.1a.

Appendix C: Hierarchical estimation of the statistics of the nuisance component and of the flux distribution

We first recall the expression of the shrinkage estimator Ω ^ = { m ^ , { C ^ n } n P } $ \widehat{{\boldsymbol{\Omega}}} = \{ {\boldsymbol{\widehat{m}}}, {\{ {{\widehat{\textbf{C}}}}_n \}_{n \in \mathbb{P}}} \} $ of the nuisance component given in Eqs. (6)–(10):

{ m ^ ( x ) = 1 T t = 1 T ( r t A t x ) C ^ n ( x ) = W n C n ( x ) , n P $$ \begin{aligned} {\left\{ \begin{array}{ll} \widehat{{\boldsymbol{m}}}({\boldsymbol{x}}) = \frac{1}{T}\sum \limits _{t=1}^T \left({\boldsymbol{r}}_t - \mathbf{A }_t \, {\boldsymbol{x}}\right) \vspace{0.5mm} \\ {\widehat{\mathbf{C }}}_n({\boldsymbol{x}}) = \widetilde{\mathbf{W }}_n \odot {\widetilde{\mathbf{C }}}_n({\boldsymbol{x}}) \,,\forall n\in \mathbb{P} \end{array}\right.} \end{aligned} $$(C.1)

with the empirical covariance matrix defined by:

C n ( x ) = 1 T t = 1 T u ^ n , t ( x ) u ^ n , t ( x ) , $$ \begin{aligned} {\widetilde{\mathbf{C }}}_n({\boldsymbol{x}}) = \frac{1}{T} \sum _{t=1}^{T} \widehat{{\boldsymbol{u}}}_{n,t}({\boldsymbol{x}})\, \widehat{{\boldsymbol{u}}}_{n,t}({\boldsymbol{x}})^{\top }, \end{aligned} $$(C.2)

and the residuals

u ^ n , t ( x ) = P n ( r t m ^ ( x ) A t x ) . $$ \begin{aligned} \widehat{{\boldsymbol{u}}}_{n,t}({\boldsymbol{x}}) = \mathbf{P }_{n}\, \left({{\boldsymbol{r}}_t - \widehat{{\boldsymbol{m}}}({\boldsymbol{x}}) - \mathbf{A }_t \,{{\boldsymbol{x}}}}\right). \end{aligned} $$(C.3)

The shrinkage matrix W n $ \widetilde{{\mathbf{W}}}_n $ is defined by Eq. (11) and is considered fixed (i.e., independent from x). In the following we show that these estimators are minimizers of the modified cost function 𝒟joint defined in Eq. (15):

( m ^ ( x ) , { C ^ n ( x ) } n P ) = arg min m , { C n } D joint ( r , x , m , { C n } n P ) $$ \begin{aligned} \left(\widehat{{\boldsymbol{m}}}({\boldsymbol{x}}),\left\{ {\widehat{\mathbf{C }}}_n({\boldsymbol{x}})\right\} _{n\in \mathbb{P} }\right) = \underset{{\boldsymbol{m}}, \{\mathbf{C }_n\}}{\mathrm{arg\,min}} \fancyscript {D}_{\mathrm{joint}}({\boldsymbol{r}}, {\boldsymbol{x}}, {\boldsymbol{m}}, \{\mathbf{C }_n\}_{n\in \mathbb{P} }) \end{aligned} $$(C.4)

with

D joint ( r , x , m , { C n } n P ) = T 2 n P log det C n + 1 2 n P tr ( C n 1 ( W n t = 1 T u n , t u n , t ) ) , $$ \begin{aligned} \fancyscript {D}_{\mathrm{joint}}({\boldsymbol{r}}, {\boldsymbol{x}}, {\boldsymbol{m}}, \{\mathbf{C }_n\}_{n\in \mathbb{P} })&= \frac{T}{2} \sum \limits _{n \in \mathbb{P} } \log \det {\mathbf{C }_n} \nonumber \\&\quad +\frac{1}{2} \sum \limits _{n \in \mathbb{P} } \mathrm{tr} \Biggl ( \mathbf{C }_n^{-1} \Biggl ( \widetilde{\mathbf{W }}_n \odot \sum _{t=1}^{T} {\boldsymbol{u}}_{n,t}\,{\boldsymbol{u}}_{n,t}^{\top }\Biggr )\Biggr ), \end{aligned} $$(C.5)

and

u n , t = P n ( r t m A t x ) . $$ \begin{aligned} {\boldsymbol{u}}_{n,t} = \mathbf{P }_{n}\, \left({{\boldsymbol{r}}_t - {\boldsymbol{m}} - \mathbf{A }_t \,{{\boldsymbol{x}}}}\right)\,. \end{aligned} $$(C.6)

Proof. For a fixed x, the differentiation of 𝒟joint with respect to Ωn = {m, Cn} leads to:

d D joint | Ω n = T 2 tr ( C n 1 d C n ) + 1 2 tr [ 2 C n 1 ( W n t = 1 T u n , t ( d m ) ) C n 1 d C n C n 1 ( W n t = 1 T u n , t u n , t ) ] . $$ \begin{aligned} \mathrm{d}\fancyscript {D}_{\mathrm{joint}}\Big |_{\mathbf{\Omega }_n}&= \frac{T}{2}\mathrm{tr}\left( \mathbf{C }_n^{-1} \, \mathrm{d}\mathbf{C }_n \right) + \nonumber \\&\quad \frac{1}{2}\mathrm{tr}\Bigg [-2\mathbf{C }_n^{-1}\Bigg ( \widetilde{\mathbf{W }}_n \odot \sum \limits _{t=1}^T {\boldsymbol{u}}_{n,t}\,(\mathrm{d}{\boldsymbol{m}})^{\top }\Bigg ) \nonumber \\&\quad - \mathbf{C }_n^{-1}\,\mathrm{d}\mathbf{C }_n\,\mathbf{C }_n^{-1}\Bigg ( \widetilde{\mathbf{W }}_n \odot \sum \limits _{t=1}^T {\boldsymbol{u}}_{n,t} \, {\boldsymbol{u}}_{n,t}^{\top }\Bigg ) \Bigg ]\,. \end{aligned} $$(C.7)

The necessary first-order optimality condition:

d D joint | Ω n = 0 $$ \begin{aligned} \mathrm{d}\fancyscript {D}_{\mathrm{joint}}\Big |_{\mathbf{\Omega }_n}=0 \end{aligned} $$

leads to:

{ C n 1 ( W n t = 1 T u n , t ) = 0 T I K C n 1 ( W n t = 1 T u n , t u n , t ) = 0 , $$ \begin{aligned} {\left\{ \begin{array}{ll} \mathbf{C }_n^{-1}\Bigg ( \widetilde{\mathbf{W }}_n \odot \sum \limits _{t=1}^T {\boldsymbol{u}}_{n,t} \Bigg ) = \mathbf{0 } \\ T \, \mathbf{I }_{K} - \mathbf{C }_n^{-1}\Bigg ( \widetilde{\mathbf{W }}_n \odot \sum \limits _{t=1}^T {\boldsymbol{u}}_{n,t} \, {\boldsymbol{u}}_{n,t}^{\top }\Bigg ) = \mathbf{0 }, \end{array}\right.} \end{aligned} $$(C.8)

where I K R K × K $ {\mathbf{I}}_{K} \in \mathbb{R}^{K\times K} $ stands for the identity matrix. Since Cn is necessary non-singular, the system of Eq. (C.8) gives:

{ m = 1 T t = 1 T ( r t A t x ) C n = W n 1 T t = 1 T u n , t u n , t , $$ \begin{aligned} {\left\{ \begin{array}{ll} {\boldsymbol{m}} = \frac{1}{T}\sum \limits _{t=1}^T ({\boldsymbol{r}}_t - \mathbf{A }_t \, {\boldsymbol{x}}) \\ \mathbf{C }_n = \widetilde{\mathbf{W }}_n \odot \frac{1}{T} \sum \limits _{t=1}^T {\boldsymbol{u}}_{n,t} \, {\boldsymbol{u}}_{n,t}^{\top }, \end{array}\right.} \end{aligned} $$(C.9)

which corresponds to the general form of the estimators given in Eq. (C.1).

All Tables

Table 1.

Summary of the main notations.

Table 2.

Quantitative evaluation of the reconstructions on simulated elliptical and spiral disks: N-RMSE, defined in Eq. (19), for the reconstructions shown in Figs. 58.

Table 3.

RMSE scores of REXPACO reconstructions of a simulated elliptical disk, for three different regularization levels.

Table 4.

Observing conditions of ADI sequences from the VLT/SPHERE-IRDIS instrument considered in this paper.

All Figures

thumbnail Fig. 1.

Schematic representation of the main steps followed by the state-of-the-art post-processing methods for ADI sequences with circumstellar disks. (a) Principle of the classical approach based on the estimation and subtraction of an on-axis PSF. (b) Four main categories of algorithms designed to reduce reconstruction artifacts seen in (a): using a physics-based model of the disk, mitigating artifacts directly from the ADI sequence of interest or using an additional sequence of reference, or formalizing the reconstruction task as an inverse problem. REXPACO falls into the last category: methods based on an inverse problems approach.

In the text
thumbnail Fig. 2.

Typical ADI sequence from the VLT/SPHERE-IRDIS instrument: (a) Examples of temporal frames; (b) two spatio-temporal slices cut along the solid and dashed lines of (a) emphasizing the spatial variations of the structure of the signal; (c) estimated covariance matrices for four regions of interest (ROIs) at different angular separations s with the host star. Data set: HR 4796A, see Sect. 3.3.1 for observing conditions.

In the text
thumbnail Fig. 3.

Schematic illustration of the forward model Ax = VHΓQx of REXPACO. First, the object x is rotated by the parallactic angles by operator Q. Then, each resulting temporal frame is attenuated by the (time independent) transmission of the coronagraph with Γ, and convolved by the off-axis PSF by application of H. Finally, the field of view seen by the sensor is extracted by V to obtain the object contribution Ax. The recorded ADI sequence is modeled by the sum of Ax and of the nuisance component f, as described by Eq. (1).

In the text
thumbnail Fig. 4.

Extraction of patch collections for local learning of the statistics of the nuisance component from ADI sequences. The contribution of off-axis sources is shown in blue.

In the text
thumbnail Fig. 5.

Reconstructions of a simulated elliptical disk: (a) comparisons between cADI image combinations and reblurred REXPACO reconstructions; (b) resolution improvement achieved by deconvolution with REXPACO. The average reconstructions are computed over ten injections of the simulated disk with various orientations with respect to the background.

In the text
thumbnail Fig. 6.

Line profiles extracted from the reconstructions at αgt = 10−6 shown in Fig. 5.

In the text
thumbnail Fig. 7.

Reconstructions of a simulated spiral disk: (a) comparisons between cADI image combinations and reblurred REXPACO reconstructions; (b) resolution improvement achieved by deconvolution with REXPACO. The average reconstructions are computed over ten injections of the simulated disk with various orientations with respect to the background.

In the text
thumbnail Fig. 8.

Line profiles extracted from the reconstructions at αgt = 10−5 shown in Fig. 7.

In the text
thumbnail Fig. 9.

Comparison between the SURE criterion accounting for the fluctuations of the nuisance component through a local learning of the covariances (left, see Eq. (23)) and the true MSE (right, see Eq. (20)). The pink circles represent the global minimum of each criterion. Data set: HIP 80019, see Sect. 3.3.1 for observing conditions.

In the text
thumbnail Fig. 10.

Influence of the regularization on the REXPACO reconstructions. xgt (1st column) is the ground truth elliptical disk injected at the level of flux αgt = 1 × 10−5. The flux distribution x ^ $ {\boldsymbol{\widehat{x}}} $ is reconstructed with a slight amount of regularization (μ → 0+; 2nd column), with μMSE derived by minimizing the optimal MSE criterion (3rd column, see Eq. (20)) and its estimate μ ^ SURE $ {\boldsymbol{\widehat{{\upmu}}}}^{\mathrm{SURE}} $ through the SURE criterion (4th column, see Eq. (23)). Bottom panel: zooms on two regions of interest (ROIs) extracted from the reconstructions given in the top panel. Data set: HIP 80019, see Sect. 3.3.1 for observing conditions.

In the text
thumbnail Fig. 11.

Images of the flux distribution reconstructed by REXPACO comparatively to the PACO, PCA and cADI algorithms. For REXPACO, both the deblurred reconstruction x ^ $ {\boldsymbol{\widehat{x}}} $ and its reblurred version H x ^ $ {\mathbf{H}} \, {\boldsymbol{\widehat{x}}} $ are given for comparison with the other considered methods that do not produce deblurred images of the flux distribution. Data sets: HR 4796A, RY Lupi, SAO 206462 and PDS 70, see Sect. 3.3.1 for observing conditions.

In the text
thumbnail Fig. 12.

Importance of the modeling of spatial covariances. Comparison between (a) regularized inversions with a model of the nuisance component that accounts for the nonstationary variances but neglect covariances, and (b) REXPACO inversions. Reconstructions are given first with a linear scale, then with a square-root scaling in order to better distinguish the lowest reconstructed values. Data sets: HR 4796A, RY Lupi, SAO 206462 and PDS 70, see Sect. 3.3.1 for observing conditions.

In the text
thumbnail Fig. 13.

Effect of the setting of the regularization parameters on REXPACO reconstructions. (a) SURE criterion derived from Eq. (23); (b) comparative reconstructions (first row: μ → 0, i.e., under-regularization; second row: μ ^ SURE = { 10 6 , 10 6 } $ {\boldsymbol{\widehat{{\upmu}}}}^{\mathrm{SURE}} = \{ 10^{6}, 10^{6}\} $, i.e., estimated optimal regularization; third row: 20 × μ ^ SURE = { 2 × 10 7 , 2 × 10 7 } $ 20\times {\boldsymbol{\widehat{{\upmu}}}}^{\mathrm{SURE}} = \{ 2\times10^{7}, 2\times10^{7}\} $, i.e., over-regularization). Two regions of interest are also displayed with a saturated colormap near the north side extremity of the disk and near the star. Data set: HR 4796A, see Sect. 3.3.1 for observing conditions.

In the text
thumbnail Fig. 14.

Combining REXPACO with PACO on a data set of PDS 70 at λ = 2.110 μm: (a) S/N map from PACO; (b) flux distribution from REXPACO; (c) contribution of the three selected point-like sources estimated with PACO; (d) contribution of the disk reconstructed with REXPACO.

In the text
thumbnail Fig. 15.

Combining REXPACO with PACO on a data set of PDS 70 at λ = 2.110 μm: (a) evolution of the estimated photometry; (b) evolution of the estimated astrometry, for the three selected point-like sources.

In the text
thumbnail Fig. B.1.

Illustration of the influence of the statistics Ω ^ = { m ^ , { C ^ n } n P } $ \widehat{{\boldsymbol{\Omega}}} = \{ {\boldsymbol{\widehat{m}}}, {\{ {{\widehat{\textbf{C}}}}_n \}_{n \in \mathbb{P}}} \} $ of the nuisance component on the REXPACO reconstructions. (a) Ω ^ $ \widehat{{\boldsymbol{\Omega}}} $ is computed with Eqs. (2.4) ignoring the contribution of the object; (b) Ω ^ $ \widehat{{\boldsymbol{\Omega}}} $ is computed with Algorithm 1; (c) difference of the mean component m ^ $ \widehat{{\boldsymbol{m}}} $ estimated with Eq. (14a) and Algorithm 1. For (a) and (b), both the deconvolved flux distribution x ^ $ \widehat{{\boldsymbol{x}}} $ and its convolved version H x ^ $ {\mathbf{H}}\,\widehat{{\boldsymbol{x}}} $ are given. The difference of estimated means m ^ $ \widehat{{\boldsymbol{m}}} $ given in (c) is represented with the same colorbar than the convolved objects H x ^ $ {\mathbf{H}}\,\widehat{{\boldsymbol{x}}} $ given in (a) and (b) to ease comparisons. Data set: HR 4796A, see Sect. 3.3.1 for observing conditions.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.