REXPACO: an algorithm for high contrast reconstruction of the circumstellar environment by angular differential imaging

Aims. The purpose of this paper is to describe a new post-processing algorithm dedicated to the reconstruction of the spatial distribution of light received from off-axis sources, in particular from circumstellar disks. Methods. Built on the recent PACO algorithm dedicated to the detection of point-like sources, the proposed method is based on the local learning of patch covariances capturing the spatial fluctuations of the stellar leakages. From this statistical modeling, we develop a regularized image reconstruction algorithm (REXPACO) following an inverse problem approach based on a forward image formation model of the off-axis sources in the ADI sequences. Results. Injections of fake circumstellar disks in ADI sequences from the VLT/SPHERE-IRDIS instrument show that both the morphology and the photometry of the disks are better preserved by REXPACO compared to standard postprocessing methods like cADI. In particular, the modeling of the spatial covariances proves usefull in reducing typical ADI artifacts and in better disentangling the signal of these sources from the residual stellar contamination. The application to stars hosting circumstellar disks with various morphologies confirms the ability of REXPACO to produce images of the light distribution with reduced artifacts. Finally, we show how REXPACO can be combined with PACO to disentangle the signal of circumstellar disks from the signal of candidate point-like sources. Conclusions. REXPACO is a novel post-processing algorithm producing numerically deblurred images of the circumstellar environment. It exploits the spatial covariances of the stellar leakages and of the noise to efficiently eliminate this nuisance term.


Introduction
Circumstellar disks are at the heart of the planet formation processes. Direct imaging in the near infrared is a unique method for addressing this question because planets can be detected together with the disk environment. Despite the very promising results from direct imaging, a small fraction of the expected circumstellar disks have been unveiled and characterized using high contrast direct imaging in the near infrared (Esposito et al. 2020;Garufi et al. 2020;Langlois et al. 2020). Recent protoplanetary and debris disk studies performed in intensity or in polarimetry have focused on morphological disk characteristics such as the presence of spirals (Benisty et al. 2015;Ren et al. 2018;Muro-Arena et al. 2020), asymmetries, warps (Dawson et al. 2011;Kluska et al. 2020) and gaps (van Boekel et al. 2017) that could be signposts for the presence of exoplanets. In this context, protoplanetary disks allow unique studies of the exoplanet-disk interactions (Keppler et al. 2018;Haffert et al. 2019;Mesa et al. 2019). In order to understand the physical processes governing these disks, it is essential to reconstruct their surface-brightness or scattering phase function but also to disentangle, in a post-processing step, the disks and the exoplanets.
From an observational point of view, the direct detection of circumstellar disks requires: (i) high angular resolution to resolve the close environment of stars (the typical size of disks is smaller than 1 arcsecond) and (ii) high contrast to detect the faint light scattered by disks (in the Four main categories of algorithms designed to reduce reconstruction artifacts seen in (a): using a physics-based model of the disk, mitigating artifacts directly from the ADI sequence of interest or using an additional sequence of reference, or formalizing the reconstruction task as an inverse problem. REXPACO falls into the last category: methods based on an inverse problems approach.
infrared, an effective contrast better than 10 5 relative to the host star is typically required). Among the available ground-based observing facilities, the Spectro-Polarimetry High-contrast Exoplanet Research instrument (SPHERE; Beuzit et al. 2019) operating at the Very Large Telescope (VLT) offers such capabilities. It produces angular differential imaging (ADI; Marois et al. 2006) sequences using the pupil-tracking mode of the telescope as a means to enhance the achievable contrast. The presence of strong residual stellar leakages from the coronagraph, in the form of so-called speckles, is the current main limitation of the reachable contrast. A post-processing step is required to combine the images of the ADI sequence and cancel out most of the nuisance component 1 while preserving the signal from the off-axis sources 2 .
There is a large variety of methods dedicated to the postprocessing of ADI sequences (see for example Pueyo (2018) for a review). Even though these methods are primarily designed for the detection of unresolved point-like sources, they are also widely used to process observations of disks (Milli et al. 2012) and share a common framework with most of the existing post-processing methods designed for disks. The common principle is to estimate a reference image 3 of the nuisance component (see Fig. 1(a) for a schematic illustration). This reference image is subtracted to each image of the sequence. The resulting residual images are aligned along a common direction to spatially superimpose the signals from the off-axis sources, and are added to each others. More sophisticated methods follow an inverse problems framework. In particular, the recent PACO algorithm (Flasseur et al. 2018a,b,c) jointly learns the average nuisance component and its fluctuations by estimating spatial covariances. Since the nuisance component is highly nonstationary, the underlying statistical model is estimated at a local scale of small patches.
Most of these standard ADI post-processing algorithms are not suited to detect and reconstruct the light distribution of circumstellar disks. Their major drawback is the presence, in the output images, of strong artifacts that take the form of partial replicas, the removal of extended smooth components, smearing, and nonuniform attenuations due to the so-called self-subtraction phenomenon. As a direct consequence, both the morphology and the photometry of the disks are strongly corrupted. Very few specific solutions have been developed to tackle these limitations in the ADI processing of circumstellar disks. The existing approaches can be classified into four categories (see Fig. 1(b) for a schematic diagram of their principles). A first family of methods introduce a forward-backward loop embedding a physics-based model of the disk (Milli et al. 2012;Esposito et al. 2013;Mazoyer et al. 2020). In a nutshell, the parameters constraining the disk morphology and flux are optimized by minimizing the residuals produced by the selected post-processing algorithm from the ADI sequence free from the current estimation of the disk contribution. This method is however limited as it requires a physics-based model of the disk that is sufficiently simple to be numerically optimized and yet sufficiently flexible to accurately model the actual disk.
A second category of methods attempt to avoid selfsubtraction by reducing the disk contribution in the estimation of the reference on-axis PSF. For instance, Pairet et al. (2018) have proposed a modified version of the PCA-based post-processing algorithm that iteratively removes the estimated disk component to improve the estimation of the onaxis PSF. To limit self-subtraction, Ren et al. (2020) have proposed a data-imputation strategy: The areas of the field of view impacted by the disk are considered as missing-data based on prior knowledge of the disk location and replaced by regions free from any disk contribution.
So-called reference differential imaging (RDI; Gerard & Marois 2016; Wahhaj et al. 2021) is a third class of approaches that require an additional ADI sequence of a reference star that hosts no known off-axis sources to estimate the stellar contamination. The quality of the postprocessing highly depends both on the selection of the reference star, and on the similarity of the observing conditions (Ruane et al. 2019). Ren et al. (2018) also use a reference ADI sequence to build a nonnegative decomposition (via nonnegative matrix factorization) of the nuisance component.
Finally, a fourth class of methods is emerging and states the reconstruction of the object of interest as an inverse problem. It consists in jointly estimating the nuisance term, the disk and, possibly, point-like sources given the data and low-complexity priors adapted to each kind of contribution (Pairet et al. 2019(Pairet et al. , 2021. The method we propose in this paper belongs to this family. Based on the previous analysis of the astrophysical needs and of the state-of-the-art post-processing methods for circumstellar disk detection and reconstruction, the desirable properties of post-processing algorithms may be listed as follows: (i) detection of disks at high contrasts, (ii) reconstruction of a physically plausible image of the flux distribution with limited processing artifacts, and (iii) ability to disentangle the spatially extended contribution of disks from the spatially localized contribution of point-like sources.
In this paper, we attempt to address these three points by proposing a new post-processing method to extract the light distribution of extended features like circumstellar disks from ADI sequences. To get rid of the nuisance term, we formalize the reconstruction as an inverse problem where all the contributions appear explicitly and where the measured instrumental PSF is taken into account. As a result, the estimation of the object of interest is nearly free from contamination by the stellar leakages and from blurring by the off-axis PSF. The method proposed in this paper, named REXPACO (for Reconstruction of EXtended features by PAtch COvariances), introduces dedicated constraints for each component to help disentangle the object of interest from the nuisance term. For the object of interest, these constraints take the form of specific regularizations favoring the smoothness of extended structures (e.g., disk-like component) or the sparsity of point-like sources. For the nuisance term, we consider the same statistical model as with PACO's algorithm (Flasseur et al. 2018a,b,c), initially designed for exoplanet detection, but which has also proven very effective for the recovery of extended patterns in holographic microscopy (Flasseur et al. 2019). The power of PACO statistical model resides in its ability to capture the local behavior of the stellar leakages and the noise through the spatial covariances. Another advantage is that this model is directly learned from the ADI sequence and requires no other specific calibration data. Similarly, REXPACO is an unsupervised algorithm: All tuning parameters are automatically estimated from the data. The framework implemented by REXPACO is rather flexible. As an example, REXPACO can be coupled with PACO to better estimate off-axis point-like sources. The paper is organized as follows. In Sect. 2, we formalize REXPACO algorithm. In Sect. 3, we illustrate the results obtained by REXPACO, both with semisynthetic data and with real observations of known circumstellar disks from the InfraRed Dual Imaging Spectrograph (IRDIS;Dohlen et al. 2008a,b) of the VLT/SPHERE instrument. Section 4 presents results from the combination of REXPACO with the PACO algorithm to unmix the spatially extended contribution of disk from the point-like contribution of exoplanets. Section 5 summarizes our contributions.

REXPACO: Reconstruction of EXtended features by PAtch COvariances modeling
This section describes REXPACO, the method we propose to recover the spatial distribution x of the light flux of offaxis sources from an ADI sequence r ∈ R N T formed by T images of N pixels each. Table 1 summarizes the main notations used throughout the paper.

Inverse problems formulation
REXPACO follows an inverse problems formulation: It computes an estimate x from the sequence of observations r by fitting a model of the data under given constraints. Since all observations r are collected during a single night, the distribution of interest x can be considered time-invariant and described by a vector x ∈ R M + , where M is the number of pixels of the reconstructed field of view and where R + is the set of nonnegative real numbers to account that the intensity is necessarily nonnegative everywhere. The ADI data sequence r results from the contribution of two components as expressed by: where A is the linear operator (R M → R N T ) modeling the effects of the off-axis instrumental point spread function (off-axis PSF), that is to say the non-coronagraphic image of the star, and f ∈ R N T is a nuisance component accounting for the residual stellar light in the presence of aberrations (that is, the speckles) and for the contribution of noise. Due to the temporal variation of aberrations and to the noise, the nuisance component f is the result of some random process. The quality of the statistical modeling of this process is crucial for the inversion of Eq. (1), and it represents a key element of REXPACO. Figure 2(a) displays an example of an ADI sequence from the star HR 4796A observed by VLT/SPHERE-IRDIS. HR 4796A is surrounded by a bright inclined debris disk (see Sect. 3.3 for a more complete description of its astrophysical properties). We selected this data set to illustrate the rationale of the different ingredients of REXPACO. Part (a) of Fig. 2 shows that the bright disk of HR 4796A can only be marginally detected by a close visual inspection of a given frame of the sequence. This is due to the very large contrast between the star and its environment: The contribution of the component of interest A x is typically fainter than the nuisance component f by one to four orders of magnitude. This high contrast sets the gain that must be achieved by the post-processing method in order to recover the component of interest x from the data. Part (b) of Fig. 2 shows two spatio-temporal slices extracted at two different locations (along the solid line: far from the host star, along the dashed line: closer to the host star) and illustrates the nonstationarity of the nuisance component. Near the star (i.e., at angular separations smaller than about 1.5 arcsecond), the nuisance component is dominated by speckles. At larger angular separations, the nuisance component mainly results from a combination of photon noise, thermal background flux and detector readout. Part (c) of Fig. 2 shows the empirical spatial covariances of the data computed by temporal averaging through the ADI sequence in different regions of interest (ROIs). This figure not only reveals that the covariances of the fluctuations are spatially nonstationary but also that the covariances are non-negligible in a small neighborhood. In a nutshell, the nuisance component f has strong spatial correlations and its distribution is spatially nonstationary and fluctuates over time, making the extraction of the object x all the more difficult. We introduce in Sect. 2.3 a model of the probability density function p(f ) of the nuisance component f that is suitable to describe the statistical behavior of these fluctuations. We formalize the reconstruction of the object x from the data r as an inverse problem: We seek the light distribution x that would produce observations, given the instrumental model A, statistically close to the actual measurements. Image reconstruction methods based on the resolution of inverse problems are generally expressed in the form of an optimization problem. In a Bayesian framework, the solution of this optimization problem is interpreted as the maximum a posteriori estimator (see for example Thiébaut (2006)). Introducing an objective function C to account for the instrumental effects, the statistical model of the measurements and regularization terms that promote more satisfying solutions (e.g., sparser and smoother solutions) and better reject noise, the problem writes: where: and arg min x≥0 C (r, x, A, Ω, µ) denotes the set of nonnegative values of x minimizing C (r, x, A, Ω, µ). The first term, D(r, A x, Ω), is the data-fidelity term that penalizes the discrepancy between the recorded data r and the modeled contribution A x of the object of interest x in the data. The data fidelity term also depends on parameters Ω that define the statistical distribution of the nuisance component f (e.g., mean, variances or covariances, see Sect. 2.3). The second term, R(x, µ), is a regularization term enforcing some prior knowledge about the object properties to favor physically more plausible reconstructions and prevent noise amplification. The hyper-parameters µ tune the behavior of the regularization, in particular, its relative importance compared to the data fidelity term. In the following, we detail the ingredients of this general framework.

Forward model: off-axis point spread function
The parameters x ∈ R M + represent the distribution of light due to the off-axis sources (e.g., circumstellar disks, exo-planets, brown dwarfs, background stars). During the acquisition of an ADI sequence, the field of view as seen by the detector rotates around the optical axis. After suitable instrument calibration, this apparent rotation is completely defined by the center of rotation and the parallactic angles recorded with the data 4 . The contribution of x in the t-th data frame r t is given by the linear model A t x where A t : R M → R N implements the off-axis point spread function for the considered frame: it models the apparent rotation of the field of view, the nonuniform attenuation by the coronagraph and the instrumental blurring. In other words, [A x] t ≡ A t x for all frames t ∈ 1, T where A : R M → R N ×T is the linear operator introduced in Sect. 2.1 and such that A x models the contribution of x to all data frames.
Considering the size of the image reconstruction problem, it is essential that our model A of the off-axis PSFs not only be an accurate approximation of the actual instrumental effects but also that A and its adjoint be fast to apply in an iterative reconstruction algorithm (see Appendix A). To obtain a numerically efficient model, we decompose each operator A t into different factors: Q t to model the temporally and spatially variant effects due to the rotation of the field of view, Γ to account for the attenuation close to the optical axis due to the coronagraph, and H to implement the instrumental blurring (considered in first approximation as spatially and temporally invariant). Operators Q t can be implemented by interpolation operations while operator H performs bi-dimensional convolutions that can be computed efficiently using fast Fourier transforms (FFTs). Modeling the instrumental blurring as being shift-invariant (a convolution) is accurate in most of the field of view except close to the optical axis. The coronagraph induces a reduction of the light transmission at short angular separations (i.e., under and near the coronagraphic mask) and a deformation of the PSF. We model the variable light transmission by a diagonal operator Γ = diag(γ) with γ a 2-D transmission mask. The mask γ has a smooth profile with all entries in the range [0, 1]; from 0 at the center of the coronagraph (i.e., no flux from off-axis sources transmitted) to 1 farther away, and being equal to 0.5 at the inner working angle (IWA) of the coronagraph. We neglect the deformation of the PSF in our model. This approximation has a limited effect since the area affected by the PSF deformations (close to or under the coronagraphic mask) is dominated by the nuisance component. An alternate way to handle the measurements in the area of the coronagraph would consist of assuming that they suffer an infinite variance (so that they be completely discarded in the subsequent estimations). Both approaches lead to comparable reconstructions (i.e., differences in the reconstructed flux are several orders of magnitude lower than the disk magnitude) in all experimental cases we have considered. This model of the effect of the coronagraph on off-axis point sources as a simple attenuation varying with the angular position is consistent (for angular separations larger than 0.05 arcsecond) with the calibrations of the coronagraphs considered in Beuzit et al. (2019) for the SPHERE instrument. For the results presented in Sect. 3, the experimental calibration of this transmission has been performed in the H-Johnson's spectral band (i.e., at 1.6 µm) and extrapolated to the other wavelengths by using the experimental measurement and dedicated simulations, as described in Beuzit et al. (2019).
To summarize, our forward model of the contribution of the object of interest x in the t-th frame writes 5 : where V is a simple truncation operator to extract the N pixels corresponding to the actual data field of view from the (larger) M -pixels output of blurring operator H. Due to the apparent rotation of the field of view during the ADI sequence, previously unseen areas fall within the corners of the sensor and, by combining all measurements, a region larger than N pixels can be reconstructed (up to 57% more pixels). The factorization of the forward operator A t can straightforwardly be generalized to the case of a timevarying blur H t or transmission Γ t to account for an evolution of the observing conditions during the ADI sequence or a possible partial decentering of the coronagraph. Figure 3 illustrates the different steps to convert the object of interest x into the model of its contribution into the ADI sequence recorded by the high contrast imaging instrument. In this figure, we adopt the same convention as for A that, when the frame index t of an operator is omitted, the result of applying the operator for all frames is considered. 5 The forward model could be written differently so that the operations involved in Qt, Γ and H would be applied in a different order, subject to slight adaptations. For example, if the field rotation and translation modeled by operator Qt are applied after the convolution H, the non-isotropic off-axis PSF has to be counter-rotated by the parallactic angles (i.e., the blurring is not the same for all t) to counter-balance the effect of the permutation of Qt and H. The same remark applies for the attenuation Γ if the mask γ is not circular symmetrical.

Statistical model of the nuisance component
The nuisance component f in ADI sequences generally presents strong and nonstationary spatial correlations (speckles) that fluctuate over time, especially near the host star where the stellar leakages dominate, see Fig. 2(a). Also, the presence of several bad pixels (hot, dead or randomly fluctuating) is a common issue in high contrast imaging where the images are recorded by infrared detectors. Despite a prereduction stage identifying and correcting for defective pixels, some of them displaying large fluctuations only on a few frames remain after this processing (see for example Pavlov et al. (2008); Delorme et al. (2017) for the SPHERE instrument). We thus aim to account for the different causes of fluctuations of the nuisance component through a statistical model.
We recover an image of the objects of interest x based on a co-log-likelihood term D(r, A x, Ω) that captures the statistical distribution of the residuals between r and the model A x. Like in our previous works on the PACO algorithm (Flasseur et al. 2018a(Flasseur et al. ,b,c, 2020a, we model the covariances locally at the scale of small patches of size K (a few tens of pixels in practice). This approximation is justified by the nonstationarity of the spatial correlations and can be seen as a trade-off between considering full-size spatial covariance matrices (of size N × N , and hence too large to be estimated and stored in practice) and completely neglecting the spatial correlations. Our choice amounts to model the nuisance component f t of the t-th frame as the realization of independent Gaussian processes for each Kpixel patch, without overlap. Given that model, the probability density function of the nuisance component is given by: with P n : R N → R K the operator extracting patch n from a given temporal frame for any index n in P defining the partition of the data frame into nonoverlapping K-pixel patches, and P is the list of the indices of the pixels at the center of the nonoverlapping patches. The vector m ∈ R N denotes the temporal mean E t (f t ) while C n ∈ R K×K denotes the temporal covariance matrix within the patch n.
For m and C n , we use the same estimators that proved successful with PACO for exoplanet detection: the temporal mean m is estimated by the sample mean m defined by: that combines the high-variance and low-bias sample covariance estimator C n with the low-variance and high-bias estimator F n that neglects all covariances. In other words, F n is a diagonal matrix containing the pixel sample variances that are the diagonal entries of the sample covariance of the considered patch: with: the residuals in patch n of frame t. With this specific choice for F n , Eq. (7) rewrites: where stands for the Hadamard product (i.e., entrywise multiplication), and W n ∈ R K×K is defined for {k, k } ∈ 1, K 2 by: The shrinkage parameter ρ n balances each estimator to reach a bias-variance trade-off. The extension of the results of Chen et al. (2010) to the diagonal covariance matrix F n leads to the following data-driven expression for the shrinkage parameter ρ n (see Eq. (12) of Flasseur et al. (2018a)): It can be noted that most above defined quantitiesm in Eq. (6), C n in Eq. (7), C n in Eq. (8), u n,t in Eq. (9), W n in Eq. (11) and ρ n in Eq. (12) -depend implicitly on the sought parameters x although, and for the sake of simplicity, this is not always explicitly indicated. In Algorithms 1 and 2 presented in Sect. 2.4, these quantities are labeled with the iteration number as the parameters x change at each iteration. Figure 4 illustrates the extraction of a collection of patches from an ADI sequence for locally learning the statistics of the nuisance component, in particular for the estimation of C n . The number K of pixels in each patch, and hence the dimension of the local covariance matrices should be large enough to encompass the core of the off-axis PSF and yet not too large compared to the number T of samples available for the estimation. The optimal patch size is estimated using the same criterion (validated empirically) as in PACO and which yields an optimal patch size corresponding roughly to twice the offaxis PSF full width at half maximum (FWHM). This rule typically leads to 50 K 100. Figure 2(c) gives examples of estimated patch covariance matrices C n for different angular separations. It shows that the structure of the correlations strongly depends on the distance to the star and that neglecting these correlations would be a crude approximation, especially near the star were the nuisance component is highly correlated and fluctuating. Figure 2(c) also emphasizes that the covariances are well preserved by the regularization (i.e., ρ n is small enough for C n to represent the largest contribution). We have shown in Flasseur et al. (2018aFlasseur et al. ( , 2020b that such a local model of the covariances approximates well the empirical distribution of the nuisance component. In Flasseur et al. (2020b), a small discrepancy was observed near the star due to the larger fluctuations 6 of the nuisance component in this area. The impact of a statistical mis-modeling in this area is further discussed in Sect. 3.3.
Given the patch-based statistics Ω = { m, { C n } n∈P } locally accounting for the fluctuations of the nuisance component, the co-log-likelihood rewrites: where the summation over n is performed on the list P of nonoverlapping K-pixel patches while the residuals u n,t in the patch n for the frame t are defined in Eq. (9).

Unbiased estimation of the mean and spatial covariances
The estimators m, in Eq. (6), and C n , in Eq. (7), that appear in Eq. (13) both depend on the object flux x which is unknown. To handle this problem, the simplest solution 6 Large fluctuations in the nuisance component can be due to a slight decentering of the coronagraph or to a sudden degradation of the observing conditions. Finer statistical models of the nuisance component could be considered to account for these fluctuations. In particular, multivariate Gaussian scale mixture models (GSM; Wainwright & Simoncelli 2000) have been shown to be very effective in the context of exoplanet detection (Flasseur et al. 2020a,b). Replacing the multi-variate Gaussian model consider in this paper by a GSM model requires specific developments that are left for future work.
Step 1. Learn statistics of nuisance term.
Eq. (6) for n ∈ P do for t ∈ 1, T do u [i] would be to neglect the object contribution as it is typically small compared to the nuisance component f . Under this assumption, Eqs. (6) and (8) rewrite: and correspond respectively to the sample mean and the sample covariances learned locally from the image patches. We show in Appendix B that neglecting the light flux of off-axis objects when characterizing the nuisance component leads to biased estimations, a problem common to many ADI post-processing methods and known as the selfsubtraction problem (Milli et al. 2012;Pairet et al. 2019). To recover a better estimation of the object flux, it is then necessary to develop an unbiased estimation of the statistics Ω. To do so, two strategies are possible: (i) an alternation between the reconstruction of x and the update of the statistics Ω of the nuisance component f from the residuals r − A x, until convergence, as done in Algorithm 1, or (ii) a joint estimation approach implemented by Algorithm 2. In this paper, we used η = 10 −8 in the stopping condition of Eqs. (2) and (15) Algorithm 1 for all the reconstructions. Approach (ii) implemented by Algorithm 2 converges faster and is especially beneficial when reconstructing structures that are close to circular symmetry such as in Figs. 7 and 8. In order to be equivalent to Algorithm 1, Algorithm 2 requires replacing the data-fidelity term D defined in Eq. (13) by (the proof is detailed in Appendix C): with u n,t the residuals in the patch n of frame t defined in Eq. (9). This modified data-fidelity corresponds to the co-log-likelihood under a Gaussian assumption except that shrinkage matrices W n are introduced 7 so that the shrunk covariances C n defined in Eq. (10) are minimizers of D joint . The mean m and shrunk covariances C n can thus be replaced by their closed-form expressions given in Eqs. (6) and (10) that depend on x and the minimization be performed solely on x. Beyond the faster convergence of Algorithm 2, the two algorithms differ slightly due to the shrinkage factors not being refined throughout the reconstruction in Algorithm 2. We found the impact of that discrepancy to be barely noticeable in the reconstruction results and hence would recommend the general usage of Algorithm 2 for its improved speed.

Regularization term
After discussing the data-fidelity term D(r, A x, Ω) in the previous sections, we focus here on the regularization term R(x, µ) and on the resolution of the inverse problem (2). Regularization terms are introduced to enforce prior knowledge about the unknown object x and to improve the conditioning of the inversion. Two classical regularizations are considered in the proposed method. The first one promotes smooth objects with sharp edges by favoring the sparsity of their spatial gradients. This is the goal of the so-called 2 − 1 edge-preserving regularization (Charbonnier et al. 1997;Mugnier et al. 2004;Thiébaut 2006) widely used in image processing: where ∆ n approximates the spatial gradient at pixel n by finite differences and > 0 sets the edge-preservation behavior of the regularization ( = √ 10 −7 in all the experiments of this paper): Local differences ∆ n x that are below in magnitude are smoothed similarly as a quadratic regularization would, while larger differences are preserved (e.g., border of a disk). This classical regularization is sufficiently flexible to remain adapted to different disk morphologies, as illustrated by Figs. 5 to 8 where elliptical disks with sharp edges and spiral disks with smooth edges are considered. Since we aim to reconstruct a well-contrasted object on an dark background, we also consider a second regularization term promoting the sparsity of the object x by penalizing its separable 1 -norm: This sparsity-promoting regularization term is also well adapted to restore point-like sources. We discuss in Sect. 4 how an alternating estimation strategy can be applied to further improve the separation of overlapping point-like sources and extended structures. Due to the positivity constraint in Eq.
(2) that guarantees that reconstructed light fluxes are nonnegative (i.e., x ≥ 0), the 1 -norm in Eq. (17) boils down to n x n which is a differentiable expression and smooth optimization techniques are applicable (see Appendix A). The two terms R smooth and R 1 are combined by: where µ = {µ smooth , µ 1 } balances their relative weight with respect to the data-fidelity term D. For a given set of parameters µ, we solve the constrained minimization problem (2) by running the VMLM-B algorithm (Thiébaut 2002). Some technical elements about this minimization technique can be found in Appendix A.

Evaluation on simulated disks
We first evaluate quantitatively, on numerical simulations, the ability of REXPACO to estimate a faithful image of the light flux distribution of off-axis sources. We consider two simulated disks with very different morphologies: (i) a spatially centered elliptical disk with an eccentricity about 0.95 and with sharp edges; (ii) a spiral disk with two arms whose center is shifted by ten pixels from the star center in one of two spatial directions. Contrary to the elliptical disk, the spiral disk has smooth edges. These simulated disks resemble the actual circumstellar disks presented in Sect. 3.3 so that our simulations can help to assess the quality of the reconstructions of real circumstellar disks. Each simulated disk is injected into a data set of HIP 80019 with no known off-axis source, at three different contrast levels: α gt ∈ {1 × 10 −6 , 5 × 10 −6 , 1 × 10 −5 }. Reconstructions of the elliptical disk are performed with Algorithm 1 while the spiral disk that requires many more iterations of Algorithm 1 to converge has been processed with Algorithm 2. A total of 60 reconstructions have been performed: for each disk and each level α gt , the simulated disk has been injected in one of ten different orientations with respect to the background in order to evaluate the mean and variance of the reconstructions.
Simulations of the elliptical disk are reported in Figs. 5 and 6. We compare REXPACO reconstructions to the cADI image combination technique. Since cADI does not perform a deconvolution, the comparison is performed at the resolution of the instrument, that is to say the images produced by cADI are shown next to the reblurred reconstructions H x of REXPACO in Figs. 5(a) and 6(a). Large errors can be noted in cADI images, in particular at small angular separations. The computation of the average reconstruction obtained for the ten different orientations of the disk with respect to the background indicates the presence of systematic errors with cADI: an under-estimation of the light-flux in the region of the disk that is closest to the star (an issue due to limited angular diversity). In contrast, REXPACO reconstructions are close to the ground truth, even for the lower level of contrast α gt = 10 −6 . The deblurred reconstructions shown in Figs. 5(b) and 6(b) are in good agreement with the ground truth. Unsurprisingly, the reconstruction quality is higher when the disk is brighter: more spurious fluctuations are visible in the deblurred reconstruction at α gt = 10 −6 than at α gt = 5 × 10 −6 or α gt = 10 −5 . Although some discrepancies can be noted in the deblurred line profiles, the resolution is improved by the deconvolution process.
Simulations of the spiral disk are reported in Figs. 7 and 8. The comparison with cADI shows a clear improvement of the reconstructions with REXPACO: cADI fails to recover most of the disk. At the lowest contrast α gt = 10 −6 and owing to the near circular-symmetry of the structures close to the star, the spiral disk is challenging to disentangle from the nuisance component. REXPACO is however able to recover some parts of the spiral arms although their flux is underestimated. At the highest contrast α gt = 10 −5 , line profiles of Fig. 8 indicate that REXPACO reconstructions are strongly improved compared to cADI and that some errors remain in particular at the lowest angular separations.
The reconstructions are quantitatively evaluated by reporting the normalized root mean square error (N-RMSE, the lower the better) computed over different regions (the disk, the background, and the whole image):  Table 2 reports N-RMSE values and indicates a clear improvement over cADI: the errors are typically divided by a factor of three to four (the error reduction is more modest for the more challenging reconstructions of the spiral disk at the lowest fluxes).

Automatic setting of the regularization parameters
The setting of the parameters µ is of uttermost importance since the results of the regularized reconstructions depend both on the characteristics of the data set (e.g., observing conditions, amplitude of the field rotation) and on the properties of the objects to reconstruct (e.g., morphology, contrast). The parameters µ can be tuned manually by trial and error until the reconstruction is qualitatively acceptable, but this approach is not completely satisfactory since it is time-consuming and user-dependent. Several methods have been developed in the literature to automatically tune regularization parameters by minimizing a quantitative criterion (Craven & Wahba 1978;Wahba et al. 1985;Stein 1981). In this work, we consider the Stein's unbiased risk estimator approach (SURE; Stein 1981), which minimizes an estimate of the mean-square error (MSE) in the data space: where x gt is the (unknown) ground truth object, and x µ (r) denotes the reconstructed object x obtained from the data r when using regularization parameters µ. For any given set of parameters µ, the criterion SURE(µ) provides an unbiased estimation of MSE(µ) that does not require knowledge of the true object x gt (Stein 1981): where J µ (r) = ∂ x µ (r)/∂r is the Jacobian matrix of the mapping r → x µ (r) with respect to the components of the data r: Since there is no closed-form expression for x µ (r) (it is obtained by an iterative process), it is complex to derive tr (A J µ (r)). An alternative proposed by Girard (1989) considers a Monte Carlo perturbation method to numerically approximate this term by finite-differences: where b ∈ R N T is an independent and identically distributed pseudo-random vector of unit variance and ξ is the amplitude of the perturbation. While the precise setting of ξ is not a crucial point of the method, it must be chosen both large enough to prevent errors due to numerical underflows in the computation of the difference x µ (r + ξb) − x µ (r) and small enough so that the approximation in Eq. (22) remains valid (the effects of the perturbation being nonlinear in the model). In practice, when we evaluate Eq. (22), we set ξ = 0.1×MAD(r), where the median absolute deviation MAD(r) = median(|r − median(r)|) is a robust estimator of the standard-deviation. As in our previous derivation of the statistical model of the nuisance component, we approximate the full covariance matrix C that appears in Eq. (21) as block-diagonal, with blocks corresponding to the partition of the image into nonoverlapping patches. The patch covariance model and the perturbation approximation lead to the following risk estimator:  The optimal value µ SURE of the regularization parameters is obtained by minimizing the quantity (23) with respect to µ.
We have validated on a numerical example the automatic tuning of the regularization parameters µ: The simulated elliptical disk is injected at the level of contrast α gt = 10 −5 on the data set of HIP 80019. Our SURE risk estimator given in Eq. (23) is evaluated for different values of the parameters µ. Figure 9 compares this criterion with the true MSE: The two criteria reach a global minimum for similar hyper-parameter values. The SURE criterion is intrinsically non-convex and additional non-convexities arise due to the sensitivity of the evaluation of the term tr (A J µ (r)) by finite differences. Rather than proceeding by local minimization of SURE(µ), it is therefore safer to systematically evaluate the criterion on a 2-D grid in order to identify the global minimum. Table 2. Quantitative evaluation of the reconstructions on simulated elliptical and spiral disks: N-RMSE, defined in Eq. (19), for the reconstructions shown in Figs. 5 to 8 . The N-RMSE is also given on the restrictions D(xgt) and D( x) to the area actually covered by the disks. The best scores are highlighted in bold fonts.

Score
Algorithm α gt = 1 × 10 −6 α gt = 5 × 10 −6 α gt = 1 × 10   Figure 10 illustrates the impact of the regularization in three cases: an under-regularized setting (i.e., µ → 0 + ), the automatic setting µ SURE based on the minimization of the SURE criterion (23), and the best-possible setting µ MSE computed using the ground truth by minimizing the MSE criterion (20). The morphology of the reconstructed disk is improved by the regularization: It is smoother and sharper thanks to the edge-preserving term. The background of x is also less noisy (closer to zero) due to the 1 -norm penalization. It can be noted that in this simulation, the reconstructed object x with µ SURE is flatter and sharper than the disk reconstructed with the optimal parameters because the optimal parameters lead to a better reconstruction out-side of the disk (reconstructed values are very close to zero: the ground truth value) at the expense of a slightly noisier reconstruction of the disk. In Figure 9, the location of the global minimum (denoted by a circle) corresponds to parameters µ SURE smooth > µ MSE smooth and µ SURE , which leads to a stronger smoothing of the object with µ SURE and a better rejection of close-to-zero noise in the background with µ MSE . This qualitative observation is also confirmed by Table 3 which reports RMSE 8 scores between the ground truth x gt and the reconstructed flux distribution x (RMSE = 1 M ||x gt − x|| 2 ). Table 3 shows that the quality of the reconstruction obtained with the regularization parameters µ SURE estimated by minimizing the SURE criterion (22) is very close to the best-possible reconstruction obtained by minimizing the MSE criterion (20).

Evaluation on data sets with circumstellar disks
In this section, we evaluate the ability of REXPACO to reconstruct light flux distributions of actual circumstellar disks from ADI sequences. The results are compared to two existing methods, cADI and PCA (see Sect. 1 for their respective principle), both designed for the detection of point-like sources but used extensively with different tuning parameters to process data with disks (see Milli et al. (2012);Pairet et al. (2019) for studies of the resulting post-processing artifacts). We also compare REXPACO to PACO. The rationale behind this last comparison is to assess the benefit of the reconstruction framework of REXPACO in addition to the common statistical modeling of the spatial covariances shared by the two methods. For cADI and the PCA, we used the SpeCal pipeline (Galicher et al. 2018) which is the post-processing standard of the SPHERE data center (Delorme et al. 2017). The PACO implementation is based on the algorithm described in Flasseur et al. (2018a,b,c). A Matlab™ implementation of the REXPACO main routines is available online 9 as a purpose of illustration. It is based on GlobalBioIm 10 , an opensource software for image recon-

Data sets description
For the comparisons, we have selected four data sets from the VLT/SPHERE-IRDIS instrument obtained by the observation of stars hosting known circumstellar disks of various morphological structures (e.g., face-on versus edge-on configuration, presence of inner and outer gaps, spirals, etc.). The rationale of such a selection is to test the versatility and the robustness of REXPACO in reconstructing the flux distribution of circumstellar disks with various morphologies. The considered data sets were recorded on the following targets. Table 4 summarizes the main observing conditions of the corresponding ADI sequences. HR 4796A (HD 109573A) is the primary member of a binary system of the TW Hydrae association with an age about 12 Myr (Bell et al. 2015) located at a distance about 72.8 pc (Van Leeuwen 2007). HR 4796A hosts a debris disk in face-on configuration first observed by the Hubble Space Telescope (Schneider et al. 1999). Since then, its morphology has been studied intensively with ground-based facilities (Milli et al. 2017(Milli et al. , 2019. The relatively bright disk (contrast between 10 −5 and 10 −4 ) exhibits a thin ring structure with a semimajor axis about 77 au, an inclination by about 76°, and a high surface brightness which suggest the

Notes.
Columns are: target name, ESO survey ID, observation date, number T of available temporal frames, spectral filter λ, total amount of field rotation ∆ par of the field of view, number NDIT of sub-integration exposures, individual exposure time DIT, DIMM coherence time τ 0 , seeing value at the beginning of the observations and the first paper reporting analysis of the same data. All the observations are performed with the apodized Lyot coronagraph (APLC; Carbillet et al. (2011)) of the VLT/SPHERE instrument. ( * ) The ADI sequence of HIP 80019 was made of 176 temporal frames but we have discarded the 60 first frames manually. This sequence contains no detectable off-axis sources based on the application of PACO and REXPACO algorithms. If off-axis sources are present at an undetectable level of contrast, they would introduce a negligible bias in comparison to the levels of contrast considered in our experiments. This ADI sequence is used in Sect. 2 to perform injections of fake disks.

Reconstruction of circumstellar disks
We apply REXPACO to the four selected ADI sequences presented in Table 4. For each, the computation takes a few hours (Matlab™ implementation, processor Intel i9-9880H at 2.3 GHz). Figure 11 displays the results obtained with PACO, PCA and cADI. For the PCA, we have tested several choices for the numbers of modes used in the decomposition and selected the best choice, corresponding to a small number of modes in order to limit the issue of selfsubtraction (see Pairet et al. (2019) for a discussion about this effect). The images produced by PACO are obtained with a maximum likelihood estimator of the light flux derived for point-like sources. For REXPACO, the results take the form of reconstructed flux distributions x. We also give the blurred versions H x to simplify the comparison with methods that do not perform a deconvolution. For the PCA and cADI, the results take the form of residual images obtained after subtraction of the estimated on-axis PSF. These residual images are normalized to the star flux that is encoded in the on-axis PSF provided with the data. Thus, the resulting images can also be interpreted as blurred flux distributions of the off-axis sources. Figure 11 shows that the mean levels of flux estimated by PCA and cADI differ significantly from the flux estimated by REXPACO and PACO 11 : The PCA and cADI are sen-11 Fluxes estimated by REXPACO and PACO are approximately comparable. The residual flux difference is due to the fact that sitive to several causes of artifacts severely altering both the morphology and the photometry of the disks. In particular, cADI and the PCA suffer from positive and negative replicas and also from the self-subtraction phenomenon. cADI performs much better in the presence of extended features like disks due to its limited attenuation of the disk signal. However, cADI is also sensitive to the ADI artifacts creating morphological distortions like nonphysical discontinuities in the structure of the disks, as discussed in Sect. 1. This is especially the case for the thin and bright ring surrounding HR 4796A. As discussed in Sect. 2.4, the REXPACO reconstruction exhibits an almost continuous elliptical structure, a flux asymmetry on the west side of the ring as predicted by intensity scattering models (Milli et al. 2017), and the scattering structures predicted by radiative transfer models near the disk extremities (Lagrange et al. 2012). The morphological structure of the reconstructed disk is very close the one obtained from polarimetric observations that are almost exempt from artifacts at such separations (Milli et al. 2019). The faithfulness of the REXPACO reconstruction is also supported by the simulations presented in Sect. 2.4 (see Figs. 5 and 6) where we injected a simulated elliptical disk with comparable eccentricity and spatial extent than the HR 4796A disk. In this latter case, no significant artifacts were identified in the reconstruction even though the simulated disk was one order of magnitude fainter than the HR 4796A disk. The REXPACO reconstruction of RY Lupi also exhibits reduced ADI artifacts. As seen on Fig. 11, the REXPACO reconstructions show better the smooth and faint extended disk structures. In addition, the reconstructed flux distribution uncovers new details around the brightest part of the disk such as spirals as well as the bottom and top disk planes.
The comparison with PACO is also interesting. While this latter algorithm demonstrates significantly better performance than the PCA to detect point-like sources (Flasseur et al. 2018a), it performs worse than the PCA and cADI in the presence of extended features since it assumes that the pattern to detect is known and takes the form of a point-like source. Besides, parts of the disks are lost in the PACO reconstructions due to the subtraction of the mean  estimated optimal regularization; third row: 20 × µ SURE = {2 × 10 7 , 2 × 10 7 }, i.e. over-regularization). Two regions of interest are also displayed with a saturated colormap near the north side extremity of the disk and near the star. Data set: HR 4796A, see Sect. 3.3 for observing conditions. and 8) where a numerical model of a spiral disk with two arms of comparable spatial extent is injected to the data.
These numerical tests have shown that REXPACO is able to reconstruct a faithful image of the flux distribution of such a type of disk for levels of contrast 5×10 −6 , that is to say about five to ten times fainter than the mean reconstructed level of flux of the SAO 206462 disk. The bright asymmetry in the southwest side of the disk could be interpreted as a reconstruction artifact possibly due to a local mis-modeling of the nuisance component, but we did not experience such an effect in our simulations for the typical level of flux of SAO 206462. Besides, the REXPACO reconstruction exhibits detailed structures of the spiral arms with a better rejection of the nuisance component near the star than the ADI and RDI post-processings performed in Maire et al. (2017) on the same sequence of observations.
Finally, the case of PDS 70 is particularly interesting due to the complex structure of the disk and to the presence of two confirmed exoplanets at formation stage inside the disk. Concerning the disk structure, the outer ring reconstructed by REXPACO is free from the self-subtraction artifacts that were visible on the cADI, PCA and PACO reductions. The bright asymmetry of the disk in the west side of the outer disk and the possible spiral-shaped features pointed by Müller et al. (2018) in the north side of the disk are also detectable in the REXPACO reconstruction. (2019) based on the analysis of several near-infrared angular plus spectral differential imaging (ASDI) sequences by various post-processing methods. This PLF can be (marginally) detected from the REXPACO reconstruction while it is indistinguishable in the same ADI sequence processed by PCA (see Fig. 1 of Mesa et al. (2019)), as it is also the case for our PCA reduction given in Fig. 11. Figure 12 shows the importance of the modeling of spatial covariances in REXPACO: Regularized inversions with a statistical model of the nuisance component that neglects all covariances (but still accounts for the nonstationary variances) are given in Fig. 12(a), to compare with REXPACO reconstruction shown in Fig. 12(b). When spatial covariances are omitted, the reconstructions are worse. This is particularly visible on HR 4796A, in the background and close to the star, and on SAO 206462, close to the star. Figure 13 completes this study by showing that the method for automatically setting the regularization hyperparameters discussed in Sect. 3.2 leads to satisfying reconstructions: The HR 4796A disk is smoother, with sharper edges, and the background is less noisy when automatically selected hyper-parameters µ SURE are used compared to the case of under-regularization µ → 0 + . Conversely, over-regularization (µ = 20 × µ SURE ) leads to strong artifacts mimicking the self-subtraction effects, especially near the star, due to the large penalization of the 1 -norm of the reconstructed flux distribution. The extremities of the disk ring also exhibit staircase-artifacts. These effects are signs of an over-regularization by the edge-preserving term.

Discussion of the results
Based on numerical injections of fake disks (with different orientations and typical morphologies), we have shown that REXPACO is capable of reconstructing a faithful image of the flux distribution at contrasts up to 10 6 for elliptical disks with a large eccentricity (about 0.95). For disks in face-on configuration, our simulations have shown that the contrast achieved by REXPACO drops by a factor between two and ten due to the increased difficulty in disentangling the disk signal from the nuisance component in such a configuration. While these first quantitative results should be complemented and refined by more extensive simulations with diverse disks and observing conditions, they give first clues about the operating range of REXPACO. In addition, REXPACO better preserves both the morphology and the photometry of the disks than the standard cADI method used routinely to process ADI sequences in the presence of circumstellar disks.
We have also applied REXPACO on four ADI sequences recorded by the VLT/SPHERE-IRDIS instrument on stars hosting known circumstellar disks with various morphologies. While there is no ground truth for these objects, we have shown that REXPACO copes with the classical ADI artifacts (e.g., nonphysical distortions and large residual stellar leakages). The images produced by REXPACO allow to recover structures in the circumstellar material that were previously identified only with methods complementing ADI such as reference or polarization differential imaging. We expect that the high contrast capability of REXPACO combined with its ability to reconstruct faithful images of the circumstellar material will be helpful in detecting new circumstellar disks and in refining the astrophysical interpretation of their flux distributions.

Unmixing point-like sources and extended sources
For systems formed of both an extended disk and pointlike sources such as exoplanets, the image reconstruction can be improved by combining REXPACO algorithm with the exoplanet detection and characterization method PACO. Both algorithms share a common statistical modeling of the background and can thus be combined in a principled way.
In our experiments, this approach proved superior to our attempts to reconstruct with REXPACO a composite object x = x ext + x pnt where x ext accounts for extended (smooth) structures while x pnt accounts for point-like sources and with a regularization designed to favor these specific structures in each component: The two main arguments in favor of combining REXPACO and PACO are the ability of PACO to account for the sub-pixel location of point-like sources (while a sparse component x pnt is restricted to the discretization of the pixel grid) and direct control on the number of point-like sources (while x pnt either displayed many more nonzero entries than the actual number of point sources, or missed the weaker pointlike sources). In a nutshell, the proposed combination of the two algorithms works as follows: (i) PACO and REXPACO are launched independently on a given ADI sequence; (ii) based on the S/N map produced by PACO and on the flux distribution produced by REXPACO, the user identifies manually putative point-like sources; (iii) REXPACO and PACO are then alternatively applied. During step (iii), the astrometry and photometry of the selected point-like sources are refined by PACO, based on the residual data obtained after subtraction of the disk contribution as currently reconstructed by REXPACO. Next, the flux distribution of the disk can be refined by REXPACO on updated residuals obtained after subtraction of the refined point-sources contribution estimated by PACO.
Step (iii) is repeated until convergence of the disk and point-like sources components. In practice, we stop the alternating procedure of step (iii) when the relative evolution of both the disk and the point-like sources components is smaller than 10 −8 between two consecutive iterations. This strategy can be particularly valuable for the processing of ADI sequences of stars that host point-like sources near a circumstellar disk. In this paper, we illustrate our combined approach on the PDS 70 system hosting two con- is not yet possible to conclude on the reality of this structure recovered by REXPACO/PACO, which could also be the result of reconstruction artifacts (e.g., due to a partial unmixing of the sparse and spatially extended components). This open question would require further analysis, for instance based on numerical simulations by means of injection of fake exoplanets and of a physics-based model of the PDS 70 disk to study the influence of the exoplanet-disk interactions on the reconstruction results. Such a study is left for future work. Concerning PDS 70c, the extended feature detected in the disk component near this exoplanet is in good agreement with the submillimeter continuum emission detection (Isella et al. 2019) which confirms the hypothesis that this exoplanet is still in its accretion phase. Figure 15 shows how the photometry and astrometry of the three point sources evolve throughout the alternations between REXPACO and PACO. Accounting for the presence of the disk modifies the estimated photometry of the three selected point-like sources by about 10%. The astrometry of PDS 70b and of the PLF estimated with PACO are not significantly modified by the proposed alternating scheme: These two point-like sources are not located inside a bright arc of the disk so that the disk material introduces only a small bias on PACO estimations. On the other hand, the estimated astrometry of the exoplanet PDS 70c located in-13 Based on estimated photometry, the study of Mesa et al. (2019) concludes that this PLF is most likely a part of the disk. In our present work, the goal is not to discuss the status of the PLF but rather to illustrate the principle of the proposed algorithm. side a very bright arm of the disk is modified by about 0.010 arcsec (i.e., about 1 pixel in the two spatial directions, PACO astrometry is evaluated on a subpixel grid with four nodes per pixel). The astrophysical interpretation of these results would require a dedicated study on multiple epochs. If the astro-photometric estimations we derived are confirmed, they could be significant corrective factors of the spectral energy distribution and of the orbit of the exoplanet PDS 70c.

Conclusion
In this paper, we have introduced a new post-processing algorithm, named REXPACO, dedicated to the reconstruction of flux distributions from ADI data sets in the presence of extended features, like circumstellar disks. To the best of our knowledge, REXPACO is one of the first methods specifically designed for this task. It encompasses the statistical modeling of the fluctuations of the nuisance component (i.e., stellar leakages and additional sources of noise) at the scale of small patches of a few tens of pixels. The resulting statistics of the nuisance term are accounted for in a regularized reconstruction framework following an inverse problem approach based on a forward image formation model of the off-axis sources in the ADI sequences. We have shown that suitable spatial regularizations ( 1 norm and 2 − 1 edgepreserving) reduce residual reconstruction artifacts. Regularization parameters are estimated by minimization of SURE. Our numerical experiments have shown that this strategy leads to a quasi-optimal setting of these parameters. The images of the circumstellar environment produced by our algorithm are deblurred from the instrumental PSF which is accounted for in the forward model of the image formation. These different ingredients make our approach very versatile and capable of recovering both sharp edges and smooth transitions of the circumstellar material. The REXPACO algorithm is fully unsupervised: The patch size, the statistics of the nuisance component and the weights given to the regularizations are estimated in a data-driven fashion. Besides, we have illustrated the benefits of the combination of REXPACO with PACO, two algorithms dedicated to complementary tasks, to analyze ADI sequences in the presence of both circumstellar disks and candidate pointlike sources. The combination of the two algorithms is able to disentangle the spatially extended contribution of circumstellar disks from the spatially localized contribution of point-like sources.
While the statistical model of REXPACO is able to capture most of the fluctuations of the nuisance component, slight stellar residuals are still present in the reconstructions. The fidelity of the underlying statistical model to the empirical distribution of the nuisance component is a key point to extract accurately the signal from off-axis sources. Beyond accurate modeling of the nuisance term, the angular motion through the ADI sequence is also a key point to unmix the light coming from off-axis objects such as a circumstellar disk and point-like sources from the stellar light that leaks from the coronagraph. For instance, objects with a circular symmetry cannot be correctly recovered since the temporal diversity brought by ADI is not sufficient in this case. Only additional information such as the spectral measurements obtained with angular plus spectral differential imaging (ASDI) in addition to the temporal diversity of ADI would make the reconstruction possible in those cases. The refinement and adaptation of REXPACO to the two abovementioned points require specific refinements that will be developed in a future paper. Figure B.1(a) illustrates on the example of HR 4796A that estimating the statistics Ω of the nuisance component directly from the data with Eqs. (14) leads to self-subtraction artifacts: A portion of the light of off-axis sources gets removed from the measurements because it is erroneously attributed to the nuisance component (the mean m [1] defined in Eq. (14a)). Figure B.1(b) shows that refining the statistics of the nuisance component to compensate for the contribution of the off-axis sources improves the reconstruction: Parts of the disk (like its bright west side) that were previously missed are covered. Figure B.1(c) displays the difference between the mean component m estimated with Eq. (14a) and with Algorithm 1. This difference is largest near the star where a significant part of disk contribution is lost in the average of the nuisance component m when using Eq. (14a). This explains the bias and artifacts observed in Fig. B.1(a). The shrinkage matrix W n is defined by Eq. (11) and is considered fixed (i.e., independent from x). In the following we show that these estimators are minimizers of the modified cost function D joint defined in Eq. (15) where I K ∈ R K×K stands for the identity matrix. Since C n is necessary non-singular, the system of equations (C.8) gives: C n = W n 1 T T t=1 u n,t u n,t , (C.9) which corresponds to the general form of the estimators given in Eqs. (C.1).