PACO ASDI: an algorithm for exoplanet detection and characterization in direct imaging with integral field spectrographs

Olivier Flasseur; Loïc Denis; Éric Thiébaut; Maud Langlois

doi:10.1051/0004-6361/201937239

Home

All issues

Volume 637 (May 2020)

A&A, 637 (2020) A9

Full HTML

Open Access

Issue		A&A Volume 637, May 2020


Article Number		A9
Number of page(s)		29
Section		Numerical methods and codes
DOI		https://doi.org/10.1051/0004-6361/201937239
Published online		05 May 2020

A&A 637, A9 (2020)

PACO ASDI: an algorithm for exoplanet detection and characterization in direct imaging with integral field spectrographs

Olivier Flasseur¹, Loïc Denis¹, Éric Thiébaut² and Maud Langlois²

¹ Université de Lyon, UJM-Saint-Etienne, CNRS, Institut d’Optique Graduate School, Laboratoire Hubert Curien UMR 5516, 42023 Saint-Etienne, France
e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
² Université de Lyon, Université Lyon1, ENS de Lyon, CNRS, Centre de Recherche Astrophysique de Lyon UMR 5574, 69230 Saint-Genis-Laval, France
e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

Received: 3 December 2019
Accepted: 6 March 2020

Abstract

Context. Exoplanet detection and characterization by direct imaging both rely on sophisticated instruments (adaptive optics and coronagraph) and adequate data processing methods. Angular and spectral differential imaging (ASDI) combines observations at different times and a range of wavelengths in order to separate the residual signal from the host star and the signal of interest corresponding to off-axis sources.

Aims. Very high contrast detection is only possible with an accurate modeling of those two components, in particular of the background due to stellar leakages of the host star masked out by the coronagraph. Beyond the detection of point-like sources in the field of view, it is also essential to characterize the detection in terms of statistical significance and astrometry and to estimate the source spectrum.

Methods. We extend our recent method PACO, based on local learning of patch covariances, in order to capture the spectral and temporal fluctuations of background structures. From this statistical modeling, we build a detection algorithm and a spectrum estimation method: PACO ASDI. The modeling of spectral correlations proves useful both in reducing detection artifacts and obtaining accurate statistical guarantees (detection thresholds and photometry confidence intervals).

Results. An analysis of several ASDI datasets from the VLT/SPHERE-IFS instrument shows that PACO ASDI produces very clean detection maps, for which setting a detection threshold is statistically reliable. Compared to other algorithms used routinely to exploit the scientific results of SPHERE-IFS, sensitivity is improved and many false detections can be avoided. Spectrally smoothed spectra are also produced by PACO ASDI. The analysis of datasets with injected fake planets validates the recovered spectra and the computed confidence intervals.

Conclusions. PACO ASDI is a high-contrast processing algorithm accounting for the spatio-spectral correlations of the data to produce statistically-grounded detection maps and reliable spectral estimations. Point source detections, photometric and astrometric characterizations are fully automatized.

Key words: techniques: image processing / techniques: high angular resolution / methods: statistical / methods: data analysis

© O. Flasseur et al. 2020

Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction

While most exoplanet detections (Schneider et al. 2011) have been obtained using indirect methods (Santos 2008), such as transit photometry or radial velocity techniques (Lovis & Fischer 2010), direct imaging (Traub & Oppenheimer 2010) appears to be a method of choice for the detection and characterization of young and massive exoplanets. To date, only a few exoplanets (over the 4300 known ones) have been successfully detected (Marois et al. 2008; Lagrange et al. 2009; Nielsen et al. 2012; Bailey et al. 2013; Macintosh et al. 2015; Chauvin et al. 2017; Keppler et al. 2018, see also Bowler 2016; Pueyo 2018; Nielsen et al. 2019 for recent reviews on direct imaging) since the emergence of this technique in the early 2000s. This can be explained by the difficulty in detecting the very faint signature of the exoplanets (typical levels of contrast with the host star are between 10⁻⁵ and 10⁻⁶ in the near infrared). The use of a coronagraph to mask out the host star jointly with an extreme adaptive optics system is mandatory for reaching these challenging contrasts. Exoplanet hunting facilities dedicated to direct observations such as VLT/SPHERE (Beuzit et al. 2008, 2019), GEMINI/GPI (Macintosh et al. 2014), Subaru/SCExAO (Jovanovic et al. 2015), Keck/VORTEX (Howard et al. 2010), and Magellan/MagAO (Morzinski et al. 2014) are equipped with these cutting-edge optical systems.

Several observation strategies can be used for direct imaging. The most popular observation mode is angular differential imaging (ADI, Marois et al. 2006). It consists of tracking the target star and maintaining the telescope pupil stable over time while the field of view rotates. Such observational sequence produces 3D datasets (2D + time) in which the speckles (structured background resulting from instrumental aberrations) are quasi-static, while the signature of the companions describes an apparent circular motion along around its host star. Spectral differential imaging (SDI, Racine et al. 1999) consists of recording simultaneously images in several spectral channels using an integral field spectrograph (IFS). Reduced 3D datacubes (2D + spectral) are obtained by mapping raw observations of the IFS cameras into a multi-spectral cube (Pavlov et al. 2008). In the reduced datacubes, the stellar speckles that are due to diffraction are very similar from one wavelength to the other, apart from their chromatic scaling (Perrin et al. 2003). After compensating for this scaling, speckles are aligned and can be combined to cancel out thus revealing the presence of off-axis sources whose positions are achromatic. A natural extension of ADI and SDI is to use them simultaneously.

This hybrid observation mode called angular and spectral differential imaging (ASDI) produces 4D datasets (2D + time + spectral), combining the properties of both ADI and SDI. Using ASDI datasets such as the ones obtained with the VLT/SPHERE-IFS instrument brings a spectral diversity (Macintosh et al. 2014; Gerard et al. 2019) compared to simple ADI. The discrimination between the signal from off-axis sources and the background signal due to stellar speckles is thus improved. In addition, ASDI datasets allow both the detection and the characterization of the detected exoplanets (Beuzit et al. 2019). Such a characterization is performed by fitting orbit models and exoplanet atmospheric models to the estimated astrometry and photometry (Vigan et al. 2010). Physical information such as age, effective temperature, composition or surface gravity can be derived for the detected exoplanets (Müller et al. 2018; Cheetham et al. 2019; Mawet et al. 2019; Claudi et al. 2019; Mesa et al. 2019a).

Whatever the observation mode, the recorded images are combined by a post-processing method in order to cancel out as much as possible the stellar speckles which largely dominate the exoplanet signal. Current state-of-the-art detection algorithms applied in direct imaging can be split into two families: algorithms specifically designed to work in SDI mode and algorithms initially designed to work in ADI mode which were later adapted to also work in ASDI mode.

There are few methods specific to SDI. They are mainly based on physical modeling of the stellar point spread function (PSF). The PeX algorithm (Thiébaut et al. 2016; Devaney & Thiébaut 2017) derives a model of the chromatic dependence of the speckles based on diffraction theory. The MEDUSAE method (Ygouf 2012; Cantalloube 2016; Cantalloube et al. 2018) uses an analytic model of the coronagraphic PSF and performs speckle modeling by an inverse problem approach that estimates phase aberrations from the measurements. It also includes an object restoration step via a deconvolution procedure combined with suitable regularization penalties.

There is a much larger variety of ADI processing methods. Several of them are derived from the LOCI algorithm (Lafrenière et al. 2007) in which a stellar PSF image is created by combining images selected in a library of data acquired under experimental conditions similar to those of the observation of interest. The combined images are selected and weighted in order to minimize the residual noise inside annular regions of the images. This combined image is then subtracted from the recorded images to attenuate the speckles and enhance the exoplanet signal. Several adaptations of this general principle have been proposed in the literature, such as the ALOCI (Currie et al. 2012a,b), TLOCI (Marois et al. 2013, 2014), or MLOCI (Wahhaj et al. 2015) algorithms. Among these, the TLOCI algorithm has become one of the gold-standard methods to process ADI datasets. It differs from the standard LOCI algorithm on the construction of the reference stellar PSF. Instead of only minimizing the residual noise, TLOCI also maximizes the exoplanet throughput. Another group of methods models the fluctuations of the stellar speckles (i.e., the on-axis PSF) by a low-dimensional subspace. The exoplanet signal is thus captured on the subspace orthogonal to the subspace that captures fluctuations of the stellar speckles. The data are projected on an orthogonal basis created by principal components analysis (PCA). This is the principle behind the widely used KLIP algorithm (Soummer et al. 2012; Absil et al. 2013) which builds a basis of the subspace capturing the stellar PSF by performing a Karhunen-Loève transform that is, a PCA¹ of the images from the reference library. To obtain a model of the stellar PSF to subtract in order to attenuate the speckles, the science data is projected onto a predetermined number of modes. Interestingly, as demonstrated by Savransky (2015), the (A,M,T)LOCI- and KLIP-type methods can be seen as two closely related instances of the general class of algorithms called Blind Source Separation (BSS). The LLSG method (Gonzalez et al. 2016) is also based on subspace approaches since it decomposes the datasets into low-rank, sparse and Gaussian components. Other ADI processing methods are based on a statistical framework. For example, the MOODS method (Smith et al. 2009) performs a joint estimation of the exoplanet amplitude and of the stellar speckles. The ANDROMEDA algorithm (Mugnier et al. 2009; Cantalloube et al. 2015) forms differences in temporal images to suppress stellar speckles and performs the detection of differential off-axis PSFs. A generalized likelihood ratio test is then evaluated to perform the detection. A matched filter approach can also be used on the PCA residuals to perform the detection (Ruffio et al. 2017). All these algorithms based on a statistical framework encompass the estimation of the exoplanet throughput and detection confidences through a maximum likelihood approach under a white and Gaussian hypothesis which is a rather crude assumption for this kind of data. The PACO algorithm (Flasseur et al. 2018a) is also based on a maximum likelihood approach but with a more consistent statistical model self-calibrated on the data and accounting for the spatial covariances of the speckles at the scale of small patches. Recently, some works have also applied deep learning techniques to direct imaging (Gonzalez et al. 2018; Yip et al. 2019).

Most of these algorithms are subject to different limitations. Generally, they are not fully unsupervised so that the tuning of several parameters is often mandatory to reach the best performance of the method. Such tuning is time-consuming and should ideally be repeated for each dataset since it depends on the dataset properties (considered spectral bands, number of temporal and spectral frames, quality of the observations, amount of parallactic rotation, etc.). For both SDI and ASDI processing, the recorded images at wavelength λ are scaled by a factor λ_ref/λ, where λ_ref is a reference wavelength, so that the on-axis PSF and the speckle field are approximately aligned throughout the ASDI stack (reduced chromatic variations). Due to the difference between the scaling factor applied respectively to the shortest and the longest wavelengths, only a central area of the field of view is covered by all rescaled images. Some source detection techniques process only that area common to all wavelengths. This leads to a drastic reduction of the field of view (typically 25% to 50%), which limits the ability to detect sources. In addition, all ADI and ASDI processing methods are subject to the so-called self-subtraction phenomenon. By combining information (either by image subtraction as in the TLOCI type methods, or by modes subtraction as in the PCA type methods) at different times or wavelengths to attenuate the speckle background, the signal of the exoplanets is also attenuated. Consequently, the photometry is not intrinsically preserved so that an additional calibration step via Monte–Carlo injections is mandatory to compensate for the exoplanet self-subtraction. Finally, the main limitation of existing approaches is the lack of control of the probability of false alarms on the detection maps and contrast curves (see Sect. 5 for a discussion). It is common for state-of-the-art methods to produce detection maps with many more false alarms than theoretically expected.

Based on an analysis of the limitations of existing algorithms for ASDI data processing and of the needs driven by the new planet finder instruments, the following desirable specifications for exoplanet detection algorithms may be listed as:

– automatic source detection with statistical guarantees (i.e., control over the probability of false alarms),

– characterization of the sources detected: subpixel astrometry and unbiased estimation of the relative spectrum², with reliable confidence intervals,

– ability to process the whole field of view covered by the instrument without artifacts,

– computation of a map reporting the contrast achieved for a source to be detected at a given detection threshold.

In this paper, we attempt to address these different points by deriving an algorithm from a data-driven statistical modeling of ASDI observations. The proposed algorithm, named PACO ASDI, is an extension of our ADI exoplanet detection method PACO introduced in Flasseur et al. (2018a). PACO can be used to process independently each spectral channel of ASDI datasets (Flasseur et al. 2018b). The main methodological adaptations of PACO ASDI for joint ASDI processing are the following: (i) modeling of local covariances based both on temporal and spectral information (see Sect. 2), (ii) adaptation to the time-specific and wavelength-specific magnitude of the background fluctuations (see Sect. 2), (iii) a general approach combining detection maps at different wavelengths that accounts for spectral correlations (see Sect. 3.2) that is also beneficial to combine detection maps from other existing algorithms (see Appendix F), and (iv) estimation of the spectrum of sources, including a parameter-free spectral smoothing (see Sect. 4.2).

Figure 1 gives the general scheme of PACO ASDI to which we will refer throughout the paper. It illustrates the four main steps of the algorithm:

Fig. 1.

Scheme of PACO ASDI algorithm.

– learning of a model of the background that accounts for the patch spatial covariance (step ➀, detailed in Sect. 2),

– single-wavelength detection, by application of detection theory to our statistical model of the background (step ➁, detailed in Sect. 3.1),

– multi-wavelength detection, by combining the single-wavelength detection maps; this is achieved by learning the spectral correlations between single-wavelength detection maps (steps ➂–➅) and introducing a (coarse) prior information on the spectrum of the source (detailed in Sect. 3.2),

– once a source has been detected, its astrometry and photometry is estimated during a characterization step (step ➆, described in Sect. 4), by iteratively refining the source parameters (angular location, spectrum and total flux) and the statistical model of the background (spatial and spectral correlations).

Statistical modeling of the spatial, temporal, and spectral fluctuations of ASDI datasets is a guiding thread throughout the paper for grounding the detection and estimation method and to obtain reliable indications on the probability of false alarms, astrometric, and photometric confidence intervals. Those statistical guarantees are essential to the astronomers to automatize the detection and analysis of ASDI datasets, for the scientific exploitation of the results, and also for characterizing the performance of the instrument (detection limits and photometric accuracy depending on the observation strategy, the observation conditions, and the performance of the adaptive optics + coronagraph).

This paper is organized as follows. We describe in Sect. 2 our statistical modeling of the background fluctuations for ASDI datasets. In Sect. 3, we explain how to obtain single wavelength and combined detection maps at a controlled probability of false alarms. Section 4 details our parameter-free and regularized spectrum estimation procedure applied to the detected sources. In Sect. 5, we illustrate on VLT/SPHERE-IFS datasets the performance of the proposed PACO ASDI algorithm in terms of detection maps, achievable contrast, and spectrum estimation. Finally, Sect. 6 presents the paper’s conclusions.

2. Statistical modeling of background fluctuations in ASDI

After speckle alignment by spectral image magnification, background structures (i.e., stellar speckles) are approximately constant (up to a multiplication by a chromatic factor accounting for the star spectrum) through time and the wavelengths. A closer observation reveals some temporal and spectral fluctuations. These fluctuations are spatially structured. It is essential to model these fluctuations in order to discriminate between an insignificant change of the background and a point source. We describe in this section a statistical model of the background fluctuations. The detection and source characterization algorithm PACO ASDI is grounded on this model.

2.1. Local multivariate Gaussian model

Our modeling of the spatial covariances in ADI datasets with PACO algorithm led to two conclusions: (i) to account for the nonstationarity of the background, local modeling is necessary; and (ii) given the limited number of samples available at any given location, a trade-off must be found between the size of the covariance matrices and the estimation variance.

To extend the modeling from ADI datasets to the 4D spatio-temporo-spectral datacubes of ASDI, we keep a local Gaussian modeling: parameters of the Gaussians are estimated by analyzing patches extracted at a given location. For the nth pixel³ of the field of view (identified by its 2D angular direction θ_n on the sky with respect to the stellar center), we extract T ⋅ L patches (where T is the number of temporal frames and L is the number of spectral channels), each made of K pixels. The patch size is constant for a given instrument and is fixed with the same empirical rule than the one derived in the PACO algorithm: it should be chosen so that twice the full width at half maximum (FWHM) of the off-axis PSF is encompassed by the patches. Those patches r_{n, ℓ, t} are all centered on the same sky location θ_n but correspond to different frames and spectral channels leading to the local collection {r_{n, ℓ, t}}_{ℓ∈1 : L, t ∈ 1 : T}, where ℓ indicates the spectral channel and t the frame index. If this collection contains no off-axis point source, we model each patch r_{n, ℓ, t} as a random realization of the K-dimensional Gaussian $N (m_{n, ℓ}, σ_{n, ℓ, t}^{2} C_{n})$ $Mathematical equation: $ \mathcal{N}(\boldsymbol m_{n,\ell},\sigma_{n,\ell,t}^2\mathbf{C}_n) $$ . The mean patch m_n, ℓ is the same for all t but is chromatic. The K × K covariance matrix is modeled as a product of two factors: a time and wavelength-dependent scaling $σ_{n, ℓ, t}^{2}$ $Mathematical equation: $ \sigma_{n,\ell,t}^2 $$ and a spatial covariance matrix C_n that are constant for a given patch collection extracted around pixel n. This modeling follows the two guidelines: (i) local adaptivity to account for background nonstationarities, in particular, the model is specific to a given spatial location and captures different fluctuation magnitudes for different wavelengths or different temporal frames; and (ii) a limitation of the number of parameters that have to be estimated from the collection of patches by neglecting temporal and spectral correlations. Several variants of this modeling have been evaluated in experiments, not reported here, that led to worse detection performances⁴. The effectiveness of introducing a temporal scaling factor in ADI has been recently demonstrated (Flasseur et al. 2020). Neglecting the spectral correlations of the background may seem a crude approximation. There are indeed some strong correlations but these correlations are difficult to capture at the scale of patches given our limited number of samples. In Sect. 3.2, we describe how to account for spectral correlations at a later stage of the algorithm, with satisfying results.

2.2. Local learning of the parameters

Since a different multivariate Gaussian model is defined for each angular location θ_n, the estimation of the parameters m_n, ℓ, $σ_{n, ℓ, t}^{2}$ $Mathematical equation: $ \sigma_{n,\ell,t}^2 $$ and C_n can be performed independently on each collection {r_{n, ℓ, t}}_{ℓ∈1 : L, t ∈ 1 : T} of 2D patches centered on a given location θ_n. Under our assumptions of negligible temporal and spectral correlations, the co-log-likelihood ℒ_n of the collection can be written:

$\begin{matrix} L_{n} & = - log p ({r_{n, ℓ, t}}_{ℓ \in 1 : L, t \in 1 : T} | \\ {m_{n, ℓ}}_{ℓ \in 1 : L}, {σ_{n, ℓ, t}^{2}}_{ℓ \in 1 : L, t \in 1 : T}, C_{n}) \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \fancyscript {L}_n&=-\log \;\text{p}\left(\{\boldsymbol{r}_{n,\ell ,t}\}_{\ell \in 1:L,\,t \in 1:T}\,\bigl |\right. \\&\quad \left.\left\{ \boldsymbol{m}_{n,\ell }\right\} _{\ell \in 1:L},\,\left\{ \sigma _{n,\ell ,t}^2\right\} _{\ell \in 1:L,\,t \in 1:T},\,\mathbf C _n\right) \nonumber \end{aligned} $$$ (1)

$\begin{matrix} \Rightarrow L_{n} & = \frac{LTK}{2} log 2 π + \sum_{\begin{matrix} ℓ \in 1 : L \\ t \in 1 : T \end{matrix}} \frac{1}{2} log det (σ_{n, ℓ, t}^{2} C_{n}) \\ + \sum_{\begin{matrix} ℓ \in 1 : L \\ t \in 1 : T \end{matrix}} \frac{1}{2} (r_{n, ℓ, t} - m_{n, ℓ})^{⊤} (σ_{n, ℓ, t}^{2} C_{n})^{- 1} (r_{n, ℓ, t} - m_{n, ℓ}) . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \Rightarrow \fancyscript {L}_n&=\frac{LTK}{2}\log 2\pi +\sum _{\begin{matrix} \ell \in 1:L\\ t\in 1:T \end{matrix}} \tfrac{1}{2}\log \text{det}(\sigma _{n,\ell ,t}^2\mathbf C _n)\nonumber \\&\quad +\sum _{\begin{matrix} \ell \in 1:L\\ t \in 1:T \end{matrix}} \tfrac{1}{2}\bigl (\boldsymbol{r}_{n,\ell ,t}-\boldsymbol{m}_{n,\ell }\bigr )^{\top }\bigl (\sigma _{n,\ell ,t}^2\mathbf C _n\bigr )^{-1}\bigl (\boldsymbol{r}_{n,\ell ,t}-\boldsymbol{m}_{n,\ell }\bigr ). \end{aligned} $$$ (2)

Algorithm 1:Local background statistics estimation.

We show in Appendix A that the maximum likelihood estimates for the Gaussian parameters are the solution to the following system of non-linear equations:

$\begin{matrix} {\begin{matrix} {\hat{m}}_{n, ℓ} = \frac{1}{\sum_{t \in 1 : T} 1 / {\hat{σ}}_{n, ℓ, t}^{2}} \cdot \sum_{t \in 1 : T} \frac{1}{{\hat{σ}}_{n, ℓ, t}^{2}} r_{n, ℓ, t} \\ {\hat{σ}}_{n, ℓ, t}^{2} = \frac{1}{K} (r_{n, ℓ, t} - {\hat{m}}_{n, ℓ})^{⊤} {\hat{S}}_{n}^{- 1} (r_{n, ℓ, t} - {\hat{m}}_{n, ℓ}) \\ {\hat{S}}_{n} = \frac{1}{TL} \sum_{\begin{matrix} ℓ \in 1 : L \\ t \in 1 : T \end{matrix}} \frac{1}{{\hat{σ}}_{n, ℓ, t}^{2}} (r_{n, ℓ, t} - {\hat{m}}_{n, ℓ}) (r_{n, ℓ, t} - {\hat{m}}_{n, ℓ})^{⊤}, \end{matrix} \end{matrix}$ $Mathematical equation: $$ \begin{aligned} {\left\{ \begin{array}{ll} \widehat{\boldsymbol{m}}_{n,\ell }=\frac{1}{\sum \limits _{t\in 1:T} 1/\hat{\sigma }_{n,\ell ,t}^2}\cdot \sum \limits _{t\in 1:T}\frac{1}{\hat{\sigma }_{n,\ell ,t}^2}\boldsymbol{r}_{n,\ell ,t}\\ {\widehat{\sigma }}_{n,\ell ,t}^2 = \frac{1}{K} \bigl (\boldsymbol{r}_{n,\ell ,t}-\widehat{\boldsymbol{m}}_{n,\ell }\bigr ) ^{\top }\widehat{\mathbf{S }}_{n}^{-1} \bigl (\boldsymbol{r}_{n,\ell ,t}-\widehat{\boldsymbol{m}}_{n,\ell }\bigr )\\ \widehat{\mathbf{S }}_n=\frac{1}{TL} \sum \limits _{\begin{matrix} \ell \in 1:L\\ t\in 1:T \end{matrix}}\frac{1}{\hat{\sigma }_{n,\ell ,t}^2} \bigl (\boldsymbol{r}_{n,\ell ,t}-\widehat{\boldsymbol{m}}_{n,\ell }\bigr ) \bigl (\boldsymbol{r}_{n,\ell ,t}-\widehat{\boldsymbol{m}}_{n,\ell }\bigr ) ^{\top }, \end{array}\right.} \end{aligned} $$$ (3)

where the maximum likelihood estimate of the covariance ${\hat{S}}_{n}$ $Mathematical equation: $ \widehat{\mathbf{S}}_n $$ is not directly used as an estimate of the covariance ${\hat{C}}_{n}$ $Mathematical equation: $ \widehat{\mathbf{C}}_n $$ , but is replaced by a more reliable estimator with a smaller risk, as described in the following paragraphs.

Algorithm 2:Shrinkage covariance estimator.

We solve the System (3) by the method of fixed-point iteration, that is, by alternatively updating each unknown until convergence. This leads to Algorithm 1, where we chose the arbitrary normalization $tr ({\hat{C}}_{n}) = K$ $Mathematical equation: $ \text{tr}(\widehat{\mathbf{C}}_n)=K $$ for matrix ${\hat{C}}_{n}$ $Mathematical equation: $ \widehat{\mathbf{C}}_n $$ (some form of normalization is necessary to remove the scaling degeneracy in the product ${\hat{σ}}_{n, ℓ, t}^{2} {\hat{C}}_{n}$ $Mathematical equation: $ \widehat{\sigma}_{n,\ell,t}^2\widehat{\mathbf{C}}_n $$ ).

For locations θ_n outside of the central region of the field of view, some patches r_{n, ℓ, t} fall outside of the measured area for the largest wavelengths. In that case, the sum for the computation of S_n in the System (3) and in Algorithm 1 is restricted to the wavelengths ℓ for which the patch is measured and the normalization factor 1/TL is corrected to match the actual number of terms in the sum. Given the severe reduction in the number of patches actually used close to the borders of the field of view, it is important to regularize the sample covariance to reduce the estimation variance and to prevent obtaining singular or ill-conditioned matrices. As in Flasseur et al. (2018a), we use a shrinkage estimator, implemented according to Algorithm 2. Because of the weighting by factors $1 / {\hat{σ}}_{n, ℓ, t}^{2}$ $Mathematical equation: $ 1/\hat{\sigma}_{n,\ell,t}^2 $$ , some patches have more importance than others and an equivalent number $\tilde{P}$ $Mathematical equation: $ \widetilde{P} $$ of patches is used in the shrinkage formula, step 3 of Algorithm 1, see also Flasseur et al. (2020). The closed form expression of $\tilde{P}$ $Mathematical equation: $ \widetilde{P} $$ is derived in Appendix B. The rationale behind this equivalent number of patches comes from the variance reduction when performing the weighted mean.

Figure 2a depicts the observed intensities in some frames of an ASDI dataset. Fluctuations can be noted both through time and through the wavelengths. In Fig. 2b, maps of the time and wavelength-specific scaling factors ${\hat{σ}}_{n, ℓ, t}^{2}$ $Mathematical equation: $ \widehat{\sigma}_{n,\ell,t}^2 $$ are displayed for 16 pairs (ℓ,t). At a given location n, large values of this scaling factor compared to other frames t or other wavelengths ℓ indicate that the corresponding patches have a moderate or negligible weight when estimating the mean background and the spatial covariance matrix. In the source detection and characterization steps described in the following sections, patches with comparatively larger scaling factors ${\hat{σ}}_{n, ℓ, t}^{2}$ $Mathematical equation: $ \widehat{\sigma}_{n,\ell,t}^2 $$ also play a minor role. The method is thus robust to the presence of outliers⁵ in the data, see Flasseur et al. (2020). A close inspection of the maps in Fig. 2b reveals the presence of outliers: when an outlier affects a patch, the whole patch is discarded, outliers are thus visible as a disk-shaped area of large ${\hat{σ}}_{n, ℓ, t}^{2}$ $Mathematical equation: $ \widehat{\sigma}_{n,\ell,t}^2 $$ values (corresponding to all spatial locations n that contain the outlier, that is, the disk shape of our patches).

Fig. 2.

Accounting for temporal and spectral fluctuations with time and wavelength-specific scaling factors: a: observed intensities, for some selected frames (4 wavelengths × 4 exposures); b: corresponding spatial distribution of scaling factors.

The convergence of Algorithm 1 is illustrated in Fig. 3. Three different locations in the field of view, depicted by a red dot in the insert, are selected: a small angular separation in Fig. 3a, an intermediate separation in 3b and a large separation in 3c. In each case, 1000 different random draws were used as an initialization. The graphs report the normalized distance to the solution found with a constant initialization after a large number of iterations. Convergence to the same solution is observed experimentally in all cases. An insert also gives the evolution of each scaling factor ${\hat{σ}}_{n, ℓ, t}^{2}$ $Mathematical equation: $ \widehat\sigma_{n,\ell,t}^2 $$ with the iterations, until the convergence criterion is reached. A satisfactory convergence is reached in about 10 iterations. At large angular separations, as in 3c, only the shortest wavelengths are available after the speckles are aligned by spectral zooming. The convergence is even faster in this case.

Fig. 3.

Convergence of the scaling factors, starting from many random initializations. In the inserts, the location in the field of view is indicated as well as the evolution of the weights until the convergence criterion is reached.

Figure 4 evaluates the statistical accuracy of our modeling of the background on HR 8799 ASDI dataset. The left column gives the values and empirical distributions of the collection of mean-subtracted patches {r_{n, ℓ, t} − m_n, ℓ}_{ℓ∈1 : L, t ∈ 1 : T}, at a location n near the coronagraph (rows a and b), at a location farther from the coronagraph (rows c and d), and for all patches from the field of view (row e). Since only the modeling of the background is considered here, all patches around and at the location of the 3 known point-like sources were excluded. Simply removing an average background per wavelength is not satisfactory: values are not distributed according to a Gaussian distribution, there are numerous large deviations. The central column of Fig. 4 gives the intensity values and the empirical distributions when only a spatial whitening is applied, using the same spatial covariance matrix for all frames and all wavelengths. Each mean-subtracted patch intensity of the collection ${{\bar{r}}_{n, ℓ, t}}_{ℓ \in 1 : L, t \in 1 : T} = {r_{n, ℓ, t} - {\hat{m}}_{n, ℓ}}_{ℓ \in 1 : L, t \in 1 : T}$ $Mathematical equation: $ \{ \overline{{\boldsymbol{r}}}_{n,\ell,t} \}_{\ell \in 1:L,\,t\in 1:T} = \{ {\boldsymbol{r}}_{n,\ell,t} - \widehat{{\boldsymbol{m}}}_{n,\ell} \}_{\ell\in 1:L,\,t\in 1:T} $$ is multiplied by the K × K Cholesky factor $\hat{N} {^{⊤}}_{n}$ $Mathematical equation: $ \widehat{{\mathbb{N}}}{^{{\top}}}_n $$ of the spatial covariance ${\hat{C}}_{n}$ $Mathematical equation: $ \widehat{\mathbf{C}}_n $$ such as ${\hat{N}}_{n} {\hat{N}}_{n}^{⊤} = {\hat{C}}_{n}^{- 1}$ $Mathematical equation: $ \widehat{{\mathbb{N}}}_n \widehat{{\mathbb{N}}}_n{^{{\top}}}= \widehat{\mathbf{C}}_n^{-1} $$ . If the statistical modeling captures accurately the fluctuations of the background, the vectors ${{\hat{N}}_{n}^{⊤} {\bar{r}}_{n, ℓ, t}}_{ℓ \in 1 : L, t \in 1 : T}$ $Mathematical equation: $ \{ \widehat{{\mathbb{N}}}_n{^{{\top}}}\overline{{\boldsymbol{r}}}_{n,\ell,t} \}_{\ell\in 1:L,\,t\in 1:T} $$ should be distributed according to 𝒩(0, I) under ℋ₀: the linear filter ${\hat{N}}_{n}^{⊤}$ $Mathematical equation: $ \widehat{{\mathbb{N}}}_n{^{{\top}}} $$ whitens the vectors ${{\bar{r}}_{n, ℓ, t}}_{ℓ \in 1 : L, t \in 1 : T}$ $Mathematical equation: $ \{ \overline{{\boldsymbol{r}}}_{n,\ell,t}\}_{\ell\in 1:L,\,t\in 1:T} $$ . The introduction of the whitening step leads to residuals more closely following a standard Gaussian distribution. The right column considers the case of spatial whitening by a covariance matrix scaled by the time and wavelength-specific factors ${\hat{σ}}_{n, ℓ, t}$ $Mathematical equation: $ \widehat\sigma_{n,\ell,t} $$ . The empirical distributions follow more closely a standard Gaussian, yet the match is not perfect close to the coronagraph. Accounting for the spectral or temporal correlations would probably further improve the statistical modeling of the background. Such modeling, however, seems difficult to carry out given the limited number of samples and is left to further studies. It is shown in the following sections that the proposed modeling already provides consistent results.

Fig. 4.

Distribution of the centered patches: left: without whitening; center: after spatial whitening; right: after spatial whitening and correction by the wavelength and time-specific scaling factors. Rows a and b: location selected at a small angular separation; rows c and d: location at a larger angular separation; row e: empirical distribution computed over the whole field of view. Patches represented in this figure contain no point-source.

3. Detection maps

The statistical model of the background in ASDI datasets introduced in the previous section is essential to derive the detection and characterization method. Backgrounds at all wavelengths are combined to estimate the parameters of this model. We first describe how this multi-wavelength background model can be applied to produce a detection map at a single wavelength. We then discuss the combination of detection maps at several wavelengths.

3.1. Detection at a single wavelength

Let ϕ₀ be the hypothetical location of a point source in some reference frame. If a point source is present at that location, with a flux α_ℓ in the ℓth band of the spectrum, then the signal of that source corresponds, at time t and in the nth patch, to α_ℓh_n, ℓ(ϕ_ℓ, t), with h_n, ℓ(ϕ_ℓ, t) the zoomed-in off-axis PSF centered at the subpixel location ϕ_ℓ, t of the source at the ℓth wavelength and tth frame. Given the scarcity of sources in the field of view, it is safe to suppose that, within a small patch of a few tens of pixels, only a single source may be present. Detecting a point source at location ϕ₀ then amounts to deciding for one of two hypotheses:

$\begin{matrix} {\begin{matrix} H_{0} : {r_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t}}_{t \in 1 : T} & = {f_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t}}_{t \in 1 : T} \\ (background only) \\ H_{1} : {r_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t}}_{t \in 1 : T} & = α_{ℓ} {h_{⌊ ϕ_{ℓ, t} ⌉, ℓ} (ϕ_{ℓ, t})}_{t \in 1 : T} \\ (background+source) & + {f_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t}}_{t \in 1 : T}, \end{matrix} \end{matrix}$ $Mathematical equation: $$ \begin{aligned} {\left\{ \begin{array}{ll} \mathcal{H} _{0}: \{\boldsymbol{r}_{{\lfloor \phi _{\ell ,t}\rceil },\ell ,t}\}_{t\in 1:T}&= \{\boldsymbol{f}_{{\lfloor \phi _{\ell ,t}\rceil },\ell ,t}\}_{t\in 1:T}\\ \text{(background}\, \text{only)}&\\ \mathcal{H} _{1}: \{\boldsymbol{r}_{{\lfloor \phi _{\ell ,t}\rceil },\ell ,t}\}_{t\in 1:T}&= \alpha _\ell \,\{\boldsymbol{h}_{{\lfloor \phi _{\ell ,t}\rceil },\ell }(\phi _{\ell ,t})\}_{{t\in 1:T}}\\ \text{(background+source)}&+\{\boldsymbol{f}_{{\lfloor \phi _{\ell ,t}\rceil },\ell ,t}\}_{t\in 1:T}, \end{array}\right.} \end{aligned} $$$ (4)

where f is the notation for patches that contain pure background. The collection of patches considered in this hypothesis test corresponds to all patches that would contain the source if it was present: patches centered at pixel locations $⌊ ϕ_{ℓ, t} ⌉$ $Mathematical equation: $ {{\lfloor \phi_{\ell,t}\rceil}} $$ that match the location of the source at time t and wavelength ℓ due to the rotation of the field of view and the zoom applied to align the speckles at all wavelengths, see Fig. 5. Under hypothesis ℋ₀, the collection of patches corresponds to pure background: no source is present at location ϕ₀. Under hypothesis ℋ₁, the patches result from the superimposition of an off-axis PSF and of the background.

Fig. 5.

Evolution of the 2D location ϕ_ℓ, t of a source in a speckle-aligned ASDI dataset: ϕ₀ defines the 2D angular location of a point source in a reference frame; the apparent location of the point source in the tth observation and the ℓth spectral band is indicated by a black disk; the apparent locations of a point source at other observation times and spectral bands are indicated by gray circles. The location ϕ_ℓ, t describes a radial motion with the wavelength and a rotation about the optical axis over time.

Under our statistical model of the background given in Eq. (2), the likelihood of each hypothesis can be compared for a given flux α_ℓ:

$\begin{matrix} 2 log \frac{p ({r_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t}} | H_{1}, α_{ℓ})}{p ({r_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t}} | H_{0})} \\ = α_{ℓ} \sum_{t = 1}^{T} u_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t}^{⊤} {\hat{C}}_{⌊ ϕ_{ℓ, t} ⌉}^{- 1} h_{⌊ ϕ_{ℓ, t} ⌉, ℓ} (ϕ_{ℓ, t}) \\ with u_{n, ℓ, t} = \frac{1}{{\hat{σ}}_{n, ℓ, t}^{2}} (r_{n, ℓ, t} - {\hat{m}}_{n, ℓ} - α_{ℓ} h_{n, ℓ} (ϕ_{ℓ, t})) . \end{matrix}$ $Mathematical equation: $$ \begin{aligned}&2\log \frac{\text{p}(\{\boldsymbol{r}_{{\lfloor \phi _{\ell ,t}\rceil },\ell ,t}\}|\mathcal{H} _{1},\alpha _\ell )}{\text{p}(\{\boldsymbol{r}_{{\lfloor \phi _{\ell ,t}\rceil },\ell ,t}\}|\mathcal{H} _{0})} \nonumber \\&\qquad = \alpha _\ell \sum _{t=1}^T \boldsymbol{u}_{{\lfloor \phi _{\ell ,t}\rceil },\ell ,t}^{\top }\widehat{\mathbf{C }}_{\lfloor \phi _{\ell ,t}\rceil }^{-1}\boldsymbol{h}_{{\lfloor \phi _{\ell ,t}\rceil },\ell }(\phi _{\ell ,t}) \\&\text{with}\, \boldsymbol{u}_{n,\ell ,t}=\frac{1}{\hat{\sigma }_{n,\ell ,t}^2} \left(\boldsymbol{r}_{n,\ell ,t}-\widehat{\boldsymbol{m}}_{n,\ell }-\alpha _\ell \boldsymbol{h}_{n,\ell }(\phi _{\ell ,t}) \right).\nonumber \end{aligned} $$$ (5)

Since the flux α_ℓ is generally not known beforehand, it has to be estimated from the data. The maximum likelihood estimator, under our model of the background, is:

$\begin{matrix} {\hat{α}}_{ℓ} = \frac{\sum_{t = 1}^{T} b_{ℓ, t} / {\hat{σ}}_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t}^{2}}{\sum_{t = 1}^{T} a_{ℓ, t} / {\hat{σ}}_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t}^{2}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \widehat{\alpha }_\ell = \frac{\sum \limits _{t=1}^T \displaystyle b_{\ell ,t}/\widehat{\sigma }_{{\lfloor \phi _{\ell ,t}\rceil },\ell ,t}^2}{\sum \limits _{t=1}^T \displaystyle a_{\ell ,t}/\widehat{\sigma }_{{\lfloor \phi _{\ell ,t}\rceil },\ell ,t}^2}\,, \end{aligned} $$$ (6)

with

$\begin{matrix} a_{ℓ, t} = h_{⌊ ϕ_{ℓ, t} ⌉, ℓ} {(ϕ_{ℓ, t})}^{⊤} {\hat{C}}_{⌊ ϕ_{ℓ, t} ⌉}^{- 1} h_{⌊ ϕ_{ℓ, t} ⌉, ℓ} (ϕ_{ℓ, t}) \end{matrix}$ $Mathematical equation: $$ \begin{aligned} a_{\ell ,t} = \boldsymbol{h}_{{\lfloor \phi _{\ell ,t}\rceil },\ell }(\phi _{\ell ,t})^{\top }\widehat{\mathbf{C }}_{{\lfloor \phi _{\ell ,t}\rceil }}^{-1} \boldsymbol{h}_{{\lfloor \phi _{\ell ,t}\rceil },\ell }(\phi _{\ell ,t}) \end{aligned} $$$ (7)

and

$\begin{matrix} b_{ℓ, t} = h_{⌊ ϕ_{ℓ, t} ⌉, ℓ} {(ϕ_{ℓ, t})}^{⊤} {\hat{C}}_{⌊ ϕ_{ℓ} ⌉}^{- 1} [] r_{⌊ ϕ_{ℓ} ⌉, ℓ, t} - {\hat{m}}_{⌊ ϕ_{ℓ, t} ⌉, ℓ} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} b_{\ell ,t} = \boldsymbol{h}_{{\lfloor \phi _{\ell ,t}\rceil },\ell }(\phi _{\ell ,t})^{\top }\widehat{\mathbf{C }}_{{\lfloor \phi _{\ell }\rceil }}^{-1} [\big ]{ \boldsymbol{r}_{{\lfloor \phi _{\ell }\rceil },\ell ,t} - \widehat{\boldsymbol{m}}_{{\lfloor \phi _{\ell ,t}\rceil },\ell } }. \end{aligned} $$$ (8)

Substituting α_ℓ with its estimate ${\hat{α}}_{ℓ}$ $Mathematical equation: $ \widehat{\alpha}_\ell $$ in Eq. (5) leads to the generalized likelihood ratio test:

$\begin{matrix} {GLRT}_{ℓ} : \frac{{(\sum_{t = 1}^{T} b_{ℓ, t} / {\hat{σ}}_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t}^{2})}^{2}}{\sum_{t = 1}^{T} a_{ℓ, t} / {\hat{σ}}_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t}^{2}} \underset{H_{0}}{\overset{H_{1}}{≷}} η . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \text{GLRT}_\ell :\quad \frac{\left(\sum \limits _{t=1}^T \displaystyle b_{\ell ,t}/\widehat{\sigma }_{{\lfloor \phi _{\ell ,t}\rceil },\ell ,t}^2\right)^2}{\sum \limits _{t=1}^T \displaystyle a_{\ell ,t}/\widehat{\sigma }_{{\lfloor \phi _{\ell ,t}\rceil },\ell ,t}^2} \underset{{\mathcal{H} }_0}{\overset{{\mathcal{H} }_1}{\gtrless }} \eta . \end{aligned} $$$ (9)

Only positive flux estimates are physically meaningful for point sources. The test can then be improved by discarding locations leading to negative flux estimates ${\hat{α}}_{ℓ}$ $Mathematical equation: $ \widehat{\alpha}_\ell $$ (see also Flasseur et al. 2018a):

$\begin{matrix} {S / N}_{ℓ} : \frac{\sum_{t = 1}^{T} b_{ℓ, t} / {\hat{σ}}_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t}^{2}}{\sqrt{\sum_{t = 1}^{T} a_{ℓ, t} / {\hat{σ}}_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t}^{2}}} \underset{H_{0}}{\overset{H_{1}}{≷}} τ, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathrm{S/N}_\ell :\quad \frac{\sum \limits _{t=1}^T \displaystyle b_{\ell ,t}/\widehat{\sigma }_{{\lfloor \phi _{\ell ,t}\rceil },\ell ,t}^2}{\sqrt{\sum \limits _{t=1}^T \displaystyle a_{\ell ,t}/\widehat{\sigma }_{{\lfloor \phi _{\ell ,t}\rceil },\ell ,t}^2}} \underset{{\mathcal{H} }_0}{\overset{{\mathcal{H} }_1}{\gtrless }} \tau \,, \end{aligned} $$$ (10)

which matches GLRT_ℓ when $τ = \sqrt{η}$ $Mathematical equation: $ \tau=\sqrt{\eta} $$ and ${\hat{α}}_{ℓ} \geq 0$ $Mathematical equation: $ \widehat{\alpha}_\ell\geq 0 $$ . As noted by Mugnier et al. (2009), the ratio in (10) corresponds to a signal-to-noise ratio. It is obtained by linearly transforming the data and accounts for the local, time and wavelength-specific covariance of the background. The variance of the estimator ${\hat{α}}_{ℓ}$ $Mathematical equation: $ \widehat{\alpha}_\ell $$ , hereafter noted v_ℓ, is:

$\begin{matrix} \underset{v_{ℓ}}{\underset{⏟}{Var [{\hat{α}}_{ℓ}]}} = {(\sum_{t = 1}^{T} a_{ℓ, t} / {\hat{σ}}_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t}^{2})}^{- 1} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \underbrace{\text{Var}[\widehat{\alpha }_\ell ]}_{{v_\ell }} = \left(\sum \limits _{t=1}^T \displaystyle a_{\ell ,t}/\widehat{\sigma }_{{\lfloor \phi _{\ell ,t}\rceil },\ell ,t}^2 \right)^{-1} . \end{aligned} $$$ (11)

The signal-to-noise ratio of the flux estimate (S/N_ℓ) therefore corresponds to the ratio ${\hat{α}}_{ℓ} / \sqrt{v_{ℓ}}$ $Mathematical equation: $ \widehat{\alpha}_\ell/\sqrt{{{v_\ell}}} $$ and is distributed as a standard normal variate under ℋ₀ (Mugnier et al. 2009).

Figure 6 compares the detection maps S/N_ℓ computed with Eq. (10) and detection maps obtained by PACO on ADI subsets (i.e., by processing the data one wavelength at a time). The major difference between the two approaches is that PACO ASDI combines information from all wavelengths to learn the background model (more specifically, the covariance matrices). Therefore, a more accurate model is obtained and point-sources are better discriminated against the background: the signal-to-noise ratio of the sources is improved at all wavelengths while the fluctuations in the absence of sources are comparable.

Fig. 6.

PACO versus PACO ASDI: impact of learning background structures at a single wavelength (PACO, first row) or jointly from all the wavelengths (PACO ASDI, second row). Single-wavelength detection maps S/N_ℓ are shown for PACO ASDI. The differences between the 2 lines are related to the estimation of the covariance matrices C and the scaling parameters σ. The combination of those maps leads to a single detection map, not shown here, with improved sensitivity (see text and Fig. 8).

Beyond the improvement of the detection map at a given wavelength, PACO ASDI also benefits from combining detection maps at different wavelengths to better detect sources, as described in the next section.

3.2. Combining multiple detection maps

3.2.1. Combination assuming spectral independence

The detection of point sources can be largely improved by combining information from different wavelengths. The most straightforward approach consists of the extension of the hypothesis test in Eq. (4) in order to include the patches at all times and all wavelengths (all locations depicted in Fig. 5 rather than a single row). Under the assumption that spectral channels are independent, the likelihood can be factored as a product over all channels and fluxes α_ℓ can be separately estimated. Deciding for the presence of a source at location ϕ₀ based on the generalized likelihood ratio amounts to:

$\begin{matrix} GLRT : \sum_{ℓ = 1}^{L} \frac{{(\sum_{t = 1}^{T} b_{ℓ, t} / {\hat{σ}}_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t}^{2})}^{2}}{\sum_{t = 1}^{T} a_{ℓ, t} / {\hat{σ}}_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t}^{2}} = \sum_{ℓ = 1}^{L} \frac{{\hat{α}}_{ℓ}^{2}}{v_{ℓ}} \underset{H_{0}}{\overset{H_{1}}{≷}} η . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \text{GLRT}:\quad \sum _{\ell =1}^L\frac{\left(\sum \limits _{t=1}^T \displaystyle b_{\ell ,t}/\widehat{\sigma }_{{\lfloor \phi _{\ell ,t}\rceil },\ell ,t}^2\right)^2}{\sum \limits _{t=1}^T \displaystyle a_{\ell ,t}/\widehat{\sigma }_{{\lfloor \phi _{\ell ,t}\rceil },\ell ,t}^2} =\sum _{\ell =1}^L \frac{\widehat{\alpha }_\ell ^2}{v_\ell } \underset{{\mathcal{H} }_0}{\overset{{\mathcal{H} }_1}{\gtrless }} \eta . \end{aligned} $$$ (12)

To account for the imposed non-negativity of source fluxes, the test can be slightly modified, see Flasseur et al. (2018b):

$\begin{matrix} {GLRT}^{+} : \sum_{ℓ = 1}^{L} \frac{{[{\hat{α}}_{ℓ}]}_{+}^{2}}{v_{ℓ}} \underset{H_{0}}{\overset{H_{1}}{≷}} η, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \text{GLRT}^+:\quad \sum _{\ell =1}^L \frac{[\widehat{\alpha }_\ell ]_+^2}{{v_\ell }}\underset{{\mathcal{H} }_0}{\overset{{\mathcal{H} }_1}{\gtrless }} \eta \,, \end{aligned} $$$ (13)

where [x]₊ = max(x, 0) is the positive part of x.

Under ℋ₀ and the assumption that S/N_ℓ values are independent from one another (a hypothesis that will be rejected in the following paragraphs), the distribution of GLRT⁺ is given by (see the derivation in Appendix C):

$\begin{matrix} p ({GLRT}^{+} | H_{0}) & = \frac{1}{2^{L}} δ_{0} ({GLRT}^{+}) \\ + \sum_{ℓ = 0}^{L - 1} \frac{L!}{2^{L} ℓ! (L - ℓ)!} χ_{L - ℓ}^{2} ({GLRT}^{+}), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \text{p}\bigl (\text{GLRT}^+\bigl |\,\mathcal{H} _{0}\bigr )&=\frac{1}{2^L}\delta _0(\text{GLRT}^+) \nonumber \\&\quad +\sum _{\ell =0}^{L-1}\frac{L!}{2^L\ell !(L-\ell )!}\chi _{L-\ell }^2\left(\text{GLRT}^+\right)\,, \end{aligned} $$$ (14)

where δ₀ is a Dirac mass centered in 0 and $χ_{L - ℓ}^{2}$ $Mathematical equation: $ \chi_{L-\ell}^2 $$ is a Chi-square distribution with L − ℓ degrees of freedom.

Figure 7a displays the GLRT⁺ obtained with Eq. (13) on an ASDI dataset of HR 8799 obtained with SPHERE-IFS. Three point-sources can be detected in this dataset, at the locations marked c, d, and e. To perform an automated detection, it is necessary to set a threshold corresponding to a fixed probability of false alarm. The empirical distribution of GLRT⁺ values, excluding the three regions that contain the point-sources, is shown at the right of the detection map of Fig. 7a. This empirical distribution is compared to the theoretical distribution of GLRT⁺ under ℋ₀, drawn in dashed line. A strong mismatch is observed. Due to this discrepancy, it is not possible to derive a detection threshold from the model of Eq. (14). The empirical distribution is shifted to the left as if the number of wavelengths was smaller than L. An effective number of wavelengths could be derived by fitting the parameter L in Eq. (14) to the empirical distribution. This effective number of wavelengths would account for the correlations of S/N values between adjacent wavelengths. To limit the number of false alarms, the detection threshold is typically set in order to reach probabilities as low as 10⁻⁷. A mismodeling of the right tail of the distribution may have a large impact on the value of the threshold. Rather than estimating an effective number of wavelengths to adjust the model (14), we model the distribution of the S/N_ℓ values by accounting for the wavelength correlations.

Fig. 7.

Combined detection maps computed on HR 8799 ASDI dataset: a: GLRT⁺ criterion and its distribution in the absence of sources; b: wGLRT criterion, including a spectral whitening operation, and its distribution. The three sources are excluded for the computation of the empirical distributions.

3.2.2. Accounting for spectral correlations

Signal-to-noise ratio values S/N_ℓ defined in Eq. (10) are Gaussian distributed. However, they are not mutually independent because the background patches, in a given frame, are very similar for adjacent wavelengths. Before combining detection maps, it is necessary to learn the spectral correlations between the maps S/N_ℓ. This learning is performed locally, on a small region of the maps S/N_ℓ. Since there might be a point source within the region, it is necessary to use a robust estimator $\hat{Σ}$ $Mathematical equation: $ \widehat{\boldsymbol{\Sigma}} $$ of the spectral covariance (otherwise, spectral whitening would suppress the source). There are several robust estimators for the covariance, see for example the review in Hubert et al. (2008). The minimum covariance determinant (MCD) method identifies a subset of observations of a fixed size whose covariance matrix has the lowest determinant. To identify this subset quickly, we use the algorithm FAST-MCD introduced in Rousseeuw & Driessen (1999). The region over which an estimate $\hat{Σ}$ $Mathematical equation: $ \widehat{\boldsymbol{\Sigma}} $$ is computed must be large enough to guarantee that the area of large S/N_ℓ values corresponding to a point source be considered as an outlier. Appendix D discusses how to set the size of this region and how $\hat{Σ}$ $Mathematical equation: $ \widehat{\boldsymbol{\Sigma}} $$ can be improved, in a second step, by masking.

The vector x of S/N_ℓ values is a sufficient statistic for the fluxes ${\hat{α}}_{ℓ}$ $Mathematical equation: $ \widehat{\alpha}_\ell $$ of a point source. The detection of a point source can thus be defined directly on the vector x:

$\begin{matrix} {\begin{matrix} H_{0} : x = ϵ \\ (no source) \\ H_{1} : x = ϵ + β, \\ (a point source is present) \end{matrix} \end{matrix}$ $Mathematical equation: $$ \begin{aligned} {\left\{ \begin{array}{ll} \mathcal{H} _{0}: \boldsymbol{x} =\boldsymbol{\epsilon }\\ \text{(no} \text{source)} \\ \mathcal{H} _{1}: \boldsymbol{x} =\boldsymbol{\epsilon }+\boldsymbol{\beta }\,,\\ \text{(a} \text{ point} \text{ source} \text{ is} \text{ present)} \end{array}\right.} \end{aligned} $$$ (15)

where β ∈ ℝ^L is the vector of expected S/N values at each of the L wavelengths: $β_{ℓ} = α_{ℓ} / \sqrt{v_{ℓ}}$ $Mathematical equation: $ \beta_\ell=\alpha_\ell/\sqrt{{{v_\ell}}} $$ , and ϵ is a random vector accounting for the fluctuations of S/N_ℓ values. According to our model of spectral correlations, ϵ follows the Gaussian distribution 𝒩(0, Σ). Replacing the unknown vector β by its maximum likelihood estimate $\hat{β} = x$ $Mathematical equation: $ \widehat{{\boldsymbol{\beta}}}={\boldsymbol{x}} $$ gives an approximation of the likelihood of ℋ₁ and leads to the following GLR test:

$\begin{matrix} wGLRT : ‖ {\hat{L}}^{⊤} {x ‖}_{2}^{2} = x^{⊤} {\hat{Σ}}^{- 1} x \underset{H_{0}}{\overset{H_{1}}{≷}} η, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \text{ wGLRT}:\quad \Vert \widehat{\mathbb{L} }^{\top }\boldsymbol{x}\Vert _2^2 = \boldsymbol{x}^{\top }\widehat{\boldsymbol{\Sigma }}^{-1} \boldsymbol{x} \underset{{\mathcal{H} }_0}{\overset{{\mathcal{H} }_1}{\gtrless }} \eta \,, \end{aligned} $$$ (16)

where $\hat{L}$ $Mathematical equation: $ \widehat{{\mathbb{L}}} $$ is the L × L whitening matrix obtained by Cholesky factorization, i.e., such that $\hat{L} \hat{L}^{⊤} = {\hat{Σ}}^{- 1}$ $Mathematical equation: $ \widehat{{\mathbb{L}}}\widehat{{\mathbb{L}}}{^{{\top}}}=\widehat{\boldsymbol{\Sigma}}^{-1} $$ . Similarly to the spatial whitening introduced in Sect. 2.2, the matrix $\hat{L}$ $Mathematical equation: $ \widehat{{\mathbb{L}}} $$ is such that $\hat{L}^{⊤} x$ $Mathematical equation: $ \widehat{{\mathbb{L}}}{^{{\top}}}{\boldsymbol{x}} $$ be distributed according to 𝒩(0, I) under ℋ₀: it whitens vectors of S/N_ℓ values, hence the name of the corresponding detection criterion.

In the absence of spectral correlations (i.e., Σ = I), 𝕃 = I and the wGLRT equals the GLRT defined in Eq. (12). Under ℋ₀, the wGLRT follows a χ² distribution with L degrees of freedom.

Figure 7b displays the wGLRT detection map and, by masking out the three sources, the distribution of wGLRT under ℋ₀. The comparison with GLRT⁺ shows that the spectral whitening reduces artifacts in the absence of sources (periodic structures observed in Fig. 7a are no longer visible in Fig. 7b). The empirical distribution of wGLRT is much closer to the expected distribution, however, the match is not perfect, which motivates considering another approach to the combination of detection maps.

3.2.3. Improving the detection based on a prior spectrum model

If a coarse model of the spectrum of the point source under study is available prior to the detection, this model can be used to improve the detection performance by giving more weight to spectral bands where larger values are expected. Let γ be the spectrum of a point source. Given that spectrum, the hypothesis test (15) takes the simplified form:

$\begin{matrix} {\begin{matrix} H_{0} : x = ϵ \\ (no source) \\ H_{1} : x = ϵ + α^{int} \cdot \underset{β^{'}}{\underset{⏟}{(\begin{matrix} | \\ γ_{ℓ} / \sqrt{v_{ℓ}} \\ | \end{matrix})}}, \\ (a point source is present) \end{matrix} \end{matrix}$ $Mathematical equation: $$ \begin{aligned} {\left\{ \begin{array}{ll} \mathcal{H} _{0}: \qquad \qquad \boldsymbol{x} =\boldsymbol{\epsilon }\\ \text{(no} \text{ source)}\\ \mathcal{H} _{1}: \qquad \qquad \boldsymbol{x} =\boldsymbol{\epsilon }+\alpha ^{\text{ int}} \cdot \underbrace{\begin{pmatrix}|\\ \gamma _\ell /\sqrt{{v_\ell }}\\ |\end{pmatrix}}_{\boldsymbol{\beta }^{\prime }}\,,\\ \text{(a} \text{ point} \text{ source} \text{ is} \text{ present)} \end{array}\right.} \end{aligned} $$$ (17)

where α^int is the spectrally integrated flux, that is, the flux such that α_ℓ = α^int γ_ℓ, for all ℓ. In contrast to the hypothesis test (15), this new test requires estimating a single scalar parameter: α^int. The maximum likelihood estimator for the integrated flux α^int is:

$\begin{matrix} {\hat{α}}^{int} = \frac{x^{⊤} \hat{L} {\hat{L}}^{⊤} β^{'}}{β^{'}^{⊤} \hat{L} {\hat{L}}^{⊤} β^{'}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \widehat{\alpha }^{\text{ int}}=\frac{\boldsymbol{x}^{\top }\widehat{{\mathbb{L} }}\widehat{{\mathbb{L} }}^{\top }\boldsymbol{\beta }^{\prime }}{\boldsymbol{\beta }^{\prime } \,^{\top }\widehat{{\mathbb{L} }}\widehat{{\mathbb{L} }}^{\top }\boldsymbol{\beta }^{\prime }}\,, \end{aligned} $$$ (18)

where β′ is the vector of ℝ^L whose ℓth element is equal to $γ_{ℓ} / \sqrt{v_{ℓ}}$ $Mathematical equation: $ \gamma_\ell/\sqrt{{{v_\ell}}} $$ (the expected S/N_ℓ value if α_ℓ was equal to γ_ℓ, i.e. α^int = 1). The variance of the estimator ${\hat{α}}^{int}$ $Mathematical equation: $ \widehat{\alpha}^{\text{ int}} $$ is ${(β^{' ⊤} {\hat{L} \hat{L}}^{⊤} β^{'})}^{- 1}$ $Mathematical equation: $ ({\boldsymbol{\beta}}{^{\prime \,{\top}}}\widehat{{\mathbb{L}}}\widehat{{\mathbb{L}}}{^{{\top}}}{\boldsymbol{\beta}}^{\prime})^{-1} $$ . By substituting α^int with its estimate ${\hat{α}}^{int}$ $Mathematical equation: $ \widehat{\alpha}^{\text{ int}} $$ , another GLRT is obtained:

$\begin{matrix} wwGLRT : 2 log \frac{p (x | H_{1}, {\hat{α}}^{int})}{p (x | H_{0})} = \frac{{(x^{⊤} \hat{L} {\hat{L}}^{⊤} β^{'})}^{2}}{β^{'}^{⊤} \hat{L} {\hat{L}}^{⊤} β^{'}} \cdot \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \text{ wwGLRT}:\quad 2\log \frac{\text{ p}(\boldsymbol{x}|\mathcal{H} _{1},\widehat{\alpha }^{\text{ int}})}{\text{ p}(\boldsymbol{x}|\mathcal{H} _{0})}=\frac{\left(\boldsymbol{x}^{\top }\widehat{{\mathbb{L} }}\widehat{{\mathbb{L} }}^{\top }\boldsymbol{\beta }^{\prime }\right)^2}{\boldsymbol{\beta }^{\prime }\,^{\top }\widehat{\mathbb{L} }\widehat{\mathbb{L} }^{\top }\boldsymbol{\beta }^{\prime }}\cdot \end{aligned} $$$ (19)

Since only positive integrated fluxes α^int make sense, vectors x such that $x^{⊤} \hat{L} \hat{L}^{⊤} β^{'} < 0$ $Mathematical equation: $ {\boldsymbol{x}}{^{{\top}}}\widehat{\mathbb{L}}\widehat{\mathbb{L}}{^{{\top}}}{\boldsymbol{\beta}}^{\prime} < 0 $$ can be discarded. The square root of the GLRT then leads to the test:

$\begin{matrix} wwS/N : \frac{x^{⊤} \hat{L} {\hat{L}}^{⊤} β^{'}}{\sqrt{β^{'}^{⊤} \hat{L} {\hat{L}}^{⊤} β^{'}}} = \sum_{ℓ = 1}^{L} w_{ℓ} \cdot {[{\hat{L}}^{⊤} x]}_{ℓ} \underset{H_{0}}{\overset{H_{1}}{≷}} τ, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \text{ wwS/N}:\quad \frac{\boldsymbol{x}^{\top }\widehat{\mathbb{L} }\widehat{\mathbb{L} }^{\top }\boldsymbol{\beta }^{\prime }}{\sqrt{\boldsymbol{\beta }^{\prime } \,^{\top }\widehat{\mathbb{L} }\widehat{\mathbb{L} }^{\top }\boldsymbol{\beta }^{\prime }}}=\sum _{\ell =1}^L w_\ell \cdot \left[\widehat{\mathbb{L} }^{\top }\boldsymbol{x}\right]_\ell \underset{{\mathcal{H} }_0}{\overset{{\mathcal{H} }_1}{\gtrless }} \tau \,, \end{aligned} $$$ (20)

which takes the form of a linear combination of the whitened vector of S/N_ℓ values, with weights w_ℓ defined by $w_{ℓ} = {[\hat{L}^{⊤} β^{'}]}_{ℓ} / \sqrt{β^{' ⊤} \hat{L} \hat{L}^{⊤} β^{'}}$ $Mathematical equation: $ w_\ell=[\widehat{\mathbb{L}}{^{{\top}}}{\boldsymbol{\beta}}^{\prime}]_\ell/\sqrt{{\boldsymbol{\beta}}{^{\prime \,{\top}}}\widehat{\mathbb{L}}\widehat{\mathbb{L}}{^{{\top}}}{\boldsymbol{\beta}}^{\prime}} $$ , hence the name wwS/N. Like in our previous derivation of S/N_ℓ, wwS/N can be interpreted as a signal to noise ratio: $wwS / N = {\hat{α}}^{int} / \sqrt{Var [{\hat{α}}^{int}]}$ $Mathematical equation: $ \mathrm{wwS/N}=\widehat{\alpha}^{\text{ int}}/\sqrt{\text{ Var}[\widehat{\alpha}^{\text{ int}}]} $$ .

Interestingly, wwS/N corresponds to the optimal linear combination of the S/N_ℓ values, in the sense that the probability of detection is maximized for all false alarms rates, i.e. a matched filter, see Appendix E.

Comparison between wGLRT and wwS/N. Both wGLRT and wwS/N encompass a whitening of the spectral correlations, but wwS/N also includes an additional spectral wheighting strategy based on a prior spectrum model. In order to select between these two different detection criteria, several aspects must be considered: (i) does the criterion follow the expected distribution under ℋ₀, that is, can detection thresholds be set for prescribed false alarms rates? (ii) How large is the gain obtained when the spectrum of the source is available? (iii) How does wwS/N degrade if the assumed spectrum differs from the true spectrum?

Under ℋ₀, wGLRT is expected to follow a χ² distribution with L degrees of freedom, while wwS/N is expected to follow a standard Gaussian distribution. The analysis of Fig. 7b led to the conclusion that the fit with a χ² distribution with L degrees of freedom was not accurate. Figure 8 illustrates the empirical distribution of wwS/N for several prior spectra. For all the spectra considered, the empirical distribution is in very good fit with a centered standard Gaussian. A possible explanation for this better fit of wwS/N with the theoretical distribution under ℋ₀ is that the weights w_ℓ give more importance to spectral channels of good quality (i.e., with a low variance v_ℓ) and that these channels follow more closely our Gaussian model. The good fit of wwS/N under ℋ₀ with the expected distribution makes it possible to reliably set detection thresholds for a prescribed false alarms rate.

Fig. 8.

Combined detection map with spectrum priors: in the absence of sources the empirical distribution matches very closely a Gaussian distribution (red parabola in the log-scale representations).

In order to assess the gain in detection performance brought by the prior knowledge of the spectrum of the source, we compare the contrasts achievable at a 5σ false alarms rate. We derive the theoretical contrast values based on our statistical modeling even if, in practice, a deviation is observed between the theoretical distribution and the empirical distribution of wGLRT. Under ℋ₀, wGLRT is expected to follow a χ² distribution with L degrees of freedom. The probability of false alarms is thus: $PFA = P (χ_{L}^{2} > η)$ $Mathematical equation: $ \text{PFA}=\text{P}(\chi_L^2>\eta) $$ . The probability of detection of a source of flux α^intγ, PD = P(wGLRT > η|ℋ₁), corresponds to the probability that a noncentral χ² distribution with L degrees of freedom and noncentrality parameter ${(α^{int})}^{2} β^{' ⊤} \hat{L} \hat{L}^{⊤} β^{'}$ $Mathematical equation: $ (\alpha^{\text{ int}})^2{\boldsymbol{\beta}}^{\prime}\, {^{{\top}}}\widehat{\mathbb{L}}\widehat{\mathbb{L}}{^{{\top}}}{{\boldsymbol{\beta}}^{\prime}} $$ exceeds the detection threshold η. This probability corresponds to $Q_{L / 2} (α^{int} {(β^{' ⊤} \hat{L} \hat{L}^{⊤} β^{'})}^{1 / 2}, \sqrt{η})$ $Mathematical equation: $ \text{Q}_{L/2}( \alpha^{\text{ int}}({\boldsymbol{\beta}}^{\prime}\, {^{{\top}}}\widehat{\mathbb{L}}\widehat{\mathbb{L}}{^{{\top}}}{{\boldsymbol{\beta}}^{\prime}})^{1/2} ,\sqrt{\eta}) $$ , where Q_M(a, b) is Marcum Q-function (Simon 2007). Hence, the theoretical 5σ contrast reached by wGLRT can be computed by first solving the equation $\frac{1}{Γ (L / 2)} γ (L / 2, η / 2) = Φ (5)$ $Mathematical equation: $ \frac{1}{\Gamma(L/2)}\gamma(L/2,\eta/2)=\Phi(5) $$ for η (where γ is here the lower incomplete gamma function and Φ is the cumulative distribution function of the standard normal distribution), and then solving $Q_{L / 2} (α^{int} {(β^{' ⊤} \hat{L} \hat{L}^{⊤} β^{'})}^{1 / 2}, \sqrt{η}) = 1 / 2$ $Mathematical equation: $ \text{Q}_{L/2}( \alpha^{\text{ int}}({\boldsymbol{\beta}}^{\prime}\, {^{{\top}}}\widehat{\mathbb{L}}\widehat{\mathbb{L}}{^{{\top}}}{{\boldsymbol{\beta}}^{\prime}})^{1/2} ,\sqrt{\eta})=1/2 $$ for α^int. For example, when L = 39 (as is the case of SPHERE-IFS), we find η ≈ 100 and $α^{int} \approx 7.87 {(β^{' ⊤} \hat{L} \hat{L}^{⊤} β^{'})}^{- 1 / 2}$ $Mathematical equation: $ \alpha^{\text{ int}}\approx 7.87({\boldsymbol{\beta}}^{\prime}\,{^{{\top}}}\widehat{\mathbb{L}}\widehat{\mathbb{L}}{^{{\top}}}{{\boldsymbol{\beta}}^{\prime}})^{-1/2} $$ .

Since the expectation 𝔼[wwS/N|ℋ₁] is equal to $α^{int} {(β^{' ⊤} \hat{L} \hat{L}^{⊤} β^{'})}^{1 / 2}$ $Mathematical equation: $ \alpha^{\text{ int}}({\boldsymbol{\beta}}^{\prime}\,{^{{\top}}}\widehat{\mathbb{L}}\widehat{\mathbb{L}}{^{{\top}}}{{\boldsymbol{\beta}}^{\prime}})^{1/2} $$ , the 5σ contrast reached by wwS/N is readily obtained: $α^{int} = 5 {(β^{' ⊤} \hat{L} \hat{L}^{⊤} β^{'})}^{- 1 / 2}$ $Mathematical equation: $ \alpha^{\text{ int}}=5({\boldsymbol{\beta}}^{\prime}\,{^{{\top}}}\widehat{\mathbb{L}}\widehat{\mathbb{L}}{^{{\top}}}{{\boldsymbol{\beta}}^{\prime}})^{-1/2} $$ . Including the prior knowledge of the source spectrum, therefore, improves the contrast by a factor 1.57 (wwS/N reaches a theoretical contrast that is 1.57 times better than wGLRT).

Rather than expressing the contrast in terms of the value of the integrated flux α^int required in order to achieve the detection, it can also be expressed as the flux of the source, at a given wavelength, so that the detection using jointly all wavelengths is possible. This wavelength-specific contrast corresponds to the values α_ℓ = α^intγ_ℓ. For example, the achievable contrast using only a single detection map S/N_ℓ is $5 \sqrt{v_{ℓ}}$ $Mathematical equation: $ 5\sqrt{v_\ell} $$ . When the multi-wavelength criterion wwS/N is applied, if $\hat{L} = I$ $Mathematical equation: $ \widehat{\mathbb{L}}=\mathbf{I} $$ and if the prior actually matches the true spectrum of the source, then the achievable contrast corresponds to a flux $5 γ_{ℓ} / \sqrt{\sum_{ℓ} γ_{ℓ}^{2} / v_{ℓ}}$ $Mathematical equation: $ 5\gamma_\ell/\sqrt{\sum\nolimits_\ell \gamma_\ell^2/v_\ell} $$ in the ℓth channel. Therefore, with respect to the single-wavelength map, the contrast is improved by a factor:

$\begin{matrix} \sqrt{1 + \sum_{ℓ^{'} \neq ℓ}^{L} \frac{{(γ_{ℓ^{'}} / γ_{ℓ})}^{2}}{v_{ℓ^{'}} / v_{ℓ}}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \sqrt{1+\sum \limits _{\ell ^{\prime }\ne \ell }^L \frac{(\gamma _{\ell ^{\prime }}/\gamma _\ell )^2}{v_{\ell ^{\prime }}/v_\ell }} \,, \end{aligned} $$$ (21)

which depends on the spectrum γ_ℓ of the source and on the variance v_ℓ of the source flux. This factor is strictly greater than one (i.e., the contrast is strictly improved) provided that there is at least one wavelength ℓ′, different from ℓ, such that the spectrum is non zero (γ_ℓ′ ≠ 0) and the variance is finite (v_ℓ′ < ∞). Obviously, if these two conditions are not met, either the source emits no light at the additional wavelengths or no meaningful measurement is available so that the performance cannot be improved compared to when using a single wavelength detection map. In all other cases, there is a gain, that is, the combined detection map leads to a better sensitivity than even the detection map S/N_ℓ at the wavelength providing the best contrast. In particular, if the spectrum is flat (∀ℓ, γ_ℓ = 1/L) and the variances v_ℓ are all equal, the contrast is improved by a factor $\sqrt{L}$ $Mathematical equation: $ \sqrt{L} $$ , which is to be expected when combining L measurements of identical statistical weight. This factor should, however, be considered as an upper bound that cannot be reached in practice for several reasons: (i) the whitening matrix $\hat{L}$ $Mathematical equation: $ \widehat{\mathbb{L}} $$ differs from the identity because of the correlations between channels, the effective number of (independent) channels is, in fact, smaller than L, (ii) the estimation of matrix $\hat{L}$ $Mathematical equation: $ \widehat{\mathbb{L}} $$ is performed in two steps and relies, in the second step, on a thresholding strategy that requires a detection to prevent the attenuation of point-like sources, (iii) neither the spectrum nor the variance are constant with respect to the wavelength, (iv) the true spectrum of the source may differ from the prior spectrum used in wwS/N.

Impact of a mismatch between the true and assumed spectrum in wwS/N. Finally, the impact of a mismatch between the assumed spectrum in wwS/N and the true spectrum of the source needs to be assessed. This impact can be evaluated by comparing the contrast that is reached when the actual spectrum is used with respect to the contrast when an incorrect prior spectrum is used in wwS/N. Let γ_⋆ be the true spectrum and $β_{⋆}^{'}$ $Mathematical equation: $ {\boldsymbol{\beta}}^\prime_{\star} $$ the vector such that $β_{⋆ ℓ}^{'} = {γ_{⋆}}_{ℓ} / \sqrt{v_{ℓ}}$ $Mathematical equation: $ {\beta^\prime_{\star}}_\ell={\gamma_\star}_\ell/\sqrt{v_\ell} $$ for all ℓ. The achievable contrast under the true spectrum prior is:

$\begin{matrix} 5 {(β_{⋆}^{'}^{⊤} \hat{L} {\hat{L}}^{⊤} β_{⋆}^{'})}^{- 1 / 2} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} 5(\boldsymbol{\beta ^{\prime }_{\star }}\,^{\top }\widehat{\mathbb{L} }\widehat{\mathbb{L} }^{\top }{\boldsymbol{\beta }^{\prime }_{\star }})^{-1/2}. \end{aligned} $$$ (22)

Under the incorrect prior γ, wwS/N is distributed, under ℋ₁, according to the Gaussian $N (α^{int} (β_{⋆}^{'}^{⊤} \hat{L} \hat{L}^{⊤} β^{'}) {(β^{' ⊤} \hat{L} \hat{L}^{⊤} β^{'})}^{- 1 / 2}, 1)$ $Mathematical equation: $ \mathcal{N}(\alpha^{\text{ int}}({\boldsymbol{\beta}}^\prime_{\star}\,{^{{\top}}}\widehat{\mathbb{L}}\widehat{\mathbb{L}}{^{{\top}}}{{\boldsymbol{\beta}}^{\prime}})({\boldsymbol{\beta}}^{\prime}\, {^{{\top}}}\widehat{\mathbb{L}}\widehat{\mathbb{L}}{^{{\top}}}{{\boldsymbol{\beta}}^{\prime}})^{-1/2},1) $$ . The contrast that is achieved is thus equal to $5 {(β^{' ⊤} \hat{L} \hat{L}^{⊤} β^{'})}^{1 / 2} / (β_{⋆}^{'}^{⊤} \hat{L} \hat{L}^{⊤} β^{'})$ $Mathematical equation: $ 5({\boldsymbol{\beta}}^{\prime}\, {^{{\top}}}\widehat{\mathbb{L}}\widehat{\mathbb{L}}{^{{\top}}}{{\boldsymbol{\beta}}^{\prime}})^{1/2}/({\boldsymbol{\beta}}^\prime_{\star}\, {^{{\top}}}\widehat{\mathbb{L}}\widehat{\mathbb{L}}{^{{\top}}}{{\boldsymbol{\beta}}^{\prime}}) $$ . With respect to the ideal case where the true spectrum γ_⋆ is used as a prior, the achievable contrast is degraded by a factor:

$\begin{matrix} \frac{\sqrt{β^{'}^{⊤} \hat{L} {\hat{L}}^{⊤} β^{'}} \sqrt{β_{⋆}^{'}^{⊤} \hat{L} {\hat{L}}^{⊤} β_{⋆}^{'}}}{β_{⋆}^{'}^{⊤} \hat{L} {\hat{L}}^{⊤} β^{'}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \frac{\sqrt{\boldsymbol{\beta }^{\prime }\,^{\top }\widehat{\mathbb{L} }\widehat{\mathbb{L} }^{\top }{\boldsymbol{\beta }^{\prime }}}\sqrt{\boldsymbol{\beta }^{\prime }_{\star }\, ^{\top }\widehat{\mathbb{L} }\widehat{\mathbb{L} }^{\top }{\boldsymbol{\beta }^{\prime }_{\star }}}}{\boldsymbol{\beta }^{\prime }_{\star }\, ^{\top }\widehat{\mathbb{L} }\widehat{\mathbb{L} }^{\top }{\boldsymbol{\beta }^{\prime }}}\,, \end{aligned} $$$ (23)

which corresponds to the inverse of the normalized correlation between the whitened true spectrum and the whitened assumed spectrum.

Table 2 reports the factors by which the achievable contrast is degraded when the prior spectrum differs from the true spectrum. Due to the symmetry in Eq. (23), the role of the prior spectrum and of the true spectrum can be interchanged. These factors have been computed for 1681 whitening filters computed on a SPHERE-IFS dataset around HR 8799. The mean factor and its standard deviation are reported in the table. When the true spectrum and the prior spectrum are very close (third true spectrum and first prior, in the table) the contrast degradation is negligible (the factor is not significantly greater than 1). Even when the spectrum differs significantly (last true spectrum and first or last prior), the contrast degradation remains modest (at most a factor 1.48 in this case) and smaller than that observed when replacing wwS/N by wGLRT (a factor 1.57 was predicted in the previous paragraph).

Table 1.

Reminder of the main notations.

Table 2.

Degradation of the achievable contrast when the prior spectrum differs from the true spectrum.

Selected detection criterion. These comparisons between wGLRT and wwS/N lead to a clear conclusion: wwS/N is to be preferred since (i) its distribution under ℋ₀ is more accurately modeled (so that detection thresholds can be automatically set to reach prescribed false alarms rates), (ii) the detection performance is higher when the spectrum of the source is known in advance, (iii) even if the prior spectrum used in wwS/N differs significantly from the true spectrum of the source, the detection performance of wwS/N is higher than that of wGLRT.

Other algorithms for exoplanet detection in ASDI datasets can produce a detection map (signal-to-noise ratio) per wavelength. We show in Appendix F that our strategy for combining multiple detection maps is also beneficial to those algorithms.

4. Source characterization

So far, we introduced two modelings of the data: (i) the background model introduced in Sect. 2, accounting for spatial covariances as well as wavelength-specific and time-specific scaling factors, and (ii) the model of the spectral covariances of vectors x of S/N_ℓ values. The second model, based on the intermediate detection maps S/N_ℓ, includes both the patch covariances (through the computation of S/N_ℓ values) and the spectral covariances. Rather than performing the astrometric and photometric characterizations of a detected point source based on the co-log-likelihood ℒ_n introduced in Eq. (2), which does not account for spectral correlations, we define the co-log-likelihood 𝒞 on the vectors of S/N_ℓ values:

$\begin{matrix} C (ϕ_{0}, α) & = - log p (x | ϕ_{0}, α, {v_{ℓ}}_{ℓ \in 1 : L}, L) \\ = \frac{1}{2} {∥ L^{⊤} (\begin{matrix} | \\ x_{ℓ} - \frac{α_{ℓ}}{\sqrt{v_{ℓ}}} \\ | \end{matrix}) ∥}_{2}^{2} + const., \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \fancyscript {C}(\phi _0,\boldsymbol{\alpha })&=-\log \text{ p}(\,\boldsymbol{x}\,|\,\phi _0,\boldsymbol{\alpha },\{v_\ell \}_{\ell \in 1:L},\mathbb{L} \,)\nonumber \\&= \tfrac{1}{2} \left\Vert \mathbb{L} ^{\top }\begin{pmatrix} | \\ x_\ell -\displaystyle \frac{\displaystyle \alpha _\ell }{\displaystyle \sqrt{v_\ell }}\\ | \end{pmatrix} \right\Vert_2^2 + \text{ const.}, \end{aligned} $$$ (24)

where the vector x of S/N_ℓ values is extracted at the integer location $⌊ ϕ_{0} ⌉$ $Mathematical equation: $ {{\lfloor \phi_0\rceil}} $$ of the field of view, and variance values v_ℓ depend on the level of background fluctuations in the patches extracted from spectral channel ℓ. The constant term depends only on L and on the determinant of the whitening matrix $L$ $Mathematical equation: $ \mathbb L $$ .

Similarly to PACO algorithm (Flasseur et al. 2018a), when characterizing a point source found in the detection map, the background statistics are re-estimated jointly with the determination of the subpixel location and flux of the source. This prevents any bias that may occur due to self-subtraction (computation of the mean patches ${\hat{m}}_{n, ℓ}$ $Mathematical equation: $ \widehat{{\boldsymbol{m}}}_{n,\ell} $$ without accounting for the presence of the source). An alternating estimation strategy is carried out by iteratively applying the following steps, see also Fig. 1: (i) Algorithm 1 is applied to the residual patches ${r_{⌊ ϕ_{ℓ, t} ⌉, ℓ, t} - α_{ℓ} h_{⌊ ϕ_{ℓ, t} ⌉, ℓ} (ϕ_{ℓ, t})}_{ℓ \in 1 : L, t \in 1 : T}$ $Mathematical equation: $ \{{\boldsymbol{r}}_{{{\lfloor \phi_{\ell,t}\rceil}},\ell,t}-\alpha_\ell\,{\boldsymbol{h}}_{{{\lfloor \phi_{\ell,t}\rceil}},\ell}(\phi_{\ell,t})\}_{\ell\in 1:L,\,t\in 1:T} $$ , with α initially set to 0, to learn the local background statistics, (ii) S/N_ℓ values x_ℓ and variances v_ℓ are computed for each wavelength, (iii) the spectral covariance Σ under ℋ₀ is estimated based on the vectors of values $x_{ℓ} - α_{ℓ} / \sqrt{v_{ℓ}}$ $Mathematical equation: $ x_\ell-\alpha_\ell/\sqrt{v_\ell} $$ in a local area centered at $⌊ ϕ_{0} ⌉$ $Mathematical equation: $ {{\lfloor \phi_0\rceil}} $$ ; the whitening matrix 𝕃 is then derived by Cholesky factorization of the inverse of Σ, (iv) the subpixel location ϕ₀ and the flux values α_ℓ are estimated, then all the steps are repeated to improve the background modeling and progressively separate source and background.

The last step, corresponding to the astrometric and photometric estimations, is detailed in the following two paragraphs.

4.1. Astrometric estimation

The estimation of the location ϕ₀, with subpixel accuracy, can be performed by maximizing one of the combined-wavelengths detection criteria over a grid. In practice, we maximize wwS/N over a refined subpixel grid of locations ϕ₀ with the current flux estimates as a prior spectrum: γ = α/(∑_ℓ α_ℓ). This corresponds to jointly maximizing 𝒞(ϕ₀, 0)−𝒞(ϕ₀, α^intγ) with respect to the location ϕ₀ and the integrated source flux α^int.

Beyond the unbiased estimation of the astrometry and the photometry, it is critical to characterize the variance of these two quantities. In this context, the variances and covariances are predicted at each location of the field of view through the so-called Cramér–Rao lower bounds (CRLBs) of the vector p of parameters (the 2D angular location ϕ₀ and the spectrum α) that characterizes a point source. The CRLB is a good estimate of the covariance of the maximum likelihood estimator when the number of samples is large enough (Kendall et al. 1948). We follow the approach of Flasseur et al. (2018a); Flasseur et al. (2020) for ADI processing with the PACO algorithm: the Fisher information matrix I^F on the vector p is given by:

$\begin{matrix} {I_{⌊ ϕ_{0} ⌉}^{F} (p)}_{i, j} = & \sum_{ℓ = 1}^{L} \sum_{t = 1}^{T} {\frac{1}{{\hat{σ}}_{⌊ ϕ_{0} ⌉, ℓ, t}^{2}} \frac{\partial [α_{ℓ} h_{⌊ ϕ_{0} ⌉, ℓ} (ϕ_{0})]}{\partial p_{i}}}^{⊤} \\ \cdot {\hat{C}}_{⌊ ϕ_{0} ⌉}^{- 1} \frac{\partial [α_{ℓ} h_{⌊ ϕ_{0} ⌉, ℓ} (ϕ_{0})]}{\partial p_{j}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} {\mathbf{I }_{\lfloor \phi _0\rceil }^\mathrm{F} (\boldsymbol{p})}_{i,j} =&\sum\nolimits _{\ell =1}^L \sum\nolimits _{t=1}^T {\frac{1}{\hat{\sigma }_{{\lfloor \phi _0\rceil },\ell ,t}^2} \frac{\partial \left[\alpha _\ell \,{\boldsymbol{h}}_{{\lfloor \phi _0\rceil },\ell }(\phi _0)\right]}{\partial p_i}} ^{\top }\\&\cdot \widehat{\mathbf{C }}_{{\lfloor \phi _0\rceil }}^{-1} \frac{\partial \left[\alpha _\ell \,{\boldsymbol{h}}_{{\lfloor \phi _0\rceil },\ell }(\phi _0)\right]}{\partial p_j}, \nonumber \end{aligned} $$$ (25)

where the first two components of p represent the two components of the 2D location ϕ₀. The product $α_{ℓ} h_{⌊ ϕ_{0} ⌉, ℓ}$ $Mathematical equation: $ \alpha_\ell\,{\boldsymbol{h}}_{{{\lfloor \phi_0\rceil}},\ell} $$ models the signal of a point source of flux α_ℓ in the ℓth spectral channel. As in Flasseur et al. (2018a); Flasseur et al. (2020), we use a continuous model of the off-axis PSF (for instance, an isotropic Gaussian) to simplify the computation of the spatial derivatives. The standard deviations δ_i for each of the parameters are obtained from the diagonal of the inverse of Fisher information matrix:

$\begin{matrix} {δ (p)}_{i} = \sqrt{* {I^{F} {(p)}^{- 1}}_{i, i}} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} {\boldsymbol{\delta }(\boldsymbol{p})}_i = \sqrt{*\mathbf{I ^\mathrm{F} (\boldsymbol{p})^{-1}}_{i,i}}. \end{aligned} $$$ (26)

4.2. Estimation of the source spectrum

At a given source location ϕ₀, estimating the vector of source fluxes α by minimizing 𝒞 leads to the following maximum likelihood estimates:

$\begin{matrix} \hat{α} = V^{- 1} x, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \widehat{\boldsymbol{\alpha }} = \mathbf V ^{-1} \boldsymbol{x} \,, \end{aligned} $$$ (27)

where V is a diagonal matrix with diagonal entry ${[V]}_{ℓ, ℓ} = 1 / \sqrt{v_{ℓ}}$ $Mathematical equation: $ [\mathbf{V}]_{\ell,\ell}=1/\sqrt{v_\ell} $$ . It means that for each wavelength ℓ, the estimated flux ${\hat{α}}_{ℓ}$ $Mathematical equation: $ \widehat{\alpha}_\ell $$ is:

$\begin{matrix} {\hat{α}}_{ℓ} = \sqrt{v_{ℓ}} x_{ℓ}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \widehat{\alpha }_{\ell } = \sqrt{v_\ell } x_\ell \,, \end{aligned} $$$ (28)

which corresponds to the same flux estimates as obtained when computing the S/N_ℓ values channel by channel with Eq. (6) (i.e., accounting for the spectral correlations does not lead to a different estimator because 𝕃 is non-singular). The estimator covariance, on the other hand, reflects that flux variations are correlated:

$\begin{matrix} {[Cov [\hat{α}]]}_{i, j} = \sqrt{v_{i} v_{j}} {[Σ]}_{i, j} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \left[\text{Cov}[\widehat{\boldsymbol{\alpha }}]\right]_{i,j}=\sqrt{v_i\,v_j}\left[\boldsymbol{\Sigma }\right]_{i,j} . \end{aligned} $$$ (29)

When using instruments with many contiguous spectral bands, a spectral smoothness can also be enforced by favoring fluxes with small variations from one spectral band to the other, as captured by the following regularization term:

$\begin{matrix} R (α) = \frac{1}{2} {‖ D α ‖}_{2}^{2}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \fancyscript {R}(\boldsymbol{\alpha }) = \tfrac{1}{2}\Vert \mathbf D \boldsymbol{\alpha }\Vert _2^2\,, \end{aligned} $$$ (30)

with D the matrix of the finite differences.

If a large library of spectra is available, an a priori covariance Γ of the spectrum can be learned, providing a richer modeling than the simple smoothness prior. In the definition of ℛ, the matrix D is then replaced by D = Γ^−1/2, that is, $R (α) = - log p (α) + const = \frac{1}{2} (α - \bar{α})^{⊤} Γ^{- 1} (α - \bar{α}) = \frac{1}{2} {‖ Γ^{- 1 / 2} (α - \bar{α}) ‖}_{2}^{2}$ $Mathematical equation: $ \mathscr{R}({\boldsymbol{\alpha}}) = -\log \text{ p}({\boldsymbol{\alpha}})+\text{ const}=\tfrac{1}{2} ({\boldsymbol{\alpha}} - \overline{{\boldsymbol{\alpha}}}) {^{{\top}}}\boldsymbol{\Gamma}^{-1} ({\boldsymbol{\alpha}} - \overline{{\boldsymbol{\alpha}}})=\tfrac{1}{2}\|\boldsymbol{\Gamma}^{-1/2}({\boldsymbol{\alpha}}-\overline{{\boldsymbol{\alpha}}})\|_2^2 $$ , where the prior distribution p(α) is a centered multivariate Gaussian with mean vector $\bar{α}$ $Mathematical equation: $ \overline{{\boldsymbol{\alpha}}} $$ and covariance matrix Γ, see for example (Tarantola 2005).

When a regularization term is considered, the estimation of the fluxes α corresponds to a maximum a posteriori (MAP):

$\begin{matrix} {\hat{α}}^{(MAP)} = \underset{α}{arg min} C (ϕ_{0}, α) + μ R (α), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \widehat{\boldsymbol{\alpha }}^{\text{(MAP)}}\;=\;\underset{\boldsymbol{\alpha}}{\rm arg\,min} \;\;\fancyscript {C}(\phi _0,\boldsymbol{\alpha })+\mu \fancyscript {R}(\boldsymbol{\alpha })\,, \end{aligned} $$$ (31)

where μ is a hyperparameter that controls the amount of smoothing introduced by the regularization term. Since both 𝒞 and ℛ are quadratic in α, the MAP estimate can be obtained in closed form and corresponds to the following linear transform of the vector x of S/N_ℓ values:

$\begin{matrix} {\hat{α}}^{(MAP)} = \underset{M (μ)}{\underset{⏟}{{(V {\hat{Σ}}^{- 1} V + μ D^{⊤} D)}^{- 1} V \hat{L}}} \hat{S}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \widehat{\boldsymbol{\alpha }}^{\text{(MAP)}}=\underbrace{\left(\mathbf V \widehat{\boldsymbol{\Sigma }}^{-1}\mathbf V +\mu \mathbf D ^{\top }\mathbf D \right)^{-1} \mathbf V \widehat{\mathbb{L} }}_\mathbf{M (\mu )}\widehat{\mathbb{S} }\,, \end{aligned} $$$ (32)

where $\hat{S} = \hat{L}^{⊤} x$ $Mathematical equation: $ \widehat{\mathbb{S}} = \widehat{\mathbb{L}}{^{{\top}}}{\boldsymbol{x}} $$ is the whitened⁶ vector of spectral S/N_ℓ values, and M(μ) is the matrix defining the linear estimator ${\hat{α}}^{(MAP)}$ $Mathematical equation: $ \widehat{{\boldsymbol{\alpha}}}^{(\text{ MAP})} $$ , parameterized by the smoothing parameter μ. It can be noted that removing the regularization term (i.e., taking μ = 0) in Eq. (32) yields the same estimator as the one in Eq. (27).

Setting the value of the hyperparameter μ requires some adaptation to both the spectrum smoothness and the integrated flux. In order to obtain a detection and characterization method that is fully automatic, we investigated several strategies to select automatically the value of μ: (i) the generalized maximum likelihood (GML), also known as the evidence method (Wahba 1985; MacKay 1992), which first marginalizes the joint distribution p(x, α|μ, ϕ₀) with respect to the unknown spectrum α, then maximizes the so-called generalized likelihood p(x|μ, ϕ₀) with respect to μ; (ii) the generalized cross-validation (Craven & Wahba 1978; Golub et al. 1979), which approximates the error obtained by a leave-one-out validation strategy and is agnostic of the noise variance; (iii) Stein’s unbiased risk estimator (SURE; Stein 1981). Estimating μ with the SURE led to the best overall performance in our simulations, with a slight improvement over GML and a clear gain with respect to GCV, see Appendix G.

After spectral whitening, the vector of whitened S/N_ℓ values is distributed according to a centered Gaussian distribution. The unbiased risk estimate provided by SURE is then (Thompson et al. 1991):

$\begin{matrix} \hat{risk} = \frac{1}{L} {‖ (I - A (μ)) \hat{S} ‖}_{2}^{2} + \frac{2}{L} tr (A (μ)) - 1, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \widehat{\text{ risk}} = \tfrac{1}{L}\Vert (\mathbf I -\mathbf A (\mu ))\widehat{\mathbb{S} }\Vert _2^2+\tfrac{2}{L}\text{ tr}(\mathbf A (\mu ))-1\,, \end{aligned} $$$ (33)

where $A (μ) = L^{⊤} V M (μ)$ $Mathematical equation: $ \mathbf{A}(\mu) = \mathbb{L}{^{{\top}}}\mathbf{V} \mathbf{M}(\mu) $$ . Therefore, the parameter ${\hat{μ}}^{(SURE)}$ $Mathematical equation: $ \widehat{\mu}^{\text{(SURE)}} $$ that minimizes the risk⁷ estimate is given by:

$\begin{matrix} {\hat{μ}}^{(SURE)} = \underset{μ}{arg min} {‖ (I - A (μ)) \hat{S} ‖}_{2}^{2} + 2 tr (A (μ)) . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \widehat{\mu }^{\text{(SURE)}} = \underset{\mu}{\rm arg\,min} \;\Vert (\mathbf I -\mathbf A (\mu ))\widehat{\mathbb{S} }\Vert _2^2+2\,\text{ tr}(\mathbf A (\mu )). \end{aligned} $$$ (34)

Estimating ${\hat{μ}}^{(SURE)}$ $Mathematical equation: $ \widehat{\mu}^{\text{(SURE)}} $$ is a mono-dimensional minimization that can be performed, for instance, by a golden section search (Brent 1973). Once ${\hat{μ}}^{(SURE)}$ $Mathematical equation: $ \widehat{\mu}^{\text{(SURE)}} $$ is obtained, the vector of fluxes is computed: ${\hat{α}}^{(MAP)} = A ({\hat{μ}}^{(SURE)}) \hat{S}$ $Mathematical equation: $ \widehat{{\boldsymbol{\alpha}}}^{\text{(MAP)}}=\mathbf{A}(\widehat{\mu}^{\text{(SURE)}})\widehat{\mathbb{S}} $$ .

5. Results

5.1. Datasets description

In this section, we assess the performance of the proposed PACO ASDI algorithm both in terms of source detection and spectrum estimation. The results are compared to two standard algorithms: TLOCI and KLIP. These two algorithms are currently used for the exploitation of the SPHERE science data (Lagrange et al. 2019a; Mesa et al. 2019b; Gratton et al. 2019; Gibbs et al. 2019; Maire et al. 2019). In the following, they are applied both in ADI and ASDI mode. We used the SpeCal (Galicher et al. 2018) implementation of TLOCI-A(S)DI and KLIP-A(S)DI algorithms which is the current reference and well documented post-processing standard of the SPHERE consortium. The TLOCI implementation is based on the algorithm as described in Galicher et al. (2011), Marois et al. (2014). In particular, it implements a frame selection strategy for the estimation of the stellar PSF to mitigate the self-subtraction of the putative sources. A flat spectral template was considered in our experiments. The KLIP implementation is based on the algorithm as described in Soummer et al. (2012). No frame selection strategy was implemented to minimize the self-subtraction of the putative sources when conducting the PCA. There are several variants of these implementations, especially for the KLIP-ASDI method. In this paper, we used only the publicly available variant of the SPHERE consortium. For all algorithms from the SpeCal software, the S/N maps are computed after annular normalization of the residual images (obtained after subtraction of the estimated on-axis PSF) by the empirical standard deviation of the noise. Besides, with SpeCal, the self-subtraction phenomenon is calibrated before the reduction using massive injections of fake exoplanets to derive the radial throughput of the algorithms as a function of the angular separation. The estimated exoplanet spectra are then compensated for this calibrated correction, see Delorme et al. (2017) for detailed procedures. In ADI mode, each spectral channel is processed independently. A combined detection map is obtained by a simple summation of the detection maps from the different spectral channels. The spectrum estimation is obtained by a photometry estimation per wavelength using the ADI PSF model. The reason we considered the ADI mode is that it is a common practice to process independently each spectral channel of an ASDI series because the exoplanet signature are usually present in the redder channels.

For the comparisons, we selected four datasets from the SPHERE-IFS instrument obtained in various conditions of observation leading to different degrees of difficulty for the detection and estimation tasks. The four datasets are obtained from the SPHERE-IFS raw data using the pre-reduction and handling pipeline of the SPHERE consortium (Pavlov et al. 2008). Background, flat-field, bad pixels, registration, true-North, wavelength and astrometric calibrations are also performed during this step, see Pavlov et al. (2008), Zurlo et al. (2014), Maire et al. (2016) for the detailed procedures. These processings are followed by additional steps implemented at the SPHERE Data Center (Delorme et al. 2017) to refine the wavelength calibration, reduce the cross-talk and somewhat cope with bad pixels. We did not apply other post-processings such as high-pass filtering. Three of these calibrated datasets are dedicated to the evaluation of the detection performance on fields reported in the literature to contain point sources. The fourth one is used for the evaluation of the spectrum recovery performance. To do so, we numerically injected fake point sources at low fluxes to perform this evaluation. The four datasets considered were recorded around the following reference targets:

HR 8799 (HIP 114189) which is a A5V type star located in the Pegasus constellation. It hosts four confirmed exoplanets, only three of them (HR 8799 c, d, and e) are within the SPHERE-IFS field of view (the last one, HR 8799 b, is visible with the larger field of SPHERE-IRDIS). All of them were discovered (Marois et al. 2008), confirmed (Marois et al. 2010), and widely studied (Bowler et al. 2010; Currie et al. 2011; Marley et al. 2012) by direct imaging. In the following, this dataset is used as a baseline to compare the overall performance of the detection algorithms on sources at a standard level of contrast (between 10⁻⁶ and 10⁻⁵) in the considered spectral band.

β Pictoris (HIP 27321) which is a A6V type star located in the Pictor constellation. It hosts two known exoplanets (β Pictoris b and c) as well as a protoplanetary disk made of gas and dust. β Pictoris b was discovered (Lagrange et al. 2009) and confirmed (Lagrange et al. 2010) by direct imaging. β Pictoris c was discovered more recently by the radial velocities method (Lagrange et al. 2019b). In the following, we do not consider the presence of β Pictoris c⁸. New datasets around this star allowed to constrain better the orbit of the β Pictoris b and to refine its photometry (Lagrange et al. 2019a). In the following, we use a dataset at a challenging epoch (before the conjonction reported in Lagrange et al. 2019a) since the angular separation of the exoplanet is smaller than 0.15 arcsec, which corresponds to nine pixels away from the coronagraph mask.

HD 131399 (HIP 72940) which is a A1V type star located in the Centaurus constellation. It forms a triple system with two other stars (HD 131399 B and C) located about 349 au from the brightest star HD 131399 A (De Zeeuw et al. 1999; Dommanget & Nys 2002; Pecaut & Mamajek 2013). This system also hosts a faint point source (HD 131399 Ab) discovered by direct imaging (Wagner et al. 2016), at first supposed to be a bounded exoplanet. However, further joint analysis of GEMINI/GPI and VLT/SPHERE datasets to refine the astrometry and the spectrum estimation of the candidate companion led to the conclusion that HD 131399 Ab is more likely to be a background brown dwarf (Nielsen et al. 2017). In the following, this dataset is used to compare the behavior of detection algorithms in the case of a faint point source falling close to the limit of the instrument’s field of view.

HD 172555 (HIP 92024) which is a A5V type star located in the Pavo constellation (Schütz et al. 2005; Lisse et al. 2009). The analysis of directly imaged (A)(S)DI series was conducted in several surveys but, to the best of our knowledge, no point source was ever detected (Nielsen et al. 2008; Nielsen & Close 2010). We use this dataset to conduct the spectrum estimation of numerically injected fake point sources.

Table 3 summarizes the main observation parameters of these datasets. It can noted that the conditions of observations were not particularly good for three of these datasets (seeing values between 1.20 and 1.43).

Table 3.

Log information for the considered SPHERE-IFS datasets.

Besides, a MATLAB™ implementation of the PACO ASDI routines for computing spectrally combined wwS/N maps is available online⁹.

5.2. Detection performance

5.2.1. Detection results

In the following, the detection criteria is based on wwS/N map with PACO ASDI since it offers interesting properties in terms of detection sensitivity and controlled false alarms rate when thresholding at 5σ (see Sect. 3.2.3). In our comparison to the state-of-the-art algorithms, we use the final signal-to-noise map (denoted “combined S/N”) provided by the different pipelines. The “combined S/N” map is generally obtained by a weighted mean of the signal-to-noise ratio computed in each channel. This combination is generally followed by a post-processing step via so-called “unsharp filtering” (high pass filtering) to improve the visual quality of the combined S/N map by attenuating some spurious background artifacts. In PACO ASDI, the spatial whitening and spectral whitening operations can be seen as data-driven and locally-adaptive filters. No additional filtering is required to enhance the detection maps produced by PACO or PACO ASDI.

Figures 9 and 10 show the combined detection maps around HR 8799, β Pictoris, and HD 131399 obtained with TLOCI-A(S)DI, KLIP-A(S)DI, and PACO ASDI for two sets of color bars. The detection is performed by thresholding the maps at τ = 5 (corresponding to a PFA = 2.87 × 10⁻⁷), as classically done in direct imaging.

Fig. 9.

Detection maps (wwS/N for PACO ASDI and combined S/N for the other algorithms) around HR 8799, β Pictoris and HD 131399 obtained with TLOCI-ADI, KLIP-ADI (50 modes), TLOCI-ASDI, KLIP-ASDI (10, 50, 100 and 150 modes) and the proposed PACO ASDI algorithm. The color bars are common for all methods and are set between −5 and the highest true detection peak provided by PACO ASDI (excepted for HR 8799 for which the color bar is set between −5 and 42.7 corresponding to the wwS/N value of HR 8799 e with PACO ASDI). The detection threshold is set at τ = 5 and the values above this threshold are classified as true detections (yellow circles) and false detections (red squares and polygons). Missed detections are indicated by pink triangles. The value of the largest false alarm is also indicated in red on each map. Black arrows point at areas with constant values.

Fig. 10.

Same caption than Fig. 9. The color bars are adapted to each method and are set between −5 and the detection peak associated with one of the real sources to be detected (respectively HR 8799 e which is the closest to the host star, β Pictoris b, and HD 131399 Ab).

State-of-the-art algorithms lead to strong radial background artifacts likely caused by a miss-modeling of the spatial and spectral correlations since they take the form of the typical structures observed in the GLRT⁺ maps obtained with PACO ASDI (see Fig. 7a) when the spectral correlations are neglected. In addition, the detection seems very difficult in the regions close to the borders of the instrument’s field of view due to strong artifacts, in particular with TLOCI-ADI and KLIP-ADI, possibly due to a miss-modeling of the aberrant data occurring on the borders. For example, distinguishing the signature of the exoplanet HR 8799 c (top-left corner of the field of view) from the artifacts seems almost impossible with TLOCI-ADI and KLIP-ADI. In ASDI mode, the quality of the detection maps is generally not significantly better in these areas due to artifacts (in particular with TLOCI-ASDI) or regions with zeros or saturated values obtained with KLIP-ASDI (indicated by arrows in the figures), possibly caused by the absence of explicit modelization of areas with missing data for the longest wavelengths. All these sources of artifacts cause a severe limitation of the workable field of view in which an exoplanet can be actually detected. The portion of the field of view impacted by missing or aberrant data increases with the parallactic rotation and with the ratio λ_max/λ_min. It reaches more than 20% for TLOCI-ASDI on the β Pictoris and HR 8799 datasets. PACO ASDI provides stationary detection maps on the whole field of view (including at the vicinity of the host star and close to the borders of the field of view) so that a unique detection threshold can be set. The stationarity is explained both by the local statistical modeling of the spatio-spectral correlations of the background and the explicit consideration of the missing or aberrant data which are flagged as outliers.

Regarding the detection accuracy of the algorithms, only PACO ASDI ensures a statistically-grounded control of the PFA in the sense that no false alarm is generally observed at the 5σ threshold. Unlike PACO ASDI, all other algorithms considered lead to several false detections in the field of view (many more than expected at 5σ). In practice, astronomers are familiar with the “false alarms issue” and generally differentiate candidate companions from the false alarms by visual inspection of the multi-spectral data and of the reduction results.

Regarding the detection sensitivity of the tested algorithms, only PACO ASDI leads to detection peaks significantly higher than the conventional detection threshold (τ = 5) for all the known point sources, even when they are close to the host star (as β Pictoris b) or close to the borders of the field of view (as HD 131399 Ab). For example, for HR 8799, only one source can be unambiguously detected at 5σ with state-of-the-art algorithms on the dataset we have reduced. The other sources can be distinguished by a visual inspection of the detection maps but require to lower the detection confidence at about 3.5 for the exoplanet HR 8799 e and at 4.3 for the exoplanet HR 8799 d. The best results with existing methods are obtained with KLIP-ADI, leading to the detection of HR 8799 c (top-left corner of the field of view) at a combined S/N equal to 66.9. On the same dataset, the three known exoplanets are detected with PACO ASDI and the wwS/N reaches 188.2 for HR 8799 c. For β Pictoris and HD 131399, only PACO ASDI can detect the known sources without any false alarm in the field of view. Lowering the detection threshold with TLOCI-A(S)DI and KLIP-A(S)DI leads to several false alarms in the field of view thus preventing their automatic detection. Moreover, our experiments tend to show that in certain cases it is not so easy to distinguish true detections from false alarms via visual inspection: on the considered β Pictoris dataset processed with TLOCI-A(S)DI and KLIP-A(S)DI algorithms, it seems difficult to visually discriminate β Pictoris b (combined S/N ∈ [1.9;3.3]) and HD 131399 Ab (combined S/N ∈ [0.2;2.0]) from false alarms since the shape of the detection peaks (blobs spatially correlated on a few pixels) are sometimes quite similar. The case of HD 131399 is interesting since the dataset we consider in this paper was already processed by other authors. For example, Nielsen et al. (2017) report the detection of HD 131399 Ab at a combined S/N between 4.0 and 6.3 with the cADI algorithm (Marois et al. 2006; Lagrange et al. 2010). The detection values reported with the cADI method are higher than the values obtained in our experiments with the TLOCI-A(S)DI and KLIP-A(S)DI algorithms but they remain significantly lower than the value obtained with PACO ASDI (wwS/N = 10.3). The difference between TLOCI & KLIP and cADI could be explained by the low “aggressivity” of the cADI method (i.e., low self-subtraction and reduced amount of artifacts due to the ADI strategy) which seems to be particularly adapted to this target located very near the borders of the field of view. However, while the cADI detection map presented in Fig. 2 top-left corner of Nielsen et al. (2017) allows to identify HD 131399 Ab, several point-like features are also falsely detected with a comparative or higher level of S/N, especially near the coronagraph.

5.2.2. Achievable contrast

In this section, we compare the minimal contrast required to achieve a detection with PACO ASDI to the contrasts of TLOCI and KLIP. As is classically done in the literature, we derive the so-called “5σ contrast curves” representing the minimum contrast of a source to still be detected with a probability of detection PD = 0.5 when the detection threshold is set to obtain a probability of false alarms PFA = 2.87 × 10⁻⁷. This achievable contrast can be computed for the single-wavelength detection maps. As detailed in Sect. 3.2.3, the 5σ contrast in channel ℓ is $5 \sqrt{v_{ℓ}}$ $Mathematical equation: $ 5\sqrt{v_{\ell}} $$ (the minimum contrast α_ℓ of the source in the spectral channel ℓ so that P(S/N_ℓ > τ) = 0.5). As detailed in Sect. 3.2.3, the combination of the S/N_ℓ maps improves the achievable contrast. When the combined detection map wwS/N is used, the achievable contrast is given by Eq. (22) if the prior spectrum perfectly matches the source spectrum. In practice, this theoretical lower bound is not reached, for the reasons discussed in Sect. 3.2.3 and, like with PACO, because at the detection stage the S/N_ℓ values are underestimated, the background statistics being estimated in the presence of the source. From our experience, values of the contrast achieved for single-wavelength detections are typically reached in practice with S/N_ℓ and can thus be used as a safe value of the achievable contrast.

Figure 11a gives the achievable 5σ contrast curves obtained with TLOCI-A(S)DI, KLIP-A(S)DI, and PACO ASDI on HR 8799 and β Pictoris. For PACO ASDI, both the S/N_ℓ contrast and the combined wwS/N contrast are represented. Considering the S/N_ℓ contrast curves of PACO ASDI, a clear gain is observed at small angular separations (≤0.7 arcsec) comparatively to the state-of-the-art algorithms. At larger separations, this gain is maintained except for KLIP-ASDI which can reach better contrasts. However, as already observed and discussed in our previous paper (Flasseur et al. 2018a) on ADI series, contrast curves produced by state-of-the-art algorithms are often optimistic both in terms of PD and PFA. These previous observations also verify here in ASDI mode: all detection maps of state-of-the-arts algorithms present many more false alarms than what would be expected at 5σ. According to Fig. 11, far from the β Pictoris star, the best achievable contrast is reached when using KLIP-ASDI with 150 modes and PACO ASDI that converge towards the same detection limit. However, Figs. 9 and 10 show that KLIP-ASDI produces several false alarms in the field of view and a lower signal-to-noise ratio for the exoplanet compared to PACO ASDI. With KLIP-ASDI, the largest value of a false alarm is significantly higher than 5.0, while it should be very unlikely to have a background value larger than 4.0. This illustrates that the 5σ contrast of state-of-the-art algorithms does not correspond to the expected level of PFA and can thus only be used to perform relative comparisons. We also give the contrast reached by PACO ASDI when single-wavelength detection maps are combined. A theoretical gain (see Eq. (21)) slightly less than one order of magnitude is expected due to the combination, according to Fig. 11. When comparing the S/N_ℓ values of the point sources in the single-detection maps of HR 8799 shown in Fig. 6b (values in the range 1.5–5.5) to the values in the combined detection maps of Figs. 9 and 10 (wwS/N = 42.7), the improvement is in relatively good agreement with the contrast curves of Fig. 11a.

Fig. 11.

Achievable 5σ contrast on HR 8799 and β Pictoris. a: contrast curves obtained with TLOCI-A(S)DI, KLIP-A(S)DI, and PACO ASDI. All curves correspond to the mean contrast along spectral channels. For PACO ASDI, the solid red line is for the spectral mean S/N_ℓ contrast while the dashed pink line is for the spectral mean wwS/N contrast. The wwS/N contrast is the theoretical lower bound given by Eq. (22) when several spectral channels are combined. Contrast curves as provided by KLIP and TLOCI do not correspond to a 5σ false alarms rate contrarily to the contrast curves of PACO ASDI. The achievable contrasts are thus significantly over-optimistic for KLIP and TLOCI, see discussion in the text (Sect. 5.2.2). b: examples of 2D S/N_ℓ contrast maps obtained with PACO ASDI for four spectral channels: ℓ₁ = 0.9575 μm, ℓ₁₃ = 1.1589 μm, ℓ₂₅ = 1.3915 μm and ℓ₃₇ = 1.6054 μm. The superimposed white circles represent the locations of the known exoplanets.

As for PACO (Flasseur et al. 2018a), the achievable contrast of PACO ASDI can be computed at every point of the field of view. Figure 11b gives examples of 2D contrast maps obtained with PACO ASDI for some selected spectral channels. They show that the detection is more favorable on certain spectral channels than others. For example, the achievable contrast on spectral channel ℓ₂₅ = 1.3915 μm is about twice worst than the one obtained on spectral channel ℓ₁₃ = 1.1589 μm. This can be explained by the presence of large spectral variations of the intensity fluctuations or additional noise probably caused by the low atmospheric transmission. Interestingly, these maps indicate that the achievable contrast varies significantly along an annulus of fixed angular separation. It is particularly the case near the host star since the residual central halo is not isotropic. Thus, the 2D contrast information can be useful to derive an accurate estimation of the achievable contrast given the angular location of a detected source.

5.3. Spectrum estimation performance

In this section, we evaluate the performance for the spectrum extraction of point sources. As mentionned in Sect. 1, the recovered spectra are not corrected for the stellar spectrum in this paper.

We first use numerical injections of fake point sources for the quantitative characterization of PACO ASDI. For this purpose, we use the dataset around HD 172555 (see Sect. 5.1) with no detectable source. Figure 12 gives the wwS/N map obtained with PACO ASDI and the combined signal-to-noise ratio maps obtained with other algorithms on this dataset, before injecting the fake point sources. While state-of-the-art algorithms reveal several areas above the detection threshold at τ = 5, experts did not identify consistent detection peaks by closer inspection. In addition, PACO ASDI does not identify any significant detection peak at 5σ. We inject 12 fake point sources in the field of view with a variety of mean contrast and true spectra. Table 4 gives astrometric and photometric information about the fake point sources. Sources #1 to #6 have a mean flux lower or equal to 5 × 10⁻⁵ while sources #7 to #12 have a mean flux lower or equal to 8.5 × 10⁻⁶. Figure 13 presents the wwS/N maps obtained with PACO ASDI around HD 172555 with fake point sources #1 to #6 (left) and #7 to #12 (right) simultaneously injected. Figure 14 gives contrast curves at 5σ obtained on the considered dataset with PACO ASDI comparatively to TLOCI-ADI. Contrast curves indicate that sources #1 to #6, #8, #10, and #12 can be detected at 5σ by PACO ASDI while the sources #7, #9, and #11 are too faint to be detected from the S/N_ℓ maps at the considered angular separations, see Fig. 13.

Fig. 12.

Detection maps (wwS/N for PACO ASDI and combined S/N for the other algorithms) around HD 172555 in the absence of fake point sources.

Fig. 13.

wwS/N maps obtained with PACO ASDI around HD 172555 with injected fake point sources #1 to #6 (left) and #7 to #12 (right).

Fig. 14.

Contrast curves at 5σ obtained on HD 172555 with PACO ASDI comparatively to TLOCI-ADI. The mean contrast of the fake faint point sources #1 to #12 is marked by orange points.

Table 4.

Angular separation, minimum, maximum, and mean contrast of the 12 fake point sources injected in the fourth dataset.

Figure 15 shows the spectrum estimation for the nine detectable fake sources obtained by PACO ASDI, TLOCI-ADI, and TLOCI-ASDI. These results show that the spectrum estimations provided by PACO ASDI are in good agreement with the ground truth since no systematic photometric bias can be observed. We note only one significant discrepancy between the estimated spectrum by PACO ASDI and the ground truth occurring for source #8 between 1.07 μm and 1.22 μm. This discrepancy can be explained both by the very faint source contrast in this spectral band (lower than 5 × 10⁻⁶), the proximity with the host star (angular separation equals to 0.187 arcsec) and the presence of a “dark” speckle near the injection leading to a negative estimated source flux (thresholded at 0) at the corresponding wavelengths. The predicted spectrum confidence intervals also seem coherent with the empirical standard deviation of estimation. The spectrum estimates obtained by PACO ASDI are qualitatively much better than those obtained with TLOCI.

Fig. 15.

Estimated spectrum of the detectable fake faint point sources (#1 to #6 plus #8, #10, and #12) obtained with TLOCI-ADI (blue), TLOCI-ASDI (orange), and PACO ASDI (green). The spectrum ground truths of the different fake faint point sources are marked by black lines. The given 1σ confidence intervals are those predicted by the considered algorithms. A zoom around the ground truth is added for sources #10 and #12.

To complete this statistical study, we consider three sources from the 12 previous ones (sources #5, #10, and #12) for which we perform 30 Monte–Carlo injections/spectrum estimations over a circular annulus (i.e., at constant angular separations). Figure 16 gives the 30 estimated spectra obtained with TLOCI-A(S)DI, KLIP-ADI, and PACO ASDI and the mean estimations. The confidence intervals provided by the different algorithms can be compared to the empirical 1σ confidence intervals. Table 5 complements this figure with statistical results in terms of photometric bias, agreement between the predicted and the empirical confidence intervals, and mean square error (MSE). These results show that PACO ASDI is photometrically unbiased in the sense that the photometric bias is negligible (about ±1% of the mean contrast of the sources) without resorting to Monte–Carlo methods to estimate and compensate the potential source self-subtraction phenomenon, as is common practice with most of the state-of-the-art algorithms. In comparison, this relative photometric bias reaches in most of the cases 4% of the mean contrast of the sources with other methods (excepted for source #12 with TLOCI-ADI). The results of state-of-the-art methods are generally not better in ASDI than in ADI mode. This could be explained by the stronger source self-subtraction that occurs when several spectral channels are processed jointly. This observation could illustrate why experts tend to apply also ADI detection and/or characterization algorithms on ASDI datasets by processing each spectral channel independently (Nielsen et al. 2017; Perrot et al. 2019; Gibbs et al. 2019; Maire et al. 2020, see also Maire et al. 2014; Rameau et al. 2015 for the challenge of ASDI processing). The empirical confidence intervals are also smaller with PACO ASDI. The RMSE of the estimated spectra is reduced by a factor at least two by PACO ASDI with respect to other algorithms (on average, by a factor 3.6 for the three sources analyzed in Table 5). Moreover, the confidence intervals provided by the method are in good match with the observed standard deviations of the Monte–Carlo simulation: the ratio between the predicted standard deviation and the empirical standard deviation is between 0.94 and 1.21 for PACO ASDI, which is closer to one than for other methods (i.e., PACO ASDI confidence intervals are more reliable).

Fig. 16.

Monte–Carlo estimated spectrum for fake point sources #5, #10 and #12 obtained with TLOCI-ADI, KLIP-ADI (5 modes), TLOCI-ASDI and PACO ASDI. For each of the three considered sources, the 30 Monte–Carlo spectrum estimations are given in gray line. Red and blue lines compare the 1σ predicted confidence intervals to the empirical ones centered on the mean estimated spectra over the 30 Monte–Carlo estimations. The spectrum ground truth of the considered sources is in black.

Table 5.

Monte–Carlo validation of spectrum estimation methods.

We also illustrate astrometric and spectrum estimations on the real point sources in the first three datasets. Table 6 presents the estimated astrometry and Fig. 17 gives the estimated spectra with PACO ASDI of the exoplanets HR 8799 c-d-e, β Pictoris b, and the background source HD 131399 Ab. The datasets are processed in a totally automatic fashion as described in Flasseur et al. (2018c) for ADI datasets. A joint refinement of the astrometry and photometry estimations of the source with the highest wwS/N is performed. Its estimated flux contribution is then subtracted to the data and the wwS/N map is updated using a conventional cleaning approach (see Fig. 17 for cleaned wwS/N examples). This procedure is repeated until no source at a significant level of signal-to-noise ratio is detected in the wwS/N map. The estimated spectra are quite smooth and coherent between one spectral channel to the other with realable confidence intervals. The spectrum extraction is especially challenging for HD 131399 Ab since it is located near the borders of the SPHERE-IFS field of view (see Figs. 9 and 10), and the observing conditions of the considered dataset were not particularly good (seeing about 1.30, see Table 3). Nielsen et al. (2017) use photometric and astrometric estimations from VLT/SPHERE and GEMINI/GPI to show that HD 131399 Ab is more likely a background brown dwarf. This result was a revision of the previous status of HD 131399 Ab firstly considered as an exoplanet just after its discovery and confirmation on the basis of (noisy) extracted astrometry and photometry (Wagner et al. 2016). The difficulty to ascertain that HD 131399 Ab was an exoplanet based on the first observations illustrates the importance of algorithms that can provide reliable astrometric and spectrum estimations. We expect PACO ASDI to help refining the estimated orbit and the spectral characterization of future candidate companions.

Fig. 17.

Estimated spectra using PACO ASDI of the known real faint point sources of the considered datasets (top: HR 8799 c–d–e, middle: β Pictoris b, bottom: HD 131399 Ab). The inserts show a residual wwS/N map after “cleaning” the contribution of the detected sources. Synthetic subpixel views (4 nodes per pixels) show with false colors the aggregated flux of the detected sources along the different spectral channels (blue for ℓ₁ = 0.9575 μm and red for ℓ_L = 39 = 1.6357 μm).

Table 6.

Estimated astrometry (separation and true-north aligned angle) of the real faint point sources known in the used datasets with PACO ASDI.

6. Conclusion

ASDI observations provide very rich data for the detection and characterization of point sources such as exoplanets. Data processing algorithms form an important component in high contrast imaging. Despite extensive efforts in the design of coronagraphic instruments, the separation of the signal of interest (off-axis sources) from the background signal of the on-axis star has to be performed under adverse conditions: large temporal and spectral fluctuations of the intensity of the background, strong correlations and contamination by outliers. Data processing algorithms must be designed with the aim to be robust to all these characteristics of the noise. We have shown in this paper that data-driven statistical modeling paved the way to reliable source detection and characterization methods.

PACO ASDI, the data processing algorithm introduced in this paper, produces detection maps with improved sensibility compared to existing methods. In particular, the detection maps are obtained through a whitening and weighting strategy accounting for the spectral correlations. We show that this general principle is also beneficial to combine detection maps from other existing detection methods. An important additional feature of PACO ASDI is the control of the probability of false alarms: detection maps can be reliably thresholded. Using the conventional 5σ threshold generally produces no false alarm in an IFS field. This contrasts with detection maps obtained with other methods for which many false alarms are observed, in particular at very small angular separations and close to the borders of the field of view. Full exploitation of the field of view seems to be a feature of PACO ASDI that is shared by few other methods.

The elaborate statistical model of PACO ASDI, accounting for spatial, spectral, and temporal fluctuations, is also used to characterize the astrometry and photometry of the detected point sources. By refining the model of the background jointly with the estimation of the source spectrum, the bias due to source self-subtraction is prevented. The spectrum is estimated using a parameter-free spectral regularization. Our numerical experiments show reduced estimation errors compared to standard methods.

Beyond the direct analysis of ASDI datasets, PACO ASDI also provides information on the achievable contrast, photometric and astrometric accuracies that are reached for given instrumental and observational conditions. The impact of different observation scenarios (spectral coverage, parallactic rotation) can then be assessed using a data-driven model whose prediction accuracy has been validated on real data.

For discrete data, the Karhunen-Loève transform and the PCA are identical procedures.

In this paper, the term relative spectrum (or more simply spectrum) refers to the estimated spectrum of the detected sources before correction for the stellar spectrum.

We note that a reminder of the main notations used throughout the paper is given in Table 1.

⁴

In particular, the choice of a temporal versus spectral mean has been studied. A temporal mean was found better suited to model the background of SPHERE-IFS data. It is straightforward to replace the temporal mean by a spectral mean if needed to model the fluctuations of other instruments.

⁵

Outliers are artifacts taking the form of unexpected fluctuations. These artifacts can be spatially localized (e.g., defective pixels) or can impact a larger part of the field of view when, e.g., a sudden degradation of the adaptive optics correction occurs.

⁶

We remind that $\hat{L} \hat{L}^{⊤} = {\hat{Σ}}^{- 1}$ $Mathematical equation: $ \widehat{\mathbb{L}}\widehat{\mathbb{L}}{^{{\top}}}= \widehat{\boldsymbol{\Sigma}}^{-1} $$ .

⁷

The risk is defined by: $risk = E [| | \hat{α} - α {| |}^{2}]$ $Mathematical equation: $ \text{risk} = \mathbb{E}\left[ || \widehat{{\boldsymbol{\alpha}}} - {\boldsymbol{\alpha}} ||^2 \right] $$ .

⁸

β Pictoris c has a very small angular separation with its host star (0.10-0.15 arcsec at its maximal elongation). For comparison, the inner working angle of the coronagraphs of SPHERE is 0.125 arcsec. Besides, β Pictoris c has a very low contrast given its separation (about 1 × 10⁻⁴ in the H Johnson’s band). For these two reasons, it could be detected by direct imaging only when it is at its maximal elongation, given the current instrumental and processing capabilities, see Lagrange et al. (2019b).

⁹

http://doi.org/10.5281/zenodo.3679426

Acknowledgments

We thank A. Boccaletti (LESIA, Paris, France) who provided the transmission of the SPHERE coronagraphs. We also thank the anonymous Referee for his careful reading of the manuscript as well as his insightful comments and suggestions. This work has made use of the SPHERE Data Centre, jointly operated by OSUG/IPAG (Grenoble, France), PYTHEAS/LAM/CESAM (Marseille, France), OCA/Lagrange (Nice, France), Observatoire de Paris/LESIA (Paris, France), and Observatoire de Lyon/CRAL (Lyon, France).

References

Absil, O., Milli, J., Mawet, D., et al. 2013, A&A, 559, L12 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bailey, V., Meshkat, T., Reiter, M., et al. 2013, ApJ, 780, L4 [NASA ADS] [CrossRef] [Google Scholar]
Beuzit, J.-L., Feldt, M., Dohlen, K., et al. 2008, Proc. SPIE, 7014, 701418 [Google Scholar]
Beuzit, J.-L., Vigan, A., Mouillet, D., et al. 2019, A&A, 631, A155 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bowler, B. P. 2016, PASP, 128, 102001 [NASA ADS] [CrossRef] [Google Scholar]
Bowler, B. P., Liu, M. C., Dupuy, T. J., & Cushing, M. C. 2010, ApJ, 723, 850 [NASA ADS] [CrossRef] [Google Scholar]
Brent, R. P. 1973, Algorithms for Minimization without Derivatives (Englewood Cliffs, NJ: Prentice-Hall) [Google Scholar]
Cantalloube, F. 2016, Ph.D. Thesis, Grenoble Alpes [Google Scholar]
Cantalloube, F., Mouillet, D., Mugnier, L., et al. 2015, A&A, 582, A89 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Cantalloube, F., Ygouf, M., Mugnier, L., et al. 2018, ArXiv e-prints [arXiv:1812.04312] [Google Scholar]
Chauvin, G., Desidera, S., Lagrange, A.-M., et al. 2017, A&A, 605, L9 [Google Scholar]
Cheetham, A., Samland, M., Brems, S., et al. 2019, A&A, 622, A80 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Claudi, R., Maire, A.-L., Mesa, D., et al. 2019, A&A, 622, A96 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Craven, P., & Wahba, G. 1978, Numer. Math., 31, 377 [CrossRef] [Google Scholar]
Currie, T., Burrows, A., Itoh, Y., et al. 2011, ApJ, 729, 128 [NASA ADS] [CrossRef] [Google Scholar]
Currie, T., Debes, J., Rodigas, T. J., et al. 2012a, ApJ, 760, L32 [NASA ADS] [CrossRef] [Google Scholar]
Currie, T., Fukagawa, M., Thalmann, C., Matsumura, S., & Plavchan, P. 2012b, ApJ, 755, L34 [NASA ADS] [CrossRef] [Google Scholar]
De Zeeuw, P., Hoogerwerf, R. V., de Bruijne, J. H., Brown, A., & Blaauw, A. 1999, AJ, 117, 354 [NASA ADS] [CrossRef] [Google Scholar]
Delorme, P., Meunier, N., Albert, D., et al. 2017, SF2A-2017: Proceedings, 347 [Google Scholar]
Devaney, N., & Thiébaut, É. 2017, MNRAS, 472, 3734 [NASA ADS] [CrossRef] [Google Scholar]
Dommanget, J., & Nys, O. 2002, VizieR Online Data Catalog: I/274 [Google Scholar]
Flasseur, O., Denis, L., Thiébaut, É., & Langlois, M. 2018a, A&A, 618, A138 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Flasseur, O., Denis, L., Thiébaut, É., & Langlois, M. 2018b, in International Society for Optics and Photonics, Adapt. Opt. Syst. VI, 10703, 10703R [Google Scholar]
Flasseur, O., Denis, L., Thiébaut, É., & Langlois, M. 2018c, in 2018 25th IEEE International Conference on Image Processing (ICIP), 2735 [CrossRef] [Google Scholar]
Flasseur, O., Denis, L., Thiébaut, É., & Langlois, M. 2020, A&A, 634, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Galicher, R., Marois, C., Macintosh, B., Barman, T., & Konopacky, Q. 2011, ApJ, 739, L41 [Google Scholar]
Galicher, R., Boccaletti, A., Mesa, D., et al. 2018, A&A, 615, A92 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Gerard, B. L., Marois, C., Currie, T., et al. 2019, AJ, 158, 36 [NASA ADS] [CrossRef] [Google Scholar]
Gibbs, A., Wagner, K., Apai, D., et al. 2019, AJ, 157, 39 [Google Scholar]
Golub, G. H., Heath, M., & Wahba, G. 1979, Technometrics, 21, 215 [CrossRef] [Google Scholar]
Gonzalez, C. G., Absil, O., Absil, P.-A., et al. 2016, A&A, 589, A54 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Gonzalez, C., Absil, O., & Van Droogenbroeck, M. 2018, A&A, 613, A71 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Gratton, R., Ligi, R., Sissa, E., et al. 2019, A&A, 623, A140 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Howard, A. W., Johnson, J. A., Marcy, G. W., et al. 2010, ApJ, 721, 1467 [NASA ADS] [CrossRef] [Google Scholar]
Hubert, M., Rousseeuw, P. J., & Van Aelst, S. 2008, Stat. Sci., 92 [Google Scholar]
Jovanovic, N., Martinache, F., Guyon, O., et al. 2015, PASP, 127, 890 [NASA ADS] [CrossRef] [Google Scholar]
Kendall, M. G., Stuart, A., & Ord, J. K. 1948, The Advanced Theory of Statistics (JSTOR), 1 [Google Scholar]
Keppler, M., Benisty, M., Müller, A., et al. 2018, A&A, 617, A44 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Lafrenière, D., Marois, C., Doyon, R., Nadeau, D., & Artigau, E. 2007, ApJ, 660, 770 [NASA ADS] [CrossRef] [Google Scholar]
Lagrange, A.-M., Gratadour, D., Chauvin, G., et al. 2009, A&A, 493, L21 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Lagrange, A.-M., Bonnefoy, M., Chauvin, G., et al. 2010, Science, 329, 57 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
Lagrange, A.-M., Boccaletti, A., Langlois, M., et al. 2019a, A&A, 621, L8 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Lagrange, A.-M., Meunier, N., Rubini, P., et al. 2019b, Nat. Astron., 1 [Google Scholar]
Lisse, C. M., Chen, C., Wyatt, M., et al. 2009, ApJ, 701, 2019 [NASA ADS] [CrossRef] [Google Scholar]
Lovis, C., & Fischer, D. 2010, Exoplanets, 27 [Google Scholar]
Macintosh, B., Graham, J., Barman, T., et al. 2015, Science, 350, 64 [NASA ADS] [CrossRef] [Google Scholar]
Macintosh, B., Graham, J. R., Ingraham, P., et al. 2014, Proc. Nat. Acad. Sci., 111, 12661 [Google Scholar]
MacKay, D. J. 1992, Neural Comput., 4, 415 [CrossRef] [Google Scholar]
Maire, A.-L., Boccaletti, A., Rameau, J., et al. 2014, A&A, 566, A126 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Maire, A.-L., Langlois, M., Dohlen, K., et al. 2016, in International Society for Optics and Photonics, Ground-based Airborne Instrum. Astron. VI, 9908, 990834 [Google Scholar]
Maire, A.-L., Rodet, L., Cantalloube, F., et al. 2019, A&A, 624, A118 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Maire, A.-L., Baudino, J.-L., Desidera, S., et al. 2020, A&A, 633, L2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Marley, M. S., Saumon, D., Cushing, M., et al. 2012, ApJ, 754, 135 [NASA ADS] [CrossRef] [Google Scholar]
Marois, C., Lafrenière, D., Doyon, R., Macintosh, B., & Nadeau, D. 2006, ApJ, 641, 556 [Google Scholar]
Marois, C., Macintosh, B., Barman, T., et al. 2008, Science, 322, 1348 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
Marois, C., Zuckerman, B., Konopacky, Q. M., Macintosh, B., & Barman, T. 2010, Nature, 468, 1080 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
Marois, C., Correia, C., Véran, J.-P., & Currie, T. 2013, Proc. Int. Astron. Union, 8, 48 [CrossRef] [Google Scholar]
Marois, C., Correia, C., Galicher, R., et al. 2014, in International Society for Optics and Photonics, Adapt. Opt. Syst. IV, 9148, 91480U [Google Scholar]
Mawet, D., Hirsch, L., Lee, E. J., et al. 2019, AJ, 157, 33 [NASA ADS] [CrossRef] [Google Scholar]
Mesa, D., Keppler, M., Cantalloube, F., et al. 2019a, A&A, 632, A25 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Mesa, D., Bonnefoy, M., Gratton, R., et al. 2019b, A&A, 624, A4 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Morzinski, K. M., Close, L. M., Males, J. R., et al. 2014, in International Society for Optics and Photonics, Adapt. Opt. Syst. IV, 9148, 914804 [Google Scholar]
Mugnier, L. M., Cornia, A., Sauvage, J.-F., et al. 2009, J. Opt. Soc. Am. A, 26, 1326 [NASA ADS] [CrossRef] [Google Scholar]
Müller, A., Keppler, M., Henning, T., et al. 2018, A&A, 617, L2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Nielsen, E. L., & Close, L. M. 2010, ApJ, 717, 878 [NASA ADS] [CrossRef] [Google Scholar]
Nielsen, E. L., Close, L. M., Biller, B. A., Masciadri, E., & Lenzen, R. 2008, ApJ, 674, 466 [Google Scholar]
Nielsen, E. L., Liu, M. C., Wahhaj, Z., et al. 2012, ApJ, 750, 53 [Google Scholar]
Nielsen, E. L., De Rosa, R. J., Rameau, J., et al. 2017, AJ, 154, 218 [Google Scholar]
Nielsen, E. L., De Rosa, R. J., Macintosh, B., et al. 2019, AJ, 158, 13 [NASA ADS] [CrossRef] [Google Scholar]
Pavlov, A., Möller-Nilsson, O., Feldt, M., et al. 2008, in International Society for Optics and Photonics, Adv. Softw. Control Astron. II, 7019, 701939 [Google Scholar]
Pecaut, M. J., & Mamajek, E. E. 2013, ApJS, 208, 9 [Google Scholar]
Perrin, M. D., Sivaramakrishnan, A., Makidon, R. B., Oppenheimer, B. R., & Graham, J. R. 2003, ApJ, 596, 702 [NASA ADS] [CrossRef] [Google Scholar]
Perrot, C., Thebault, P., Lagrange, A.-M., et al. 2019, A&A, 626, A95 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Pueyo, L. 2018, Handbook of Exoplanets, 705 [Google Scholar]
Racine, R., Walker, G. A., Nadeau, D., Doyon, R., & Marois, C. 1999, PASP, 111, 587 [NASA ADS] [CrossRef] [Google Scholar]
Rameau, J., Chauvin, G., Lagrange, A.-M., et al. 2015, A&A, 581, A80 [NASA ADS] [EDP Sciences] [Google Scholar]
Rousseeuw, P. J., & Driessen, K. V. 1999, Technometrics, 41, 212 [CrossRef] [Google Scholar]
Ruffio, J.-B., Macintosh, B., Wang, J. J., et al. 2017, ApJ, 842, 14 [NASA ADS] [CrossRef] [Google Scholar]
Santos, N. C. 2008, New Astron. Rev., 52, 154 [NASA ADS] [CrossRef] [Google Scholar]
Savransky, D. 2015, in International Society for Optics and Photonics, Tech. Instrum. Detection Exoplanets VII, 9605, 96050R [Google Scholar]
Schneider, J., Dedieu, C., Le Sidaner, P., Savalle, R., & Zolotukhin, I. 2011, A&A, 532, A79 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Schütz, O., Meeus, G., & Sterzik, M. 2005, A&A, 431, 175 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Simon, M. K. 2007, Probability Distributions Involving Gaussian Random Variables: A Handbook for Engineers and Scientists (Springer Science & Business Media) [Google Scholar]
Smith, I., Ferrari, A., & Carbillet, M. 2009, IEEE Trans. Signal Process., 57, 904 [NASA ADS] [CrossRef] [Google Scholar]
Soummer, R., Pueyo, L., & Larkin, J. 2012, ApJ, 755, L28 [NASA ADS] [CrossRef] [Google Scholar]
Stein, C. M. 1981, Ann. Stat., 1135 [Google Scholar]
Tarantola, A. 2005, Inverse Problem Theory and Methods for Model Parameter Estimation (SIAM), 89 [CrossRef] [Google Scholar]
Thiébaut, É., Devaney, N., Langlois, M., & Hanley, K. 2016, in International Society for Optics and Photonics, SPIE Astron. Telescopes+ Instrum., 99091R [Google Scholar]
Thompson, A. M., Brown, J. C., Kay, J. W., & Titterington, D. M. 1991, IEEE Trans. Pattern Anal. Mach. Intell., 326 [Google Scholar]
Traub, W. A., & Oppenheimer, B. R. 2010, Exoplanets, 111 [Google Scholar]
Vigan, A., Moutou, C., Langlois, M., et al. 2010, MNRAS, 407, 71 [NASA ADS] [CrossRef] [Google Scholar]
Wagner, K., Apai, D., Kasper, M., et al. 2016, Science, 353, 673 [NASA ADS] [CrossRef] [Google Scholar]
Wahba, G., et al. 1985, Ann. Stat., 13, 1378 [CrossRef] [Google Scholar]
Wahhaj, Z., Cieza, L. A., Mawet, D., et al. 2015, A&A, 581, A24 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Ygouf, M. 2012, Ph.D. Thesis, Grenoble [Google Scholar]
Yip, K. H., Nikolaou, N., Coronica, P., et al. 2019, ArXiv e-prints [arXiv:1904.06155] [Google Scholar]
Zurlo, A., Vigan, A., Mesa, D., et al. 2014, A&A, 572, A85 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

Appendix A: Estimation of the local statistics of the background in ASDI

In this appendix, we derive the maximum likelihood estimator of the Gaussian parameters. The expression of this estimator is obtained by minimizing the co-log-likelihood ℒ_n defined in Eq. (2). The first-order optimality condition ∇ℒ_n = 0 leads to equations defining each parameter.

The condition ${\frac{\partial L_{n}}{\partial m_{n, ℓ}} |}_{m_{n, ℓ} = {\hat{m}}_{n, ℓ}} = 0$ $Mathematical equation: $ \left.\frac{\partial \mathscr{L}_n}{\partial {\boldsymbol{m}}_{n,\ell}}\right|_{{\boldsymbol{m}}_{n,\ell}=\widehat{{\boldsymbol{m}}}_{n,\ell}}=0 $$ gives:

$\begin{matrix} C_{n}^{- 1} \sum_{t \in 1 : T} \frac{1}{σ_{n, ℓ, t}^{2}} ({\hat{m}}_{n, ℓ} - r_{n, ℓ, t}) = 0 . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathbf C _n^{-1}\sum _{t\in 1:T}\frac{1}{\sigma _{n,\ell ,t}^2}\bigl (\widehat{\boldsymbol{m}}_{n,\ell }-\boldsymbol{r}_{n,\ell ,t}\bigr )=\boldsymbol{0}. \end{aligned} $$$

Since $C_{n}^{- 1}$ $Mathematical equation: $ \mathbf{C}_n^{-1} $$ is necessarily non-singular, we obtain the expression of the wavelength-specific average patch:

$\begin{matrix} {\hat{m}}_{n, ℓ} = \frac{1}{\sum_{t \in 1 : T} 1 / σ_{n, ℓ, t}^{2}} \cdot \sum_{t \in 1 : T} \frac{1}{σ_{n, ℓ, t}^{2}} r_{n, ℓ, t}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \widehat{\boldsymbol{m}}_{n,\ell }=\frac{1}{\sum \limits _{t\in 1:T}1/\sigma _{n,\ell ,t}^2}\cdot \sum _{t\in 1:T}\frac{1}{\sigma _{n,\ell ,t}^2}\boldsymbol{r}_{n,\ell ,t}\,, \end{aligned} $$$ (A.1)

which corresponds to a weighted average of the spatial patches, computed over time indices 1 to T, with weights $1 / σ_{n, ℓ, t}^{2}$ $Mathematical equation: $ 1/\sigma_{n,\ell,t}^2 $$ that reduce the impact of frames and spectral channels displaying a large variance $σ_{n, ℓ, t}^{2}$ $Mathematical equation: $ \sigma_{n,\ell,t}^2 $$ .

The condition ${\frac{\partial L_{n}}{\partial σ_{n, ℓ, t}^{2}} |}_{σ_{n, ℓ, t}^{2} = {\hat{σ}}_{n, ℓ, t}^{2}} = 0$ $Mathematical equation: $ \left.\frac{\partial \mathscr{L}_n}{\partial \sigma_{n,\ell,t}^2}\right|_{\sigma_{n,\ell,t}^2=\hat{\sigma}_{n,\ell,t}^2}=0 $$ leads to:

$\begin{matrix} \frac{K}{2 {\hat{σ}}_{n, ℓ, t}^{2}} - \frac{1}{2 {\hat{σ}}_{n, ℓ, t}^{4}} {\bar{r}}_{n, ℓ, t}^{⊤} C_{n}^{- 1} {\bar{r}}_{n, ℓ, t} = 0, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \frac{K}{2\hat{\sigma }_{n,\ell ,t}^2} -\frac{1}{2 \hat{\sigma }_{n,\ell ,t}^4 } \bar{\boldsymbol{r}}_{n,\ell ,t}^{\top }\mathbf C _{n}^{-1} \bar{\boldsymbol{r}}_{n,\ell ,t}=0\,, \end{aligned} $$$

with ${\bar{r}}_{n, ℓ, t} = r_{n, ℓ, t} - m_{n, ℓ}$ $Mathematical equation: $ \bar{{\boldsymbol{r}}}_{n,\ell,t}={\boldsymbol{r}}_{n,\ell,t}-{\boldsymbol{m}}_{n,\ell} $$ the residual patches. This gives:

$\begin{matrix} {\hat{σ}}_{n, ℓ, t}^{2} = \frac{1}{K} {\bar{r}}_{n, ℓ, t}^{⊤} C_{n}^{- 1} {\bar{r}}_{n, ℓ, t} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \widehat{\sigma }_{n,\ell ,t}^2 = \frac{1}{K}\bar{\boldsymbol{r}}_{n,\ell ,t}^{\top }\mathbf C _{n}^{-1} \bar{\boldsymbol{r}}_{n,\ell ,t}. \end{aligned} $$$ (A.2)

The time and wavelength-specific scaling factors σ_{n, ℓ, t} are thus obtained by computing the variance of each spatially whitened patch.

Finally, the condition ${\nabla_{C_{n}} L |}_{C_{n} = {\hat{S}}_{n}} = 0$ $Mathematical equation: $ \left.{\boldsymbol{\nabla}}_{\mathbf{C}_n} \mathscr{L}\right|_{\mathbf{C}_n=\widehat{\mathbf{S}}_n}={\boldsymbol{0}} $$ gives:

$\begin{matrix} \frac{TL}{2} {\hat{S}}_{k}^{- 1} - {\hat{S}}_{k}^{- 1} (\sum_{\begin{matrix} ℓ \in 1 : L \\ t \in 1 : T \end{matrix}} \frac{1}{2 σ_{n, ℓ, t}^{2}} {\bar{r}}_{n, ℓ, t} {\bar{r}}_{n, ℓ, t}^{⊤}) {\hat{S}}_{k}^{- 1} = 0, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \frac{TL}{2}\widehat{\mathbf{S }}_k^{-1}- \widehat{\mathbf{S }}_k^{-1}\biggl (\sum _{\begin{matrix} \ell \in 1:L\\ t\in 1:T \end{matrix}}\frac{1}{ 2\sigma _{n,\ell ,t}^2}\bar{\boldsymbol{r}}_{n,\ell ,t}\bar{\boldsymbol{r}}_{n,\ell ,t}^{\top }\biggr )\widehat{\mathbf{S }}_k^{-1}=\boldsymbol{0}\,, \end{aligned} $$$

leading to:

$\begin{matrix} {\hat{S}}_{k} = \frac{1}{TL} \sum_{\begin{matrix} ℓ \in 1 : L \\ t \in 1 : T \end{matrix}} \frac{1}{σ_{n, ℓ, t}^{2}} {\bar{r}}_{n, ℓ, t} {\bar{r}}_{n, ℓ, t}^{⊤}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \widehat{\mathbf{S }}_k=\frac{1}{TL}\sum _{\begin{matrix} \ell \in 1:L\\ t\in 1:T \end{matrix}}\frac{1}{\sigma _{n,\ell ,t}^2} \bar{\boldsymbol{r}}_{n,\ell ,t}\bar{\boldsymbol{r}}_{n,\ell ,t} ^{\top }\,, \end{aligned} $$$ (A.3)

which is the sample covariance matrix of the spatial patches, each rescaled by the corresponding time and wavelength-specific factor.

Appendix B: Derivation of the equivalent number of patches

The equivalent number of patches $\tilde{P}$ $Mathematical equation: $ \widetilde{P} $$ corresponds to the number of samples if all weights are equal and is smaller when some weights differ.

Let us assume that {r_t}_{t ∈ 1 : T} be a collection of T independent and identically distributed random variables. The weighted mean $\hat{m} = \sum_{t = 1}^{T} w'_{t} r_{t}$ $Mathematical equation: $ \widehat{m}=\sum_{t=1}^T w^\prime_{t}\,r_t $$ , where $w'_{t} \geq 0$ $Mathematical equation: $ w^\prime_{t}\geq 0 $$ are normalized weights ( $w'_{t} = w_{t} / \sum_{t = 1}^{T} w_{t}$ $Mathematical equation: $ w^\prime_{t}=w_t/\sum_{t=1}^T w_t $$ ), is an unbiased estimator of 𝔼[r] with a variance $Var [\hat{m}] = \sum_{t = 1}^{T} Var [w'_{t} r_{t}]$ $Mathematical equation: $ \text{Var}[\widehat{m}]=\sum_{t=1}^T \text{ Var}[w\prime_{t}\,r_t] $$ (by independence of the r_t), which leads to $Var [\hat{m}] = Var [r] / \tilde{P}$ $Mathematical equation: $ \text{Var}[\widehat{m}]=\text{ Var}[r]/\widetilde{P} $$ , with

$\begin{matrix} \tilde{P} = 1 / \sum_{t = 1}^{T} w^{'}_{t}^{2} = {(\sum_{t = 1}^{T} w_{t})}^{2} / (\sum_{t = 1}^{T} w_{t}^{2}), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \widetilde{P}=1\bigg /\sum _{t=1}^T w^{\prime }{_{t}^2}=\left(\sum _{t=1}^T w_t\right)^2\bigg /\left(\sum _{t=1}^T w_t^2\right)\,, \end{aligned} $$$ (B.1)

the effective number of samples.

If all weights are equal, $\tilde{P} = T$ $Mathematical equation: $ \widetilde{P}=T $$ : the effective number of samples is equal to the total number of samples. If all weights but one are zero, $\tilde{P} = 1$ $Mathematical equation: $ \widetilde{P}=1 $$ . In our case, the weights w_t correspond to $1 / {\hat{σ}}_{n, ℓ, t}^{2}$ $Mathematical equation: $ 1/\widehat{\sigma}_{n,\ell,t}^2 $$ , which leads to the formula to compute $\tilde{P}$ $Mathematical equation: $ \widetilde{P} $$ in Algorithm 1, step 3. In practice, the samples {r_t}_{t ∈ 1 : T} are not identically distributed (their variances differ), but $\tilde{P}$ $Mathematical equation: $ \widetilde{P} $$ still indicates if the mean is reliable.

Appendix C: Derivation of the distribution of GLRT⁺

The GLRT⁺ is defined as the sum $\sum_{ℓ = 1}^{L} \frac{{[{\hat{α}}_{ℓ}]}_{+}^{2}}{v_{ℓ}} = \sum_{ℓ = 1}^{L} s_{ℓ}$ $Mathematical equation: $ \sum\nolimits_{\ell=1}^L \frac{[\widehat{\alpha}_\ell]_+^2}{{{v_\ell}}}=\sum\nolimits_{\ell=1}^L s_\ell $$ with $s_{ℓ} = {[{\hat{α}}_{ℓ}]}_{+}^{2} / v_{ℓ}$ $Mathematical equation: $ s_\ell=[\widehat{\alpha}_\ell]_+^2/{{v_\ell}} $$ . In the absence of source and under the simplifying assumption of an absence of spectral correlation of the backgrounds (i.e., under ℋ₀ and within our Gaussian model with statistically independent channels) the terms s_ℓ are independent and identically distributed. Due to the thresholding of negative values, the distribution of each s_ℓ corresponds to a mixture of a χ² random variable and a Dirac mass at 0:

$\begin{matrix} p (s_{ℓ} | H_{0}) = \frac{1}{2} δ_{0} (s_{ℓ}) + \frac{1}{2} χ_{1}^{2} (s_{ℓ}), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \text{ p}\bigl (s_\ell \bigl |\,\mathcal{H} _{0}\bigr )=\frac{1}{2}\delta _0(s_\ell )+\frac{1}{2}\chi _1^2\left(s_\ell \right)\,, \end{aligned} $$$ (C.1)

where the Dirac mass δ₀ centered in 0 accounts for the probability 1/2 that S/N_ℓ be negative and the Chi-square distribution with one degree of freedom $χ_{1}^{2}$ $Mathematical equation: $ \chi_1^2 $$ corresponds to the distribution of the square of a standard Gaussian variable. By independence of the s_ℓ, the distribution of their sum, that is to say GLRT⁺, is given by the convolution product:

$\begin{matrix} p ({GLRT}^{+} | H_{0}) & = \underset{L times}{\underset{⏟}{(p (s_{1} | H_{0}) * \dots * p (s_{L} | H_{0}))}} ({GLRT}^{+}) \\ = \frac{1}{2^{L}} δ_{0} ({GLRT}^{+}) \\ + \sum_{ℓ = 0}^{L - 1} \frac{L!}{2^{L} ℓ! (L - ℓ)!} χ_{L - ℓ}^{2} ({GLRT}^{+}), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \text{ p}\bigl (\text{ GLRT}^+\bigl |\,\mathcal{H} _{0}\bigr )&=\underbrace{\left(\text{ p}\bigl (s_1\bigl |\,\mathcal{H} _{0}\bigr )*\cdots *\text{ p}\bigl (s_L\bigl |\,\mathcal{H} _{0}\bigr )\right)}_{L \text{ times}}(\text{ GLRT}^+)\nonumber \\&=\frac{1}{2^L}\delta _0(\text{ GLRT}^+)\nonumber \\&\quad +\sum _{\ell =0}^{L-1}\frac{L!}{2^L\ell !(L-\ell )!}\chi _{L-\ell }^2\left(\text{ GLRT}^+\right) \,, \end{aligned} $$$ (C.2)

by application of the binomial expansion and the property $χ_{a}^{2} * χ_{b}^{2} = χ_{a + b}^{2}$ $Mathematical equation: $ \chi_a^2\ast\chi_b^2=\chi_{a+b}^2 $$ (the sum of two independent χ² random variables with respective degrees of freedom a and b is a χ²-distributed random variable with a + b degrees of freedom).

Appendix D: Robust estimation of the spectral correlations

We described in Sect. 3.2.2 how modeling the covariance between the detection maps S/N_ℓ at each wavelength could be used in order to combine the spectral information. In this appendix, we discuss the estimation process of the spectral covariances Σ.

When estimating covariance matrices Σ, we face two difficulties: (i) the estimation must be local in order to capture the nonstationarities of those correlations and (ii) the estimation cannot be performed in the presence of sources (since the spectral correlations would then also depend on the spectrum of the source and on the values of v_ℓ).

As described in Sect. 3.2.2, the presence of sources can be circumvented by applying a robust estimator like the MCD to a region large enough so that the portion corresponding to a point source is always marginal. Figures D.1a–f displays the combined detection maps wwS/N obtained on HR 8799 when assuming the spectrum plotted in (g). The size of the region over which the robust estimation of Σ is performed is given both in terms of pixels and by a disk at the same scale as the detection map. If the spectral covariances are learned on a region that is too small, as in Fig. D.1a, the detection map is flattened even at the location of the sources. By increasing the size of the region, the robust estimator of the spectral covariance correctly captures the correlations in the absence of sources. However, when the region gets too large, the whitening operation slightly lacks locality.

Fig. D.1.

Influence of the size A (given in pixels and displayed as a disk at the scale of the field of view) of the region over which the spectral covariances $\hat{Σ}$ $Mathematical equation: $ \widehat{\boldsymbol{\Sigma}} $$ are estimated. The S/N_ℓ maps are combined assuming the spectrum shown in (g), for whitening matrices $\hat{L}$ $Mathematical equation: $ \widehat{\mathbb{L}} $$ obtained from each estimate of the spectral covariance. From a to f: combined maps wwS/N and the empirical distribution of wwS/N under ℋ₀ are shown side by side.

It can be observed in Fig. D.1 that the empirical distribution in the absence of source correctly matches the expected standard Gaussian model. From a detection map like Fig. D.1e, it is then possible to detect point sources by thresholding at τ = 5, and a binary mask can be obtained in order to mask the point sources. In a second step, the spectral covariance matrices Σ can be re-estimated on much smaller windows by excluding all pixels that fall in the binary mask. Figure D.2 compares the detection map obtained D.2a without spectral whitening, D.2b with spectral whitening performed after estimating the spectral covariances over a large area with a robust estimator, and D.2c with spectral whitening performed on small areas by computing the sample covariance after exclusion of the pixels around the point sources. This last strategy can be applied to small areas (A = 300 pixels), and thus, better eliminate spurious structures in the background. However, it requires a two-step processing: first the computation of the whitened detection map D.2b with the robust covariance estimator, then the formation of the exclusion map, the re-estimation of the covariance matrices and the re-computation of a new detection map.

Fig. D.2.

Comparison of three spectral whitening strategies: a: no whitening; b: spectral whitening based on a robust estimate of the covariance computed over a large area (A = 5000 pixels, shown at the bottom); c: spectral whitening based on a local estimate of the covariance computed over a smaller area (A = 300 pixels, shown at the bottom) by masking out regions of high wwS/N given by method (b).

Appendix E: Optimality of the detection criterion wwS/N

In this part, we show that the optimal linear combination of S/N_ℓ values corresponds to the wwS/N test that we derived from the GLRT. The general form of a test based on a linear combination of (whitened) S/N_ℓ values takes the form:

$\begin{matrix} wwS / N : \sum_{ℓ = 1}^{L} w_{ℓ} \cdot {[{\hat{L}}^{⊤} x]}_{ℓ} \underset{H_{0}}{\overset{H_{1}}{≷}} τ, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathrm{wwS/N}:\quad \sum _{\ell =1}^L w_\ell \cdot \left[\widehat{\mathbb{L} }^{\top }\boldsymbol{x}\right]_\ell \underset{{\mathcal{H} }_0}{\overset{{\mathcal{H} }_1}{\gtrless }} \tau \,, \end{aligned} $$$ (E.1)

where w_ℓ are weights whose value is to be determined.

Under hypothesis ℋ₁, the value of wwS/N is to be maximized, while the variance of wwS/N under ℋ₀ remains equal to one, so that under ℋ₀ wwS/N is a standard Gaussian variate and a detection threshold can be straightforwardly set.

Since the vector $\hat{L}^{⊤} x$ $Mathematical equation: $ \widehat{\mathbb{L}}{^{{\top}}}{\boldsymbol{x}} $$ follows 𝒩(0, I) under ℋ₀, the variance of wwS/N under ℋ₀ equals $\sum_{ℓ = 1}^{L} w_{ℓ}^{2}$ $Mathematical equation: $ \sum_{\ell=1}^L w_\ell^2 $$ . The constraint that wwS/N has unit variance leads to the condition $\sum_{ℓ = 1}^{L} w_{ℓ}^{2} = {‖ w ‖}_{2}^{2} = 1$ $Mathematical equation: $ \sum_{\ell=1}^L w_\ell^2 = \|{\boldsymbol{w}}\|_2^2=1 $$ .

Under ℋ₁, the expected value of wwS/N is

$\begin{matrix} E_{H_{1}} [wwS / N] = \sum_{ℓ = 1}^{L} w_{ℓ} \cdot {[{\hat{L}}^{⊤} E_{H_{1}} [x]]}_{ℓ}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathbb{E} _{\mathcal{H} _{1}} \left[\mathrm{wwS/N}\right]=\sum _{\ell =1}^L w_\ell \cdot \left[\widehat{\mathbb{L} }^{\top }\mathbb{E} _{\mathcal{H} _{1}}[\boldsymbol{x}]\right]_\ell \,, \end{aligned} $$$ (E.2)

where ${[E_{H_{1}} [x]]}_{ℓ} = α^{int} β'_{ℓ}$ $Mathematical equation: $ [\mathbb{E}_{{\ensuremath{\mathcal{H}_{1}}}}[{\boldsymbol{x}}]]_\ell=\alpha^{\text{ int}}\beta^\prime_{\ell} $$ . Equation (E.2) is a scalar product between the vector of weights w_ℓ and the whitened expected S/N_ℓ values. Given the normalization constraint ∥w∥ = 1, Eq. (E.2) is maximized for a vector of weights that has unit Euclidean norm and is collinear to $\hat{L}^{⊤} E_{H_{1}} [x]$ $Mathematical equation: $ \widehat{\mathbb{L}}{^{{\top}}}\mathbb{E}_{{\ensuremath{\mathcal{H}_{1}}}}[{\boldsymbol{x}}] $$ . This leads to the following definition of optimal weights:

$\begin{matrix} w = \frac{{\hat{L}}^{⊤} β^{'}}{‖ {\hat{L}}^{⊤} β^{'} ‖}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \boldsymbol{w} = \frac{\widehat{\mathbb{L} }^{\top }\boldsymbol{\beta }^{\prime }}{\Vert \widehat{\mathbb{L} }^{\top }\boldsymbol{\beta }^{\prime }\Vert }\,, \end{aligned} $$$ (E.3)

which corresponds to the values of the weights obtained in Eq. (20).

Appendix F: Combination of S/N maps with spectral whitening

Our approach to combine detection maps computed at different wavelengths is general and can also apply to the output of other algorithms, as illustrated in Fig. F.1. We processed the signal-to-noise ratio maps produced by TLOCI and KLIP algorithms. In F.1a, we show the combined signal-to-noise ratio obtained by simple averaging, in F.1b and F.1c we apply a spectral whitening and the prior spectrum of Fig. D.1g according to the definition of wwS/N (we have set v_ℓ = 1 for both TLOCI and KLIP). In D.1b, the whitening matrix $\hat{L}$ $Mathematical equation: $ \widehat{\mathbb{L}} $$ is computed from the robust covariance estimator applied on a large area, in D.1c the two-step approach with masking of the point sources in the second step is applied. The detection maps are clearly improved by our spectral whitening scheme. Compared to PACO ASDI, the combined detection maps display a lower value for the 3 sources of the field of view as well as some border artifacts, which indicates that modeling the nonstationary spatial covariance also plays an important role in PACO ASDI.

Fig. F.1.

Combination of S/N maps with our spectral whitening strategy: a: simple spectral averaging of TLOCI and KLIP S/N maps, b–c: combination with spectral whitening and the spectrum shown in Fig. D.1g.

Appendix G: Automatic setting of the smoothing parameter μ for spectrum estimation

As detailed in Sect. 4.2, it is useful to enforce a spectral smoothness during the spectrum estimation of a detected point source. The expected gain is a reduction of the MSE of the estimation, in particular when the source contrast is very weak. In this appendix, we numerically compare three well-known regularization strategies of the estimated spectra: GCV (Craven & Wahba 1978; Golub et al. 1979), GML (Wahba 1985; MacKay 1992) and SURE (Stein 1981) (see Sect. 4.2 for a short description). For this purpose, we perform 30 Monte–Carlo injections and spectrum estimations on 4 sources (#2, #5, #10 #12, see Sect. 5.3). Figure G.1 compares the GCV, GML, and SURE regularization strategies in terms of MSE and agreement between the estimates with the empirical 1σ confidence intervals. A comparison is also given with the results obtained when the regularization hyperparameter μ is set in an “oracle” mode that is, by selecting μ that minimizes the MSE between the spectrum estimate and the spectrum ground truth. These experiments illustrate that the GML and SURE approaches lead to very similar results with a slight improvement brought by SURE with respect to GML. In comparison, the GCV leads to significantly worst results. This can be explained by the fact the GCV is generally used when the noise variance is unknown. Regularizing the spectrum estimation with GML or SURE is beneficial since it reduces the MSE. As expected, the gain brought by the regularization is larger when the contrast of the source is weak and for sources located near the host star, that is, when the estimated spectrum is very noisy. For example, the MSE is reduced by about 37% for the source #2 (brightest one of the four considered) while it is reduced up to 67% for the source #10 which is the faintest one. In addition, the results obtained with the automatic setting of the hyperparameter μ by GML or SURE are not very far from the optimal results achieved by the oracle. Finally, as shown by Fig. G.1 (bottom), the estimated confidence intervals are in good match with the empirical ones (a factor between 0.94 and 1.21 is observed in our experiments) when the regularization is performed with the SURE approach.

Fig. G.1.

Comparison of the GCV, GML, and SURE regularization strategies on 30 Monte–Carlo injections / spectrum estimations for sources #2, #5, #10, and #12. Top: gain in terms of MSE reduction comparing with the absence of spectral regularization (values higher than one indicates a decrease of the MSE). Bottom: comparison between the empirical 1σ confidence intervals predicted by PACO ASDI and the empirical ones (values higher than one indicate that predicted confidence intervals are smaller than the empirical ones so that the algorithm estimation is too optimistic).

All Tables

Table 1.

Reminder of the main notations.

In the text

Table 2.

Degradation of the achievable contrast when the prior spectrum differs from the true spectrum.

In the text

Table 3.

Log information for the considered SPHERE-IFS datasets.

In the text

Table 4.

Angular separation, minimum, maximum, and mean contrast of the 12 fake point sources injected in the fourth dataset.

In the text

Table 5.

Monte–Carlo validation of spectrum estimation methods.

In the text

Table 6.

Estimated astrometry (separation and true-north aligned angle) of the real faint point sources known in the used datasets with PACO ASDI.

In the text

All Figures

	Fig. 1. Scheme of `PACO ASDI` algorithm.
In the text

	Fig. 2. Accounting for temporal and spectral fluctuations with time and wavelength-specific scaling factors: a: observed intensities, for some selected frames (4 wavelengths × 4 exposures); b: corresponding spatial distribution of scaling factors.
In the text

	Fig. 3. Convergence of the scaling factors, starting from many random initializations. In the inserts, the location in the field of view is indicated as well as the evolution of the weights until the convergence criterion is reached.
In the text

	Fig. 7. Combined detection maps computed on HR 8799 ASDI dataset: a: GLRT⁺ criterion and its distribution in the absence of sources; b: wGLRT criterion, including a spectral whitening operation, and its distribution. The three sources are excluded for the computation of the empirical distributions.
In the text

	Fig. 8. Combined detection map with spectrum priors: in the absence of sources the empirical distribution matches very closely a Gaussian distribution (red parabola in the log-scale representations).
In the text

Fig. 9.

In the text

	Fig. 10. Same caption than Fig. 9. The color bars are adapted to each method and are set between −5 and the detection peak associated with one of the real sources to be detected (respectively HR 8799 e which is the closest to the host star, β Pictoris b, and HD 131399 Ab).
In the text

Fig. 11.

In the text

	Fig. 12. Detection maps (wwS/N for `PACO ASDI` and combined S/N for the other algorithms) around HD 172555 in the absence of fake point sources.
In the text

	Fig. 13. wwS/N maps obtained with `PACO ASDI` around HD 172555 with injected fake point sources #1 to #6 (left) and #7 to #12 (right).
In the text

	Fig. 14. Contrast curves at 5σ obtained on HD 172555 with `PACO ASDI` comparatively to TLOCI-ADI. The mean contrast of the fake faint point sources #1 to #12 is marked by orange points.
In the text

	Fig. F.1. Combination of S/N maps with our spectral whitening strategy: a: simple spectral averaging of TLOCI and KLIP S/N maps, b–c: combination with spectral whitening and the spectrum shown in Fig. D.1g.
In the text

Fig. G.1.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[R1] Absil, O., Milli, J., Mawet, D., et al. 2013, A&A, 559, L12 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R2] Bailey, V., Meshkat, T., Reiter, M., et al. 2013, ApJ, 780, L4 [NASA ADS] [CrossRef] [Google Scholar]

[R3] Beuzit, J.-L., Feldt, M., Dohlen, K., et al. 2008, Proc. SPIE, 7014, 701418 [Google Scholar]

[R4] Beuzit, J.-L., Vigan, A., Mouillet, D., et al. 2019, A&A, 631, A155 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R5] Bowler, B. P. 2016, PASP, 128, 102001 [NASA ADS] [CrossRef] [Google Scholar]

[R6] Bowler, B. P., Liu, M. C., Dupuy, T. J., & Cushing, M. C. 2010, ApJ, 723, 850 [NASA ADS] [CrossRef] [Google Scholar]

[R7] Brent, R. P. 1973, Algorithms for Minimization without Derivatives (Englewood Cliffs, NJ: Prentice-Hall) [Google Scholar]

[R8] Cantalloube, F. 2016, Ph.D. Thesis, Grenoble Alpes [Google Scholar]

[R9] Cantalloube, F., Mouillet, D., Mugnier, L., et al. 2015, A&A, 582, A89 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R10] Cantalloube, F., Ygouf, M., Mugnier, L., et al. 2018, ArXiv e-prints [arXiv:1812.04312] [Google Scholar]

[R11] Chauvin, G., Desidera, S., Lagrange, A.-M., et al. 2017, A&A, 605, L9 [Google Scholar]

[R12] Cheetham, A., Samland, M., Brems, S., et al. 2019, A&A, 622, A80 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R13] Claudi, R., Maire, A.-L., Mesa, D., et al. 2019, A&A, 622, A96 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R14] Craven, P., & Wahba, G. 1978, Numer. Math., 31, 377 [CrossRef] [Google Scholar]

[R15] Currie, T., Burrows, A., Itoh, Y., et al. 2011, ApJ, 729, 128 [NASA ADS] [CrossRef] [Google Scholar]

[R16] Currie, T., Debes, J., Rodigas, T. J., et al. 2012a, ApJ, 760, L32 [NASA ADS] [CrossRef] [Google Scholar]

[R17] Currie, T., Fukagawa, M., Thalmann, C., Matsumura, S., & Plavchan, P. 2012b, ApJ, 755, L34 [NASA ADS] [CrossRef] [Google Scholar]

[R18] De Zeeuw, P., Hoogerwerf, R. V., de Bruijne, J. H., Brown, A., & Blaauw, A. 1999, AJ, 117, 354 [NASA ADS] [CrossRef] [Google Scholar]

[R19] Delorme, P., Meunier, N., Albert, D., et al. 2017, SF2A-2017: Proceedings, 347 [Google Scholar]

[R20] Devaney, N., & Thiébaut, É. 2017, MNRAS, 472, 3734 [NASA ADS] [CrossRef] [Google Scholar]

[R21] Dommanget, J., & Nys, O. 2002, VizieR Online Data Catalog: I/274 [Google Scholar]

[R22] Flasseur, O., Denis, L., Thiébaut, É., & Langlois, M. 2018a, A&A, 618, A138 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R23] Flasseur, O., Denis, L., Thiébaut, É., & Langlois, M. 2018b, in International Society for Optics and Photonics, Adapt. Opt. Syst. VI, 10703, 10703R [Google Scholar]

[R24] Flasseur, O., Denis, L., Thiébaut, É., & Langlois, M. 2018c, in 2018 25th IEEE International Conference on Image Processing (ICIP), 2735 [CrossRef] [Google Scholar]

[R25] Flasseur, O., Denis, L., Thiébaut, É., & Langlois, M. 2020, A&A, 634, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R26] Galicher, R., Marois, C., Macintosh, B., Barman, T., & Konopacky, Q. 2011, ApJ, 739, L41 [Google Scholar]

[R27] Galicher, R., Boccaletti, A., Mesa, D., et al. 2018, A&A, 615, A92 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R28] Gerard, B. L., Marois, C., Currie, T., et al. 2019, AJ, 158, 36 [NASA ADS] [CrossRef] [Google Scholar]

[R29] Gibbs, A., Wagner, K., Apai, D., et al. 2019, AJ, 157, 39 [Google Scholar]

[R30] Golub, G. H., Heath, M., & Wahba, G. 1979, Technometrics, 21, 215 [CrossRef] [Google Scholar]

[R31] Gonzalez, C. G., Absil, O., Absil, P.-A., et al. 2016, A&A, 589, A54 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R32] Gonzalez, C., Absil, O., & Van Droogenbroeck, M. 2018, A&A, 613, A71 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R33] Gratton, R., Ligi, R., Sissa, E., et al. 2019, A&A, 623, A140 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R34] Howard, A. W., Johnson, J. A., Marcy, G. W., et al. 2010, ApJ, 721, 1467 [NASA ADS] [CrossRef] [Google Scholar]

[R35] Hubert, M., Rousseeuw, P. J., & Van Aelst, S. 2008, Stat. Sci., 92 [Google Scholar]

[R36] Jovanovic, N., Martinache, F., Guyon, O., et al. 2015, PASP, 127, 890 [NASA ADS] [CrossRef] [Google Scholar]

[R37] Kendall, M. G., Stuart, A., & Ord, J. K. 1948, The Advanced Theory of Statistics (JSTOR), 1 [Google Scholar]

[R38] Keppler, M., Benisty, M., Müller, A., et al. 2018, A&A, 617, A44 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R39] Lafrenière, D., Marois, C., Doyon, R., Nadeau, D., & Artigau, E. 2007, ApJ, 660, 770 [NASA ADS] [CrossRef] [Google Scholar]

[R40] Lagrange, A.-M., Gratadour, D., Chauvin, G., et al. 2009, A&A, 493, L21 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R41] Lagrange, A.-M., Bonnefoy, M., Chauvin, G., et al. 2010, Science, 329, 57 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]

[R42] Lagrange, A.-M., Boccaletti, A., Langlois, M., et al. 2019a, A&A, 621, L8 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R43] Lagrange, A.-M., Meunier, N., Rubini, P., et al. 2019b, Nat. Astron., 1 [Google Scholar]

[R44] Lisse, C. M., Chen, C., Wyatt, M., et al. 2009, ApJ, 701, 2019 [NASA ADS] [CrossRef] [Google Scholar]

[R45] Lovis, C., & Fischer, D. 2010, Exoplanets, 27 [Google Scholar]

[R46] Macintosh, B., Graham, J., Barman, T., et al. 2015, Science, 350, 64 [NASA ADS] [CrossRef] [Google Scholar]

[R47] Macintosh, B., Graham, J. R., Ingraham, P., et al. 2014, Proc. Nat. Acad. Sci., 111, 12661 [Google Scholar]

[R48] MacKay, D. J. 1992, Neural Comput., 4, 415 [CrossRef] [Google Scholar]

[R49] Maire, A.-L., Boccaletti, A., Rameau, J., et al. 2014, A&A, 566, A126 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R50] Maire, A.-L., Langlois, M., Dohlen, K., et al. 2016, in International Society for Optics and Photonics, Ground-based Airborne Instrum. Astron. VI, 9908, 990834 [Google Scholar]

[R51] Maire, A.-L., Rodet, L., Cantalloube, F., et al. 2019, A&A, 624, A118 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R52] Maire, A.-L., Baudino, J.-L., Desidera, S., et al. 2020, A&A, 633, L2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R53] Marley, M. S., Saumon, D., Cushing, M., et al. 2012, ApJ, 754, 135 [NASA ADS] [CrossRef] [Google Scholar]

[R54] Marois, C., Lafrenière, D., Doyon, R., Macintosh, B., & Nadeau, D. 2006, ApJ, 641, 556 [Google Scholar]

[R55] Marois, C., Macintosh, B., Barman, T., et al. 2008, Science, 322, 1348 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]

[R56] Marois, C., Zuckerman, B., Konopacky, Q. M., Macintosh, B., & Barman, T. 2010, Nature, 468, 1080 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]

[R57] Marois, C., Correia, C., Véran, J.-P., & Currie, T. 2013, Proc. Int. Astron. Union, 8, 48 [CrossRef] [Google Scholar]

[R58] Marois, C., Correia, C., Galicher, R., et al. 2014, in International Society for Optics and Photonics, Adapt. Opt. Syst. IV, 9148, 91480U [Google Scholar]

[R59] Mawet, D., Hirsch, L., Lee, E. J., et al. 2019, AJ, 157, 33 [NASA ADS] [CrossRef] [Google Scholar]

[R60] Mesa, D., Keppler, M., Cantalloube, F., et al. 2019a, A&A, 632, A25 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R61] Mesa, D., Bonnefoy, M., Gratton, R., et al. 2019b, A&A, 624, A4 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R62] Morzinski, K. M., Close, L. M., Males, J. R., et al. 2014, in International Society for Optics and Photonics, Adapt. Opt. Syst. IV, 9148, 914804 [Google Scholar]

[R63] Mugnier, L. M., Cornia, A., Sauvage, J.-F., et al. 2009, J. Opt. Soc. Am. A, 26, 1326 [NASA ADS] [CrossRef] [Google Scholar]

[R64] Müller, A., Keppler, M., Henning, T., et al. 2018, A&A, 617, L2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R65] Nielsen, E. L., & Close, L. M. 2010, ApJ, 717, 878 [NASA ADS] [CrossRef] [Google Scholar]

[R66] Nielsen, E. L., Close, L. M., Biller, B. A., Masciadri, E., & Lenzen, R. 2008, ApJ, 674, 466 [Google Scholar]

[R67] Nielsen, E. L., Liu, M. C., Wahhaj, Z., et al. 2012, ApJ, 750, 53 [Google Scholar]

[R68] Nielsen, E. L., De Rosa, R. J., Rameau, J., et al. 2017, AJ, 154, 218 [Google Scholar]

[R69] Nielsen, E. L., De Rosa, R. J., Macintosh, B., et al. 2019, AJ, 158, 13 [NASA ADS] [CrossRef] [Google Scholar]

[R70] Pavlov, A., Möller-Nilsson, O., Feldt, M., et al. 2008, in International Society for Optics and Photonics, Adv. Softw. Control Astron. II, 7019, 701939 [Google Scholar]

[R71] Pecaut, M. J., & Mamajek, E. E. 2013, ApJS, 208, 9 [Google Scholar]

[R72] Perrin, M. D., Sivaramakrishnan, A., Makidon, R. B., Oppenheimer, B. R., & Graham, J. R. 2003, ApJ, 596, 702 [NASA ADS] [CrossRef] [Google Scholar]

[R73] Perrot, C., Thebault, P., Lagrange, A.-M., et al. 2019, A&A, 626, A95 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R74] Pueyo, L. 2018, Handbook of Exoplanets, 705 [Google Scholar]

[R75] Racine, R., Walker, G. A., Nadeau, D., Doyon, R., & Marois, C. 1999, PASP, 111, 587 [NASA ADS] [CrossRef] [Google Scholar]

[R76] Rameau, J., Chauvin, G., Lagrange, A.-M., et al. 2015, A&A, 581, A80 [NASA ADS] [EDP Sciences] [Google Scholar]

[R77] Rousseeuw, P. J., & Driessen, K. V. 1999, Technometrics, 41, 212 [CrossRef] [Google Scholar]

[R78] Ruffio, J.-B., Macintosh, B., Wang, J. J., et al. 2017, ApJ, 842, 14 [NASA ADS] [CrossRef] [Google Scholar]

[R79] Santos, N. C. 2008, New Astron. Rev., 52, 154 [NASA ADS] [CrossRef] [Google Scholar]

[R80] Savransky, D. 2015, in International Society for Optics and Photonics, Tech. Instrum. Detection Exoplanets VII, 9605, 96050R [Google Scholar]

[R81] Schneider, J., Dedieu, C., Le Sidaner, P., Savalle, R., & Zolotukhin, I. 2011, A&A, 532, A79 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R82] Schütz, O., Meeus, G., & Sterzik, M. 2005, A&A, 431, 175 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R83] Simon, M. K. 2007, Probability Distributions Involving Gaussian Random Variables: A Handbook for Engineers and Scientists (Springer Science & Business Media) [Google Scholar]

[R84] Smith, I., Ferrari, A., & Carbillet, M. 2009, IEEE Trans. Signal Process., 57, 904 [NASA ADS] [CrossRef] [Google Scholar]

[R85] Soummer, R., Pueyo, L., & Larkin, J. 2012, ApJ, 755, L28 [NASA ADS] [CrossRef] [Google Scholar]

[R86] Stein, C. M. 1981, Ann. Stat., 1135 [Google Scholar]

[R87] Tarantola, A. 2005, Inverse Problem Theory and Methods for Model Parameter Estimation (SIAM), 89 [CrossRef] [Google Scholar]

[R88] Thiébaut, É., Devaney, N., Langlois, M., & Hanley, K. 2016, in International Society for Optics and Photonics, SPIE Astron. Telescopes+ Instrum., 99091R [Google Scholar]

[R89] Thompson, A. M., Brown, J. C., Kay, J. W., & Titterington, D. M. 1991, IEEE Trans. Pattern Anal. Mach. Intell., 326 [Google Scholar]

[R90] Traub, W. A., & Oppenheimer, B. R. 2010, Exoplanets, 111 [Google Scholar]

[R91] Vigan, A., Moutou, C., Langlois, M., et al. 2010, MNRAS, 407, 71 [NASA ADS] [CrossRef] [Google Scholar]

[R92] Wagner, K., Apai, D., Kasper, M., et al. 2016, Science, 353, 673 [NASA ADS] [CrossRef] [Google Scholar]

[R93] Wahba, G., et al. 1985, Ann. Stat., 13, 1378 [CrossRef] [Google Scholar]

[R94] Wahhaj, Z., Cieza, L. A., Mawet, D., et al. 2015, A&A, 581, A24 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R95] Ygouf, M. 2012, Ph.D. Thesis, Grenoble [Google Scholar]

[R96] Yip, K. H., Nikolaou, N., Coronica, P., et al. 2019, ArXiv e-prints [arXiv:1904.06155] [Google Scholar]

[R97] Zurlo, A., Vigan, A., Mesa, D., et al. 2014, A&A, 572, A85 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

PACO ASDI: an algorithm for exoplanet detection and characterization in direct imaging with integral field spectrographs

1. Introduction

2. Statistical modeling of background fluctuations in ASDI

2.1. Local multivariate Gaussian model

2.2. Local learning of the parameters

3. Detection maps

3.1. Detection at a single wavelength

3.2. Combining multiple detection maps

3.2.1. Combination assuming spectral independence

3.2.2. Accounting for spectral correlations

3.2.3. Improving the detection based on a prior spectrum model

4. Source characterization

4.1. Astrometric estimation

4.2. Estimation of the source spectrum

5. Results

5.1. Datasets description

5.2. Detection performance

5.2.1. Detection results

5.2.2. Achievable contrast

5.3. Spectrum estimation performance

6. Conclusion

Acknowledgments

References

Appendix A: Estimation of the local statistics of the background in ASDI

Appendix B: Derivation of the equivalent number of patches

Appendix C: Derivation of the distribution of GLRT+

Appendix D: Robust estimation of the spectral correlations

Appendix E: Optimality of the detection criterion wwS/N

Appendix F: Combination of S/N maps with spectral whitening

Appendix G: Automatic setting of the smoothing parameter μ for spectrum estimation

All Tables

All Figures

Appendix C: Derivation of the distribution of GLRT⁺