Calibration of quasi-static aberrations in exoplanet direct-imaging instruments with a Zernike phase-mask sensor. III. On-sky validation in VLT/SPHERE

Second-generation exoplanet imagers using extreme adaptive optics and coronagraphy have demonstrated their great potential for studying close circumstellar environments and for detecting new companions and helping to understand their physical properties. However, at very small angular separation, their performance in contrast is limited by several factors: diffraction by the complex telescope pupil not perfectly canceled by the coronagraph, residual dynamic wavefront errors, chromatic wavefront errors, and wavefront errors resulting from noncommon path aberrations (NCPAs). In a previous work, we demonstrated the use of a Zernike wavefront sensor called ZELDA for sensing NCPAs in VLT/SPHERE and their compensation. In the present work, we move to the next step with the on-sky validation of NCPA compensation with ZELDA. We start by reproducing previous results on the internal source and show that the amount of aberration integrated between 1 and 15 cycles/pupil is decreased by a factor of five, which translates into a gain in raw contrast of between 2 and 3 below 300 mas. On sky, we demonstrate that NCPA compensation works in closed loop, leading to an attenuation of the amount of aberration by a factor of approximately two. However, we identify a loss of sensitivity for the sensor that is only partly explained by the difference in Strehl ratio between the internal and on-sky measurements. Coronagraphic imaging on sky is improved in raw contrast by a factor of 2.5 at most in the ExAO-corrected region. We use coronagraphic image reconstruction based on a detailed model of the instrument to demonstrate that both internal and on-sky raw contrasts can be precisely explained, and we establish that the observed performance after NCPA compensation is no longer limited by an improper compensation for aberration but by the current apodized-pupil Lyot coronagraph design. [abridged]


Introduction
With the advent of the second-generation, high-contrast instruments on the ground in 2013-2014, unprecedented performance has been obtained on the images of nearby stars with contrasts down to 10 −6 at separations beyond 300 mas in the near-infrared (NIR) band, leading to the observation of various circumstellar disks (e.g., de Boer et al. 2016;Lagrange et al. 2016;Currie et al. 2017;Feldt et al. 2017;Goebel et al. 2018;Esposito et al. 2018) and the discovery of new young gas giant planets (e.g., Macintosh et al. 2015;Chauvin et al. 2017a;Keppler et al. 2018). These results begin to shed light on the architecture of planetary systems, planet formation and evolution, and atmospheric properties of young giant planets and brown dwarfs. In terms of planet demography, NIR surveys of hundreds of nearby stars have so far shown that giant gaseous planets on orbits wider than 10 AU remain rare, mostly due to inefficient formation of such companions at large distances (e.g., Bowler 2016;Uyama et al. 2017;Vigan et al. 2017;Chauvin et al. 2017b;Nielsen et al. 2019).
Imaging planets on shorter orbits from the ground requires observations at closer separations and deeper contrasts than what is currently done. A first step towards this challenging objective consists in thoroughly understanding the performance and limitations of the latest exoplanet imagers, such as Gemini/GPI, Subaru/SCExAO, and VLT/SPHERE (Macintosh et al. 2014;Jovanovic et al. 2015;Beuzit et al. 2019). These advanced instruments use a common recipe based on extreme adaptive optics (ExAO) to correct for the effects of the atmospheric turbulence on the incoming wavefront, coronagraphy to remove starlight due to the telescope diffraction effects on the star image, and postprocessing methods to reduce the residual scattered light in the coronagraphic image and detect the signal of a companion or a disk.
Even in good observing conditions, the 10 −6 contrast in coronagraphic images can be quickly degraded by many effects, such as residual jitter of a few milliarseconds (mas), extremely low-order aberrations (tip, tilt and defocus), the low-wind effect (Sauvage et al. 2016a), or the wind-driven halo (Cantalloube et al. 2018). Over the past few years, various strategies have been investigated to mitigate these issues (e.g., Baudoz et al. 2010;Singh et al. 2015;Huby et al. 2015Huby et al. , 2017Lamb et al. 2017;Milli et al. 2018;N'Diaye et al. 2018;Wilby et al. 2018). In the absence of or after compensation for these effects, there are still other hurdles, the most infamous being "speckles", which are residual scattered light in the coronagraphic images that prevent the observation of the close stellar environment. These artifacts first come from the wavefront errors due to the atmospheric turbulence residuals after ExAO correction. Different upgrades are under investigation to push the limits of the AO performance further, such as new or additional deformable mirrors, high-sensitivity wavefront sensors, control loops with frequency faster than 1 kHz, and predictive control algorithms (e.g., Fusco et al. 2016;Chilcote et al. 2018;Guyon et al. 2018).
Other speckles originate from the wavefront errors that are invisible to the ExAO system, the so-called noncommon path aberrations (NCPAs). Anticipated at the time of design of the exoplanet imagers (e.g., Fusco et al. 2006), these wavefront errors are due to the differential optical path between the ExAO visible wavefront sensing and the NIR science camera arms. These aberrations slowly evolve with time, producing speckles with a quasi-static behavior that makes their calibration challenging for exoplanet imaging (Martinez et al. 2012(Martinez et al. , 2013Milli et al. 2016). In addition to post-processing techniques, this issue is currently being addressed thanks to the exploration of several solutions including online software (e.g., Savransky et al. 2012;Martinache et al. 2014;Bottom et al. 2017) and hardware (e.g., Galicher et al. 2010;Paul et al. 2013Paul et al. , 2014Martinache 2013;Martinache et al. 2016;Wilby et al. 2017). Among these solutions, we proposed the use of a Zernike wavefront sensor to calibrate NCPAs and enable deeper contrast at closer angular separation.
The ZELDA sensor is based on phase-contrast techniques, originally proposed by Zernike (1934), to measure NCPAs in high-contrast imaging instruments with nanometric accuracy. This sensor uses a focal plane phase mask to produce interference between a reference wave created by the mask and the phase errors present in the system. As a result, this sensor converts the aberrations in the entrance pupil into intensity variations in the exit pupil. This phase-to-intensity conversion depends on the mask characteristics, that is, the diameter and the depth that is related to the sensor phase delay. In N' Diaye et al. (2013; hereafter Paper I), we established the formalism for this approach, showing its ability to measure static aberration with sub-nanometric accuracy and deriving theoretical contrast gains after compensation for exoplanet observation. This approach has also been studied in the context of phase discontinuities such as segmented apertures (Janin-Potiron et al. 2017) or low-wind effects (Sauvage et al. 2016b).
In N'Diaye et al. (2016a; hereafter Paper II), we reported the validation of our method with the implementation of a prototype called ZELDA (Zernike sensor for Extremely Low-level Differential Aberration) on VLT/SPHERE. The first results on the internal source demonstrated a significant contrast gain in the coronagraphic image after compensation for NCPAs using the deformable mirror of the instrument. Following our successful experiment, we now move to the next step with on-sky NCPA measurement and compensation.
Noncommon path aberration compensation can be addressed in two different ways: (i) full compensation on sky or (ii) NCPA calibration on the internal source and application of the correction on sky. Although the second strategy is obviously more efficient and avoids dealing with problems such as integration times and variable observing conditions, it also requires a complete understanding of the behavior of the instrument when switching between the internal calibration source and a star. It also relies on the presence of a calibration source sufficiently close to the entrance of the instrument, which is the case in VLT/SPHERE, but even so the light will not see the telescope mirrors as there is rarely (if ever) an appropriate light source in the telescope itself. Although theoretically applicable, this strategy was deemed too risky for our first on-sky tests and we adopted the first approach instead.
To carry out this demonstration, ESO awarded us three technical half-nights on 2018-04-01, 2018-04-02, and 2018-04-03 1 . The observing conditions were unfortunately quite unstable on April 1 and 3, and very poor on April 2. In this paper, we report on the on-sky tests of NCPA calibration with ZELDA on VLT/SPHERE with the acquired data on the first and third half-nights. We first detail our implemented improvements in the wavefront error calibration and compensation with our Zernike sensor in the instrument compared with Paper II. The on-sky results are then presented with a comprehensive analysis to validate our approach. We finally quantify the impact of NCPA correction on coronagraphic data and discuss prospects for further exoplanet direct imaging and spectroscopy observations.

Improvements in the NCPA calibration
The first attempt presented in Paper II was a working demonstration but had a few shortcomings that made it difficult to implement in a robust and repeatable way.
The first one was the low-pass filtering of the spatial frequencies contained in the ZELDA optical path difference (OPD) maps. This filtering is a mandatory step because ZELDA measures wavefront errors with spatial frequencies up to 192 c/p (the pupil image has 384 pixels in diameter on the detector), while the SPHERE high-order deformable mirror (HODM) only corrects for the spatial frequencies up to 20 c/p. In Paper II, we implemented a simple Hann filtering in the Fourier space with a radius of 25 c/p. This approach proved very efficient at removing the high-spatial frequencies of the OPD maps, but the filter radius was arbitrary and disconnected from the 990 Karhunen-Loève (KL) modes that are used by SAXO, the SPHERE ExAO system, in its control architecture (Petit et al. 2008). As a result, the filtered OPD maps may still contain spatial frequencies that cannot be controlled, and therefore cannot be corrected for by the system.
To improve the spatial filtering we implemented a new approach based on the control matrices of SAXO. The unfiltered OPD maps are first directly converted into HODM actuator voltages before being projected onto the KL modes of the system. At this level, this projection has already removed all the spatial frequencies that cannot be seen and controlled by SAXO. On top of that, a user-defined number of additional modes can be removed by setting their value to zero in the projection; this is done to restrict the NCPA correction only to the lowest spatial frequencies. These filtered modes are then converted back to voltages and finally converted into HOWFS reference slope offsets. This procedure has the powerful advantage that it takes into account all the specificity of the system, and in particular defects like dead actuators, the knowledge of which is incorporated into the control architecture of SAXO.
The second shortcoming was the centering of the source point spread function (PSF) on the ZELDA mask. We first remind that the tip-tilt control in SPHERE is handled in two stages (Sauvage et al. 2016a): first with a fast (800 Hz bandwidth) tip-tilt mirror that provides an rms residual jitter of 2 mas in the visible in nominal conditions, and then with the slow (1 Hz) differential tip-tilt sensor (DTTS) that is implemented in the NIR arm to compensate for the differential chromatic tip-tilt between the visible and the NIR . The goal of the DTTS loop is to provide a fine (<0.5 mas) stabilization of the PSF on the coronagraph mask, or in this case on the ZELDA mask. The centering of the PSF on our sensor spot was previously handled by hand, that is, by manually changing the reference slopes of the DTTS until the residual tip-tilt in the ZELDA signal, estimated by eye only, is deemed negligible. To improve this situation, the tip-tilt in now estimated from the OPD maps by projecting them on the tip (Z1) and tilt (Z2) Zernike polynomials computed for the SPHERE pupil (taking into account the 14% central obscuration and the HODM dead actuators). The tip-tilt is then converted into an offset on the DTTS reference slopes, which are updated accordingly to center the PSF on the mask. Before proceeding to the correction for the higher-order NCPA as described previously, the tip and tilt terms are subtracted from the OPD maps.
Finally, we also implemented several improvements to the ZELDA analysis code. Based on the IDL code that was originally developed for Paper I, we developed a new Python-based generic code called pyZELDA that is dedicated to the analysis of data acquired with Zernike WFS (Vigan & N'Diaye 2018). This code is public and freely available under the MIT license 2 . It is easily extensible to include new instruments or laboratory test beds.
All these improvements make NCPA calibration and compensation with ZELDA much more robust. As a result, NCPA calibration has been transposed into a calibration template that is executed in the SPHERE daily calibration plan to monitor the long-term evolution of NCPAs. The stability of the aberrations in SPHERE has begun to be explored in the laboratory (Martinez et al. 2013) and in Paranal (Milli et al. 2016) based on focal-plane coronagraphic images, but these measurements do not constitute a direct measurement of the phase errors. The short-and long-term stability of NCPAs that are based on the ZELDA phase measurements will be explored in a future study.

On-sky calibration and compensation
There are two complementary approaches to analyze the NCPA compensation in high-contrast imaging: the first one is to look at the wavefront measurements provided by the NCPA WFS (be it ZELDA or a different one), and the second is to estimate the raw contrast gain in coronagraphic images. While the most important parameter in the end is the final gain in raw contrast, the first approach is essential to understanding the instrument. In this section we use this first approach to investigate the behavior and limitations of ZELDA in the presence of ExAO-filtered atmospheric residuals.

Description of the tests
Various tests were performed with ZELDA, both on the internal point source of SPHERE located in the instrument calibration unit (Wildi et al. 2009(Wildi et al. , 2010, and on bright single stars that were selected in advance to ensure optimal ExAO performance (Sauvage et al. 2016a). Indeed, the primary goal of these tests was to validate ZELDA in optimal conditions rather than explore the sensor flux limit. The two stars that were used are α Crt (R = 3.27, H = 1.76) and 1 Pup (R = 3.29, H = 0.93). The different tests that are relevant for our analysis are summarized in Table 1. Unless otherwise stated, all tests are started with the HOWFS reference slopes calibrated during the SPHERE daily calibration sequence.
The internal source and on-sky tests are handled in mostly the same way, except for the initial setup of the light source. On the internal source, the calibration unit is set up to provide a NIR point source, while on sky the full target acquisition procedure is used to point the star, engage the telescope active optics and guiding, and setup the instrument. In either case, the system ends up with all AO loops closed, providing a diffractionlimited PSF. A specific setup is then manually handled in the instrument: ZELDA mask in the coronagraph wheel, and pupil imaging mode with N_FeII filter (λ = 1642 nm, ∆λ = 24 nm) in the IRDIS camera (see Paper II for more details). At this level, the PSF falls close enough to the center of the ZELDA mask to provide a measurable signal. The detector integration time (DIT) is then adjusted so that the average flux in the pupil is approximately at half the linearity range of the Hawaii-2RG detector.
The ZELDA analysis requires two calibration pupil images (Paper II). The first one is an instrumental background, which is obtained by opening the AO loops, closing the entrance shutter of the instrument, acquiring an image with IRDIS, and reverting back to the original instrument state. The second is a clear pupil image obtained by moving the ZELDA mask out of the PSF with a small rotation of the wheel that holds the mask, acquiring an image with IRDIS and again reverting back to the original state. Once these two calibrations have been acquired, the ZELDA mask is no longer moved and the NCPA calibration and compensation can start.
The NCPA compensation is implemented as an iterative process for reasons that become clear in Sect. 3.3. At each iteration, a pupil image is acquired with IRDIS and analyzed with pyZELDA to produce an OPD map calibrated in nanometers. In this analysis, the pupil features (central obscuration, spiders) are taken into account to compute the wavefront reconstruction. The obscured parts are, in the end, masked numerically because the ZELDA signal can only be computed in the illuminated parts of the pupil. Moreover, because the reconstruction is performed independently for each individual point in the pupil, the dead actuators of the SPHERE HODM are not specifically taken into account in the reconstruction. However, they produce wavefront errors that are significantly beyond the linear range of the ZELDA sensor, creating a meaningless signal. They are therefore masked numerically in our reconstruction. Finally, the OPD map is spatially filtered (see Sect. 2) and the reference slopes of the HOWFS are updated. For the filtering, we conservatively kept 700 modes (out of 990), which corresponds to a low-pass filter with a cutoff at ∼15 c/p. Higher values are possible but the higher modes are noisier and can potentially lead to instabilities. In most cases three or four iterations were performed. More iterations were attempted but beyond four we noticed some instabilities in the AO loop induced by the presence of dead HODM actuators on the right edge of the pupil.

Results
Our results are presented in Fig. 1 for 2018-04-01. The results for 2018-04-03 are almost identical and are provided in Appendix A. For each NCPA compensation loop we present the OPD maps, calibrated in nanometers, at each iteration, and the standard deviation σ( f ) decomposed in terms of spatial frequency f . This last term is derived from the classical two-dimensional power spectral density (2D PSD) expressed in (nm/(c/p)) 2 : with ν and η being the radial and azimuthal coordinates of the spatial frequencies in the 2D PSD. The σ( f ) values are discrete and expressed in nm rms. An important property of the PSD is that its integral over all spatial frequencies is equal to the variance of the original OPD map. However, the PSD expressed in (nm/(c/p)) 2 can sometimes be difficult to physically interpret from the instrumental point of view, where we usually think more in terms of standard deviation of the wavefront errors within a given range of spatial frequencies.
The σ( f ) addresses that issue: each of the points represents the standard deviation of the measured OPD maps in bins of size 1 c/p. The standard deviation over a wider range of spatial frequencies is obtained by quadratically summing the individual bins within that range. On the internal source, the starting point is generally dominated by a strong astigmatism and residual tip-tilt due to the fact that the PSF is not precisely centered on the mask at the beginning. The first iteration already brings a major improvement at all spatial frequencies up to ∼15 c/p (because of the mode filtering; see Sect. 2), but there are still some residual low frequencies, again dominated by tip, tilt, and astigmatism. Subsequent iterations help to decrease the low frequencies even more down to a floor at ∼3 nm rms in all c/p bins. After four iterations, the total amount of aberration in 1-15 c/p decreases from 50-55 nm rms down to 9-12 nm rms, corresponding to a gain in wavefront error of a factor of approximately five. High-spatial frequencies beyond 20 c/p are mostly unaffected by the NCPA compensation, as expected.
Two important questions prior to on-sky tests pertained to whether or not ZELDA would provide a measurable signal in the presence of ExAO residual phase errors, and to whether or not this signal would be accurate enough to enable a direct correction for the NCPA. The results in the right panel of Fig. 1 clearly show a positive answer to both questions. On sky, the OPD map at iteration #0 is similar to the one on the internal source, with a noticeable astigmatism. Interestingly, the level of aberration measured on sky at iteration #0 is lower than on the internal source: 35-40 nm rms versus 50-55 nm rms in 1-15 c/p, respectively. This is partly due to the fact that on sky the tip-tilt at the beginning was compensated manually when centering the PSF on the ZELDA mask, and is also partly due to a loss in sensitivity of the sensor on sky; this latter is explored in Sect. 3.3. In terms of convergence of the NCPA loop, it seems that only two iterations are required to reach a plateau in the low spatial frequencies, resulting in a final value of 16 nm rms in 0-15 c/p.
Overall we conclude that ZELDA on-sky measurements and NCPA compensation in closed loop are possible. However, there seems to be differences in the values measured between the internal source and on sky, which makes the absolute values of the σ( f ) on sky difficult to interpret.

ZELDA sensitivity with atmospheric residuals
We investigated the apparent discrepancy between the internal and the on-sky measurement with a dedicated test performed on 2018-04-03. In this test, we first performed a ZELDA on-sky measurement on 1 Pup (test ZT05), then switched to the internal source without changing anything else in the instrument setup and finally made a new ZELDA measurement (test ZT06). Due to some time lost on switching to the internal source and on setting up the new ZELDA images, the internal measurement was only acquired ∼30 min after the on-sky one.
The results are presented in Fig. 2. In this plot the tip, tilt, and defocus have been subtracted from the internal and on-sky OPD maps. For the tip-tilt, the DTTS ensures a repositioning accuracy of the order of 0.5 mas rms (Sauvage et al. 2016a) in a given setup. In these data, we measure differential tip and tilt of 1.43 mas and 4.15 mas, respectively. This is significantly larger than the specification, however here we are repositioning the PSF with the DTTS in two different setups: on sky with the full VLT pupil and on the internal source with a circular, nonobstructed pupil. The observed difference is therefore plausible, although more systematic tests would be required to confirm that the repositioning accuracy is slightly decreased between the two configurations. For the defocus, the observed difference of 14 nm rms is less easily explained. Apart from the presence of central obscuration and spiders, the other main difference between the internal source and the sky measurements is the illumination of the pupil, which is  perfectly uniform on sky but Gaussian on the single-mode fiber of the internal source. One possibility is that the difference in illumination, in the presence of phase aberrations, propagates as a slight defocus in the Shack-Hartmann (SH) reconstruction. We tested this hypothesis using a simplified SH simulation using OOMAO (Conan & Correia 2014) but could not reproduce the observed defocus. At this stage the origin of this defocus is not precisely understood and further tests would be needed.
Once the tip, tilt, and focus are removed, the difference in sensitivity of ZELDA on sky appears very clearly in the σ( f ) plot, with a ratio of a factor β = 0.64 on average between the two over spatial frequencies up to 15 c/p. The fact that the OPD maps are spatially almost identical but the σ( f ) values are at a decreased level indicates a change in sensitivity of ZELDA on sky, likely due to the difference in Strehl ratio between the two configurations. Beyond 15 c/p, the ratio goes up to ∼0.8, possibly indicating a different regime at different spatial frequencies.
From here on we use a fixed value for β but keep in mind that the value could in fact vary as a function of the spatial frequency. We see further evidence of this possibility in Sect. 4.3.
In an attempt to explain the loss of sensitivity, we performed a numerical simulation of a ZELDA NCPA reconstruction based on realistic data acquired with the instrument. The simulation is designed to model measurements taken either on the internal source or on sky. For the on-sky simulation, we use reconstructed ExAO residuals computed from SAXO real-time telemetry data over 30 s (41 400 phase screens; see Appendix B). The average residual wavefront error of the generated phase screens is 95.9 ± 12.4 nm rms, corresponding to a Strehl of 87% ± 3% at 1.6 µm, which is a good match to the observing conditions encountered during our tests in Paranal and the typical SAXO performance. The input static NCPA pattern is based on the ZELDA measurement performed in test ZT01 on the internal source and scaled to exactly 55 nm rms between 1 and 15 c/p for the purpose of the simulation. This value corresponds to the one typically observed in SPHERE after the daily calibrations.
The results of the simulation are presented in Fig. 3. In this figure, the residual tip, tilt, and defocus have been removed for a fair comparison to Fig. 2. We first note that the internal reconstruction does not match exactly the input NCPA, with a difference of a factor ∼0.8. This means that the internal reconstruction typically underestimates the NCPA by ∼20%. We also clearly identify a difference in sensitivity between the internal source and on sky in this simulation and we measure a factor β = 0.86 of attenuation between the reconstructions simulated on sky and on the internal source for spatial frequencies up to 20 c/p. This is higher than the factor 0.64 that was identified in the real data in Fig. 2, and the difference in sensitivity appears much flatter than in the data, with a factor that remains the same beyond 20 c/p. We also note a marginal variation of the sensitivity as a function of spatial frequency.
The differences between the internal source and on-sky results can only be attributed to the presence of two factors: the telescope optics and the atmospheric turbulence. Our numerical simulations already take into account the presence of the telescope optics by including the pupil illumination from pupil data measurements, and its effect is estimated to be negligible. The atmospheric turbulence residuals induce wavefront errors and possible cross-talk terms with the existing NCPA, which impact the Strehl ratio on sky. We therefore conclude that the value of the β factor is primarily driven by the Strehl ratio.
The fact that the β factor is not exactly identical in the simulation and on sky can likely be attributed to the variable observing conditions since the SAXO telemetry data were acquired almost 30 min after the ZELDA data. As a consistency check, we measured the Strehl ratio of an off-axis PSF acquired within ∼5 min of the SPARTA telemetry data to be 79%. This value agrees reasonably well with the 82% Strehl ratio expected for the ∼96 nm rms of reconstructed residuals combined with ∼60 nm rms of quasi-static NCPA. Multiplying the reconstructed ExAO residuals by a correction factor, we also determined that a Strehl ratio of ∼70% is necessary to match the observed β = 0.64 ratio of the on-sky and internal measurements. In this test, NCPA was first measured on sky, and subsequently the instrument was switched to internal source without changing anything other than the sourceselection mirror before another NCPA measurement was acquired. The tip, tilt, and focus have been subtracted (see Sect. 3.3), which explains why the amount of integrated aberration is smaller than in Fig. 1). attenuation factor. In conclusion we consider that the loss in sensitivity on sky is well understood and is directly related to the Strehl ratio of the observations. New ZELDA measurements obtained in parallel with SPARTA telemetry should hopefully enable a conclusion on this matter in the future.

Influence of exposure time
To make sure that our on-sky NCPA reconstruction with ZELDA is not limited by the residual atmospheric turbulence downstream of the ExAO system in typical observing conditions (0.7 seeing, 5 ms τ 0 , bright star) and AO regime (1.3 kHz loop speed), we also explored the effect of integration time for the onsky NCPA reconstruction in numerical simulations. We used the same static NCPA pattern as in Fig. 3 and simulated reconstructions assuming exposure times from 0.001 s to 10 s. The results are showed in Fig. 4. In this plot we represent the σ( f ) values of the reconstructed NCPA for four different values of f as a function of integration time as well as a reconstruction on pure atmospheric residuals, that is when setting the amount of NCPA to zero, to show the impact of these residuals on the reconstruction.
At very small integration times, the static NCPA and the turbulent parts are both an issue and contribute equally in the reconstructed wavefront. Beyond 0.1 s, the curves drop rapidly, which means that the atmospheric residuals, that is dynamic contributions to the speckle field, become negligible to the standard deviation of the wavefront and hence in this regime it is critical to   specifically address the NCPA terms as they dominate. We know from Fig. 1 that the level of σ( f ) will drop down to ∼4 nm rms in the 1-15 c/p range once the NCPA is compensated for. Setting up a DIT longer than ∼0.1 s is therefore required to force the dynamic terms to fall below this threshold value and measure the NCPA accurately. Finally, we can safely conclude that we can perform NCPA compensation with ZELDA on bright stars with 8-10 m telescopes equipped with ExAO using short exposure times.

Description of the tests
Beyond the pure on-sky NCPA compensation, the goal of our tests was to demonstrate a quantitative contrast gain in raw coronagraphic images once the compensation is applied. For this purpose, we acquired coronagraphic data following the ZT01, ZT02, ZT03, and ZT04 tests presented in Table 1. The complete data logs for these tests are provided in Table 2. We used the SPHERE apodized-pupil Lyot coronagraph (APLC; Soummer 2005;Carbillet et al. 2011;Guerri et al. 2011) and all data were acquired with the IRDIS instrument (Dohlen et al. 2008) in dual-band imaging (DBI; Vigan et al. 2010) mode. To present the results we focus on the images in the H2 filter of the H23 filter pair, at a wavelength of 1.593 µm and with a width of 52 nm. For each coronagraphic test, we acquired a set of coronagraphic images on-axis, followed by an off-axis PSF taken with a neutral density filter, and finally an instrumental background obtained by either switching the lamp off (internal source) or applying a 30 telescope offset (on sky). The DITs were manually adjusted to avoid any saturation of the detector. The centering of the PSF on the coronagraph mask was manually performed so as to minimize any visible residual tip-tilt.
On the internal source, we acquired two short-exposure coronagraphic images (∼1 min): one with the HOWFS reference slopes corresponding to iteration #0 in the NCPA compensation loop (referred to as "before compensation" hereafter), and one with reference slopes corresponding to iteration #3 ("after compensation" hereafter). A similar procedure was applied on sky but the before and after compensation images were interleaved over several minutes to better sample the variable observing conditions. On 2018-04-03 all the DIT were saved independently, while on 2018-04-01 a setup error ended up saving only the co-added image of 10 NDIT. On 2018-04-03 we also saved SAXO real-time telemetry data in parallel with most of the coronagraphic data acquisition to reconstruct the ExAO residual wavefront errors a posteriori.

Results
The results on 2018-04-01 on the internal source and on sky are presented in Fig. 5. On the internal source, there is a visible gain from the NCPA correction in the images. The gain is particularly visible on the symmetry of the residuals at the edge of the focal-plane mask, as well as on the vertical and horizontal structures created by the HODM actuators on the edge of the control region. These structures are not corrected up to the AO cutoff frequency because only 700 KL modes are kept in the NCPA compensation (see Sect. 3.1). The gain is visible on the contrast curves, with the largest gain being of a factor of six at ∼600 mas, which corresponds to the location of the structures mentioned above. At very small separation (100-300 mas), the gain is limited to a factor of two or three.
On sky the coronagraphic images are dominated by the ExAO residuals in the control region, with a significant contribution from the aliasing term (Cantalloube et al. 2019) that creates the bright horizontal and vertical lobes starting at the cutoff frequency and decreasing towards the center leading visually to a dark cross in the AO-controlled region. In these observations, the spatial filter of the Shack-Hartmann wavefront sensor was set to its MEDIUM position (1.5 λ/D), which provides a moderate attenuation of the aliasing. The SMALL position (1.3 λ/D), which provides a much stronger attenuation of the aliasing, could not be used given the observing conditions on that date. The pattern of quasi-static speckles that was visible on the internal source is still visible in most of the on-sky image. We observe a small contrast gain from 100 mas up to 700 mas after NCPA compensation, but this gain is limited to a factor of 2.5 at most. Similarly to the internal source data, the highest gain is observed at a separation of 600 mas.

Analysis of the limitations with coronagraphic image reconstruction
To understand the observed results on the internal source and on sky we performed coronagraphic image reconstruction based on realistic inputs fed into a model of the instrument. The details of the reconstruction are provided in Appendix C. The comparison between the data and the reconstruction is presented in Fig. 6. On the internal source, the reconstruction provides an excellent match to the data at low spatial frequencies

On-sky
Data / before compensation Data / after compensation Simu / before compensation Simu / after compensation APLC + amplitude errors + VLT pupil Fig. 6. Comparison of the coronagraphic profiles acquired on 2018-04-01 on the internal source (left; test CT01) and on sky (right; test CT02) with coronagraphic image reconstruction based on realistic inputs (see Appendix C). The theoretical performance of the APLC in the presence of amplitude errors is also plotted (dashed green curve). For the on-sky data, the image reconstruction includes ExAO residual wavefront errors based on SAXO real-time telemetry (see Appendix B). The blue shadowed region shows the region masked by the focal-plane mask of the APLC.
(≤650 mas), provided that a defocus term of −40 nm rms is added in the model. This term originates from the fact that the ZELDA mask and the APLC focal-plane masks are mounted in two different positions of the same filter wheel. There is therefore no reason for the best centering and focus to be exactly identical between the two. To find the best value, we generated a family of reconstructions with an increasing defocus term from −100 to +100 nm rms in steps of 10 nm rms and compared the resulting contrast curves with the data. The best fit was obtained for ∼0 nm of defocus for the data before NCPA compensation and −40 nm rms for the data after NCPA compensation. Physically, the ∼0 nm defocus before compensation is expected because it corresponds to the HO-WFS reference slopes optimized for the usual coronagraph during the daily calibrations. After compensation, the focus is, on the contrary, optimized for the ZELDA mask, hence the need to add an appropriate defocus term.
For the image reconstruction, we also multiply the ZELDA OPD map with a factor to compensate for the loss of sensitivity of the ZELDA reconstruction. As demonstrated in the simulation in Fig. 3, the value of this loss factor is of the order of 0.80 on the internal source, so we use 1/0.8 as the correction value in the reconstruction of the internal source data.
The data and reconstruction before and after NCPA compensation match very well at separations below 700 mas. Around the AO cutoff frequency the simulation predicts a level slightly higher than what is actually observed, which could be related to a variation of the sensitivity factor as a function of spatial frequency (see Sect. 3.3). After compensation, close to the focalplane mask, two bumps at ∼100 mas and ∼175 mas are clearly identifiable in both curves. We also compare these results with the contrast curve simulated for the APLC with a circular nonobstructed pupil and in the presence of amplitude errors. At the smallest separations, the raw contrast after compensation of the NCPA on the internal source is very close to the theoretical limit of the APLC when taking into account the amplitude aberrations. This means that on the internal source we are likely limited by the coronagraph design at separations <200 mas, and not by our capacity to calibrate and compensate for NCPA, although there seems to remain a very small amount of low-order NCPA. Beyond 200 mas, the profiles of the theoretical APLC and the data depart from each other, which means that the midfrequencies (∼5-15 c/p) are not completely compensated for. Additional work is still needed to understand why the NCPA compensation does not allow us to further decrease the midspatial frequency errors. One likely possibility is the presence of several dead actuators in the SPHERE HODM, and in particular a column of them located just on the outside on the right of the pupil. While the dead actuators are masked in Lyot stop plane to decrease their impact on coronagraphic imaging, their presence necessarily has an effect on what can be corrected. The column of bad actuators shows an extended impact on a significant portion of the pupil, which can be seen on the OPD maps at iterations #2-4 in Fig. 1. Unfortunately their impact is difficult to model accurately because the phase error induced by the dead actuators is far beyond the linearity range of ZELDA.
On sky, the reconstruction also matches the data extremely well below ∼700 mas. However, at the level of the AO cutoff there is a sharp increase in the simulated curve that is not observed in the data. In this reconstruction, the OPD map measured with ZELDA is compensated assuming a sensitivity loss factor of β = 0.64 as determined from actual measurements in Sect. 3.3. Nevertheless, the β factor represents the difference between the on-sky measurement and the internal source measurement, but we see in Sect. 3.3 that the internal measurement is also attenuated by a factor 0.8 with respect to the real NCPA value. As a result, the on-sky ZELDA measurement is compensated by a factor 1/0.64 × 1/0.8 = 1.95 in the reconstruction. While this factor seems appropriate for the low-spatial frequencies, the value does not seem appropriate at the level of the AO cutoff frequency. This again points towards a value that would vary as a function of spatial frequency. For this first study, provided that the predictions of our simulations appear correct at small separation, we keep a fixed value for the attenuation factor. However, further study would be required to fully understand the behavior of the sensitivity loss factor.
Other than the atmospheric residuals, the main change on sky is the VLT pupil. The impact of this pupil is significant on the theoretical APLC raw contrast, with the apparition of a bright peak at 150 mas as well as secondary peaks in the 200-400 mas range. In the 100-400 mas range, the data after NCPA compensation are obviously limited by these peaks. The relatively small raw contrast gain after NCPA correction is therefore not induced by an improper NCPA compensation on sky, but simply by the absolute performance of the SPHERE APLC design (Carbillet et al. 2011;Guerri et al. 2011). Even before NCPA compensation, the data are also visibly limited by the coronagraph, although to a lesser extent, probably through an effect similar to pinned speckles (Bloemhof et al. 2001;Sivaramakrishnan et al. 2002;Perrin et al. 2003). In conclusion, we can verify that the NCPA compensation with ZELDA functions on sky, but we cannot evaluate its ultimate performance due to the current design of the SPHERE APLC.

Discussion
Second-generation exoplanet imagers have already demonstrated great potential for studying close circumstellar environments, and for detecting new companions and helping to understand their physical properties. However, the current statistics derived from observations (e.g., Nielsen et al. 2019) and the most recent planet population models (e.g., Mordasini et al. 2017;Forgan et al. 2018) respectively show and predict that giant planets are scarce in the regime of mass and separation probed by current instruments. In other words, we need to probe closer to the stars and at deeper contrasts.
One of the current limitations in VLT/SPHERE is the control and compensation for the NCPA. To address this issue, we proposed to use the ZELDA wavefront sensor (Paper I; Paper II) to measure these aberrations down to the level of the coronagraphic focal-plane mask and compensate for them with the HODM. The results on the internal light source in the present work clearly demonstrate that a major gain in the amount of aberration between 1 and 15 c/p can be obtained, with a decrease from ∼50 nm rms down to less than 10 nm rms. This compensation for NCPA naturally translates into a gain in raw contrast in the coronagraphic images, although our simulations based on coronagraphic image reconstruction show that the performance is not limited by the final amount of NCPA after compensation but by the design of the SPHERE APLC. We cannot therefore conclude easily on the ultimate contrast that could be reached thanks to NCPA compensation with ZELDA.
On sky, we demonstrated that ZELDA measurements are possible and that they enable closed-loop NCPA compensation in a fashion identical to that performed on the internal source. However, our data show an important loss in sensitivity of ZELDA, which cannot be explained solely by a loss in Strehl ratio. The sensitivity loss does not represent a limitation as long as the NCPA compensation is performed in closed loop. The compensation will potentially require more iterations to reach the same level of correction, even though in practice (Fig. 1) it seems that a plateau is reached in only a few iterations. In any case, we establish that the presence of ExAO residuals does not limit the accuracy of the ZELDA measurement for integration times beyond a few tenths of a second. A system implementing a continuous NCPA compensation loop would therefore have to run at a very minimum frequency of 10 Hz.
In terms of coronagraphic performance after NCPA compensation, we show that raw contrast gain is of the order of a factor of 2-2.5 at most within 100-700 mas. While this gain is modest,

Simulation On-sky
Simu -before compensation Simu -after compensation New APLC design + amplitude errors + VLT pupil Fig. 7. Expected raw-contrast performance on sky with a new apodizer design for the SPHERE APLC, before and after compensation of the NCPA. The design has been optimized to maximize the contrast in the 80-820 mas region (N'Diaye et al. 2016b). The theoretical performance of the APLC in the presence of amplitude errors is also plotted (dashed green curve). The coronagraphic image reconstruction includes ExAO residual wavefront errors reconstructed from SAXO real-time telemetry (see Appendix B). The blue shadowed region shows the region masked by the focal-plane mask of the APLC. our coronagraphic image reconstructions clearly demonstrate that the SAXO AO system and APLC are performing close to their theoretical limits, in particular when looking at very small angular separation (<400 mas). This comes as a surprise because we anticipated that on sky the main limitation would be dominated by the ExAO residuals close to the optical axis. Our NCPA correction approach based on ZELDA would however offer greater gains on other imagers that are not as well calibrated, or for a coronagraph design with increased theoretical contrast at short separation.
Indeed, this result has strong implications in the context of discussions around upgrades for the SPHERE instrument (Lovis et al. 2017;Mouillet et al. 2018;Beuzit et al. 2019). One of the main goals of such an upgrade would be to improve the sensitivity of the instrument at very small angular separation. With the present work we show that even without a heavy AO upgrade, a low-complexity coronagraph upgrade combined to a proper NCPA compensation scheme would potentially bring a significant contrast gain below 400 mas, i.e., in the region of interest for new planetary companions. To illustrate this, we designed an APLC based on the current focal-plane mask and Lyot stop but with a new apodizer that maximizes the contrast in the 80-820 mas region while having the same transmission as the current APO1 apodizer (N'Diaye et al. 2016b). The outer limit of 820 mas has been chosen to cover the size of the AO control region in H-band (825 mas at 1.6 µm). We injected the resulting apodizer in our coronagraphic image model and the results are shown in Fig. 7. Although the raw contrast appears still somewhat limited by the coronagraph at 150 mas, the raw contrast is now a factor of more than two lower than with the current APLC design. Quite importantly, the coronagraph is also no longer a visible limitation in the 200-400 mas range. Without implying that this specific design is the one that should be implemented in a SPHERE upgrade, it demonstrates that a simple change of apodizer could immediately bring a quantitative gain in contrast.
A better coronagraph design would however require a real NCPA control scheme that is fully part of the operational model of the SPHERE instrument. More generally, the implementation of a ZELDA-based NCPA compensation scheme in any exoplanet imager could follow two different scenarios.
The first one is to have the ZELDA device and the coronagraphic mask in the same filter wheel and to switch between wavefront sensing and science imaging. This scenario has mainly been explored in SPHERE and could be easily transposable to other existing instruments in which a ZELDA component can be added in the coronagraphic mask focal plane. In this scenario, one can calibrate the NCPA either (i) in daytime or during the telescope pointing, which is the most efficient in terms of telescope time, or (ii) directly on-sky on the science target before the science observations. However, this strategy relies on an intrinsic stability of the NCPA as a function of time and/or instrumental configuration. In the case of SPHERE, these stability aspects will be treated in a forthcoming paper. The constraints are even stronger in (ii), since this calibration requires some knowledge of the integration times and frame rates to use to reach sufficient accuracy in the determination of the NCPA. Another important aspect is the chromaticity of the NCPA and its correction. This aspect has begun to be addressed in Paper I and Paper II but a more thorough study is required to conclude on the effectiveness of the correction over a wide spectral band. This is highly relevant for exoplanet imagers with spectroscopic capabilities to study, for example, exoplanet atmospheres over a wide spectral window. During our latest observing run, suitable data have been acquired with SPHERE to the spectral behavior of the NCPA, which will be the subject of future work.
The second and ideal scenario would be a measurement and compensation done regularly in parallel with the science observations, for example performing ZELDA wavefront sensing in a spectral window located outside of the science wavelengths of interest. In the case of SPHERE, this approach is already in use by the DTTS, which picks up a small fraction of the light in H-band just before the coronagraph to measure and compensate for the residual pointing errors of the PSF. The DTTS could in principle be replaced by a ZELDA-based system to sense not only tip-tilt errors but also errors of a higher order.
The use of ZELDA for NCPA compensation is a promising approach for SPHERE, and could be implemented easily in other existing exoplanet imagers. Its application is already envisioned in upcoming facilities on the ground and in space to compensate for NCPA but also for low-order wavefront-sensing aspects. In particular, the approach analyzed here has been adopted as the baseline for NCPA compensation of ELT/HARMONI in its high-contrast imaging mode (Carlotti et al. 2018) and for the low-order wavefront sensor of WFIRST/CGI (Zhao 2014). The maturity of ZELDA gained in these high-visibility instruments will pave the way for its use in space observatories with exo-Earth imaging capabilities such as LUVOIR or Habex.

Appendix A: Results obtained on 2018-04-03
This section presents the results on 2018-04-03 for the NCPA compensation loop (Fig. A.1), the impact on coronagraphic images (Fig. A.2), and the corresponding simulations (Fig. A.3). The results on that date are almost identical to the ones that are presented for 2018-04-01 in the main text.

On-sky
Data / before compensation Data / after compensation Simu / before compensation Simu / after compensation APLC + amplitude errors + VLT pupil

Appendix B: ExAO residual phase-screen reconstruction
SAXO enables real-time telemetry to be recorded from the SPARTA real-time computer (Suárez Valles et al. 2012). This recording is however not systematic because of the large amount of generated data (typically ∼4 GB min −1 ). The main recorded information is the individual slopes for each of the 1240 subapertures of the high-order Shack-Hartmann wavefront sensor. In this section we outline how we used these slopes in combination with the SAXO matrices to reconstruct residual phase screens that are representative of the observing conditions encountered during our tests on sky. All the HOWFS residual slopes are directly read from the recorded telemetry and stored in a matrix S of dimension 2N subap × N ts , where N ts and N subap denote the number of saved time steps and the number of HOWFS sub-apertures. The factor two comes from the storage of the individual x and y slopes. For the reconstruction, we use the following matrices, which are all saved automatically in parallel with the HOWFS slopes when recording real-time telemetry: -IFM HODM : the HODM influence matrix, calibrated from Zygo measurements in the laboratory during the SAXO integration. This matrix is used to go from the HODM voltages to optical path difference in nanometers. It has a dimension of N act ×240 2 , where 240 represents the diameter of the pupil in pixels in all SAXO calibrations; -S2M: the slope-to-mode matrix that is used to go from orthogonal slope-subspace to KL-mode-space (Petit et al. 2008). Its dimension is 2N subap × N mode . -M2V: the mode-to-voltage matrix that is used to go from KLmode-space to HODM voltage-space. It has a dimension of N mode × N act ; -IM ITTM : the image tip-tilt mirror (ITTM) interaction matrix that is used to go from the ITTM voltage-space to the parallel slope-subspace. Its dimension is 2 × 2N subap ; p : the parallel projection matrix that is used to go from the HOWFS slope-space to the parallel slope-subspace (Petit et al. 2008). Its dimension is 2N subap × 2; p ⊥ : the orthogonal projection matrix that is used to go from the HOWFS slope-space to the orthogonal slope-subspace.
The orthogonal subspace is defined as the slopes subspace complementary to the parallel slope subspace (Petit et al. 2008). This matrix has a dimension of 2N subap × 2N subap ; with N act being the number of HODM actuators and N mode the number of KL-modes controlled by SAXO. In these matrices, the voltages are normalized to ±1, which means they represent the actuators stroke.
The SPHERE control architecture includes a specific separation strategy between ITTM and HODM control, first to ensure strict decoupling between the two control loops, and second to allow dedicated control laws to be used for each control loop. In that framework, the HOWFS slopes space is split into a parallel subspace, corresponding to the slopes subspace addressed by the ITTM actuation. This is a two-dimensional subspace. Its counterpart is 2N subap × 2 slopes subspace, and is therefore referred to as orthogonal subspace. The ITTM is controlled from the parallel subspace, while the HODM is controlled from the orthogonal subspace. Consequently, the slopes are first decomposed into their parallel and orthogonal values S and S ⊥ using the appropriate projection matrices: The tip-tilt and higher orders are then reconstructed separately.

B.1. Reconstruction of the tip and tilt
For the tip-tilt reconstruction, some computation is required to obtain the control matrix of the ITTM that is used to go from the HOWFS parallel slope subspace to the ITTM voltage space. This is achieved with where Z 1 and Z 2 represent the 2D Zernike polynomials corresponding to the tip and tilt that are normalized to 1 nm rms and computed over a pupil of 240 pixels in size, D tel is the telescope diameter (8 m), and γ = 2.6 V −1 = 1.26 × 10 −5 rad V −1 is the conversion factor between IM voltages and angular motion on sky. The tip and tilt values are interleaved in the V ITTM matrix, which is why only the even-index values are used for the tip and the odd-index values are used for the tilt. At the end of this procedure, OPD tip and OPD tilt are matrices of size 240 2 × N ts . An example of the tip and tilt reconstruction is provided in Fig. B.1.

B.2. Reconstruction of the higher orders
For the reconstruction of the higher orders, we can directly use the available matrices to reconstruct the OPD maps: At the end of this procedure, OPD HO is a matrix of 240 2 × N ts in size. An example of the reconstruction higher orders is provided in Fig. B.1.

B.3. Final reconstruction
The reconstruction of the OPD maps representing the residuals from SAXO is finally achieved by summing the tip-tilt and higher-order contributions: which is a matrix of dimension 240 2 × N ts . However, this does not represent a completely usable residual OPD map because the 40 × 40 sub-apertures Shack-Hartmann wavefront sensor cannot sense aberrations beyond 20 c/p. The AO fitting and aliasing errors are therefore not included in the reconstruction. To circumvent these missing contributions, we used a Fourier simulation code based on the modeled power-spectral densities (PSDs) of the AO residuals that are tuned for the SAXO AO system. This code was originally used during the design phase of SAXO . This approach presents the advantage of enabling the decomposition of the full AO-filtered PSD into its individual terms, namely the fitting error, the servo-lag error, the aliasing error, the noise error, and the differential refraction.
In the case of our reconstruction, we included the stellar magnitude, zenith distance, and azimuth as input parameters. For the seeing and C 2 n we used the ESO Paranal ambient database to access the data from the MASS 3 (Kornilov et al. 2007), and we estimated that more than 55% of the turbulence was contained in a ground layer at altitude z = 0 km, 35% in a layer at z = 4 km, and the final 10% in a high-altitude layer at z = 16 km. The wind speed and direction of the ground layer were set in the simulation using the values that are also reported in the ambient database.
The Fourier code was then used to generate a PSD that only includes the fitting error and the aliasing error. For the aliasing, an attenuation factor of 0.5 was used to take into account the fact that SAXO used a spatially filtered Shack-Hartmann wavefront sensor and that the spatial filter was in the MEDIUM position, which only provides a partial attenuation of the aliasing. Using the PSD, we finally generate a series of N ts random OPD maps (OPD Fourier ) that are added to the OPD maps reconstructed from the SAXO real-time telemetry: OPD ExAO = OPD SAXO + OPD Fourier . (B.11) One realization of OPD Fourier is provided in Fig. B.1, with the final OPD map OPD ExAO resulting from the sum of the tip and tilt, the high orders, and the Fourier-simulated aliasing and fitting errors. These reconstructed OPD maps can finally be converted into residual phase screens and injected into numerical simulations to generate images with realistic ExAO residuals (see Appendix C).

OPD ExAO
Entrance pupil at a given λ within the spectral bandwidth ∆λ centered at the wavelength λ 0 . Our simplified model relies on the APLC setup that involves four successive planes (A, B, C, D) in which A defines the entrance pupil P , which combines the telescope pupil shape P 0 , the apodization Φ, the amplitude errors a, and phase errors ϕ, B sets the location of the focal plane mask, C includes the Lyot stop L to filter out the diffraction due to the coronagraph mask, and D represents the final image plane. The optical layout of the coronagraph is such that the complex amplitudes of the electric field in two successive planes are related by a Fourier transform. We callf the Fourier transform of the function f . Our coronagraph image reconstruction is performed with the following inputs: -ExAO residual wavefront errors OPD ExAO reconstructed from SAXO real-time telemetry (see Appendix B) with a number of maps N = 41 400, corresponding to a 30 s integration time sequence with a closed-loop frequency of 1380 Hz; -instrument pupil P 0 . This is modeled by a clear unobstructed aperture on internal source and a VLT pupil on sky; -amplitude errors map a, measured from pupil images in SPHERE; -NCPA OPD map measured with ZELDA OPD ZELDA and taking into account the loss in sensitivity (see Sect. 3.3) due to the ZELDA sensitivity factor β; -differential defocus term OPD Z4 of −40 nm rms between the ZELDA mask and the APLC focal-plane mask, estimated from the data (see Sect. 4.3); -transmission of the APO1 apodizer Φ; -model of the circular hard-edge focal-plane mask of diameter m and transmission 1 − M with M(ξ) = 1 for ξ < m/2 and 0 otherwise; -OPD map of the apodizer OPD Φ , obtained from Zygo measurements during the integration of the instrument. The map used in the simulation is a reconstruction of the measurement with 16 Zernike modes; -transmission of the STOP_ALC2 Lyot stop L. These different elements are displayed in Fig. C.1. As our images were taken in H2 filter on VLT/SPHERE, we use λ 0 = 1.593 µm and ∆λ = 0.052 µm in our simulations.

C.2. Image reconstruction
We perform our coronagraphic image simulations with the new Python-based object-oriented toolkit called Coronagraphs that enables one to model and optimize Lyot-style coronagraphs and Shaped-pupildevices(N'Diaye&Flamary,inprep.).Thecodewill soon be made public and freely available under the MIT license 4 .
For the reconstruction on the internal source, we do not include the ExAO residual atmospheric phase screens and we only simulate a single coronagraphic image. On the contrary, for the on-sky image reconstruction we simulate coronagraphic images with the same number as the ExAO residual phase screens and all the images are averaged in intensity to simulate a long exposure. An example of such a reconstruction for the data acquired on 2018-03-03 is provided in Fig. C.2. The reconstruction is based on a ZELDA OPD map and pupil amplitude image, which were acquired a few minutes before the coronagraphic data (the ZT04 and CT04 tests are summarized in Tables 1 and 2), and on the SAXO telemetry that was acquired in parallel with the coronagraphic data. The figure shows both the complete data and reconstruction as well as their high-passfiltered versions where the low-spatial frequency ExAO residuals have been removed with a simple median filter.
The complete data and reconstruction show strong similarities in the quasi-static speckles structures. However, the data are dominated by aliasing effects in the ExAO-correction region. While the coronagraphic image reconstruction does include an aliasing term that has been added to the reconstruction of the ExAO residual phase screens (see Appendix B), its modeling does not fully agree with reality. We tried different values for the aliasing attenuation but could never reproduce the same level as in the data. Further work is needed to understand this discrepancy and model the aliasing effect properly in the reconstruction. 4 https://github.com/astromam/Coronagraphs To bypass this limitation, we also show a spatially filtered version of the data and the reconstruction in Fig. C.2. Many structures originating from the quasi-static aberrations are well matched between the data and the model, in particular at the level and beyond the ExAO cutoff. Inside the ExAO-corrected region, there is good correlation of the main speckles but in general the agreement is worse than outside of this region. This means that some aberrations are not properly taken into account in our model. In this very preliminary study we have not yet explored the origin of these discrepancies in full detail, but we can already mention two possible contributors that are not taken into account. The first one is the post-coronagraphic aberrations. Although their contribution is expected to have little effect on coronagraphic images (e.g., Cavarroc et al. 2006), they cannot be disregarded completely. Unfortunately, ZELDA is only sensitive to the aberrations upstream of the coronagraph, so we currently have no estimation of the downstream aberrations. The second is Fresnel propagation effects inside the instrument, which are going to convert some amplitude aberrations into phase aberrations and vice versa. Our current model based on the classical four-plane APLC setup does not yet take such effects into account. In any case, modeling the Fresnel propagation effects in SPHERE would be extremely challenging without detailed knowledge of the phase and amplitude aberrations that are introduced by each optical component.
Despite these small differences, the agreement is statistically excellent, as demonstrated in the contrast curves that are presented in Figs. 6 and A.3. This is very promising for future work that could use prior knowledge on the instrument to remove part of the ExAO-related structures (e.g., the wind-driven halo, Cantalloube et al. 2018) or quasi-static speckles to improve data analysis and interpretation of resolved structures like circumstellar disks.