Issue 
A&A
Volume 668, December 2022



Article Number  A36  
Number of page(s)  7  
Section  Astronomical instrumentation  
DOI  https://doi.org/10.1051/00046361/202143001  
Published online  01 December 2022 
A deep learning approach for focalplane wavefront sensing using vortex phase diversity
^{1}
Montefiore Institute of Electrical Engineering and Computer Science, University of Liège,
Liège, Belgium
^{2}
Space sciences, Technologies and Astrophysics Research (STAR) Institute, University of Liège,
Liège, Belgium
email: maxime.quesnel@uliege.be
Received:
24
December
2021
Accepted:
30
September
2022
Context. The performance of highcontrast imaging instruments is limited by wavefront errors, in particular by noncommon path aberrations (NCPAs). Focalplane wavefront sensing (FPWFS) is appropriate to handle NCPAs because it measures the aberration where it matters the most, that is to say at the science focal plane. Phase retrieval from focalplane images results, nonetheless, in a sign ambiguity for even modes of the pupilplane phase.
Aims. The phase diversity methods currently used to solve the sign ambiguity tend to reduce the science duty cycle, that is, the fraction of observing time dedicated to science. In this work, we explore how we can combine the phase diversity provided by a vortex coronagraph with modern deep learning techniques to perform efficient FPWFS without losing observing time.
Methods. We applied the stateoftheart convolutional neural network EfficientNetB4 to infer phase aberrations from simulated focalplane images. The two cases of scalar and vector vortex coronagraphs (SVC and VVC) were considered using a single postcoronagraphic point spread function (PSF) or two PSFs obtained by splitting the circular polarization states, respectively.
Results. The sign ambiguity has been properly lifted in both cases even at low signaltonoise ratios (S/Ns). Using either the SVC or the VVC, we have reached a very similar performance compared to using phase diversity with a defocused PSF, except for high levels of aberrations where the SVC slightly underperforms compared to the other approaches. The models finally show great robustness when trained on data with a wide range of wavefront errors and noise levels.
Conclusions. The proposed FPWFS technique provides a 100% science duty cycle for instruments using a vortex coronagraph and does not require any additional hardware in the case of the SVC.
Key words: techniques: high angular resolution / techniques: image processing
© M. Quesnel et al. 2022
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the SubscribetoOpen model. Subscribe to A&A to support open access publication.
1 Introduction
Because of the small angular separation and high contrast between planetary companions and their parent star, exoplanet imaging is particularly challenging. Although these constraints can be addressed with specific instruments such as coronagraphs, residual wavefront aberrations still represent an inherent obstacle for detecting the majority of exoplanets. To a large extent, these residuals originate from noncommon path aberrations (NCPAs) between the scientific and wavefront sensing arms. Focalplane wavefront sensing (FPWFS) is an approach that has the advantage of taking NCPAs into account by probing their signature in the focalplane images (Jovanovic et al. 2018) while offering high sensitivity.
Estimating phase aberrations from the sole scientific images is not trivial since the relationship between focalplane intensities and the pupilplane phase is nonlinear and degenerate (Guyon 2018). Numerical methods have been developed for FPWFS, such as iterative algorithms (Fienup 1982), with the most standard one being the GerchbergSaxton algorithm (Gerchberg 1972). More recent techniques have been proposed for various applications (see Jovanovic et al. 2018, for a review), including the use of deep learning techniques for FPWFS (Paine & Fienup 2018; Andersen et al. 2019, 2020; Orban de Xivry et al. 2021). All of these approaches have to deal with one important hindrance: for a centrosymmetric pupil, two different phase distributions in the input pupil plane can produce the same point spread function (PSF). This ambiguity, also called the twinimage problem (e.g., GuizarSicairos & Fienup 2012), is typically solved with phase diversity using, for instance, an additional defocused PSF (Gonsalves 1982), or an asymmetric pupil mask (Martinache 2013). This, however, reduces the science duty cycle because some observing time, and/or part of the science beam, has to be dedicated to wavefront measurements exclusively.
Based on the properties of the vector vortex coronagraph (VVC, Mawet et al. 2005), a NijboerZernike phase retrieval approach tailored to the postVVC PSF was formulated in Riaud et al. (2012a,b). They proposed to split the two circular polarization states to exploit the phase diversity introduced by the two opposite topological charges associated with the VVC. A similar approach was more recently used by Bos et al. (2019) in the case of the gratingvector apodizing phase plate; although, it also required an asymmetric pupil to lift the sign ambiguity fully.
Here, we revisit the problem of phase retrieval behind a vortex coronagraph using deep learning techniques. Unlike an analytical approach, which could show limitations regarding its formulation, deep learning models can be trained regardless of the instruments and observing conditions. First, in Sect. 2, we argue that a scalar vortex coronagraph (SVC) has the potential to yield comparable residual phase errors to the dualpolarization VVC implementation, using a single postcoronagraphic PSF instead of two. In Sect. 3, we present our deep learning approach, based on convolutional neural networks (CNNs), which have the advantage of being flexible and easy to implement, and they have already been shown to be capable of reaching fundamental noise limits in our previous works (Quesnel et al. 2020; Orban de Xivry et al. 2021). Finally, in Sect. 4, we provide quantitative results on simulated data. We compare the performance of our vortex phase diversity method to a classical approach, and assess the robustness of the models, notably in the presence of representative atmospheric turbulence residuals.
2 Vortex phase diversity
2.1 Vortex coronagraphs
The vortex coronagraph (VC), introduced by Mawet et al. (2005), is a transparent focal plane mask that diffracts onaxis light outside of the pupil area. A Lyot stop placed in a downstream pupil plane allows this diffracted light to be blocked, enabling high contrast observations. Because of wavefront aberrations, some incoming light from the star is, however, not blocked. Indeed, the VC only removes the Airy disk, and speckles still appear in the focal plane.
There are two different types of vortex coronagraphs: vectorial (VVC) and scalar (SVC). The VVC applies a geometrical phase ramp to the incoming wavefront with a transmission t = exp(±j l_{p} θ), where l_{p} is the topological charge and θ is the azimuthal coordinate. Conjugated phase ramps are applied to each circular polarization state, producing a different signature in the focal plane for each (Riaud et al. 2012a). Here we focus on a topological charge l_{p} = 2, which is the most commonly used design so far (Mawet et al. 2009; Absil et al. 2016), but the following developments would also hold for any even topological charge. Unlike the VVC, the SVC uses longitudinal phase delays (Ruane et al. 2019; Desai et al. 2021), and thereby applies the same phase ramp (e.g., with +l_{p}) to both polarization states. The focalplane signature behind an SVC corresponds to the one obtained with a single polarization state using the VVC.
2.2 Sign ambiguity and phase diversity
In FPWFS, the Fourier relationship between the PSF and the pupilplane phase causes a sign ambiguity for Zernike modes of an even radial order (e.g., defocus, astigmatism): (1)
where E_{even}(x) = exp(−jϕ_{even}(x)) is the pupilplane electric field with phase aberrations ϕ_{even} (containing even modes only), is its conjugate, and F(·) is the Fourier transform operator. This sign ambiguity is a strong limitation for FPWFS using a single infocus image. A fair number of FPWFS methods have been developed to solve the twinimage problem. The most standard one is to use an additional known defocus together with the infocus image. An illustration of this ambiguity can be found in Fig. 1, where we generated two phase maps with opposite signs for their even Zernike modes. After propagation through a VVC, the infocus PSFs are the same in both cases (Figs. 1d and f), showcasing the twinimage problem. The outoffocus PSFs, however, are different (Figs. 1e and g) because the added defocus has the same sign in both cases, which allows the ambiguity to be lifted.
Now, if the two orthogonal circular polarization states are split downstream of the VVC to separate the conjugated phase ramps (−l_{p} and +l_{p}), or if the case of the SVC is considered, the infocus PSFs are not identical anymore (Figs. 1h and j, or Figs. 1i and k). The resulting PSFs are actually switched between the two circular polarization states. This indicates that the sign ambiguity can potentially be lifted when using either the two PSFs obtained from the separate circular polarization states, or the single PSF behind the SVC independently of the polarization state. This illustrates the fact the VC provides an azimuthal phase diversity, which can be used instead of the radial phase diversity provided by an additional defocus (Riaud et al. 2012a). In the case of the SVC, the sign ambiguity would then be lifted similarly to using only an outoffocus PSF in classical phase diversity (e.g., Lamb et al. 2021).
Fig. 1 Comparison of simulated PSFs for two conjugated phase maps ϕ (left) and ϕ′ (right): for ϕ′, we set opposite Zernike coefficients to those of ϕ only for the even modes, with a total of 18 modes starting from defocus. (a) Entrance annular pupil. (b, c) The conjugated phase maps. (d, e) Infocus and outoffocus PSFs obtained from propagating (b) with both polarization states together. (f, g) The same as (d, e) but using (c) for propagation instead. (h, i) Infocus PSFs obtained from (b) with −l_{p} and +l_{p} used separately. (j, k) The same as (h, i) but using (c) for propagation instead. 
3 Deep learning approach
3.1 Data generation
In our simulations, we considered an annular entrance pupil with a diameter of 8 m and a central obstruction of 30%. An observed bandwidth of 0.2 µm was defined around 2.2 µm (K band), by simulating a total of five wavelengths. A pixel scale of 0.25 λ/D/pix was set with a detector containing 64 × 64 pixels, giving a fieldofview of 16λ/D. The most relevant simulation parameters are listed in Table 1.
We generated the phase aberrations using annular Zernike polynomials, which make up an orthonormal basis on the input pupil: (2)
where ϕ is the complete phase map, Z_{i} are the Zernike polynomials, c_{i} are the corresponding coefficients, and N_{modes} is the number of modes considered.
The generated datasets are composed of 18 or 88 Zernike modes, up to the fifth and 12th radial orders, respectively, excluding the piston, tip, and tilt modes. The set of Zernike coefficients for each sample was first randomly generated within the range [−1, 1] before each coefficient was divided by its corresponding radial order to approximate a 1/f^{2} power spectral density profile, typically encountered with good quality optics (Dohlen et al. 2011). Low and high aberration levels, represented by wavefront error (WFE) distributions centered at a 70 and 350 nm root mean square (RMS), respectively, are considered by normalizing the Zernike coefficients accordingly. An example of such a distribution can be seen in Orban de Xivry et al. (2021). For classical phase diversity, the additional defocus was set to λ/5, that is, 440 nm RMS. In our case, this amount of diversity is close to the optimal value in terms of phase retrieval performance. The defocus was added in the entrance pupil plane, as if done by the deformable mirror of an adaptive optics system, which means that the resulting defocused PSFs contain more flux than the infocus PSFs as the coronagraphic performance of the VC is degraded.
To increase the representativeness of our simulations and to test the robustness of our approach, we added atmospheric turbulence residuals to the phase maps. A stateoftheart extreme adaptive optics (AO) was simulated using the COMPASS library (Ferreira et al. 2018), assuming a loop frequency of 3.5 kHz, 2frame delay, a 50 × 50 deformable mirror (i.e., 2040 modes/valid actuators), and a pyramid sensor with 5 λ/D of modulation (without noise). This has yielded a Strehl ratio of about 98% at 2.2 µm, corresponding to a WFE of about 50 nm RMS. We sampled the AO residuals at 10 Hz and we used a sequence of ten consecutive phase screens by summing up the corresponding PSFs. We therefore simulated a 1s exposure in the presence of a given amount of static NCPAs. The results with data containing these AO residuals are shown in Sect. 4.3.
To simulate a PSF obtained behind a VVC, we performed two propagations, one with +l_{p} and the other with −l_{p}, to consider each circular polarization state. The downstream Lyot stop blocked 2% of the outer pupil area (but the central obstruction was not oversized). The resulting PSFs were then either summed up to reproduce the nonpolarized case, or they were kept separate to consider the dualpolarization case. To simulate the SVC, only one such PSF was taken. The optical propagation was handled by the HEEPS package^{1} (Carlomagno et al. 2020), which makes use of PROPER (Krist 2007). Examples of generated phase maps and PSFs can be found in Fig. 1. We then added photon noise to our PSFs, so that the signaltonoise ratio (S/N) was defined as , where N_{ph} is the number of photons. A squareroot stretching operation was applied to the PSFs to help the CNN identify the speckle patterns. Finally, we normalized the PSFs with a minmax scaling to obtain flux in the range [0,1], which ensured the CNN was fed with samescale quantities.
Data generation parameters.
3.2 Model architecture
We built deep neural network models whose goal is to map the Zernike coefficients of phase aberrations ϕ from a given PSF I, that is, to approximate a nonlinear function f such that ϕ ≈ f(I). CNNs have been proven to be very well suited for image analysis, with numerous applications for both classification and regression tasks. CNNbased architectures have been developing very quickly in recent years, with performance still improving greatly. We have therefore used a stateoftheart deep CNN called EfficientNet (Tan & Le 2019). This type of architecture stands out from other ones by using a new scaling technique: all dimensions of the CNN (depth, width, and resolution) are scaled by the same compound coefficient Φ, and the parameters are inferred from the original model or baseline EfficientNetB0 (Φ = 0). There are thus different models available, and we chose to use EfficientNetB4, for which we have obtained the best tradeoff between model performance and runtime. EfficientNetB4 has a total of 1.9 × 10^{7} parameters and 4.2 × 10^{9} FLOPS. It has about the same number of parameters as the ResNet50 architecture, which was used in Quesnel et al. (2020) and Orban de Xivry et al. (2021).
3.3 Model training
For a given training, a dataset composed of 10^{5} PSFs (or PSF pairs for the cases with two input channels) was randomly split into training (90%) and validation (10%) sets. Each sample also contains the true NCPA phase maps as labels, while the AO phase screens are never given. Batches composed of 64 data samples were then consecutively fed to the neural network. We define the loss function as the rootmeansquare error (RMSE) of the phase residuals. Weight updates based on the loss were handled by the Adam optimizer (Kingma & Ba 2017). To improve the performance, we set a penalty on the loss (“weight decay”) of 10^{−7} for the low aberration regime and 10^{−6} for the higher aberration regime. We also set an initial learning rate of 10^{−3} which was decreased by a factor of two as soon as the validation loss reached a plateau over 15 epochs. This results in sudden loss drops, allowing the performance to be greatly improved. Pretrained models on ImageNet were used to initialize the weights. The training of the model was stopped if no improvement of the validation loss was observed over 25 epochs. This results in training procedures lasting between 50 and 250 epochs.
4 Results
We compare the capacity of different configurations to lift the sign ambiguity as well as their performance. The designation of these configurations, together with some of their parameters, can be found in Table 2: we consider the cases of the VVC with or without classical phase diversity (“VVC [in, out]focus” and “VVC infocus,” respectively), which are compared to the new approaches presented in this paper (“VVC dualpolar” and “SVC”). The noncoronagraphic case (“no vortex [in, out]focus”) is evaluated as well. We also investigate the possibility to work with differential PSFs obtained by subtracting the separate circular polarization states (“VVC dualpolar; diff PSFs”). In the last part of this section, we add atmospheric turbulence residuals and we assess the robustness of the models regarding variations in the S/N levels, input wavefront errors, and Zernike polynomial orders. All models are evaluated using 1000 test samples.
Fig. 2 RMSE per Zernike mode, following the Noll convention, starting from the defocus mode. Four cases were compared (see Table 2 for notations), using a single infocus postVVC PSF without splitting the polarization states (cyan), two postVVC PSFs with additional defocus (dark blue), the two postVVC PSFs associated with each polarization state (red), and a single PSF after the SVC (orange). The RMSE of the input phase maps is represented in black and the even modes are indicated by the green areas. Left: input WFE of 70 nm distributed over 18 modes. Right: input WFE of 350 nm distributed over 88 modes. In both examples, the S/N in the entrance pupil plane is equal to 100. 
Configurations considered for phase retrieval.
4.1 Phase sign determination
To determine whether the models predict the correct sign, we looked at the performance per Zernike mode. The metric used is the RMSE per mode: (3)
where N_{test} is the number of test samples, while and c are the estimated and true Zernike coefficients, respectively.
In Fig. 2, we compare the performance per mode between four cases for two different aberration contents. A network using only infocus PSFs in the nonpolarized case with the VVC yields no correction for even Zernike modes, because the model tends to predict zero for the coefficients facing the ambiguity (due to the l_{2}norm training loss). For odd modes, the model is able to provide some correction, even though its quality is limited by the loss function, which does not discriminate between even and odd modes. Adding defocused PSFs as input solves the problem as expected (Quesnel et al. 2020). In the dualpolarization case, a network using either one or both circular polarization states separately as input (SVC and VVC, respectively) also yields good performance for even modes as well as for odd modes. This indicates that the sign ambiguity is properly lifted with these two approaches.
It is noteworthy that the performance marginally depends on the Zernike mode: the error tends to increase for larger angular azimuthal orders at a given radial order. Our interpretation is that since the phase information is of higher spatial frequency and located closer to the edge of the pupil in these cases, it is more difficult for the CNN model to identify those features.
4.2 Performance compared to classical phase diversity
We now compare our method to the classical phase diversity approach in terms of overall phase retrieval performance. The RMS WFE on the phase residuals is used as a metric and it is defined for each test sample as: (4)
where N_{pix} is the number of pixels, while and ϕ are the estimated and true pupil phases, respectively.
In our simulations, we consider the fact that the vortex coronagraphs block out most of the starlight, and that for a given stellar magnitude, the resulting flux in the detector plane is reduced. The flux is also equally split between each PSF for all the cases with two channels, while for the configurations with a single one, the PSF receives the total remaining flux behind the vortex mask. The performance of the trained models at different S/N levels defined in the entrance pupil plane is shown in Fig. 3. In our case, S/Ns between 10^{1} and 3 × 10^{3} correspond to stars of apparent magnitudes in the range from 18.6 to 6.2^{2}. For a median input WFE of 70 nm with 18 modes (Fig. 3, left), the simulated performance is almost identical for the classical, SVC and VVC dualpolarization approaches, even though the additional defocus increases the overall S/N at the focal plane for the classical method. For a median input WFE of 350 nm with 88 modes (Fig. 3, right), the phase residuals are distinctly higher for all the configurations, and a plateau is reached for S/Ns above 1000. We can especially notice that the sole PSF behind the SVC somewhat limits the performance in this case. Our main hypothesis for this discrepancy is that, in a high aberration regime, the effects of the nonlinear nature of the problem are greater. The extra information given by having two input channels is therefore favorable and makes the models easier to train. In general, it is more difficult to train datasets containing strong aberrations, and this can typically be improved by using more data (e.g., 5 × 10^{5} samples, see Orban de Xivry et al. 2021), more complex architectures (e.g., EfficientNetB6), and/or stronger weight decay.
We also consider the possible presence of planetary companions in the detected images. This additional, offaxis source of light is largely unaffected by the vortex phase ramp and therefore adds the same signature in both circular polarization states. This additional light source may bias the phase retrieval process, and lead to unwanted planetary signal subtraction. A possible workaround is to subtract one polarization image from the other, in an attempt to remove the signature of any offaxis light source. We thus assessed the phase retrieval capabilities using the difference between both polarization states after the VVC. The results are shown in Fig. 3 and are compared with the other configurations. We only obtain a marginal increase in the error at high S/Ns, especially in the low aberration regime, which can be explained by the loss of information produced by subtracting one PSF from the other.
The performance of the various configurations are finally compared to the theoretical limit in Fig. 3. This limit is discussed in Orban de Xivry et al. (2021) for noncoronagraphic imaging. For both the noncoronagraphic and vortex imaging cases, the residual errors reach the fundamental limit in the low aberration regime (Fig. 3, left). In a higher aberration regime, the performance does not reach the fundamental limit, and the gap increases toward higher S/Ns (Fig. 3, right). This can be improved with more robust training as explained above. One can note that the residual errors are constrained by the WFE distribution in the data toward lower S/Ns, while the theoretical limit is independent of the input WFE distribution and continues to increase for lower S/Ns, thus yielding residual WFE below the limit.
Fig. 3 Performance in terms of RMS WFE on the phase residuals at different S/N levels. Each point corresponds to a model trained and evaluated on the indicated S/N (six S/Ns are considered, and slight horizontal shifts were applied to be able to discern each point). The same colors in Fig. 2 are used, with the addition of the performance with classical imaging (green), using differential postVVC PSFs (violet), as well as the theoretical limit (black dashed line). The median values are represented and the error bars correspond to the 2–98th percentiles. The S/Ns indicated are the ones at the entrance pupil plane, and the flux suppression induced by the vortex mask is taken into account. Left: input WFE of 70 nm distributed over 18 modes. Right: input WFE of 350 nm distributed over 88 modes. 
4.3 Model robustness
To test how the method handles more realistic groundbased observations, we added atmospheric turbulence residuals in addition to the NCPAs, as described in Sect. 3.1. This represents an additional source of noise since the AO residuals are not included in the labels for training. Examples of input PSFs at the different flux levels can be found in Fig. 5. The performance now starts to reach a plateau of a few nm RMS in the low aberration regime at high S/Ns (Fig. 4, left), due to the presence of these atmospheric turbulence residuals. In the high NCPA regime (Fig. 4, right), the AO residuals however become negligible and the performance is almost identical to the case without turbulence (Fig. 3, right).
We finally study the robustness of the models regarding a variation in the data during evaluation. First, we may encounter different flux levels than those considered during training. In Fig. 6, we illustrate how models in the VVC dualpolar configuration trained on data containing 70 nm RMS behave in such conditions. Whether the training S/N is low or high, models only show good robustness to other flux levels within a limited range, outside of which the performance is strongly degraded. If a more robust model is required, it is also possible to train with various flux levels. We investigated this by using a training dataset covering the entire test S/N range, without increasing its size. The median performance is much more consistent at every S/N; although, the variation in the residual error between samples is greater, and a small degradation can naturally be seen compared to using identical training and testing S/N (as shown in Fig. 4).
We also study the change in performance when evaluating the model outside the input WFE training range. Figure 7 shows the robustness of models trained on the two aberration regimes studied in this paper. Data containing more aberrations rapidly deteriorate the reconstruction. The models perform better when evaluated at lower aberration levels, but they have limited performance when trained in the high aberration regime. To overcome these limitations, we trained two models over the entire test WFE range for each of the Zernike mode contents considered in the paper. Such models show excellent robustness, with minimal degradation compared to models with identical training and testing WFE distributions. This suggests that these models could be robustly used in closedloop operations, even with the aberration level decreasing with time. Regarding the varying spatial power spectral density of the wavefront, the residuals are generally constant along the Zernike modes, as seen in Fig. 2. When giving the reconstructed PSFs as input to the same trained model, we have observed that most residual RMS WFE stay below 10 nm for a model trained on 70 nm RMS as input and an S/N of 1000. A thorough analysis of a closedloop application will be the subject of future work when testing the algorithm in the lab or onsky.
It can also be expected to have observations containing higherorder NCPAs (in addition to the changing atmospheric residuals) than considered during training. For a model trained on 18 modes at 70 nm RMS (S/N = 1000), we added 70 higherorder Zernike modes in the test data. In Fig. 8, we observe a moderate degradation for the 18 modes when increasing the wavefront error contained in these additional modes, because the central PSF signature is mostly preserved.
Fig. 4 Phase prediction errors at different S/N levels, presented the same way as in Fig. 3, but this time also including atmospheric turbulence residuals in the PSFs during both training and testing. 
Fig. 5 Examples of PSFs at different S/N levels (defined in the entrance pupil plane) for +l_{p}. The resulting S/N in the detector place is reduced due to the extinction factor introduced by the coronagraph and by the beam splitting between the two polarization channels. The level of NCPA is equal to 70 nm RMS distributed over 18 modes (top) and 350 nm RMS over 88 modes (bottom). AO residuals are also present: each PSF is the result of combining ten PSFs, with each containing a different AO residual phase screen. 
Fig. 6 Performance with altered S/N levels during evaluation for three models trained on data with a median RMS WFE of 70 nm over 18 modes, with an S/N of 30 (purple), 1000 (blue), and with S/Ns uniformly distributed over the entire S/N range (green). Each point is obtained from an testing batch composed of 1000 samples (the median value together with the 2–98th percentiles are shown). 
Fig. 7 Performance with different input WFE levels defined during evaluation for models trained on data with a median RMS WFE of 70 nm over 18 modes (blue), and 350 nm RMS over 88 modes (red). Models were also trained on data following a uniform distribution covering the whole input WFE range, using both spatial frequency regimes (cyan and orange). The S/N is 1000 and each training dataset contains 10^{5} samples. 
5 Conclusions
In this paper, we have investigated a new way to perform focalplane wavefront sensing using vortex coronagraphs. Based on a deep learning approach and considering simulated data, we have leveraged the modulation introduced by the vortex coronagraph (either scalar, or vectorial after splitting the circular polarization states) to lift the sign ambiguity and perform FPWFS for various S/Ns, input WFEs, and spatial frequency contents. The dualpolarization method with the VVC offers a very similar performance to the classical phase diversity method using additional defocused PSFs, even though the level of light is largely reduced after filtering by the VVC. For instance, considering a star of magnitude 6.2 observed at a wavelength of 2200 nm, we obtain a residual of 0.73 nm RMS from an input WFE of 70 nm RMS. In the case of the SVC, which provides a single focalplane image, a loss in performance is only observed for high aberration levels. For bright stars, and with higher order and higher levels of aberrations, the CNN training is generally challenging, and the performance reaches a plateau of approximately 20 nm RMS. In such circumstances, more training data, larger and deeper CNN architectures, and regularization techniques could further improve the phase retrieval accuracy. Atmospheric turbulence residuals that are expected in groundbased data only produce minor degradation in performance in a low NCPA regime, and they should not be a concern in practice. We have also shown that models trained on data containing particularly wide WFE and S/N distributions provide very good robustness.
Potential applications of the proposed method could rely on including a polarizing beam splitter downstream of the VVC to collect both circular polarization states separately, either on a single or on two distinct sensors. Since our simulationbased FPWFS experiments work well even with a single image obtained behind an SVC, it appears that this flavor of vortex coronagraph offers an interesting alternative, notably because it would work without any additional optical components.
Deep learning models offer a flexible framework and fast inference speeds, which are appreciable features for onsky applications. The requirement on speed is, however, not very stringent as we expect the lifetime of NCPAs that produce quasistatic speckles to be on the order of minutes. But onsky applications will naturally come with their own challenges and discrepancies unpredicted by simulations. To account for the difference between simulations and real data, transfer learning techniques can be used to efficiently finetune the models before observations. Finally, it is difficult to obtain reliable and very precise NCPA labels for model training. Employing unsupervised learning techniques, for example autoencoderbased architectures, is another interesting approach that we are considering for future developments.
Fig. 8 Robustness on higherorder aberrations. Top: performance per Zernike mode on test data following the distribution of the training data (red), adding, to the test data, 70 modes containing 35 nm RMS (purple) and 70 nm RMS (blue) of NCPAs. Bottom: example of postVVC PSFs for each case (+l_{p}). 
Acknowledgements
This research made use of PyTorch (Paszke et al. 2019) and the following implementation of EfficientNet: https://github.com/lukemelas/EfficientNetPyTorch. The HEEPS (Carlomagno et al. 2020) and PROPER (Krist 2007) opensource optical propagation Python packages were used for data generation. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 819155), and from the WalloniaBrussels Federation (grant for Concerted Research Actions).
References
 Absil, O., Mawet, D., Karlsson, M., et al. 2016, in SPIE Conf. Ser., 9908, 99080Q [NASA ADS] [Google Scholar]
 Andersen, T., OwnerPetersen, M., & Enmark, A. 2019, Opt. Lett., 44, 4618 [NASA ADS] [CrossRef] [Google Scholar]
 Andersen, T., OwnerPetersen, M., & Enmark, A. 2020, J. Astron. Telescopes Instrum. Syst., 6, 034002 [NASA ADS] [Google Scholar]
 Bos, S. P., Doelman, D. S., Lozi, J., et al. 2019, A&A, 632, A48 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Carlomagno, B., Delacroix, C., Absil, O., et al. 2020, J. Astron. Telescopes Instrum. Syst., 6, 035005 [Google Scholar]
 Desai, N., LlopSayson, J., Jovanovic, N., et al. 2021, in Techniques and Instrumentation for Detection of Exoplanets X, eds. S. B. Shaklan, & G. J. Ruane, 11823, International Society for Optics and Photonics (SPIE), 238 [Google Scholar]
 Dohlen, K., Wildi, F. P., Puget, P., Mouillet, D., & Beuzit, J.L. 2011, in Second International Conference on Adaptive Optics for Extremely Large Telescopes, online at http://ao4elt2.lesia.obspm.fr, 75 [Google Scholar]
 Ferreira, F., Gratadour, D., Sevin, A., & Doucet, N. 2018, 2018 International Conference on High Performance Computing & Simulation (HPCS), 180 [CrossRef] [Google Scholar]
 Fienup, J. 1982, Appl. Opt., 21, 2758 [NASA ADS] [CrossRef] [Google Scholar]
 Gerchberg, R. W. 1972, Optik, 35, 237 [Google Scholar]
 Gonsalves, R. A. 1982, Opt. Eng., 21, 829 [NASA ADS] [CrossRef] [Google Scholar]
 GuizarSicairos, M., & Fienup, J. R. 2012, J. Opt. Soc. Am. A, 29, 2367 [NASA ADS] [CrossRef] [Google Scholar]
 Guyon, O. 2018, ARA&A, 56, 315 [Google Scholar]
 Jovanovic, N., Absil, O., Baudoz, P., et al. 2018, SPIE Conf. Ser., 10703, 107031U [NASA ADS] [Google Scholar]
 Kingma, D. P., & Ba, J. 2017, Adam: A Method for Stochastic Optimization Krist, J. E. 2007, in Optical Modeling and Performance Predictions III, ed. M. A. Kahan, 6675, International Society for Optics and Photonics (SPIE), 250 [Google Scholar]
 Lamb, M. P., Correia, C., Sivanandam, S., Swanson, R., & Zavyalova, P. 2021, MNRAS, 505, 3347 [NASA ADS] [CrossRef] [Google Scholar]
 Martinache, F. 2013, PASP, 125, 422 [NASA ADS] [CrossRef] [Google Scholar]
 Mawet, D., Riaud, P., Absil, O., & Surdej, J. 2005, ApJ, 633, 1191 [Google Scholar]
 Mawet, D., Serabyn, E., Liewer, K., et al. 2009, ApJ, 709 [Google Scholar]
 Orban de Xivry, G., Quesnel, M., Vanberg, P. O., Absil, O., & Louppe, G. 2021, MNRAS, 505, 5702 [NASA ADS] [CrossRef] [Google Scholar]
 Paine, S. W., & Fienup, J. R. 2018, Opt. Lett., 43, 1235 [NASA ADS] [CrossRef] [Google Scholar]
 Paszke, A., Gross, S., Massa, F., et al. 2019, in Advances in Neural Information Processing Systems, eds. H. Wallach, H. Larochelle, A. Beygelzimer, et al., 32 (Curran Associates, Inc.), 8026 [Google Scholar]
 Quesnel, M., Orban de Xivry, G., Louppe, G., & Absil, O. 2020, SPIE Conf. Ser., 11448, 114481G [NASA ADS] [Google Scholar]
 Riaud, P., Mawet, D., & Magette, A. 2012a, A&A, 545, A151 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Riaud, P., Mawet, D., & Magette, A. 2012b, A&A, 545, A150 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Ruane, G., Mawet, D., Riggs, A. E., & Serabyn, E. 2019, in Techniques and Instrumentation for Detection of Exoplanets IX, ed. S. B. Shaklan, 11117, International Society for Optics and Photonics (SPIE), 454 [Google Scholar]
 Tan, M., & Le, Q. 2019, in Proceedings of Machine Learning Research, 97, Proceedings of the 36th International Conference on Machine Learning, eds. K. Chaudhuri, & R. Salakhutdinov (PMLR), 6105 [Google Scholar]
All Tables
All Figures
Fig. 1 Comparison of simulated PSFs for two conjugated phase maps ϕ (left) and ϕ′ (right): for ϕ′, we set opposite Zernike coefficients to those of ϕ only for the even modes, with a total of 18 modes starting from defocus. (a) Entrance annular pupil. (b, c) The conjugated phase maps. (d, e) Infocus and outoffocus PSFs obtained from propagating (b) with both polarization states together. (f, g) The same as (d, e) but using (c) for propagation instead. (h, i) Infocus PSFs obtained from (b) with −l_{p} and +l_{p} used separately. (j, k) The same as (h, i) but using (c) for propagation instead. 

In the text 
Fig. 2 RMSE per Zernike mode, following the Noll convention, starting from the defocus mode. Four cases were compared (see Table 2 for notations), using a single infocus postVVC PSF without splitting the polarization states (cyan), two postVVC PSFs with additional defocus (dark blue), the two postVVC PSFs associated with each polarization state (red), and a single PSF after the SVC (orange). The RMSE of the input phase maps is represented in black and the even modes are indicated by the green areas. Left: input WFE of 70 nm distributed over 18 modes. Right: input WFE of 350 nm distributed over 88 modes. In both examples, the S/N in the entrance pupil plane is equal to 100. 

In the text 
Fig. 3 Performance in terms of RMS WFE on the phase residuals at different S/N levels. Each point corresponds to a model trained and evaluated on the indicated S/N (six S/Ns are considered, and slight horizontal shifts were applied to be able to discern each point). The same colors in Fig. 2 are used, with the addition of the performance with classical imaging (green), using differential postVVC PSFs (violet), as well as the theoretical limit (black dashed line). The median values are represented and the error bars correspond to the 2–98th percentiles. The S/Ns indicated are the ones at the entrance pupil plane, and the flux suppression induced by the vortex mask is taken into account. Left: input WFE of 70 nm distributed over 18 modes. Right: input WFE of 350 nm distributed over 88 modes. 

In the text 
Fig. 4 Phase prediction errors at different S/N levels, presented the same way as in Fig. 3, but this time also including atmospheric turbulence residuals in the PSFs during both training and testing. 

In the text 
Fig. 5 Examples of PSFs at different S/N levels (defined in the entrance pupil plane) for +l_{p}. The resulting S/N in the detector place is reduced due to the extinction factor introduced by the coronagraph and by the beam splitting between the two polarization channels. The level of NCPA is equal to 70 nm RMS distributed over 18 modes (top) and 350 nm RMS over 88 modes (bottom). AO residuals are also present: each PSF is the result of combining ten PSFs, with each containing a different AO residual phase screen. 

In the text 
Fig. 6 Performance with altered S/N levels during evaluation for three models trained on data with a median RMS WFE of 70 nm over 18 modes, with an S/N of 30 (purple), 1000 (blue), and with S/Ns uniformly distributed over the entire S/N range (green). Each point is obtained from an testing batch composed of 1000 samples (the median value together with the 2–98th percentiles are shown). 

In the text 
Fig. 7 Performance with different input WFE levels defined during evaluation for models trained on data with a median RMS WFE of 70 nm over 18 modes (blue), and 350 nm RMS over 88 modes (red). Models were also trained on data following a uniform distribution covering the whole input WFE range, using both spatial frequency regimes (cyan and orange). The S/N is 1000 and each training dataset contains 10^{5} samples. 

In the text 
Fig. 8 Robustness on higherorder aberrations. Top: performance per Zernike mode on test data following the distribution of the training data (red), adding, to the test data, 70 modes containing 35 nm RMS (purple) and 70 nm RMS (blue) of NCPAs. Bottom: example of postVVC PSFs for each case (+l_{p}). 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.