Variation on a Zernike wavefront sensor theme: optimal use of photons

The Zernike wavefront sensor (ZWFS) is a concept belonging to the wide class Fourier-filtering wavefront sensor (FFWFS). The ZWFS is known for its extremely high sensitivity while having a low dynamic range, which makes it a unique sensor for second stage adaptive optics (AO) systems or quasi-static aberrations calibration sensor. This sensor is composed of a focal plane mask made of a phase shifting dot fully described by two parameters: its diameter and depth. In this letter, we aim to improve the performance of this sensor by changing the diameter of its phase shifting dot. We begin with a general theoretical framework providing an analytical description of the FFWFS properties, then we predict the expected ZWFS sensitivity for different configurations of dot diameters and depths. The analytical predictions are then validated with end-to-end simulations. From this, we propose a variation of the classical ZWFS shape which exhibits extremely appealing properties. We show that the ZWFS sensitivity can be optimized by modifying the dot diameter and even reach the optimal theoretical limit, with a trade-off for low spatial frequencies sensitivity. As an example, we show that a ZWFS with a 2{\lambda}/D dot diameter (where {\lambda} is the sensing wavelength and D the telescope diameter), hereafter called Z2WFS, exhibits a sensitivity twice higher than the classical 1.06{\lambda}/D ZWFS for all the phase spatial components except for tip-tilt modes. Furthermore, this gain in sensitivity does not impact the dynamic range of the sensor, and the Z2WFS exhibits a similar dynamical range as the classical 1.06{\lambda}/D ZWFS. This study opens the path to the conception of diameter-optimized ZWFS.


Introduction
The role of a wavefront sensor (WFS) is to encode the phase information at the entrance of an optical system into intensities on a detector. For ground-based astronomy, WFS are mostly used for Active or Adaptive Optics (AO), in conjunction with a wavefront control strategy in order to compensate for optical aberrations induced by the atmosphere or the telescope itself. In the context of astronomy, one of the main driver for a WFS design is its sensitivity, or in other words, its ability to provide an accurate measurement in the presence of noise. Sensitivity is therefore a useful metric to assess WFS performance in terms of photon noise, which is related to key quantities in AO field: loop speed and sky-coverage. Existing WFS can be separated into two main categories, usually defined as focal plane WFS, and for which the measurements are done in a focal plane (like the Shack-Hartmann wavefront sensor) and pupil plane WFS, for which the measurements are done in a pupil plane. Among this later category, the Fourier Filtering WFS (FFWFS) represents a wide class of sensors of particular interest thanks to their superior sensitivity. From a general point of view, a FFWFS consists of a phase mask located in an intermediate focal plane which performs an optical Fourier filtering. As such, the Zernike phase mask (Zernike & Stratton 1934;Bloemhof & Wallace 2003;Dohlen et al. 2006;Wallace et al. 2011) forms a FFWFS, hereafter called Zernike WFS (ZWFS). In this case, the filtering element is a phase shifting dot which is, for a given substrate, fully described by two parameters: its diameter and its depth (or phase-shift). In a classical implementation, the ZWFS phase dot has a diameter of 1.06 λ/D (where λ is the sensing wavelength and D the telescope diameter) and a phase shift of π/2. This ZWFS is known to be one of the most sensitive WFS (Guyon 2005). Its drawback being its limited dynamic range, it has therefore been mostly implemented as a second-stage WFS, or as a quasi-static aberrations calibration sensor like on VLT/SPHERE (N'Diaye et al. 2016;Vigan, A. et al. 2019). In this paper, we show that the classical implementation of the ZWFS with a phase dot diameter of 1.06 λ/D is actually not optimal: by using a larger dot diameter, the sensitivity of the ZWFS can be significantly improved at the expense of lower spatial frequencies sensitivity, and even reach a performance close to the theoretical limit. For this reason, Section 2 starts from a theoretical study of the ZWFS based on a general convolutional formalism for FFWFS (Fauvarque et al. 2016). This analytical work shows that the sensitivity of the ZWFS can be improved by increasing its dot diameter. We then confirm the theoretical results with end-to-end simulations in Section 3 and we show that a gain in sensitivity by a factor two can be reached without impacting the dynamic range of the sensor. Conclusions are given in Section 4.

Definition of a FFWFS sensitivity
Following the formalism introduced by Fauvarque et al. (2016), the raw intensities recorded by a FFWFS are processed with a return-to-reference operation. It simply consists in removing from the FFWFS recorded intensities map I(φ) the one corresponding to the reference phase I 0 (usually a flat wavefront). We also posit that all the intensities are normalized by the number of photons. The resulting quantity is called tared intensities: One of the most important performance criteria for a sensor is its behaviour in terms of noise propagation. This criteria is encoded in a quantity called sensitivity which depends on the energy in the columns of the Interaction Matrix (IM - Rigaut & Gendron (1992)). For a given WFS, each column of the IM is built as the linear response of the sensor to a given mode φ i which is usually obtained experimentally through a "push-pull" method: where is the amplitude of the mode. The sensitivity s for a given mode φ i is then defined through the Euclidean norm: For a given uniform noise distribution σ n , the noise propagation coefficient σ WFS for a mode φ i is then related to the sensitivity by the following relationship: For ground-based astronomy, where WFS are usually implemented within an AO loop, the sensitivity is a critical metric as it describes how the system performs in the presence of noise. Optimising the WFS sensitivity has always been one of the main motivation in the conception of new WFS.
Finally, it should be noticed that we choose to visualize the FFWFS sensitivity as a 2 dimensional map along the spatial frequencies of the wavefront. It consists in calculating sensitivity with respects to the Fourier modes φ i (close from what was done in Jensen-Clem et al. (2012)) which are simply defined by the sum of a cosine and a sine carrying a given spatial frequency f. The following quantity then encodes the sensitivity:

A convolutional approach to compute FFWFS sensitivity
The FFWFS sensitivity can be computed based on a convolutional model, as described in Fauvarque et al. (2019). This model assumes that the sensor can be fully characterized by an impulse response IR that links the entrance phase to the measured tared intensities: where stands for the classical convolutional product. A convenient aspect of the convolutional approach is the fact that one can compute the transfer function TF of a FFWFS. FFWFS can be described by two parameters: their phase masks m, and their weighting functions ω which describe the energy distribution in the focal plane during one acquisition time of the sensor. We precise that this function is normalized to 1 in order to ensure energy conservation. Assuming that ω is a real function and that ω and m are both centro-symmetric, which is generally the case for most of the known FFWFS, TF is expressed through the simple following formula: where Im is the imaginary part and the bar is the complex conjugate operator. From the knowledge of a FFWFS transfer function, it is then possible to compute the sensitivity with respect to spatial frequencies thanks to the formula exposed in Fauvarque et al. (2019): where the quantity PSF is the Point Spread Function of the system. Its energy correspond to the incoming flux and is normalized to 1. At this point, it is important to note that the sensitivity is bounded. Since the mask transmission, |m|, cannot be greater than one (|m| ≤ 1), Equation 7 implies that TF| f ≤ 2. Hence, given Equation 8, we conclude that, in the frame of our normalizations, the sensitivity cannot be greater than 2: ∀f, s f ≤ 2 . (9) This is an important result as it defines the theoretical limit for a FFWFS sensitivity.

Application to the ZWFS
The convolutional formalism introduced in the previous section is now applied to the ZWFS in order to find a simple formula of its sensitivity according to the mask parameters. As it was previously mentioned, the ZWFS mask is defined by two free parameters: • The depth (phase shift) of the dot δ. For the classical ZWFS: δ = π/2, • Its diameter p. For the classical ZWFS: p = 1.06 λ/D. Diaye et al. (2013), the dot diameter value was chosen in order to get an equivalent flux inside and outside the focal plane dot. This configuration with p = 1.06 λ/D also allows to get an uniform reference intensity distribution. The purpose of this section is to demonstrate that this choice Article number, page 2 of 8 V. Chambouleyron et al.: Variation on a Zernike wavefront sensor theme: optimal use of photons of p = 1.06 λ/D is actually not optimal, and that the sensitivity can be improved with a larger dot.

As described in N'
For the sake of clarity, we carry this study in one dimension. As such, the spatial frequency vector f becomes the scalar frequency f . We further assume that the weighting function ω, which corresponds to the PSF, can be described as a top-hat function of diameter radius a. This simplified geometry is summarized by Figure 1-(a). From this simplified geometry, we then compute TF| f and plot this quantity in Figure 1- (b). Note that the full derivation can be found in appendix B. : Simplified 1D framework for convolutional derivations: in black, the dot diameter equals to p and the phase shift equals to δ. In green, the PSF is approximated by a normalized top-hat function with a diameter a. (b) : Transfer function of the ZWFS for two cases in red, the dot size is smaller than the PSF. In blue, the dot is larger than the PSF. The optimal case appears for p = a and not p = a/2 as it is done for the classical ZWFS.
From Figure 1-(b), one can distinguish two cases: • p ≥ a, i.e. the dot diameter is larger than the PSF characteristic size. Frequencies above (p + a)/2 reach the sensitivity 2 sin(δ). Frequencies below (p − a)/2 are at 0. • p < a: the dot diameter is smaller than the PSF size. Frequencies over p have a value of 2p/a × sin(δ), which is smaller than the theoretical limit of 2.
From this simplified model one can first conclude that a phase shift of δ = π/2 will maximize the sensitivity as expected. But surprisingly, it shows that for this phase shift, a dot radius of p = a offers a sensitivity of the optimal value for almost all the modes. At this point it is important to remember that the value of TF| f directly sets the sensor sensitivity through Equation 8. It is therefore possible to design a ZWFS that reaches the theoretical sensitivity value. As a comparison, the classical ZWFS configuration (N'Diaye et al. 2013) uses p = a/2, which leads to a sub-optimal TF value of 1 for frequencies above p.
Although this simplified study uses strong assumptions, it shows that the ZWFS can be further optimized compared to its classical form. In the next section, we demonstrate that these simplified results are actually accurate and enable us to build the most sensitive sensor ever proposed.

Towards an optimal ZWFS
Following the results from the convolutional approach, the goal of this section is to make use of numerical simulations to confirm the sensitivity of the ZWFS with respect to its dot diameter, and eventually to propose an optimal configuration. For that, we consider different configurations for a dot diameter ranging from 0 to 5 λ/D. The phase shift is set to δ = π/2 for the rest of this letter.
3.1. Impact of the dot diameter on the ZWFS Sensitivity As a first step, we want to illustrate the impact of the dot diameter on sensitivity for a spatial frequency located outside of the dot (horizontal part of the curves Figure 1-(b)). For that purpose, we arbitrary choose a spatial frequency with 6 cycles over the pupil (left insert of Figure 2), which is far enough from the maximum diameter dot value (5 λ/D i.e a radius of 2.5 λ/D). This configuration is illustrated in right insert of Figure 2. The sensitivity results for this spatial frequencies are shown in Figure  3. As predicted by the convolutional approach, Figure 3 shows that the sensitivity for a high spatial frequency increases with the dot diameter. This behaviour has been discussed briefly in previous literature (Ruane et al. 2020) without further analysis. It is here explained thanks to the convolutional model. It is also interesting to notice that the sensitivity growth is closely following the PSF encircled energy in the dot diameter, confirming the analytical results presented in Figure 1. For a classical ZWFS, with a dot diameter of 1.06 λ/D, the sensitivity is actually far from being optimal. Fig. 3. Sensitivity evolution for a frequency outside of the dot, while increasing dot diameter. The sensitivity is in strong accordance with the proportion of the PSF energy located inside the dot, as predicted by the convolutional approach.
However one cannot just increase the dot diameter inconsiderately because the sensitivity to frequencies lying inside the dot would drop to zero (Figure 1). We illustrate this effect in Figure  4, where we plot the sensitivity curves with respect to a wide range of spatial frequencies for the three dot diameters configurations p = 1.06, 2 and 5λ/D. For p = 5λ/D, the sensitivity to high-spatial frequencies (those larger than 5 cycles per pupil) almost reaches the theoretical limit of 2, however the sensitivity becomes close to 0 for low-spatial frequencies (those smaller than 3 cycles per pupil). There is therefore a trade-off between enhanced sensitivity and unseen modes. In the following, we propose to choose the configuration with p = 2λ/D and we call this specific configuration the Z2WFS. As a remark, we also emphasize that the reference intensities, i.e. the intensity distribution for a flat wavefront, change with the value of p. This is illustrated in Figure 5 for the previous three different values of p. The classical ZWFS shows a flat reference illumination, while the Z2WFS appears to be less uniform. This spatial distribution could involve practical issues in terms of detector dynamics or for complex pupil shapes, as for instance central obscuration or spiders. These potential practical implementation issues are beyond the scope of this paper. In this letter, we will only assume a full aperture pupil with monochromatic light for the sake of clarity. It is to be noted that there is no sticking point here, the formalism and results developed here are maintained with a central obscuration in the pupil.

Comparison with other FFWFS
In this section, we compare the Z2WFS with other well-know FFWFS: the classical ZWFS with p = 1.06 λ/D, the nonmodulated PyWFS (Ragazzoni (1996)), the modulated PyWFS (with here a modulation radius of 3 λ/D) and a flattened pyramid (FPyWFS) proposed by Fauvarque et al. (2015) with a pupils overlapping rate of 75%. Spatial frequencies basis, i.e. Fourier basis, is chosen for this comparison. Results are given Figure 6. First, we retrieve well-know results as for instance the gain around a factor 2 in sensitivity between the classical ZWFS and the PyWFS. We can also highlight the behaviour of the FPyWFS showing oscillating sensitivity and peaks for some specific frequencies, as described in Fauvarque et al. (2015). (The explanation of the PyWFS class behaviour through to the convolutional approach is also given in appendix C.) The Z2WFS is clearly the most sensitive sensor, except for extremely low frequencies. As expected from Figure 3, it has a sensitivity twice better than the classical ZWFS for almost all modes and is four times more sensitivity than the non-modulated PyWFS. The behavior at low spatial frequencies deserves some further analysis: we plot Figure 7 the sensitivity with respects to the tip-tilt and focus modes (which are the lowest frequency Zernike modes) for the ZWFS class with a dot diameter ranging from 0 to 5 λ/D. For the tip-tilt modes, the Z2WFS is twice less sensitive than the classical ZWFS, but the Z2WFS provides better results for the focus. Even if the Z2WFS is less sensitive for tip-tilt than the classical ZWFS, it is important to note that it remains as sensitive as the non-modulated PyWFS which is around 0.4. As a remark, it is interesting to see that the sensitivity curve for the tip-tilt is following the PSF shape: for the edge of the dot lying on a dark area of the PSF, the sensitivity drops to 0. By taking a Z1.5WFS (p = 1.5 λ/D), one could have a better sensitivity for all the frequencies compared to the classical ZWFS, but a lowest gain overall compared to Z2WFS.
To conclude this section, we demonstrated that a Z2WFS significantly improves the sensitivity, and approaches the ideal FFWFS behavior. When compared to the classical ZWFS, the gain in sensitivity for all modes, except the Tip-Tilt, is a factor 2. In the next section we investigate if this gain in sensitivity costs in dynamic range. V. Chambouleyron et al.: Variation on a Zernike wavefront sensor theme: optimal use of photons Fig. 7. Evolution of the low order Zernike modes sensitivities with respect to the dot diameter. We can see that the Z2WFS senstivity is lower for the tip-tilt modes, but higher for all the other ones. The classical ZWFS is not even optimized for the tip-tilt modes.

Dynamic Range
To complete our study, we now compare the dynamic range of the Z2WFS with the classical ZWFS. A drastic loss of dynamic range while increasing the dot diameter could indeed prevent from a practical utilisation of the Z2WFS. To calculate this quantity with respects to a given mode φ i , we evaluate its capture range C φ i . To calculate it, we look at the lowest amplitude value (in absolute value) such that : We then define the capture range as C φ i = 2a 0 , where the factor 2 allows one to take into account negative and positive amplitudes in the capture range calculation. The capture range can be larger than the pure linearity regime. However, we decided to use this definition for two reasons. First because it defines the amplitude below which we are ensured that a closed loop system will eventually converge. Indeed, even if the measurement is not linear anymore, there is still a one-to-one correspondence with the input signal. Secondly, because, the ZWFS measurements are often processed through non-linear reconstructors (N'Diaye et al. (2013), Steeves et al. (2020)) that can be perfectly applied to Z2WFS or other variations of the ZWFS.
Capture range values for cosine phase modes at frequencies ranging from 0 to 5 cycles in diameter are given Figure 8 for Zernike and Pyramid wave-front sensors. The PyWFS class have a better capture range over all spatial frequencies, matching the fact that sensitivity and dynamic range are competing properties. This graph also confirms the great benefit in terms of dynamic range provided by the modulation of the PWFS. More importantly, we see that the Z2WFS exhibits the same capture range as the classical ZWFS for high frequencies and is even higher for the lowest frequencies where the Z2WFS sensitivity goes below the classical ZWFS one. The Z2WFS is therefore more sensitive than the classical ZWFS while exhibiting the same capture range.

Conclusion
In this paper, we provided a physical description of the sensitivity behaviour for the ZWFS class, and in particular we studied the sensitivity evolution for different dot diameters. We showed that it was possible to significantly improve the current sensitivity of the ZWFS at the expense of the lower spatial frequencies, simply by increasing the dot diameter. The resulting sensitivity can even almost reach the fundamental limit of FFWFS. We further studied the specific case of a dot diameter of 2 λ/Dcalled Z2WFS, which exhibits a averaged gain of sensitivity by a factor two (with a loss of sensitivity compared to the classical ZWFS only for the tip-tilt modes), without loss of the dynamic range with respect to the classical ZWFS. This new sensor then becomes the most sensitive WFS available for ground-based astronomy. It still exhibits the low dynamic range of the ZWFS, but as for the PyWFS, modulation schemes can be imagined. For instance, one way to increase its linearity range is to dynamically change the dot diameter during one integration time of the sensor camera. Further studies will now investigate its practical implementation and the impact of chromaticity on wavefront sensing.

Appendix A: convolutional approach: general framework
In the infinite pupil approximation and assuming that the weighting function (energy distribution at the focal plane) ω is a real centro-symmetric function and that the focal-plane mask function m is also centro-symmetric, the transfer function of a FFWFS may be written as: We consider in this formalism only pure phase mask for the Fourier filter function m, because the global context is the search for sensitivity of WFS so we did not amplitude mask that would result in wasting of photons. Therefore, they are pure phase masks that can be written as m = e i∆ . Moreover, we continue to carry out these mathematical developments in one dimension. The previous equation (A.1) becomes: where u is expressed in unit of λ/D. We choose to approximate the weighting function as a rectangular and normalized function with a diameter a: The normalization allows one to respect the energy conservation. Furthermore, a may be seen as the characteristic size of the energy distribution at the focal plane. a is therefore the typical size of the modulation while studying the Pyramid WFS class and corresponds to the PSF characteristic length when modulation is inactive. In other words, the weighting function allows us to take into account the finite size of the pupil in spite of the "infinite pupil approximation" needed in the convolutional approach.

Appendix B: Zernike WFS class
For the ZWFS, we consider the following mask: a centered dot with a diameter of p and a depth δ. Thus, the phase of the filtering mask equals to: In blue, ZWFS mask in 1 dimension with a uniform modulation representing the PSF. In black, the centered mask ∆|u. In red, the shifted mask ∆| f −u .
The problem being symmetric, the derivation of equation (A.2) is only done for positive frequencies f ≥ 0. We can distinguish two cases: whether the size of the dot is bigger than the PSF, i.e. p ≥ a or not, i.e. p < a.
Case 1. The dot is larger than the PSF: p ≥ a.
Case 1.1 f ≥ p+a 2 We have ∆| u = δ and ∆| f −u = 0 In that case, ∆| u − ∆| f −u depends on f . We get: Case 2. The dot is smaller than the PSF: p < a. Case 2.1: f ≥ p We have again ∆| u = δ and ∆| f −u = 0. Consequently, the transfer function equals to: The plot corresponding to these results is given earlier in the letter, Figure 1.
A&A proofs: manuscript no. aanda  Notably, it appears that the value of the characteristic length of the weighting function over two a/2, i.e. the modulation radius when this device is active, plays a major role in the integration and may be seen as a cutoff frequency. Equation (C.4) becomes: Finally, we get the transfer function of the Pyramid class depending on its two optical parameters, namely the Pyramid apex angle and the weighting function characteristic size.
We may now use this formula to explain the sensitivity with respects to the spatial frequencies of the Pyramid WFS class.
Firstly, we are interested in apex angle parameter influence on the sensitivity. In other words we study the difference between classical and flattened pyramids. To do so, we assume that the modulation is inactive. Consequently, the parameter a is related to the PSF size. Considering a pupil diameter of D, a sensing wavelength of λ and a imaging focal of f oc , we have: To distinguish between flattened or classical pyramids, we just have to identify the limit apex angle θ limit allowing to totally separate pupil images. It can actually be linked with the pupil diameter and the imaging system focal via the following formula: Consequently, if θ is below θ limit there is an overlap of the pupil images and the Pyramid is therefore a flattened one whereas a θ above θ limit implies a complete separation of the pupil images and thus, a classical pyramid. If we use the a and α variables, these two cases may be summarized in the following way: aα < π Flattened Pyramid (C.10) aα ≥ π Classical Pyramid (C.11) Such a distinction allows to explain why the sensitivity of the flattened pyramid may be larger in absolute value than the classical pyramid one. As a matter of fact, the sinc function in equation (C.7) may be significant for some frequencies when aα is small, i.e. for flattened pyramids. Keeping in mind that the sensitivity is linked with the TF via the convolution product of Equation (8), this shows why the FPyWFS sensitivity may reach 2 when the classical pyramid optimally attains 1. Moreover, the small α implies large oscillations with respects to the spatial frequencies (see green curve of Figure C.2). This is relevant regarding to the observed sensitivity which indeed oscillates (dotted green curve in Figure 6). By contrast, if aα is large, i.e. if pupil images are completely separated, the sinc function may be neglected and the transfer function can be summarized as: This function corresponds may be seen as the transfer function of the classical pyramid. We notice that it oscillates more rapidly than the flattened pyramid one (yellow and purple curves of Figure (C.2)). However, these oscillations disappear when we look at the corresponding sensitivity curves (Figure (6)). Such a peculiarity may be explained by Equation (8): to get the sensitivity curve, the transfer function is convoluted with the PSF which is in this case larger than oscillation period. As a result, the transfer function is smoothed and the sensitivity follows the TF envelope.
Concerning this envelope, we may observe two regimes. The first one goes from the null spatial frequency to the modulation radius a/2 ; it is linear with f . The second one corresponds to spatial frequencies above the modulation radius ; it is constant and equals to 1. We identify here the typical behavior of the classical pyramid (modulated or not) with its two regimes, slope and phase sensors separated by a cutoff frequency corresponding to the modulation radius (Vérinaud 2004).
The convolutional approach therefore shows its capability to describe the sensitivity of pyramid sensors. It indeed allows to get a unique formula which explains both the enhanced and oscillating sensitivity of the FPyWFS and the dual behavior slope/phase sensors of the classical modulated PyWFS.