Robustness of prediction for extreme adaptive optics systems under various observing conditions

M. A. M. van Kooten; N. Doelman; M. Kenworthy

doi:10.1051/0004-6361/201937076

Home

All issues

Volume 636 (April 2020)

A&A, 636 (2020) A81

Full HTML

Free Access

Issue		A&A Volume 636, April 2020


Article Number		A81
Number of page(s)		9
Section		Astronomical instrumentation
DOI		https://doi.org/10.1051/0004-6361/201937076
Published online		21 April 2020

A&A 636, A81 (2020)

An analysis using VLT/SPHERE adaptive optics data

M. A. M. van Kooten¹, N. Doelman¹^,2 and M. Kenworthy¹

¹ Leiden Observatory, Leiden University, Niels Bohrweg 2, 2333 CA Leiden, The Netherlands
e-mail: vkooten@strw.leidenuniv.nl
² TNO, Stieltjesweg 1, 2628 CK Delft, The Netherlands

Received: 7 November 2019
Accepted: 4 March 2020

Abstract

Context. For high-contrast imaging systems, such as VLT/SPHERE, the performance of the system at small angular separations is contaminated by the wind-driven halo in the science image. This halo is a result of the servo-lag error in the adaptive optics (AO) system due to the finite time between measuring the wavefront phase and applying the phase correction. One approach to mitigating the servo-lag error is predictive control.

Aims. We aim to estimate and understand the potential on-sky performance that linear data-driven prediction would provide for VLT/SPHERE under various turbulence conditions.

Methods. We used a linear minimum mean square error predictor and applied it to 27 different AO telemetry data sets from VLT/SPHERE taken over many nights under various turbulence conditions. We evaluated the performance of the predictor using residual wavefront phase variance as a performance metric.

Results. We show that prediction always results in a reduction in the temporal wavefront phase variance compared to the current VLT/SPHERE AO performance. We find an average improvement factor of 5.1 in phase variance for prediction compared to the VLT/SPHERE residuals. When comparing to an idealised VLT/SPHERE, we find an improvement factor of 2.0. Under our 27 different cases, we find the predictor results in a smaller spread of the residual temporal phase variance. Finally, we show there is no benefit to including spatial information in the predictor in contrast to what might have been expected from the frozen flow hypothesis. A purely temporal predictor is best suited for AO on VLT/SPHERE.

Conclusions. Linear prediction leads to a significant reduction in phase variance for VLT/SPHERE under a variety of observing conditions and reduces the servo-lag error. Furthermore, prediction improves the reliability of the AO system performance, making it less sensitive to different conditions.

Key words: instrumentation: adaptive optics / methods: numerical

© ESO 2020

1. Introduction

In the search for new exoplanets and earth analogs, dedicated high-contrast imaging (HCI) systems have allowed the angular separation of host stars from their surroundings to reveal circumstellar disks and exoplanets. The combination of extreme adaptive optics (XAO) to provide high spatial resolution, coronagraphs to suppress the host star’s light, and data reduction techniques to remove residual effects, allows HCI systems to reach post-processed contrasts of 10⁻⁶ at spatial separations of 200 milliarcseconds (Zurlo et al. 2016). VLT/SPHERE is a HCI system that has discovered two confirmed planets: HIP 65426b (Chauvin et al. 2017) and PDS 70b (Keppler et al. 2018). It has also discovered a vast array of debris, protoplanetary, and circumstellar disks (e.g., Avenhaus et al. 2018; Sissa et al. 2018). Operating with a tip/tilt deformable mirror (TTDM), a 41-by-41 high order deformable mirror (HODM), and a Shack-Hartmann wavefront sensor (SHWFS) sampling at 1380 Hz, the XAO system of VLT/SPHERE, SAXO delivers Strehl ratios greater than 90% in the H band (Beuzit et al. 2019).

A major challenge with VLT/SPHERE (and other HCI systems to varying degrees) is the presence of the wind-driven halo (WDH) that dominates the wavefront error at small angular separations. The WDH is a manifestation of the servo-lag error and appears as a butterfly pattern in the coronagraphic/science images (see Cantalloube et al. 2018, for details on the WDH). The servo-lag error is due to the finite time between the measurement of the incoming wavefront aberration (caused by atmospheric turbulence) and the subsequent applied correction. The resulting wavefront error is due to the outdated disturbance information and the closed-loop stability constraints. Owing to this, the halo is aligned with the dominant wind direction and severely limits the contrast at small angular separations even after post-processing techniques. The servo-lag error prevents VLT/SPHERE from achieving its optimal performance when coherence times are below 5 ms (Milli et al. 2017).

The XAO system, SAXO, has a temporal delay of approximately 2.2 SHWFS camera frames. The HODM is controlled using an integrator with modal gain optimisation (Petit et al. 2014). Within this framework, one solution to minimise the delay in SAXO itself is to run everything faster. However, this solution poses a number of hardware challenges with a new HODM that can run at the desired speed, a wavefront sensor camera with fast readout, and a more powerful real-time-computer. One alternative solution is to upgrade the controller with a control scheme that predicts the evolution of the wavefront error over the time delay. In this paper, we look at the potential of prediction to improve the performance of SAXO especially when the servo-lag error is the dominant residual wavefront error source (i.e., small coherence times).

Many different groups have worked on predictive control as a means of improving the performance of an adaptive optics (AO) system by minimising the servo-lag error. We highlight a few results from the last 15 years. Prediction, within the context of optimal control, is an ingredient in finding the optimal controller. Linear quadratic Gaussian (LQG) control has been explored by Petit et al. (2008) for general AO systems to perform vibration filtering with the Kalman filter. On-sky demonstrations of the LQG controller for tip-tilt/vibrational control are provided in Sivo et al. (2014). Laboratory work to include higher order modes for atmospheric turbulence compensation (Le Roux et al. 2004) has demonstrated a reduction in the temporal error showing predictive capabilities. The H2 optimal controller (closely related to LQG) has been tested on-sky (Doelman et al. 2011) showing a reduction in the temporal error for tip-tilt control. For multi-conjugate AO systems, a temporal aspect to the phase reconstruction for each layer has been implemented; the spatial-angular predictor is formed by exploiting frozen flow hypothesis and making use of minimum mean square error estimator (with analytical expressions for the stochastic process) (Jackson et al. 2015).

Within the HCI community, there have been many efforts to incorporate prediction into the AO control algorithm. Building on Poyneer & Macintosh (2006), predictive Fourier control, proposed in Poyneer et al. (2007), makes use of Fourier decomposition and the closed-loop power spectral density (PSD) to find components due to frozen flow. Using Kalman filtering a predictive control law is determined resulting in a reduction of the servo-lag error. Other efforts in HCI have focused on splitting the prediction step from the controller. This is done by first estimating the pseudo-open loop phase (slopes, or modes), applying a prediction filter, and then controlling the HODM using the predicted phases as input into the controller. This approach allows for a system architecture that can turn prediction on and off without affecting the control loop. A similar structure has been implemented using the CACAO real-time computer (Guyon et al. 2018). Empirical orthogonal functions (EOF) is a data-driven predictor that aims to minimise the phase variance. Implemented on SCExAO (an HCI instrument at the Subaru telescope using the CACAO real-time computer) it has been demonstrated (Guyon et al. 2018) that EOF improves the standard deviation of the point-spread-function over a set of images. However, the improvement is less than expected from initial simulations (Guyon & Males 2017). Similar methods minimise the same cost function as EOF but with a different evaluation of the necessary covariance functions (see Sect. 3.1) as reported in van Kooten et al. (2019) and Jensen-Clem et al. (2019). Current a posteriori tests, using AO telemetry (Jensen-Clem et al. 2019), show an average factor of 2.6 improvement for contrast for separations from 0 to 10 λ/D. This approach to prediction will be implemented at the Keck telescope. One benefit to separating the prediction and the control steps is that the behaviour of the input disturbance (atmospheric induced phase fluctuations) can be studied for a given system and telescope site location, along with tests performed with AO telemetry data. We take this approach in this work, building on our earlier work (van Kooten et al. 2019), looking solely at the predictability of the pseudo open-loop slopes under various atmospheric conditions. Previous work on prediction in the AO community has focused on one or two case(s) on-sky for demonstrations of prediction, successfully showing the feasibility of predictive control. We look at how a linear minimum mean square error (LMMSE) predictor performs a posteriori on VLT/SPHERE AO telemetry data under a large set of various observing conditions (such as guide star magnitude, coherence time, and seeing conditions).

We organise this paper as follows: we introduce and summarise our SAXO data set in Sect. 2. In Sect. 3 we outline our methodology, including the structure of our predictor (Sect. 3.1). We elaborate on how we apply the predictor to the VLT/SPHERE SAXO telemetry data and we present our results in Sect. 4, discussing them in Sect. 5. We look at how the predictor performs under different conditions as well as the stationarity of the turbulence. The implications of the results are discussed in Sect. 5.4, concluding in Sect. 6 including future research directions.

2. SAXO data

The SAXO system has the option to save the full (or partial) XAO telemetry, including HODM positions, SHWFS slopes, SHWFS intensities, interaction matrix, at the discretion of the instrument user. In this paper, we make use of 27 SAXO data sets taken between 2016 and 2019. By limiting ourselves to these years we also have estimations of the atmospheric conditions (seeing, coherence time, and turbulence velocity) from the MASS-DIMM instrument located approximately 100 m away from UT4. The conditions under which our data were acquired are summarised in Fig. 1, where we plot the kernel density functions estimated from the data. We note that the data set is biased toward shorter coherence times (tau), where the median coherence time are 2.5 ms; from Milli et al. (2017) and Cantalloube et al. (2018) we expect the WDH to be dominant (and thereby the servo-lag error) when the coherence time drops below 5 ms. For completeness, the data set has a couple of data sets with longer coherence times. The turbulence velocity is the velocity of the characteristic turbulent layer as determined by the MASS-DIMM instrument and is associated with a characteristic altitude determined from the atmospheric profile. Therefore, it does not necessarily indicate the speed of the jet stream layer but provides a tracer atmosphere velocity. For most of the data we have bright guide star magnitudes with a mean of 5 mag in the r band (the wavefront sensor bandwidth), resulting in high signal-to-noise ratios (S/N) for the SHWFS in all cases. The seeing has a Gaussian-like distribution with a mean of 1.3 arcsec. A full summary of the entire data set (including time of observation) is provided in Table A.1. Each of the 27 data sets differ in length and range from 10 s to 60 s, allowing us to probe different conditions while still having the opportunity to observe the behaviour of turbulence on timescales of a minute.

Fig. 1.

Kernel density functions for the seeing, coherence time, turbulence velocity, and guide star magnitude (r band) showing the conditions under which the VLT/SPHERE telemetry was taken. The first three panels are measurements closest to the time of the observation output by the MASS-DIMM (accessed via ESO Paranal query form). The corresponding targets were found at the VLT/SPHERE ESO archive and their r-band magnitudes were found in the VizieR catalog. The kernel density functions are a nonparametric estimation of the probability functions.

To study the influence of prediction on our data sets, we estimated the pseudo-open loop phases, thereby applying prediction to the zonal two-dimensional grid of phase values and not the modes. We performed the open-loop estimation using the HODM commands only because SAXO saves the full HODM voltages, not just the updates (see Appendix B). We converted to phase using the laboratory measured HODM influence functions. As a result, we neglected the spatial frequencies beyond the spatial bandwidth of the HODM and therefore underestimated the open-loop phase at higher spatial frequencies. We took this approach after first considering the more traditional method of using the wavefront sensor measurements and unravelling the pseudo open-loop phase from the SHWFS measurements and knowledge of the controller state. Without full knowledge of the modal gains and controller at each time step, as in our case, this method provides an inaccurate estimation.

In Fig. 2, we plot the resulting PSDs for the final pseudo open-loop phase from the VLT/SPHERE telemetry (SPHERE full) and the closed-loop phase of VLT/SPHERE residuals and identify some key features. There are peaks around 40 Hz and 60 Hz in both the open and closed-loop PSDs. Performing a modal analysis we find that the peaks appear for Zernike modes 7−11 (Noll indexing; Noll 1976) with varying amplitudes. The second feature is the increase in power at high temporal frequencies for the closed-loop PSD (i.e., the so-called waterbed effect, a result of Bode’s sensitivity integral).

Fig. 2.

Power spectral densities, estimated using the Welch method, for all the data sets; both for the full VLT/SPHERE estimated pseudo open-loop phases and for the reconstructed VLT/SPHERE residual phases.

3. Methodology

Although we are ultimately interested in the improvement in contrast using prediction, we limit ourselves in this work to studying the minimisation of the servo-lag error. From Guyon (2005), Kasper (2012), and Cantalloube et al. (2018), we see that minimising the lag results in an improvement in contrast but ultimately also depends on the coronagraph of choice and how it interacts with the residual phase at small angular separations. We also do not have focal plane images taken at the same time as the data sets, making a clear claim of improvement unreliable. The metric we adopt is the temporally and spatially averaged wavefront phase variance.

3.1. LMMSE prediction

For our predictor we chose a data-driven method called the LMMSE predictor. This approach provides a flexible framework that allows us to implement the predictor in three different ways: batch, recursive, and with an exponential forgetting factor (van Kooten et al. 2019).

A single point i of a phase screen at time t is given by y_i(t), while u(t) is a P²×1 column vector containing a collection of P² phase values on a discrete spatial grid at time t. We assume that the future value of a given phase point, ${\hat{y}}_{i}$ $\hat{\mathit{y}}_i$ at the discrete time index t + d, is a linear combination of the most recent phase values at time t. The predictor coefficients are denoted as a_i. The cost function of our predictor, with < > _t as the time average operator, is then

$\begin{matrix} \min_{a_{i}} ⟨ | | y_{i} (t + d) - a_{i}^{T} {w (t) | |}^{2} ⟩_{t}, \end{matrix}$ $\begin{aligned} \mathrm{min}_{{\boldsymbol{a}}_{i}} \langle ||{{ y}}_i(t+d)-{\boldsymbol{a}}_i ^{T} {\boldsymbol{w}}(t)|| ^2 \rangle _t , \end{aligned}$ (1)

where w(t) includes a set of Q most recent measurements,

$\begin{matrix} w (t) = {(\begin{matrix} u {(t)}^{T} & u {(t - 1)}^{T} & u {(t - 2)}^{T} & \dots & u {(t - Q)}^{T} \end{matrix})}^{T} . \end{matrix}$ $\begin{aligned} {\boldsymbol{w(t)}}= \begin{pmatrix} {\boldsymbol{u}}(t)^{T}&{\boldsymbol{u}}(t-1)^{T}&{\boldsymbol{u}}(t-2 )^{T}&\ldots&{\boldsymbol{u}}(t-Q)^{T} \end{pmatrix}^{T} . \end{aligned}$ (2)

We allowed for both spatial and temporal regressors gathered into w(t) – a P²Q × 1 vector. We denoted predictors of various orders by indicating the spatial order P (spatially limiting ourselves to a box of order P, symmetric around the phase point of interest, resulting in P² spatial regressors) followed by the temporal order Q; for example, a “s5t2” predictor has P = 5 and Q = 2.

Solving Eq. (1) for our zero-mean stochastic process, the solution can be written in terms of the inverse of the auto-covariance matrix and cross-covariance vector (Haykin 2002)

$\begin{matrix} a_{i} = C_{w w}^{+} c_{w y_{i}}, \end{matrix}$ $\begin{aligned} {\boldsymbol{a}}_i=\mathbf C _{{\boldsymbol{w}}{\boldsymbol{w}}}^{+}{\boldsymbol{c}}_{{\boldsymbol{w}}{{ y}}_i} , \end{aligned}$ (3)

where + denotes a pseudo inverse; C_ww is the auto-covariance matrix of w, the vector containing the regressors; and c_{wy_i} is the vector containing the cross-covariance between the true phase value, y_i and w.

We can estimate the covariances in Eq. (3) directly from a training set, forming a fixed batch solution. Alternatively, we can form a recursive solution making use of the Sherman-Morrison formula (a special case of the Woodbury matrix inversion lemma). In Eq. (4) through to Eq. (6) we inserted the exponential forgetting factor in the update in the following equations, thereby forming our final LMMSE implementation. The recursive form is given in Eqs. (4) and (5) with λ = 1, such that all previous data is weighted equally, as follows:

$\begin{matrix} c_{w y_{i}} (t - d) & = λ c_{w y_{i}} (t - d - 1) + w (t - d) y_{i} (t - d) \end{matrix}$ $\begin{aligned} {\boldsymbol{c}}_{{\boldsymbol{w}}{{ y}}_i}(t-d)&=\lambda {\boldsymbol{c}}_{{\boldsymbol{w}}{{ y}}_i}(t-d-1)+{\boldsymbol{w}}(t-d){{ y}}_i(t-d) \end{aligned}$ (4)

$\begin{matrix} C_{w w}^{+} (t - d) & = λ^{- 1} C_{w w}^{+} (t - d - 1) - k (t - d) \end{matrix}$ $\begin{aligned} \mathbf C _{{\boldsymbol{w}}{\boldsymbol{w}}}^{+}(t-d)&=\lambda ^{-1}\mathbf C _{{\boldsymbol{w}}{\boldsymbol{w}}}^{+}(t-d-1)- {\boldsymbol{k}}(t-d) \end{aligned}$ (5)

with

$\begin{matrix} k (t - d) = \frac{λ^{- 2} C_{w w}^{+} (t - d - 1) w (t - d) w^{T} (t - d) C_{w w}^{+} (t - d - 1)}{1 + λ^{- 1} w^{T} (t - d) C_{w w}^{+} (t - d - 1) w (t - d)} \cdot \end{matrix}$ $\begin{aligned} {\boldsymbol{k}}(t-d)=\frac{\lambda ^{-2}\mathbf C _{{\boldsymbol{w}}{\boldsymbol{w}}}^{+}(t-d-1) {\boldsymbol{w}}(t-d){\boldsymbol{w}}^T(t-d)\mathbf C _{{\boldsymbol{w}}{\boldsymbol{w}}}^{+}(t-d-1)}{1+\lambda ^{-1}{\boldsymbol{w}}^T(t-d)\mathbf C _{{\boldsymbol{w}}{\boldsymbol{w}}}^{+}(t-d-1){\boldsymbol{w}}(t-d)} \cdot \end{aligned}$ (6)

By updating Eqs. (4) and (5), the coefficients can be found for each time step using Eq. (3). The recursive solution goes on-line immediately with the initial auto-covariance being set to diagonal matrices with large values (as done with recursive least-squares methods) and the cross-covariance vector set to ones. By adjusting the forgetting factor, we can weigh old data by less compared to the most recent measurements, therefore allowing the tracking of slowly varying signals.

3.2. Comparison with EOF

Methods such as LMMSE and EOF (see Guyon & Males 2017) and similar techniques (see Jensen-Clem et al. 2019) all minimise the same cost function, but the evaluation of Eq. (3) is different in each case; we note that the cost function is slightly different when including exponential forgetting factor. In EOF, the solution is estimated with the inverse of the auto-covariance determined using a singular-value-decomposition that is re-estimated on minute timescales. The amount of data used to estimate the prediction filter, the numerical robustness, the noise properties of the system, and the atmospheric turbulence above the telescope all contribute to the performance of these algorithms and the final computational load. Therefore one implementation might be more suited for specific conditions than the other, but the three methods can, in ideal conditions, result in the same performance.

3.3. Applying prediction to SAXO telemetry

From the estimated open-loop phases (which results in a 240-by-240 phase screen using the HODM modes) we bin the data to 60-by-60 phase screens for computational memory purposes. We then performed prediction on the estimated open-loop phases assuming a 2-frame delay. For each phase point, we estimated a unique set of prediction coefficients using the equations as outlined in Sect. 3.1. Our aim is to focus on the prediction capabilities, ignoring the control aspect by assuming a perfect system – no wavefront sensor noise and a HODM that can perfectly correct all spatial frequencies predicted – and only including the delay. We note that this results in no fitting error, no spatial bandwidth limitations, and no temporal bandwidth limitations on the achievable performance.

We ran batch, recursive, and forgetting (with λ = 0.998 as this value gives the best performance assuming λ ≠ 1) LMMSE predictors for each prediction order. We then subtracted the predicted phase from the pseudo open-loop phase 2 frames later resulting in the residual predictor phase. From the phases we calculated the averaged phase variance by taking the final 5 s of the data set, and therefore, all the different predictors have converged; we calculated the spatio-temporally averaged phase variance. We started with the performance of a s1t1 predictor. The s1t1 is a zero-order predictor because it only makes use of the most recent measurement for that given phase point, making it analogous to an optimised integrator (with a gain close to unity) in our simulations; we refer to the s1t1 as the ideal VLT/SPHERE performance.

We performed simulations testing a variety of predictors with different spatial and temporal orders including s1t3, s1t10, s3t1, and s3t3. We looked at how the three different implementations of the LMMSE for each prediction order behave (see Fig. 3). In Sect. 4 we present the results for the recursive s1t10 predictor, which performs the best out of all the different orders.

Fig. 3.

Top panel: time series showing the estimated pseudo open-loop in black, compared with the VLT/SPHERE residual in blue for a random SAXO telemetry data set. Other plots (all with the same y-scale): various predictor residual phase variances (various green and purple lines) compared to the VLT/SPHERE residuals for the same data. The averaged phase variance, in μm², for the last 5 s is indicated in the top right corner of each plot. The forgetting s1t10 (abbrev. for) does not perform significantly better than the recursive s1t10 (abbrev. rec). Secondly, the s3t3 does not perform better than the s1t10.

4. Results

We find that prediction provides a reduction in averaged phase variance when compared to the VLT/SPHERE SAXO residuals. An example of how prediction behaves spatially and temporally for a slice across the telescope aperture is shown in the bottom panel of Fig. 4. We see a reduction in the phase compared to the VLT/SPHERE residuals and see a more uniform solution in time and space. In Fig. 5 we summarise the results of running prediction on all of our data sets. As mentioned above, the averaged phase variance is found by using the last 5 s of the data. The data sets all vary in length. When we see a reduction in the residual phase variance, we would expect an improvement of performance for the XAO system for all cases independent of the guide star magnitude, the seeing, and the coherence time. We observe a reduction in the spread of residual phase variance, with prediction providing a more uniform performance for various conditions; see the kernel density function plots in the top panel of Fig. 5.

Fig. 4.

Vertical slice across the telescope aperture (y-axis) showing the wavefront phase in μm (indicated by the colour-bars), plotted as a function of time for the pseudo open-loop phase (top), the VLT/SPHERE residual phase (middle), and s1t10 predictor residual phase (bottom) for the same night as Fig. 3. Comparing the bottom two panels (colour map is the same in both), we can see the prediction residuals have a flatter and more uniform appearance compared to the real VLT/SPHERE residuals.

Fig. 5.

Top: Kernel density function estimation for the averaged phase variances plotted in the bottom plot. The change in spread of the values is shown. Bottom: pseudo open-loop averaged wavefront phase variance compared to residual averaged wavefront phase variance for VLT/SPHERE, a batch s1t1 (i.e., idealised integrator for VLT/SPHERE), and a recursive s1t10 predictor. The points move from right to left, indicating that a s1t1 does better than VLT/SPHERE and a high order predictor does even better than the s1t1.

In Fig. 6 we plot the ratio of the recursive s1t10 predictor residual phase variance to the VLT/SPHERE residual phase variance defining this as the “ratio of improvement” as a function of coherence time. In the same figure, we add the ratio of improvement for the same recursive s1t10 predictor and an idealised VLT/SPHERE integrator (batch s1t1) against coherence time. We calculate the average ratio of improvement to be 5.1 and 2.0, respectively. Figure 6 shows the relative seeing conditions indicated by the size of the marker; smaller marker sizes indicate better seeing conditions.

Fig. 6.

Ratio of improvement, found by taking the ratio of an idealised integrator on VLT/SPHERE to a recursive spatial-temporal predictor (s1t10) phase variance as calculated from the last 5 s of data, as a function of coherence time. The size of the markers indicates the MASS-DIMM seeing conditions at the time of observation. The average ratio of improvement is 5.1 when comparing the prediction to the real VLT/SPHERE residuals. When looking at the idealised VLT/SPHERE, we find an average ratio of improvement of 2.0 in wavefront variance reduction.

We evaluate several predictors, varying both spatial and temporal orders. We find that there is no gain in performance by adding spatial regressors and find that temporal regressors perform equally as well. These results are summarised in Table 1.

Table 1.

Averaged phase variance for the pseudo open-loop, VLT/SPHERE residuals, s1t3 residuals, s3t3 residuals, and the s3t1 residuals.

We see minimal evidence of nonstationary behaviour of optical turbulence. First, by looking at the coherence times in Table A.1, we do not see significant change in coherence time for data taken on the same night – showing that on 100 s time scales the statistics of optical turbulence do not vary significantly. Second, on shorter timescales, we do not see evidence of nonstationary turbulence. In Fig. 3, we plot the batch, recursive, and the forgetting LMMSE for a s1t10 predictor. From the average residual phase variances, we see that the batch and recursive perform the same over the full 40 s period. We do see a slight improvement for the forgetting LMMSE, implying a slight time-variant behaviour of the pseudo open-loop phase but nothing significant.

5. Discussion

5.1. Comparison of the prediction residuals to the VLT/SPHERE residuals

We should note the difficulties in performing a direct comparison between the predictor residuals to the real VLT/SPHERE residuals. There are a few challenges, with the first being a difference in delay. In our estimation of the open-loop phase we choose to round the frame delay to a whole frame; therefore our predictor sees a delay of 2 frames (or 1.45 ms) while the real system delay, and thereby encoded in the real VLT/SPHERE residuals, is 2.2 frames (1.59 ms). We therefore assume that the VLT/SPHERE residuals immediately have a larger phase variance compared to the case where the true delay is 2 frames. An alternative option to rounding the delay frames is to interpolate, which is a step that would have also introduced an error. We make use of the HODM commands which are saved as the total voltage on the HODM, not the update to the HODM. The SHWFS, however, is used to determine the real VLT/SPHERE residuals and therefore see higher order spatial frequencies, potentially increasing the residuals if this was not the case. Perhaps the most substantial contribution to the final performance of VLT/SPHERE results from the fact that the VLT/SPHERE performance will be limited by operational parameters. The controller will need to be stable and robust on-sky, potentially resulting in a loss of performance when compared to our idealised situation. Therefore, in Figs. 5 and 6 we plot the idealised VLT/SPHERE (batch s1t1) as well as the VLT/SPHERE residuals. The true gain in prediction will be between the idealised VLT/SPHERE and real VLT/SPHERE residuals and the ratio of improvement will fall between 5.1 and 2.0.

5.2. Performance under different conditions

Our data set and analysis is unique, showing that with prediction, there will be an improvement under almost all conditions even for long coherence times, and no loss in performance with prediction is observed. However, the data set does not show any clear correlations between predictor performance and observational conditions (see Figs. 6 and 7). We briefly discuss the behaviour for various observing parameters including coherence times, turbulence velocity, seeing, and finally guide star magnitudes.

Fig. 7.

Ratio of improvement compared to seeing (left), turbulence velocity (middle), and guide star magnitude in r band (right) during the time of observation.

Looking closely at Fig. 6 and the behaviour for various coherence times we do not find any correlation between the coherence time and performance for the true VLT/SPHERE residuals. However, when looking at the idealised VLT/SPHERE behaviour we see, at smaller coherence times, an exponential-like gain in the ratio of improvement in which the asymptote is the Nyquist sampling frequency (2/WFS_f = 2/1380 = 1.45 ms). This behaviour for the idealised case is as expected, where we have a larger improvements for shorter coherence times. We then look at the relation between the ratio of improvement and turbulence velocity (middle plot of Fig. 7). We expect a similar behaviour to that of the coherence time as they are related. We note that we also see no correlation for the ground layer wind speeds measured by the nearby meteorological tower. Studying the relation between the seeing and the ratio of improvement we notice an asymptotic behaviour where the ratio improves for better seeing conditions. We do not see any dependence between performance and S/N (or guide star magnitudes; left plot of Fig. 7). For the VLT/SPHERE residuals, the lack of correlation between performance and S/N is as expected from laboratory and on-sky validation of SAXO by Fusco et al. (2016) at these guide star magnitudes. We also note that for long coherence times VLT/SPHERE is often looking at fainter targets since the conditions are ideal. This is the case for our data as well (see Table A.1). In summary, we see a relation between the true VLT/SPHERE residuals and the seeing, but for the idealised VLT/SPHERE case we see a relation between the improvement and the coherence time, as expected. The lack of correlation between ratio of improvement and the other observing parameters could be a result of the different locations of VLT/SPHERE and MASS-DIMM (which measures the observing parameters) and that the values determined over minute averages do not reflect the exact values at the time of observation. Alternatively, the behaviour of the SAXO controller is limited by internal system requirements (such as vibration rejection) and not the observing conditions, resulting in a lack of correlation between observing conditions and gain in performance.

Studying Fig. 6, we can see that the ratio of improvement when using idealised VLT/SPHERE as a benchmark behaves very differently than the true VLT/SPHERE case. For a direct comparison to VLT/SPHERE, we do not see any correlations between the ratio of improvement and the coherence time. However, for the idealised case, we see that at longer coherence times we have no improvement since the SAXO XAO system can already perform well.

5.3. Time-invariant turbulence statistics

Observing the behaviour of the three (batch, recursive, and exponential forgetting) implementations of the LMMSE, we can comment on the stationarity of the turbulence. The LMMSE finds the optimal prediction coefficients determined from the training set. The batch is trained on the first 5 s and the recursive trains continuously. We note the residual phase variances for the last 5 s for each case; we do not see a significant difference between the batch and recursive solutions in any case, indicating the statistics of the turbulence has not changed over the measurement period. Studying the recursive solution, we look at the behaviour of the prediction coefficients in time across the aperture. We do not see any notable changes over the entire period once the solution has converged. We do see more fluctuations in the prediction coefficients for phase points located on the edges with a predictor using spatial information such as the s3t3. Conversely, the exponential forgetting LMMSE does show a slight improvement, however, this could be due to noise in the system that the LMMSE predictor can and does remove.

5.4. Implications of results

For all conditions we see an increase of performance with prediction when compared to the VLT/SPHERE residuals, under an idealised assumption there are a few cases where the ratio of improvement is one, indicating no gain but also, notably, no loss in performance.

A more notable result is from the kernel density functions plotted in the top panel of Fig. 5. The spread of the averaged phase variance is less for the s1t10 predictor, indicating a more uniform performance in phase variance reduction for different observing conditions. From an observational point of view, having a more stable correction under different conditions is desirable, especially for the cases on surveys in which observers might be targeting similar objects and can perform reference star differential imaging from a library.

The ratio of improvement we find is 5.1, but this is probably an overestimate of the improvement we could achieve on-sky. In previous prediction work, Guyon & Males (2017) show an improvement of 7 in root-mean-square (rms) residual wavefront error while offline telemetry tests by Jensen-Clem et al. (2019) show a factor of 2.5 rms wavefront error; we note that the errors in both of these works refer to systems located on top of Mauna Kea. Although an exact direct comparison using these values is impossible, we note that we show a more modest predictive improvement compared to these studies.

When studying the prediction order we do not see a large gain from including spatial information. This is due to large sub-aperture size and the high rate of temporal sampling, which results in the turbulence only moving across a sub-aperture after many frames; for example, assuming 10 m s⁻¹ and 0.2 m sub-aperture size, it takes approximately 28 frames before a cell of turbulence moves to the next sub-aperture. The turbulence is still dynamic but we sense the average variations that fall within the wavefront sensor sub-aperture. We have much finer temporal sampling. We expect a temporal-only predictor to be best suited for HCI, and by removing the spatial information, we are no longer sensitive to wind direction (spatial solution requires a symmetric choice of regressors) which therefore reduces the computational size of the prediction problem.

We do not see evidence of time-invariant turbulence, meaning that our predictor does not need to be able to track changes in turbulence behaviour on timescales less than 1−2 min of observations. We see a slight increase in performance from the exponential forgetting factor LMMSE solution but the difference is very slight and not substantial enough to suggest this as the best choice. From a computational point of view, the batch LMMSE is the best option and resetting it every 1 to 2 min (or as needed based on longer telemetry data) using 5 s of training data would be the best implementation.

6. Conclusions and future work

We find a reduction in the phase variance in comparison to the VLT/SPHERE residuals and determine the ratio of improvement to be 5.1 for SAXO telemetry data. When prediction is compared to an idealised VLT/SPHERE system, we find an improvement ratio of 2.0. In all cases, no matter what the observing conditions, prediction performs well with no loss in performance. Most importantly, we note that under all the 27 various observing conditions studied, we see a reliable and overall more consistent improvement of the system performance. The data set, in combination with our predictors, reveals that the optical turbulence as seen by the telescope is time-invariant and that the temporal regressors have a larger impact on the performance of the predictor than spatial regressors. We recommend a batch (updating every few minutes as necessary) temporal-only predictor for the VLT/SPHERE to reduce the servo-lag error. In future work we will seek to investigate the effects of prediction on contrast for different types of coronagraphs and to determine whether predictive control can be used to directly optimise the raw contrast.

Acknowledgments

The authors would like to thank Markus Kasper and Julien Milli for providing us with the VLT/SPHERE SAXO data. The authors would also like to thank Leiden University, NOVA, METIS consortium, and TNO for funding this research.

References

Avenhaus, H., Quanz, S. P., Garufi, A., et al. 2018, ApJ, 863, 44 [NASA ADS] [CrossRef] [Google Scholar]
Beuzit, J.-L., Vigan, A., Mouillet, D., et al. 2019, A&A, 631, A155 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Cantalloube, F., Por, E. H., Dohlen, K., et al. 2018, A&A, 620, L10 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Chauvin, G., Desidera, S., Lagrange, A.-M., et al. 2017, A&A, 605, L9 [Google Scholar]
Doelman, N., Fraanje, R., & den Breeje, R. 2011, 2nd Conference on Adaptive Optics for Extremely Large Telescopes, 1 [Google Scholar]
Fusco, T., Sauvage, J. F., Mouillet, D., et al. 2016, Int. Soc. Opt. Photon., 9909, 273 [Google Scholar]
Guyon, O. 2005, ApJ, 629, 592 [NASA ADS] [CrossRef] [Google Scholar]
Guyon, O., & Males, J. 2017, ArXiv e-prints [arXiv:1707.00570] [Google Scholar]
Guyon, O., Sevin, A., Gratadour, D., et al. 2018, Int. Soc. Opt. Photon., 10703, 469 [Google Scholar]
Haykin, S. 2002, Adaptive Filter Theory, 4th edn. (Upper Saddle River, NJ: Prentice Hall) [Google Scholar]
Jackson, K., Correia, C., Lardière, O., Andersen, D., & Bradley, C. 2015, Opt. Lett., 40, 143 [NASA ADS] [CrossRef] [Google Scholar]
Jensen-Clem, R., Bond, C. Z., Cetre, S., et al. 2019, Int. Soc. Opt. Photon., 11117, 275 [Google Scholar]
Kasper, M. 2012, Proc. SPIE, 8447, 84470B [NASA ADS] [CrossRef] [Google Scholar]
Keppler, M., Benisty, M., Müller, A., et al. 2018, A&A, 617, A44 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Le Roux, B., Ragazzoni, R., Arcidiacono, C., et al. 2004, Proc. SPIE, 5490, 1336 [NASA ADS] [CrossRef] [Google Scholar]
Milli, J., Mouillet, D., Fusco, T., et al. 2017, Performance of the Extreme-AO Instrument VLT/SPHERE and Dependence on the Atmospheric Conditions, AO4ELT5 [Google Scholar]
Noll, R. J. 1976, J. Opt. Soc. Am., 66, 207 [NASA ADS] [CrossRef] [Google Scholar]
Petit, C., Conan, J.-M., Kulcsár, C., Raynaud, H.-F., & Fusco, T. 2008, Opt. Exp., 16, 87 [NASA ADS] [CrossRef] [Google Scholar]
Petit, C., Sauvage, J. F., Fusco, T., et al. 2014, Int. Soc. Opt. Photon., 9148, 214 [Google Scholar]
Poyneer, L. A., & Macintosh, B. A. 2006, Opt. Exp., 14, 7499 [NASA ADS] [CrossRef] [Google Scholar]
Poyneer, L. A., Macintosh, B. A., & Véran, J.-P. 2007, J. Opt. Soc. Am. A, 24, 2645 [NASA ADS] [CrossRef] [Google Scholar]
Sissa, E., Olofsson, J., Vigan, A., et al. 2018, A&A, 613, L6 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Sivo, G., Kulcsár, C., Conan, J.-M., et al. 2014, Opt. Exp., 22, 23565 [NASA ADS] [CrossRef] [Google Scholar]
van Kooten, M., Doelman, N., & Kenworthy, M. 2019, J. Opt. Soc. Am. A, 36, 731 [NASA ADS] [CrossRef] [Google Scholar]
Zurlo, A., Vigan, A., Galicher, R., et al. 2016, A&A, 587, A57 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

Appendix A: Overview of SAXO data

In this appendix we provide a full summary of our data set. The XAO telemetry data is stored on the SAXO server and a log of when AO data was taken can be found on the ESO science archive under the VLT/SPHERE instrument. The 2019 data was kindly provided to us by Markus Kasper while the other data sets were accessed by Julien Milli.

Table A.1.

Summary of the VLT/SPHERE data used in this work as well as the MASS-DIMM atmospheric conditions as recorded closest to the measurement time.

Appendix B: Estimation of open-loop phase

In this appendix, we explain how the SAXO VLT/SPHERE telemetry data is used to estimate the pseudo open-loop phase, providing an estimation of the open loop phase of the wavefront at the pupil plane due to atmospheric turbulence. Usually the pseudo-open loop phases are estimated using wavefront sensor data and controller state (deformable mirror updates, gain, and interaction matrix). Specifically, by summing the measured wavefront sensor phase and the previous deformable mirror updates, an estimation of atmospheric phase can be made. We make make use of an alternative method using the HODM for which we have the full voltages applied to the mirror. Spatially, the HODM has comparable sampling to the SHWFS (41-by-41 actuators versus 40-by-40 sub-apertures) and therefore using the HODM for estimating the open-loop phase does not restrict the spatial bandwidth. Similarly, the temporal bandwidth of the system is determined by SHWFS frame rate, while the temporal bandwidth of the HODM is much higher. By using the HODM we do not lose any spatial or temporal bandwidth.

From Fig. 2, we can see that the SAXO provides a good correction for our data sets, as expected, therefore the HODM surface is representative of the full atmospheric phase. We assume the residual errors are negligible owing to the high expected Strehl ratio for our conditions (80−90% in the H band, see Fusco et al. 2016). From Fig. 2, we also see that the closed-loop PSDs estimated from the wavefront sensor residuals are relatively flat, indicating a good correction. We can then take the HODM surface as representative of the full atmospheric phase. In an open-loop system, the full atmospheric phase is measured by the wavefront sensor and then used to determine the deformable mirror commands. The surface of the deformable mirror therefore represents the estimated open-loop phase of the wavefront in the pupil plane. It can be expressed in the form of a lifted vector as

$\begin{matrix} y (t) = {DM}_{surface} (t), \end{matrix}$ $\begin{aligned} { y}(t)=\mathrm{DM}_{\rm surface}(t) , \end{aligned}$ (B.1)

where DM_surface(t) is a 41² × 1 vector.

Therefore, using a posteria data, since we know the full voltage on the DM surface we can estimate the pseudo open-loop phase.

All Tables

Table 1.

Averaged phase variance for the pseudo open-loop, VLT/SPHERE residuals, s1t3 residuals, s3t3 residuals, and the s3t1 residuals.

In the text

Table A.1.

Summary of the VLT/SPHERE data used in this work as well as the MASS-DIMM atmospheric conditions as recorded closest to the measurement time.

In the text

All Figures

Fig. 1.

Kernel density functions for the seeing, coherence time, turbulence velocity, and guide star magnitude (r band) showing the conditions under which the VLT/SPHERE telemetry was taken. The first three panels are measurements closest to the time of the observation output by the MASS-DIMM (accessed via ESO Paranal query form). The corresponding targets were found at the VLT/SPHERE ESO archive and their r-band magnitudes were found in the VizieR catalog. The kernel density functions are a nonparametric estimation of the probability functions.

In the text

	Fig. 2. Power spectral densities, estimated using the Welch method, for all the data sets; both for the full VLT/SPHERE estimated pseudo open-loop phases and for the reconstructed VLT/SPHERE residual phases.
In the text

Fig. 3.

Top panel: time series showing the estimated pseudo open-loop in black, compared with the VLT/SPHERE residual in blue for a random SAXO telemetry data set. Other plots (all with the same y-scale): various predictor residual phase variances (various green and purple lines) compared to the VLT/SPHERE residuals for the same data. The averaged phase variance, in μm², for the last 5 s is indicated in the top right corner of each plot. The forgetting s1t10 (abbrev. for) does not perform significantly better than the recursive s1t10 (abbrev. rec). Secondly, the s3t3 does not perform better than the s1t10.

In the text

Fig. 4.

Vertical slice across the telescope aperture (y-axis) showing the wavefront phase in μm (indicated by the colour-bars), plotted as a function of time for the pseudo open-loop phase (top), the VLT/SPHERE residual phase (middle), and s1t10 predictor residual phase (bottom) for the same night as Fig. 3. Comparing the bottom two panels (colour map is the same in both), we can see the prediction residuals have a flatter and more uniform appearance compared to the real VLT/SPHERE residuals.

In the text

Fig. 5.

Top: Kernel density function estimation for the averaged phase variances plotted in the bottom plot. The change in spread of the values is shown. Bottom: pseudo open-loop averaged wavefront phase variance compared to residual averaged wavefront phase variance for VLT/SPHERE, a batch s1t1 (i.e., idealised integrator for VLT/SPHERE), and a recursive s1t10 predictor. The points move from right to left, indicating that a s1t1 does better than VLT/SPHERE and a high order predictor does even better than the s1t1.

In the text

Fig. 6.

Ratio of improvement, found by taking the ratio of an idealised integrator on VLT/SPHERE to a recursive spatial-temporal predictor (s1t10) phase variance as calculated from the last 5 s of data, as a function of coherence time. The size of the markers indicates the MASS-DIMM seeing conditions at the time of observation. The average ratio of improvement is 5.1 when comparing the prediction to the real VLT/SPHERE residuals. When looking at the idealised VLT/SPHERE, we find an average ratio of improvement of 2.0 in wavefront variance reduction.

In the text

	Fig. 7. Ratio of improvement compared to seeing (left), turbulence velocity (middle), and guide star magnitude in r band (right) during the time of observation.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Avenhaus, H., Quanz, S. P., Garufi, A., et al. 2018, ApJ, 863, 44 [NASA ADS] [CrossRef] [Google Scholar]

[2] Beuzit, J.-L., Vigan, A., Mouillet, D., et al. 2019, A&A, 631, A155 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[3] Cantalloube, F., Por, E. H., Dohlen, K., et al. 2018, A&A, 620, L10 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[4] Chauvin, G., Desidera, S., Lagrange, A.-M., et al. 2017, A&A, 605, L9 [Google Scholar]

[5] Doelman, N., Fraanje, R., & den Breeje, R. 2011, 2nd Conference on Adaptive Optics for Extremely Large Telescopes, 1 [Google Scholar]

[6] Fusco, T., Sauvage, J. F., Mouillet, D., et al. 2016, Int. Soc. Opt. Photon., 9909, 273 [Google Scholar]

[7] Guyon, O. 2005, ApJ, 629, 592 [NASA ADS] [CrossRef] [Google Scholar]

[8] Guyon, O., & Males, J. 2017, ArXiv e-prints [arXiv:1707.00570] [Google Scholar]

[9] Guyon, O., Sevin, A., Gratadour, D., et al. 2018, Int. Soc. Opt. Photon., 10703, 469 [Google Scholar]

[10] Haykin, S. 2002, Adaptive Filter Theory, 4th edn. (Upper Saddle River, NJ: Prentice Hall) [Google Scholar]

[11] Jackson, K., Correia, C., Lardière, O., Andersen, D., & Bradley, C. 2015, Opt. Lett., 40, 143 [NASA ADS] [CrossRef] [Google Scholar]

[12] Jensen-Clem, R., Bond, C. Z., Cetre, S., et al. 2019, Int. Soc. Opt. Photon., 11117, 275 [Google Scholar]

[13] Kasper, M. 2012, Proc. SPIE, 8447, 84470B [NASA ADS] [CrossRef] [Google Scholar]

[14] Keppler, M., Benisty, M., Müller, A., et al. 2018, A&A, 617, A44 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[15] Le Roux, B., Ragazzoni, R., Arcidiacono, C., et al. 2004, Proc. SPIE, 5490, 1336 [NASA ADS] [CrossRef] [Google Scholar]

[16] Milli, J., Mouillet, D., Fusco, T., et al. 2017, Performance of the Extreme-AO Instrument VLT/SPHERE and Dependence on the Atmospheric Conditions, AO4ELT5 [Google Scholar]

[17] Noll, R. J. 1976, J. Opt. Soc. Am., 66, 207 [NASA ADS] [CrossRef] [Google Scholar]

[18] Petit, C., Conan, J.-M., Kulcsár, C., Raynaud, H.-F., & Fusco, T. 2008, Opt. Exp., 16, 87 [NASA ADS] [CrossRef] [Google Scholar]

[19] Petit, C., Sauvage, J. F., Fusco, T., et al. 2014, Int. Soc. Opt. Photon., 9148, 214 [Google Scholar]

[20] Poyneer, L. A., & Macintosh, B. A. 2006, Opt. Exp., 14, 7499 [NASA ADS] [CrossRef] [Google Scholar]

[21] Poyneer, L. A., Macintosh, B. A., & Véran, J.-P. 2007, J. Opt. Soc. Am. A, 24, 2645 [NASA ADS] [CrossRef] [Google Scholar]

[22] Sissa, E., Olofsson, J., Vigan, A., et al. 2018, A&A, 613, L6 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[23] Sivo, G., Kulcsár, C., Conan, J.-M., et al. 2014, Opt. Exp., 22, 23565 [NASA ADS] [CrossRef] [Google Scholar]

[24] van Kooten, M., Doelman, N., & Kenworthy, M. 2019, J. Opt. Soc. Am. A, 36, 731 [NASA ADS] [CrossRef] [Google Scholar]

[25] Zurlo, A., Vigan, A., Galicher, R., et al. 2016, A&A, 587, A57 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]