A&A 405, 11771189 (2003)
DOI: 10.1051/00046361:20030653
Efficient firstorder performance estimation for highorder adaptive optics systems
R. C. Flicker
Lund Observatory, Box 43, 22100 Lund, Sweden
Received 3 February 2003 / Accepted 25 April 2003
Abstract
It is shown how firstorder performance estimation of highorder adaptive optics (AO) systems may be efficiently implemented in a hybrid numerical simulation by the use of 1) sparse matrix techniques for wavefront reconstruction, 2) undersampled pupilplane turbulenceinduced aberrations, and 3) analytical models that compensate  in the limit of infinite exposure time  for the errors introduced by undersampling. A sparse preconditioned conjugate gradient (PCG) method is applied for wavefront reconstruction, and it is seen that acceptable AO performance may be achieved at a relative error tolerance of 0.01, at which the computational cost of the sparse PCG scales approximately as
O(n^{1.2}), where n is the number of actuators in the system. Estimations of adaptive optics performance for extremely highorder systems are presented, including multiconjugate and laserguidestarbased systems. The scaling laws for AO performance with telescope diameter D and turbulence outer scale L_{0} coupled with the use of laser guide stars are also investigated. It is shown that a single or a small number of laser guide stars (LGS) may still provide a useful level of compensation to telescopes with diameters in the range
30100 m, if L_{0} is on the order of or smaller than D. The deviations from Kolmogorov theory are also investigated for LGS AO. To the best of the author's knowledge, results presented for a n=65 282 case represent the largest multiconjugate adaptive optics system simulated in full to date.
Key words: instrumentation  adaptive optics  methods  numerical
The technology of adaptive optics (AO) is attaining an increasingly more prominent role in groundbased optical astronomy as the next generation of telescopes and instrumentation is being planned. It is clear (Ragazzoni et al. 2000; Snel & Ardeberg 2002; Hawarden et al. 2002; Gilmozzi 2000) that a neardiffractionlimited mode of imaging will be required of future groundbased telescopes with aperture diameters ranging upward from tens of meters to 100 m, in order to enable such science as will be unique to that large a telescope and thereby justify the staggering costs involved. AO systems, as generically outlined in Fig. 1, are thus moving toward becoming an integral part of a telescope, and may rightfully claim to be one of the important "raisons d'être'' for future extremely large telescopes (ELTs). It can be problematic, however, to study and design adaptive optics systems when the required order of the system, as represented henceforth by the total number of actuators n, becomes very large. The computational cost for a conventional wavefront reconstruction algorithm, a simple matrixvector multiplication, is of the order O(n^{2}) floating point operations and the cost for the matrix inversion required to prepare the estimator scales as O(n^{3}). Since n scales linearly with the surface area of the telescope aperture, the computational costs for these two actions will scale as O(D^{4}) and O(D^{6}), where D is the diameter of the telescope. Hence it is clear that, at some point, a practical problem will arise both in simulation and realtime control of the system as D grows large. These scaling laws are illustrated in numbers in Table 1.

Figure 1:
Generic adaptive optics schematic, introducing the chief components that must be incorporated into a firstorder simulation of the system: the reference beacon ("guide star''), atmospheric turbulence, the telescope, wavefront sensor, deformable mirrors and a realtime control computer. 
Current AO systems in use for astronomical purposes range from orders n of a few tens to
1000 actuators, where simulations are easily manageable. One may estimate that ELTs with diameters in the 30100 m range will require adaptive optics systems of orders n= 10^{3}10^{6}, depending on what degree of correction is desired and whether multiconjugation and tomography  multiconjugated adaptive optics (MCAO)  are to be employed. Apart from the realtime computational load becoming overwhelming toward the high end, previous simulation techniques have been unable to efficiently estimate longexposure performance with sufficient accuracy for a large range of telescope diameters. Modeling efforts of adaptive optics systems may be loosely grouped into two categories: 1) analytical models predicting the longexposure performance of the system as a function of system and observation parameters (Ellerbroek 1994; OwnerPetersen & Goncharov 2002; Tokovinin et al. 2001; Rigaut et al. 1998), and 2) MonteCarlotype simulations that mimic the spatial and temporal behavior of the system by applying spatial algorithmic actions each discrete time step exactly as they would be applied in a live run of the system (Flicker et al. 2000; Le Louarn 2002; Ellerbroek 2002; Fusco et al. 2001). In the first case, spatialtemporal analytical models of the system's various components (cf. Fig. 1) are integrated into the analytical formalism of a performance estimator, which generally amounts to a collection of integrals that may be evaluated numerically for a given set of system and observational parameters. Such models can be made very fast and powerful when focusing on a particular aspect of the system, by introducing approximations in areas that are tangential or unimportant to the specific subject of the study. Analytical performance estimators become increasingly complicated, however, when many aspects of the system are to be modeled simultaneously to augment the generality and accuracy of the results. Apart from the computational issues arising thus for very highorder systems, it is generally difficult to find analytical descriptions of closed loop spatialtemporal statistics that are both accurate and computationally tractable. For these reasons, fullfledged analytical models of closed loop AO systems have not yet been applied to extremely highorder systems. By contrast, the MonteCarlotype simulation is only required to model open loop turbulence, and closed loop properties will be emergent upon closing the loop in the numerical simulation. In a Monte Carlo simulation, longexposure results are obtained only after allowing the system to run in a closed feedback loop for many cycles until a statistically significant average is obtained. In either approach, however, it has been the case that extremely highorder systems face computational bottlenecks in the dealing with very large matrices that require storage beyond the capabilities of ordinary computers  and even granted storage, handling of and performing computations with such unwieldy data objects becomes timeconsuming to the point of scientific analysis being rendered infeasible.
It has been recognized that the spatial interaction between wavefront sensors (WFS) and deformable mirrors (DM) in adaptive optics systems may in a linearized model be described by a very sparse linear system. This means that the influence of a single actuator on the DM is registered only by a small number of WFS elements (subapertures). Sparse matrix methods that exploit the sparseness of the system to economize on computations have been applied to the inverse problem of wavefront reconstruction by, notably, Cochran (1986) in early investigations and Ellerbroek (2002) (E02) and Gilles et al. (2002) (GVE02) more recently. The adaptive optics performance estimation model to be investigated in this paper differs from the latter two on a number of important points. Whereas E02 uses a Cholesky factorization solver, the simulation of the current study employs a sparse preconditioned conjugate gradient (PCG) method for efficient wavefront reconstruction. This is also the approach of GVE02, although they apply it within a multigrid framework to increase efficiency further. The PCG has a couple of advantages over the Cholesky solver: 1) numbering, which may be important to optimize efficiency of a Cholesky solver (Cochran 1986), does not matter in the PCG; 2) the PCG allows realtime tuning of the regularization strength (see Sect. 3) to optimize performance, and 3) speed may be traded against accuracy by varying the PCG relative error tolerance, by which it may be pushed to outperform even an optimally ordered Cholesky solver. But whereas GVE02 is an open loop simulation focusing on the wavefront reconstruction performance, the simulation of this paper goes on to incorporate this algorithm into a complete closed loop AO simulation model similar to that of E02. The engine of the simulation is a MonteCarlotype simulation, but unlike E02 the current simulation is divided into two parts pertaining to the low and high spatial frequency content of the atmospheric turbulence, which are treated fundamentally differently. Only the turbulence that lies within the domain of spatial frequency attenuation of the AO system is processed through the Monte Carlo simulation, which greatly reduces computing demands. The effect of the unprocessed high spatial frequency content are computed analytically in the limit of infinite exposure time, and depending upon their nature these corrections are added to the simulation either in the loop or in postprocessing the final result. This renders the simulation very fast and economic, and permits efficient parameter studies of extremely highorder adaptive optics systems. The simulation allows for artificial reference beacons  laser guide stars (LGS), multiple reference beacons for atmospheric tomography and multiple deformable mirrors for multiconjugation. MCAO on telescopes as large as 100 m thus becomes possible with relatively modest computing resources.
Section 2 introduces the linearized description of wavefront sensing and reconstruction which has become one of the most commonly used formalisms for astronomical adaptive optics systems, and is often used as the canonical starting point for modeling. The sparse matrix methods to be applied to the wavefront reconstruction algorithm are described in the first part of Sect. 3, and the second part describes the details of the Monte Carlo simulation. In Sect. 4 some sample numerical results are presented for a number of elementary adaptive optics configurations, including the use of laser reference beacons and multiconjugation.
One may without much loss of generality adopt a wavefront sensor model that partitions the telescope pupil plane into a uniform grid of subapertures, within which individual wavefront measurements are made on the part of the wavefront incident on the subaperture. To start off as generally as possible, the response s_{i} from the subaperture i centered on the pupil plane
coordinate
to a static phase aberration
present in the telescope pupil plane may be described by a nonlinear functional

(1) 
where
is the aperture transmission function of subaperture i and
encodes all the steps of modulation, image formation, detector response and signal processing. Although an authentic description, this is a few orders too general and not very useful for practical analysis. It is usually the case, however, and indeed a desirable feature of closed loop adaptive optics systems for successful calibration and implementation, that the response becomes essentially linear with
within a limited range of operation. Within this linear range, one may study much simplified heuristic WFS models based on what the resultant measurement signal actually represents. One such representation which will be appropriate in the following is

(2) 
where
is a linear differential operator. This Fredholm equation of the first kind covers the two most common WFS types  the gradient sensors (e.g. ShackHartmann, lateral shearing interferometer) and the curvature sensors (BeckersRoddier)  by setting, respectively,
and
.
The closed loop WFS signal consists of the turbulenceinduced open loop phase aberration
minus the phase correction
applied by the adaptive optics. Under certain conditions one may assume that
is a linear superposition of the DM influence functions g_{j} according to

(3) 
where c_{j} is the jth actuator command signal. The validity of this assumption is limited, as some DMs (e.g. electrostatic membrane and bimorph mirrors) may have influence functions extending nonlinearly over many actuators, and even well localized influence functions have a tendency to deviate slightly from perfect linearity. It has been found, however, that within the linear range of the WFS (3) is usually a sufficiently good approximation which leads not only to accurate simulation results but also sufficiently accurate wavefront reconstruction in real AO systems. Introducing a discrete description of
over a total of N computational mesh points,
,
one may write the WFS interaction equation as
with the matrix elements defined by
B_{ij} 
= 

(7) 
G_{ij} 
= 

(8) 
The closed loop WFS response to optical phase aberrations introduced by the atmosphere and by the deformable mirror may thus be written on vector form as

(9) 
where s is the melement column vector of WFS measurements, c the nelement vector of DM actuator commands, G the
DM interaction matrix (m rows, n columns), B the
phase interaction matrix and n a melement vector of additive noise on the measurement signal. G and B are Jacobian type matrices,
and
,
encoding information on the geometry of WFS subapertures and DM influence functions. Whereas B is of primarily academic interest, G is a calibration matrix that defines the fundamental interaction between the deformable mirror and the wavefront sensor. It may be obtained columnwise by poking the DM actuators one by one and measure the response in the sensor (hence called the "pokematrix'' by some authors).
To spatially control an adaptive optics system, the inverse problem posed by the linearized interaction Eq. (9) must be solved for the command signal vector c. Requiring the reconstruction to be a linear mapping, the solution may be stated on the general form

(10) 
where F is a spatialtemporal filter. Defining the open loop measurement
signal
,
one finds upon combining (10) and (9)

(11) 
From here it is not clear how to proceed in the general case to derive the filter F that is closed loop optimal. Ellerbroek (1994) shows that an optimal estimator may be derived by separating spatial and temporal variables by the splitting F=fE, where f is a scalar temporal filter and E the spatial reconstruction matrix, and then imposing the constraint EG=I to linearize the spatial component of (11). For reasons of sparseness and computational efficiency (see Sect. 3), only open loop wavefront reconstruction shall be considered here. Open loop estimators may be sought by solving the simplified interaction equation

(12) 
for the command signal vector c. To date, a commonly used technique is to ignore the noise component and invert the interaction matrix G by singular value decomposition (SVD) and filtering, whereupon wavefront reconstruction may be conducted by the simple matrixvector multiplication
.
The computational costs associated with this type of wavefront reconstruction are listed for an assortment of AO and MCAO configurations in Table 1. The SVD estimator is a leastsquares solution which minimizes the merit function
.
There are several reasons to look beyond the SVD estimator, however, and consider adding a few layers of sophistication:
 1.
 For very highorder systems it might be necessary to filter many modes which, even though poorly sensed, may constitute a nonnegligible fraction of the modal content of the turbulenceinduced aberrations. Indiscriminate filtering of singular modes will gradually reduce the modal efficiency of the wavefront reconstruction.
 2.
 The leastsquares estimator minimizing
is a purely geometric reconstructor that does not take into account potentially useful a priori knowledge about statistics of the noise in the WFS or turbulence in the atmosphere, which could help to condition the overdetermined system and improve performance and robustness.
 3.
 As explicit matrix inversion, a full SVD is an O(n^{3}) computation that takes no advantage of sparseness. For extremely highorder systems, a computational problem will arise in realtime control as well as in simulations.
One may solve the inverse problem posed by (12) by a Bayesian approach that delivers a leastsquares estimator weighted by noise statistics and regularized by open loop turbulence statistics. This estimator is derived in Appendix A, and the regularized inverse problem is given by the expression (A.10)

(13) 
where
is the WFS noise covariance matrix and
is the actuator command signal covariance matrix. By inverting the lefthand matrix one obtains the standard Wiener filter, sometimes referred to as the
maximum a posteriori (MAP) estimator (Fusco et al. 1999). Assuming Kolmogorov turbulence statistics, one may estimate C_{c}^{1} by projecting the turbulence onto a Zernike modal basis, for which the covariances were worked out by Noll (1976). Introducing the pupilplane phase bases H and Z of DM influence functions and Zernike modes, a turbulenceinduced phase aberratio
may be described in either base as
,
where a is the vector of Zernike coefficients. Forming the inverse composite mapping
it is clear that C_{c} in general can only be found up to leastsquares fit, but its inverse has the exact representation
C_{c}^{1}=H^{T}ZC_{a}^{1}Z^{T}H,

(14) 
which is accurate up to the numerical orthonormality of the Zernike modes in Z, and
.
It may be noted for future reference that the DM interaction matrix may be represented by the composite mapping G=AH.

Figure 2:
Upper left  Fried geometry of WFS subapertures and DM actuators; upper right  the natural numbering of the actuators. Lower left  heuristic model for a Fried geometry interaction; numbers indicate the x and ycomponent of the gradient registered in respective subaperture as a result of poking the central actuator. Lower right  numerical weights for the discrete definition C of the Laplacian curvature operator, for an internal, edge and corner point of the mesh. 
Crucial to the numerical simulation to be presented here is the use of sparse matrix methods for wavefront reconstruction, which in turn relies on the matrices involved actually being sparse. So far nothing has been said about the morphology of the interaction matrix G or the covariance matrices C_{n} and C_{c}, and what about them might potentially invite the use of sparse techniques. The particular form of these matrices depends strongly upon the type of wavefront sensor and deformable mirror being used, and their relative geometry as embodied in the interaction matrix. For the highorder AO applications we wish to investigate, it is reasonable to make the following assumptions:
 The influence functions
are localized, extending no further than the immediate nearest neighbor.
 A Fried or Hudgin geometry of actuators and subapertures in perfect alignment (no DMtoWFS misregistration).
 Identical influence functions,
,
with perfect x and y symmetry.
The first assertion comes from the difficulty of building bimorph or electrostatic membrane DMs of very high order. This makes it plausible that piezostack, electrostatic PMN (leadmagnesiumniobate) or MOEM (microoptoelectromechanical) type DMs will be employed for highorder systems, whose actuators have a well localized influence and produce thus a sparse WFS response. For manufacturing reasons, it is also reasonable to assume a square geometry of DM actuator and WFS subaperture patterns such as e.g. the Fried configuration drawn in the upper part of Fig. 2  hexagonal patterns are a possible alternative with some attractive features, but they would not affect the sparse wavefront reconstruction to be presented here dramatically, only complicate the numerical analysis considerably. In both the Fried and Hudgin configurations the interactuator and subaperture spacings are matched, with the actuators positioned either at the corners of the subapertures (Fried) or along the sides (Hudgin) bisecting the subaperture. As a final assumption, a ShackHartmann type wavefront sensor shall be adopted, which measures a twocomponent gradient in each subaperture. It is clear from these assumptions that the interaction matrix G will have no more than eight nonzero entries per column (i.e. per actuator), making it and its covariance G^{T}G very sparse. The drawing on the lower left of Fig. 2 illustrates the idealized heuristic interaction model that was adopted based on these assumptions: any given actuator on the DM will produce x and ygradients of equal amplitude but alternating sign in the subapertures that lie within the actuator's region of influence.
To quantify the sparseness of a matrix one may define the degree of sparseness
as the number of zeros divided by the total number of elements  the sparser a matrix, e.g. the closer to one
becomes, the greater the gain of sparse methods over explicit ones. Assuming the noise in different subapertures of the WFS to be uncorrelated, C_{n} will be diagonal with the individual subaperture variances
on the diagonal, making C_{n} a maximally sparse fullrank matrix. If in addition the subaperture noise variances are all the same
(as follows upon assuming identical detector pixels and ignoring the effects of partial illumination at the edge of the aperture), then
,
where I is the identity matrix. The regularization term
,
however, is a full matrix and some approximation must be employed to render it sparse. It was demonstrated by Ellerbroek (1986,2002) that approximating the Kolmogorov turbulence power spectrum by
,
where
is the radial component of the spatial frequency, yields a sufficiently sparse approximation to
of the form

(15) 
where C is the discrete matrix representation of the Laplace curvature operator
and
a scalar parameter. By Taylor expansion, a secondorder accurate finitedifference approximation to the twodimensional Laplacian is given by



(16) 
where the paired indices (k,l) refer to the discrete x and ycoordinates (x_{k},y_{l}) in the pupil plane. Hence, C will have no more than five nonzero elements per column, which makes it and C^{T}C very sparse matrices as soon as the order of the system reaches nontrivial proportions. It was also verified by Ellerbroek (2002) that simple boundary conditions such as mirroring or protruding from the center produce sufficiently good results that one need not look into more sophisticated extrapolations at the edge of the pupil (the drawing on the lower right of Fig. 2 illustrates the boundary conditions applied here for edge and corner points). This and the fact that the turbulence approximation
works admirably are not so surprising, since the role of the regularization term C_{c}^{1} was never to perfectly tell the system exactly how to reconstruct turbulence from WFS signals. Its main job is merely to support the inversion by smoothing out singularities and, in the presence of noise, to provide some additional advice on what to do, in a statistical sense, with noisy measurements.
Multiplying out the noise variance
,
the approximate wavefront reconstruction equation with all sparse matrices is thus finally obtained as

(17) 
where
,
and r_{0} is the Fried parameter governing the turbulence strength. The resultant system matrix
is shown in Fig. 3 for two systems with eight (n=69) and 32 (n=877) subapertures across the telescope aperture. For the n=877 actuator system A attains a sparseness of %, which already suggests the use of a sparse algorithm to economize on computations, even though conventional reconstruction would not be a computational problem at this level. A useful feature of implicit wavefront reconstruction schemes is that small adjustments may be made to the constituent matrices in order to optimize the reconstruction in realtime. Equation (17) provides a clear example: as was mentioned in the introduction, the influence of the regularization term is controlled by a single parameter ,
which may be adjusted on the fly to match the current atmospheric seeing and WFS noise conditions  in explicit wavefront reconstruction one would have to recompute the reconstructor E by SVD or matrix inversion each time an adjustment was made (with a Cholesky solver one would need to compute the factorization again).

Figure 3:
Modulus of G^{T}G+C^{T}C for n=69 and n=877. 

Figure 4:
Required computing time
as a function of the order
of the system for explicit (pluses) matrixvector multiplication and sparse (diamonds) PCG wavefront reconstruction algorithms. Slopes extrapolated from the last two data points are indicated next to each data set, where the two values for the PCG, 1.56 and 1.18, correspond to relative error tolerances of 10^{6} and 10^{2}. 
Conjugate gradient (CG) methods provide a quite general means to solve linear systems of the form (17) by iteration. Given an initial guess for the solution vector c, the CG algorithm generates a succession of orthogonal search directions that produce gradually improved estimates ,
and the recurrence may be terminated when sufficient accuracy is achieved. For the numerical simulation of this study, the preconditioned conjugate gradient (PCG) method described in Press et al. (1992) was employed. The PCG algorithm deviates from the CG only by both sides of (17) being premultiplied by the inverse of a "preconditioning matrix''
,
which was taken to be simply the diagonal part of the system matrix A. The PCG algorithm turns out to be particularly suitable for sparse systems, as the matrix A is only referred to by its multiplication with a vector, which can be very efficiently implemented for a properly stored sparse matrix. The sparse storage scheme adopted for this study, the rowwise format described in e.g. Pissanetzky (1984), is in fact constructed such that the efficiency of the matrixvector multiplication is optimized with respect to sparseness. A representative benchmarking of the sparse PCG applied to the current task of AO wavefront reconstruction is shown in Fig. 4. Error tolerances of 10^{6} and 10^{2} were benchmarked, where the latter leads to the PCG computational cost scaling almost linearly (exponent 1.18) with n. As is shown in Sect. 4, other error sources present in the adaptive optics system relax the requirements on precision in the reconstruction algorithm, such that a remarkably large intrinsic error
(e.g. 10^{2}) may be tolerated without significant loss of AO performance.
The numerical Monte Carlo simulation is of the type described in, e.g., Flicker et al. (2000), as originally devised by Rigaut (1999) for MCAO feasibility studies at the Gemini Observatory. The designation "Monte Carlo'' is here used in its loosest sense, that the statistically averaged result is obtained after processing a large number of random inputs. Its main components are sketched in Fig. 5, which reflects the system architecture as depicted in Fig. 1. The main differences in the current study from previous applications of the algorithm lies within the propagation module, the WFS module and the wavefront reconstruction algorithm that was described in the previous Section. The atmosphere is modeled by a finite number of infinitesimally thin layers of turbulence producing pure phase aberrations ,
that are stacked vertically at varying altitudes above the telescope. The phase aberrations obey von Kármán statistics
,
with an outer scale
ranging between 25100 m in the various simulations presented in Sect. 4. The propagation operator
would in the general case be a Fresnel propagator that propagates a plane wave over the atmospheric aberrations
and the adaptive phase corrections
into a residual complex aberrated wave
in the telescope pupil plane. Under assumptions of weak and not extremely highaltitude turbulence, however, one may forego a wave optical description and model propagation geometrically as the linear addition of phase perturbations. This neglects the effects of scintillation, which are small at nearinfrared wavelengths, 12.2 m, but become important at visible wavelengths and when high a degree of correction is the goal (Flicker 2001; Sasiela 1994).

Figure 5:
Flow diagram of the hybrid numerical simulation. 
The WFS module is of a fundamentally different function in this simulation than previous ones. Rather than building a wave optical model for a subaperture of the ShackHartmann sensor, the WFS signal s is here obtained from the residual phase
directly by a sparse matrixvector multiplication,
.
This is made possible by undersampling the turbulenceinduced phase aberrations in the pupil plane  undersampling in this context meaning neglecting spatial frequencies above the cutoff frequency of the DM, which is set by the interactuator distance d. This allows the representation in the telescope pupil plane of a DM actuator and a WFS subaperture by one single phaseelement (pixel). Hence the mapping
from DM commands to wavefronts becomes the identity, whereupon G=AH=A, according to the discussion directly following Eq. (14). Equating phase and actuator commands, it follows from the interaction Eq. (12) that both are mapped into WFS signals by the interaction matrix G. Hence the WFS model
,
where the subscript zero denotes the pupilplane interaction, in order to distinguish it from conjugateplane interactions that enter into G when looking to MCAO configurations. Undersampling this coarsely is a rather crude approximation that will not produce very accurate results unless compensated for. Two obvious errors thus unaccounted for are the fitting of DM influence functions to phase aberrations and the spatial aliasing of high spatial frequencies in the WFS. Presently, only simple analytical models are applied to account for the fitting error phase variance
in the limit of long exposure time, and the contribution to the WFS measurement signal from spatial aliasing (see next section). The remainder of the simulation sequence sketched in Fig. 5 is straightforward. Noise and aliasing vectors n and r are added to the obtained WFS signal

(18) 
and the sum is premultiplied by G^{T} to prepare the vector b that is input into the sparse PCG wavefront reconstruction algorithm. The PCG delivers the required actuator adjustment c, which is added to the current shape of the DM by a standard integrating controller over a scalar gain g. The operator
shifts the turbulence screens according to wind velocity vectors before evaluating performance in terms of the fielddependent pupilplane residual phase variance
.
It merits comment that, in this hybrid model, aliasing enters the interaction on exactly the same basis as noise, although with different spatial statistics.
It remains to be investigated how firstorder system errors such as servolag and anisoplanatism are affected by the truncation of high spatial frequencies, but it is verified in Sect. 4 that these effects are minor. The DM fitting error will here be approximated as the total phase variance deriving from spatial frequencies higher than the cutoff frequency
of the DM,
.
Integrating the Kolmogorov power spectrum given by Noll (1976) outside the controlled frequency domain gives

(19) 
where
is the radial component of the spatial frequency vector
and the region
is the complement of the centered square with side
.
Evaluating the integral in cylindrical coordinates
and observing the radial symmetry gives
where the azimuthal integral was numerically integrated to 0.663, and the cosine derives from the boundary
of
in the interval
described in polar coordinates as
.
This 5/3' s power law
may be applied with various coefficients a that better approximate the real fitting of different types of influence functions. Coefficients for the most common types are listed in e.g. Hardy (1998), but in this paper the value 0.414 as derived in (22) will be used throughout. This result applies less well in the case of multiconjugation, but it shall be used nevertheless as a rough approximation  the method in Rigaut et al. (2000) could plausibly lead to a better estimation in this case.
The WFS spatial aliasing error is produced by some of the high spatial frequency content of the atmospheric turbulence being folded into the interval
by the discrete sampling of the wavefront. Since those high frequencies that produce aliasing remain largely unattenuated by the AO system, open loop statistics may be employed to represent aliasing to first order. Approximating the ShackHartmann WFS by an infinite array of gradient subapertures of linear dimension d, the aliasing measurement signal
as a function of the pupil plane coordinate x may be expressed as

(23) 
where asterisk (*) denotes convolution,
is the square windowing function defining one subaperture,
is the highfrequency component of the phase and the replicating function (the shah) was defined as

(24) 
As the shah is its own Fourier transform (Bracewell 1994), the spatial Fourier transform of the measurement becomes
where the aliased frequency vector k was defined as

(27) 
and
.
The phase spectrum may be written
where
is the phase shift of the complex spectrum
.
Whereas the modulus
inherits the Gaussian statistics of ,
is uniformly distributed and independent of
.
Hence the expectation value of
may be split into the separate expectations
,
where the statistics of
lead to the condition

(28) 
Substituting the von Kármán spectrum from (Noll 1976; Winker 1991) for
,
the power spectral density (PSD)
of
evaluates to



(29) 
where
.
Rather than applying an analytical reconstructor as in (Rigaut et al. 1998) for computing the aliasing phase variance, the PSD (29) will here be used to generate the aliasing vector r indicated in (18) by Fourier transform of
,
where
is a complexvalued white noise. By doing so, the WFS aliasing is propagated through the PCG wavefront reconstructor to produce a resultant effect on the DM commands that observe the proper propagation properties.

Figure 6:
Residual phase variance
and number of PCG iterations versus the PCG relative error tolerance, for the
ShackHartmann test configuration in conventional AO (dotted lines) and MCAO (dashed lines) mode. In the top field, a fieldweighted average (pluses), an unweighted average (diamonds) and the onaxis value (asterisks) are plotted for comparison. 
Table 2:
Discrete 4layer model for atmospheric turbulence  h is the layer altitude,
is the fractional turbulence strength per layer, and v is the wind speed.
Layer 
h (m) 

v (m s^{1}) 
1...... 
0 
0.50 
15 
2...... 
3000 
0.25 
25 
3...... 
6000 
0.15 
35 
4...... 
12 000 
0.10 
25 
4 Model implementation
Simulation results for a number of different AO and MCAO configurations are presented at the end of this section. Of particular interest are the results concerning extremely highorder systems, and the investigations into scaling laws for AO performance with telescope diameter D and with turbulence outer scale L_{0}. The simulations all employ a generic fourlayer atmospheric model, with parameters as summarized in Table 2. The altitude h refers to the level above the telescope site, and the fractional turbulence strength f_{l} per layer l is defined as
,
where the c_{l} are the coefficients of the C_{n}^{2} decomposition
.
In terms of the
turbulence moment
,
these coefficients are given by
.
The wind speeds v_{l} were given random directional vectors from a uniform distribution. It should be noted that, although this toy atmospheric model does not derive from real measurements, it was contrived to be conservative with respect to reported atmospheric conditions at most good astronomical observing sites, without for that reason being overly pessimistic. Where nothing else is stated, other observation and simulation parameters assumed the values listed in Table 3. Only a 2DM case was considered for the MCAO simulations, and the guide star magnitudes were all set to 10 in order to suppress the effects of photon noise (CCD readout noise was not included in the simulation)  the main effects investigated for this study were the errors associated with LGS (cone effect), MCAO (tomographic error) and the outer scale of turbulence L_{0}.
In order to optimize the efficiency of the PCG wavefront reconstruction algorithm, its required accuracy in an AO simulation was metered versus its relative error tolerance. Since the error tolerance determines when to terminate the recursion, it directly governs the computational cost for wavefront reconstruction. The somewhat surprising results are shown in Fig. 6, where the upper part plots the residual pupil plane phase variance
versus PCG error tolerance, and the lower part plots the average number of required PCG iterations. Both a conventional AO (dotted lines) and a MCAO (dashed lines) system of the same order (the
ShackHartmann used for the comparison in Fig. 7) were tested for comparison. It is seen that over a range of 10^{7} increase in the error tolerance, over which the PCG iteration count drops by roughly a factor of 100, there is very little loss of AO performance  superimposed upon a very mild and gradual increase in
there is an irregular jitter of comparable or even larger magnitude. With turbulence variations sufficiently averaged
(100 independent atmospheric realizations with 2000 Monte Carlo cycles averaged within each), this jitter can only be numerical, pertaining to the inner machinations of the PCG. The investigation of AO performance versus PCG error tolerance was discontinued at a value of 0.2, as at this level the computational cost of the PCG reconstruction in the case of the conventional AO system was already down to a single PCG iteration upon the initial guess. The simulation results presented in this section were produced at a PCG relative error tolerance of 0.01, where the scaling with n for large systems is approximately
O(n^{1.2}) as shown in Fig. 4.

Figure 7:
Comparison between simulation results (phase variance  solid; Strehl ratio  dotted) for the hybrid simulation of the current study (diamonds) and a more accurate wave optical simulation with wellsampled turbulence (pluses). Dashed curves show for comparison the hybrid simulation result without fitting and aliasing compensation. 
The efficient simulation model was corroborated for realism and accuracy versus a fullfledged wave optical Monte Carlo simulation of similar type, where all the firstorder errors (fitting, aliasing, servolag, anisoplanatism and noise) are modeled with high accuracy. For the comparison, a small test configuration (
ShackHartmann WFS, n=137) on an 8m telescope was simulated, and the comparison is shown in Fig. 7. Looking to the curves of residual phase variance
(solid lines) versus angle from the guide star, it is observed that the efficient simulation is afflicted with a bias of approximately 0.04 rad^{2}. The majority of this offset is due to the analytical fitting and aliasing compensation, which in their approximation do not exactly reproduce the conditions in the more accurate simulation. There is an additional variation across the field, and at the edge of the field the discrepancy is close to 0.1 rad^{2}, but looking to the Strehl curves (dotted lines) this discrepancy translates to less than 5% relative deviation over the range of values plotted (35 arcsec), with a field average of 3%. These numbers are quite tolerable for a firstorder performance estimation model, and it may be concluded that undersampling the frequency content of the atmospheric turbulence to first order has no or only a negligible impact on the errors associated with anisoplanatism, servolag and noise. While small in the present context, these discrepancies may not be completely insignificant for a tight error budget as far as the design of a real system is concerned. For that purpose one would need to look into secondorder corrections to improve upon the accuracy of the model, if the current hybrid algorithm is to be used. Presently, no higherorder correction schemes have been investigated. Keeping in mind that fitting and aliasing are given by analytical approximations and that scintillation remains unaccounted for, simulation results for a number of relevant AO and MCAO configurations are presented in the following.
4.3 Laser beacons and multiconjugation
Concerning laser guide stars and multiconjugated systems, a few special considerations must be taken in order to ensure that these are efficiently implemented without losing accuracy in the simulation.
 1.
 For the case of laser guide star (LGS) adaptive optics, the propagator
must include a scaling procedure in order to account for the cone effect that arises when using a reference beacon at a finite altitude. The horizontal scaling of the atmospheric turbulence is accomplished by rebinning the phase screens (which are periodic) onto a finer mesh and resample at new coordinates by cubic spline interpolation. Tip and tilt modes were filtered from the measurement signal s rather than being projected out of the controlled subspace of the interaction matrix G, which would render it nonsparse, and a NGS employed for tiptilt sensing.
 2.
 For the case of multiconjugated adaptive optics (MCAO), the interpolation described above would be subject to aliasing effects at the edges of the DMs if applied thereto, and should be avoided if possible. Accuracy in the wavefront sensing is thus retained by having the guide star positions on the sky observe integer pixel shifts on the conjugated DMs. The integrated DM correction in the direction of any guide star may then be obtained by adding the DM shapes conformally without interpolation.
 3.
 For the case of LGS MCAO, the pixel scales at the conjugated DMs were reduced by the geometric factor 1h_{k}/H, for a LGS at the range H and a DM at the range h_{k}. By having the LGS positions on the sky correspond to integer pixel shifts on the conjugated DMs with this reduced scale, the DM phase may be summed as in the NGS MCAO case and only the turbulence interpolated. Differential tilt is in this case obtained from three tiptilt NGS, and compensated using a separate control loop (Ellerbroek & Rigaut 2001).

Figure 8:
Onaxis residual phase variance
versus telescope diameter D for LGS AO, for a range of outer scales L_{0}. Low and highorder AO systems with d=0.8 m (solid) and d=0.5 m (dashed) are both shown for comparison. 
As the diameter of the telescope aperture D approaches the turbulence outer scale L_{0}, loworder modes responsible for a large fraction of the turbulent energy such as tip and tilt, defocus, astigmatism etc. become deweighted relative to higher order modes within the aperture. Although L_{0} is a poorly known quantity at most sites and times, its value seems to be of critical importance to the design and performance of ELTs. Predictions based in Kolmogorov theory do no longer apply, but it has not been investigated extensively how the characters of angular and focal anisoplanatism scale to ELTs, especially when it comes to tomography and MCAO. Figures 8 and 9 plot the residual phase variance
for two conventional LGS AO systems with d=0.8 m and d=0.5 m as the telescope diameter was varied through
D=[10,30,50,100] m. Since performance is measured only on the axis of the LGS, angular anisoplanatism is not present, and for the relatively benign observation scenario chosen (nearinfrared, bright guide stars  see Table 2) the noise error is negligible and the servolag error small. The dominant error sources in Figs. 8 and 9 are therefore DM fitting and LGS focal anisoplanatism, and for small D (10 m) the cone effect is mild enough compared to the fitting error that no variation with L_{0} is observed. According to Kolmogorov theory, the fitting error does not vary with telescope diameter when aperture filtering is not taken into account (and only weakly when it is), and the induced phase variance due to the cone effect should scale as D^{5/3} (Fried & Belsher 1994). The increase in
in Fig. 8 therefore chiefly reflects the cone effect, but in part also the much smaller variations of the servolag, aliasing and reconstruction errors that arise with the departure from Kolmogorov statistics when the telescope diameter approaches the outer scale. As seen from the loglog presentation, however, the residual phase variance for the various cases still scales very much like a power law with a varying exponent. By assuming a power law relation of the form
and fitting straight lines to the logarithmized data in Fig. 9, one may obtain the exponent .
The exponents for the current data are listed in Table 4, and the deviations from the 5/3's law predicted by Kolmogorov theory are seen to be significant.

Figure 9:
Same data as in Fig. 8 replotted on logarithmic axes to show the differing slopes. 
This is partially due to the finite outer scale, but as may be inferred from the two columns of the table, also dependent upon the interactuator density d. Not simply does the AO residual phase variance depend upon the outer scale, but more precisely it depends upon the resultant spatial statistics as the turbulence is projected onto the influence functions of the DM (or equivalently, the controlled modes of the wavefront estimator). The smaller the interactuator distance d, the higher the spatial cutoff frequency of the deformable mirror and the closer to the real turbulence spectrum the system operates. That
decreases when d increases can be understood in terms of the AO system addressing less of the high spatial frequency content of the atmosphere as the order of the system goes down. With a finite outer scale, the deviations from Kolmogorov statistics then become increasingly more prominent among the low spatial frequencies that are compensated by the system. It may be inferred by numerical evaluation of turbulence integrals (Sasiela 1994)
Table 5:
Strehl ratio simulation results for 2DM MCAO system with 5 NGSs or 5 LGSs and three different turbulence outer scales L_{0}. For the case d=0.5 m (lefthand side of table), the square field of view defined by the guide stars measured 40 arcsec in the NGS case and
36 arcsec in the LGS case. For the d=0.8 m case, the field was 67 and 60 arcsec for the NGS and LGS simulations.

that include a finite outer scale, that the contribution to
due to focal
anisoplanatism will be reduced at large D and small L_{0} as compared to Kolmogorov theory. Where this effect becomes significant, the scaling with D will start to deviate from a simple power law. There is an indication to this effect by the presence of an inflection point in some of the curves. The current sparse data does not allow a detailed analysis of this phenomenon, however, and a discussion of thereof is in any case beyond the scope of the present study^{}.
A series of MCAO cases with two DMs, conjugated to zero and 10 km, and either five NGSs or five LGSs in a Xlike geometry with a base of 40'' were simulated subject to the same variations in L_{0} and d as the single LGS simulation. The tomography algorithm employed was the linear MAP estimator described in Sect. 2.2, which is derived in Appendix A. The results are given in Table 5, in the form of the fieldaveraged Strehl ratio
(left column) and the relative deviation over the field
(right column).
The Strehl ratios were computed from the phase variance in the extended Maréchal approximation,
,
which means that entries below 0.2 are somewhat underestimated. For MCAO, no analytical predictions for the tomographic error in Kolmogorov theory have been produced, leaving little by which to compare these results. It may be conjectured, however, that the instance of tomography should help to mitigate the cone effect even further upon the aid lent by the finite outer scale. Comparing with the values in Fig. 8, it is seen that the sensitivity to L_{0} is stronger in the MCAO case, plausibly due to the larger field over which MCAO attempts to compensate turbulence. The gain toward low L_{0} is however also larger for LGS MCAO than LGS AO, indicating a very nonlinear coupling between L_{0}, D and the instance of tomography and multiconjugation  which is why results for ELTs are very difficult to infer with any confidence from simulations at smaller scales. It is seen that, contrary to some claims in the literature based on very conservative estimations or extrapolations, a small number of LGSs may actually provide a useful level of compensation on telescopes as large as 3050 m, if not close to diffractionlimited. For higher LGS performance the LGS density must be increased in order to reduce the tomographic error due to focal anisoplanatism, which means a larger number of LGSs if a given field of view is to be maintained. Tilt anisoplanatism in LGSbased MCAO systems will reduce performance somewhat with respect to NGS MCAO, as discussed in Ellerbroek & Rigaut (2001).
Regarding concerns occasionally voiced that LGS tomography, tilt anisoplanatism aside, would be implicitly less accurate than NGS tomography, no such effect has been ever observed by the author of this paper. It has been verified in a designed test simulation that, if the turbulence was confined to the union volume spanned by the cones (i.e. no unsampled turbulence), LGS and NGS tomography become exactly equivalent. The poorer performance of LGS tomography in a real scenario is solely due to the loss of information that a focused sensor beam must contend with, and minimizing this information loss by increasing the LGS density will gradually improve LGS MCAO performance. It remains to be proven however, that an ELT in the range 30100 m will be able to deliver neardiffractionlimited performance under wind buffeting and mechanical flexure, which may be a larger obstacle to overcome than adaptive optics performance.
Firstorder performance estimation for adaptive optics systems may be accomplished efficiently with a hybrid numerical simulation that combines the accuracy and realism provided by a MonteCarlotype simulation with analytical compensation tools where the former was compromised in order to increase efficiency. Systems with 10^{5} degrees of freedom, including multiconjugate and laserguidestarbased systems, may be efficiently investigated thus using a current standard personal computer. Results were presented for a number of AO and MCAO configurations, demonstrating that the turbulence outer scale L_{0} has a significant impact on performance when the telescope diameter D approaches L_{0}. LGS MCAO systems are seen to be more sensitive to L_{0} than conventional LGS AO systems, and the gain also larger in MCAO as L_{0} becomes smaller. To the best of the author's knowledge, the results for the n=65 282 case represent the largest MCAO system simulated in full to date, although the simulation approaches presented in Ellerbroek (2002) and Gilles et al. (2002) are of comparable capabilities and efficiencies. Future extensions of the simulation will proceed to refine the fitting error estimation and include scintillation effects.
Acknowledgements
The author would like to thank H.G. Ludwig for many valuable comments on the manuscript and generous support with additional computing resources.
The open loop interaction equation may be written

(A.1) 
from which we wish to estimate the optimal command vector
that maximizes the
a posteriori conditional probability
:

(A.2) 
A standard procedure to find
that shall be used here is to solve
for c. From Bayes' theorem of conditional probabilities we have that

(A.3) 
where
since this is the measured data and
is the socalled Bayesian prior, which contains the a priori information on the statistics of
that will regularize the solution. Assuming the wavefront sensor subaperture noise to be of Gaussian statistics with zero mean and standard deviations
,
it is seen by
rearranging (A.1) that
shares the statistics of n. Assuming further the noise to be uncorrelated between subapertures,
,
the likelihood term
evaluates to

(A.4) 

(A.5) 
where
is the noise covariance matrix. For mathematical stringency, the term
specifies the scale of discretization, as the probability would always be zero if the s_{i}'s took on continuous values. Invoking the central limit theorem to bear on refractive index fluctuations in the atmosphere, the open loop statistics of
may likewise be assumed to be Gaussian. In analogy with (A.5) we may thus write down the prior term directly as

(A.6) 
where
is the actuator command signal covariance matrix. This gives the a posteriori probability



(A.7) 
Maximizing the probability is equivalent to minimizing the negative of its logarithm, so a local minimum is given where for all vectors u

(A.8) 
It is seen that
is quadratic in c, so taking the derivative will result in a linear equation that should be easy to solve. Evaluating the limit gives

(A.9) 
and a nontrivial solution for all u requires that

(A.10) 
from which one may write
and identify the MAP reconstructor as
E_{*}=(G^{T}C_{n}^{1}G+C_{c}^{1})^{1}G^{T}C_{n}^{1}.

Bracewell, R. N. 1994, The Fourier Transform and its Applications, 2nd ed.
(WCB/McGrawHill Inc.)

Cochran, G. M. 1986, Sparse matrix techniques in wavefront reconstruction,
Tech. Rep. No. TR668 (Anaheim, CA92825, USA: The Optical Sciences Company)

Ellerbroek, B. L. 1986, Comparison of least squares and minimum variance
wavefront reconstruction for atmospheric turbulence compensation in the
presence of noise, Tech. Rep. No. TR721R
(Anaheim CA92825 USA: The Optical Sciences Company)

Ellerbroek, B. L. 1994, J. Opt. Soc. Am. A, 11, 783
NASA ADS

Ellerbroek, B. L. 2002, J. Opt. Soc. Am. A, 19, 1803
NASA ADS

Ellerbroek, B. L., & Rigaut, F. 2001, J. Opt. Soc. Am. A, 18, 2539
NASA ADS

Flicker, R., Rigaut, F., & Ellerbroek, B. 2000, in 4007,
Adaptive Optical Systems Technology, Proc. SPIE, 1032

Flicker, R. C. 2001, Opt. Lett., 26, 1743
NASA ADS

Fried, D. L., & Belsher, J. F. 1994, J. Opt. Soc. Am. A, 11, 277
NASA ADS

Fusco, T., Conan, J., Rousset, G., Mugnier, L. M., & Michau, V.
2001, J. Opt. Soc. Am., 18, 2527
NASA ADS

Fusco, T., Conan, J.M., Michau, V., Mugnier, L. M., & Rousset, G.
1999, Opt. Lett., 24, 1472
NASA ADS

Gilles, L., Vogel, C. R., & Ellerbroek, B. L. 2002, J. Opt. Soc. Am. A, 19,
1817
NASA ADS

Gilmozzi, R. 2000, in Discoveries and Research
Prospects from 8 to 10MeterClass Telescopes, ed. J. Bergeron, Proc. SPIE, 4005, 2

Hardy, J. W. 1998, Adaptive optics in astronomy (Oxford University Press), 176

Hawarden, T. G., Dravins, D., Gilmore, G. F., et al. 2002, in
Future Giant Telescopes, ed. J. R. P. Angel, & R. Gilmozzi, Proc. SPIE, 4840, in
press

Le Louarn, M. 2002, MNRAS, 334, 865
NASA ADS

Noll, R. J. 1976, J. Opt. Soc. Am., 66, 207
NASA ADS

OwnerPetersen, M., & Goncharov, A. 2002, J. Opt. Soc. Am. A, 19, 537
NASA ADS

Pissanetzky, S. 1984, Sparse matrix technology (London: Academic Press Inc.)

Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. 1992,
Numerical recipes in C: the art of scientific computing, 2nd ed.
(Cambridge University Press)

Ragazzoni, R., Farinato, J., & Marchetti, E. 2000, in Adaptive Optical Systems Technology, ed. P. Wizinowich, Proc. SPIE, 4007, 1076

Rigaut, F. 1999, private communication, Gemini Observatory, HiloHI, USA

Rigaut, F., Salmon, D., Arsenault, R., et al. 1998, Publ. Astr. Soc. Pac.,
110, 152
NASA ADS

Rigaut, F. J., Ellerbroek, B. L., & Flicker, R. 2000, in
Adaptive Optical Systems Technology, ed. P. L. Wizinowich,
Proc. SPIE, 4007, 1022

Rigaut, F. J., Veran, J., & Lai, O. 1998, in
Adaptive Optical System Technologies, ed. D. Bonaccini, & R. K. Tyson,
Proc. SPIE, 3353, 1038

Sasiela, R. J. 1994, Electromagnetic Wave propagation in Turbulence:
Evaluation and Application of Mellin Transforms (SpringerVerlag)

Snel, R. C., & Ardeberg, A. L. 2002, in Future
Research Direction and Visions for Astronomy, ed. A. M. Dressler, Proc. SPIE, 4835, 60

Tokovinin, A., Le Louarn, M., Viard, E., Hubin, N., & Conan, R.
2001, A&A, 378, 710
NASA ADS

Winker, D. M. 1991, J. Opt. Soc. Am. A, 8, 1568
NASA ADS
Copyright ESO 2003