Improved GRAVITY astrometric accuracy from modeling of optical aberrations

The GRAVITY instrument on the ESO VLTI pioneers the field of high-precision near-infrared interferometry by providing astrometry at the $10 - 100\,\mu$as level. Measurements at such high precision crucially depend on the control of systematic effects. Here, we investigate how aberrations introduced by small optical imperfections along the path from the telescope to the detector affect the astrometry. We develop an analytical model that describes the impact of such aberrations on the measurement of complex visibilities. Our formalism accounts for pupil-plane and focal-plane aberrations, as well as for the interplay between static and turbulent aberrations, and successfully reproduces calibration measurements of a binary star. The Galactic Center observations with GRAVITY in 2017 and 2018, when both Sgr A* and the star S2 were targeted in a single fiber pointing, are affected by these aberrations at a level of less than 0.5 mas. Removal of these effects brings the measurement in harmony with the dual beam observations of 2019 and 2020, which are not affected by these aberrations. This also resolves the small systematic discrepancies between the derived distance $R_0$ to the Galactic Center reported previously.


Introduction
The distance to the Galactic Center (GC), R 0 , can be measured directly from stellar orbits around Sgr A*, the radio source associated with the GC massive black hole (MBH) (see e.g. Genzel et al. 2010 and Bland-Hawthorn & Gerhard 2016 for a recent overview of alternative methods). To this end, the star's proper motion, given in angle per unit time, is compared to its radial velocity, obtained in absolute length per units time from spectroscopic observations. The GC distance then follows directly as a scaling parameter between the two measurements. Most suited to measure R 0 is S2, a massive young main sequence B-star on a 16-year orbit with semi-major axis a » 125 mas and apparent K-band magnitude m k » 14 (Ghez et al. 2003;Eisenhauer et al. 2005;Martins et al. 2008;Gillessen et al. 2009aGillessen et al. , 2017Habibi et al. 2017). During its pericenter passage in 2018, S2 was closely monitored in astrometry and spectroscopy (Gravity Collaboration et al. 2018;Do et al. 2019). In particular, the GRAV-ITY instrument (Gravity Collaboration et al. 2017) directly measured the distance between S2 and Sgr A* during the fly-by at high angular resolution of around 30 µas. The combination of ultra-high astrometric precision from near-infrared interferometry and the spectroscopic precision of À 10 km/s allowed to determine the GC distance at the unprecedented precision of ă 1% (Gravity Collaboration et al. 2019).
Operating in the K-band, GRAVITY combines the light from either the four Unit Telescopes (UTs) or Auxiliary Telescopes (AT) of the ESO Very Large Telescope Interferometer (VLTI). Fringe tracking on a bright reference object allows for minutelong integration times on the fainter science target and for the measurement of differential complex visibilities. The instrument's extremely high angular resolution of » 3 mas results in very accurate astrometry with error bars between 10 µas and 100 µas (Gravity Collaboration et al. 2017). However, the latest R 0 measurement in Gravity Collaboration et al. (2020) indicates a possible systematic difference with earlier determinations (Gravity Collaboration et al. 2018. While the shift is small, of O p1%q only, it is nevertheless significant due to the high precision of the measurement. The difference in the measured GC distance coincides with a change in the observing mode. GRAVITY observes the Galactic Center with two different methods, depending on the separation between Sgr A* and S2. Close to pericenter passage, i.e. in 2017 and 2018, the sources are detected simultaneously in a single Article number, page 1 of 17 arXiv:2101.12098v1 [astro-ph.GA] 28 Jan 2021 A&A proofs: manuscript no. gravity-phasemaps fiber pointing in the so-called single-beam mode. In later epochs, their separation exceeds the fiber's field of view (FOV), and S2 and Sgr A* are targeted individually. This is referred to as dualbeam mode.
In single-beam mode, it is not possible to align the two sources with the fiber center. Hence, to further improve the GRAVITY astrometry, we conducted an analysis of how optical aberrations affect the visibility measurement across the full field of view. A similar concept of field-dependent errors already exist in radio interferometry, where it is known as direction dependent effects (DDEs) (see e.g. Bhatnagar et al. (2008); Smirnov (2011); Smirnov & Tasse (2015); Tasse et al. (2018) and references there in). The DDEs can arise either at the instrument level from the antenna beam pattern or at the atmospheric level such as from the ionosphere. In particular for the latest generation of interferometers (e.g. VLA, Meerkat, LOFAR) with a wide FOV and a large fractional bandwidth DDEs cannot be neglected. However, to our knowledge there is no equivalent discussion in the context of optical/near-IR interferometry.
Indeed, our analysis shows that small optical imperfections in the beam combiner induce field-dependent phase errors that reflect in the inferred binary separation. We developed an analytical model to describe this effect, and verified it by application to a dedicated test-case observation. Applied to the GC observations, the model induces a shift in the S2 relative position of order 0.1´0.2 mas in 2018 and " 0.5 mas in 2017 in both right ascension (R.A.) and declination (Dec.). Despite being small, the change is non-negligible at the high astrometric accuracy achieved by GRAVITY. We can show that the corrected 2017 and 2018 data is in harmony with the dual-beam observations of 2019 and 2020. Further, when retroactively applying the correction to the data sets used in Gravity Collaboration et al. (2018Collaboration et al. ( , 2019, the ensuing GC distance is fully consistent with the latest result (Gravity Collaboration et al. 2020).
We introduce the analytical model in Sec. 2 and compare it to calibration measurements in Sec. 3. Verification from the binary test-case and the improved S2 position are presented in Sec. 4, while we discuss the implications for the GC distance in Sec. 5. Finally, we conclude in Sec. 6.

Formal description of static aberrations and their impact on visibility measurements
Static aberrations along the instrument's optical path affect the measured visibilities by introducing a complex, field-dependent factor for each telescope. We express this gain in its polar representation and decompose it into a phase map φ i pαq and an amplitude map A i pαq. Here, the index i labels the telescope and α denotes positions in the image plane. Phase and amplitude maps lead to a modification of the observed complex visibilities V obs from the well-known van Cittert-Zernike theorem (c.f. Eq. 23). As we demonstrate in the following, they are given by where b i, j is the baseline vector between the two telescopes and O pαq denotes the intensity distribution of the observed astronomical object.
In this section, we show how the phase-and amplitudemaps follow from optical aberrations. To this end, we start from the overlap integral, which determines the electromagnetic field from a single telescope arriving at the beam combiner. Subsequently, we propagate the effect of static aberrations from the overlap integral to the measured complex visibility to arrive at a rigorous derivation of Eq. (1). Finally, we account for the superposition of static and turbulent aberrations, to obtain a formalism which is applicable in realistic observation scenarios.

Static, field-dependent aberrations at fiber injection
Single mode fibers transport the light collected by each telescope E tel to the beam combiner instrument. The overlap integral between light and the fiber mode E fib then determines the transmitted electric field (Neumann 1988), Here, we assume a normalized fiber mode ş dξ |E fib pξq| 2 " 1 and express image-plane positions by two-dimensional vectors, ξ and β. Following the description of Perrin & Woillez (2019), the overlap integral is converted to the pupil plane by the Parseval-Plancharel theorem, where F´1 denotes the inverse Fourier transform, i.e. transformation from the image to the pupil plane, and E obj the light emitted by the astronomical object. The latter is connected to F´1 pE tel q by multiplication with the pupil function P puq, corresponding to a convolution in the image plane. In the most simple case of a single point source located at α 0 , the light is described by a pure phase F´1rE ps obj s " exp p´2πi u¨α 0 q. The pupil-and image-plane coordinates, ξ and u respectively, are Fourier-conjugate to each other and chosen to be dimensionless. That is, any length scale in the pupil plane is given by λu where λ refers to the wavelength and u " |u|. For discussion, we convert the dimensionless image plane coordinates ξ to the corresponding angular separation in UT observations. In an aberration-free scenario, the pupil function of a spherical telescope with diameter 2r tel and central obscuration 2r cent simply is Optical aberrations multiply the pupil function by a positiondependent, complex phase, and we here consider the case of purely static aberrations. These are characterized by an optical path difference (OPD) d pup puq in the pupil plane that can be expanded in terms of Zernike polynomials Z m n , We adopt the convention that Z m n is dimensionless and the coefficient A m n corresponds to the term's root mean square over the unit circle. Defining the turbulence-free complex fiber mode apodised by the pupil function as the overlap integral reads The overlap integral obviously depends on the fiber profile which, for a perfectly aligned ideal single-mode fiber, is GRAVITY was designed for optimal fiber injection (Pfuhl et al. 2014), which is obtained for σ fib " 2r tel ? 2 ln 2{ pπ q (Wallner et al. 2002). Here, the parameter is of order unity and describes the pupil shape. From comparison between model predictions and the calibration measurements in Sec. 3.2, we find that pupil-plane distortions alone are not sufficient to describe the observed aberration pattern. We also need to account for optical errors in the focal plane. Misalignment of the optical fiber, as well as higher order aberrations at fiber injection, introduce a complex phase to Eq. (8) and can distort the amplitude of the fiber profile.
To illustrate the effect of focal plane aberrations, we first consider the three types of misalignment depicted in Fig. 1: (A) Lateral misplacement of the fiber by pδx, δyq, which in the pupil plane produces a phase slope ξ fib " pδx{ f, δy{ f q, with f being the focal length. (B) Fiber tilt by an angle ϕ fib " pϕ 1 , ϕ 2 q with respect to the optical axis of the system which shifts the backpropagated fiber mode by u fib " ϕ¨f {λ. And (C), a defocus or axial fiber misplacement by δz that introduces an additional phase curvature exp " π iδzλ{ f 2 u 2 ‰ . Taking all three effects into account, the generalized fiber profile, projected to the pupil, is (Wallner et al. 2002) By rearranging the phase term in the pupil plane, one can decompose it into a piston, tip-tilt and defocus d piston fib puq "´λˆδ zλ f 2 u fib`ξfib˙¨ufib´δ The phase terms in Eqs. (10) to (12) thus affects the overlap integral in the same way as the lowest-order aberrations in d pup puq.
For the coordinate shift of the Gaussian profile, on the other hand, there is no such correspondence, and it alters the way in which the optical fiber scans the pupil-plane aberrations. During GRAVITY observations, the misplacement term (A) depends on the performance of the fiber tracker but also on the uncertainty of the source position. In particular for exoplanet observations, the latter can be sizable. Fiber tilt (B) is controlled by the GRAVITY pupil tracker, and the adaptive optics calibration is one example that impacts the defocus (C).
While lateral misplacement (A) and defocus (C) describe the misplacement of a point-like fiber entrance, fiber tilt (B) accounts for the alignment of the fiber's surface. This surface can exhibit irregularities beyond a simple tilt, which lead to a position-dependent OPD in the focal plane, d foc pxq, as illustrated in Fig. 1. Generally, aberrations from optical elements not conjugated to the pupil are field-dependent and known as Seidel aberrations. In this context, d foc pxq arising in the focal plane constitutes an extreme example. Still, it is possible to decompose the focal plane distortions into a series of Zernike polynomials, in analogy to Eq. (5). In this representation, axial fiber offset (C) and fiber tilt (B) simply correspond to the lowestorder coefficients, and higher-order terms amount to a generalization of Wallner et al. (2002). Again, the phase terms introduced in F´1 "Ẽf b ‰ by higher order aberrations are degenerate with d pup puq, but the amplitude distortions need to be modeled explicitly by themselves.
Finally, for a single point source, located at α 0 in the image plane, the overlap integral averaged over a time scale much longer than the source's coherence time x...y obj is xη ps y obj 9 ż du e´2 πi u¨α 0 Π e puq " F rΠ e s pα 0 q .
Evaluation of the Fourier transform as function of α 0 results in a two-dimensional complex map. We show several examples of such maps in Fig. 2, assuming different Zernike coefficients to determine d pup puq. The perfect Airy pattern, obtained in the limit of zero aberrations, exhibits zero phase in the central part and a phase jump by 180˝at |α| » 1.22 λ{ p2r tel q. Anti-symmetric terms, such as tilt, coma and trefoil (not shown), only alter the location and shape of the phase jump, while defocus (not shown), astigmatism and higher order terms produce smooth phase gradients. For a general choice of d pup puq and in the absence of focalplane aberrations, there is a saddle point where the phase maps average to zero, but significant phase shifts are encountered at larger radii. Focal-plane aberrations break the radial symmetry of the fiber profile. Still, if the perturbations are small enough, the phase maps show a saddle point, but its value differs from zero and its location may be shifted. In any case, the transmitted amplitude is deformed and/or misplaced from the perfect Airy case. Pupil-plane aberrations typically widen the amplitude, while image-plane aberrations have the opposite effect. They lead to a widening of the fiber in the pupil plane and correspondingly to a narrower image-plane profile. The exact scaling relation for the position of the Airy ring remains true only approximately in the presence of higher-order aberrations such that maps at two different wavelengths, λ 1 and λ 2 , can be related by xη ps y obj pα 0 , λ 1 q » xη ps y objˆα 0 λ 2 λ 1 , λ 2˙.

Effect on visibility measurements and astrometry
The overlap integral defines the electromagnetic wave transmitted to the beam combiner from each of the four telescopes. After pairwise beam combination, the complex visibilities are obtained from the inference pattern I i, j , where i and j denote the telescopes involved in the measurement and I is the intensity. The complex pupil function enters each of these terms. Focusing on the single-telescope component first, we find from Eq. (7) where the b-operator denotes auto-correlation, and O pαq "ˇE obj pαqˇˇ2 is the brightness distribution of the observed astronomical object which obeys Similarly, the inference term is given by where b i, j is the baseline vector. All optical aberrations discussed previously are encoded in the back-projected apodized pupil, which is a complex fielddependent function. Expressing the pupil function in its polar representation, we refer to A i as the telescope-dependent "amplitude map" and to φ i as the "phase map". Note that these quantities are closely related to the photometric and the interferometric lobes, L i pαq " A 2 i pαq and respectively. From the measured inference pattern, the complex visibilities are obtained as By contrast, in an ideal, aberration-free setting, the van-Cittert-Zernike theorem relates the complex visibilities to the object's brightness distribution Comparison of Eq. (22) and Eq. (23) readily suggests that static aberrations at fiber injection distort both the measured visibility phases and amplitudes. We thus need to adapt the interferometric equation accordingly. To make this effect even more explicit, we first consider the case of a single, unresolved object at position α 0 , In the aberration-free case, the phase and amplitude maps of either telescope are given by the perfect Airy pattern shown in the very left panel of Fig. 2, and φ i{ j pα 0 q equals zero or 2π. The presence of static aberrations introduces a phase shift by φ i pα 0 q´φ j pα 0 q. For an interferometric binary with positions α 1 , α 2 and flux ratio f bin the measured visibility becomes Finally, for a generic extended object with an intensity distribution O pαq the van-Cittert-Zernike theorem generalizes to the expression stated at the beginning of this section, in Eq. (1) Single point sources typically are observed at the fiber center, where fiber injection is highest and the phase distortions are close to zero. In situations where a very precise alignment is not possible, like for example in exoplanet observations, the visibilities can pick up some small contribution from the phase maps. For binaries with a separation comparable to the fiber width, a configuration in which the phase and amplitude maps are irrelevant cannot be obtained in principle. In this case, the effect of static aberrations needs to be modeled and corrected for in the data analysis.

Interplay with turbulent aberrations
To this point, we have not considered the effect of time varying phase aberrations. These are introduced by atmospheric turbulence or time-varying imperfections in the optical system such as tip-tilt jitter from the adaptive optics. Their effect is to multiply the static pupil function by another, time dependent phase P e " Π e e iφ turb pu,tq .
To see how time-dependent aberrations affect the visibility measurement, we briefly recap the arguments of Perrin & Woillez (2019). Assuming that the detector integration time by far exceeds the coherence time of phase fluctuations, the long-time average x...y turb over the telescope lobes is where D φ puq is the structure function of the turbulent phase (Roddier 1981), which saturates to 2σ φ on large scales. Two assumptions underlie these expressions, first that the fluctuations are stationary and second that the baseline between the telescopes is long enough for the respective apertures to become uncorrelated. As in Perrin & Woillez (2019), we assume both to be fulfilled. In the case of GRAVITY observations, atmospheric phase variations across the telescope apertures are corrected by the adaptive optics system and the turbulent aberrations are dominated by tip-tilt jitter. Thus, the turbulent phase is where the two directions of t i ptq are independent and follow a Gaussian distribution with zero mean and variance σ 2 t . The structure function then becomes D t puq " p2πσ t uq 2 , and the photometric lobe is given by where f denotes convolution. In case of the interferometric lobe, we further assume that the jitter is uncorrelated between telescopes which yields These turbulent lobes replace the static expressions of the previous sections in the prediction of the observed visibility, i.e. in Eq.
(1), Eq. (24) and Eq. (25). The tip-tilt jitter acts like a Gaussian convolution kernel on the static maps, which is applied to the amplitude map squared in case of the photometric lobe but to the full complex map in the case of the interferometric lobe.

Measurement and characterization of aberrations for the GRAVITY beam combiner
GRAVITY observes the Galactic Center in its so-called dualfield mode, which requires the presence of a bright reference target (IRS 16C) within 2" of the actual science targets, Sgr A* and S2. The field at each telescope is split, and reference and science source are separately injected into the fringe tracking (FT) and science channel (SC) fibers. Short detector integration times on the FT allow for the optical path delay to be constantly adjusted for atmospheric turbulence in order to maintain a high fringe contrast. The science channel then measures a differential visibility phase with respect to the fringe tracker on each baseline. Phase and amplitude maps are inherently single-field effects in the sense that they individually affect the fringe tracker and the science channel for each telescope separately. Based on the optical layout of the fiber coupler (Pfuhl et al. 2014), there is no reason to expect equal aberrations on the SC and FT. However, the fringe tracking object is a bright, unresolved source which is actively tracked by the fiber center in closed loop, such that the phase distortions introduced from static aberrations are small. Moreover, any possible phase distortion from the fringe tracker cancels in the analysis of closure phases or induces a global shift without affecting the binary separation in the analysis of visibility phases. However, a description of the SC phase and amplitude maps is essential to robustly measure a binary separation in the science channel.
Here we report on measurements with the GRAVITY Calibration Unit (Blind et al. 2014) and on our subsequent analysis to extract SC phase and amplitude maps. We then fit the staticaberration model from Sec. 2.1 to those maps in order to demonstrate its validity and to obtain compressed representation of the aberrations in form of a small number of Zernike coefficients.

Phase map measurements with the calibration unit
The GRAVITY Calibration Unit, which we use for the measurement of static aberrations, is directly attached to the beam combiner and creates the light of an artificial science and fringe tracker star. By modulating the voltage on GRAVITY's positioning mirror, the position of that star relative to the fiber can be changed. We scan the FOV out to " 70 mas in a pattern of inand out-spiral, which is applied simultaneously to the FT and SC on one single telescope at a time, see Fig. 3.
In normal observation mode, GRAVITY controls the differential OPD between science channel and fringe tracker by its laser metrology and the common path to the telescopes by fringe tracking. During the phase map calibration measurement, however, fringe tracking is not possible because the fringes are lost at the margins of the scanning region. Instead, the common path from the telescope to the instrument drifts in time. Thus the determination of the aberration pattern from the absolute FT and SC phase requires a drift correction. On the FT, the short detector integration time with maximum sampling frequency of 1 kHz allows one to resolve fast modulation of the source position and the full FOV can be scanned within " 15 s. Over this short time span, the drift is well described by a constant velocity, which we fit and subtract from the data. On the SC, in contrast, the minimum detector integration time is 0.13 s and a full scan of the FOV takes 2´3 minutes, too long to model the drift by a simple polynomial fit. Instead, we obtain the science channel aberrations via a detour and first analyze the differential, drift-free SC-FT phase. The pure science channel aberrations then follow from knowledge of the absolute fringe tracker phase.
The data are reduced by the standard GRAVITY pipeline and we obtain the correlated flux in six FT spectral channels (ranging from 1.99´2.38 µm) and in medium resolution for the SC (233 wavelength bins in the range 1.97´2.48 µm). With the chosen setup, where the source position is varied on only one of the two beams forming a baseline, the measured correlated flux is given by Thus, the measurement directly scans the phase and amplitude maps on the modulated channel. Potential offsets in the accompanying non-modulated beam, φ j p0q ‰ 0, can only cause a global phase shift, which we fit and remove in the subsequent analysis. Finally, we consider the amplitude maps normalized to their maximum value, such that A j p0q has no impact on our result.
In summary we apply the following analysis steps to obtain the FT and the differential SC-FT phase and amplitude maps.
1. We fit and subtract a linear time drift from the phases measured in each spectral channel and on each baseline. 2. Phases and amplitudes are binned on a spatial grid with resolution 1 mas and averaged over all periods of in-and outspiral available. 3. The image plane coordinates do not align perfectly with the amplitude maximum, i.e. the source position for which the coupling to the fiber is most efficient. We correct for this effect by fitting a Gaussian profile and shifting the coordinate origin to its maximum. 4. Interpolation over the gridded data gives one phase and amplitude map per spectral channel and baseline. 5. All spectral channels are combined into a single map at reference wavelength λ 0 " 2.2 µm, by applying the approximate coordinate scaling from Eq. (14). Here, we verified that the individual maps are consistent over the full spectral range. Cross-validation with simulated maps shows that the error introduced by the approximate scaling relation is small, apart from the very margins of the map. It further cancels between channels above and below λ 0 to a very good degree. 6. From consideration of all baselines, three maps are available for each telescope. We again verify their consistency and average them into a single phase and amplitude map.
This method results in a FT and a differential SC-FT map for each telescope. Subtracting the former from the latter, we finally arrive at the desired SC phase map, which is shown in Fig. 4. The amplitude map on the SC, on the other hand, is measured directly. The Calibration Unit measurement was performed twice with a four month break, in late-2019 and early-2020, and we use the data to construct two independent sets of maps. These agree very well in the qualitative features and structures displayed. On the quantitative level the maps display moderate differences of the order of " 10˝, which are smaller at the center and increase towards the map's margins.

Representation in the pupil plane
Analyzing the Calibration Unit measurement as described in the previous subsection, we obtain the phase and amplitude maps on a grid discretizing the image plane. We use this result to infer the underlying pupil-plane and fiber aberrations, d pup puq and d foc puq, in their Zernike representation. To this end, we developed a simulation tool that creates complex maps of image-plane distortions from a set of Zernike coefficients according to Eq. (5), Eq. (6) and Eq. (13).
For the fit we consider the two Calibration Unit measurements from 2019 and 2020 separately and combine the phase and amplitude maps for each telescope into a complex map. We then minimize the square absolute difference to the model prediction summed over all pixels with respect to the input coefficients. Due to the nature of the approximate coordinate scaling (step 5 of the analysis pipeline), at a map's edge only the smallest wavelengths contribute. We limit the radius to which the data is considered in the fit to α maxˆλlow {λ high . With α max being the size of the full map and λ low and λ high the wavelength of the lowest and highest channel, respectively. This choice ensures equal participation of all channels in the fit.
The optical layout of observations with the Calibration Unit has some important differences with the on-sky situation, for which the phase maps will be applied later. Namely, the lack of a central obscuration and an enlarged outer stop r GCU " 9.6 m{2 alter the shape of the pupil defined in Eq. (4). As a consequence, the Calibration Unit pupil illuminates image-plane aberrations out to a slightly larger radius. We choose to normalize the Zernike polynomials by r tel " 8.0{2 m, i.e. the telescope area covered by the secondary mirror, to optimize our parameterization for the on-sky case. Image plane distortions, on the other hand, are normalized over the image-plane fiber width at λ 0 ,σ fib " λ 0 {`4r tel ? ln 2˘, i.e.
Of the different types of maps constructed, the fringe tracker provides the cleanest system and thus gives an important benchmark point for the agreement between model and data. We thus use the FT-maps to determine the order n max to which Zernike polynomials in the pupil-and focal-plane aberrations are considered. Successively increasing the fit order, we find that pupilplane aberrations with n max " 6 and focal-plane aberrations with n max " 2 provide satisfactory model consistency, while still allowing for manageable convergence times. Increasing the Zernike order in the pupil plane is especially important to reduce phase residuals at larger radii, while the central part of the maps can also be described by polynomials of lower order. Fits without focal-plane aberrations manage to reproduce the phase structure to a satisfactory degree, but show poor consistency between the phase and the amplitude data. Finally, an additional parameter   6. Phase residuals of the fit to the differential SC-FT map measured on 03/03/20 for all four GRAVITY beams. Only the data within the dashed circle is considered in the fit; at larger radii the cancellation of wavelength-dependent scaling errors is not guaranteed. accounts for the overall amplitude scaling between measured and predicted maps, such that each fit constrains at least 34 degrees of freedom. The phase RMS achieved for the fringe tracker fits is of order " 1˝for all beams and data sets; extrapolation of the fit result to the full map radius yields an RMS of a few degrees.
In principle, it is possible to directly fit the SC maps by the same procedure employed for the FT. However, by further re-fining the analysis we can remove additional systematic effects from the SC maps. Creating the maps, we corrected for misalignment of the image plane coordinates with the amplitude maximum (step 3 in the analysis pipeline). This shift, however, is not guaranteed to be identical on SC and FT, and as a result there can be a small offset between the FT phase entering the differential SC-FT measurement. To describe this effect, we fit a differential map, predicted from two sets of Zernike coefficients, to the SC-FT maps. The latter of this two sets of parameters is largely fixed to the previously obtained FT coefficients, and only the tiptilt terms are allowed to vary. The SC parameters, on the other hand, are all free, such that the fit eventually determines the desired SC maps and the offset between the two channels.
From the best-fit coefficients of the differential SC-FT fit, which we summarize in App. A, we reconstruct a complex SC map. Its phase is displayed in Fig. 5. As expected, the structure agrees very well with the maps obtained by direct evaluation of the Calibration Unit measurement in Fig. 4. Residuals between measured and fitted SC-FT map, shown in Fig. 6, are low over the full radius considered for the fit. We obtain a best-fit RMS of 1˝´2˝for most beams and data sets and two slightly worse results with RMS " 3˝and " 5˝. Going to larger radii, the disagreement between fit and data starts to increase. This can be caused either by wavelength-dependent errors or by higherorder aberrations, beyond those considered for the fit. Indeed, in optimizing n max , we noted that every increase improved the extrapolation to large separations. However, at such large offaxis distances, fiber damping becomes very significant, resulting in a poor signal-to-noise ratio. Thus, we consider the Zernike decomposition up to 6th order sufficient for our applications.

Application to GRAVITY observations
Static, field-dependent aberrations affect the visibility measurement whenever the size of an observed object is comparable to the fiber FOV. Here, we apply the formalism developed in Sec. 2 alongside the characterization of aberrations from Sec. 3 to observations of two different binary systems. First, as a proof of concept, we consider a test-case binary observed with the Auxiliary Telescopes (ATs), where the system's position in the FOV was systematically varied and thus screened over the phase and amplitude maps. Second, we apply the aberration-correction to GC observations with the UTs from 2017 and 2018. During those epochs, close to pericenter passage, S2 and Sgr A* where observed simultaneously in a single fiber pointing.
The data considered in either analysis consists of visibility amplitudes, squared visibilities and closure phases with a relative weighting of (1:1:2). To infer the sources' separation, we fit a binary model based on Eq. (25), which we extend to account for the effect of finite spectral resolution and for a homogeneous background with flux ratio f bkg relative to the first binary component, Phase distortions enter this expression via the OPD correction d i, j " pφ i´φ j qˆλ{2π. Further, the point-source visibility averaged over a spectral channel is The spectral bandpass P pλq is given by a top hat function. The source positions α 1 and α 2 , the flux ratios f bin and f bkg as well as the spectral index of the central component (ν 1 ) and the background flux (ν bkg ) are free fit parameters, while the companion's spectral slope is fixed to ν 2 " 3. Finally,Ã i{ j ,φ i{ j andL i{ j in Eq. (34) refer to the phase maps, amplitude maps and the photometric lobes as they are encountered in on-sky observations. Those have two important differences with the Calibration Unit measurement. Firstly, while the pupil-plane representation of the aberrations is the same for both settings, the presence of a central obscuration and the smaller outer stop affects the realization of the maps in the image plane. This is conveniently captured by using the Zernike coefficients found in Sec. 3.2 to create a new set of maps with adjusted pupil configuration. Secondly, the maps are subject to turbulent smoothing according to Eqs. (30) and (30).

Verification for a binary test-case
The test-case observations, carried out with the ATs in astrometric configuration, targeted HIP 41426, a binary with Kband magnitude m K » 5.393 at R.A. " 8:26:57.75 h, Dec. " 52:42:17.8 (Cutri et al. 2003). The system has an approximate separation of 200 mas. Its position relative to the GRAVITY fiber was kept fixed for three of the four telescopes and varied in 24 steps between˘400 mas on AT2. At each offset, ten frames with a 6 s integration time were taken. The setup is illustrated in Fig. 7, which shows both binary components relative to the fiber profile on all four telescopes. The shift was applied along the x-axis in the frame of the GRAVITY pupil, whose rotation with respect to the field results in a diagonal movement on the sky. We use the Zernike coefficients obtained for the SC in Sec. 3.2 to produce phase and amplitude maps tailored to observations with the ATs. In this case, the pupil, c.f. Eq. (4), is defined by r tel " 1.82 m{2 and r cent " 0.14 m{2. After beam collimation, ATs and UTs illuminate the same section on the GRAVITY mirrors, such that the pupil-plane phase screen can simply be scaled to the AT radius, i.e. r tel " 1.82 m{2 also applies in the Zernike decomposition of Eq. (5). To authenticate the impact of correct aberration modeling, we compare our results to a second, no-map analysis. In this latter scenario, we set all phase maps to zero and all amplitude maps to one, i.e.φ i{ j " 0, A i{ j " 1.
For too large fiber offsets, the signal-to-noise ratio on AT2 is poor due to large fiber damping and we consequently discard these data. The remaining pointings are shown in Fig. 7, and the corresponding separation, measured from a binary fit to the data according to Eq. (34), is given in Fig. 8  ing the correct aberration model in the analysis clearly shifts the result and reduces the scatter. Even more importantly, however, the separation found in the no-map analysis systematically depends on the fiber position; it is largest for positive fiber-offsets and smallest for offsets in the negative direction. With application of the aberration-correction, this systematic is largely removed.
We consider the binary test-case observations primarily as a proof of concept and therefore forgo a full analysis of the measurement's systematic error as carried out for the GC. Such uncertainties arise from the accuracy to which the phase maps can be determined and from the uncertainty of the atmospheric smoothing kernel. Further, there can be minor differences in the phase and amplitude maps between AT und UT observations, and our treatment is optimized to the UT scenario.
As the shift in its central value indicates, the binary separation is large enough that even at perfect fiber pointing at least one source lies in a region of the FOV where aberration-induced phase errors are significant. Accurate astrometry thus is not a question of precise fiber alignment but is only possible with a consistent treatment of the pupil-plane distortions in the analysis.

The separation between S2 and Sgr A*
Having verified our approach to correct for aberration-induced systematic errors, we also apply it to Galactic Center observations with GRAVITY. During 2017 and 2018, i.e. close to pericenter passage, S2 and Sgr A* where observed simultaneously in a single fiber pointing. In particular during 2017, when the off-axis distance of S2 was larger, the aberration correction improves the inferred binary separation. In 2019, in contrast, the Sgr A*-S2 separation exceeds the single telescope beam size of about 60 mas, and GRAVITY observes both sources separately in so called dual-beam mode. Their separation is then obtained by calibrating Sgr A* with S2 and fitting a point source model to its visibilities (see Gravity Collaboration et al. (2020) for details). In this configuration, each source can be well aligned with the fiber center, such that field-dependent aberrations do not impact the measurement.
To derive the aberration-induced shift of the S2 position, we examine a subset of the GRAVITY data used in Gravity Collaboration et al. (2019). In particular, we apply stricter quality cuts and demand a high signal-to-noise ratio. Phase and amplitude maps are generated from the coefficients obtained in Sec. 3.2 by accounting for the specific geometry of UT-observations, i.e. r tel " 8.0 m{2 and r cent " 0.96 m{2. The residual turbulent tiptilt is between 10 mas and 15 mas per axis (Perrin & Woillez 2019). In total, we consider four different realizations of the aberration maps which are given by the independent analysis of the two calibration measurements in 2019 and 2020 each convolved with the minimum and maximum smoothing assumption. A representative example for the phase maps applied in the GC analysis is shown in Fig. 9 in relation to the orbit of S2.
Our main result, the difference in S2 position with and without aberration-corrections averaged per month, is shown in Fig. 10. As expected, the correction is largest in early-2017 and smallest around peri-center passage in May 2018. Further, the mean corrections per epoch obtained with the four different realizations of the aberration maps are consistent over the full observational period.
As the orbit of S2 smoothly scans over the phase and amplitude maps (see Fig. 9), we also expect a smooth variation in the position-correction. Indeed, the time-dependence in Fig. 10 is well described by a second-order polynomial fit ∆R.A. "`´0.44 τ 2`0 .11 τ`0.04˘mas , ∆Dec. "`0.41 τ 2´0 .47 τ´0.06˘mas , where τ " t{years´2018.4 refers to the shifted observation date in years. In addition to the mean correction per epoch, Fig. 10 also shows the individual file-by-file results as gray dots. These give some insight into the uncertainty of the aberration-correction. When we fit the orbit of S2, any such uncertainty must to be propagated as source of systematic error. We construct a upper and a lower estimate of the correction, containing 67% of the files per epoch. This is shown in Fig. 10 as a gray band.
Apart from the systematic error, we also need to account for the statistical uncertainty of the S2 position. That is, as the phase and amplitude error changes when the S2 position is varied within its errorbars, we need to propagate this effect to the final correction. To this end, we take the position error of the original, un-corrected data point from which we draw 100 realizations and shift the aberration maps by it. We then derive the correction from each realization independently and use their scatter to estimate the statistical error of the S2 position correction. The resulting mean statistical uncertainty per epoch is small, between 10 µas and 30 µas, but we nevertheless also account for it in the orbit fitting.
A further check is to ask the question, what correction makes the 2017 and 2018 GRAVITY positions optimally match to the rest of the S2 data. To this end, we included a scaling factor f corr in the correction we apply, such that f corr " 1 is our best correction and f corr " 0 is no correction. This parameter we can then include in the orbit fit (see Sec. 5.1). The best fit yields f corr " 0.99˘0.06, i.e. identical to the correction we have derived purely from calibration data. This gives an independent confirmation of our concept and the resulting aberration correction: Our correction yields the most consistent S2 orbit.
The aberration correction presented here constitutes a further refinement of the analysis in Gravity Collaboration et al. (2020). There, we applied the measured aberration maps as shown in Fig. (4) directly, rather than the fitted decomposition in terms of pupil-plane Zernike polynomials. To account for the widening of the maps, which occurs when projecting from the enlarged stop on the Calibration Unit to the telescope pupil, in addition to the effect of turbulence, we applied a smoothing kernel of σ t " p19˘5q mas. The resulting best-estimate for the correction is depicted in Fig. 10 as dashed line. Both methods give consistent results, affirming the robustness of the approach. The only sizable deviation is in 2017.2, when S2 was observed at a separation comparable to the maximum radius for which we obtained the calibration measurement (see Fig. 9). This case shows the strength of the Zernike decomposition, which allows for a well-defined extrapolation.

Determination of the S2 orbit
In the following we evaluate the effect of the aberration correction on the S2 orbit. The data used is similar to Gravity Collaboration et al. (2020) and described in detail in Appendix B. We employ the same fitting procedure as in Gravity Collaboration et al. (2020), using a 13-parameter, Post-Newtonian orbit model. Six of those parameters describe the Kepler orbit (a, e, i, ω, Ω, t peri ), and another six describe the reference frame relative to the AO spectroscopy and assumed Local Standard of Rest (LSR) correction, (x 0 , y 0 , R 0 , 9 x 0 , 9 y 0 , 9 z 0 ). Here, R 0 is the distance to the GC, the prime focus of this work, and M ‚ the central mass. The best-fit parameters are given in Tab. 2.
For determining the systematic uncertainty, we follow the approach in Gravity Collaboration et al. (2019) of varying our assumptions and tracing the associated changes in R 0 . Compared to our earlier work, we also include the uncertainty due to the aberration correction, as given by the gray band in Fig. 10. The individual contributions are given in Tab. 1. It turns out that the aberration correction is the dominant contributor to the systematic error. The total systematic uncertainty is 33 pc when adding the contributions quadratically.
Our best estimate of the Galactic Center distance thus is

Comparison to previous results
Our measurements into agreement as shown in Fig. 11 and Tab. 3. We further note the following: -In contrast to Gravity Collaboration et al. (2020), we also apply a correction for the 2018 data, where S2 and Sgr A* were close to each other and close to the field center. Yet, the small aberration corrections lead to a small upward correction of R 0 of around 30 pc, comparable to the systematic error. -The orbit is particularly sensitive to the pericenter data. This leads to the effect that the statistical uncertainty decreases strongly with time, while the systematic uncertainty even increases slightly during this time frame, since varying the assumptions then leads to stronger variations in the fit result.

Comparison with further S2-based results
We estimate that the accuracy of our VLT-based result is at the 40 pc level. However, it deviates significantly from the Keckbased value reported in Do et al. (2019), with the difference being at the 300 pc level. Since both works use the orbit of S2 around Sgr A* for the determination of R 0 , it is important to investigate where the discrepancy is arising, and we address this in App. C. Overall, we conclude that the combination of a difference in the radial velocity data and a modest offset of the Keck coordinate system in the declination direction might explain the discrepancy. Both effects contribute roughly 50%. About 20% of the radial velocity difference can be attributed to the Doppler formula in StarKit used implicitly by Do et al. (2019). The remaining 80% are unexplained and could be in either the Keck or the VLT data.
The origin of the coordinate system offset is unclear as well. Trying to explain the offset with a shift of the VLT coordinate systems is much harder than imposing a shift of the Keck one due to the high precision of the GRAVITY data.

Conclusions
GRAVITY delivers high-resolution astrometry which, in combination with spectroscopic data, allows for a very precise determination of the Galactic Center distance. The values inferred from different epochs (Gravity Collaboration et al. 2018, 2020 show a small discrepancy at the 1% level, which nevertheless is significant due to the high precision of the measurement. We were able to relate this shift to optical aberrations introduced in the instrument, which lead to a field-dependent distortion of the visibility phase. Their effect is the stronger, the further off-axis an object lies within the FOV. In particular Galactic   a single fiber pointing but at a separation comparable to the FOV. In earlier and later epochs, in contrast, we employed the so-called dual-beam method and targeted each source individually. In this case, as for most other GRAVITY science observable, each source can be well centered and aberration corrections become irrelevant. The dual-beam observation mode was also assumed to derive the astrometric error budget in Lacour et al. (2014), which did not include the effect of phase maps for this precise reason.
The full analytical description which we developed here allows us to propagate the effect of optical aberrations at fiber injection to the measured visibilities. Fitting this model to dedicated calibration measurements confirms its validity and enables us to account for the effect in the data analysis. We further verify the approach with dedicated test-case observations.
The formalism which we developed is applicable beyond GRAVITY to any optical/near-IR interferometer where aberrations are introduced in the pupil or the focal plane. There have been several cases in the literature with more than one object lying in the interferometer's FOV, for example some Keck (Colavita et al. 2013), CHARA (ten Brummelaar et al. 2005 or NPOI (Armstrong et al. 1998) results on binary stars. How severely aberrations affect an observation, however, depends not only on their strength for a particular instrument but also on the off-axis distance considered and on the statistical noise in the measurement. In the example of GRAVITY on the UTs, the mean phase error introduced at 20 mas separation is 4´5 degrees per telescope and increases to 14´20 degrees at 50 mas. While a binary test case as presented in Sec. 4.1 can serve as a general strategy to diagnose whether aberration-induced systematics are an issue, dedicated calibration measurement are required for their correction in the analysis for each individual instrument.
With the results from the GRAVITY Calibration Unit measurements and our refined analysis scheme, we are able to further improve the separation between S2 and Sgr A* in 2017 and 2018, introducing shifts up to 0.5 mas caused by the phase aberrations. In Fig. 12, we show a detailed view of the S2 orbit in 2017, where we have also included two dual-beam measurements that do not suffer from phase aberrations. Indeed, the improved data agrees very well with these positions.
Of all orbital parameters, the distance to the Galactic Center R 0 is most strongly affected by the change in the S2 position. This can be easily understood if one views R 0 as the scaling factor between angular and proper velocity. As such, the fielddependent phase errors discussed in this work fully explain the shift between earlier R 0 measurements with GRAVITY data. Applying the analysis scheme developed here lifts any such discrepancies (see Sec. 5.2). In particular Fig. 11 demonstrates that belatedly corrected data sets of earlier publications give fully consistent results whose accuracy increases with time.  -We corrected the radial velocity of the epoch 2018.1277, which was 13 km/s too high in the previous data set. -Further, we are able to add one interferometric position measurement of S2 from early March 2020. Like in 2019, the separation between S2 and Sgr A* exceeds the fiber field of view, and hence a dual-beam measurement needed to be employed.
Our data set consists of 128 AO-based astrometric points, 58 GRAVITY-based astrometric points and 97 radial velocities, of which the first three before 2003 are from Do et al. (2019).

Appendix B.1: Dual-beam measurement in 2020
Due to the limited observability of the GC in early March and expecting observations in the following months, we did not attempt to observe Sgr A* in March 2020, but only pointed to S2 and to our usual calibrator star R2, with the aim of testing the stability of the GRAVITY astrometry. Pointings to Sgr A* were planned for later in the year. They had to be canceled due to the pandemic-related closure of the VLT(I) from mid-March on. To still determine the S2 -Sgr A* separation vector from this observation, we need to proceed in two steps and first measure the S2 -R2 distance, then we reference R2 to Sgr A*. The distance between S2 and R2 is measured with the dualbeam method (Sec. 4.2), where we calibrate the S2 files with R2. In addition to the 2020 measurement, this separation is also available for 56 epochs in the years 2017, 2018 and 2019. It can be measured very precisely due to the brightness of the two stars. Since the S2 -Sgr A* vectors have already been determined in Gravity Collaboration et al. (2020), we can also refer R2 to Sgr A* in those earlier epochs. We then fit a simple quadratic function for the time evolution of the R2 coordinates relative to Sgr A* and extrapolate it to March 2020. Given the large number of data and the small time range to extrapolate for, the extra uncertainty introduced is well below the 100 µas level.
We derive the S2 position in 2020 from the four scientifically usable exposures as their mean. We assign an error of 150 µas to each coordinate for this data point, reflecting both the smaller number of files compared to what we typically had available in 2019 and the extra uncertainty due to the additional step of referencing via R2. The new data point falls well onto the expected orbit, but its error bar is too large to have a significant impact on the fitted parameters.
For fitting the VLT data set, we use the same approach as in Gravity Collaboration et al. (2020): For the GRAVITY data, we assume that the astrometry directly refers the S2 positions to the mass center, as we directly measure the separation vector between the two objects interferometrically. For the NACO (AOimaging based) data, we allow for a coordinate system offset, on which we set priors following the work from Plewa et al. (2015), and we include the NACO flare positions as an additional constraint for locating the mass. This fit yields R 0 " 8274.9˘9.3 pc a " 124.982˘0.034 mas i " 134.685˘0.029Ω " 227.175˘0.029˝, where a is the semi-major axis, i the inclination and Ω the position angle of ascending node of the S2 orbit, and the errors are the statistical fit uncertainties. The VLT astrometry is dominated by the GRAVITY points, as illustrated by dropping all AO data points, which results in R 0 " 8276˘10 pc.
Fitting the Keck data set with the same 13-parameter model as used for Eq. C.1 yields  Do et al. (2019) lies between the two numbers we get by re-fitting their data. In the following, we will use for simplicity, and for equal treatment of the data, the value and approach as in Eq. C.2. We have thus a difference of ∆R 0 " 340˘45 pc.

Appendix C.3: Comparing, combining & adjusting the astrometry
Already, Gillessen et al. (2009a) noticed that a simple attempt to compare the astrometric data sets by plotting them on top of each other fails. One needs to allow for an offset and a drift between the two coordinate systems (i.e. four parameters ∆x, ∆y, ∆v x , ∆v y ). This yields thus a 17-parameter fit. Comparing the best-fitting parameters in Eq. C.1 and Eq. C.2 shows that they differ in Ω significantly. This parameter is fully degenerate with the angular orientation (called β here) of the coordinate system. Hence, the difference in Ω suggests that the two astrometric data sets are rotated with respect to each other. Therefore we extend the combination scheme by an additional, fifth parameter, ∆β, resulting in a 18-parameter fit. With this we fitted both data sets simultaneously, omitting the 41 VLT radial velocities from the Keck data set, whist dropping also the three Keck ones in the VLT data set. This fit matches the two StarKit package actually applies a Doppler formula which includes the longitudinal, relativistic correction: λ 1 " λ 0 b 1`v r {c 1´v r {c . In this form, the Doppler formula ignores the (significant) tangential motion v t of S2. In order to apply a relativistic correction one needs to use the full Doppler formula 1`z " 1`v r {c ? 1´pv 2 r`v 2 t q{c 2 (Lindegren & Dravins 2003). For this correction, however, the spectroscopic information is not sufficient. One cannot, in general, Doppler-correct a spectrum in a relativistic way without knowing the other motion component. Further, even if one would apply the full correction, one would in the following of course not be able to fit for the relativistic redshift anymore. The difference between the two formulae is small at velocities much smaller than the speed of light, but becomes important close to peri-center, when S2 reaches a velocity of nearly 8000 km{s. Still, it amounts to « 25 km{s at most and thus is smaller than the observed difference in Fig. C.1. This difference is also visible in Fig. 1 of Do et al. (2019): The plotted model spectra are slightly more redshifted than what the underlying data suggest. Changing the Keck radial velocities accordingly yields a fit with R 0 " 7972˘44 pc, i.e. accounting for 37 pc of the 159 pc.
Further checks did not yield any clues why there remains a significant difference in the radial velocities. We note: -We checked whether the time stamps are assigned consistently between the two data sets, and did not find a difference. -Fig. C.1 bottom right shows that both data sets clearly show the redshift peak around pericenter.

Appendix C.5: Discrepancy in the astrometry
Comparing the fits in Eq. C.1 and Eq. C.2 shows that they not only differ in R 0 , but also in the size of the semi-major axis a. We find ∆a{a " 1.28˘0.22 %. The same is not true for the semi-minor axis though, ∆b{b is consistent with 0. Interestingly, the projected ellipses as given by the astrometric data in the plane of sky agree in both semi-major and semi-minor axes to within 0.17%. Hence, the inclinations i need to differ, which Eq. C.1 and Eq. C.2 confirm. We find in accordance with the above 1ś inpi VLT q{ sinpi Keck q « 1.3%. The inclination of the ellipse determines where the projected center of mass is located. Given the orientation of the S2 orbit and the disagreement in a but not in b hints towards an offset of the center of mass in the declination direction. Indeed, we can show that introducing an offset to either y or v y (the mass position and velocity in declination) can explain the remaining discrepancy. Starting from the fit of the transformed Keck data set, we fix v y to its best fit value of´0.15 mas/yr. All other parameters are left free again for a subsequent fit. Additionally using the VLTI velocities in this fit instead of the Keck ones yields: This fit yields thus from the Keck astrometry the same value for R 0 as the VLT fit. Also note, that indeed semi-major axis a and inclination i have moved to the VLT values by forcing v y to have an offset. Since the mass position is parametrized with a time origin at T=2000.0, the best fit y also changes, from´0.972 mas to 1.234 mas. The systematic uncertainty on y and v y estimated by Do et al. (2019) are 1.16 mas and 0.066 mas/yr respectively. Comparison of the astrometric residual after forcing an offset in declination such that the fit to Keck data set matches the VLT one (left) and such that the fit to VLT data set matches the Keck one (right). Lighter blue corresponds to AO data from the VLT data set, darker blue to the GRAVITY data.
Hence, the difference one needs to enforce is within « 2σ of the systematic uncertainty, and the residuals in fig. C.2 (left) appear to be acceptable. Essentially the same can be achieved by forcing an offset to y and leaving v y free instead.
Can one can turn the argument around and apply a similar offset to the VLT data in order to lower the VLT-based value of R 0 ? In a first attempt we applied the same offset to the VLT AO data. However, even an offset 10 x larger (i.e. 1.2 mas/yr), changes R 0 only by « 30 pc. This is not surprising, since the VLT astrometry is completely dominated by the GRAVITY data. Thus, we instead tried varying v y and y for the GRAVITY data, giving up the assumption that the GRAVITY source directly is the mass center. Also, we exchanged the VLT radial velocities for the Keck ones. We find that we need to change v y by´1 .4 mas/yr in order to get a distance similar to the Keck value: The fit achieves the lower R 0 by tilting the orbit similar to the fit from Eq. C.2. The enforced change of v y is unrealistically large (12ˆlarger than what was needed for the Keck data), Also, the GRAVITY data show very strong and systematic residuals of up to 0.5 mas ( fig. C.2 right), and the reduced χ 2 of the fit increased from 1.50 to 2.63.