Polarization-dependent beam shifts upon metallic reflection in high-contrast imagers and telescopes

(Abridged) Context. To directly image rocky exoplanets in reflected (polarized) light, future space- and ground-based high-contrast imagers and telescopes aim to reach extreme contrasts at close separations from the star. However, the achievable contrast will be limited by reflection-induced polarization aberrations. While polarization aberrations can be modeled numerically, such computations provide little insight into the full range of effects, their origin and characteristics, and possible ways to mitigate them. Aims. We aim to understand polarization aberrations produced by reflection off flat metallic mirrors at the fundamental level. Methods. We used polarization ray tracing to numerically compute polarization aberrations and interpret the results in terms of the polarization-dependent spatial and angular Goos-H\"anchen and Imbert-Federov shifts of the beam of light as described with closed-form mathematical expressions in the physics literature. Results. We find that all four beam shifts are fully reproduced by polarization ray tracing and study the origin, characteristics, sizes, and directions of the shifts. Of the four beam shifts, only the spatial Goos-H\"anchen and Imbert-Federov shifts are relevant for high-contrast imagers and telescopes because these shifts are visible in the focal plane and create a polarization structure in the PSF that reduces the performance of coronagraphs and the polarimetric speckle suppression close to the star. Conclusions. The beam shifts in an optical system can be mitigated by keeping the f-numbers large and angles of incidence small. Most importantly, mirror coatings should not be optimized for maximum reflectivity, but should be designed to have a retardance close to 180{\deg}. The insights from our study can be applied to improve the performance of current and future high-contrast imagers, especially those in space and on the ELTs.


Introduction
To directly image rocky exoplanets in (polarized) reflected visible and near-infrared light, future space telescopes and extremely large ground-based telescopes and instruments aim to reach extreme planet-to-star contrast ratios at diffraction-limited angular separations from the star. Even though the optical systems of these high-contrast imagers will minimize scalar aberrations, the coronagraphic performance and achievable contrast will still be limited by polarization aberrations (e.g. , Chipman 1989;McGuire & Chipman 1990;Sanchez Almeida & Martinez Pillet 1992;McGuire & Chipman 1994a,b;Breckinridge et al. 2015). Polarization aberrations are minute, polarizationdependent variations of the amplitude and phase of the electromagnetic field across a beam of light that result in a polarization structure in the point-spread function (PSF). Polarization aberrations are predominantly caused by reflection off oblique and/or curved metallic mirrors and originate directly from the Fresnel reflection coefficients. The first-order polarization aberrations, that is, the sub-wavelength, polarization-dependent shifts of the beam of light, most negatively affect the achievable contrast. Because polarization aberrations are different for orthogonal polar-ization components of unpolarized light, adaptive optics cannot fully correct these aberrations (Breckinridge et al. 2015).
Recently, it has become clear that high-angular-resolution polarimeters are also affected by polarization aberrations. The polarization aberrations of the Gemini South telescope appear to be limiting the polarimetric contrast achieved by the Gemini Planet Imager at the smallest angular separations from the star (Millar-Blanchaer et al. 2022). Moreover, the polarimetric speckle suppression of the high-contrast imaging polarimeter SPHERE-ZIMPOL at the Very Large Telescope, which is specifically designed to search for the reflected, polarized visible light of giant exoplanets, is limited by reflection-induced, polarization-dependent beam shifts (Schmid et al. 2018). Such shifts also affect interferometric polarization measurements with the SPeckle Polarimeter at the Sternberg Astronomical Institute 2.5-m telescope (Safonov et al. 2019). The beam shifts become apparent for these instruments due to the unprecedented polarimetric sensitivity and spatial resolution they achieve.
The polarization aberrations of an astronomical telescope and instrument can be numerically computed with polarization ray tracing (Breckinridge et al. 2015). First, the paths of the rays of light are traced through the optical system using geometrical Article number, page 1 of 19 arXiv:2308.10940v2 [astro-ph.IM] 28 Sep 2023 A&A proofs: manuscript no. beam_shifts_fundamentals optics, but instead of the intensity, the electric field components of the rays are computed upon each reflection or transmission (e.g., Waluschka 1989;Chipman 1989;Yun et al. 2011a,b). Each point in the exit pupil is then associated with a Jones matrix. In this way, the Jones pupil, which maps the changes in the electric fields between the entrance and exit pupils of the system, is calculated (Totzeck et al. 2005). Finally, the intensity in the focal plane (i.e., the PSF) is computed in the Fraunhofer approximation through spatial Fourier transforms over the Jones pupil. Several studies have used polarization ray tracing to model the polarization aberrations of proposed and future high-contrast imagers and telescopes, such as the Roman Space Telescope (Krist et al. 2017), HabEx Breckinridge et al. 2018), LUVOIR (Sabatke et al. 2018;Will & Fienup 2019), PICTURE-C (Mendillo et al. 2019), and the three extremely large telescopes (Anche et al. 2018(Anche et al. , 2023. However, these numerical computations give little insight into the full range of aberrations, their origin and characteristics, and the relative importance of amplitude and phase effects. Breckinridge et al. (2015) use polarization ray tracing to analyze a three-mirror system consisting of a Cassegrain telescope followed by a flat fold mirror, and find two beam-shift effects that both originate from the oblique reflection off the flat mirror. The authors find phase gradients (i.e., wavefront tilts) in the Jones pupil that have opposite directions for the linearly polarized components parallel and perpendicular to the plane of incidence of the fold mirror. In the focal plane, these gradients cause the orthogonally polarized components of the PSF to shift in opposite directions, thereby broadening the resulting PSF in intensity. Furthermore, the authors find PSF components that couple the light from one orthogonal polarization into the other. These PSF components, which they call ghost PSFs, have two peaks, one on either side of the plane of incidence.
Sub-wavelength, polarization-dependent shifts of a beam of light induced by reflection off a flat metallic mirror are also extensively described in the physics literature (for overviews, see Aiello & Woerdman 2008;Götte & Dennis 2012;Bliokh & Aiello 2013). These shifts are referred to as the Goos-Hänchen (GH) and Imbert-Federov (IF) shifts and occur in the directions parallel and perpendicular to the plane of incidence, respectively. Both shifts are further divided into a spatial and an angular shift. The spatial shifts are displacements of the entire beam of light upon reflection, and the angular shifts refer to angular deviations of the beam upon reflection. As such, the four shifts are considered first-order corrections to the laws of geometrical optics due to diffraction within a beam of light of finite width; the Fresnel equations only apply to infinitely extended interfaces, and a correct description of light reflected off an interface must therefore take into account the finite beam size. The GH and IF shifts are derived from first principles through full diffraction calculations and are described using closed-form mathematical expressions specifying the centroid of the intensity of a reflected Gaussian beam (e.g., Aiello & Woerdman 2007. All four shifts have been experimentally validated for metallic reflections (Merano et al. 2007;Aiello et al. 2009;Hermosa et al. 2011). Schmid et al. (2018) show in their analysis of the beam shifts of SPHERE-ZIMPOL that the spatial GH shift is likely the same as the shift arising from phase gradients in the Jones pupil as described by Breckinridge et al. (2015).
In this paper, we aim to understand polarization aberrations produced by reflection off flat metallic mirrors at the fundamental level and seek to unify the two views of the beam shifts from polarization ray tracing and full diffraction calculations in the physics literature. To this end, we determine the beam shifts from the polarization ray tracing of the reflection of a beam of light with a uniform (or top-hat) intensity profile (as applies to astronomical telescopes and instruments), and compare the resulting shifts to the spatial and angular GH and IF shifts as predicted by the closed-form expressions derived for Gaussian beams. We investigate whether the GH and IF shifts are reproduced by polarization ray tracing or whether they are additional effects that we need to take into account for astronomical instruments. In addition, we study the origin and characteristics of the shifts and determine how the size and direction of the shifts depend on the beam intensity profile, incident polarization state, angle of incidence, mirror material, and wavelength. Finally, we examine how these shifts affect the performance of high-contrast imagers and how we can mitigate them in (future) diffraction-limited astronomical telescopes and instruments.
The outline of this paper is as follows. In Sect. 2 we describe the conventions and definitions of the mathematics used throughout the paper. Subsequently, in Sect. 3, we outline the polarization ray tracing of the reflection of a beam of light off a flat metallic mirror and the determination of the beam shifts. In Sect. 4 we then explain the origin and characteristics of the spatial and angular GH and IF shifts and their relation to shifts found using polarization ray tracing. We also show the dependence of the size and direction of the shifts on the incident polarization state and angle of incidence. In Sect. 5 we investigate the polarization structure in the PSF induced by the beam shifts and the effect of the beam shifts on polarimetric measurements. In the same section we also examine the size of the beam shifts for various mirror materials and wavelengths, and discuss and refine the approaches to mitigate the beam shifts. Finally, we show a table summarizing the properties of the four beam shifts at the end of Sect. 5 and present conclusions in Sect. 6.

Conventions and definitions
In this section, we outline the conventions and definitions used throughout this paper. In the literature, the mathematical definitions underlying the descriptions of polarization aberrations and beam shifts are often incomplete and not consistent among different studies. This can lead to errors in the physical interpretation, for example with the handedness of the circular polarization or the direction of the beam shifts. We therefore describe our definitions extensively and have carefully checked our equations for consistency. As such, this paper provides a complete reference for the correct computation of the polarization aberrations and beam shifts. To enable easy comparison of our results with those from the physics literature, we use the same definitions as Aiello & Woerdman (2007), Merano et al. (2007), Aiello & Woerdman (2008), Aiello et al. (2009), andHermosa et al. (2011). For the description of the polarization of light, these definitions are consistent with the definitions adopted by the International Astronomical Union (see e.g., Hamaker & Bregman 1996). We present the mathematics to describe light and its polarization in Sect. 2.1 and discuss metallic reflection in Sect. 2.2.

Polarization of light
We shall consider a monochromatic, polarized light wave propagating in the positive z-direction of a Cartesian reference frame (or basis) xyz as shown in Fig. 1. The transverse electric field components of this light wave in the vertical x-and horizontal y-directions can then be described as follows (see e.g., Born & Fig. 1. Definition of the three reference frames (or bases) and the Stokes parameters to describe the electric field components and polarization of an electromagnetic wave. The light propagates along the z-axis out of the paper toward the reader. In the xyz-basis, the x-axis (y-axis) is oriented in the vertical (horizontal) direction. In the daz-basis, the d-axis (a-axis) is oriented in the diagonal (antidiagonal) direction, at 45 • counterclockwise (clockwise) from the x-axis. In the rlz-basis, r and l represent the right-handed and left-handed circularly polarized components. For each reference frame, the basis Jones vectors, expressed in the xyzbases, are indicated. The Stokes parameters are shown in orange with the plus sign (minus sign) indicating that the Stokes parameter is positive (negative) in that direction. The angle of linear polarization χ is defined positive for a counterclockwise rotation from the x-axis.

Wolf 2013)
: where t is time, ω > 0 is the angular frequency, k = 2π/λ is the wave number with λ the wavelength, A x and A y are the amplitudes, φ x and φ y are the initial phases, Re[. . . ] denotes the real part, and i is the imaginary unit. On the right side of Eqs.
(1) and (2), the factor exp [i(kz − ωt)] only describes the propagation of the light wave. The polarization of the wave can therefore be described by a Jones vector E: where E x and E y are the complex electric field components.
As an alternative way to describe the polarization, we can define a set of Stokes parameters (see Fig. 1): where the asterisk denotes the complex conjugate, δ = φ y − φ x is the phase difference between the yand x-components of the electric field, and I x and I y are the intensities of the x-and ycomponents of the electric field. The variables I d and I a are the intensities of the d-and a-components in the basis of the diagonal and antidiagonal polarizations, daz, and I r and I l are the intensities of the r-and l-components in the basis of the righthanded and left-handed circular polarizations, rlz (see Fig. 1). Stokes I is the total intensity, positive (negative) Stokes Q describes linear polarization in the vertical x-direction (horizontal y-direction), positive (negative) Stokes U describes linear polarization in the diagonal (antidiagonal) direction, 45 • counterclockwise (clockwise) from the x-direction, and positive (negative) Stokes V describes right-handed (left-handed) circular polarization. Whereas the xyz-basis is the natural basis of Stokes Q, the daz-and rlz-bases are the natural bases of Stokes U and V, respectively. Because we normalize the total intensity, that is, we set I = 1 in Eq. (4), Q, U, and V have values between 1 and −1. We note that Eqs. (4)-(7) are strictly speaking only valid for 100% polarized, monochromatic light. However, for quasimonochromatic light, whether 100% polarized, partially polarized, or unpolarized, we simply need to take the time averages over the terms in the equations. From Eqs. (4) and (5), we can derive expressions for the intensities of the x-and y-components of the electric field: Although these two equations are simple, they are important, and we use them in all closed-form expressions for the beam shifts in Sect. 4. Finally, we assemble the Stokes parameters in a Stokes vector S: and define the degree of linear polarization P (which for I = 1 is equal to the linearly polarized intensity) and angle of linear polarization χ (see Fig. 1) as follows:

Metallic reflection
Using this mathematically consistent description of light and its polarization, we can describe the reflection of light using the Fresnel equations in the geometric polarization ray-tracing approximation. We shall consider the central ray of a beam of light incident on a flat metallic mirror as shown in Fig. 2. Describing this ray as a plane electromagnetic wave, we decompose the incident electric field into the p-and s-polarized components that are parallel and perpendicular to the plane of incidence, respectively. For this central ray, the p-and s-directions correspond to the x-and y-directions, respectively. Assuming the refractive index of the incident medium (air) to be equal to 1, we compute the complex Fresnel reflection coefficients r p and r s as follows (see e.g., Born & Wolf 2013): Article number, page 3 of 19 A&A proofs: manuscript no. beam_shifts_fundamentals Side view where θ is the central angle of incidence (see Fig. 2) andn = n + iκ is the complex refractive index of the mirror material, with n and κ the real and complex parts, respectively. The amplitudes R p/s = |r p/s | specify the ratios of the amplitudes of the reflected and incident electric fields, while the phases ϕ p/s = arg (r p/s ) describe the phase shifts between the reflected and incident electric fields. Two important quantities related to the reflection coefficients are the diattenuation and the retardance, which can be considered to be the zeroth-order polarization aberrations. The diattenuation ϵ is defined as follows: which ideally equals 0. When unpolarized light is incident on the mirror, a nonzero value of the diattenuation quantifies the amount of linearly polarized light that is created, that is, the instrumental polarization. The retardance ∆ is defined as follows: which ideally equals 180 • . The latter value comes from the requirement that the electromagnetic wave before and after reflection is described by a right-handed triplet in terms of the electric field, the magnetic field, and the wave vector. For values other than 180 • , retardance results in the conversion of incident linearly polarized light into circularly polarized light and vice versa, that is, it produces polarimetric crosstalk. The physics of the beam shifts as described in Sect. 4 depends on the diattenuation and retardance as well as on the gradients of the amplitude and phase of the reflection coefficients with the angle of incidence. Figure 3 shows the amplitude and phase of the reflection coefficients as a function of the angle of incidence for gold withn = 0.188+i5.39 at a wavelength of 820 nm, corresponding to the configuration studied in Sects. 3-5. From Fig. 3 (left) it follows that the diattenuation, which is roughly the difference between the curves of R s and R p (see Eq. (15)), is zero at θ = 0 • , increases with increasing angle of incidence until it reaches a maximum around θ = 80 • , and then decreases again to zero at θ = 90 • . In Fig. 3 (right) we see that the retardance, which is the difference between the curves of ϕ s and ϕ p (see Eq. (16)), is 180 • at θ = 0 • and remains close to this value for small values of θ. For large θ, the retardance decreases rapidly to 0 • at θ = 90 • . Fig. 3 (left and right) also show the gradients in amplitude and phase at θ = 45 • (similar to the phase gradients shown by Breckinridge et al. 2015). Whereas the amplitude gradient ∂R s /∂θ is always positive for θ > 0 • , ∂R p /∂θ is initially negative, then becomes zero, and finally is positive for very large angles of incidence. Lastly, for θ > 0 • the phase gradients ∂ϕ s /∂θ and ∂ϕ p /∂θ are negative and positive, respectively, and monotonically decrease and increase with increasing angle of incidence.

Beam shifts from polarization ray tracing
In this section, we describe the polarization ray tracing of a beam of light that reflects off a (flat) metallic mirror, following the methodology outlined in Breckinridge et al. (2015), and the determination of the beam shifts that result. In Sect. 4 we compare the resulting shifts for various incident polarization states and angles of incidence to the predicted spatial and angular GH and IF shifts as derived for Gaussian beams. We determine the centroid shifts of both the focal-plane intensity (i.e., the PSF) and the intensity in the exit-pupil plane because these planes are where the spatial shifts (shifts of the complete beam) and angular shifts (angular deviations as measured from the focus) should be visible. To enable a direct comparison of our results with the experimental measurements of the GH and IF shifts by Merano et al. (2007), Aiello et al. (2009), andHermosa et al. (2011), we consider a (practically) identical configuration to the one used in those studies: a converging, monochromatic beam of light with an f-number of 61.3 that reflects off a flat gold mirror at a wavelength of 820 nm and with a focal distance of 11.9 cm. Our configuration differs in that the beam of light is not Gaussian but has a uniform (or top-hat) intensity profile across the entrance pupil as is the case for astronomical telescopes and instruments. As the first step in our analysis, we compute the Jones pupil that describes the electric-field response in the exit pupil upon reflection. We only describe this computation briefly here (for detailed descriptions see e.g., Waluschka 1989;Götte & Dennis 2012). We use the definitions as shown in Fig. 2 and decompose the beam of light into a set of rays that each can be described by a plane electromagnetic wave. For each ray, we compute the angle of incidence and, using Eqs. (13) and (14), the corresponding Fresnel reflection coefficients in the local p-and s-directions. Subsequently, we calculate the orientation of the local plane of incidence for each ray. Finally, we compute the Jones pupil as the set of Jones matrices describing the reflection of each ray, taking into account the orientation of the local plane of incidence and the change of sign of the x-coordinate of the ray upon reflection. The resulting Jones pupil J xyz , which is expressed in the xyzbasis, can be written as follows: where J xx to J yy are the complex Jones-pupil elements describing the contribution of the x-or y-polarized components of the incident electric field (in the entrance pupil) to the x-or y-polarized components of the reflected electric field (in the exit pupil). The amplitudes and phases of the Jones-pupil elements, which define the ratios of the amplitudes and the phase shifts of the reflected and incident electric fields, are denoted R xx to R yy and ϕ xx to ϕ yy , respectively. The Jones pupil J xyz for reflection with an angle of incidence of 45 • is shown in Fig. 4 (top). The Jones pupil is a crucial ingredient for our understanding of the beam shifts in Sect. 4. In that context, it is useful to also express the Jones pupil in the basis of the diagonal and antidiagonal polarizations, daz, and the basis of the right-handed and left-handed circular polarizations, rlz, as defined in Fig. 1. The Jones pupils in the daz-and rlz-bases, J daz and J rlz , are defined as follows: where R dd to R ll and ϕ dd to ϕ ll are the amplitudes and phases of the Jones-pupil elements and −1 denotes the inverse of a matrix. The matrices T daz and T rlz describe the transformations from the xyz-basis to the daz-and rlz-bases, respectively, and are given by: The Jones pupils J daz and J rlz for reflection with an angle of incidence of 45 • are shown in Fig. 4 (center) and Fig. 4 (bottom), respectively. As the next step, we compute the amplitude-response matrix (ARM) specifying the electric-field response in the focal plane (expressed in the xyz-basis). The ARM is computed as follows: where F (. . . ) denotes the spatial Fourier transform over a Jonespupil element and R ′ xx to R ′ yy and ϕ ′ xx to ϕ ′ yy denote the amplitudes and phases, respectively, of the ARM elements. By using the spatial Fourier transform for the computation of the ARM we assume that the Fraunhofer approximation to diffraction applies, which is the case for beams with absolute f-numbers larger than ∼5 (see e.g., McGuire & Chipman 1990). The ARM for reflection with an angle of incidence of 45 • is shown in Fig. A.1.
Next, we calculate the point-spread matrix (PSM), which is the Mueller-matrix representation of the PSF and describes the intensity response in the focal plane for any incident Stokes vector, whether 100% polarized, partially polarized, or unpolarized. The PSM is calculated as follows: where ⊗ denotes the Kronecker product, the asterisk indicates the element-wise complex conjugate, and the matrix C is given by (see e.g., Espinosa-Luna et al. 2008): The PSM can be written as follows: where each element A→ B describes the contribution of the incident Stokes parameter A to the resulting Stokes parameter B. The PSM for reflection with an angle of incidence of 45 • is shown in Fig. 5. We note that the same PSM can also be obtained by computing the ARM (Eq. (22)) from the Jones pupil expressed in the daz-or rlz-bases and replacing the matrix C in Eqs. (23) and (24) with the appropriate matrix corresponding to those bases. As the final step, we determine the beam shifts in the exit pupil and the focal plane. To this end, we define an incident Jones vector or Stokes vector with a uniform intensity profile and polarization state. For the determination of the shift in the exit pupil, we right-multiply the Jones pupil by the incident Jones vector to obtain the Jones vector in the pupil plane. Subsequently, we compute the intensity distribution in the pupil plane as the sum of squares of the amplitudes of the latter Jones vector. Finally, we calculate the beam shift as the offset of the centroid of the intensity distribution with respect to the beam position in the absence of diffraction and aberrations. To determine the beam shift in the focal plane, we compute the Stokes vector after reflection by right-multiplying the PSM by the incident Stokes vector. We then retrieve the intensity image from the first element of the resulting Stokes vector and determine the shift as the offset of the centroid with respect to the beam position in the absence of diffraction and aberrations.

Explanation of beam shifts and comparison to polarization ray tracing
In this section, we explain the spatial and angular GH and IF shifts and compare them to the shifts found using polarization ray tracing. We analytically describe the four shifts using the closed-form expressions from Aiello & Woerdman (2008). These expressions are derived (see Aiello & Woerdman 2007) by decomposing an incident, uniformly polarized Gaussian beam of light into the angular spectrum of plane waves (e.g., Born & Wolf 2013) and computing the effect of the reflection on each wave. Because the plane waves are infinitely extended, the Fresnel equations can be applied without making any approximations. The decomposition into plane waves is equivalent to a Fourier transform of the electric field at the mirror interface. The resulting reflected plane waves are then integrated over, and the shift is calculated as the shift of the centroid of the intensity of the

2.0086
Right-handed circular ─ left-handed circular (rlz) basis Fig. 4. Jones pupil expressed in the xyz-(top), daz-(center), and rlz-bases (bottom) at a wavelength of 820 nm for a converging beam of light with an f-number of 61.3 that reflects off gold at an angle of incidence of 45 • . The panels in the first and second (third and fourth) columns show the amplitude (phase) of the Jones-pupil elements. The positive x-and y-directions are upward and to the left, respectively. The values of the color maps are different among the panels. The red, orange, blue, and green borders around the panels indicate the gradients that are visible and the specific beam shifts that these gradients cause (see the legend above the top panels). The panels of R da , R ad , ϕ da , and ϕ ad have two colored borders because these panels show a combination of two gradients. To reveal the gradient in the panels of ϕ xy and ϕ yx , π has been added to the phase in the left and right halves of the pupil, respectively.   beam. The expressions depend on the Fresnel reflection coefficients at the central angle of incidence and the complex electricfield components of the incident beam. We have rewritten the expressions in terms of the more familiar Stokes parameters to make the expressions easier to understand and enable the computation of the shifts for any incident polarization state.
For each of the four shifts, which generally occur simultaneously, we explain the origin and characteristics, and analytically compute the size and direction as a function of the angle of incidence for different incident polarization states. We consider 100% linearly polarized light with angles of linear polarization χ ranging from 0 • to 180 • in steps of 22.5 • , 100% righthanded and left-handed circularly polarized light (i.e., V = 1 and V = −1, respectively), and unpolarized light. For these same polarization states, we numerically compute the shifts from the polarization ray tracing as outlined in Sect. 3 and compare the results to the analytical computations. We also explain the shifts using the Jones pupil and the PSM. We discuss the spatial and angular GH shifts in Sects. 4.1 and 4.2 and the spatial and angular IF shifts in Sects. 4.3 and 4.4. For easy reference, an overview of the properties of the four beam shifts is shown in Table 1 of Sect. 5.5.

Spatial Goos-Hänchen shift
The spatial GH shift, X sGH , is a displacement of the entire beam of light upon reflection and occurs in the plane of incidence (e.g., Goos & Hänchen 1947;Merano et al. 2007;Aiello & Woerdman 2008;Aiello et al. 2009;Götte & Dennis 2012;Bliokh & Aiello 2013). Figure 6 (top) shows a schematic with the definition of the spatial GH shift. The shift is independent of the divergence angle of the incident beam (i.e., the f-number) and does not depend on whether the reflection occurs in the focus or the converging or diverging parts of the beam. From the perspective of the plane-wave decomposition, the spatial GH shift can be understood from a 2D picture of the beam of light, looking from a direction perpendicular to the plane of incidence (i.e., the side view as shown in Fig. 6, top). Each plane wave of the beam has a different angle of incidence and therefore acquires a correspondingly different phase shift upon reflection. This results in a gra-Article number, page 7 of 19 A&A proofs: manuscript no. beam_shifts_fundamentals

Side view
Schematic showing the definitions of the spatial and angular GH shifts, X sGH and Θ aGH (top), and the spatial and angular IF shifts, Y sIF and Θ aIF (bottom), for an (initially converging) beam of light incident on a metallic mirror. Darker colors within the reflected beam indicate a higher relative intensity. The orientation of the xyz reference frame before and after reflection is indicated. Positive spatial GH and IF shifts are directed in the positive x-and y-directions, respectively, after reflection (the spatial GH shift is shown in the negative direction). The angular GH and IF shifts are positive for a right-handed rotation around the y-axis and a left-handed rotation around the x-axis, respectively. For clarity the size of the shifts is shown extremely exaggerated. dient in phase over the range of angles of incidence (see Fig. 3, right). Integrating over all reflected plane waves, this then results in a shift of the entire beam parallel to the plane of incidence. The integration is equivalent to an inverse Fourier transform, which explains how a phase gradient is equivalent to a shift of the entire beam on the mirror. The spatial GH shift can be computed as follows: where R p and R s (from Eqs. (13) and (14)) and the phase gradients ∂ϕ p /∂θ and ∂ϕ s /∂θ (see Fig. 3, right) are computed at the central angle of incidence of the beam, and I x and I y are the intensities of the components of the light polarized in the x-and y-direction, respectively. These intensities only depend on the incident Stokes Q and follow from Eqs. (8) and (9). The factor R 2 p I x + R 2 s I y in Eq. (26) is the intensity of the reflected beam and returns in the expressions of all shifts. The spatial GH shift is produced by the phase gradients, whereas R p and R s can be considered to be small corrections. Indeed, if we set either I x or I y equal to zero in Eq. (26), we obtain: which shows that the spatial GH shift consists of two components: X sGH,x for the light polarized in the x-direction and X sGH,y for the light polarized in the y-direction. The total spatial GH shift as computed from Eq. (26) can then be understood as the intensity-weighted average of these two shifts. Figure 7 shows the spatial GH shift as a function of the angle of incidence for different incident polarization states as computed from Eq. (26). The figure also shows the shifts in the focal plane (data points) as obtained from the numerical computations using the polarization ray tracing as outlined in Sect. 3. The close agreement between the analytical and numerical results shows that the spatial GH shift is reproduced very closely by the polarization ray tracing and that Eq. (26) is not only valid for Gaussian beams, but is also accurate for beams with a uniform intensity profile. Small deviations between the analytical and numerical results are only visible for very large angles of incidence (θ ≳ 80 • ). These deviations are higher-order effects due to the beam intensity profile deviating from a Gaussian profile. Indeed, when performing the polarization ray tracing for a Gaussian beam, the data points agree exactly with the analytical curves for all angles of incidence. Figure 7 shows that, although the size of the spatial GH shift is generally less than the wavelength, the shift can be larger than a wavelength for large angles of incidence and certain incident polarization states. At normal incidence, the shift is always zero. The spatial GH shift is largest for light polarized in the x-direction (i.e., for χ = 0 • and χ = 180 • , or Q = 1) and increases with increasing angle of incidence. Because the shift for light polarized in the x-direction is directly proportional to ∂ϕ p /∂θ (see Eq. (27)), this behavior can be understood from the increasing gradient seen in Fig. 3 (right). For incident light polarized in the y-direction (i.e., for χ = 90 • or Q = −1), the shift is much smaller and in the opposite direction, which also agrees with ∂ϕ s /∂θ being smaller than and opposite to ∂ϕ p /∂θ in Fig. 3 (right). In case of light with Q = 0 (e.g., for unpolarized light or 100% polarized light with χ = 45 • , χ = 135 • , V = 1, or V = −1), the intensities of the light polarized in the x-and y-directions are equal and the resulting shift is the intensity-weighted average of the shifts of the x-and y-polarizations. Finally, for light with ized light), the resulting shift is in between the three aforementioned shifts.
As can be seen from Fig. 4 (top), which shows the Jones pupil expressed in the xyz-basis, the spatial GH shift produces gradients in the phase of all Jones-pupil elements (blue borders). These phase gradients represent wavefront tilts in the exit pupil and as such result in shifts of the centroid of the PSF in the focal plane. This confirms the claim by Schmid et al. (2018) that the spatial GH shift is the shift that arises from the phase gradient in the x-direction in the Jones pupil as described by Breckinridge et al. (2015). However, we note that Fig. 27 of Schmid et al. (2018) suggests that the spatial GH shift is caused by both a shift on the mirror and a directional change of the beam due to a wavefront tilt induced upon reflection. This depiction is inaccurate: The spatial GH shift is a shift of the entire beam that occurs on the mirror surface, which, in the Fraunhofer approximation, can be described as a wavefront tilt in the exit pupil.
From the Jones pupil, it may seem that the spatial GH shift depends on the f-number, but this is not the case. Although a two times smaller f-number gives a two times larger phase gradient in the pupil plane, the focal distance is also two times smaller, resulting in the same shift in the focal plane. Similarly, for a diverging beam (i.e., a beam with a negative f-number) the phase gradients have the opposite sign but then the focal plane is virtual and located in front of the mirror (i.e., the focal distance is negative), again yielding the same shift. A more mathematical approach to showing the independence of the shift from the fnumber is presented in Schmid et al. (2018). We note that the size of the shift (which scales with λ, see Eq. (26)) relative to the size of the PSF (which scales with λ|F|, with F the f-number) does depend on the f-number and is proportional to 1/|F|. This means that a more strongly converging or diverging beam results in a larger shift relative to the PSF.
Finally, we show that the spatial GH shift is visible in the PSM as well (see Fig. 5). As described in Sect. 3, the focalplane shifts are determined from the intensity image constructed by right-multiplying the PSM by the incident Stokes vector. In other words, the shifts are determined from the image constructed as a linear combination of the PSM elements in the top row, weighted with the incident Stokes parameters. Whereas the (I → I)-, (U → I)-, and (V → I)-elements have their centroids shifted in the x-direction by the same small amount, the (Q→ I)element exhibits a much larger shift in this direction. For incident unpolarized light (Q = U = V = 0), the shift we find is that of the (I → I)-element. On the other hand, for incident light with Q nonzero, a scaled version of the (Q→ I)-element, which shows a relatively large shift, is added to or subtracted from the (I → I)-element. This results in a larger, smaller, or opposite shift compared to that of the (I → I)-element, in agreement with the curves in Fig. 7. Finally, for incident light with nonzero U and/or V, scaled versions of the (U → I)-and (V → I)-elements are added to or subtracted from the (I → I)-element. However, in this case the resulting shift is the same as that for incident unpolarized light because the centroid shifts of the (U → I)-and (V → I)-elements are equal to that of the (I → I)-element.

Angular Goos-Hänchen shift
The angular GH shift, Θ aGH , is an angular deviation of the beam of light upon reflection and, similar to the spatial GH shift, occurs in the plane of incidence (e.g., Aiello & Woerdman 2008;Aiello et al. 2009;Götte & Dennis 2012;Bliokh & Aiello 2013). The definition of the angular GH shift is shown in Fig. 6 (top). Similar to the spatial GH shift, the angular GH shift can be un-derstood from a 2D picture of the beam of light. Each ray in the incident beam hits the mirror at a different angle of incidence and therefore experiences a different reflection coefficient. Over the range of angles of incidence this results in a gradient in the amplitude across the reflected beam (see Fig. 3, left), which translates into a shift of the centroid in intensity. Contrary to the spatial GH shift, the size of the angular shift depends on the divergence angle, and thus the f-number, of the incident beam. This is because a more strongly converging or diverging beam covers a larger range of angles of incidence and therefore yields a larger gradient. The angular GH shift is truly a deflection of the beam centroid as described by an angle, which is the same whether the reflection occurs in the focus or the converging or diverging part of the beam (see Fig. 6, top). The resulting physical displacement of the beam centroid vanishes in the focus and increases with distance from the focus. That the physical displacement of the beam centroid is zero in the focus can easily be understood in the Fraunhofer approximation: The amplitude gradient in the exit pupil will lead to a point-symmetric change in the PSF, which cannot change the centroid of the intensity distribution.
The angular GH shift can be computed as follows: where, similar to the spatial GH shift, I x and I y are functions of Stokes Q (see Eqs. (8) and (9)), and R p , R s , and the amplitude gradients ∂R p /∂θ and ∂R s /∂θ (see Fig. 3, left) are evaluated at the central angle of incidence. The divergence angle of the beam, α, is given by: with F the f-number of the beam. Contrary to the spatial GH shift, the angular GH shift only depends on the amplitude of the reflection coefficients, and not on the phase. The angular GH shift is produced by the amplitude gradients, whereas R p and R s only have a small effect. The structure of Eq. (28) is quite similar to that of Eq. (26), which describes the spatial GH shift. Indeed, when setting I x = 0 or I y = 0 in Eq. (28), we see that the angular GH shift also consists of two components for the light polarized in the x-and y-directions: Equation (28) therefore constitutes the intensity-weighted average of these two shifts. Finally, the physical displacement of the beam centroid at a distance z f from the focus of the beam is given by: where z f > 0 in the diverging part of the beam and z f < 0 in the converging part. We can compute the physical displacement of the centroid of the intensity in the pupil plane by inserting z f = − f in Eq. (31), where f is the focal distance ( f > 0 in a converging beam and f < 0 in a diverging beam). Figure 8 shows the angular GH shift as a function of the angle of incidence for different incident polarization states as computed from Eq. (28). The figure also shows the shifts as obtained from the exit pupil (data points) using the polarization ray tracing as explained in Sect. 3. We have computed these numerical shifts by dividing the physical displacements of the centroid in the pupil plane by the negative value of the focal distance (see Eq. (31)). Contrary to the analytically computed shifts, we have computed the numerical shifts only for 100% polarized light (i.e., not for unpolarized light) because the Jones calculus used cannot describe unpolarized or partially polarized light. Similar to the spatial GH shift, the analytical and numerical results in Fig. 8 agree closely and small deviations are only visible for very large angles of incidence. These deviations are due to the angular GH shift depending on the precise beam intensity profile and vanish when performing the polarization ray tracing for a Gaussian beam. Figure 8 indicates that the angular GH shift is on the order of microradians for the particular configuration studied. For normal incidence, the shift is zero. The largest shifts are found for light polarized in the x-direction (i.e., for χ = 0 • and χ = 180 • , or Q = 1), whereas the shifts of the light polarized in the y-direction (i.e., for χ = 90 • or Q = −1) are much smaller. The curves can be understood from the amplitude gradients governing the angular GH shift as shown in Fig. 3 (left): Whereas ∂R s /∂θ increases monotonically with increasing angle of incidence, ∂R p /∂θ is initially negative, reaches a value of zero, and then attains large positive values. The curves in Fig. 8 follow a similar pattern as those of the spatial GH shift (see Fig. 7), with the shifts for incident light that is not 100% x-or y-polarized being an intensityweighted average of the shifts of the x-and y-polarizations.
As shown in the R xx -and R yy -elements of Fig. 4 (top; red borders), the amplitude gradients associated with the angular GH shift are visible in the Jones pupil expressed in the xyz-basis. In the antidiagonal elements R xy and R yx these amplitude gradients also exist, but they are overshadowed by the left-right symmetric structure visible in those elements. For a diverging rather than converging beam, the amplitude gradients have opposite signs (see also Fig. 6, top). Because a diverging beam implies a negative focal distance, that is, the focal plane is virtual and located in front of the mirror, the signs of the angular shifts themselves do not change (see Eq. (31)). Finally, the angular GH shift is not visible in the PSM (Fig. 5) because it is zero in the focus.

Spatial Imbert-Federov shift
The spatial IF shift, Y sIF , is a displacement of the entire beam of light upon reflection and occurs in the direction perpendicular to the plane of incidence (e.g., Federov 1955;Imbert 1972;Bliokh & Bliokh 2006;Aiello & Woerdman 2008;Hermosa et al. 2011;Götte & Dennis 2012;Bliokh & Aiello 2013;. A schematic with the definition of the spatial IF shift is shown in Fig. 6 (bottom). Similar to the spatial GH shift, the spatial IF shift is independent of the f-number of the beam and the position within the beam where the reflection occurs. To understand the spatial IF shift from a plane-wave decomposition, it is necessary to consider the full 3D picture (e.g., Aiello & Woerdman 2008;Bliokh & Aiello 2013). Each plane wave in the incident beam has a different (3D) propagation direction. Therefore, not only the angles of incidence (and thus the reflection coefficients) are different among the waves, but also the orientations of the local planes of incidence. These rotations of the planes of incidence induce different geometric (Berry) phases among the circularly polarized components of the waves. This results in a gradient of the geometric phases in the direction perpendicular to the plane of incidence, with the gradient having opposite sign for the righthanded and left-handed circular polarizations. Accounting for the reflection coefficients of each wave as well as the geometric phases within the reflected beam, the reflected beam is found to be shifted in the direction perpendicular to the plane of incidence when integrating over all waves.
The spatial IF shift is more easily understood in terms of conservation of total angular momentum (e.g., Bliokh & Bliokh 2006;Bliokh & Aiello 2013;. Disregarding vortex beams, the total angular momentum of a beam of light consists of the spin angular momentum (SAM) and the external orbital angular momentum. In the quantummechanical description of light, photons carry one of two spin states that correspond to right-handed and left-handed circular polarization. The SAM of a beam of light is a vector quantity pointing in the direction of propagation that is proportional to the difference between the number of right-handed and left-handed photons, that is, it is proportional to Stokes V. The external orbital angular momentum is given by the cross product of the radius vector of the beam centroid with respect to some origin and the linear momentum of the beam, with the latter pointing in the direction of propagation. Upon reflection, the total angular momentum in the direction normal to the surface of the mirror is conserved. As a result, any change in the SAM of the beam, that is, in the circular polarization, must be compensated for by a shift of the beam in the direction perpendicular to the plane of incidence. This shift is the spatial IF shift, which is therefore considered to be a spin-orbit interaction of light.
The spatial IF shift can be calculated as follows: where R p , R s , and the retardance ∆ (see Eq. (16) and Fig. 3) are evaluated at the central angle of incidence θ, and cot θ is the transverse gradient of the induced geometric phase. Although the spatial IF shift has a weak dependence on Stokes Q through I x and I y (see Eqs. (8) and (9)), the shift depends primarily on the incident Stokes U and V. So, whereas the GH shift consists of two separate shifts for light polarized in the x-and y-directions, the spatial IF shift comprises separate and opposite shifts for the diagonally and antidiagonally polarized components (because  6)) as well as for the right-handed and lefthanded circularly polarized components (because V = I r − I l , see Eq. (7)). For metallic reflections, the spatial IF shift results primarily from the retardance, whereas R p and R s can be considered to be small corrections. Indeed, we can simplify Eq. (32) by assuming that the incident beam is totally reflected. Setting R p = R s = 1 and inserting I x + I y = 1 (see Eq. (4)), we obtain: In this equation, the factor [V(1 + cos ∆) + U sin ∆] is proportional to the change of the SAM upon reflection, with V(1) proportional to the incident SAM and −(V cos ∆ + U sin ∆), which gives Stokes V after reflection, proportional to the SAM of the reflected beam. The spatial IF shift thus depends on the crosstalk from U to V (U sin ∆) and the crosstalk from V to U or even the crosstalk creating a change of handedness of the circular polarization (V cos ∆). Figure 9 shows the spatial IF shift as a function of the angle of incidence for different incident polarization states as computed from Eq. (32). Also shown are the shifts in the focal plane (data points) as numerically determined using polarization ray tracing (see Sect. 4), which agree closely with the analytical computations. The small deviations among the results vanish when performing the polarization ray tracing with a Gaussian beam. Figure 9 illustrates that the spatial IF shift is (somewhat) smaller than the spatial GH shift and is always smaller than the wavelength. At normal incidence, where ∆ = 180 • (see Fig. 3), the spatial IF shift is zero. For nonzero angles of incidence, where ∆ 180 • , changes in the SAM occur for incident Uor V-polarized light, thus leading to spatial IF shifts. The spatial IF shifts are in opposite directions for opposite signs of U (e.g., for χ = 45 • and χ = 135 • ) and V (for right-handed and lefthanded circular polarization). The shifts initially become larger with increasing angle of incidence (because ∆ monotonically decreases), but then become smaller again for (very) large angles of incidence as cot θ → 0 when θ → 90 • , resulting in no shift at θ = 90 • . The spatial IF shift for U (χ = 45 • and χ = 135 • ) reaches larger values than that of V with the maximum of U occurring at a smaller angle of incidence than the maximum of V. The maxima of the curves are lower for partially polarized light or light with both Q and U nonzero (e.g., χ = 22.5 • , χ = 67.5 • , χ = 112.5 • , or χ = 157.5 • ). Although the light with χ = 22.5 • and χ = 67.5 • (and similar for χ = 112.5 • and χ = 157.5 • ) have the same value for U, small differences in the size of the shifts occur due to the dependence on Q via I x and I y . The curves of incident light with both U and V nonzero (not shown in Fig. 9) are combinations of the curves for the individual Stokes parameters. Finally, for unpolarized light or light polarized in the x-or y-direction (i.e., Q-polarized light), the spatial IF shift is always zero because the incident beam overall carries no SAM and no SAM can be created upon reflection.
Similar to the spatial GH shift, the spatial IF shift is expected to create gradients in phase in the Jones pupil. However, in the Jones pupil expressed in the xyz-basis (see Fig. 4, top), phase gradients in the y-direction are not visible. This is because the spatial IF shift primarily depends on Stokes U and V (see Eq. (33)), and therefore results from the complex linear combination of all four Jones-pupil elements in this basis. Nevertheless, a hint of a gradient in the y-direction is visible in the R xy -, R yx -, ϕ xy -, and ϕ yx -elements when considering that a phase difference of π between the left and right sides of the pupil implies that the reflection coefficients on either side have opposite signs. Actual phase gradients in the y-direction naturally appear in the Jones pupils expressed in the bases of Stokes U and V, that is, in the Jones pupils expressed in the daz-and rlz-bases (see Fig. 4, center and bottom). The gradients are visible in the ϕ da -, ϕ ad -, ϕ rr -, and ϕ ll -elements (green borders). The Jones pupils also show the phase gradient in the x-direction produced by the spatial GH shift (blue borders), with the ϕ da -and ϕ ad -elements exhibiting a combination of gradients in the x-and y-directions. In Fig. 4 (center and bottom), the amplitude gradient in the x-direction due to the angular GH is visible as well (red borders). Lastly, we note that although the spatial IF shift does not depend on the f-number, its size relative to the PSF scales as 1/|F|, analogous to the spatial GH shift (see Sect. 4.1).
Finally, we show how the spatial IF shift is visible in the PSM (see Fig. 5). As explained in Sect. 4.1, the focal-plane shifts are determined from the image created as a linear combination of the PSM elements in the top row, weighted with the incident Stokes parameters. Because the (I → I)-and (Q→ I)-elements are symmetric with respect to the x-axis (i.e., they are left-right symmetric in Fig. 5), no shift results for unpolarized light or light that is polarized in the x-or y-direction. The (U → I)-and (V → I)elements on the other hand are asymmetric, with positive and negative signals on opposite sides of the x-axis. For incident light with nonzero U and/or V, scaled versions of these elements are added to or subtracted from the (I → I)-element, producing a PSF with the centroid shifted in the y-direction. We note that the relative intensity of the (U → I)-element is larger than that of the (V → I)-element, in agreement with the spatial IF shift being larger for U than for V at an angle of incidence of 45 • (see Fig. 9).

Angular Imbert-Federov shift
The angular IF shift, Θ aIF , is an angular deviation of the beam of light upon reflection directed away from the plane of incidence (e.g., Bliokh & Bliokh 2007;Aiello & Woerdman 2008;Hermosa et al. 2011;Götte & Dennis 2012;Bliokh & Aiello 2013). The definition of the angular IF shift is shown in Fig. 6 (bottom). The angular IF shift is related to the conservation of linear momentum in the direction perpendicular to the plane of incidence, and, similar to the spatial IF shift, results from the differences in induced geometric phase across the beam. Similar to the angular GH shift, the size of the angular IF shift depends on the f-number of the incident beam and is the same whether the beam is reflected in the focus or in the converging or diverging parts of the beam. The physical displacement of the centroid of the beam is zero in the focus and increases with distance from the focus.
The angular IF shift can be calculated as follows: where R p and R s are computed at the central angle of incidence, and the divergence angle α is given by Eq. (29). Similar to the angular GH shift, the angular IF shift does not depend on the phases of the reflection coefficients, but only on the amplitudes. Although the angular IF shift has small Q-dependent corrections through I x and I y (see Eqs. (8) and (9)), the shift depends primarily on the incident Stokes U. The angular IF shift consists of separate and opposite shifts for the diagonally and antidiagonally polarized components (because U = I d − I a , see Eq. (6)) and results primarily from the diattenuation. Indeed, if Q = 0, that is, I x = I y = 1 / 2 , Eq. (34) reduces to: with ϵ the diattenuation from Eq. (15). Finally, the physical displacement of the centroid of the beam is given by: with z f the distance from the focus, similar to Eq. (31). Figure 10 shows the angular IF shift as a function of the angle of incidence for different incident polarization states as computed from Eq. (34). The shifts as obtained from the exit pupil (data points) using polarization ray tracing (see Sect. 3) are also shown. These numerical shifts are computed using Eq. (36) and are only calculated for 100% polarized light, similarly to the angular GH shifts (see Sect. 4.2). The analytical and numerical results agree closely, with the small deviations vanishing when performing the polarization ray tracing for a Gaussian beam. Figure 10 shows that the angular IF shift is on the order of less than a microradian for the particular configuration considered. For incident light with U nonzero, angular IF shifts occur that are in the opposite direction for opposite signs of U. The shifts are zero for angles of incidence of 0 • and 90 • . The shape of the curves is related to the diattenuation (roughly the difference between R s and R p in Fig. 3), which initially increases with increasing angle of incidence, reaches a maximum, and then decreases again to zero at θ = 90 • . For incident light with U = 0 (i.e., χ = 0 • , χ = 90 • , χ = 180 • , V = 1, V = −1, or unpolarized light), the shift is zero for any angle of incidence.
Finally, the amplitude gradients in the y-direction associated with the angular IF shift are visible in the R da -and R ad -elements of the Jones pupil expressed in the daz-basis (see Fig. 4, center). The gradients of these elements are a combination of gradients  Fig. 10. Angular IF shift as a function of the angle of incidence at a wavelength of 820 nm for a beam of light with an f-number of 61.3 that reflects off gold as obtained from the closed-form expression of Eq. (34) (curves) and polarization ray tracing (data points). The shift is shown for an incident beam that is completely unpolarized, 100% linearly polarized with various angles of linear polarization χ, and 100% right-handed (V = 1) or left-handed (V = −1) circularly polarized. The shifts for χ = 67.5 • and χ = 157.5 • are not shown, but are very close to the shifts for χ = 22.5 • and χ = 112.5 • , respectively. Except for the circular polarization, the colors used indicate the same polarization states as in Fig. 9. in the y-direction and the x-direction, with the latter due to the angular GH shift (red borders). Because the angular IF shift is zero in the focus, it is not visible in the PSM.

Discussion
In Sect. 4 we explained the origin and characteristics of the spatial and angular GH and IF shifts and investigated their size and direction as a function of the angle of incidence and incident polarization state. We also showed that all four beam shifts are fully reproduced by polarization ray tracing as described in Sect. 3 and that the exact beam intensity profile (i.e., whether it is Gaussian or uniform) has a negligible effect. Of the four beam shifts, only the spatial GH and IF shifts are relevant for highcontrast imagers and telescopes because these shifts are visible in the focal plane; the angular GH and IF shifts are not important because, besides a small point-symmetric deformation of the PSF for angles of incidence close to grazing incidence (which do not occur in high-contrast imagers), they have no effect in the focus. We thus find that the polarization structure in the PSF that limits the performance of coronagraphs and the speckle suppression of polarimetric imagers is created by the spatial GH and IF shifts. We note that the effect of these shifts on highresolution spectroscopy and astrometry of planets should generally be small. The fiber-positioning system of a high-resolution spectrograph maximizes the amount of planet light that enters the fiber, thereby automatically correcting for the beam shifts. And because the beam shifts are similar for astronomical objects at different locations on the science detector, relative astrometry is almost not affected.
In Sect. 5.1 we investigate the polarization structure in the PSF created by the spatial GH and IF shifts. Subsequently, in Sect. 5.2, we examine the effect of the spatial GH and IF shifts on polarimetric measurements. In Sect. 5.3 we then briefly discuss the size of the spatial GH and IF shifts for various mirror materials and wavelengths. After that, in Sect. 5.4, we use our understanding of the spatial GH and IF shifts to discuss and refine the approaches to mitigate the shifts. Finally, we present a table summarizing the properties of the four beam shifts in Sect. 5.5.

Polarization structure in the PSF due to beam shifts
In this section, we investigate the polarization structure in the PSF created by the spatial GH and IF shifts. This polarization structure must be taken into account when designing the coronagraphs of high-contrast imagers that aim to detect planets in reflected light (Breckinridge et al. 2015). For our analysis, we consider the reflection off a single flat mirror at an angle of incidence of 45 • , using the same configuration as examined in Sects. 3 and 4.
The observed light of the stars around which high-contrast imagers search for planets is unpolarized or has a degree of polarization of only several percent (see e.g., Heiles 2000). For this case of (nearly) unpolarized incident light, the Stokes vector after reflection off a flat mirror is given by the elements in the left column of the PSM in Fig. 5, that is, the (I → I)-, (I → Q)-, (I →U)-, and (I →V)-elements. These elements are the same as those in the top row of the PSM, except for the (I →U)-element which has opposite sign. Because the spatial GH and IF shifts follow from these top-row elements (see Sects. 4.1 and 4.3), the polarization-dependent structures visible in the Stokes vector for reflection of incident unpolarized light must be created by the spatial GH and IF shifts. In the following, we refer to the (I → I)-, (I → Q)-, (I →U)-, and (I →V)-elements as the intensity image and the Q-, U-, and V-images, respectively.
As outlined in Sect. 4.1, the spatial GH shift is described by two opposite shifts of different size for the incident light polarized in the x-and y-directions, that is, for the incident I x -and I y -components of the light. Because unpolarized light can be described as the sum of equal amounts of the I x -and I y -components (see Eqs. (4), (8), and (9)), the intensity image consists of two PSF components that are slightly shifted in opposite directions along the x-axis. As a result, the PSF in intensity is not only shifted (by 15 nm or 1.8% of the wavelength for the configuration considered; see Fig. 7, black curve), but also broadened in the x-direction. The Q-image is equal to the difference of the I x -and I y -components (see Eq. (5)). Due to the diattenuation (see Eq. (15)), the two components are not reflected by an equal amount. Therefore, an overall negative signal with a minimum of ∼0.9% remains in the image, which constitutes the instrumental polarization. But because the I x -and I y -components are also shifted in opposite directions, this instrumental-polarization signal itself also has a large shift (see also Breckinridge et al. 2015).
As explained in Sect. 4.3, the spatial IF shift is opposite for incident diagonally (d) and antidiagonally (a) polarized light (i.e., for positive and negative 100% U-polarized light) as well as for incident right-handed (r) and left-handed (l) circularly polarized light (i.e., for positive and negative 100% V-polarized light). Unpolarized light can be described as the sum of equal amounts of these I d -and I a -components as well as the sum of equal amounts of the I r -and I l -components (see Eqs. (4), (6), and (7)). Therefore, the intensity image consists of PSF components that are slightly shifted by equal amounts in opposite directions parallel to the y-axis. So although the PSF in intensity is not shifted by the spatial IF shift when the incident light is unpolarized (see Fig. 9, black curve), it is broadened in the y-direction in addition to the broadening in the x-direction (due to the spa-tial GH shift). The opposite shifts of the I d -and I a -components and the I r -and I l -components can also be seen in the U-and V-images, respectively, where they create asymmetric structures with positive and negative signals on opposite sides of the xaxis. For the configuration considered, these structures have values below 0.1% of the intensity (with the U-image having larger values than the V-image as can be expected from Fig. 9). The asymmetric structures are also visible in the R ′ xy -, R ′ yx -, ϕ ′ xy -, and ϕ ′ yx -elements of the ARM (see Fig. A.1). Breckinridge et al. (2015) refer to these structures in the ARM as ghost PSFs (see Sect. 1). Our results therefore show that these ghost PSFs are created by the spatial IF shifts and are elliptically polarized. Finally, we note that due to the splitting of the orthogonal circular polarization states in the V-image, the spatial IF shift is often also referred to as the spin Hall effect of light (e.g., Hermosa et al. 2011;Bliokh & Aiello 2013;. The PSM in Fig. 5 as calculated with polarization ray tracing includes all orders of polarization aberrations. Still, we find that the polarization structure in the PSF for the case of incident unpolarized light is adequately described by the diattenuation (i.e., the instrumental polarization) and the first-order polarization aberrations in the focus, that is, the spatial GH and IF shifts. We therefore conclude that only for curved mirrors the higherorder polarization aberrations, such as polarization-dependent astigmatism (Breckinridge et al. 2015), come into play. For a discussion on the combined effect of a series of flat mirrors and the polarization aberrations of curved mirrors with normal incidence, we refer to Breckinridge et al. (2015).

Effect of beam shifts on polarimetric measurements
In this section, we investigate the effect of the spatial GH and IF shifts on polarimetric measurements with high-contrast imagers. The physics literature does not describe beam shifts for the case where unpolarized light is incident on a mirror and where the reflected light is subsequently measured by a polarimeter. However, we can understand this case from our insights into the beam shifts and our results from the polarization ray tracing.
We shall consider a rotatable linear polarizer placed behind the mirror that we analyzed in Sect. 5.1. In that case, the Stokes vector incident on the polarizer is the same Stokes vector as examined in Sect. 5.1: It is equal to the left column of the PSM in Fig. 5. If we then align the transmission axis of the polarizer with the x-, y-, d-, and a-directions, we measure the I x -, I y -, I d -, and I a -components of the beam. Also, if we replace the polarizer with a right-handed or left-handed circular polarizer, we measure the I r -and I l -components of the beam. As a result, these six measurements are sensitive to exactly the same spatial GH and IF shifts of these components as described in Sect. 5.1. Therefore, when we compute the differences of the x-and y-, dand a-, and r-and l-measurements, we obtain the Q-, U-, and V-images of the Stokes vector after reflection.
Because stars are generally unpolarized, polarimetric measurements strongly suppress the light from the star, thereby making the detection of planets in reflected light easier. However, the maximum gain in contrast from polarimetry is limited by the spatial GH and IF shifts and the polarization structure that they create. Although the instrumental polarization is a larger aberration, this effect is routinely subtracted in the data reduction and/or removed by using a half-wave plate in front of the optical path in current high-contrast imaging polarimeters (Witzel et al. 2011;Canovas et al. 2011;Wiktorowicz et al. 2014; To quantify the maximum gain in contrast from polarimetry as limited by the spatial GH and IF shifts, we compute the mirror-induced fractional polarization in Q, U, and V over the PSF. To this end, we convolve the intensity image and the Q-, U-, and V-images using a top-hat kernel with a diameter equal to the full width at half maximum of the PSF in the intensity image. This diameter is equal to the diameter of the apertures one would use to extract the fluxes of detected planets and determine the noise level in the images (e.g., Mawet et al. 2014). After convolving the images, we compute the instrumental polarization in the Q-image by dividing the total flux in the Q-image by the total flux in the intensity image. We then subtract the instrumental polarization from the Q-image by multiplying the intensity image by the instrumental polarization and subtracting the resulting image from the Q-image. Subsequently, we compute the images of the normalized Stokes parameters q = Q/I, u = U/I, and v = V/I by dividing the (instrumental-polarization-subtracted) Q-, U-, and V-images by the intensity image. The resulting images as well as the images of the intensity and the degree and angle of linear polarization P and χ (see Eqs. (11) and (12)) are shown in Fig. 11. Figure 11 (top) shows that the spatial GH and IF shifts create a significant polarization structure in the PSF. In all images the PSF core and the Airy rings show an asymmetric structure with successive positively and negatively polarized regions. In case of the u-and v-images, we found that these structures are created by the spatial IF shifts and identified them as the ghost PSFs described by Breckinridge et al. (2015) (see Sect. 5.1). However, by subtracting the instrumental polarization, we have revealed an even stronger asymmetric structure or ghost PSF in the q-image. In this case, the structure is produced by the spatial GH shifts and is oriented orthogonally to the structures in the u-and v-images. Figure 11 (top) also shows that the PSF has significant fractional-polarization levels, with the largest values in the qimage and the smallest values in the v-image. The relative strengths of the fractional polarization in the q-, u-, and v-images are directly related to the relative sizes of the spatial GH and IF shifts at an angle of incidence of 45 • (see Figs. 7 and 9). Figure 11 (bottom) indicates that the degree of linear polarization in the PSF reaches a maximum of 0.56%. Finally, we see that the angle of linear polarization rotates 180 • when moving in a circle around the center of the PSF and that it differs by 90 • between the inner and outer regions of the Airy rings.
The polarization structure in the q-, u-, and v-images limit the local gain in contrast achievable with polarimetry. The degree of (linear) polarization is several tenths of a percent on average; hence the average contrast gain is a factor of ∼350, which is the gain compared to the contrast in intensity including the effects of seeing. This is because any speckles due to the seeing are also polarized at approximately this level. We stress that the exact numerical values presented in Fig. 11 are only valid for the specific configuration considered. For example, for a series of mirrors and/or beams with smaller f-numbers, the fractional-polarization levels are much higher and therefore the gain in contrast due to polarimetry is much lower.
Finally, as discussed in Sect. 1, the polarimetric speckle suppression of the high-contrast imaging polarimeter SPHERE-ZIMPOL is limited by polarization-dependent beam shifts (Schmid et al. 2018). Indeed, the structures visible in the on-sky polarimetric images of Fig. 26 of Schmid et al. (2018) agree very well with the asymmetric structures (ghost PSFs) in the q-and u-images of Fig. 11 (top). Therefore, the polarimetric contrast of SPHERE-ZIMPOL at small angular separations from the star is clearly limited by both the spatial GH and IF shifts.

Size of beam shifts for various mirror materials and wavelengths
So far we have only considered the beam shifts for reflection off gold at a wavelength of 820 nm. Here we briefly discuss the maximum size of the spatial GH and IF shifts as a function of wavelength from the ultraviolet to the near-infrared for the three most common (bulk) mirror materials used in astronomical telescopes and instruments. We note, however, that actual mirrors in astronomical telescopes and instruments are likely to consist of a stack of thin films and so the exact sizes of the shifts will be different. To compute the shifts, we use the complex refractive indices of gold, silver, and aluminum for the range of wavelengths from Rakić et al. (1998). Figure 12 shows the spatial GH shift for x-polarized light (from Eq. (26)) and the spatial IF shift for antidiagonally polarized light (from Eq. (32)), both normalized with the wavelength, for angles of incidence θ equal to 45 • and 70 • . Figure 12 shows that the spatial GH shift is larger than the spatial IF shift for all mirror materials, that the size of the shifts is always less than the wavelength, and that the shifts relative to the wavelength are larger for shorter wavelengths. Of the three materials, aluminum produces the smallest shifts, whereas gold and silver create larger shifts. For all materials and wavelengths, the spatial GH shift is smaller for θ = 45 • than for θ = 70 • . The same is true for the spatial IF shift, except for the shortest wavelengths where the shift for θ = 45 • becomes larger than that for θ = 70 • .  12. Maximum wavelength-normalized spatial GH (top) and IF (bottom) shifts as a function of wavelength at an angle of incidence θ of 45 • and 70 • for reflection off gold, silver, and aluminum. The legend in the bottom panel is valid for both panels. The shifts for gold and silver are only shown for wavelengths longer than 600 nm and 400 nm, respectively, because the reflectivity drops below 90% at shorter wavelengths. Breckinridge et al. (2015) provide possible approaches to mitigate polarization aberrations in optical systems, which include using beams of light with large f-numbers, keeping the angles of incidence small, and tuning the coatings of the mirrors. In this section, we discuss and refine these approaches based on our fundamental understanding of the beam shifts. Breckinridge et al. (2015) also discuss the use of possible optical devices that could compensate polarization aberrations (see also Clark & Breckinridge 2011;Sit et al. 2017;Dai et al. 2019), but a discussion of these devices is beyond the scope of this paper. We also note that Schmid et al. (2018) and Hunziker et al. (2020) are able to correct the beam shifts of SPHERE-ZIMPOL by measuring them from on-sky data. This correction significantly reduces the speckle noise at angular separations >0.6 ′′ from the star, but residuals remain at separations <0.6 ′′ . These residuals are particularly strong for broadband data because the beam shifts are wavelength dependent and thus cannot be corrected with a simple shift for a broad wavelength range. Therefore, mitigating the beam shifts already during the optical design is the preferred approach.

Mitigation of beam shifts
The size of the spatial GH and IF shifts is independent of the f-number F of the beam of light incident on a mirror. However, as explained in Sect. 4.1, the size of these shifts relative to the size of the PSF is inversely proportional to the f-number. Therefore, to limit the effect of the beam shifts and the polariza-tion structure they create, the absolute f-numbers of the beams falling onto the mirrors in the optical system should be large; the beams should converge or diverge slowly. In the limit of a perfectly collimated beam (F = ∞), the spatial GH and IF shifts are even negligibly small compared to the size of the PSF. Because any beam of finite extent corresponds to an angular spectrum of plane waves, the spatial GH and IF shifts still occur for a perfectly collimated beam, but the PSF is located at an infinite distance and is infinitely large. We note that magnifications in the optical system after the reflection off the mirror do not affect the size of the beam shifts relative to the PSF, because magnifications change the size of the shifts and the PSF by an equal amount.
The spatial GH and IF shifts are created by respectively the phase gradient and the retardance of the mirror at the central angle of incidence of the beam; the amplitudes of the reflection coefficients have only a marginal effect and are therefore not important. Hence, to minimize the spatial GH and IF shifts, the phase gradient should be kept small and the retardance should have a value close to 180 • (see Eqs. (26) and (33)). Fortunately, the values of the phase gradient and the retardance are closely related: A retardance close to 180 • automatically implies small phase gradients in both the p-and s-directions. Figure 3 (right) shows that this situation occurs at small angles of incidence. Therefore, to minimize the spatial GH and IF shifts, the central angle of incidence of the beams should be kept small.
Keeping the f-numbers large and the central angles of incidence small may not always be possible because optical systems need to fit in a limited volume. Therefore, also the design of the coatings of the mirrors should be considered to minimize the spatial GH and IF shifts. In general, mirror coatings are optimized for large reflectivity to maximize the throughput of the optical system. However, highly reflective coatings almost always have retardances significantly different from 180 • and therefore such coatings produce large spatial GH and IF shifts. But for high-contrast imaging, a high system throughput is of little use when one cannot attain the contrast to image exoplanets. Therefore, a paradigm shift in the design of the mirror coatings for high-contrast imagers is necessary: Rather than maximizing the reflectivity, the retardance should be optimized to have values close to 180 • for the central angle of incidence of the mirror and the wavelength range of interest. For linear polarimeters such a design philosophy has the added advantage that it also prevents large losses of signal due to strong polarimetric crosstalk, such as those found for the image derotators of SPHERE and SCExAO-CHARIS (de Boer et al. 2020;van Holstein et al. 2020a,b;'t Hart et al. 2021). The larger instrumental polarization resulting from the suboptimal reflectivity is not an issue because it can be easily removed by adding a half-wave plate to the optical path or subtracting it in the data reduction.

Table summarizing properties of beam shifts
In Table 1 we present an overview of the properties of the four beam shifts discussed in this paper. For each shift, the table shows the type and nature of the effect, the plane or direction of occurrence, the origin of the shift, the parameters that the shift depends on, the typical size, the effect in the focal plane, and whether or not the shift is important for high-contrast imaging. Table 1 therefore provides a clear summary of the beam shifts and is a useful reference to compare the effects.

Conclusions
We used polarization ray tracing to numerically compute the beam shifts for reflection off a flat metallic mirror and compared the resulting shifts to the closed-form expressions of the spatial and angular GH and IF shifts from the physics literature. We find that all four beam shifts are fully reproduced by polarization ray tracing. In particular, we find that the phase gradients in the Jones pupil and the ghost PSFs as described by Breckinridge et al. (2015) are produced by the spatial GH and IF shifts. We also studied the origin and characteristics of the four shifts and the dependence of their size and direction on the beam intensity profile, incident polarization state, angle of incidence, mirror material, and wavelength. An overview of the properties of the four beam shifts is shown in Table 1. Whereas the spatial GH and IF shifts depend on the phase of the Fresnel reflection coefficients, the angular GH and IF shifts depend on the amplitude. Only the spatial GH and IF shifts are relevant for high-contrast imagers and telescopes because these shifts are visible in the focal plane. The angular GH and IF shifts on the other hand are not important because they only change the intensity distribution across the reflected beam. As such, the angular shifts have no significant effect in the focus and only create a small point-symmetric deformation of the PSF. We thus conclude that only phase aberrations are important; amplitude aberrations have an almost negligible effect.
The spatial GH and IF shifts create a polarization structure in the PSF that reduces the performance of coronagraphs. In fact, we find that the polarization structure for the case of unpolarized light incident on a flat metallic mirror is adequately described by the diattenuation (i.e., the instrumental polarization) and the spatial GH and IF shifts. The polarization structure created by the spatial GH and IF shifts can also significantly reduce the speckle suppression of polarimetric measurements, thereby limiting the maximum attainable gain in contrast from polarimetry. To mitigate the spatial GH and IF shifts in optical systems, the beams of light reflecting off the mirrors should have large f-numbers and small central angles of incidence. Most importantly, mirror coatings should not be optimized for maximum reflectivity, but should instead be designed to have a retardance close to 180 • .
Our study provides a fundamental understanding of the polarization aberrations resulting from reflection off flat metallic mirrors in terms of beam shifts. In addition, we have created the analytical and numerical tools to describe these shifts. The next step is to study the combined effect and wavelength dependence of the beam shifts of complete optical paths of (polarimetric) high-contrast imaging instruments and telescopes with multiple inclined and rotating components, including half-wave plates. In particular, we plan to use our tools to create a detailed model of the beam shifts affecting the polarimetric mode of SPHERE-ZIMPOL and enable accurate corrections of on-sky observations. The insights from our work can be applied to understand and improve the performance of many future space-and ground-based high-contrast imagers and polarimeters, such as the Roman Space Telescope, the Habitable Worlds Observatory, GMagAO-X at the Giant Magellan Telescope, PSI at the Thirty Meter Telescope, and PCS (or EPICS) at the Extremely Large Telescope.
R. G. van Holstein et al.: Polarization-dependent beam shifts upon metallic reflection in high-contrast imagers and telescopes Appendix A: Amplitude-response matrix