Generalized multi-plane gravitational lensing: time delays, recursive lens equation, and the mass-sheet transformation

Peter Schneider

doi:10.1051/0004-6361/201424881

Home

All issues

Volume 624 (April 2019)

A&A, 624 (2019) A54

Full HTML

Free Access

Issue		A&A Volume 624, April 2019


Article Number		A54
Number of page(s)		10
Section		Cosmology (including clusters of galaxies)
DOI		https://doi.org/10.1051/0004-6361/201424881
Published online		09 April 2019

A&A 624, A54 (2019)

Generalized multi-plane gravitational lensing: time delays, recursive lens equation, and the mass-sheet transformation

Peter Schneider

Argelander-Institut für Astronomie, Universität Bonn, Auf dem Hügel 71, 53121 Bonn, Germany
e-mail: peter@astro.uni-bonn.de

Received: 29 August 2014
Accepted: 25 February 2019

Abstract

We consider several aspects of the generalized multi-plane gravitational lens theory, in which light rays from a distant source are affected by several main deflectors, and in addition by the tidal gravitational field of the large-scale matter distribution in the Universe when propagating between the main deflectors. Specifically, we derive a simple expression for the time-delay function in this case, making use of the general formalism for treating light propagation in inhomogeneous spacetimes which leads to the characterization of distance matrices between main lens planes. Applying Fermat’s principle, an alternative form of the corresponding lens equation is derived, which connects the impact vectors in three consecutive main lens planes, and we show that this form of the lens equation is equivalent to the more standard one. For this, some general relations for cosmological distance matrices are derived. The generalized multi-plane lens situation admits a generalized mass-sheet transformation, which corresponds to uniform isotropic scaling in each lens plane, a corresponding scaling of the deflection angle, and the addition of a tidal matrix (mass sheet plus external shear) to each main lens. The scaling factor in the lens planes exhibits a curious alternating behavior for odd and even numbered planes. We show that the time delay for sources in all lens planes scale with the same factor under this generalized mass-sheet transformation, thus precluding the use of time-delay ratios to break the mass-sheet transformation.

Key words: cosmological parameters / gravitational lensing: strong

© ESO 2019

1. Introduction

In strong gravitational lensing systems, in which a galaxy or a galaxy cluster causes multiple images or strong image distortions of background sources, one often neglects the inhomogeneities of the gravitational field between the observer and the lens, and between the lens and the source (see, e.g., Kochanek 2006; Treu 2010; Bartelmann 2010; Kneib & Natarajan 2011, for reviews on strong lensing systems). This usually provides a very good approximation, since the lensing strength of the main lens over the region where strong lensing effects occur is much larger than the typical distortion effects of matter along the line-of-sight. The latter is comparable to the typical strength of cosmic shear effects (see, e.g., Bartelmann & Schneider 2001; Schneider 2006; Munshi et al. 2008; Hoekstra 2013), and amounts to about 1 or 2% of the distortion in the strong-lens region.

Whereas the propagation effects are thus small, the interest of these weak distortions has been renewed, for at least two different reasons. The first is that strong lens systems may be biased toward showing relatively strong line-of-sight structures, thereby increasing their lensing probability. This effect is most likely affecting the lensing cross sections for the formation of giant arcs in clusters (see Bartelmann et al. 1998 for stating the “arc statistic problem”, and Meneghetti et al. 2013 for a recent update of the issue). Cosmological simulations (Puchwein & Hilbert 2009, and references therein) indicate that line-of-sight structure can indeed substantially modify the lensing efficiency of clusters. More recently, Bayliss et al. (2014) found a significant overdensity of galaxy groups along the line-of-sight toward strong-lensing clusters, observationally supporting the presence of this bias.

The second reason for the renewed interest in intervening distortions of strong-lens systems is their use for a precise determination of parameters, most noticible the Hubble constant from measured time delays (Refsdal 1964) in multiply imaged quasi-stellar objects (see, e.g., Kochanek 2003, 2006; Treu 2010; Suyu et al. 2010, 2013). The quality of modern imaging data and the accuracy of time delay estimates allows one to derive estimates of the Hubble constant with a formal error of ∼6% (Suyu et al. 2013). At this level of precision, line-of-sight effects may become highly relevant in these strong-lensing systems (e.g., Wong et al. 2011; Collett et al. 2013; Greene et al. 2013, and references therein).

With additional deflectors along the line-of-sight, the lens equation, which relates the true source position to the observed positions of images, needs to be generalized to include deflection at more than one distance from the observer. Blandford & Narayan (1986) established the theory of multi-plane gravitational lensing (see also Chap. 9 of Schneider et al. 1992), where the mapping between images and their source is affected by the action of several deflectors at different redshifts. This multi-plane lensing is employed in ray-tracing simulations – see Refsdal (1970) for the earliest ray-tracing simulations, Jain et al. (2000), Hilbert et al. (2009), and references therein – where the three-dimensional mass distribution between observer and source is partitioned into separate slices, and the mass distribution in each slice is treated as a gravitational lens plane.

The full theory of multiple deflection lensing is required if there are two or more strong deflectors along the same line-of-sight toward a background source. For galaxy-scale lensing, such systems are rare, since such an alignment is not very probable. However, in the sample of several hundred galaxy-scale strong lens systems currently known, there are examples of multiple lenses at two different redshifts (Chae et al. 2001; Gavazzi et al. 2008). These systems may be of particular interest, since they may be used in principle to determine cosmological distance ratios (Collett & Auger 2014), and therefore to constrain the cosmic expansion history, although in practice the mass-sheet transformation renders this a difficult task (Schneider 2014).

Far more common are situations where there is a single strong lens in the line-of-sight to a distant source, and several other deflectors at different redshifts situated sufficiently far away from the strong-lensing region of the main deflector such that their deflection angle can be linearized across this strong-lensing region. Kovner (1987) considered this case of one main lens, combined with linearized deflections at different redshifts. The effects of the linear deflectors can be summarized into a set of matrices which describe the mapping between angles at the vertices of light cones to the separation vectors along the light cone at the redshifts of lens and source. Schneider (1997) reconsidered this situation, using the general formalism of light propagation in an inhomogeneous universe (Seitz et al. 1994; hereafter SSE), and related this generalized gravitational lens equation to the one where the deflection occurs in only a single plane, with a tidal deflection matrix added (see also Keeton et al. 1997). McCully et al. (2014) further generalized this theory to consider several main lenses, together with linearized deflections between the main lens planes. In particular, McCully et al. (2014) emphasized the advantages of this hybrid framework for the modeling of strong lens systems with multiple main deflectors along the line-of-sight (see also McCully et al. 2017).

One of the prime motivations for the work of McCully et al. (2014) was the derivation of the time-delay function for such generalized lensing systems, as this is required for relating the time delay to the scale-length of the Universe, i.e., the Hubble radius. Using the Millennium Simulation, Jaroszynski & Kostrzewa-Rutkowska (2014) investigated the impact of the line-of-sight matter distribution on strong lensing properties, including the time delay; they concluded that the intermediate mass distribution leads to a spread of ∼6% in the product of Hubble constant and time delay, for strong lensing systems with source redshifts z_s ∼ 2. Arguably, the largest obstacle for the time-delay method to obtain accurate estimates for the Hubble constant is the degeneracy of the mass model due to the mass-sheet transformation (MST; Falco et al. 1985; see Schneider & Sluse 2013 for a recent discussion) or the more general source position transformation (Schneider & Sluse 2014). Recently, Schneider (2014) has shown that an MST also exists for the case of lenses at two different distances from us. In particular, this MST leads to a scaling of all time delays (from sources on both source planes) by the same factor, thus precluding that the degeneracy due to the MST can be broken by time-delay ratios.

In this paper, these results are further generalized to the case of arbitrarily many main lens planes, together with linear deflections between the main lens (and source) planes. After a short summary of the propagation equations in an inhomogeneous universe, as they apply to the case under consideration here, we derive the time-delay function for the case of several main lens planes. Whereas this has been done before by McCully et al. (2014), our expression for the light travel time function is expressed in a form which allows us to apply Fermat’s theorem in gravitational lensing. Thus, the lens equation can be obtained from requiring that the light travel time function is stationary with respect to all impact vectors in the main lens planes. With this procedure, we derive an alternative form of the lens equation, which involves the impact vectors of three consecutive lens planes and the deflection angle in the middle one of them. We show that this iterative form of the lens equation is equivalent to the more standard form; in order to do so, we first obtain a very general relation between distance matrices.

We then turn to the MST in this case, and find a curious property of its behavior: The uniform, isotropic scaling factor which characterizes the MST alternates between a free parameter λ and unity from one lens/source plane to the next. The corresponding modification of the deflection angle in the main lens planes corresponds to a scaling of the deflection, plus the addition of a tidal deflection matrix (which in the absence of linear deflections between lens planes reduces to the addition of a uniform mass sheet). Finally we show that all time delays, for sources located on any plane, scale by the same factor under an MST, precluding the possibility of breaking the degeneracy from the MST by measured time-delay ratios.

2. Generalized multi-plane lens equation

2.1. Optical tidal equation

The propagation of light rays follows from the geodesic equation, specialized to a perturbed Robertson–Walker metric (see Schneider et al. 1992; Seitz et al. 1994; Bartelmann 2010, and references therein). In particular, for infinitesimally small light bundles, the separation vector ξ of a ray from the reference (or “central”) ray of the bundle is given by the optical tidal equation

$\begin{matrix} ξ^{″} (λ) = T (λ) ξ (λ), \end{matrix}$ $\begin{aligned} {\boldsymbol{\xi }}^{\prime \prime }(\lambda )=\mathcal {T}(\lambda ) \,{\boldsymbol{\xi }}(\lambda ), \end{aligned}$ (1)

where 𝒯 is the optical tidal matrix, evaluated at the affine parameter λ along the central ray. Here and throughout this paper, a prime denotes differentiation with respect to the affine parameter λ, which is related to the redshift z, the cosmic scale factor a = 1/(1 + z), or cosmic time t through

$\begin{matrix} d λ = - \frac{c d t}{(1 + z)} = - \frac{c d a}{H} = \frac{c d z}{H {(1 + z)}^{2}}, \end{matrix}$ $\begin{aligned} {\mathrm{d}}\lambda = -{c\,{\mathrm{d}} t \over (1+z)}=-{c\,{\mathrm{d}} a\over H} ={c\,{\mathrm{d}} z\over H\,(1+z)^2}, \end{aligned}$ (2)

where we have assumed that λ increases with redshift, i.e., decreases with cosmic time,

$\begin{matrix} H = \dot{a} / a = H_{0} \sqrt{Ω_{m} {(1 + z)}^{3} + (1 - Ω_{m} - Ω_{Λ}) {(1 + z)}^{2} + Ω_{Λ}} \end{matrix}$ $\begin{aligned} H=\dot{a}/a= H_0\sqrt{\Omega _{\rm m}(1+z)^3+(1-\Omega _{\rm m}-\Omega _\Lambda )(1+z)^2+\Omega _\Lambda } \end{aligned}$ (3)

is the Hubble function, H₀ is the Hubble constant, and Ω_m and Ω_Λ are the cosmic density parameters in matter and vacuum energy.

We consider 𝒯 to consist of three separate components, 𝒯 = 𝒯_bg + 𝒯_sm + 𝒯_cl. The first is the optical tidal matrix of the homogeneous background universe, and is given by (see SSE)

$\begin{matrix} T_{bg} = - \frac{3}{2} {(\frac{H_{0}}{c})}^{2} Ω_{m} {(1 + z)}^{5} I, \end{matrix}$ $\begin{aligned} \mathcal{T}_{\rm bg}=-{3\over 2}\left( H_0\over c \right)^2 \Omega _{\rm m}(1+z)^5\mathcal {I}, \end{aligned}$ (4)

where ℐ is the two-dimensional unit matrix. If we consider a light bundle with vertex at redshift z = 0 and affine parameter λ = 0, which is only subject to 𝒯_bg, then the solution of Eq. (1) is ξ(λ)=D(λ)θ, where θ is the angle that the light ray under consideration encloses with the fiducial ray at the vertex, and D(λ) is the solution of the differential equation

$\begin{matrix} \frac{d^{2} D (λ)}{d λ^{2}} = - \frac{3}{2} {(\frac{H_{0}}{c})}^{2} Ω_{m} {(1 + z)}^{5} D (λ), \end{matrix}$ $\begin{aligned} {{\mathrm{d}}^2 D(\lambda )\over {\mathrm{d}}\lambda ^2}=-{3\over 2}\left( H_0\over c \right)^2 \Omega _{\rm m}(1+z)^5\,D(\lambda ), \end{aligned}$ (5)

with initial condition D(0)=0 and D′=1. Here, D is the angular-diameter distance of the homogeneous universe, as a function of affine parameter (or redshift).

We consider two different kinds of inhomogeneities here. The first of them is related to small-scale density inhomogeneities, such as galaxies and their dark matter halos. For those, the optical tidal matrix 𝒯_cl is a strong function of position, and there is no longer a linear relation between enclosed angle θ and separation ξ of a light ray, for finite θ. In strong-lensing applications, we are typically interested only in that region of these small-scale inhomogeneities where multiple images can be formed, which corresponds to a few tens of kiloparsecs for galaxy lenses. These small-scale inhomogeneities will be considered explicitly as “main lenses” in the following. The second kind of inhomogeneities is due to the large-scale mass distribution of the universe. We assume that the corresponding gravitational field is sufficiently smooth, so that the tidal effects can be considered approximately constant across the region where strong-lensing effects occur. Hence we assume that over such a region, 𝒯_sm can be considered to depend only on λ, not on the actual position of the light ray. According to SSE, this contribution of the optical tidal matrix is

$\begin{matrix} {(T_{sm})}_{ij} = - \frac{{(1 + z)}^{2}}{c^{2}} (2 \frac{\partial^{2} ϕ}{\partial ξ_{i} \partial ξ_{j}} + δ_{ij} \frac{\partial^{2} ϕ}{\partial ξ_{3}^{2}}), \end{matrix}$ $\begin{aligned} \left( \mathcal{T}_{\rm sm} \right)_{ij}=-{(1+z)^2\over c^2} \left( 2{\partial ^2\phi \over \partial \xi _i\,\partial \xi _j} +\delta _{ij}{\partial ^2\phi \over \partial \xi _3^2} \right), \end{aligned}$ (6)

where ϕ is the Newtonian potential sourced by the density inhomogeneity, i.e., satisfying the (three-dimensional) Poisson equation $\nabla_{ξ}^{2} ϕ = 4 π G (ρ - \bar{ρ})$ $\nabla_\xi^2 \phi=4\pi G (\rho-\bar\rho)$ , where $\bar{ρ} (z)$ $\bar\rho(z)$ is the mean matter density in the Universe, and we assumed that the light ray propagates in the ξ₃-direction.

2.2. Generalized multi-plane lens equation

We consider a direction in the sky with has a set of main lenses along the line-of-sight to a distant source, located at redshifts z_i or affine parameters λ_i. As shown in SSE (see also Schneider 1997), the separation vector ξ(λ) then becomes

$\begin{matrix} ξ (λ) = D (λ) θ - \sum_{i} D_{i} (λ) [\hat{α_{i}} (ξ_{i}) - {\hat{α}}_{i}^{0}] H (λ - λ_{i}), \end{matrix}$ $\begin{aligned} {\boldsymbol{\xi }}(\lambda )={\mathcal{D}} (\lambda )\,{\boldsymbol{\theta }} - {\sum }_i {\mathcal{D}}_i(\lambda )\left[\hat{{\boldsymbol{\alpha }}_i}({\boldsymbol{\xi }}_i)-\hat{{\boldsymbol{\alpha }}}_i^0 \right] \mathrm{H}(\lambda -\lambda _i), \end{aligned}$ (7)

where θ is the angle the light encloses with the fiducial ray at the observer. The distance matrix 𝒟(λ) solves the optical tidal equation

$\begin{matrix} \frac{d^{2} D (λ)}{d λ^{2}} = [T_{bg} (λ) + T_{sm} (λ)] D (λ), \end{matrix}$ $\begin{aligned} {{\mathrm{d}}^2{\mathcal{D}}(\lambda )\over {\mathrm{d}}\lambda ^2}=\left[\mathcal{T}_{\rm bg}(\lambda )+\mathcal{T}_{\rm sm}(\lambda ) \right] {\mathcal{D}}(\lambda ) , \end{aligned}$ (8)

with initial conditions 𝒟(0) = 0 and 𝒟′(0) = ℐ, and the distance matrices 𝒟_i(λ) solve the same differential equation, but with initial conditions 𝒟_i(λ_i) = 0 and (d𝒟_i/dλ)(λ_i) = (1+z_i)ℐ. They are the distance matrices which apply for light rays having their vertex at λ_i. In Eq. (7), H(λ − λ_i) is the Heaviside step function. The deflection angle ${\hat{α}}_{i} (ξ_{i})$ $\hat{{\boldsymbol{\alpha}}}_i({\boldsymbol{\xi}}_i)$ is given in terms of the surface mass density Σ_i(ξ_i) as

$\begin{matrix} {\hat{α}}_{i} (ξ_{i}) = \frac{4 G}{c^{2}} \int d^{2} ξ^{'} Σ_{i} (ξ^{'}) \frac{ξ_{i} - ξ^{'}}{| ξ_{i} - ξ^{'} |^{2}}, \end{matrix}$ $\begin{aligned} {\hat{{\boldsymbol{\alpha }}}}_i({\boldsymbol{\xi }}_i)={4G\over c^2}\int \mathrm{{d}}^2\xi ^\prime \; \Sigma _i({\boldsymbol{\xi }}^\prime ){{\boldsymbol{\xi }}_i-{\boldsymbol{\xi }}^\prime \over |{\boldsymbol{\xi }}_i-{\boldsymbol{\xi }}^\prime |^2}, \end{aligned}$ (9)

and ${\hat{α}}_{i}^{0}$ ${\hat{{\boldsymbol{\alpha}}}}_i^0$ denotes the deflection angle of the fiducial ray in the ith lens plane. If we now perform the translation $ξ (λ) = \tilde{ξ} (λ) + η (λ)$ ${\boldsymbol{\xi}}(\lambda)= \tilde{{\boldsymbol{\xi}}}(\lambda) +{\boldsymbol{\eta}}(\lambda)$ , with

$η (λ) = \sum_{i} D_{i} (λ) {\hat{α}}_{i}^{0} H (λ - λ_{i}),$ ${\boldsymbol{\eta }}(\lambda )= {\sum }_i {\mathcal{D}}_i(\lambda )\,\hat{{\boldsymbol{\alpha }}}_i^0\, \mathrm{H}(\lambda -\lambda _i),$

then $\tilde{ξ}$ $\tilde{{\boldsymbol{\xi}}}$ satisfies Eq. (7) with the ${\hat{α}}_{i}^{0}$ $\hat{{\boldsymbol{\alpha}}}_i^0$ set to zero. In the following, we always assume this (unobservable) translation, and drop the tilde on $\tilde{ξ}$ $\tilde{{\boldsymbol{\xi}}}$ henceforth. For the impact vectors ξ_j in the jth plane, we then obtain

$\begin{matrix} ξ_{j} = D (λ_{j}) θ - \sum_{i = 1}^{j - 1} D_{i} (λ_{j}) {\hat{α}}_{i} (ξ_{i}) \equiv D_{j} θ - \sum_{i = 1}^{j - 1} D_{ij} {\hat{α}}_{i} (ξ_{i}) . \end{matrix}$ $\begin{aligned} {\boldsymbol{\xi }}_j={\mathcal{D}}(\lambda _j)\,{\boldsymbol{\theta }}- {\sum }_{i=1}^{j-1}{\mathcal{D}}_i(\lambda _j)\,\hat{{\boldsymbol{\alpha }}}_i({\boldsymbol{\xi }}_i) \equiv {\mathcal{D}}_j\, {\boldsymbol{\theta }}- {\sum }_{i=1}^{j-1} {\mathcal{D}}_{ij}\,\hat{{\boldsymbol{\alpha }}}_i({\boldsymbol{\xi }}_i). \end{aligned}$ (10)

2.3. Calculation of the distance matrices

The distance matrices 𝒟i(λ) depend on the large-scale matter distribution around the line-of-sight, which is dominated by dark matter and thus difficult to determine observationally. One possibility to estimate 𝒯_sm from observations is to assume that galaxies provide a good tracer of the total matter distribution on large scales. From an observed distribution of galaxies (or a particular kind of galaxies, like luminous red galaxies) around the line-of-sight, an estimate of the tidal field can be obtained; this is the strategy proposed in Collett et al. (2013) and Smith et al. (2014).

The propagation of light through the large-scale structure is the subject of cosmological weak lensing, or cosmic shear (see, e.g., Schneider 2006). In cosmic shear, one usually describes the propagation matrices in comoving coordinates. In order to connect the formulation given here, where the 𝒟_i relate angles to proper transverse separation vectors (which is appropriate, as the matter distribution of the main lenses – galaxies or clusters – are most conveniently described in physical scales), to that used in weak lensing, we show in the appendix that

$\begin{matrix} D_{i} (χ) = D_{i} (χ) I - \frac{2}{c^{2}} \int_{χ_{i}}^{χ} d χ^{'} \frac{a (χ)}{a (χ^{'})} f_{k} (χ - χ^{'}) H (ϕ (χ^{'})) D_{i} (χ^{'}), \end{matrix}$ $\begin{aligned} {\mathcal{D}}_i(\chi )=D_i(\chi )\mathcal{I} -{2\over c^2}\int _{\chi _i}^\chi {\mathrm{d}}\chi ^{\prime } {a(\chi )\over a(\chi ^{\prime })} f_k(\chi -\chi ^{\prime }) \mathsf{H } (\phi (\chi ^{\prime })) {\mathcal{D}}_i(\chi ^{\prime }), \end{aligned}$ (11)

where χ is the comoving distance, f_k(χ) is the comoving angular diameter distance, D_i(χ)=a(χ)f_k(χ − χ_i), which satisfies the differential equation

$\begin{matrix} \frac{d^{2} f_{k} (χ)}{d χ^{2}} = - K f_{k} (χ) \end{matrix}$ $\begin{aligned} {\mathrm{{d}}^2 f_k(\chi )\over {\mathrm{d}} \chi ^2}=-K\,f_k(\chi ) \end{aligned}$ (12)

with initial conditions f_k(0) = 0 and df_k(0)/dχ = 1. Furthermore, $K = (Ω_{m} + Ω_{Λ} - 1) H_{0}^{2} / c^{2}$ $K\,{=}\,(\Omega_{\rm m}+\Omega_\Lambda-1)H_0^2/c^2$ is the spatial curvature of the universe, and H(ϕ) is the two-dimensional Hessian of the gravitational potential ϕ, evaluated in comoving transverse coordinates. Provided the perturbations are small, so that 𝒟_i deviates only slightly from D_iℐ, we can replace 𝒟_i(χ′) by D_i(χ′) in the integrand,

$\begin{matrix} D_{i} (χ) = D_{i} (χ) I - \frac{2}{c^{2}} \int_{χ_{i}}^{χ} d χ^{'} \frac{a (χ)}{a (χ^{'})} f_{k} (χ - χ^{'}) H (ϕ (χ^{'})) D_{i} (χ^{'}) . \end{matrix}$ $\begin{aligned} {\mathcal{D}}_i(\chi )=D_i(\chi )\mathcal{I} -{2\over c^2}\int _{\chi _i}^\chi {\mathrm{d}}\chi ^\prime \; {a(\chi )\over a(\chi ^\prime )} f_k(\chi -\chi ^\prime )\mathsf{H }(\phi (\chi ^\prime )) D_i(\chi ^\prime ). \end{aligned}$ (13)

In weak lensing, this approximation is called “the neglect of lens-lens coupling”, often also termed as “Born-approximation”. This approximation is very accurate and certainly sufficient for the purposes discussed in the current context¹. Thus, the deviation of 𝒟_i from D_iℐ is given as a line-of-sight integral over the tidal force field, with a distance-dependent weighting. We also note that in the approximation (13), the distance matrices 𝒟_i are symmetric, which is generally not the case if the exact expression (11) is used.

For statistical studies, instead of trying to obtain the tidal field along the line-of-sight toward the sources from observations of the galaxy distribution in those directions, one can also derive the probability distribution for the distance matrices from cosmological simulations, as has been done in Suyu et al. (2013).

3. Time-delay function and the iterative lens equation

In this section, we first derive the light travel time function (LTTF) corresponding to the generalized multi-plane lens equation considered in the previous section. An alternative form of the lens equation is then derived from Fermat’s principle, which in the current context states that the lens equation is equivalent to setting the gradient of the LTTF with respect to all impact vectors equal to zero (i.e., real light rays correspond to stationary points of the LTTF). Then we show that this iterative form of the lens equation is equivalent to Eq. (10).

3.1. Time-delay function

We first consider a single lens plane at z₁ with a source at z₂, in which case the lens equation reads

$\begin{matrix} ξ_{2} = D_{2} θ - D_{12} \hat{α} (ξ_{1}) = D_{2} D_{1}^{- 1} ξ_{1} - D_{12} \hat{α} (ξ_{1}), \end{matrix}$ $\begin{aligned} {\boldsymbol{\xi }}_2={\mathcal{D}}_2\,{\boldsymbol{\theta }}-{\mathcal{D}}_{12}\, \hat{{\boldsymbol{\alpha }}}({\boldsymbol{\xi }}_1) ={\mathcal{D}}_2\,{\mathcal{D}}_1^{-1}{\boldsymbol{\xi }}_1-{\mathcal{D}}_{12}\, \hat{{\boldsymbol{\alpha }}}({\boldsymbol{\xi }}_1), \end{aligned}$ (14)

where we set $\hat{α} (ξ_{1}) \equiv {\hat{α}}_{1} (ξ_{1})$ $\hat{{\boldsymbol{\alpha}}}({\boldsymbol{\xi}}_1)\equiv\hat{{\boldsymbol{\alpha}}}_1({\boldsymbol{\xi}}_1)$ . We define the LTTF τ(ξ₁, ξ₂) to be the excess light travel time from a source at ξ₂ to the observer caused by the deflection in the lens plane at ξ₁. This excess travel time has two components, a geometrical one (a bent ray is longer than an unbent one), and a potential one caused by the retardation of photons in the gravitational potential of the deflector.

In standard lens theory, with unperturbed angular-diameter distances D_ij, the potential part of the time delay function takes the form cτ_pot = −(1 + z₁)(D₁D₂/D₁₂)ψ(θ), where ψ(θ) is the deflection potential, which satisfies $\nabla_{θ} ψ = α (θ) \equiv (D_{12} / D_{2}) \hat{α} (D_{1} θ)$ $\nabla_{{\boldsymbol{\theta}}}\psi={\boldsymbol{\alpha}}({\boldsymbol{\theta}})\equiv (D_{12}/D_2)\hat{{\boldsymbol{\alpha}}}(D_1{\boldsymbol{\theta}})$ . Now we define the potential $\hat{ψ} (ξ_{1})$ $\hat\psi({\boldsymbol{\xi}}_1)$ such that it satisfies $\nabla_{ξ_{1}} \hat{ψ} = \hat{α}$ $\nabla_{{\boldsymbol{\xi}}_1}\hat\psi=\hat{{\boldsymbol{\alpha}}}$ . This potential is a multiple of ψ, i.e., $\hat{ψ} (ξ_{1}) = k ψ (ξ_{1} / D_{1})$ $\hat\psi({\boldsymbol{\xi}}_1)=k\,\psi({\boldsymbol{\xi}}_1/D_1)$ , up to an irrelevant additive constant. To find k, we consider

$\hat{α} = \nabla_{ξ_{1}} \hat{ψ} = \frac{k}{D_{1}} \nabla_{θ} ψ = \frac{k}{D_{1}} α = \frac{k D_{12}}{D_{1} D_{2}} \hat{α},$ $\hat{{\boldsymbol{\alpha }}}=\nabla _{{\boldsymbol{\xi }}_1}\hat{\psi }={k\over D_1}\nabla _{{\boldsymbol{\theta }}}\psi ={k\over D_1}{\boldsymbol{\alpha }} = {k\,D_{12}\over D_1 D_2}\hat{{\boldsymbol{\alpha }}},$

yielding

$\begin{matrix} \hat{ψ} (ξ_{1}) = \frac{D_{1} D_{2}}{D_{12}} ψ (ξ_{1} / D_{1}) . \end{matrix}$ $\begin{aligned} \hat{\psi }({\boldsymbol{\xi }}_1)={D_1 D_2\over D_{12}}\psi ({\boldsymbol{\xi }}_1/D_1). \end{aligned}$ (15)

Hence, the potential part of the LTTF takes the form $c τ_{pot} = - (1 + z_{1}) \hat{ψ} (ξ_{1})$ $c\tau_{\mathrm{pot}}=-(1+z_1)\hat\psi({\boldsymbol{\xi}}_1)$ . As expected, it does not depend on the cosmological distances, since it is caused solely by the local effect of propagating through a gravitational field, and the corresponding time interval is then redshifted by a factor (1 + z₁) to the observer. Accordingly, also in the case of perturbed light propagation between the lens planes, the potential part of the time delay must have the same form, since it is unaffected by propagation effects.

We find an explicit expression for τ(ξ₁, ξ₂) by requiring that ∇_ξ₁τ(ξ₁, ξ₂)=0 is equivalent to the lens Eq. (14). This fixes τ up to a multiplicative constant and terms which depend solely on ξ₂. The multiplicative constant is fixed by the explicit expression given above for the potential part of τ, and the additive constant (which is irrelevant for time delay measurements) is fixed by requiring that the geometrical part of τ should vanish if the light ray is undeflected by the main lens. This then yields

$\begin{matrix} c τ (ξ_{1}, ξ_{2}) = & \frac{1 + z_{1}}{2} {(C_{12} ξ_{1} - D_{12}^{- 1} ξ_{2})}^{t} C_{12}^{- 1} (C_{12} ξ_{1} - D_{12}^{- 1} ξ_{2}) \\ - (1 + z_{1}) \hat{ψ} (ξ_{1}), \end{matrix}$ $\begin{aligned} c \tau ({\boldsymbol{\xi }}_1,{\boldsymbol{\xi }}_2) =&{1+z_1 \over 2} \left( {\mathsf{C }}_{12}{\boldsymbol{\xi }}_1 - {\mathcal{D}}_{12}^{-1}{\boldsymbol{\xi }}_2 \right)^\mathrm{t} {\mathsf{C }}_{12}^{-1}\left( {\mathsf{C }}_{12}{\boldsymbol{\xi }}_1 - {\mathcal{D}}_{12}^{-1}{\boldsymbol{\xi }}_2 \right) \nonumber \\&- (1+z_1)\,\hat{\psi }({\boldsymbol{\xi }}_1), \end{aligned}$ (16)

where we defined

$\begin{matrix} C_{ij} = D_{ij}^{- 1} D_{j} D_{i}^{- 1} . \end{matrix}$ $\begin{aligned} {\mathsf{C }}_{ij}={\mathcal{D}}_{ij}^{-1}{\mathcal{D}}_j{\mathcal{D}}_i^{-1}. \end{aligned}$ (17)

As shown in Schneider (1997) – see also Kovner (1987) – the matrix C_ij is symmetric. Indeed, we see that

$\begin{matrix} \nabla_{ξ_{1}} τ (ξ_{1}, ξ_{2}) = \frac{1 + z_{1}}{c} (C_{12} ξ_{1} - D_{12}^{- 1} ξ_{2} - \hat{α}) = 0 \end{matrix}$ $\begin{aligned} \nabla _{{\boldsymbol{\xi }}_1}\tau ({\boldsymbol{\xi }}_1,{\boldsymbol{\xi }}_2)={1+z_1\over c} \left( {\mathsf{C }}_{12}{\boldsymbol{\xi }}_1 - {\mathcal{D}}_{12}^{-1}{\boldsymbol{\xi }}_2-\hat{{\boldsymbol{\alpha }}} \right) =0 \end{aligned}$ (18)

is equivalent to the lens Eq. (14), as is easily verified by multiplying the foregoing expression by 𝒟₁₂ from the left. We also note that τ = 0 if $\hat{ψ} = 0$ $\hat{\psi}=0$ and if the ray is unbent, i.e., if $ξ_{2} = D_{2} D_{1}^{- 1} ξ_{1}$ ${\boldsymbol{\xi}}_2={{\cal D}}_2 {{\cal D}}_1^{-1}{\boldsymbol{\xi}}_1$ , as we required for τ. McCully et al. (2014) obtained a somewhat different form for τ, which they showed to be equivalent to the expression given here. However, it must be pointed out that this equivalence applies only to physical light rays, i.e., those which satisfy the gravitational lensing Eq. (14). For those rays, we could write the light travel time as

$c τ = \frac{1 + z_{1}}{2} {\hat{α}}_{1}^{t} C_{12}^{- 1} {\hat{α}}_{1} - (1 + z_{1}) \hat{ψ} (ξ),$ $c\tau ={1+z_1\over 2}\hat{{\boldsymbol{\alpha }}}_1^\mathrm{t}\mathsf{C }_{12}^{-1}\hat{{\boldsymbol{\alpha }}}_1-(1+z_1)\hat{\psi }({\boldsymbol{\xi }}),$

where the lens Eq. (14) was used to eliminate ξ₂. However, Eq. (16) is more general, as it yields the light travel time for all kinematically possible rays, not only for those for which the bend by the main lenses equals the actual deflection angle as calculated as the gradient of $\hat{ψ}$ $\hat\psi$ . This more general form of the τ is needed if the lens equation is to be derived from Fermat’s principle.

In case of several main lens planes, the LTTF is obtained by considering the replacement of the actual light ray by successively straighter rays, i.e., by removing the bends of the ray and the gravitational potentials they traverse (see Sect. 9.2 of Schneider et al. 1992). Removing the bend and deflection potential in the first plane leads to the contribution (16) of the LTTF. Subsequent removal of the bend and potential in the second lens plane yields a similar contribution, with the indices (1,2) replaced by (2,3). Iterating this consideration, we obtain for the general case

$\begin{matrix} c τ (ξ_{1}, ξ_{2}, \dots, ξ_{N + 1}) = & \sum_{i = 1}^{N} (1 + z_{i}) [\frac{1}{2} {(C_{i, i + 1} ξ_{i} - D_{i, i + 1}^{- 1} ξ_{i + 1})}^{t} \\ \times C_{i, i + 1}^{- 1} (C_{i, i + 1} ξ_{i} - D_{i, i + 1}^{- 1} ξ_{i + 1}) - {\hat{ψ}}_{i} (ξ_{i})] \end{matrix}$ $\begin{aligned} c\tau ({\boldsymbol{\xi }}_1,{\boldsymbol{\xi }}_2,\dots ,{\boldsymbol{\xi }}_{N+1}) =&{\sum }_{i=1}^N (1+z_i)\, \Biggl [{1\over 2} \left( {\mathsf{C }}_{i,i+1}{\boldsymbol{\xi }}_i - {\mathcal{D}}_{i,i+1}^{-1}{\boldsymbol{\xi }}_{i+1} \right)^\mathrm{t}\nonumber \\&\times {\mathsf{C }}_{i,i+1}^{-1}\left( {\mathsf{C }}_{i,i+1}{\boldsymbol{\xi }}_i - {\mathcal{D}}_{i,i+1}^{-1}{\boldsymbol{\xi }}_{i+1} \right) - \hat{\psi }_i({\boldsymbol{\xi }}_i) \Biggr ] \end{aligned}$ (19)

as the sum over terms of the form (16) for the individual planes.

Any ray connecting the source at ξ_N + 1 and the observer is fully characterized by the impact vectors ξ_i, 1 ≤ i ≤ N in the lens planes, since between the planes, it follows the propagation Eq. (1) whose solution is uniquely determined by the two impact vectors at consecutive planes. Therefore, the actual light rays are singled out as those for which the LTTF is stationary, with respect to variations of the impact vectors in the N lens planes. Thus, the lens equation is obtained by setting the derivative of τ with respect to the ξ_j equal to zero. For each j ≥ 2, two terms of the above sum contribute, namely the terms i = j and i = j − 1. We obtain

$\begin{matrix} \nabla_{ξ_{j}} (c τ) = & (1 + z_{j}) [C_{j, j + 1} ξ_{j} - D_{j, j + 1}^{- 1} ξ_{j + 1} - {\hat{α}}_{j} (ξ_{j})] \\ + (1 + z_{j - 1}) {(D_{j - 1, j}^{- 1})}^{t} (C_{j - 1, j}^{- 1} D_{j - 1, j}^{- 1} ξ_{j} - ξ_{j - 1}) \\ = & 0, \end{matrix}$ $\begin{aligned} \nabla _{{\boldsymbol{\xi }}_j}(c\tau ) =&(1+z_j)\left[{\mathsf{C }}_{j,j+1}{\boldsymbol{\xi }}_j - {\mathcal{D}}_{j,j+1}^{-1}{\boldsymbol{\xi }}_{j+1}-\hat{{\boldsymbol{\alpha }}}_j({\boldsymbol{\xi }}_j) \right] \nonumber \\&+(1+z_{j-1})\left( {\mathcal{D}}_{j-1,j}^{-1} \right)^\mathrm{t} \left( {\mathsf{C }}_{j-1,j}^{-1}{\mathcal{D}}_{j-1,j}^{-1}{\boldsymbol{\xi }}_j-{\boldsymbol{\xi }}_{j-1} \right)\nonumber \\ =&0, \end{aligned}$ (20)

or

$\begin{matrix} ξ_{j + 1} = & D_{j, j + 1} [C_{j, j + 1} + \frac{1 + z_{j - 1}}{1 + z_{j}} {(D_{j - 1, j}^{- 1})}^{t} C_{j - 1, j}^{- 1} D_{j - 1, j}^{- 1}] ξ_{j} \\ - \frac{1 + z_{j - 1}}{1 + z_{j}} D_{j, j + 1} {(D_{j - 1, j}^{- 1})}^{t} ξ_{j - 1} - D_{j, j + 1} {\hat{α}}_{j} (ξ_{j}) . \end{matrix}$ $\begin{aligned} {\boldsymbol{\xi }}_{j+1} =&{\mathcal{D}}_{j,j+1}\left[{\mathsf{C }}_{j,j+1} +{1+z_{j-1}\over 1+z_j}\left( {\mathcal{D}}_{j-1,j}^{-1} \right)^\mathrm{t}{\mathsf{C }}_{j-1,j}^{-1}{\mathcal{D}}_{j-1,j}^{-1} \right]{\boldsymbol{\xi }}_j \nonumber \\&- {1+z_{j-1}\over 1+z_j}{\mathcal{D}}_{j,j+1}\left( {\mathcal{D}}_{j-1,j}^{-1} \right)^\mathrm{t}{\boldsymbol{\xi }}_{j-1}-{\mathcal{D}}_{j,j+1}\hat{{\boldsymbol{\alpha }}}_j({\boldsymbol{\xi }}_j). \end{aligned}$ (21)

This equation relates the position vectors in three consecutive planes to the deflection angle in the middle plane, quite in contrast to the lens Eq. (10) which contains all impact vectors ξ_i for a given ξ_j, 1 ≤ i ≤ j − 1. Hence, this new lens equation is more “local” than the original one.

In the following we explicitly show that these two forms of the lens equation are equivalent. For this, we first need to derive a general relation between distance matrices.

3.2. A relation between distance matrices

Consider the pairs of light rays sketched in Fig. 1, where the first has a vertex at λ_q and encloses an angle θ with the fiducial ray. At the affine parameter λ_s, its separation vector from the fiducial ray is ξ_s. The second light ray has its vertex at λ_r and intersects the first ray at λ_s; this then specifies its direction ϑ relative to the fiducial ray. At the intersection point, the two rays enclose an angle φ. From the geometry of the figure, we find

$\begin{matrix} θ & = D_{qr}^{- 1} ξ_{r} = D_{qs}^{- 1} ξ_{s} = D_{qt}^{- 1} ξ_{t} ; \\ ϑ & = D_{rt}^{- 1} (ξ_{t} + Δ ξ_{t}) = D_{rs}^{- 1} ξ_{s} . \end{matrix}$ $\begin{aligned} {\boldsymbol{\theta }}&={\mathcal{D}}_{qr}^{-1}{\boldsymbol{\xi }}_r ={\mathcal{D}}_{qs}^{-1}{\boldsymbol{\xi }}_s={\mathcal{D}}_{qt}^{-1}{\boldsymbol{\xi }}_t; \nonumber \\ {\boldsymbol{\vartheta }}&={\mathcal{D}}_{rt}^{-1} \left( {\boldsymbol{\xi }}_t+\Delta {\boldsymbol{\xi }}_t \right)={\mathcal{D}}_{rs}^{-1}{\boldsymbol{\xi }}_s. \end{aligned}$ (22)

Fig. 1.

Sketch of two light rays through four consecutive planes, with 0 ≤ λ_q < λ_r < λ_s < λ_t. The rays are not deflected in the lens planes. The first ray has its vertex at λ_q and encloses an angle θ with the fiducial ray; the second ray with vertex at λ_r encloses an angle ϑ with the fiducial ray. Both rays intersect at λ_s. The geometry of this figure yields the relation (25) between distance matrices.

We use the latter equation to derive a relation between the 𝒟’s, by expressing all vectors in terms of ξ_s. Using the first of (22), we get $ξ_{t} = D_{qt} D_{qs}^{- 1} ξ_{s}$ ${\boldsymbol{\xi}}_t={{\cal D}}_{qt}{{\cal D}}_{qs}^{-1}{\boldsymbol{\xi}}_s$ . Furthermore, the figure shows that Δξ_t = 𝒟_stφ. On the other hand, ξ_r = −𝒟_s(λ_r)φ.

We now have to relate 𝒟_s(λ_r), the backward extension of the solution 𝒟_s(λ) of Eq. (8), to 𝒟_rs. For that, we consider two solutions of Eq. (8), 𝒟_r(λ) and 𝒟_s(λ), with their appropriate initial conditions at λ_r and λ_s, respectively, and define the matrix

$\begin{matrix} W (λ) = \frac{d D_{r}^{t}}{d λ} (λ) D_{s} (λ) - D_{r}^{t} (λ) \frac{d D_{s}}{d λ} (λ), \end{matrix}$ $\begin{aligned} {\mathsf{W }}(\lambda )={{\mathrm{d}}{\mathcal{D}}_r^\mathrm{t}\over {\mathrm{d}}\lambda }(\lambda )\; {\mathcal{D}}_s(\lambda ) -{{\mathcal{D}}_r^\mathrm{t}}(\lambda )\; {{\mathrm{d}}{\mathcal{D}}_s\over {\mathrm{d}}\lambda }(\lambda ), \end{aligned}$ (23)

which is the Wronskian of the differential Eq. (8); here, the superscript “t” denotes the transpose of a matrix. The derivative of W vanishes, due to Eq. (8); hence, W is a constant. Evaluating Eq. (23) at λ = λ_r and making use of the initial conditions of 𝒟_r(λ) yields 𝖶 = (1 + z_r) 𝒟_s(λr). Similarly, at λ = λ_s we find $W = - (1 + z_{s}) (D_{r}^{t}) (λ_{s}) = - (1 + z_{s}) D_{rs}^{t}$ ${{\mathsf{W}}}=-(1+z_s)\, {\left( {{\cal D}}_r^\mathrm{{t}} \right)}(\lambda_s)=-(1+z_s)\,{{\cal D}}_{rs}^{\mathrm{t}}$ . Thus, we obtain

$\begin{matrix} D_{s} (λ_{r}) = - \frac{1 + z_{s}}{1 + z_{r}} D_{rs}^{t}, \end{matrix}$ $\begin{aligned} {\mathcal{D}}_s(\lambda _r)=-{1+z_s \over 1+z_r}\,{\mathcal{D}}_{rs}^\mathrm{t}, \end{aligned}$ (24)

which is Etherington’s theorem in matrix form (Etherington 1933). With this relation, we then find that

$Δ ξ_{t} = D_{st} φ = - D_{st} D_{s}^{- 1} (λ_{r}) ξ_{r} = \frac{1 + z_{r}}{1 + z_{s}} D_{st} {(D_{rs}^{t})}^{- 1} ξ_{r} .$ $\Delta {\boldsymbol{\xi }}_t={\mathcal{D}}_{st}{\boldsymbol{\varphi }}=-{\mathcal{D}}_{st}{\mathcal{D}}_s^{-1}(\lambda _r)\,{\boldsymbol{\xi }}_r ={1+z_r \over 1+z_s}\,{\mathcal{D}}_{st} \left( {\mathcal{D}}_{rs}^\mathrm{t} \right)^{-1}{\boldsymbol{\xi }}_r.$

Using Eq. (22) and collecting terms,

$D_{rt} D_{rs}^{- 1} ξ_{s} = ξ_{t} + Δ ξ_{t} = D_{qt} D_{qs}^{- 1} ξ_{s} + \frac{1 + z_{r}}{1 + z_{s}} D_{st} {(D_{rs}^{t})}^{- 1} D_{qr} D_{qs}^{- 1} ξ_{s}$ ${\mathcal{D}}_{rt}{\mathcal{D}}_{rs}^{-1}{\boldsymbol{\xi }}_s= {\boldsymbol{\xi }}_t+\Delta {\boldsymbol{\xi }}_t={\mathcal{D}}_{qt}{\mathcal{D}}_{qs}^{-1}{\boldsymbol{\xi }}_s +{1+z_r \over 1+z_s}\,{\mathcal{D}}_{st} \left( {\mathcal{D}}_{rs}^\mathrm{t} \right)^{-1}{\mathcal{D}}_{qr}{\mathcal{D}}_{qs}^{-1} {\boldsymbol{\xi }}_s$

follows. Since this relation is valid for all ξ_s, a general relation between distance matrices is obtained:

$\begin{matrix} \frac{1 + z_{r}}{1 + z_{s}} D_{st} {(D_{rs}^{t})}^{- 1} D_{qr} = D_{rt} D_{rs}^{- 1} D_{qs} - D_{qt}, \end{matrix}$ $\begin{aligned} {1+z_r \over 1+z_s}\,{\mathcal{D}}_{st} \left( {\mathcal{D}}_{rs}^\mathrm{t} \right)^{-1}{\mathcal{D}}_{qr}={\mathcal{D}}_{rt}{\mathcal{D}}_{rs}^{-1}{\mathcal{D}}_{qs}-{\mathcal{D}}_{qt}, \end{aligned}$ (25)

where we multiplied the resulting equation by 𝒟_qs from the right². This result generalizes the corresponding relation in Schneider (2016) derived for the case that the distance matrices 𝒟 reduce to scalars.

Indeed, a relation of this kind is expected to hold: Consider λ_t ≡ λ as a variable. The two matrix-valued functions 𝒟_q(λ) and 𝒟_r(λ) are linearly independent solutions of the transport Eq. (8), provided λ_r ≠ λ_q. Therefore, the solution 𝒟_s(λ) can be written as a linear combination of the other two. This combination should be of the form

$\begin{matrix} D_{s} (λ) = [D_{r} (λ) D_{r}^{- 1} (λ_{s}) - D_{q} (λ) D_{q}^{- 1} (λ_{s})] X, \end{matrix}$ $\begin{aligned} {\mathcal{D}}_s(\lambda )=\left[{\mathcal{D}}_r(\lambda ){\mathcal{D}}_r^{-1}(\lambda _s)- {\mathcal{D}}_q(\lambda ){\mathcal{D}}_q^{-1}(\lambda _s) \right] \mathsf{X }, \end{aligned}$ (26)

which satisfies one of the initial conditions, 𝒟_s(λ_s) = 0. The matrix X is determined from the second initial condition; our result (25) shows that

$\begin{matrix} D_{s} (λ) = [D_{r} (λ) D_{r}^{- 1} (λ_{s}) - D_{q} (λ) D_{q}^{- 1} (λ_{s})] \frac{1 + z_{s}}{1 + z_{r}} D_{qs} D_{qr}^{- 1} D_{rs}^{t} . \end{matrix}$ $\begin{aligned} {\mathcal{D}}_s(\lambda )=\left[{\mathcal{D}}_r(\lambda ){\mathcal{D}}_r^{-1}(\lambda _s)- {\mathcal{D}}_q(\lambda ){\mathcal{D}}_q^{-1}(\lambda _s) \right] {1+z_s\over 1+z_r} {\mathcal{D}}_{qs} {\mathcal{D}}_{qr}^{-1}{\mathcal{D}}_{rs}^\mathrm{t}. \end{aligned}$ (27)

3.3. Equivalence of Eqs. (10) and (21)

We shall now show that the two forms (10) and (21) of the lens equation are equivalent. As a first step, we rewrite Eq. (21) in a form that admits a simple geometrical interpretation. Specializing Eq. (25) to q = 0, r = j − 1, s = j, t = j + 1 yields

$\begin{matrix} \frac{1 + z_{j - 1}}{1 + z_{j}} D_{j, j + 1} {(D_{j - 1, j}^{t})}^{- 1} D_{j - 1} = D_{j - 1, j + 1} D_{j - 1, j}^{- 1} D_{j} - D_{j + 1} . \end{matrix}$ $\begin{aligned} {1+z_{j-1} \over 1+z_j}\,{\mathcal{D}}_{j,j+1} \left( {\mathcal{D}}_{j-1,j}^\mathrm{t} \right)^{-1}{\mathcal{D}}_{j-1}={\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1}{\mathcal{D}}_{j}-{\mathcal{D}}_{j+1}. \end{aligned}$ (28)

We next consider the prefactor of ξ_j in Eq. (21). Using $C_{j - 1, j}^{- 1} D_{j - 1, j}^{- 1} = D_{j - 1} D_{j}^{- 1}$ ${{\mathsf{C}}}_{j-1,j}^{-1}\,{{\cal D}}_{j-1,j}^{-1}={{\cal D}}_{j-1}\,{{\cal D}}_j^{-1}$ , which is obtained from the definition (17) of C, we find that this prefactor becomes

$\begin{matrix} D_{j + 1} D_{j}^{- 1} + \frac{1 + z_{j - 1}}{1 + z_{j}} D_{j, j + 1} {(D_{j - 1, j}^{- 1})}^{t} D_{j - 1} D_{j}^{- 1} = D_{j - 1, j + 1} D_{j - 1, j}^{- 1}, \end{matrix}$ $\begin{aligned} {\mathcal{D}}_{j+1}{\mathcal{D}}_j^{-1}+{1+z_{j-1}\over 1+z_j}{\mathcal{D}}_{j,j+1} \left( {\mathcal{D}}_{j-1,j}^{-1} \right)^\mathrm{t}{\mathcal{D}}_{j-1}{\mathcal{D}}_j^{-1} ={\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1}, \end{aligned}$ (29)

where in the last step we made use of Eq. (28) and the fact that the inversion and transposition operations on a matrix commute. Thus, we can rewrite Eq. (21) in the form

$\begin{matrix} ξ_{j + 1} = & D_{j - 1, j + 1} D_{j - 1, j}^{- 1} ξ_{j} \\ - \frac{1 + z_{j - 1}}{1 + z_{j}} D_{j, j + 1} {(D_{j - 1, j}^{- 1})}^{t} ξ_{j - 1} - D_{j, j + 1} {\hat{α}}_{j} (ξ_{j}) . \end{matrix}$ $\begin{aligned} {\boldsymbol{\xi }}_{j+1}=&{\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1} {\boldsymbol{\xi }}_j \nonumber \\&- {1+z_{j-1}\over 1+z_j}{\mathcal{D}}_{j,j+1}\left( {\mathcal{D}}_{j-1,j}^{-1} \right)^\mathrm{t}{\boldsymbol{\xi }}_{j-1}-{\mathcal{D}}_{j,j+1}\hat{{\boldsymbol{\alpha }}}_j({\boldsymbol{\xi }}_j). \end{aligned}$ (30)

Fig. 2.

Propagation of a light ray (thick bent line) between three consecutive planes. The vertical line is the optical axis, with respect to which the separation vectors ξ are measured. The geometry of this figure yields the lens Eq. (30) – see text.

We note that this generalizes equation (4.47) of SSE to the case of general distance matrices between main lens planes. This form of the lens equation can be immediately interpreted geometrically. For this, we consider Fig. 2, from which we read off

$η + ξ_{j + 1} = D_{j - 1, j + 1} D_{j - 1, j}^{- 1} ξ_{j} - D_{j, j + 1} {\hat{α}}_{j} .$ ${\boldsymbol{\eta }}+{\boldsymbol{\xi }}_{j+1}={\mathcal{D}}_{j-1,j+1} {\mathcal{D}}_{j-1,j}^{-1}{\boldsymbol{\xi }}_j-{\mathcal{D}}_{j,j+1}\hat{{\boldsymbol{\alpha }}}_j.$

Furthermore, η = 𝒟_j,j+1φ; on the other hand, ξ_j−1 = −𝒟_j,j−1φ. Eliminating φ from these two relations and making use of Eq. (24), we find

$η = \frac{1 + z_{j - 1}}{1 + z_{j}} D_{j, j + 1} {(D_{j - 1, j}^{- 1})}^{t} ξ_{j - 1} .$ ${\boldsymbol{\eta }}={1+z_{j-1} \over 1+z_j}\,{\mathcal{D}}_{j,j+1}\left( {\mathcal{D}}_{j-1,j}^{-1} \right)^\mathrm{t}{\boldsymbol{\xi }}_{j-1}.$

Together, these two equations reproduce Eq. (30). In this form, the equation not only is confined to three consecutive lens planes, but all distance matrices occurring here are those between these three planes.

In order to show the equivalence of Eqs. (10) and (21), it is useful to rewrite the prefactor of ξ_j − 1 in Eq. (30) in a different form. Making use again of Eq. (28), we obtain

$\begin{matrix} ξ_{j + 1} = & D_{j - 1, j + 1} D_{j - 1, j}^{- 1} ξ_{j} \\ + (D_{j + 1} D_{j - 1}^{- 1} - D_{j - 1, j + 1} D_{j - 1, j}^{- 1} D_{j} D_{j - 1}^{- 1}) ξ_{j - 1} \\ - D_{j, j + 1} {\hat{α}}_{j} (ξ_{j}) . \end{matrix}$ $\begin{aligned} {\boldsymbol{\xi }}_{j+1}=&{\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1} {\boldsymbol{\xi }}_j \nonumber \\&+ \left( {\mathcal{D}}_{j+1}D_{j-1}^{-1}-{\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1}{\mathcal{D}}_j{\mathcal{D}}_{j-1}^{-1} \right) {\boldsymbol{\xi }}_{j-1}\nonumber \\ &-{\mathcal{D}}_{j,j+1}\hat{{\boldsymbol{\alpha }}}_j({\boldsymbol{\xi }}_j). \end{aligned}$ (31)

We prove the equivalence by induction; for j = 1, this equivalence is seen from Eq. (18). Hence we assume that it is true for all planes up to j. Then, taking the difference between Eq. (10) for ξ_j + 1 and Eq. (31),

$\begin{matrix} Δ = & D_{j - 1, j + 1} D_{j - 1, j}^{- 1} ξ_{j} \\ + (D_{j + 1} D_{j - 1}^{- 1} - D_{j - 1, j + 1} D_{j - 1, j}^{- 1} D_{j} D_{j - 1}^{- 1}) ξ_{j - 1} - D_{j, j + 1} {\hat{α}}_{j} (ξ_{j}) \\ - D_{j + 1} θ + \sum_{i = 1}^{j} D_{i, j + 1} {\hat{α}}_{i} (ξ_{i}), \end{matrix}$ $\begin{aligned} \Delta =&{\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1} {\boldsymbol{\xi }}_j\nonumber \\&+ \left( {\mathcal{D}}_{j+1}D_{j-1}^{-1}-{\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1}{\mathcal{D}}_j{\mathcal{D}}_{j-1}^{-1} \right) {\boldsymbol{\xi }}_{j-1}-{\mathcal{D}}_{j,j+1}\hat{{\boldsymbol{\alpha }}}_j({\boldsymbol{\xi }}_j)\nonumber \\&- {\mathcal{D}}_{j+1}\, {\boldsymbol{\theta }}+{\sum }_{i=1}^{j} {\mathcal{D}}_{i,j+1}\,\hat{{\boldsymbol{\alpha }}}_i({\boldsymbol{\xi }}_i), \end{aligned}$ (32)

we need to show that Δ = 0. We first replace ξ_j − 1 and ξ_j by their expressions from Eq. (10), which holds because of the induction assumption,

$\begin{matrix} Δ = & D_{j - 1, j + 1} D_{j - 1, j}^{- 1} (D_{j} θ - \sum_{i = 1}^{j - 1} D_{i, j} {\hat{α}}_{i} (ξ_{i})) \\ + (D_{j + 1} - D_{j - 1, j + 1} D_{j - 1, j}^{- 1} D_{j}) (θ - \sum_{i = 1}^{j - 2} D_{j - 1}^{- 1} D_{i, j - 1} {\hat{α}}_{i} (ξ_{i})) \\ - D_{j, j + 1} {\hat{α}}_{j} (ξ_{j}) - D_{j + 1} θ + \sum_{i = 1}^{j} D_{i, j + 1} {\hat{α}}_{i} (ξ_{i}) . \end{matrix}$ $\begin{aligned} \Delta =&{\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1} \left( {\mathcal{D}}_j{\boldsymbol{\theta }} -{\sum }_{i=1}^{j-1} {\mathcal{D}}_{i,j}\,\hat{{\boldsymbol{\alpha }}}_i({\boldsymbol{\xi }}_i) \right) \nonumber \\&+ \left( {\mathcal{D}}_{j+1}-{\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1}{\mathcal{D}}_j \right) \left( {\boldsymbol{\theta }}- {\sum }_{i=1}^{j-2} {\mathcal{D}}_{j-1}^{-1}{\mathcal{D}}_{i,j-1} \,\hat{{\boldsymbol{\alpha }}}_i({\boldsymbol{\xi }}_i) \right) \nonumber \\&-{\mathcal{D}}_{j,j+1}\hat{{\boldsymbol{\alpha }}}_j({\boldsymbol{\xi }}_j)- {\mathcal{D}}_{j+1}\, {\boldsymbol{\theta }}+{\sum }_{i=1}^{j} {\mathcal{D}}_{i,j+1}\,\hat{{\boldsymbol{\alpha }}}_i({\boldsymbol{\xi }}_i) . \end{aligned}$ (33)

From this equation, one sees immediately that the terms ∝θ cancel each other. Second, the two terms $\propto {\hat{α}}_{j}$ $\propto \hat{{\boldsymbol{\alpha}}}_j$ add up to zero. Third, also the sum of the two terms $\propto {\hat{α}}_{j - 1}$ $\propto \hat{{\boldsymbol{\alpha}}}_{j-1}$ is zero. Thus, what remains to be shown is that the prefactor of the terms $\propto {\hat{α}}_{i}$ $\propto\hat{{\boldsymbol{\alpha}}}_i$ ,

$\begin{matrix} K_{i} = & D_{i, j + 1} - D_{j - 1, j + 1} D_{j - 1, j}^{- 1} D_{i, j} \\ - (D_{j + 1} - D_{j - 1, j + 1} D_{j - 1, j}^{- 1} D_{j}) D_{j - 1}^{- 1} D_{i, j - 1}, \end{matrix}$ $\begin{aligned} K_i =&{\mathcal{D}}_{i,j+1}-{\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1}{\mathcal{D}}_{i,j} \nonumber \\&- \left( {\mathcal{D}}_{j+1}-{\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1}{\mathcal{D}}_j \right) {\mathcal{D}}_{j-1}^{-1}{\mathcal{D}}_{i,j-1}, \end{aligned}$ (34)

for i ≤ j − 2 vanish. For this, we consider again Eq. (25), setting r = j − 1, s = j, t = j + 1, once with q = 0, and once with q = i. This then yields

$\begin{matrix} \frac{1 + z_{j - 1}}{1 + z_{j}} D_{j, j + 1} {(D_{j - 1, j}^{t})}^{- 1} & = (D_{j - 1, j + 1} D_{j - 1, j}^{- 1} D_{j} - D_{j + 1}) D_{j - 1}^{- 1} \\ = (D_{j - 1, j + 1} D_{j - 1, j}^{- 1} D_{i, j} - D_{i, j + 1}) D_{i, j - 1}^{- 1} . \end{matrix}$ $\begin{aligned} {1+z_{j-1} \over 1+z_j}\,{\mathcal{D}}_{j,j+1}\left( {\mathcal{D}}_{j-1,j}^\mathrm{t} \right)^{-1}&= \left( {\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1}{\mathcal{D}}_{j}-{\mathcal{D}}_{j+1} \right) {\mathcal{D}}_{j-1}^{-1} \nonumber \\&= \left( {\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1}{\mathcal{D}}_{i,j}-{\mathcal{D}}_{i,j+1} \right) {\mathcal{D}}_{i,j-1}^{-1}. \end{aligned}$ (35)

After multiplying by 𝒟_i,j−1, we see that the final equality shows that K_i = 0, which proves that Δ = 0, and thus the equivalence of the two forms of the lens equation. The equation K_i = 0 itself provides an interesting relation between distance matrices.

4. Mass-sheet transformation

In standard gravitational lensing, with a single deflector between the source and observer, there is a transformation of the mass distribution of the lens which keeps most observables invariant, the mass-sheet transformation (MST, see Falco et al. 1985). Since this transformation is accompanied by a uniform isotropic scaling in the source plane, all magnifications are scaled by the same factor, so that magnification (and thus observable flux) ratios are unchanged. The MST changes the product of time delay and Hubble constant, though, and the corresponding degeneracy can thus be broken by measuring the time delay in lens systems, assuming the Hubble constant to be known from other cosmological observations (see Schneider & Sluse 2013, and references therein). S14 has recently shown that a MST also exists in the case of two lens planes and two source planes. In this section, we show that also for perturbed gravitational lens systems, as considered in this paper, such a MST does exist.

4.1. Single main lens plane

We start with the case of a single lens plane, using the lens Eq. (14), and modify the deflection angle ${\hat{α}}_{1} (ξ_{1})$ $\hat{{\boldsymbol{\alpha}}}_1({\boldsymbol{\xi}}_1)$ to the new form

$\begin{matrix} {\hat{α}}_{1}^{'} (ξ_{1}) = λ {\hat{α}}_{1} (ξ_{1}) + G_{1} ξ_{1}, \end{matrix}$ $\begin{aligned} \hat{{\boldsymbol{\alpha }}}_1^\prime ({\boldsymbol{\xi }}_1)=\lambda \hat{{\boldsymbol{\alpha }}}_1({\boldsymbol{\xi }}_1) +\mathsf{G }_1\,{\boldsymbol{\xi }}_1, \end{aligned}$ (36)

where λ is a real number, and G₁ is a matrix³. Throughout this section, a prime denotes a mass-sheet transformed quantity. Thus, the modified deflection angle is a scaled version of the original one, plus a term linear in the impact vector. If G₁ is symmetric, this linear term corresponds to a tidal matrix, i.e., adding a uniform mass sheet to the scaled lens mass distribution, plus an external shear. The modified lens equation then becomes

$\begin{matrix} ξ_{2}^{'} = D_{2} θ - D_{12} {\hat{α}}_{1}^{'} (ξ_{1}) = D_{2} θ - D_{12} [λ {\hat{α}}_{1} (ξ_{1}) + G_{1} D_{1} θ] . \end{matrix}$ $\begin{aligned} {\boldsymbol{\xi }}_2^\prime ={\mathcal{D}}_2\,{\boldsymbol{\theta }}-{\mathcal{D}}_{12}\, \hat{{\boldsymbol{\alpha }}}_1^\prime ({\boldsymbol{\xi }}_1) ={\mathcal{D}}_2\,{\boldsymbol{\theta }}-{\mathcal{D}}_{12}\left[\lambda \hat{{\boldsymbol{\alpha }}}_1({\boldsymbol{\xi }}_1) +\mathsf{G }_1\,{\mathcal{D}}_1 {\boldsymbol{\theta }} \right]. \end{aligned}$ (37)

As for the orignal MST, we require that the modified impact vector $ξ_{2}^{'}$ ${\boldsymbol{\xi}}_2^\prime$ is related to the original one by a uniform, isotropic scaling, $ξ_{2}^{'} = ν_{2} ξ_{2}$ ${\boldsymbol{\xi}}_2^\prime=\nu_2 {\boldsymbol{\xi}}_2$ , where ν₂ is the scaling factor. Thus we require

$\begin{matrix} D_{2} θ - D_{12} [λ {\hat{α}}_{1} (ξ_{1}) + G_{1} D_{1} θ] = ν_{2} [D_{2} θ - D_{12} {\hat{α}}_{1} (ξ_{1})] . \end{matrix}$ $\begin{aligned} {\mathcal{D}}_2\,{\boldsymbol{\theta }}-{\mathcal{D}}_{12}\left[\lambda \hat{{\boldsymbol{\alpha }}}_1({\boldsymbol{\xi }}_1) +\mathsf{G }_1\,{\mathcal{D}}_1 {\boldsymbol{\theta }} \right]=\nu _2\left[{\mathcal{D}}_2\,{\boldsymbol{\theta }}-{\mathcal{D}}_{12}\, \hat{{\boldsymbol{\alpha }}}_1({\boldsymbol{\xi }}_1) \right]. \end{aligned}$ (38)

In order to have the terms $\propto {\hat{α}}_{1}$ $\propto\hat{{\boldsymbol{\alpha}}}_1$ equal on both sides of Eq. (38), we need to set ν₂ = λ, as is also the case for the MST in standard lensing – the scaling of the source plane (here plane number 2) is the same as that of the deflection angle. The remaining terms are all ∝θ, and setting them equal on both side leads to 𝒟₂ − 𝒟₁₂𝖦₁𝒟₁ = λ𝒟₂, or

$\begin{matrix} G_{1} = (1 - λ) D_{12}^{- 1} D_{2} D_{1}^{- 1} = (1 - λ) C_{12} . \end{matrix}$ $\begin{aligned} \mathsf{G }_1=(1-\lambda ){\mathcal{D}}_{12}^{-1}{\mathcal{D}}_2{\mathcal{D}}_1^{-1}=(1-\lambda ) {\mathsf{C }}_{12}. \end{aligned}$ (39)

Since C₁₂ is symmetric (see Schneider 1997), G₁ is indeed a tidal matrix. This single-main plane MST was also derived by McCully et al. (2014). Thus, in a generalized gravitational lens situation, the MST requires a shear in addition to a uniform mass sheet⁴.

4.2. Two main lens planes

We now consider a second lens plane at z₂, with the source plane being located at z₃. The modified lens equation then reads

$\begin{matrix} ξ_{3}^{'} = D_{3} θ - D_{13} {\hat{α}}_{1}^{'} (ξ_{1}^{'}) - D_{23} {\hat{α}}_{2}^{'} (ξ_{2}^{'}), \end{matrix}$ $\begin{aligned} {\boldsymbol{\xi }}_3^\prime ={\mathcal{D}}_3{\boldsymbol{\theta }}-{\mathcal{D}}_{13}\hat{{\boldsymbol{\alpha }}}_1^\prime ({\boldsymbol{\xi }}_1^\prime ) -{\mathcal{D}}_{23}\hat{{\boldsymbol{\alpha }}}_2^\prime ({\boldsymbol{\xi }}_2^\prime ) , \end{aligned}$ (40)

and we require the modified deflection angle ${\hat{α}}_{2}'$ $\hat{{\boldsymbol{\alpha}}}_2\prime$ to be chosen such that the 3-plane is just uniformly scaled relative to the original one, i.e., $ξ_{3}^{'} = ν_{3} ξ_{3}$ ${\boldsymbol{\xi}}_3^\prime=\nu_3 {\boldsymbol{\xi}}_3$ . This condition then yields

$\begin{matrix} D_{3} θ & - D_{13} [λ {\hat{α}}_{1} (ξ_{1}) + G_{1} ξ_{1}] - D_{23} {\hat{α}}_{2}^{'} (ξ_{2}^{'}) \\ = ν_{3} [D_{3} θ - D_{13} {\hat{α}}_{1} (ξ_{1}) - D_{23} {\hat{α}}_{2} (ξ_{2})] . \end{matrix}$ $\begin{aligned} {\mathcal{D}}_3{\boldsymbol{\theta }}&-{\mathcal{D}}_{13}\left[\lambda \hat{{\boldsymbol{\alpha }}}_1({\boldsymbol{\xi }}_1) +\mathsf{G }_1{\boldsymbol{\xi }}_1 \right]-{\mathcal{D}}_{23}\hat{{\boldsymbol{\alpha }}}_2^\prime ({\boldsymbol{\xi }}_2^\prime )\nonumber \\&= \nu _3\left[{\mathcal{D}}_3{\boldsymbol{\theta }}-{\mathcal{D}}_{13}\hat{{\boldsymbol{\alpha }}}_1({\boldsymbol{\xi }}_1) -{\mathcal{D}}_{23}\hat{{\boldsymbol{\alpha }}}_2({\boldsymbol{\xi }}_2) \right]. \end{aligned}$ (41)

In order to account for the term $\propto {\hat{α}}_{2}$ $\propto \hat{{\boldsymbol{\alpha}}}_2$ on the r.h.s. of Eq. (41), the modified deflection has to be of the form⁵

$\begin{matrix} {\hat{α}}_{2}^{'} (ξ_{2}^{'}) & = ν_{3} {\hat{α}}_{2} (ξ_{2}^{'} / λ) + G_{2} ξ_{2}^{'} \\ = ν_{3} {\hat{α}}_{2} (ξ_{2}^{'} / λ) + λ G_{2} [D_{2} θ - D_{12} {\hat{α}}_{1} (ξ_{1})] . \end{matrix}$ $\begin{aligned} \hat{{\boldsymbol{\alpha }}}_2^\prime ({\boldsymbol{\xi }}_2^\prime )&= \nu _3\hat{{\boldsymbol{\alpha }}}_2({\boldsymbol{\xi }}_2^\prime /\lambda ) +\mathsf{G }_2{\boldsymbol{\xi }}_2^\prime \nonumber \\&= \nu _3\hat{{\boldsymbol{\alpha }}}_2({\boldsymbol{\xi }}_2^\prime /\lambda ) +\lambda \mathsf{G }_2 \left[{\mathcal{D}}_2{\boldsymbol{\theta }}-{\mathcal{D}}_{12}\hat{{\boldsymbol{\alpha }}}_1({\boldsymbol{\xi }}_1) \right]. \end{aligned}$ (42)

This choice then yields equal terms $\propto {\hat{α}}_{2}$ $\propto \hat{{\boldsymbol{\alpha}}}_2$ on both sides. Equating the terms $\propto {\hat{α}}_{1}$ $\propto \hat{{\boldsymbol{\alpha}}}_1$ leads to the condition −λ𝒟₁₃ + λ𝒟₂₃𝖦₂𝒟₁₂ = −ν₃𝒟₁₃, or

$\begin{matrix} G_{2} = (1 - ν_{3} / λ) D_{23}^{- 1} D_{13} D_{12}^{- 1} . \end{matrix}$ $\begin{aligned} \mathsf{G }_2=(1-\nu _3/\lambda ){\mathcal{D}}_{23}^{-1}{\mathcal{D}}_{13}{\mathcal{D}}_{12}^{-1}. \end{aligned}$ (43)

Using the same arguments as in the Appendix of Schneider (1997), it is straightforward to show that any combination of distance matrices of the form

$\begin{matrix} D_{st}^{- 1} D_{rt} D_{rs}^{- 1} is symmetric, \end{matrix}$ $\begin{aligned} {\mathcal{D}}_{st}^{-1}{\mathcal{D}}_{rt}{\mathcal{D}}_{rs}^{-1}\quad {\text{ is} \text{ symmetric,}} \end{aligned}$ (44)

for 0 ≤ z_r < z_s < z_t. Hence, G₂ is symmetric, and thus corresponds to a tidal matrix. Equating the terms ∝θ in Eq. (41) then leads to

$\begin{matrix} (1 - ν_{3}) D_{3} = (1 - λ) D_{13} D_{12}^{- 1} D_{2} + λ (1 - ν_{3} / λ) D_{13} D_{12}^{- 1} D_{2}, \end{matrix}$ $\begin{aligned} (1-\nu _3){\mathcal{D}}_3=(1-\lambda ){\mathcal{D}}_{13}{\mathcal{D}}_{12}^{-1}{\mathcal{D}}_2 +\lambda (1-\nu _3/\lambda ){\mathcal{D}}_{13}{\mathcal{D}}_{12}^{-1}{\mathcal{D}}_2, \end{aligned}$ (45)

which has the unique solution ν₃ = 1. Thus, as is the case for the standard multi-plane lens discussed in Schneider (2014), the MST does lead to no scaling in the plane j = 3. Therefore,

$\begin{matrix} G_{2} = (1 - 1 / λ) D_{23}^{- 1} D_{13} D_{12}^{- 1} . \end{matrix}$ $\begin{aligned} \mathsf{G }_2=(1-1/\lambda ){\mathcal{D}}_{23}^{-1}{\mathcal{D}}_{13}{\mathcal{D}}_{12}^{-1}. \end{aligned}$ (46)

The implied scaling of the mass distribution in the plane i = 2 that follows from Eq. (42) is discussed in Schneider (2014); in short, the surface mass density distribution giving rise to ${\hat{α}}_{2} (θ_{2})$ $\hat{{\boldsymbol{\alpha}}}_2({\boldsymbol{\theta}}_2)$ needs to be scaled in amplitude and scale-length to yield a deflection ${\hat{α}}_{2} (λ θ_{2})$ $\hat{{\boldsymbol{\alpha}}}_2(\lambda{\boldsymbol{\theta}}_2)$ .

4.3. Arbitrary number of planes

Here we generalize the MST to an arbitrary number of source/lens planes. It turns out that the lens equation in the form (31) is better suited for that purpose. We write it in the form

$\begin{matrix} ξ_{j + 1} = D_{j - 1, j + 1} D_{j - 1, j}^{- 1} ξ_{j} + B_{j} ξ_{j - 1} - D_{j, j + 1} {\hat{α}}_{j} (ξ_{j}), \end{matrix}$ $\begin{aligned} {\boldsymbol{\xi }}_{j+1}={\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1}{\boldsymbol{\xi }}_j + \mathsf{B }_j{\boldsymbol{\xi }}_{j-1} -{\mathcal{D}}_{j,j+1}\hat{{\boldsymbol{\alpha }}}_j({\boldsymbol{\xi }}_j), \end{aligned}$ (47)

where B_j is the term in parenthesis in Eq. (31). We now assume a scaling $ξ_{j}^{'} = ν_{j} ξ_{j}$ ${\boldsymbol{\xi}}_j^{\prime}=\nu_j {\boldsymbol{\xi}}_j$ in every plane, and set the scaled deflection angles to be

$\begin{matrix} {\hat{α}}_{j}^{'} (ξ_{j}^{'}) = ν_{j + 1} {\hat{α}}_{j} (ξ_{j}^{'} / ν_{j}) + G_{j} ξ_{j}^{'} = ν_{j + 1} {\hat{α}}_{j} (ξ_{j}) + ν_{j} G_{j} ξ_{j} . \end{matrix}$ $\begin{aligned} \hat{{\boldsymbol{\alpha }}}_j^\prime ({\boldsymbol{\xi }}_j^\prime )=\nu _{j+1}\hat{{\boldsymbol{\alpha }}}_j({\boldsymbol{\xi }}_j^\prime /\nu _j) +\mathsf{G }_j{\boldsymbol{\xi }}_j^\prime =\nu _{j+1}\hat{{\boldsymbol{\alpha }}}_j({\boldsymbol{\xi }}_j)+\nu _j\mathsf{G }_j{\boldsymbol{\xi }}_j. \end{aligned}$ (48)

Note that Eqs. (36) and (42) are special cases of the relation (48) for j = 1, 2, respectively, and ν₁ = 1, ν₂ = λ, ν₃ = 1. Then, from $ξ_{j}^{'} = ν_{j} ξ_{j}$ ${\boldsymbol{\xi}}_j^{\prime}=\nu_j {\boldsymbol{\xi}}_j$ , we obtain from Eq. (47)

$\begin{matrix} ξ_{j + 1}^{'} = & ν_{j} D_{j - 1, j + 1} D_{j - 1, j}^{- 1} ξ_{j} + ν_{j - 1} B_{j} ξ_{j - 1} \\ - ν_{j + 1} D_{j, j + 1} {\hat{α}}_{j} (ξ_{j}) - ν_{j} D_{j, j + 1} G_{j} ξ_{j} \\ = & ν_{j + 1} [D_{j - 1, j + 1} D_{j - 1, j}^{- 1} ξ_{j} + B_{j} ξ_{j - 1} - D_{j, j + 1} {\hat{α}}_{j} (ξ_{j})] . \end{matrix}$ $\begin{aligned} {\boldsymbol{\xi }}_{j+1}^\prime =&\nu _j{\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1}{\boldsymbol{\xi }}_j + \nu _{j-1}\mathsf{B }_j{\boldsymbol{\xi }}_{j-1} \nonumber \\&-\nu _{j+1}{\mathcal{D}}_{j,j+1} \hat{{\boldsymbol{\alpha }}}_j({\boldsymbol{\xi }}_j)-\nu _j{\mathcal{D}}_{j,j+1}\mathsf{G }_j{\boldsymbol{\xi }}_j \\ =&\nu _{j+1}\left[{\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1}{\boldsymbol{\xi }}_j + \mathsf{B }_j{\boldsymbol{\xi }}_{j-1} -{\mathcal{D}}_{j,j+1}\hat{{\boldsymbol{\alpha }}}_j({\boldsymbol{\xi }}_j) \right] .\nonumber \end{aligned}$ (49)

The terms $\propto {\hat{α}}_{j}$ $\propto \hat{{\boldsymbol{\alpha}}}_j$ cancel each other. The terms ∝ξ_j − 1 yield the condition ν_j + 1 = ν_j − 1, and equating the terms ∝ξ_j yields

$\begin{matrix} G_{j} = (1 - ν_{j + 1} / ν_{j}) D_{j, j + 1}^{- 1} D_{j - 1, j + 1} D_{j - 1, j}^{- 1}, \end{matrix}$ $\begin{aligned} \mathsf{G }_j=(1-\nu _{j+1}/\nu _j){\mathcal{D}}_{j,j+1}^{-1}{\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1}, \end{aligned}$ (50)

which is symmetric according to Eq. (44) and thus represents a tidal matrix. Thus, we obtain ν_j = λ for j even, and ν_j = 1 for j odd. Correspondingly,

$\begin{matrix} G_{j} & = (1 - 1 / λ) D_{j, j + 1}^{- 1} D_{j - 1, j + 1} D_{j - 1, j}^{- 1} for j even ; \\ G_{j} & = (1 - λ) D_{j, j + 1}^{- 1} D_{j - 1, j + 1} D_{j - 1, j}^{- 1} for j odd . \end{matrix}$ $\begin{aligned} \mathsf{G }_j&= (1-1/\lambda ){\mathcal{D}}_{j,j+1}^{-1}{\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1}\;\;\mathrm{for}\;j\;\mathrm{even}; \nonumber \\ \mathsf{G }_j&=(1-\lambda ){\mathcal{D}}_{j,j+1}^{-1}{\mathcal{D}}_{j-1,j+1}{\mathcal{D}}_{j-1,j}^{-1} \;\;\mathrm{for}\;j\;\mathrm{odd}. \end{aligned}$ (51)

We note that Eqs. (39) and (43) are special cases of Eq. (51).

Hence we find that the MST in multiple (lens and source) plane gravitational lensing exhibits a curious behavior: The scaling factor in every second plane is just unity, whereas it is λ in the other half of the planes. In particular that means that a “standard candle” or “standard rod” in one of the planes with j odd cannot be used to break the degeneracy related to the MST, as the images of these sources are unaffected by the MST. The prefactor in the tidal matrices G_j are positive on every other plane, and negative on the remaining ones. If one disregards the perturbations between lens planes, so that the distance matrices 𝒟_ij reduce to angular diameter distances D_ij, then the G_j become scalars proportional to the density of uniform mass sheets; in this case, positive and negative densities of these sheets alternate.

4.4. Transformation of the time delay

We next consider how the MST affects the time delays. For that, we assume to have a source on plane number N + 1, with its light being deflected in N main lens planes. The corresponding LTTF is given in Eq. (19), where ξ_N + 1 is the position of the source in its source plane.

To obtain the corresponding function cτ′ after the MST, we first need to consider the transformation of the potential time delay. That is, we need to find the transformed deflection potential ${\hat{ψ}}_{i}' (ξ_{i}')$ $\hat\psi_i\prime({\boldsymbol{\xi}}_i\prime)$ , which needs to satisfy $\nabla_{ξ_{i}'} {\hat{ψ}}_{i}' (ξ_{i}') = {\hat{α}}_{i}' (ξ_{i}')$ $\nabla_{{\boldsymbol{\xi}}_i\prime} \hat\psi_i\prime({\boldsymbol{\xi}}_i\prime)=\hat{{\boldsymbol{\alpha}}}_i\prime({\boldsymbol{\xi}}_i\prime)$ . For this, we make the ansatz ${\hat{ψ}}_{i}' (ξ_{i}') = a {\hat{ψ}}_{i} (ξ_{i}' / ν_{i}) + ξ_{i}^{' t} G_{i} ξ_{i}' / 2$ $\hat\psi_i\prime({\boldsymbol{\xi}}_i\prime)=a \hat\psi_i({\boldsymbol{\xi}}_i\prime/\nu_i) +{\boldsymbol{\xi}}_i^{\prime\mathrm{t}}{\mathsf{G}}_i{\boldsymbol{\xi}}_i\prime/2$ . Taking the gradient yields $\nabla_{ξ_{i}'} {\hat{ψ}}_{i}' (ξ_{i}') = (a / ν_{i}) {\hat{α}}_{i} (ξ_{i}' / ν_{i}) + G_{i} ξ_{i}'$ $\nabla_{{\boldsymbol{\xi}}_i\prime} \hat\psi_i\prime({\boldsymbol{\xi}}_i\prime) =(a/\nu_i)\hat{{\boldsymbol{\alpha}}}_i({\boldsymbol{\xi}}_i\prime/\nu_i)+{\mathsf{G}}_i{\boldsymbol{\xi}}_i\prime$ . This is seen to agree with ${\hat{α}}_{i}' (ξ_{i}')$ $\hat{{\boldsymbol{\alpha}}}_i\prime({\boldsymbol{\xi}}_i\prime)$ in Eq. (48), provided a = ν_iν_i + 1 = λ. Thus,

$\begin{matrix} {\hat{ψ}}_{i}^{'} (ξ_{i}^{'}) = λ {\hat{ψ}}_{i} (ξ_{i}^{'} / ν_{i}) + \frac{1}{2} ξ_{i}^{' t} G_{i} ξ_{i}^{'} . \end{matrix}$ $\begin{aligned} \hat{\psi }_i^\prime ({\boldsymbol{\xi }}_i^\prime )=\lambda \hat{\psi }_i({\boldsymbol{\xi }}_i^\prime /\nu _i) +{1\over 2}{\boldsymbol{\xi }}_i^{\prime \mathrm{t}}\mathsf{G }_i{\boldsymbol{\xi }}_i^\prime . \end{aligned}$ (52)

We then obtain for the transformed LTTF

$\begin{matrix} c τ^{'} = & \sum_{i = 1}^{N} (1 + z_{i}) [\frac{1}{2} {(ν_{i} C_{i, i + 1} ξ_{i} - ν_{i + 1} D_{i, i + 1}^{- 1} ξ_{i + 1})}^{t} C_{i, i + 1}^{- 1} \\ \times (ν_{i} C_{i, i + 1} ξ_{i} - ν_{i + 1} D_{i, i + 1}^{- 1} ξ_{i + 1}) \\ - λ {\hat{ψ}}_{i} (ξ_{i}) - \frac{ν_{i}^{2}}{2} ξ_{i}^{t} G_{i} ξ_{i}] . \end{matrix}$ $\begin{aligned} c\tau ^\prime =&{\sum }_{i=1}^N (1+z_i) \Biggl [ {1\over 2} \left( \nu _i{\mathsf{C }}_{i,i+1}{\boldsymbol{\xi }}_i - \nu _{i+1}{\mathcal{D}}_{i,i+1}^{-1}{\boldsymbol{\xi }}_{i+1} \right)^\mathrm{t} {\mathsf{C }}_{i,i+1}^{-1} \nonumber \\&\times \left( \nu _i{\mathsf{C }}_{i,i+1}{\boldsymbol{\xi }}_i - \nu _{i+1}{\mathcal{D}}_{i,i+1}^{-1}{\boldsymbol{\xi }}_{i+1} \right) \nonumber \\&- \lambda \hat{\psi }_i({\boldsymbol{\xi }}_i)- {\nu _i^2\over 2} {\boldsymbol{\xi }}_i^{\mathrm{t}}\mathsf{G }_i{\boldsymbol{\xi }}_i \Biggr ] . \end{aligned}$ (53)

We now show that

$\begin{matrix} τ^{'} = λ τ + F (ξ_{N + 1}), \end{matrix}$ $\begin{aligned} \tau ^\prime = \lambda \tau + F({\boldsymbol{\xi }}_{N+1}), \end{aligned}$ (54)

which means that the transformed LTTF just scales by a factor λ, independent of the plane on which the source is located, plus a function which only depends on the location of the source, and thus cancels when considering time delays, i.e., differences between τ for pairs of multiple images of the source.

Taking the difference of τ′−λτ, we first note that the terms $\propto {\hat{ψ}}_{i}$ $\propto \hat\psi_i$ drop out. Second, we note that the terms containing products of ξ_i and ξ_i + 1 also cancel, since ν_iν_i + 1 = λ. Thus we find that

$\begin{matrix} c (τ^{'} - λ τ) = & \sum_{i = 1}^{N} \frac{(1 + z_{i})}{2} {ξ_{i}^{t} [(ν_{i}^{2} - λ) C_{i, i + 1} - ν_{i}^{2} G_{i}] ξ_{i} \\ + (ν_{i + 1}^{2} - λ) ξ_{i + 1}^{t} {(D_{i, i + 1}^{- 1})}^{t} C_{i, i + 1}^{- 1} D_{i, i + 1}^{- 1} ξ_{i + 1}} . \end{matrix}$ $\begin{aligned} c(\tau ^\prime -\lambda \tau ) =&{\sum }_{i=1}^N {(1+z_i)\over 2}\Biggl \{ {\boldsymbol{\xi }}_i^\mathrm{t}\left[\left( \nu _i^2-\lambda \right){\mathsf{C }}_{i,i+1}-\nu _i^2\mathsf{G }_i \right]{\boldsymbol{\xi }}_i \nonumber \\&+ \left( \nu _{i+1}^2-\lambda \right)\xi _{i+1}^\mathrm{t} \left( {\mathcal{D}}_{i,i+1}^{-1} \right)^\mathrm{t}{\mathsf{C }}_{i,i+1}^{-1} {\mathcal{D}}_{i,i+1}^{-1}{\boldsymbol{\xi }}_{i+1} \Biggr \}. \end{aligned}$ (55)

The term i = 1 of the first sum vanishes, since ν₁ = 1, and Eq. (39) holds. The final term (i = N) of the second sum depends only on the source position, and thus corresponds to the function F(ξ_N + 1) previously mentioned. Since this term is of no interest, we simply drop it from now on. Relabeling the index of the second sum as i → i − 1, we then get

$\begin{matrix} c (τ^{'} - λ τ) = ξ_{i}^{t} [\sum_{i = 2}^{N} \frac{(1 + z_{i})}{2} (ν_{i}^{2} - λ) K_{i}] ξ_{i}, \end{matrix}$ $\begin{aligned} c(\tau ^\prime -\lambda \tau )= {\boldsymbol{\xi }}_i^\mathrm{t}\left[{\sum }_{i=2}^N {(1+z_i)\over 2}(\nu _i^2-\lambda ) \mathsf{K }_i \right]{\boldsymbol{\xi }}_i, \end{aligned}$ (56)

where the matrices K_i are given as

$\begin{matrix} K_{i} = \frac{1 + z_{i - 1}}{1 + z_{i}} {(D_{i - 1, i}^{- 1})}^{t} C_{i - 1, i}^{- 1} D_{i - 1, i}^{- 1} + C_{i, i + 1} - (\frac{ν_{i}^{2}}{ν_{i}^{2} - λ}) G_{i} . \end{matrix}$ $\begin{aligned} \mathsf{K }_i={1+z_{i-1}\over 1+z_i} \left( {\mathcal{D}}_{i-1,i}^{-1} \right)^\mathrm{t}{\mathsf{C }}_{i-1,i}^{-1} {\mathcal{D}}_{i-1,i}^{-1}+{\mathsf{C }}_{i,i+1} -\left( \nu _i^2\over \nu _i^2-\lambda \right)\mathsf{G }_i. \end{aligned}$ (57)

Since, according to the definition (17), $C_{i - 1, i}^{- 1} D_{i - 1, i}^{- 1} = D_{i - 1} D_{i}^{- 1}$ ${{\mathsf{C}}}_{i-1,i}^{-1} {{\cal D}}_{i-1,i}^{-1}={{\cal D}}_{i-1}{{\cal D}}_i^{-1}$ , we can rewrite K_i as

$\begin{matrix} K_{i} = \frac{1 + z_{i - 1}}{1 + z_{i}} {(D_{i - 1, i}^{- 1})}^{t} D_{i - 1} D_{i}^{- 1} + C_{i, i + 1} - (\frac{ν_{i}^{2}}{ν_{i}^{2} - λ}) G_{i} . \end{matrix}$ $\begin{aligned} \mathsf{K }_i={1+z_{i-1}\over 1+z_i} \left( {\mathcal{D}}_{i-1,i}^{-1} \right)^\mathrm{t}{\mathcal{D}}_{i-1}{\mathcal{D}}_i^{-1}+{\mathsf{C }}_{i,i+1} -\left( \nu _i^2\over \nu _i^2-\lambda \right)\mathsf{G }_i. \end{aligned}$ (58)

From Eq. (29),

$\begin{matrix} \frac{1 + z_{i - 1}}{1 + z_{i}} {(D_{i - 1, i}^{- 1})}^{t} D_{i - 1} D_{i}^{- 1} = D_{i, i + 1}^{- 1} D_{i - 1, i + 1} D_{i - 1, i}^{- 1} - D_{i, i + 1}^{- 1} D_{i + 1} D_{i}^{- 1} \end{matrix}$ $\begin{aligned} {1+z_{i-1}\over 1+z_i}\left( {\mathcal{D}}_{i-1,i}^{-1} \right)^\mathrm{t}{\mathcal{D}}_{i-1}{\mathcal{D}}_i^{-1} ={\mathcal{D}}_{i,i+1}^{-1}{\mathcal{D}}_{i-1,i+1}{\mathcal{D}}_{i-1,i}^{-1} -{\mathcal{D}}_{i,i+1}^{-1}{\mathcal{D}}_{i+1}{\mathcal{D}}_i^{-1} \end{aligned}$ (59)

is obtained. Noting that the final term is C_{i, i + 1}, we obtain with Eq. (50) that

$\begin{matrix} K_{i} = [1 - (\frac{ν_{i}^{2}}{ν_{i}^{2} - λ}) (1 - \frac{ν_{i + 1}}{ν_{i}})] D_{i, i + 1}^{- 1} D_{i - 1, i + 1} D_{i - 1, i}^{- 1} . \end{matrix}$ $\begin{aligned} \mathsf{K }_i=\left[1-\left( \nu _i^2\over \nu _i^2-\lambda \right)\left( 1-{\nu _{i+1}\over \nu _i} \right) \right] {\mathcal{D}}_{i,i+1}^{-1}{\mathcal{D}}_{i-1,i+1}{\mathcal{D}}_{i-1,i}^{-1}. \end{aligned}$ (60)

However, the prefactor vanishes: if i is odd, ν_i = 1, ν_i + 1 = λ, and thus

$1 - (\frac{ν_{i}^{2}}{ν_{i}^{2} - λ}) (1 - \frac{ν_{i + 1}}{ν_{i}}) = 1 - (\frac{1}{1 - λ}) (1 - λ) = 0 .$ $1-\left( \nu _i^2\over \nu _i^2-\lambda \right)\left( 1-{\nu _{i+1}\over \nu _i} \right) =1-\left( 1\over 1-\lambda \right)(1-\lambda )=0.$

If i is even, ν_i = λ, ν_i + 1 = 1, and

$1 - (\frac{ν_{i}^{2}}{ν_{i}^{2} - λ}) (1 - \frac{ν_{i + 1}}{ν_{i}}) = 1 - (\frac{λ^{2}}{λ^{2} - λ}) (1 - \frac{1}{λ}) = 0 .$ $1-\left( \nu _i^2\over \nu _i^2-\lambda \right)\left( 1-{\nu _{i+1}\over \nu _i} \right) =1-\left( \lambda ^2\over \lambda ^2-\lambda \right)\left( 1-{1\over \lambda } \right) =0.$

Therefore, K_i = 0 for all i, which completes our proof of the validity of Eq. (54). Since the result is independent of the plane on which the source is located – the time delay scales for all source planes with λ – we see that all time delays are scaled by the factor λ under the MST. In particular this implies that the degeneracy due to the MST cannot be broken from measuring time delay ratios.

5. Discussion

In this paper we have considered several aspects of the generalized multi-plane gravitational lensing equation. In contrast to the treatment in Schneider et al. (1992) and more recent papers (e.g., McCully et al. 2014), we treat the light propagation between main lens planes with a continuous formalism, offered by the optical tidal equation, instead of slicing up the matter into several “weak-lensing” lens planes⁶. For this, we made use of the formalism of light propagation in arbitrary spacetimes, as given in SSE. As a result, the distance matrices between lens planes are not written in terms of recursion relations, but as solutions of the optical tidal equation; the explicit solution in terms of an integral over the tidal field caused by large-scale density inhomogeneities along the line-of-sight is provided in Eq. (11).

The time-delay function for generalized multi-plane lensing was derived, using the same arguments as employed in Schneider et al. (1992) for the derivation of the time delay in ordinary multi-plane lensing. The explicit form deviates from that obtained in McCully et al. (2014), in that our result depends only on the impact vectors in the various main planes, but not on the deflection angles. In other word, our expression for τ yields the light travel time of a kinematically possible ray with specified impact vectors ξ_i in the main lens planes, up to an additive constant. Physical light rays are those for which the light-travel time is stationary; this allows the derivation of an iterative lens equation which relates the impact vectors of three consecutive main lens planes to the deflection angle in the middle one of those. We have shown that this form of the lens equation is equivalent to the more standard one which contains the impact vectors and deflection angles of all earlier lens planes. This consecutive lens equation is probably preferable for the use in ray-tracing simulations (see, e.g. Petkova et al. 2014; Jaroszyński & Skowron 2016)⁷.

Finally, we showed that the generalized multi-plane lensing admits a mass-sheet transformation (MST) which leaves all observables but the time delay invariant. In contrast to ordinary lensing, the MST corresponds to adding a tidal matrix in each main lens plane. We obtained the curious behavior that the uniform isotropic scaling of the source/lens planes, which is the key aspect of the MST, alternates between planes; in every second plane, the scaling corresponds to the MST parameter λ, in the other half of the planes, the scaling is unity. In particular, this implies that the magnification of sources living on the odd planes with scaling factor unity, is unaffected by the MST. All time delays – i.e., for sources in all main planes – scale as λ under the MST.

This curious behavior of the MST in multi-plane lensing may indeed offer a way to break the corresponding degeneracy, at least in a statistical way. As we discussed before, in the case of vanishing perturbations between the main lens planes, the MST corresponds to mass sheets of alternating sign from plane to plane. Since such a mass sheet changes the slope of the total mass distribution, it means that this slope change also alternates. If one now makes the perhaps plausible assumption that the shape of the mean mass profiles of lenses is the same, this alternating slope change would violate the universality of the mean mass profile. Thus, in multi-plane lensing, the mass-sheet generacy may be more easily lifted than in the case of a single lens plane only.

We hope that the results obtained here will be useful for further theoretical studies of generalized multi-plane lensing, as well as for modeling lens systems in which more than one main deflector affects the imaging properties between observer’s sky and the source plane.

¹

It must be stressed here that this Born-approximation only applies to the distance matrices 𝒟_i which are governed by the smooth part of 𝒯 according to Eq. (8); only this smooth contribution is contained in Eq. (11). No such approximation is made with regards to the main deflectors.

²

We explicitly point out that the only geometrical relation used in this derivation is the one between angles and transverse separations, i.e., the definition of the distance matrices.

³

The use of the same symbol λ for the affine parameter and the MST parameter is due to the conventions in the literature, but should not lead to any confusion; in particular, in this section λ is exclusively used as MST parameter, and we use redshift z to label lens planes.

⁴

In case the distance matrices are proportional to the unit matrix, the transformation reduces to the known one in standard lensing. Note that the transformation (36) implies the transformation ${\hat{ψ}}_{1}' (ξ_{1}) = λ {\hat{ψ}}_{1} (ξ_{1}) + ξ_{1}^{t} G_{1} ξ_{1} / 2$ $\hat\psi_1\prime({\boldsymbol{\xi}}_1)=\lambda \hat{\psi}_1({\boldsymbol{\xi}}_1) +{\boldsymbol{\xi}}_1^{\mathrm{t}}{\mathsf{G}}_1{\boldsymbol{\xi}}_1/2$ for the deflection potential. For isotropic distance matrices, 𝖦₁ = (1−λ)D₂/(D₁D₁₂)ℐ, so that ${\hat{ψ}}_{1}' (ξ_{1}) = λ {\hat{ψ}}_{1} (ξ_{1}) + (1 - λ) (D_{1} D_{2} / D_{12}) {| θ |}^{2} / 2$ $\hat{\psi}_1\prime({\boldsymbol{\xi}}_1)=\lambda \hat\psi_1({\boldsymbol{\xi}}_1) +(1-\lambda)(D_1 D_2/D_{12})|{\boldsymbol{\theta}}|^2/2$ . According to (15), this then implies for the scaled deflection potential ψ′(θ)=λψ(θ)+(1 − λ)|θ|²/2, as in standard lens theory.

⁵

This relation is unique for a general deflection law ${\hat{α}}_{2}$ $\hat{{\boldsymbol{\alpha}}}_2$ . However, if the “strong lens plane” at z₂ only yields a deflection linear in the impact vector (i.e., if this lens plane is in fact just a “weak” deflector), then the form of $\hat{α}'_{2}$ $\hat{{\boldsymbol{\alpha}}}\prime_2$ no longer is uniquely determined. This comment also applies to the more general case discussed below. Hence, we implicitly assume that all “strong lens planes” yield indeed a non-linear deflection law.

⁶

Whereas these two treatments are equivalent (indeed, as shown in SSE, the slicing into weak-lensing planes corresponds to a discretized version of the optical tidal equation), the continuous formalism is more convenient for analytical calculations – for example, obtaining a result such as (25) using the discretized version is probably extremely tedious.

⁷

The advantage of Eqs. (21) and (10) is twofold: First, in order to calculate the impact vectors in all N planes requires of order N²/2 multiplications for each light ray when Eq. (10) is used, compared to about 3N multiplications for Eq. (21). More significant, however, is the fact that Eq. (21) allows one to save on memory: whereas Eq. (10) requires the information of all lens planes for each ray, one can treat with Eq. (21) a large set of rays, tracing them from plane to plane, and in each step require only the information on a single lens plane.

⁸

We note that B ≡ B₀ corresponds to the matrix B in McCully et al. (2014), and the B_i corresponds, up to a prefactor, to their matrices C.

Acknowledgments

The author thanks Thomas Collett, Dominique Sluse, and Sherry Sutu for helpful comments and discussions, and the anonymous referee for a constructive report. This work was supported in part by the Deutsche Forschungsgemeinschaft under the TR33 “The Dark Universe”.

References

Bartelmann, M. 2010, Classical Quantum Gravity, 27, 233001 [NASA ADS] [CrossRef] [Google Scholar]
Bartelmann, M., & Schneider, P. 2001, Phys. Rep., 340, 291 [NASA ADS] [CrossRef] [Google Scholar]
Bartelmann, M., Huss, A., Colberg, J. M., Jenkins, A., & Pearce, F. R. 1998, A&A, 330, 1 [NASA ADS] [Google Scholar]
Bayliss, M. B., Johnson, T., Gladders, M. D., Sharon, K., & Oguri, M. 2014, ApJ, 783, 41 [NASA ADS] [CrossRef] [Google Scholar]
Blandford, R., & Narayan, R. 1986, ApJ, 310, 568 [NASA ADS] [CrossRef] [Google Scholar]
Chae, K.-H., Mao, S., & Augusto, P. 2001, MNRAS, 326, 1015 [NASA ADS] [CrossRef] [Google Scholar]
Collett, T. E., & Auger, M. W. 2014, MNRAS, 443, 969 [NASA ADS] [CrossRef] [Google Scholar]
Collett, T. E., Marshall, P. J., Auger, M. W., et al. 2013, MNRAS, 432, 679 [NASA ADS] [CrossRef] [Google Scholar]
Etherington, I. M. H. 1933, Philos. Mag., 15, 761 [NASA ADS] [CrossRef] [Google Scholar]
Falco, E. E., Gorenstein, M. V., & Shapiro, I. I. 1985, ApJ, 289, L1 [Google Scholar]
Gavazzi, R., Treu, T., Koopmans, L. V. E., et al. 2008, ApJ, 677, 1046 [NASA ADS] [CrossRef] [Google Scholar]
Greene, Z. S., Suyu, S. H., Treu, T., et al. 2013, ApJ, 768, 39 [NASA ADS] [CrossRef] [Google Scholar]
Hilbert, S., Hartlap, J., White, S. D. M., & Schneider, P. 2009, A&A, 499, 31 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Hoekstra, H. 2013, ArXiv e-prints [arXiv:1312.5981] [Google Scholar]
Jain, B., Seljak, U., & White, S. 2000, ApJ, 530, 547 [NASA ADS] [CrossRef] [Google Scholar]
Jaroszynski, M., & Kostrzewa-Rutkowska, Z. 2014, MNRAS, 439, 2432 [NASA ADS] [CrossRef] [Google Scholar]
Jaroszyński, M., & Skowron, J. 2016, MNRAS, 462, 1405 [NASA ADS] [CrossRef] [Google Scholar]
Keeton, C. R., Kochanek, C. S., & Seljak, U. 1997, ApJ, 482, 604 [NASA ADS] [CrossRef] [Google Scholar]
Kneib, J.-P., & Natarajan, P. 2011, A&ARv, 19, 47 [NASA ADS] [CrossRef] [Google Scholar]
Kochanek, C. S. 2003, ApJ, 583, 49 [NASA ADS] [CrossRef] [Google Scholar]
Kochanek, C. S. 2006, in Saas-Fee Advanced Course 33: Gravitational Lensing: Strong, Weak and Micro, eds. G. Meylan, P. Jetzer, P. North, et al., 91 [NASA ADS] [CrossRef] [Google Scholar]
Kovner, I. 1987, ApJ, 316, 52 [NASA ADS] [CrossRef] [Google Scholar]
McCully, C., Keeton, C. R., Wong, K. C., & Zabludoff, A. I. 2014, MNRAS, 443, 3631 [NASA ADS] [CrossRef] [Google Scholar]
McCully, C., Keeton, C. R., Wong, K. C., & Zabludoff, A. I. 2017, ApJ, 836, 141 [NASA ADS] [CrossRef] [Google Scholar]
Meneghetti, M., Bartelmann, M., Dahle, H., & Limousin, M. 2013, Space Sci. Rev., 177, 31 [NASA ADS] [CrossRef] [Google Scholar]
Munshi, D., Valageas, P., van Waerbeke, L., & Heavens, A. 2008, Phys. Rep., 462, 67 [NASA ADS] [CrossRef] [Google Scholar]
Petkova, M., Metcalf, R. B., & Giocoli, C. 2014, MNRAS, 445, 1954 [NASA ADS] [CrossRef] [Google Scholar]
Puchwein, E., & Hilbert, S. 2009, MNRAS, 398, 1298 [NASA ADS] [CrossRef] [Google Scholar]
Refsdal, S. 1964, MNRAS, 128, 307 [NASA ADS] [CrossRef] [Google Scholar]
Refsdal, S. 1970, ApJ, 159, 357 [NASA ADS] [CrossRef] [Google Scholar]
Schneider, P. 1997, MNRAS, 292, 673 [NASA ADS] [CrossRef] [Google Scholar]
Schneider, P. 2006, in Saas-Fee Advanced Course 33: Gravitational Lensing: Strong, Weak and Micro, eds. G. Meylan, P. Jetzer, P. North, et al., 269 [NASA ADS] [CrossRef] [Google Scholar]
Schneider, P. 2014, A&A, 568, L2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Schneider, P. 2016, A&A, 592, L6 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Schneider, P., & Sluse, D. 2013, A&A, 559, A37 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Schneider, P., & Sluse, D. 2014, A&A, 564, A103 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Schneider, P., Ehlers, J., & Falco, E. E. 1992, Gravitational Lenses (Berlin: Springer-Verlag) [Google Scholar]
Seitz, S., & Schneider, P. 1994, A&A, 287, 349 [NASA ADS] [Google Scholar]
Seitz, S., Schneider, P., & Ehlers, J. 1994, Classical Quantum Gravity, 11, 2345 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
Smith, M., Bacon, D. J., Nichol, R. C., et al. 2014, ApJ, 780, 24 [NASA ADS] [CrossRef] [Google Scholar]
Suyu, S. H., Marshall, P. J., Auger, M. W., et al. 2010, ApJ, 711, 201 [NASA ADS] [CrossRef] [Google Scholar]
Suyu, S. H., Auger, M. W., Hilbert, S., et al. 2013, ApJ, 766, 70 [NASA ADS] [CrossRef] [Google Scholar]
Treu, T. 2010, ARA&A, 48, 87 [NASA ADS] [CrossRef] [Google Scholar]
Wong, K. C., Keeton, C. R., Williams, K. A., Momcheva, I. G., & Zabludoff, A. I. 2011, ApJ, 726, 84 [NASA ADS] [CrossRef] [Google Scholar]

Appendix A: Distance matrices in terms of peculiar gravitational potential

In this appendix we derive the expression (11) for the distance matrices in an inhomogeneous Universe. As usual in cosmological weak lensing, we work in comoving coordinates, and therefore replace the affine parameter λ by the comoving distance χ, where dχ/da = c/(a²H), and a is the cosmic scale factor normalized to unity today. From the Robertson–Walker metric and the condition for radial null geodesics, we have a dχ = −c dt, and with Eq. (2) follows dχ = dλ/a². These relations then imply that

$\begin{matrix} \frac{d}{d λ} = \frac{d χ}{d λ} \frac{d}{d χ} = \frac{1}{a^{2}} \frac{d}{d χ}, \\ \frac{d^{2}}{d λ^{2}} = \frac{1}{a^{4}} \frac{d^{2}}{d χ^{2}} - \frac{2}{a^{3}} \frac{H}{c} \frac{d}{d χ}, \\ a \frac{d^{2}}{d λ^{2}} + \frac{2 H}{c} \frac{d}{d λ} = \frac{1}{a^{3}} \frac{d^{2}}{d χ^{2}} \cdot \end{matrix}$ $\begin{aligned}&{{\mathrm{d}}\over {\mathrm{d}}\lambda }={{\mathrm{d}}\chi \over {\mathrm{d}}\lambda }{{\mathrm{d}}\over {\mathrm{d}}\chi } ={1\over a^2}{{\mathrm{d}}\over {\mathrm{d}}\chi }, \nonumber \\&{\mathrm{{d}}^2\over {\mathrm{d}}\lambda ^2}= {1\over a^4}{\mathrm{{d}}^2\over {\mathrm{d}}\chi ^2}-{2\over a^3}{H\over c}{{\mathrm{d}}\over {\mathrm{d}}\chi }, \\&a{\mathrm{{d}}^2\over {\mathrm{d}}\lambda ^2}+{2 H\over c}{{\mathrm{d}}\over {\mathrm{d}}\lambda } ={1\over a^3}{\mathrm{{d}}^2\over {\mathrm{d}}\chi ^2}\cdot \nonumber \end{aligned}$ (A.1)

The angular-diameter distance D_i is related to the comoving angular diameter distance by D_i = a f_k(χ − χ_i), where χ_i is the comoving distance corresponding to the affine parameter λ_i. Applying Eq. (5), we get

$\begin{matrix} \frac{d^{2} D_{i}}{d λ^{2}} = a \frac{d^{2} f_{k}}{d λ^{2}} + 2 \frac{d a}{d λ} \frac{d f_{k}}{d λ} + \frac{d^{2} a}{d λ^{2}} f_{k} = - \frac{3}{2} {(\frac{H_{0}}{c})}^{2} \frac{Ω_{m}}{a^{4}} f_{k} . \end{matrix}$ $\begin{aligned} {\mathrm{{d}}^2 D_i\over {\mathrm{d}}\lambda ^2}=a{\mathrm{{d}}^2 f_k\over {\mathrm{d}}\lambda ^2} +2{{\mathrm{d}} a\over {\mathrm{d}}\lambda }{{\mathrm{d}} f_k\over {\mathrm{d}}\lambda } +{\mathrm{{d}}^2 a\over {\mathrm{d}}\lambda ^2}f_k =-{3\over 2}\left( H_0\over c \right)^2{\Omega _{\rm m}\over a^4}f_k. \end{aligned}$ (A.2)

With da/dλ = (dχ/dλ)(da/dχ)=H/c, and

$\begin{matrix} \frac{d^{2} a}{d λ^{2}} & = \frac{d H}{c d λ} = \frac{d a}{d λ} \frac{d H}{c d a} = \frac{1}{2 c^{2}} \frac{d H^{2}}{d a} \\ = \frac{H_{0}^{2}}{2 c^{2}} (2 \frac{Ω_{m} + Ω_{Λ} - 1}{a^{3}} - \frac{3 Ω_{m}}{a^{4}}), \end{matrix}$ $\begin{aligned} {\mathrm{{d}}^2 a\over {\mathrm{d}}\lambda ^2}\!\!\!\!&= \!\!\!\!{{\mathrm{d}} H\over c\,{\mathrm{d}}\lambda } ={{\mathrm{d}} a\over {\mathrm{d}}\lambda }{{\mathrm{d}} H\over c\,{\mathrm{d}} a} ={1\over 2 c^2}{{\mathrm{d}} H^2\over {\mathrm{d}} a} \nonumber \\ \!\!\!\!&= \!\!\!\!{H_0^2\over 2 c^2}\left( 2{\Omega _{\rm m}+\Omega _\Lambda -1\over a^3} -{3 \Omega _{\rm m}\over a^4} \right), \nonumber \end{aligned}$

and making use of the relations (A.1), Eq. (A.2) reduces to Eq. (12).

Next we write the distance matrix as 𝒟_i = D_i 𝖡_i, so that B_i describes the deviation of 𝒟_i from the unperturbed distance matrix D_iℐ⁸. This factorization transforms Eq. (8) into

$\begin{matrix} \frac{d^{2} B_{i}}{d λ^{2}} D_{i} + 2 \frac{d B_{i}}{d λ} \frac{d D_{i}}{d λ} + B_{i} \frac{d^{2} D_{i}}{d λ^{2}} = (T_{bg} + T_{sm}) B_{i} D_{i} . \end{matrix}$ $\begin{aligned} {\mathrm{{d}}^2\mathsf{B }_i\over {\mathrm{d}}\lambda ^2}D_i +2{{\mathrm{d}}\mathsf{B }_i\over {\mathrm{d}}\lambda }{{\mathrm{d}} D_i\over {\mathrm{d}}\lambda } +\mathsf{B }_i{\mathrm{{d}}^2 D_i\over {\mathrm{d}}\lambda ^2} =\left( \mathcal{T}_{\rm bg}+\mathcal{T}_{\rm sm} \right)\mathsf{B }_i D_i. \end{aligned}$ (A.3)

Subtracting from this the transport Eq. (5) for D_i, we are left with

$\begin{matrix} \frac{d^{2} B_{i}}{d λ^{2}} D_{i} + 2 \frac{d B_{i}}{d λ} \frac{d D_{i}}{d λ} = T_{sm} B_{i} D_{i} . \end{matrix}$ $\begin{aligned} {\mathrm{{d}}^2\mathsf{B }_i\over {\mathrm{d}}\lambda ^2}D_i +2{{\mathrm{d}}\mathsf{B }_i\over {\mathrm{d}}\lambda }{{\mathrm{d}} D_i\over {\mathrm{d}}\lambda } =\mathcal{T}_{\rm sm}\mathsf{B }_i D_i. \end{aligned}$ (A.4)

Inserting D_i(λ)=a(λ) f_k(χ − χ_i), and using the differentiation rules (A.1), this turns into

$\begin{matrix} \frac{d^{2} B_{i}}{d χ^{2}} f_{k} (χ - χ_{i}) + 2 \frac{d B_{i}}{d χ} \frac{d f_{k}}{d χ} = T_{sm} B_{i} a^{4} f_{k} (χ - χ_{i}) . \end{matrix}$ $\begin{aligned} {\mathrm{{d}}^2\mathsf{B }_i\over {\mathrm{d}}\chi ^2}f_k(\chi -\chi _i) +2{{\mathrm{d}}\mathsf{B }_i\over {\mathrm{d}}\chi }{{\mathrm{d}} f_k\over {\mathrm{d}}\chi } =\mathcal{T}_{\rm sm}\mathsf{B }_i a^4 f_k(\chi -\chi _i). \end{aligned}$ (A.5)

We this relation, we find for X_i(χ):=f_k(χ − χ_i)B_i(χ):

$\begin{matrix} \frac{d^{2} X_{i}}{d χ^{2}} = \frac{d^{2} B_{i}}{d χ^{2}} f_{k} (χ - χ_{i}) + 2 \frac{d B_{i}}{d χ} \frac{d f_{k}}{d χ} - K X = - K X_{i} + T_{sm} a^{4} X_{i}, \end{matrix}$ $\begin{aligned} {\mathrm{{d}}^2 \mathsf{X }_i\over {\mathrm{d}}\chi ^2} ={\mathrm{{d}}^2\mathsf{B }_i\over {\mathrm{d}}\chi ^2}f_k(\chi -\chi _i) +2{{\mathrm{d}}\mathsf{B }_i\over {\mathrm{d}}\chi }{{\mathrm{d}} f_k\over {\mathrm{d}}\chi } -K\mathsf{X }= -K \mathsf{X }_i +\mathcal{T}_{\rm sm} a^4 \mathsf{X }_i, \end{aligned}$ (A.6)

where we made use of Eq. (12). This differential equation can be transformed into an integral equation, using the method of Green’s functions, to read

$\begin{matrix} X_{i} (χ) = f_{k} (χ - χ_{i}) I + \int_{χ_{i}}^{χ} d χ^{'} f_{k} (χ - χ^{'}) T_{sm} (χ^{'}) a^{4} (χ^{'}) X_{i} (χ^{'}) . \end{matrix}$ $\begin{aligned} \mathsf{X }_i(\chi ) =f_k(\chi -\chi _i)\mathcal{I} +\int _{\chi _i}^\chi {\mathrm{d}}\chi ^\prime \;f_k(\chi -\chi ^\prime ) \mathcal{T}_{\rm sm}(\chi ^\prime ) a^4(\chi ^\prime ) \mathsf{X }_i(\chi ^\prime ). \end{aligned}$ (A.7)

By differtiating twice, one can easily show that Eq. (A.7) indeed is a formal solution of Eq. (A.6), with the correct initial condition X_i(χ_i)=0, d𝖷_i(χ_i)/dχ = ℐ. If we now use the expression (6) for 𝒯_sm, neglecting the final term which is a derivative along the line-of-sight and thus cancels in the integration, and replacing the derivatives w.r.t. ξ by those w.r.t. comoving transverse coordinates, we get

$\begin{matrix} X_{i} (χ) = f_{k} (χ - χ_{i}) I - \frac{2}{c^{2}} \int_{χ_{i}}^{χ} d χ^{'} f_{k} (χ - χ^{'}) H (ϕ (χ^{'})) X_{i} (χ^{'}) . \end{matrix}$ $\begin{aligned} \mathsf{X }_i(\chi ) =f_k(\chi -\chi _i)\mathcal{I} -{2\over c^2}\int _{\chi _i}^\chi {\mathrm{d}}\chi ^\prime \;f_k(\chi -\chi ^\prime ) \mathsf{H }(\phi (\chi ^\prime )) \mathsf{X }_i(\chi ^\prime ). \end{aligned}$ (A.8)

Using 𝒟_i = a 𝖷_i, we arrive at Eq. (11). We also note that the large-scale structure component of the optical tidal matrix can alternatively written in the form (see Seitz & Schneider 1994)

$\begin{matrix} T_{sm} = - \frac{2}{c^{2} a^{4}} [2 π G a^{2} (ρ - \bar{ρ}) + (\begin{matrix} Γ_{1} & Γ_{2} \\ Γ_{2} & - Γ_{1} \end{matrix})], \end{matrix}$ $\begin{aligned} \mathcal{T}_{\rm sm}=-{2\over c^2\,a^4} \left[2\pi G a^2 (\rho -\bar{\rho })+\left( \begin{array}{cc} \Gamma _1&\Gamma _2 \\ \Gamma _2&-\Gamma _1 \end{array} \right) \right], \end{aligned}$ (A.9)

with Γ₁ = (ϕ_,11 − ϕ_,22)/2 and Γ₂ = ϕ_,12, where the partial derivatives are with respect to transverse comoving coordinates. Thus, one can replace H(ϕ) in Eq. (A.8) by the bracket in Eq. (A.9).

All Figures

Fig. 1.

Sketch of two light rays through four consecutive planes, with 0 ≤ λ_q < λ_r < λ_s < λ_t. The rays are not deflected in the lens planes. The first ray has its vertex at λ_q and encloses an angle θ with the fiducial ray; the second ray with vertex at λ_r encloses an angle ϑ with the fiducial ray. Both rays intersect at λ_s. The geometry of this figure yields the relation (25) between distance matrices.

In the text

	Fig. 2. Propagation of a light ray (thick bent line) between three consecutive planes. The vertical line is the optical axis, with respect to which the separation vectors ξ are measured. The geometry of this figure yields the lens Eq. (30) – see text.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Bartelmann, M. 2010, Classical Quantum Gravity, 27, 233001 [NASA ADS] [CrossRef] [Google Scholar]

[2] Bartelmann, M., & Schneider, P. 2001, Phys. Rep., 340, 291 [NASA ADS] [CrossRef] [Google Scholar]

[3] Bartelmann, M., Huss, A., Colberg, J. M., Jenkins, A., & Pearce, F. R. 1998, A&A, 330, 1 [NASA ADS] [Google Scholar]

[4] Bayliss, M. B., Johnson, T., Gladders, M. D., Sharon, K., & Oguri, M. 2014, ApJ, 783, 41 [NASA ADS] [CrossRef] [Google Scholar]

[5] Blandford, R., & Narayan, R. 1986, ApJ, 310, 568 [NASA ADS] [CrossRef] [Google Scholar]

[6] Chae, K.-H., Mao, S., & Augusto, P. 2001, MNRAS, 326, 1015 [NASA ADS] [CrossRef] [Google Scholar]

[7] Collett, T. E., & Auger, M. W. 2014, MNRAS, 443, 969 [NASA ADS] [CrossRef] [Google Scholar]

[8] Collett, T. E., Marshall, P. J., Auger, M. W., et al. 2013, MNRAS, 432, 679 [NASA ADS] [CrossRef] [Google Scholar]

[9] Etherington, I. M. H. 1933, Philos. Mag., 15, 761 [NASA ADS] [CrossRef] [Google Scholar]

[10] Falco, E. E., Gorenstein, M. V., & Shapiro, I. I. 1985, ApJ, 289, L1 [Google Scholar]

[11] Gavazzi, R., Treu, T., Koopmans, L. V. E., et al. 2008, ApJ, 677, 1046 [NASA ADS] [CrossRef] [Google Scholar]

[12] Greene, Z. S., Suyu, S. H., Treu, T., et al. 2013, ApJ, 768, 39 [NASA ADS] [CrossRef] [Google Scholar]

[13] Hilbert, S., Hartlap, J., White, S. D. M., & Schneider, P. 2009, A&A, 499, 31 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[14] Hoekstra, H. 2013, ArXiv e-prints [arXiv:1312.5981] [Google Scholar]

[15] Jain, B., Seljak, U., & White, S. 2000, ApJ, 530, 547 [NASA ADS] [CrossRef] [Google Scholar]

[16] Jaroszynski, M., & Kostrzewa-Rutkowska, Z. 2014, MNRAS, 439, 2432 [NASA ADS] [CrossRef] [Google Scholar]

[17] Jaroszyński, M., & Skowron, J. 2016, MNRAS, 462, 1405 [NASA ADS] [CrossRef] [Google Scholar]

[18] Keeton, C. R., Kochanek, C. S., & Seljak, U. 1997, ApJ, 482, 604 [NASA ADS] [CrossRef] [Google Scholar]

[19] Kneib, J.-P., & Natarajan, P. 2011, A&ARv, 19, 47 [NASA ADS] [CrossRef] [Google Scholar]

[20] Kochanek, C. S. 2003, ApJ, 583, 49 [NASA ADS] [CrossRef] [Google Scholar]

[21] Kochanek, C. S. 2006, in Saas-Fee Advanced Course 33: Gravitational Lensing: Strong, Weak and Micro, eds. G. Meylan, P. Jetzer, P. North, et al., 91 [NASA ADS] [CrossRef] [Google Scholar]

[22] Kovner, I. 1987, ApJ, 316, 52 [NASA ADS] [CrossRef] [Google Scholar]

[23] McCully, C., Keeton, C. R., Wong, K. C., & Zabludoff, A. I. 2014, MNRAS, 443, 3631 [NASA ADS] [CrossRef] [Google Scholar]

[24] McCully, C., Keeton, C. R., Wong, K. C., & Zabludoff, A. I. 2017, ApJ, 836, 141 [NASA ADS] [CrossRef] [Google Scholar]

[25] Meneghetti, M., Bartelmann, M., Dahle, H., & Limousin, M. 2013, Space Sci. Rev., 177, 31 [NASA ADS] [CrossRef] [Google Scholar]

[26] Munshi, D., Valageas, P., van Waerbeke, L., & Heavens, A. 2008, Phys. Rep., 462, 67 [NASA ADS] [CrossRef] [Google Scholar]

[27] Petkova, M., Metcalf, R. B., & Giocoli, C. 2014, MNRAS, 445, 1954 [NASA ADS] [CrossRef] [Google Scholar]

[28] Puchwein, E., & Hilbert, S. 2009, MNRAS, 398, 1298 [NASA ADS] [CrossRef] [Google Scholar]

[29] Refsdal, S. 1964, MNRAS, 128, 307 [NASA ADS] [CrossRef] [Google Scholar]

[30] Refsdal, S. 1970, ApJ, 159, 357 [NASA ADS] [CrossRef] [Google Scholar]

[31] Schneider, P. 1997, MNRAS, 292, 673 [NASA ADS] [CrossRef] [Google Scholar]

[32] Schneider, P. 2006, in Saas-Fee Advanced Course 33: Gravitational Lensing: Strong, Weak and Micro, eds. G. Meylan, P. Jetzer, P. North, et al., 269 [NASA ADS] [CrossRef] [Google Scholar]

[33] Schneider, P. 2014, A&A, 568, L2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[34] Schneider, P. 2016, A&A, 592, L6 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[35] Schneider, P., & Sluse, D. 2013, A&A, 559, A37 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[36] Schneider, P., & Sluse, D. 2014, A&A, 564, A103 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[37] Schneider, P., Ehlers, J., & Falco, E. E. 1992, Gravitational Lenses (Berlin: Springer-Verlag) [Google Scholar]

[38] Seitz, S., & Schneider, P. 1994, A&A, 287, 349 [NASA ADS] [Google Scholar]

[39] Seitz, S., Schneider, P., & Ehlers, J. 1994, Classical Quantum Gravity, 11, 2345 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]

[40] Smith, M., Bacon, D. J., Nichol, R. C., et al. 2014, ApJ, 780, 24 [NASA ADS] [CrossRef] [Google Scholar]

[41] Suyu, S. H., Marshall, P. J., Auger, M. W., et al. 2010, ApJ, 711, 201 [NASA ADS] [CrossRef] [Google Scholar]

[42] Suyu, S. H., Auger, M. W., Hilbert, S., et al. 2013, ApJ, 766, 70 [NASA ADS] [CrossRef] [Google Scholar]

[43] Treu, T. 2010, ARA&A, 48, 87 [NASA ADS] [CrossRef] [Google Scholar]

[44] Wong, K. C., Keeton, C. R., Williams, K. A., Momcheva, I. G., & Zabludoff, A. I. 2011, ApJ, 726, 84 [NASA ADS] [CrossRef] [Google Scholar]