Issue 
A&A
Volume 643, November 2020



Article Number  A165  
Number of page(s)  40  
Section  Cosmology (including clusters of galaxies)  
DOI  https://doi.org/10.1051/00046361/202038861  
Published online  20 November 2020 
TDCOSMO
IV. Hierarchical timedelay cosmography – joint inference of the Hubble constant and galaxy density profiles^{⋆}
^{1}
Kavli Institute for Particle Astrophysics and Cosmology and Department of Physics, Stanford University, Stanford, CA 94305, USA
email: sibirrer@stanford.edu
^{2}
Physics and Astronomy Department, University of California, Los Angeles, CA 90095, USA
^{3}
Institute of Physics, Laboratory of Astrophysics, Ecole Polytechnique Fédérale de Lausanne (EPFL), Observatoire de Sauverny, 1290 Versoix, Switzerland
^{4}
DARK, NielsBohr Institute, Lyngbyvej 2, 2100 Copenhagen, Denmark
^{5}
Institute of Astrononmy, University of Cambridge, Madingley Road, Cambridge CB30HA, UK
^{6}
Kavli Institute for Cosmology, University of Cambridge, Madingley Road, Cambridge CB30HA, UK
^{7}
Physics Dept., University of California, Davis, 1 Shields Ave., Davis, CA 95616, USA
^{8}
Institute of Cosmology and Gravitation, University of Portsmouth, Burnaby Rd, Portsmouth PO1 3FX, UK
^{9}
Carnegie Visiting Scientist, USA
^{10}
Kapteyn Astronomical Institute, University of Groningen, PO Box 800, 9700 AV Groningen, The Netherlands
^{11}
National Astronomical Observatory of Japan, 2211 Osawa, Mitaka, Tokyo 1810015, Japan
^{12}
STAR Institute, Quartier Agora, Allée du six Août, 19c, 4000 Liège, Belgium
^{13}
Department of Physics, University of Oxford, Denys Wilkinson Building, Keble Road, Oxford OX1 3RH, UK
^{14}
INAF, Osservatorio Astronomico di Capodimonte, Via Moiariello 16, 80131 Naples, Italy
^{15}
MaxPlanckInstitut für Astrophysik, KarlSchwarzschildStr. 1, 85748 Garching, Germany
^{16}
PhysikDepartment, Technische Universität München, JamesFranckStraße 1, 85748 Garching, Germany
^{17}
Academia Sinica Institute of Astronomy and Astrophysics (ASIAA), 11F of ASMAB, No.1, Section 4, Roosevelt Road, Taipei 10617, Taiwan
^{18}
Kavli IPMU (WPI), UTIAS, The University of Tokyo, Kashiwa, Chiba 2778583, Japan
^{19}
NSF’s National OpticalInfrared Astronomy Research Laboratory, 950 N. Cherry Ave., Tucson, AZ 85719, USA
^{20}
University of Vienna, Department of Astrophysics, Türkenschanzstr. 17, 1180 Wien, Austria
^{21}
Fermi National Accelerator Laboratory, PO Box 500, Batavia, IL 60510, USA
^{22}
Kavli Institute for Cosmological Physics, Department of Astronomy & Astrophysics, The University of Chicago, Chicago, IL 60637, USA
Received:
6
July
2020
Accepted:
11
October
2020
The H0LiCOW collaboration inferred via strong gravitational lensing time delays a Hubble constant value of H_{0} = 73.3_{−1.8}^{+1.7} km s^{−1} Mpc^{−1}, describing deflector mass density profiles by either a powerlaw or stars (constant masstolight ratio) plus standard dark matter halos. The masssheet transform (MST) that leaves the lensing observables unchanged is considered the dominant source of residual uncertainty in H_{0}. We quantify any potential effect of the MST with a flexible family of mass models, which directly encodes it, and they are hence maximally degenerate with H_{0}. Our calculation is based on a new hierarchical Bayesian approach in which the MST is only constrained by stellar kinematics. The approach is validated on mock lenses, which are generated from hydrodynamic simulations. We first applied the inference to the TDCOSMO sample of seven lenses, six of which are from H0LiCOW, and measured H_{0} = 74.5_{−6.1}^{+5.6} km s^{−1} Mpc^{−1}. Secondly, in order to further constrain the deflector mass density profiles, we added imaging and spectroscopy for a set of 33 strong gravitational lenses from the Sloan Lens ACS (SLACS) sample. For nine of the 33 SLAC lenses, we used resolved kinematics to constrain the stellar anisotropy. From the joint hierarchical analysis of the TDCOSMO+SLACS sample, we measured H_{0} = 67.4_{−3.2}^{+4.1} km s^{−1} Mpc^{−1}. This measurement assumes that the TDCOSMO and SLACS galaxies are drawn from the same parent population. The blind H0LiCOW, TDCOSMOonly and TDCOSMO+SLACS analyses are in mutual statistical agreement. The TDCOSMO+SLACS analysis prefers marginally shallower mass profiles than H0LiCOW or TDCOSMOonly. Without relying on the form of the mass density profile used by H0LiCOW, we achieve a ∼5% measurement of H_{0}. While our new hierarchical analysis does not statistically invalidate the mass profile assumptions by H0LiCOW – and thus the H_{0} measurement relying on them – it demonstrates the importance of understanding the mass density profile of elliptical galaxies. The uncertainties on H_{0} derived in this paper can be reduced by physical or observational priors on the form of the mass profile, or by additional data.
Key words: gravitational lensing: strong / galaxies: general / galaxies: kinematics and dynamics / distance scale / cosmological parameters / cosmology: observations
The full analysis is available at https://github.com/TDCOSMO/hierarchy_analysis_2020_public.
© ESO 2020
1. Introduction
There is a discrepancy in the reported measurements of the Hubble constant from early universe and late universe distance anchors. If confirmed, this discrepancy would have profound consequences and would require new or unaccounted physics to be added to the standard cosmological model. Early universe measurements in this context are primarily calibrated with sound horizon physics. This includes the cosmic microwave background (CMB) observations from Planck with H_{0} = 67.4 ± 0.5 km s^{−1} Mpc^{−1} (Planck Collaboration VI 2020), galaxy clustering and weak lensing measurements of the Dark Energy Survey (DES) data in combination with baryon acoustic oscillations (BAO) and Big Bang nucleosynthesis (BBN) measurements, giving H_{0} = 67.4 ± 1.2 km s^{−1} Mpc^{−1} (Abbott et al. 2018), and using the fullshape BAO analysis in the BOSS survey in combination with BBN, giving H_{0} = 68.4 ± 1.1 km s^{−1} Mpc^{−1} (Philcox et al. 2020). All of these measurements provide a selfconsistent picture of the growth and scales of structure in the Universe within the standard cosmological model with a cosmological constant, Λ, and cold dark matter (ΛCDM).
Late universe distance anchors consist of multiple different methods and underlying physical calibrators. The most well established one is the local distance ladder, effectively based on radar observations on the Solar system scale, the parallax method, and a luminous calibrator to reach the Hubble flow scale. The SH0ES team, using the distance ladder method with supernovae (SNe) of type Ia and Cepheids, reports a measurement of H_{0} = 74.0 ± 1.4 km s^{−1} Mpc^{−1} (Riess et al. 2019). The Carnegie–Chicago Hubble Project (CCHP) using the distance ladder method with SNe Ia and the tip of the red giant branch measures H_{0} = 69.6 ± 1.9 km s^{−1} Mpc^{−1} (Freedman et al. 2019, 2020). Huang et al. (2020) used the distance ladder method with SNe Ia and Mira variable stars and measured H_{0} = 73.3 ± 4.0 km s^{−1} Mpc^{−1}.
Among the measurements that are independent of the distance ladder are the Megamaser Cosmology Project (MCP), which uses water megamasers to measure H_{0} = 73.9 ± 3.0 km s^{−1} Mpc^{−1} (Pesce et al. 2020), gravitational wave standard sirens with km s^{−1} Mpc^{−1} (Abbott et al. 2017) and the TDCOSMO collaboration^{1} (formed by members of H0LiCOW, STRIDES, COSMOGRAIL and SHARP), using timedelay cosmography with lensed quasars (Wong et al. 2020; Shajib et al. 2020a; Millon et al. 2020). Timedelay cosmography (Refsdal 1964) provides a onestep inference of absolute distances on cosmological scales – and thus the Hubble constant. Over the past two decades, extensive and dedicated efforts have transformed timedelay cosmography from a theoretical idea to a contender for precision cosmology (Vanderriest et al. 1989; Keeton & Kochanek 1997; Schechter et al. 1997; Kochanek 2003; Koopmans et al. 2003; Saha et al. 2006; Read et al. 2007; Oguri 2007; Coles 2008; Vuissoz et al. 2008; Suyu et al. 2010, 2013, 2014; Fadely et al. 2010; Sereno & Paraficz 2014; Rathna et al. 2015; Birrer et al. 2016, 2019; Wong et al. 2017; Rusu et al. 2020; Chen et al. 2019; Shajib et al. 2020a).
The keys to precision timedelay cosmography are: Firstly, precise and accurate measurements of relative arrival time delays of multiple images; Secondly, understanding of the largescale distortion of the angular diameter distances along the line of sight; and thirdly, accurate model of the mass distribution within the main deflector galaxy. The first problem has been solved by high cadence and high precision photometric monitoring, often with dedicated telescopes (e.g., Fassnacht et al. 2002; Tewes et al. 2013; Courbin et al. 2018). The time delay measurement procedure has been validated via simulations by the Time Delay Challenge (TDC1; Dobler et al. 2015; Liao et al. 2015). The second issue has been addressed by statistically correcting the effect of the line of sights to strong gravitational lenses by comparison with cosmological numerical simulations (e.g., Fassnacht et al. 2011; Suyu et al. 2013; Greene et al. 2013; Collett et al. 2013). Millon et al. (2020) recently showed that residuals from the line of sight correction based on this methodology are smaller than the current overall errors. Progress on the third issue has been achieved by analyzing high quality images of the host galaxy of the lensed quasars with provided spatially resolved information that can be used to constrain lens models (e.g., Suyu et al. 2009). By modeling extended sources with complex and flexible source surface brightness instead of just the quasar images positions and fluxes, modelers have been able to move from extremely simplified models like singular isothermal ellipsoids (Kormann et al. 1994; Schechter et al. 1997) to more flexible ones like power laws or stars plus standard dark matter halos (Navarro et al. 1997, hereafter NFW). The choice of elliptical powerlaw and stars plus NFW profiles was motivated by their generally good description of stellar kinematics and Xray data in the local Universe. It was validated postfacto by the small residual corrections found via pixellated models (Suyu et al. 2009), and by the overall goodness of fit they provided to the data.
Building on the advances in the past two decades, the H0LiCOW and SHARP collaborations analyzed six individual lenses (Suyu et al. 2010, 2014; Wong et al. 2017; Birrer et al. 2019; Rusu et al. 2020; Chen et al. 2019) and measured H_{0} for each lens to a precision in the range 4.3−9.1%. The STRIDES collaboration measured H_{0} to 3.9% from one single quadruply lensed quasar (Shajib et al. 2020a). The seven measurements follow an approximately standard (although evolving over time) procedure (see e.g. Suyu et al. 2017) and incorporate singleaperture stellar kinematics measurements for each lens. The H0LiCOW collaboration combined their six quasar lenses, of which five had their analysis blinded, assuming uncorrelated individual distance posteriors and arrived at km s^{−1} Mpc^{−1}, a 2.4% measurement of H_{0} (Wong et al. 2020). Adding the blind measurement by Shajib et al. (2020a) further increases the precision to ∼2% (Millon et al. 2020).
Given the importance of the Hubble tension, it is crucial, however, to continue to investigate potential causes of systematic errors in timedelay cosmography. After all, extraordinary claims, like physics beyond ΛCDM, require extraordinary evidence.
The first and main source of residual modeling error in timedelay cosmography is due to the masssheet transform (MST; Falco et al. 1985). MST is a mathematical degeneracy that leaves the lensing observables unchanged, while rescaling the absolute time delay, and thus the inferred H_{0}. This degeneracy is well known and frequently discussed in the literature (e.g., Gorenstein et al. 1988; Kochanek 2002, 2006, 2020a; Saha & Williams 2006; Read et al. 2007; Schneider & Sluse 2013, 2014; Coles et al. 2014; Xu et al. 2016; Birrer et al. 2016; Unruh et al. 2017; Sonnenfeld 2018; Wertz et al. 2018; Blum et al. 2020). Lensingindependent tracers of the gravitational potential of the deflector galaxy, such as stellar kinematics, can break this inherent degeneracy (e.g., Grogin & Narayan 1996; Romanowsky & Kochanek 1999; Treu & Koopmans 2002). Another way to break the degeneracy is to make assumptions on the mass density profile, which is primarily the strategy adopted by the H0LiCOW/STRIDES collaboration (Millon et al. 2020). Millon et al. (2020) showed that the two classes of radial mass profiles considered by the collaboration, powerlaw and stars and a Navarro Frenk & White (NFW, Navarro et al. 1997) dark matter halo, yield consistent results^{2}. Sonnenfeld (2018), Kochanek (2020a, b) argued that the error budget of individual lenses obtained under the assumptions of powerlaw or stars + NFW are underestimated and that, given the MST, the typical uncertainty of the kinematic data does not allow one to constrain the mass profiles to a few percent precision^{3}.
A second potential source of uncertainty in the combined TDCOSMO analysis is the assumption of no correlation between the errors of each individual lens system. The TDCOSMO analysis shows that the scatter between systems is consistent with the estimated errors, and the random measurement errors of the observables are indeed uncorrelated (Wong et al. 2020; Millon et al. 2020). However, correlations could be introduced by the modeling procedure and assumptions made, such as the form and prior on the mass profile and the distribution of stellar anisotropies in elliptical galaxies.
In this paper we address these two dominant sources of potential residual uncertainties by introducing a Bayesian hierarchical framework to analyze and interpret the data. Addressing these uncertainties is a major step forward in the field, however it should be noted that the scope of this framework is broader than just these two issues. Its longer term goal is to take advantage of the expanding quality and quantity of data to trade theoretical assumptions for empirical constraints. Specifically, this framework is designed to meet the following requirements: (1) Theoretical assumptions should be explicit and, whenever possible, verified by data or replaced by empirical constraints; (2) Kinematic assumptions and priors must be justified by the data or the laws of physics; (3) The methodology must be validated with realistic simulations. By using this framework we present an updated measurement of the Hubble constant from timedelay cosmography and we lay out a roadmap for further improvements of the methodology to enable a measurement of the Hubble constant from strong lensing timedelay measurements with 1% precision and accuracy.
In practice, we adopt a parameterization that allows us to quantify the full extent MST in our analysis, addressing point (1) listed above. We discuss the assumptions on the kinematic modeling and the impact of the priors chosen. We deliberately choose an uninformative prior, addressing point (2). We make use of a blind submission to the timedelay lens modeling challenge (TDLMC; Ding et al. 2018, 2020) and validate our approach end to end, including imaging analysis, kinematics analysis and MST mitigation, addressing point (3)^{4}.
In our new analysis scheme, the MST is exclusively constrained by the kinematic information of the deflector galaxies, and thus fully accounted for in the error budget. Under these minimal assumptions, we expect that the data currently available for the individual lenses in our TDCOSMO sample will not constrain H_{0} to the 2% level. In addition, we take into account covariances between the sample galaxies, by formulating the priors on the stellar anisotropy distribution and the MST at the population level and globally sampling and marginalizing over their uncertainties.
To further improve the constraints on the mass profile and the MST on the population level, we incorporate a sample of 33 lenses from the Sloan Lens ACS (SLACS) survey (Bolton et al. 2006) into our analysis. We make use of the lens model inference results presented by Shajib et al. (2020b), which follow the standards of the TDCOSMO collaboration. We assess the assumptions in the kinematics modeling and incorporate integral field unit (IFU) spectroscopy from VIMOS 2D data of a subset of the SLACS lenses from Czoske et al. (2012) in our analysis. This dataset allows us to improve constraints on the stellar anisotropy distribution in massive elliptical galaxies at the population level and thus reduces uncertainties in the interpretation of the kinematic measurements, hence improving the constraints on the MST and H_{0}. Our joint hierarchical analysis is based on the assumption that the massive elliptical galaxies acting as lenses in the SLACS and the TDCOSMO sample represent the same underlying parent population in regard of their mass profiles and kinematic properties. The final H_{0} value derived in this work is inferred from the joint hierarchical analysis of the SLACS and TDCOSMO samples.
The paper is structured as follows: Sect. 2 revisits the analysis performed on individual lenses and assesses potential systematics due to MST and mass profile assumptions. Section 3 describes the hierarchical Bayesian analysis framework to mitigate assumptions and priors associated to the MST to a sample of lenses. We first validate this approach in Sect. 4 on the TDLMC data set (Ding et al. 2018) and then move to perform this very same analysis on the TDCOSMO data set in Sect. 5. Next, we perform our hierarchical analysis on the SLACS sample with imaging and kinematics data to further constrain uncertainties in the mass profiles and the kinematic behavior of the stellar anisotropy in Sect. 6. We present the joint analysis and final inference on the Hubble constant in Sect. 7. We discuss the limitations of the current work and lay out the path forward in Sect. 8 and finally conclude in Sect. 9.
All the software used in this analysis is open source and we share the analysis scripts and pipeline with the community^{5}. Numerical tests on the impact of the MST are performed with LENSTRONOMY^{6} (Birrer & Amara 2018; Birrer et al. 2015). The kinematics is modeled with the LENSTRONOMY.GALKIN module. The reanalysis of the SLACS lenses imaging data is performed with DOLPHIN^{7}, a wrapper around LENSTRONOMY for automated lens modeling (Shajib et al. 2020b) and we introduce HIERARC^{8} (this work) for the hierarchical sampling in conjunction with LENSTRONOMY. All components of the analysis – including analysis scripts and software – were reviewed internally by people not previously involved in the analysis of the sample before the joint inference was performed. All uncertainties stated are given in 16th, 50th and 84th percentiles. Error contours in plots represent 68th and 95th credible regions.
As in previous work by our team – in order to avoid experimenter bias – we keep our analysis blind by using previously blinded analysis products, and all additional choices made in this analysis, such as considering model parameterization and including or excluding of data, are assessed blindly in regard to H_{0} or parameters directly related to it. All sections, except Sect. 8.5, of this paper have been written and frozen before the unblinding of the results.
2. Cosmography from individual lenses and the masssheet degeneracy
In this section we review the principles of timedelay cosmography and the underlying observables (Sect. 2.1 for lensing and time delays and Sect. 2.2 for the kinematic observables). We emphasize how an MST affects the observables and thus the inference of cosmographic quantities (Sect. 2.3). We separate the physical origin of the MST into the lineofsight (external MST, Sect. 2.4) and massprofile contributions (internal MST, Sect. 2.5) and then provide the limits on the internal mass profile constraints from imaging data and plausibility arguments in Sect. 2.6. We provide concluding remarks on the constraining power of individual lenses for timedelay cosmography in Sect. 2.7.
2.1. Cosmography with strong lenses
In this section we state the relevant governing physical principles and observables in terms of imaging, time delays, and stellar kinematics. The phenomena of gravitational lensing can be described by the lens equation, which maps the source plane β to the image plane θ (2D vectors on the plane of the sky)
where α is the angular shift on the sky between the original unlensed and the lensed observed position of an object.
For a single lensing plane, the lens equation can be expressed in terms of the physical deflection angle as
with D_{s}, D_{ds} is the angular diameter distance from the observer to the source and from the deflector to the source, respectively. In the single lens plane regime we can introduce the lensing potential ψ such that
and the lensing convergence as
The relative arrival time between two images θ_{A} and θ_{B}, Δt_{AB}, originated from the same source is
where c is the speed of light,
is the Fermat potential (Schneider 1985; Blandford & Narayan 1986), and
is the timedelay distance (Refsdal 1964; Schneider et al. 1992; Suyu et al. 2010); D_{d}, D_{s}, and D_{ds} are the angular diameter distances from the observer to the deflector, the observer to the source, and from the deflector to the source, respectively.
Provided constraints on the lensing potential, a measured time delay allows us to constrain the timedelay distance D_{Δt} from Eq. (5):
The Hubble constant is inversely proportional to the absolute scales of the Universe and thus scales with D_{Δt} as
2.2. Deflector velocity dispersion
The lineofsight projected stellar velocity dispersion of the deflector galaxy, σ^{P}, can provide a dynamical mass estimate of the deflector independent of the lensing observables and joint lensing and dynamical mass estimates have been used to constrain galaxy mass profiles (Grogin & Narayan 1996; Romanowsky & Kochanek 1999; Treu & Koopmans 2002).
The modeling of the kinematic observables in lensing galaxies range in complexity from spherical Jeans modeling to Schwarzschild (Schwarzschild 1979) methods. For example, Barnabè & Koopmans (2007), Barnabè et al. (2009) use axisymmetric modeling of the phasespace distribution function with a twointegral Schwarzschild method by Cretton et al. (1999), Verolme & de Zeeuw (2002). In this work, the kinematics and their interpretation are a key component of the inference scheme and thus we provide the reader with a detailed background and the specific assumptions in the modeling we apply.
The dynamics of stars with the density distribution ρ_{*}(r) in a gravitational potential Φ(r) follows the Jeans equation. In this work, we assume spherical symmetry and no rotation in the Jeans modeling. In the limit of a relaxed (vanishing time derivatives) and spherically symmetric system, with the only distinction between radial, , and tangential, , dispersions, the Jeans equation results in (e.g., Binney & Tremaine 2008)
with the stellar anisotropy parameterized as
The solution of Eq. (10) can be formally expressed as (e.g., van der Marel 1994)
where M(r) is the mass enclosed in a threedimensional sphere with radius r and
is the integration factor of the Jeans Equation (Eq. (10)). The modeled luminosityweighted projected velocity dispersion σ_{s} is given by (Binney & Mamon 1982)
where R is the projected radius and Σ_{*}(R) is the projected stellar density
The observational conditions have to be taken into account when comparing a model prediction with a data set. In particular, the aperture 𝒜 and the PSF convolution of the seeing, 𝒫, need to be folded in the modeling. The luminosityweighted line of sight velocity dispersion within an aperture, 𝒜, is then (e.g., Treu & Koopmans 2004; Suyu et al. 2010)
where is taken from Eq. (14).
The prediction of the stellar kinematics requires a threedimensional stellar density ρ_{*}(r) and mass M(r) profile. In terms of imaging data, we can extract information about the parameters of the lens mass surface density with parameters ξ_{mass} and the surface brightness of the deflector with parameters ξ_{light}. When assuming a constant masstolight ratio across the galaxy, the integrals in the Jeans equation can be performed on the light distribution and Σ_{*}(R) can be taken to be the surface brightness I(R). To evaluate the threedimensional distributions, we rely on assumptions on the deprojection to the threedimensional mass and light components. In this work, we use spherically symmetric models with analytical projections/deprojections to solve the Jeans equation.
An additional ingredient in the calculation of the velocity dispersion is the anisotropy distribution of the stellar orbits, β_{ani}(r). It is impossible to disentangle the anisotropy in the velocity distribution and the gravitational potential from velocity dispersion and rotation measurements alone. This is known as the massanisotropy degeneracy (Binney & Mamon 1982).
Finally, the predicted velocity dispersion requires angular diameter distances from a background cosmology. Specifically, the prediction of any σ^{P} from any model can be decomposed into a cosmologicaldependent and cosmologyindependent part, as (Birrer et al. 2016, 2019)
where J(ξ_{mass}, ξ_{light}, β_{ani}) is the dimensionless and cosmologyindependent term of the Jeans equation only relying on the angular units in the light, mass and anisotropy model. The term ξ_{light} in Eq. (17) includes the deflector light contribution. The deflector light is required for the Jeans modeling (Σ_{*} and deconvolved ρ_{*} terms in the equations above). In practice, the inference of the deflector light profile is jointly fit with other light components, such as source light and quasar flux.
Inverting Eq. (17) illustrates that a measured velocity dispersion, σ^{P}, allows us to constrain the distance ratio D_{s}/D_{ds}, independent of the cosmological model and time delays but while relying on the same lens model, ξ_{lens},
We note that the distance ratio D_{s}/D_{ds} can be constrained without time delays being available. If one has kinematic and timedelay data, instead of expressing constraints on D_{s}/D_{ds}, one can also express the cosmologically independent constraints in terms of D_{d} (e.g., Paraficz & Hjorth 2009; Jee et al. 2015; Birrer et al. 2019) as
In this work, we do not transform the kinematics constraints into D_{s}/D_{ds} or D_{d} constraints but work directly on the likelihood level of the velocity dispersion when discriminating between different cosmological models.
In Appendix B we illustrate the radial dependence on the model predicted velocity dispersion, σ^{P}, for different stellar anisotropy models. Observations at different projected radii can partially break the massanisotropy degeneracy provided that we have independent mass profile estimates from lensing observables.
2.3. Masssheet transform
The MST is a multiplicative transform of the lens equation (Eq. (1)) (Falco et al. 1985)
which preserves image positions (and any higher order relative differentials of the lens equation) under a linear source displacement β → λβ. The term (1 − λ)θ in Eq. (20) above describes an infinite sheet of convergence (or mass), and hence the name masssheet transform. Only observables related to the absolute source size, intrinsic magnification or to the lensing potential are able to break this degeneracy.
The convergence field transforms according to
The same relative lensing observables can result if the mass profile is scaled by the factor λ with the addition of a sheet of convergence (or mass) of κ(θ) = (1 − λ).
The different observables described in Sects. 2.1 and 2.2 transform by an MST term λ as follow: The image positions remain invariant
The source position scales with λ
The time delay scales with λ
and the velocity dispersion scales with λ as
Until now we have only stated how the MST impacts observables directly. However, it is also useful to describe how cosmographic constraints derived from a set of observables and assumptions on the mass profile are transformed when transforming the lens model with an MST (Eq. (8), (18), (19)). The timedelay distance (Eq. (7)) is dependent on the time delay Δt (Eq. (5))
The distance ratio constrained by the kinematics and the lens model scales as
Given timedelay and kinematics data the inference on the angular diameter distance to the lens is invariant under the MST
The Hubble constant, when inferred from the timedelay distance, D_{Δt}, transforms as (from Eq. (9))
Mathematically, all the MSTs can be equivalently stated as a change in the angular diameter distance to the source
In other words, if one knows the dependence of any lensing variable upon D_{s} one can transform it under the MST and scale all other quantities in the same way.
2.4. Lineofsight contribution
Structure along the line of sight of lenses induce distortions and focusing (or defocusing) of the light rays. The firstorder shear distortions do have an observable imprint on the shape of Einstein rings and can thus be constrained as part of the modeling procedure of strong lensing imaging data. The first order convergence effect alters the angular diameter distances along the specific line of sight of the strong lens. We define D^{lens} as the specific angular diameter distance along the line of sight of the lens and D^{bkg} as the angular diameter distance from the homogeneous background metric without any perturbative contributions. D^{lens} and D^{bkg} are related through the convergence terms as
κ_{s} is the integrated convergence along the line of sight passing through the strong lens to the source plane and the term 1 − κ_{s} corresponds to an MST (Eq. (30))^{9}. To predict the velocity dispersion of the deflector (Eq. (17)), the terms κ_{s} and κ_{ds} are relevant when using background metric predictions from a cosmological model (D^{bkg}). To predict the time delays (Eq. (5)) from a cosmological model, all three terms are relevant. We can define a single effective convergence, κ_{ext}, that transforms the timedelay distance (Eq. (7))
with
2.5. External vs. internal mass sheet transform
An MST (Eq. (21)) is always linked to a specific choice of lens model and so is its physical interpretation. The MST can be either associated with lineofsight structure (κ_{s}) not affiliated with the main deflector or as a transform of the mass profile of the main deflector itself (e.g., Koopmans 2004; Saha & Williams 2006; Schneider & Sluse 2013; Birrer et al. 2016; Shajib et al. 2020a).
There are different observables and physical priors related to these two distinct physical causes and we use the notation κ_{s} to describe the external convergence aspect of the MST and λ_{int} to describe the internal profile aspect of the MST. The total transform which affects the time delays and kinematics (see Eqs. (24) and (25)) is the product of the two transforms
The lineofsight contribution can be estimated by tracers of the larger scale structure, either using galaxy number counts (e.g., Rusu et al. 2017) or weak lensing of distant galaxies by all the mass along the line of sight (e.g., Tihhonova et al. 2018), and can be estimated with a few per cent precision per lens. The internal MST requires either priors on the form of the deflector profile or exquisite kinematic tracers of the gravitational potential. The λ_{int} component is the focus of this work.
2.6. Approximate internal masssheet transform
Imposing the physical boundary condition, lim_{r → ∞}κ(r) = 0, violates the mathematical form of the MST^{10}. However, approximate MSTs that satisfy the boundary condition of a finite physically enclosed mass may still be possible and encompass the limitations and concerns of strong gravitational lensing in providing precise constraints on the Hubble constant. We specify an approximate MST as a profile without significantly impacting imaging observables around the Einstein radius and resulting in the transforms of the time delays (Eq. (24)) and kinematics (Eq. (25)).
Cored mass components, κ_{c}(r), can serve as physically motivated approximations to the MST (Blum et al. 2020). We can write a physically motivated approximate internal MST with a parameter λ_{c} as
where κ_{model} corresponds to the model used in the reconstruction of the imaging data and λ_{c} describes the scaling between the cored and the other model components, in resemblance to λ_{int}. Approximating a physical cored transform with the pure MST means that:
in deriving all the observable scalings in Sect. 2.3.
Blum et al. (2020) showed that several wellchosen cored 3D mass profiles, ρ(r), can lead to approximate MST’s in projection, κ_{c}(r), with physical interpretations, such as
resulting in the projected convergence profile
where Σ_{crit} is the critical surface density of the lens. The specific functional form of the profile listed above (37) resemble the outer slope of the NFW profile with ρ(r) ∝ r^{−3}.
Figure 1 illustrates a composite profile consisting of a stellar component (Hernquist profile) and a dark matter component (NFW + cored component, Eq. (37)) which transform according to an approximate MST. The stellar component gets rescaled by the MST while the cored component is transforming only the dark matter component.
Fig. 1. Illustration of a composite profile consisting of a stellar component (Hernquist profile, dotted lines) and a dark matter component (NFW + cored component (Eq. (38)), dashed lines) which transform according to an approximate MST (joint as solid lines). The stellar component gets rescaled by the MST while the cored component transforms the dark matter component. Left: profile components in three dimensions. Right: profile components in projection. The transforms presented here cannot be distinguished by imaging data alone and require i.e., stellar kinematics constraints (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/MST_composite_cored.ipynb). 

Open with DEXTER 
It is of greatest importance to quantify the physical plausibility of those transforms and their impact on other observables in detail. In this section we extend the study of Blum et al. (2020). We perform detailed numerical experiments on mock imaging data to quantify the constraints from imaging data, time delays and kinematics, and we quantify the range of such an approximate transform with physically motivated boundary conditions. Further illustrations and details on the examples given in this section can be found in Appendix A.
2.6.1. Imaging constraints on the internal MST
In this section we investigate the extent to which imaging data is able to distinguish between different lens models with different cored mass components and their impact on the inferred time delay distance in combination with time delay information. We first generate a mock image and time delays without a cored component and then perform the inference with an additional cored component model (Eq. (38)) parameterized with the core radius R_{c} and the core projected density Σ_{c} ≡ (1 − λ_{c}) (Eq. (35)). In our specific example, we simulate a quadruply lensed quasar image similar to Millon et al. (2020) (more details in Appendix A and Fig. A.2) with a powerlaw elliptical mass distribution (PEMD, Kormann et al. 1994; Barkana 1998)
where γ_{pl} is the logarithmic slope of the profile, q_{m} is the axis ratio of the minor and the major axes of the elliptical profile, and θ_{E} is the Einstein radius. The coordinate system is defined such that θ_{1} and θ_{2} are along the major and minor axis respectively. We also add an external shear model component with distortion amplitude γ_{ext} and direction ϕ_{ext}. The PEMD+shear model is one of two lens models considered in the analysis of the TDCOSMO sample. For the source and lens galaxies we use elliptical Sérsic surface brightness profiles. We add a Gaussian point spread function (PSF) with fullwidthathalfmaximum (FWHM) of 01, pixel scale of 005 and noise properties consistent with the current TDCOSMO sample of Hubble Space Telescope (HST) images. The time delays between the images between the first arriving image and the subsequent images are 11.7, 27.6, and 94.0 days, respectively. We chose timedelay uncertainties of ±2 days between the three relative delays. The timedelay precision does not impact our conclusions about the MST. The inference is performed on the pixel level of the mock image as with the real data on the TDCOSMO sample.
In the modeling and parameter inference, we add an additional cored mass component (Eq. (38)) and perform the inference on all the lens and source parameters simultaneously, including the core radius R_{c} and the projected core density Σ_{c}. In the limit of a perfect MST there is a mathematical degeneracy if we only use the imaging data as constraints. We thus expect a full covariance in the parameters involved in the MST (Einstein radius of the main deflector, source position, source size etc.) and the posterior inference of our problem to be inefficient in the regime where the cored profile mimics the full MST (κ_{c}(θ) acts as Σ_{crit} for R_{c} → ∞). To improve the sampling, instead of modeling the cored profile κ_{c}(θ), we model the difference between the cored component and a perfect MST, Δκ_{c} = κ_{c}(θ)−Σ_{crit}, with λ_{c} (Eq. (35)) instead. Δκ_{c} is effectively the component of the model that does not transform under the MST and leads to a physical threedimensional profile interpretation.
Figure 2 shows the inference on the relevant lens model parameters for the mock image described in Appendix A. The input parameters are marked as orange lines for the model without a cored component. We can clearly see that for small core radii, R_{c}, the approximate MST parameter λ_{c} can be constrained. This is the limit where the additional core profile cannot mimic a pure MST at a level where the data is able to distinguish between them. For core radii R_{c} = 3θ_{E}, the uncertainty on the approximate MST, λ_{c}, is 10%. For core radii R_{c} > 5θ_{E}, the approximate MST is very close to the pure MST and the imaging information in our example is not able to constrain λ_{c} to better than λ_{c} ± 0.4. We make use of the expected constraining power on λ_{c} as a function of R_{c} when we discuss the plausibility of certain transforms. When looking at the inferred timedelay distance λ_{c}D_{Δt}, we see that this quantity is constant as a function of R_{c} and thus the timedelay prediction is accurately being transformed by a pure MST (Eq. (24)). Overall, we find that λ_{c} ≈ λ_{int} is valid for larger core radii.
Fig. 2. Illustration of the constraining power of imaging data on a cored mass component (Eq. (35)). Shown are the parameter inference of the powerlaw profile mock quadruply lensed quasar of Fig. A.2 when including a marginalization of an additional cored power law profile (Eq. (38)). Orange lines indicate the input truth of the model without a cored component. λ_{c} is the scaled core model parameter (Eq. (35)) resembling the pure MST for large core radii (λ_{c} ≈ λ_{int}) (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/MST_pl_cored.ipynb). 

Open with DEXTER 
Identical tests with a composite profile instead of a PEMD profile result in the same conclusions and are available online^{11}.
2.6.2. Allowed cored mass components from physical boundary conditions
In the previous Sect. 2.6.1 we demonstrated that, for large core radii, there are physical profiles that approximate a pure MST (λ_{c} ≈ λ_{int}). In this section we take a closer look at the physical interpretation of such large positive and negative cored component transforms with respect to a chosen mass profile. It is possible that the core model itself does not require a physical interpretation as it is overall included in the total mass distribution. The galaxy surface brightness provides constraints on the stellar mass distribution (modulo a masstolight conversion factor) and the focus here is a consideration of the distribution of the invisible (dark) matter component of the deflector. Our starting model is a NFW profile and we assess departures from this model by using a cored component.
We apply the following conservative boundary conditions on the distribution of the dark matter component: Firstly, the total mass of the cored component within a threedimensional radius shall not exceed the total mass of the NFW profile within the same volume, M_{core}(<r) ≤ M_{NFW}(<r). This is not a strict bound, but violating this condition would imply changing the mass of the halo itself. Secondly, the density profile shall never drop to negative values, ρ_{NFW + core}(r) ≥ 0.
Those two imposed conditions define a physical interpretation of a threedimensional mass profile as being a redistribution of matter from the dark matter component and a rescaling of the masstolight ratio of the luminous component. An independent estimate of the masstolight ratio of few per cent is below our current limits of knowledge about the stellar initial mass function, stellar evolution models and dust extinction. Moreover, the masstolight ratio can vary with radius. Figure 3 provides the constraints from the two conditions, as well as from the imaging data constraints of Sect. 2.6.1, for an expected NFW mass and concentration profile at a typical lens and source redshift configuration. The remaining white region in Fig. 3 is effectively allowed by the imaging data and simple plausibility considerations. We conclude that the physically allowed parameter space does encompass a pure MST with , with more parameter volume for λ_{int} < 1, which corresponds to a positive cored component. We emphasize that the constraining power at small core radii may be due to the angular rather than the radial imprint of the cored profile (see e.g., Kochanek 2020b). However, such a behavior would not alter our conclusions and inference method chosen in the analysis presented in subsequent sections of this work. We also performed this inference for a composite (stellar light + NFW dark matter) model and arrive at the same conclusions.
Fig. 3. Constraints on an approximate internal MST transform with a cored component, λ_{c}, of an NFW profile as a function of core radius. In gray are the 1σ exclusion limits that imaging data can provide. In orange is the region where the total mass of the core within a threedimensional radius exceeds the mass of the NFW profile in the same sphere. In blue is the region where the transformed profile results in negative convergence at the core radius. The white region is effectively allowed by the imaging data and simple plausibility considerations and where we can use the mathematical MST as an approximation (λ_{c} ≈ λ_{int}). The halo mass, concentration and the redshift configuration is displayed in the lower left box (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/MST_pl_cored.ipynb). 

Open with DEXTER 
2.6.3. Stellar kinematics of an approximate MST
In this section we investigate the kinematics dependence on the approximate MST. To do so, we perform spherical Jeans modeling (Sect. 2.2) and compute the predicted velocity dispersion in an aperture under realistic seeing conditions (Eq. (16)) for models with a cored mass component as an approximation of the MST.
Figure 4 compares the actual predicted kinematics from the modeling of the physical threedimensional mass distribution κ_{λc} (Eq. (35)) and the analytic relation of a perfect MST (Eq. (25)) for the mock lens presented in Appendix A. For this figure, we chose an aperture size of 1″ × 1″ and seeing of FWHM = 07 and an isotropic stellar orbit distribution (β_{ani}(r) = 0). For λ_{c} in the range [0.8,1.2], the MST approximation in the predicted velocity dispersion is accurate to < 1%. We conclude that, for the λ_{int} range considered in this work, the analytic approximation of a perfect MST is valid to reliably compute the predicted velocity dispersion. The precise dependence of the velocity dispersion only marginally depends on the specific core radius R_{c} and the approximation remains valid for all reasonable and nonexcluded core radii and λ_{int}. We tested that our conclusions also hold for different anisotropy profiles and observational conditions.
Fig. 4. Comparison of the actual predicted kinematics from the modeling of the physical threedimensional mass distribution κ_{λint} (Eq. (35)) for varying core sizes (solid) and the analytic relation of a perfect MST (Eq. (25), dashed) for the mock lens presented in Fig. A.2. Lower panel: fractional differences between the exact prediction and a perfect MST calculation. The MST prediction matches to < 1% in the considered range. Minor numerical noise is present at the subpercent level (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/MST_pl_cored.ipynb). 

Open with DEXTER 
2.7. Constraining power using individual lenses
For each individual strong lens in the TDCOSMO sample, there are four data sets available: (1) imaging data of the strong lensing features and the deflector galaxy, 𝒟_{img}; (2) timedelay measurements between the multiple images, 𝒟_{td}; (3) stellar kinematics measurement of the main deflector galaxy, 𝒟_{spec}; (4) lineofsight galaxy count and weak lensing statistics, 𝒟_{los}.
These data sets are independent and so are their likelihoods in a joint cosmographic inference. Hence, we can write the likelihood of the joint set of the data 𝒟 = {𝒟_{img}, 𝒟_{td}, 𝒟_{spec}, 𝒟_{los}} given the cosmographic parameters {D_{d}, D_{s}, D_{ds}}≡D_{d, s, ds} as
In the expression above we only included the relevant model components in the expressions of the individual likelihoods. ξ_{light} formally includes the source and lens light surface brightness. For the timedelay likelihood, we only consider the timevariable source position from the set of ξ_{light} parameters. In Appendix C we provide details on the computation of the combined likelihood, in particular with application in the hierarchical context.
An approximate internal MST of a power law with λ_{int} of 10% still leads to physically interpretable mass profiles with the Hubble constant changed by 10% (see Eq. (29)). Imaging data is not sufficiently able to distinguish between models producing H_{0} value within this 10% range (Kochanek 2020a). The kinematics are changed with good approximation by Eq. (25) through this transform. The kinematic prediction is also cosmology dependent by Eq. (17). The scalings of an MST are analytical in the modelpredicted timedelay distance and kinematics and thus its marginalization can be performed in post processing given posteriors for a specific lens model family that breaks the MST, such as a powerlaw model.
The kinematics information is the decisive factor in discriminating different profile families. The relative uncertainty in the velocity dispersion measurement directly propagates into the relative uncertainty in the MST as
The current uncertainties on the velocity dispersion measurements, on the order of 5−10% (including the uncertainties due to stellar template mismatch and other systematic errors) limit the precise determination of the mass profile per individual lens. Uncertainties in the interpretation of the stellar anisotropy orbit distribution additionally complicates the problem. Birrer et al. (2016) performed such an analysis and demonstrated that an explicit treatment of the MST (in their approach parameterized as a source scale) leads to uncertainties consistent with the expectations of Kochanek (2020a). Because the kinematic measurement of each lens is not sufficiently precise to constrain the mass profile to the desired level, in this work we marginalize over the uncertainties properly accounting for the priors.
3. Hierarchical Bayesian cosmography
The overarching goal of timedelay cosmography is to provide a robust inference of cosmological parameters, π, and in particular the absolute distance scale, the Hubble constant H_{0}, and possibly other parameters describing the expansion history of the Universe (such as Ω_{Λ} or Ω_{m}), from a sample of gravitational lenses with measured time delays. Based on the conclusions we draw from Sect. 2, it is absolutely necessary to propagate assumptions and priors made on the analysis of an individual lens hierarchically when performing the inference on the cosmological parameters from a population of lenses. In particular, this is relevant for parameters that we cannot sufficiently constrain on a lensbylens basis and parameters whose uncertainties significantly propagate to the H_{0} inference on the population level. In this section, we introduce three specific hierarchical sampling procedures for properties of lensing galaxies and their selection that are relevant for the cosmographic analysis. In particular, these are: (1) an overall internal MST relative to a chosen mass profile, λ_{int}, and its distribution among the sample of lenses; (2) stellar anisotropy distribution in the sample of lenses; (3) the lineofsight structure selection and distribution of the lens sample.
In Sect. 3.1 we formalize the Bayesian problem and define an approximate scheme for the full hierarchical inference that allows us to keep track of key systematic uncertainties while still being able to reuse currently available inference products. In Sect. 3.2 we specify the hyperparameters we sample on the population level. Section 3.3 details the specific approximations in the likelihood calculation. All hierarchical computations and sampling presented in this work are implemented in the opensource software HIERARC.
3.1. Hierarchical inference problem
In Bayesian language, we want to calculate the probability of the cosmological parameters, π, given the strong lensing data set, p(π{𝒟_{i}}_{N}), where 𝒟_{i} is the data set of an individual lens (including imaging data, timedelay measurements, kinematic observations and lineofsight galaxy properties) and N the total number of lenses in the sample.
In addition to π, we introduce ξ that incorporates all the model parameters. Using Bayes rule and considering that the data of each individual lens 𝒟_{i} is independent, we can write:
In the following, we divide the nuisance parameter, ξ, into a subset of parameters that we constrain independently per lens, ξ_{i}, and a set of parameters that require to be sampled across the lens sample population globally, ξ_{pop}. The parameters of each individual lens, ξ_{i}, include the lens model, source and lens light surface brightness and any other relevant parameter of the model to predict the data. Hence, we can express the hierarchical inference (Eq. (45)) as
where {ξ_{i}}_{N} = {ξ_{1}, ξ_{2}, …, ξ_{N}} is the set of the parameters applied to the individual lenses and p(ξ_{i}) are the interim priors on the model parameters in the inference of an individual lens. The cosmological parameters π are fully encompassed in the set of angular diameter distances, {D_{d}, D_{s}, D_{ds}}≡D_{d, s, ds}, and thus, instead of stating π in Eq. (46), we now state D_{d, s, ds}(π). Up to this point, no approximation was applied to the full hierarchical expression (Eq. (45)).
From now on, we assume
which states that, for the parameters classified as ξ_{{i}}, the interim priors do not propagate into the cosmographic inference and the population prior on those parameters is formally known exactly. The population parameters, ξ_{pop}, describe a distribution function such that the values of individual lenses, ξ′_{pop, i}, follow the distribution likelihood p(ξ′_{pop, i}ξ_{pop}).
With this approximation and the notation of the sample distribution likelihood, we can simplify expression (46) to
where
are the individual likelihoods from an independent sampling of each lens with access to global population parameters, ξ_{pop}, and marginalized over the population distribution. The integral in Eq. (49) goes over all individual parameters where a population distribution is applied. Equation (40) is effectively expression (49) without the marginalization over parameters assigned as ξ_{pop}.
For parameters in the category ξ_{{i}}, our approximation implies that there is no population prior and that the interim priors do not impact the cosmographic inference. This approximation is valid in the regime where the posterior distribution in ξ_{{i}} is effectively independent of the prior. Although formally this is never true, for many parameters in the modeling of high signaltonoise imaging data the individual lens modeling parameters are very well constrained relative to the prior imposed.
In the following we highlight some key aspects of the cosmographic analysis and in particular the inference on the Hubble constant where the approximation stated in expression (47) is not valid and thus fall in the category of ξ_{pop}. We give explicit parameterizations of these effects and provide specific expressions to allow for an efficient and sufficiently accurate sampling and marginalization, according to Eq. (49), for individual lenses within an ensemble.
3.2. Lens population hyperparameters
In this section we discuss the choices of population level hyperparameters we include in our analysis.
3.2.1. Deflector lens model
The deflectors in the quasar lenses with measured time delays of the TDCOSMO sample are massive elliptical galaxies. These galaxies, observationally, follow a tight relation in a luminosity, size and velocity dispersion parameter space (e.g., Faber & Jackson 1976; Auger et al. 2010; Bernardi et al. 2020), exhibiting a high degree of selfsimilarity among the population.
In Sect. 2.6 we defined λ_{c} as the approximate MST relative to a chosen profile of an individual lens and established the close correspondence to a perfect MST (λ_{c} ≈ λ_{int}). For the inference from a sample of lenses, the sample distribution of deflector profiles is the relevant property to quantify. For the deflector mass profile, we do not want to artificially break the MST based on imaging data and require the kinematics to constrain the mass profile. To do so, we chose as a baseline model a PEMD (Eq. (39)) to be constrained on the lensbylens case and we add a global internal MST specified on the population level, λ_{int}.
The PEMD lens profile inherently breaks the MST and the parameters of the PEMD profile can be precisely constrained (within few per cent) by exquisite imaging data. In this work, we avoid describing the PEMD parameters at the population level, such as redshift, mass or galaxy environment, and make use of the individual lens inference posterior products derived on flat priors. We note that the powerlaw slope, γ_{pl}, of the PEMD profile inferred from imaging data is a local quantity at the Einstein radius of the deflector. The Einstein radius is a geometrical quantity that depends on the mass of the deflector and lens and source redshift. Thus, the physical location of the measured γ_{pl} from imaging data depends on the redshift configuration of the lens system. In a scenario where the mass profiles of massive elliptical galaxies deviate from an MST transformed PEMD resulting in a gradient in the measured slope γ_{pl} as a function of physical projected distance, a global joint MST correction on top of the individually inferred PEMD profiles may lead to inaccuracies.
To allow for a radial trend in the applied MST relative to the imaging inferred local quantities, we parameterize the global MST population with a linear relation in r_{eff}/θ_{E} as
where λ_{int, 0} is the global MST when the Einstein radius is at the halflight radius of the deflector, r_{eff}/θ_{E} = 1, and α_{λ} is the linear slope in the expected MST as a function of r_{eff}/θ_{E}. In this form, we assume selfsimilarity in the lenses in regard to their halflight radii. In addition to the global MST normalization and trend parameterization, we add a Gaussian distribution scatter with standard deviation σ(λ_{int}) at fixed r_{eff}/θ_{E}.
Wong et al. (2020) and Millon et al. (2020) showed that the TDCOSMO sample results in statistically consistent individual inferences when employing a PEMD lens model. This implies that the global properties of the mass profiles of massive elliptical galaxies in the TDCOSMO sample can be considered to be homogeneous to the level to which the data allows to distinguish differences.
3.2.2. External convergence
The lineofsight convergence, κ_{ext}, is a component of the MST (Eq. (34)) and impacts the cosmographic inference. When performing a joint analysis of a sample of lenses, the key quantity to constrain is the sample distribution of the external convergence. We require the global selection function of lenses to be accurately represented to provide a Hubble constant measurement. A bias in the distribution mean of κ_{ext} on the population level directly leads to a bias of H_{0}.
In this work, we do not explicitly constrain the global external convergence distribution hierarchically but instead constrain p(κ_{ext}) for each individual lens independently. However, due to the multiplicative nature of internal and external MST (Eq. (34)), the kinematics constrains foremost the total MST, which is the relevant parameter to infer H_{0}. The population distribution of p(κ_{ext}) only changes the interpretation of the divide into internal vs. external MST and the scatter in each of the two parts.
3.2.3. Stellar anisotropy
The anisotropy distribution of stellar orbits (Eq. (11)) can alter significantly the observed lineofsight projected stellar velocity dispersion (see Sect. 2.2 and Appendix B). The kinematics can constrain (together with a lens model) the angular diameter distance ratio D_{s}/D_{ds} (Eqs. (17) and (18)). Having a good quantitative handle on the anisotropy behavior of the lensing galaxies is therefor crucial in allowing for a robust inference of cosmographic quantities. As is the case for an internal MST, the anisotropy cannot be constrained on a lensbylens basis with a single aperture velocity dispersion measurement, which impacts the derived cosmographic constraints. It is thus crucial to impose a population prior on the deflectors’ anisotropic stellar orbit distribution and propagate the population uncertainty onto the cosmographic inference.
Observations suggest that typical massive elliptical galaxies are, in their central regions, isotropic or mildly radially anisotropic (e.g., Gerhard et al. 2001; Cappellari et al. 2007); similarly, different theoretical models of galaxy formation predict that elliptical galaxies should have anisotropy varying with radius, from almost isotropic in the center to radially biased in the outskirts (van Albada 1982; Hernquist 1993; Nipoti et al. 2006). A simplified description of the transition can be made with an anisotropy radius parameterization, r_{ani}, defining β_{ani} as a function of radius r (Osipkov 1979; Merritt 1985)
To describe the anisotropy distribution on the population level, we explicitly parameterize the profile relative to the measured halflight radius of the galaxy, r_{eff}, with the scaled anisotropy parameter
To account for lensbylens differences in the anisotropy configuration, we also introduce a Gaussian scatter in the distribution of a_{ani}, parameterized as σ(a_{ani}), such that σ(a_{ani})⟨a_{ani}⟩ is the standard deviation of a_{ani} at sample mean ⟨a_{ani}⟩.
3.2.4. Cosmological parameters
All relevant cosmological parameters, π, are part of the hierarchical Bayesian analysis. Wong et al. (2020) and Taubenberger et al. (2019) showed that when adding supernovae of type Ia from the Pantheon (Scolnic et al. 2018) or JLA (Betoule et al. 2014) sample as constraints of an inverse distance ladder, the cosmologicalmodel dependence of stronglensing H_{0} measurements is significantly mitigated.
In this work, we assume a flat ΛCDM cosmology with parameters H_{0} and Ω_{m}. We are using the inference from the Pantheononly sample of a flat ΛCDM cosmology with Ω_{m} = 0.298 ± 0.022 as our prior on the relative expansion history of the Universe in this work.
3.3. Likelihood calculation
In Sect. 3.1 we presented the generic form of the likelihood ℒ(𝒟_{i}D_{d, s, ds}, ξ_{pop}) (Eq. (49)) that we need to evaluate for each individual lens for a specific choice of hyperparameters, and in Sect. 3.2 we provided the specific choices and parameterization of the hyperparameters used in this work. In this section, we specify the specific likelihood of Eq. (49), ℒ(𝒟_{i}D_{d, s, ds}, ξ′_{pop, i}), that we use, since it is accessible and sufficiently fast to evaluate so that we can sample over a large number of lenses and their population priors.
Specifically, the parameters treated on the population level are ξ′_{pop, i} = {λ_{int, 0}, α_{λ}, σ(λ_{int}),⟨a_{ani}⟩,σ(a_{ani})}. Our choice of hyperparameters allows us to reutilize many of the posterior products derived from an independent analysis of single lenses (Eq. (40)). None of the lens model parameters, ξ_{mass}, except parameters describing λ_{int} and none of the light profile parameters, ξ_{light}, are treated on the population level and thus we can sample those independently for each lens directly from their imaging data
Furthermore, κ_{ext} and λ_{int} can be merged to a total MST parameter λ according to their definitions (Eq. (34)). All observables and thus the likelihood only respond to this overall MST parameter.
4. Validation on the timedelay lens modeling challenge
Before applying the hierarchical framework to real data, we use the timedelay lens modeling challenge (TDLMC; Ding et al. 2018, 2020) data set to validate the hierarchical analysis and to explore different anisotropy models and priors. The TDLMC was structured with three independent submission rungs. Each of the rungs contained 16 mock lenses with HSTlike imaging, time delays and kinematics information. The H_{0} value used to create the mocks was hidden from the modeling teams. The Rung1 and Rung2 mocks both used PEMD (Eq. (39)) with external shear lens models. The Rung3 lenses were generated by raytracing through zoomin hydrodynamic simulations and reflect a large complexity in their mass profiles and kinematic structure, as expected in the real Universe.
In the blind submissions for Rung1 and Rung2, different teams demonstrated that they could recover the unbiased Hubble constant within their uncertainties under realistic conditions of the data products, uncertainties in the Point Spread Function (PSF) and complex source morphology. In particular, two teams used LENSTRONOMY in their submissions in a completely independent way and achieved precise constraints on H_{0} while maintaining accuracy. For Rung1 and Rung2, the most precise submissions used the same model parameterization in their inference, thus omitting the problems reviewed in Sect. 2.
It is hard to draw precise conclusions from Rung3 as there are remaining issues in the simulations, such as numerical smoothing scale, subgrid physics, and a truncation at the virial radius. For more details of the challenge setup we refer to Ding et al. (2018) and on the results and the simulations used in Rung3 to Ding et al. (2020). For a recent study comparing spectroscopic observations with hydrodynamical simulations at z = 0 we refer for instance to van de Sande et al. (2019).
Despite the limitations of the available simulations for accurate cosmology, the application of the hierarchical analysis scheme on TDLMC Rung3 is a stress for the flexibility introduced by the internal MST and the kinematic modeling. Furthermore, the stellar kinematics from the stellar particle orbits provides a selfconsistent and highly complex dynamical system. The analysis of TDLMC Rung3 can further help in validating the kinematic modeling aspects in our analysis. However, the removal of substructure in postprocessing and truncation effects do not allow, in this regard, conclusions below the 1% level (see Ding et al. 2020). For the effect of substructure on the time delays we refer, for instance, to Mao & Schneider (1998), Keeton & Moustakas (2009) and for a study including the full lineofsight halo population to Gilman et al. (2020).
We describe the analysis as follow: In Sect. 4.1 we discuss the modeling of the individual lenses. In Sect. 4.2 we describe the hierarchical analysis and priors, and present the inference on H_{0}.
4.1. TDLMC individual lens modeling
For the validation, we make use of the blind submissions of the EPFL team by A. Galan, M. Millon, F. Courbin and V. Bonvin. The modeling of the EPFL team is performed with LENSTRONOMY, including an adaptive PSF reconstruction technique and taking into account astrometric uncertainties explicitly (e.g., Birrer & Treu 2019). Overall, the submissions of the EPFL team follow the standards of the TDCOSMO collaboration. The time that each investigator spent on each lens was substantially reduced due to the homogeneous mock data products, the absence of additional complexity of nearby perturbers and the line of sight, and improvements in the modeling procedure (Shajib et al. 2019). The EPFL team achieved the target precision and accuracy requirement on Rung2, with and without the kinematic constraints, and thus showed reliable inference of lens model parameters within a mass profile parameterization for which the MST does not apply. We refer to the TDLMC paper (Ding et al. 2020) for the details of the performance of all of the participating teams.
We use Rung2 as the reference result for which the MST does not apply, and Rung3 as a test case of the hierarchical analysis. In particular, we make use of the EPFL team’s blind Rung3 submission of the joint timedelay and imaging likelihood (Eq. (C.11)) of their PEMD + external shear models to allow for a direct comparison with the Rung2 results without the kinematics constraints. From the model posteriors of the EPFL team submission, we require the timedelay distance D_{Δt}, Einstein radius θ_{E}, powerlaw slope γ_{pl} and halflight radius r_{eff} of the deflector. The added external convergence is specified in the challenge setup to be drawn from a normal distribution with mean ⟨κ_{ext}⟩ = 0 and σ(κ_{ext}) = 0.025. The EPFL submission of Rung3, which is used in this work, consists of 13 lenses out of the total sample of 16. Three lenses were dropped in their analysis prior to submission due to unsatisfactory results and inconsistency with the submission sample. The uncertainty on the Einstein radius and halflight radius is at subpercent value for all the lenses and the powerlaw slope reached an absolute precision ranging from below 1% to about 2% for the least constraining lens in their sample from the imaging data alone.
In this work, we perform the kinematic modeling and the likelihood calculation within the hierarchical framework. We use the anisotropy model of Osipkov (1979) and Merritt (1985) (Eq. (51)) with a parameterization of the transition radius relative to the halflight radius (Eq. (52)). We assume a Hernquist light profile with r_{eff} in conjunction with the powerlaw lens model posteriors θ_{E} and γ_{pl} to model the dimensionless kinematic quantity J (Eqs. (16) and (17)), incorporating the slit mask and seeing conditions (slit 1″ × 1″, seeing FWHM = 06), as specified in the challenge setup.
4.2. TDLMC hierarchical analysis
For the setting of the TDLMC we only sample H_{0} as a free cosmologyrelevant parameter. The matter density Ω_{m} = 0.27 is provided in the challenge setup. We extend the EPFL submission by adding an internal MST distribution with a linear scaling of r_{eff}/θ_{E} described by λ_{int, 0} and α_{λ} (Eq. (50)) and Gaussian standard deviation σ(λ_{int}) of the population at fixed r_{eff}/θ_{E}. The anisotropy parameter a_{ani} is also treated on the population level with mean ⟨a_{ani}⟩ and Gaussian standard deviation σ(a_{ani}) for the population. In the hierarchical sampling we ignore the covariances between D_{Δt} and the model prediction of the kinematics J. This is justified because of the precise γ_{pl} constraints from the imaging data and the inference from the EPFL team.
The summary of the parameters and prior being used in this inference on the TDLMC is presented in Table 1. We chose two different forms of the prior on the anisotropy parameter ⟨a_{ani}⟩, one uniform in ⟨a_{ani}⟩ and a second one uniform in log(⟨a_{ani}⟩), covering the same range in the parameter space, to investigate prior dependences in our inference. To account for the external convergence, we marginalize for each individual lens from the probability distribution p(κ_{ext}) as specified in the challenge setup^{12}.
Figure 5 shows the posteriors of the hierarchical analysis with the priors specified in Table 1.
Fig. 5. Mock data from the TDLMC Rung3 inference with the parameters and prior specified in Table 1. Orange contours indicate the inference with a uniform prior in a_{ani} while the purple contours indicate the inference with a uniform priors in log(a_{ani}). The thin vertical line indicates the ground truth H_{0} value in the challenge (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/TDLMC/TDLMC_rung3_inference.ipynb). 

Open with DEXTER 
We recover the assumed value for the Hubble constant (H_{0} = 65.413 km s^{−1} Mpc^{−1}) within the uncertainties of our inference. We find km s^{−1} Mpc^{−1} for the 𝒰(log(a_{ani})) prior and km s^{−1} Mpc^{−1} for the 𝒰(a_{ani}) prior. We note that a uniform prior in log(a_{ani}) is a slightly less informative prior than a uniform prior in a_{ani} in the same range, as already pointed out by Birrer et al. (2016). In the remaining of this work 𝒰(log(a_{ani})) is the prior of choice in the absence of additional data that constrain the stellar anisotropy of massive elliptical galaxies to provide H_{0} constraints. The hierarchical analysis and the additional degree of freedom in the mass profile allows us to accurately correct for the insufficient assumptions in the mass profiles on the simulated galaxies. The kinematics modeling indicates that there is more mass in the central part of the galaxies than is modeled with a single powerlaw profile and infers λ_{int} > 1.
We notice a nonzero inferred scatter in the internal MST distribution. One contributing source to this scatter is the fact that the external convergence component was added in postprocessing in the TDLMC time delays (Eq. (24)). The rescaling was not applied to the velocity dispersion (Eq. (25)), leading to an artificial scatter in this relation equivalent to the distribution scatter of κ_{ext}, σ(κ_{ext}) = 0.025. As the mean in the convergence distribution in the TDLMC is ⟨κ_{ext}⟩ = 0, we do not expect biases beyond a scatter to occur.
The velocity dispersion measurements allow us to constrain λ_{int} and effectively probe a more flexible mass model family. Generally, the velocity dispersion estimates have a 5% relative uncertainty on each individual mock lens. As an ensemble, the 13 lenses of the EPFL submission in the TDLMC Rung3 provide information to infer λ_{int} to 2.8% precision (see Eq. (44)) in the limit of a perfect anisotropy model.
The final achieved precision on H_{0} from the sample of lenses, however, is 8%, dominated by the uncertainty in λ_{int}. The fact that, within our chosen priors, the kinematics cannot constrain λ_{int} to better than 8% comes from the uncertainty in the anisotropy model. More constraining priors on the anisotropy distribution of the stellar orbits in the lensing galaxies are the key to reducing the uncertainty in the H_{0} inference (see e.g., Birrer et al. 2016; Shajib et al. 2018; Yıldırım et al. 2020).
5. TDCOSMO mass profile and H_{0} inference
Having verified the hierarchical approach introduced in Sect. 3 in simultaneously constraining mass profiles and H_{0} with imaging, kinematics and timedelay observations in the TDLMC (Sect. 4) we employ the inference on the TDCOSMO sample set to measure H_{0}. The inference on the TDCOSMO data is identical to the validation on the TDLMC, apart from some necessary modifications due to the additional complexity in the lineofsight structure of the real data. In Sect. 5.1 we summarize the data and individual analyses for each single lens of the TDCOSMO sample. In Sect. 5.2 we describe the hierarchical analysis and present the results.
5.1. TDCOSMO sample overview
The analysis presented in this work heavily relies on data and analysis products collected and presented in the literature. We give here a detailed list of the references relevant for our work for the seven lenses of the TDCOSMO sample.
1. B1608+656: The discovery in the Cosmic Lens AllSky Survey (CLASS) is presented by Myers et al. (1995) with the source redshift by Fassnacht et al. (1996). The imaging modeling is presented by Suyu et al. (2009, 2010). The timedelay measurement is presented by Fassnacht et al. (1999, 2002). The velocity dispersion measurement of 260 km s^{−1} presented by Suyu et al. (2010) is based on KeckLRIS spectroscopy. The statistical uncertainty is ±7.7 km s^{−1} with a systematic spread of ±13 km s^{−1} depending on wavelength and stellar template solution. The combined uncertainty is 260 ± 15 km s^{−1}. A previous measurement by Koopmans et al. (2003) with 247 ± 35 km s^{−1} with Echellette Spectrograph and Imager (ESI) on KeckII is consistent with the more recent one by Suyu et al. (2010). The lineofsight analysis is presented by Suyu et al. (2010), based on galaxy number counts by Fassnacht et al. (2011).
2. RXJ1131−1231: The discovery is presented by Suyu et al. (2013) and Sluse et al. (2003). The imaging modeling is presented by Suyu et al. (2014) (for HST) and Chen et al. (2019) (for Keck Adaptive Optics data). An independent analysis of the HST data was performed by Birrer et al. (2016). The timedelay measurement is presented by Tewes et al. (2012). The velocity dispersion measurement of 323 ± 20 km s^{−1} presented by Suyu et al. (2013) is based on KeckLRIS spectroscopy and includes systematics. The lineofsight analysis is presented by Suyu et al. (2013).
3. HE0435−1223: The discovery is presented by Wisotzki et al. (2002). The image modeling is presented by Wong et al. (2017) (for HST) and Chen et al. (2019) (for Keck Adaptive Optics data). The timedelay measurement is presented by Bonvin et al. (2016). The velocity dispersion measurement of 222 ± 15 km s^{−1} presented by Wong et al. (2017) is based on KeckLRIS spectroscopy and includes systematic uncertainties. An independent measurement of 222 ± 34 km s^{−1} by Courbin et al. (2011) using VLT is in excellent agreement. The lineofsight analysis is presented by Rusu et al. (2017).
4. SDSS1206+4332: The discovery is presented by Oguri et al. (2005). The image modeling is presented by Birrer et al. (2019). The timedelay measurement is presented by Eulaers et al. (2013) with an update by Birrer et al. (2019). The velocity dispersion measurement of 290 ± 30 km s^{−1} presented by Agnello et al. (2016) is based on KeckDEIMOS spectroscopy and includes systematic uncertainties. The lineofsight analysis is presented by Birrer et al. (2019).
5. WFI2033−4723: The discovery is presented by Morgan et al. (2004), the image modeling by Rusu et al. (2020) and the timedelay measurement by Bonvin et al. (2019). The velocity dispersion measurement from VLT MUSE is presented by Sluse et al. (2019) with 250 ± 10 km s^{−1} only accounting for statistical error and 250 ± 19 km s^{−1} including systematic uncertainties. The lineofsight analysis is presented by Rusu et al. (2020).
6. DES0408−5354: The discovery is presented by Lin et al. (2017), Diehl et al. (2017). The imaging modeling is presented by Shajib et al. (2020a). A second team within STRIDES and TDCOSMO is performing an independent and blind analysis using a different modeling code (Yildirim et al., in prep.). The timedelay measurement is presented by Courbin et al. (2018). The velocity dispersion measurements are presented by BuckleyGeer et al. (2020). We used the values from Table 3 in Shajib et al. (2020a). The measurements are from Magellan with 230 ± 37 km s^{−1} (mask A) and 236 ± 42 km s^{−1} (mask B), from Gemini with 220 ± 21 km s^{−1} and from VLT MUSE with 227 ± 9 km s^{−1}. The reported values do not include systematic uncertainties and covariances among the different measurements. Following Shajib et al. (2020a) we add a covariant systematic uncertainty of ±17 km s^{−1} to the reported values. The lineofsight analysis is presented by BuckleyGeer et al. (2020).
7. PG1115+080: The discovery is presented by Weymann et al. (1980). The image modeling is presented by Chen et al. (2019) using Keck Adaptive Optics. The timedelay measurement is presented by Bonvin et al. (2018), while the lineofsight analysis by Chen et al. (2019). The velocity dispersion measurement of 281 ± 25 km s^{−1}, presented by Tonry (1998), is based on KeckLRIS spectroscopy. In this work we add new acquired integralfield spectroscopy obtained with the MultiObject Survey Explorer (MUSE) on the VLT in March 2019 (0102.A0600(C), PI Agnello), and we thus go in some detail about the observations. The details and the data will be presented in a forthcoming paper by Agnello et al. (in prep.). At the location of the lens, 3 h of total exposure time were obtained, in clear or photometric conditions and nominal seeing of 0.8″ FWHM. Due to the proximity of the four quasar images to the main galaxy, a dedicated extraction routine was used in order to optimally deblend all components. We followed the same procedure as by Sluse et al. (2019) and Braibant et al. (2014), fitting each spectral channel as a superposition of a Sersic profile (for the main lens) and four point sources as identical Moffat profiles. The separation between the individual components is held fixed to the HSTNICMOS measurements (Sluse et al. 2012).
A nearby star in the MUSE fieldofview was used as a reference PSF. From this direct modeling, the FWHM of the PSF was found to be , with some variation with wavelength that was accounted for in the modelbased deblending. This procedure produced an optimal subtraction of the quasar spectra, at least within 1″ from the center of the lens. The lens galaxy 1D spectra were then extracted in two square apertures (, ), and processed with the Penalized PiXelFitting (PPXF) code presented in Cappellari & Emsellem (2004) and further upgraded in Cappellari (2017) to obtain velocity dispersions.
The velocity dispersion measurement results from a linear combination of stellar template spectra to which a sum of orthogonal polynomials is added to adjust the continuum shape of the templates to the observed galaxy specttrum. The spectral library used for the fit is the IndoUS spectral library, 1273 stars covering the region from 3460 to 9464 Åat a spectral resolution of 1.35 Å FWHM (Valdes et al. 2004).
We measure for the inner aperture (R < 0.6″) a stellar velocity dispersion value of 277 ± 6.5 km s^{−1} and for the outer () a value of 241 ± 8.8 km s^{−1}. The uncertainties only include the statistical errors. In order to estimate the systematics, we performed a number of PPXF fits on the smaller aperture, changing each time the wavelength range, the degree of the additive polynomial and the number of stellar templates used to fit the galaxy spectra. We obtained a systematic uncertainty of ±23.6 km s^{−1} that, as for the case of DES0408, we treat as fully covariant among the two aperture measurements. With the spectral resolution of MUSE, systematic uncertainties are within ≈10% and about three times larger than the nominal, statistical uncertainties thanks to the high signaltonoise of the spectra.
All the TDCOSMO analyses of lenses used uniform priors on all relevant parameters when performing the inference with a PEMD model^{13}. Six out of the seven lenses were modeled blindly^{14}, that is H_{0} values were never seen by the modeler at any step of the process.
Detailed lineofsight analyses for each lens have been performed based on weighted relative number counts of galaxies along the line of sight on deep photometry and spectroscopic campaigns (e.g., Rusu et al. 2017). Furthermore, for a fraction of the lenses, we have used also an external shear constraint inferred by the strong lens modeling to inform the lineofsight convergence estimate. The weighted galaxy number count and external shear summary statistics have been applied on the Millenium Simulation (Springel et al. 2005) with raytracing (Hilbert et al. 2009) to extract a posterior in p(κ_{ext}) with the prior from the Millenium Simulation and semianalytic galaxy evolution model with painted synthetic photometry on top (De Lucia & Blaizot 2007)^{15}. The external convergence and shear values from the Millenium simulation are computed from the observer to the source plane, κ_{ext} ≈ κ_{s}. The coupling of the strong lens deflector (e.g., BarKana 1996; McCully et al. 2014; Birrer et al. 2017) is not included in the calculation of κ_{s}. Figure 6 shows the κ_{ext} posteriors for the individual lenses. For the overall sample mean, we get with a scatter of σ(κ_{ext}) = 0.046 around the mean. Nearby massive galaxies along the line of sight were included explicitly in the modeling where required, and the external convergence term was adapted accordingly in order to not double count mass structure in the analysis. Table 2 presents the redshifts and the relevant lens model posteriors that are used in our analysis.
Fig. 6. External convergence posteriors for the individual TDCOSMO lenses (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/TDCOSMO_sample/tdcosmo_sample.ipynb). 

Open with DEXTER 
Overview of the TDCOSMO sample posterior products used in this work.
5.2. TDCOSMO hierarchical inference
We use for each lens the individual timedelay distance likelihood according to Eq. (C.11) that was derived in previous works of this collaboration from a lens model inference on imaging data and the timedelay measurements from the PEMD inference, not including external convergence or internal MST, . We add the same MST transform as a distribution mean λ_{int, 0} and scaling α_{λ} with r_{eff}/θ_{E}, and with Gaussian scatter across the data set, identical to the TDLMC validation in Sect. 4. The individual p(κ_{ext}) distributions are added for each lens and in the inference combined with the internal MST parameters.
For the kinematic modeling, we make the same assumptions as for the TDLMC sample (Sect. 4.1) with the anisotropy model of Osipkov (1979), Merritt (1985) (Eq. (51)) with a parameterization of the transition radius relative to the halflight radius (Eq. (52)). The approach is consistent with the previous kinematic analysis and sufficiently verified on the TDLMC to the level of accuracy we can expect from this analysis. We also assume a Hernquist light profile with r_{eff}, in conjunction with the powerlaw lens model posteriors θ_{E} and γ_{pl} to model the dimensionless kinematic quantity J (Eqs. (16) and (17)), also incorporating the slit mask and seeing conditions of the individual observations.
For each of the lenses in the TDCOSMO sample, we use the distribution p(κ_{ext}) as derived on the individual blinded analyses and do not invoke an additional population parameter. We leave the hierarchical analysis of the lineofsight selection to future work. We want to stress that the overall selection bias in this hierarchical approach does not impact the H_{0} constraints as the kinematics constrains the overall MST (Eq. (34)). An overall shift in the distribution of κ_{s} will be compensated by λ_{int} in the inference, thus leaving the H_{0} constraints invariant.
We assume a flat ΛCDM cosmology with a uniform prior on H_{0} in [0,150] km s^{−1} Mpc^{−1}. For Ω_{m} we chose the prior based on the Pantheon sample (Scolnic et al. 2018), 𝒩(μ = 0.298, σ = 0.022). We also perform the inference with a flat prior on Ω_{m} in [0.05,0.5] to allow for comparison with the previous work by Wong et al. (2020) and Millon et al. (2020) and to illustrate cosmology dependences in the timedelay cosmography inference. Table 3 summarizes all the hierarchical hyperparameters sampled in the analysis of this section. The posteriors of the TDCOSMO sample inference are presented in Fig. 7.
Fig. 7. Hierarchical analysis of the TDCOSMOonly sample when constraining the MST with kinematic information. Parameter and priors are specified in Table 3. Orange contours correspond to the inference with uniform prior on Ω_{m}, 𝒰([0.05, 0.5]), while the purple contours correspond to the prior based on the Pantheon sample with 𝒩(μ = 0.298, σ = 0.022) (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/TDCOSMO_sample/tdcosmo_sample.ipynb). 

Open with DEXTER 
For the tight prior on Ω_{m}, we measure km s^{−1} Mpc^{−1}. For an unconstrained relative expansion history with a prior on Ω_{m} uniform in [0.05,0.5], we measure km s^{−1} Mpc^{−1}. The 9% precision on H_{0} is significantly inflated relative to previous studies with the same data set (Wong et al. 2020; Millon et al. 2020). The increase in uncertainty with respect to the H0LiCOW analysis is attributed to two main factors: (1) we relaxed the assumption of NFW+stars or powerlaw mass density profiles; (2) we considered the impact of covariance between lenses when accounting for uncertainties potentially arising from assumptions about mass profile and stellar anisotropy models. As we show in the next sections, however, this uncertainty can be reduced by adding external information to further constrain the mass profile and anisotropy of the deflectors. The inferred scatter in λ_{int}, σ(λ_{int}), is consistent with zero. This is a statement on the internally consistent error bars on H_{0} among the TDCOSMO sample (Wong et al. 2020; Millon et al. 2020).
6. SLACS analysis of galaxy density profiles
Gravitational lenses with imaging and kinematics data can add valuable information about the mass profiles of the lenses. Even though the kinematics data in the current TDCOSMO sample is limited, an additional sufficiently large data set with precise measurements can significantly improve the precision on the mass profiles of the population and thus on the Hubble constant. Resolved kinematics observations may in addition provide constraints on the anisotropy distribution of stellar orbits.
When incorporating external data sets as part of the hierarchical framework, it is important that those external lenses are drawn from the same population as the timedelay lenses – unless explicitly marginalized over population differences. Provided that (i) the lensing sample has a known selection function, (ii) the lens modeling is performed to the same level of precision and with the same model assumptions as the timedelay lenses, (iii) the kinematic modeling assumptions are identical and (iv) the anisotropy uncertainties are mitigated on the population level, we can fold in the extracted likelihood (Eq. (C.12)) into the hierarchical analysis, applying the same population dependence on λ_{int} and a_{ani}.
Selection biases can arise from different aspects. Ellipticity and shear naturally increase the abundance of quadruple lenses relative to double lenses. Holder & Schechter (2003) use Nbody simulations to estimate the level of external shear due to structure near the lens and conclude that the local environment is the dominant contribution that drives the external shear bias in quadruple lenses. Huterer et al. (2005) investigate the external shear bias and conclude that this effect is not sufficient to explain the observed quadrupletodouble ratio. Collett & Cunnington (2016) conclude, based on idealized simulations, that selection based on image brightness and separation leads to significant selection bias in the slope of the mass profiles. In addition, Collett & Cunnington (2016) also find a lineofsight selection bias in quadruply lensed quasars relative to the overall population on the level of 0.9%. The bias is less prominent for doubly imaged quasars. The specific discovery channel can also lead to selection effects. Dobler et al. (2008) note that a spectroscopically selected search, as performed for the Sloan Lens ACS (SLACS) survey (Bolton et al. 2006), can lead to significant biases on the selected velocity dispersion in the resulting sample. However, Treu et al. (2006) show that, at fixed velocity dispersion, the SLACS sample is indistinguishable from other elliptical galaxies.
In this section we present a hierarchical analysis of the SLACS sample (Bolton et al. 2006, 2008) following the same hierarchical approach as the TDCOSMO sample, based on the imaging modeling by Shajib et al. (2020b). The SLACS sample of strong gravitational lenses is a sample of massive elliptical galaxies selected from the Sloan Digital Sky Survey (SDSS) by the presence in their spectra of emission lines consistent with a higher redshift. Followup highresolution observations with HST revealed the presence of strongly lensed sources. The SLACS data set allows us to further constrain the population distribution in the mass profile parameter λ_{int} and the anisotropy distribution a_{ani} and, thus, can add significant information to the TDCOSMO sample to be used jointly in Sect. 7 to constrain H_{0}.
In Sect. 6.1 we describe the imaging data and lens model inference. In Sect. 6.2 we describe the spectroscopic data set used and how we model it, including VLT VIMOS IFU data for a subset of the lenses. We analyze the selection effect of the SLACS sample in Sects. 6.3 and 6.4 we constrain the lineofsight convergence for the individual lenses. In Sect. 6.5 we present the results of the hierarchical analysis of the SLACS sample in regard to mass profile and anisotropy constraints.
6.1. SLACS imaging
To include additional lenses in the hierarchical analysis, we must ensure that the quality and the choices made in the analysis are on equal footing with the TDCOSMO sample. Shajib et al. (2020b) presents a homogeneous lens model analysis of 23 SLACS lenses from HST imaging data. The lens model assumptions are a PEMD model with external shear, identical to the derived products we are using from the TDCOSMO sample. The scaling of the analysis was made possible by advances in the automation of the modeling procedure (e.g., Shajib et al. 2019) with the DOLPHIN pipeline package. The underlying modeling software is LENSTRONOMY (Birrer & Amara 2018; Birrer et al. 2015) for which we also performed the TDLMC validation (Sect. 4).
Shajib et al. (2020b) first select 50 SLACS lenses for uniform modeling from the sample of 85 lenses presented by Auger et al. (2009). The selection criteria for these lenses are: (i) no nearby satellite or large perturber galaxy within approximately twice the Einstein radius, (ii) absence of multiple source galaxies or complex structures in the lensed arcs that require large computational cost for source reconstruction, and (iii) the main deflector galaxy is not disklike. These criteria are chosen so that the modeling procedure can be carried out automatically and uniformly without tuning the model settings on a lensbylens basis. Using the DOLPHIN package on top of LENSTRONOMY, a uniform and automated modeling procedure is performed on the 50 selected lenses with Vband data (Advance Camera for Surveys F555W filter, or Wide Field and Planetary Camera 2 F606W filter).
After the modeling, 23 lenses are selected to have good quality models. The criteria for this final selection are: (i) good fitting to data by visually inspecting the residual between the image and the modelbased reconstruction, and (ii) the median of the powerlaw slope does not diverge to unusual values (i.e., ≲1.5 or ≳2.5)^{16}. For the TDCOSMO sample, iterative PSF corrections have been performed, based on the presence of the bright quasar images, to guarantee a well matched and reliable PSF in the modeling. For the SLACS lenses, such an iterative correction on the image itself cannot be performed due to the absence of quasars in these systems. Nevertheless, extensive tests with variations of the PSF have been performed by Shajib et al. (2020b) and the impact on the resulting powerlaw slope inference was below ∼0.005 on the population mean of γ_{pl}. The half light radius for the deflector galaxies are taken from Auger et al. (2009) in Vband (measured along the intermediate axis).
6.2. SLACS spectroscopy
The constraints on the MST rely on the kinematics observations. In this section we provide details on the data set and reduced products we are using in this work, on top of the already described ones for the TDCOSMO lenses. These include SDSS’s Baryon Oscillation Spectroscopic Survey (BOSS) fiber spectroscopy (Dawson et al. 2013) and VLT VIMOS IFU observations.
6.2.1. SDSS fiber spectroscopy
All the SLACS lenses have BOSS spectra available as part of SDSSIII. The fiber diameter is 3″ and the nominal seeing of the observations are 14 FWHM. The measurements of the velocity dispersion from the SDSS reduction pipeline were originally presented by Bolton et al. (2008). However, in this work, we use improved measurements of the velocity dispersion, determined using an improved set of templates as described in Shu et al. (2015). The SDSS measurements are in excellent agreement with the subsample measured with VLT Xshooter presented by Spiniello et al. (2015).
6.2.2. VLT VIMOS IFU data
The VLT VIMOS IFU data set is described in Czoske et al. (2008) and subsequently used in Barnabè et al. (2009, 2011), Czoske et al. (2012). The VIMOS fibers were in a configuration with spatial sampling of 0.67″, and the seeing was 08 FWHM.
The first moment (velocity) and second moment (velocity dispersion) of the individual VIMOS fibers are fit with a single stellar template for each fiber individually and the uncertainties in the measurements are quantified within Bayesian statistics. Templates were chosen by fitting a random sample of IndoUS spectra to the apertureintegrated VIMOS IFU spectra and selecting one of the bestfitting (in the leastsquares sense) template candidates (we refer to details to Czoske et al. 2008). Marginalization over template mismatch adds another 5−10% measurement uncertainties. Within this additional error budget, the integrated velocity dispersion measurements of Czoske et al. (2008) are consistent with the SDSS measured values of Bolton et al. (2008). We bin the fibers in radial bins in steps of 1″ from the center of the deflector. The binning is performed using luminosity weighting and propagation of the independent errors to the uncertainty estimate per bin. Where necessary, we exclude fibers that point on satellite galaxies or lineofsight contaminants. In this work, we make use of the relative velocity dispersion measurements in radial bins when inferring H_{0}. We do so by introducing a separate internal MST distribution λ_{ifu}, effectively replacing λ_{int} when evaluating the likelihood of the IFU data. λ_{ifu} is entirely constrained by the IFU data. The MST information that propagates in the joint constraints of TDCOSMO+SLACS analysis, λ_{int}, (Sect. 7) is derived from the SDSS velocity dispersion measurements only. In this form, the IFU data informs the anisotropy parameter but not the mass profile directly. We leave the amplitude calibration and usage of this data set to constrain the MST for future work.
From the original sample of 17 SLACS lenses with VIMOS observations, we drop five objects that are fast rotators (when the first moments dominate the averaged dispersion in the outer radius bin) and one slow rotator with velocity dispersion >380 km s^{−1}. This is necessary to match this sample with the TDCOSMO one in velocity dispersion space; the fast rotators are, in fact, all in a lower velocity dispersion range (σ^{P} in [185,233] km s^{−1}). Finally, we excluded one more galaxy for which there is no estimate of the Einstein radius, and thus we cannot combine lensing and dynamics. In this way, we end up with a sample of ten lenses, prior to further local environment selection.
6.3. SLACS selection function
The SLACS lenses were preselected from the spectroscopic database of the SDSS based on the presence of absorptiondominated galaxy continuum at one redshift and nebular emission lines (Balmer series, [OII] 3727 A, or [OIII] 5007 A) at another, higher redshift. Details on the method and selection can be found in Bolton et al. (2004, 2006) and Dobler et al. (2008). The lens and source redshifts of the SLACS sample are significantly lower than for the TDCOSMO sample.
Treu et al. (2009) studied the relation between the internal structure of earlytype galaxies and their environment with two statistics: the projected number density of galaxies inside the tenth nearest neighbor (Σ_{10}) and within a cone of radius one h^{−1} Mpc (D_{1}) based on photometric redshifts. It was observed that the local physical environment of the SLACS lenses is enhanced compared to random volumes, as expected for massive earlytype galaxies, with 12 out of 70 lenses in their sample known to be in group/cluster environments.
In this study, we are specifically only looking for lenses whose lensing effect can be described as the mass profile of the massive elliptical galaxy and an uncorrelated lineofsight contribution. Assuming SLACS and TDCOSMO lenses are galaxies within the same homogeneous galaxy population and with the local environment selection of SLACS lenses, the remaining physical mass components in the deflector model are the same physical components of the lensing effect we model in the TDCOSMO sample. The uncorrelated lineofsight contribution can be characterized based on large scale structure simulations.
6.3.1. Deflector morphology and lensing information selection
Our first selection cut on the SLACS sample is based on Shajib et al. (2020b), which excludes a subset of lenses based on their unusual lens morphology (prominent disks, two main deflectors, or complex source morphology) to derive reliable lensing properties using an automated and uniform modeling procedure.
This first cut reduced the total SLACS sample of 85 lenses, presented by Auger et al. (2009), to 51 lenses^{17}. Out of these 51 lenses, 23 lenses had good quality models from an automated and uniform modeling procedure as described in Sect. 6.1. Producing good quality models for the rest of the SLACS lenses would require careful treatment on a lensbylens basis, which was out of the scope of Shajib et al. (2020b).
6.3.2. Mass proxy selection
We want to make sure that the deflector properties are as close as possible to the TDCOSMO sample. To do so without introducing biases regarding uncertainties in the velocity dispersion measurements, we chose a cut based on Singular Isothermal Sphere (SIS) equivalent dispersions, σ_{SIS}, derived from the Einstein radius and the lensing efficiency only. The deflectors of the TDCOSMO sample span a range of σ_{SIS} in [200,350] km s^{−1} and we select the same range for the SLACS sample.
6.3.3. Local environment selection
We use the DESI Legacy Imaging Surveys (DLS; Dey et al. 2019) to characterize the environment of the SLACS lenses. We query the DR7 Tractor source photometry catalog (Lang et al. 2016) removing any object that is morphologically consistent with being a point source convolved with the DLS point spread function. We use the R band data to count objects with 18 < R < 23 within 120″ of the lens galaxy but more than 3″ from the lens.
We quantify the environment with two numbers: N_{2′}, the total number of galaxies within 2 arcmin and an inverse projected distance weighted count N_{1/r} within the same 2 arcmin aperture, defined as (Greene et al. 2013)
N_{2′} and N1/r are physically meaningful numbers for our analysis as N_{2′} should approximately trace the total mass close enough to significantly perturb the lensing (see Collett et al. 2013), and N_{1/r} should be skewed larger by masses close along the line of sight of the lens which are likely to have the most significant perturbative effect. We assess the uncertainty on N_{2′} and N_{1/r} by taking every object within 120″ of the lens and bootstrap resampling from their R band magnitude errors, before reapplying the 18 < R < 23 cut. Where the SLACS lens is not in the DLS DR7 footprint we queried the DLS DR8 catalog instead. To put N_{2′} and N_{1/r} into context, we perform the same cuts centered on 10^{5} random points within the DLS DR7 footprint. Dividing the SLACS N_{2′} and N_{1/r} by the median ⟨N_{2′⟩rand} and ⟨N_{1/r}⟩_{rand} of the calibration lines of sight allows us to assess the relative overdensity of the SLACS lenses as
and
We compare this metric on our sample with the overlapping sample of Treu et al. (2009) where local 3dimensional quantities in the form of D_{1} are available, and we find good agreement between these two statistics in terms of a rank correlation.
We remove lenses that have ζ_{1/r} > 2.10 within the 2 arcminutes aperture from our sample. This selection cut corresponds to D_{1} = 1.4 Mpc^{−3} for the subset by Treu et al. (2009). Independently of the ζ_{1/r} cut, we check and flag all lenses within the Shajib et al. (2020b) sample that have prominent nearby perturbers present in the HST data within 5″. We do not find any additional lenses with prominent nearby perturbers not already removed by the selection cut of ζ_{1/r} > 2.10.
6.3.4. Combined sample selection
With the combined selection on the SLACS sample based on the morphology, mass proxy, local environment, and for the IFU lenses also rotation, we end up with 33 SLACS lenses of which nine lenses have IFU data. 14 lenses out of the sample have quality lens models by Shajib et al. (2020b), including five lenses with IFU data. Figure 8 shows how the individual lenses among the different samples, TDCOSMO, SLACS and the subset with IFU data are distributed in key parameters of the deflector.
Fig. 8. Sample selection of the SLACS lenses being added to the analysis and comparison with the TDCOSMO data set. The comparisons are in lens redshift, z_{lens}, source redshift, z_{source}, measured velocity dispersion, σ^{P}, half light radius of the deflector, r_{eff}, Einstein radius of the deflector, θ_{E}, the ratio of half light radius to Einstein radius, r_{eff}/θ_{E}, and the SIS equivalent velocity dispersion estimated from the Einstein radius and a fiducial cosmology, σ_{SIS}. Open dots correspond to lenses included in our selection without quality lens models. Red points correspond to SLACS lenses which have VIMOS IFU data (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/sample_selection.ipynb). 

Open with DEXTER 
We discuss possible differences between the SLACS and TDCOSMO samples and the possibility of trends within the samples impacting our analysis in a systematic way in Sect. 8.3.2 after presenting the results of the hierarchical analysis of the joint sample. We list all the relevant measured values and uncertainties of the 33 SLACS lenses in Appendix E.
6.4. Line of sight convergence estimate
We compute the probability for the external convergence given the relative number counts, P(κ_{ext}ζ_{1}, ζ_{1/r}), following Greene et al. (2013) (see e.g., Rusu et al. 2017, 2020; Chen et al. 2019; BuckleyGeer et al. 2020). In brief, we select from the Millennium Simulation (MS; Springel et al. 2005) line of sights which satisfy the relative weighted number density constraints measured above, in terms of both number counts and 1/r weighting (Eq. (54)). While the MS consists only of dark matter halos, we use the catalog of galaxies painted on top of these halos following the semianalytical models of De Lucia & Blaizot (2007). We implement the same magnitude cut, aperture radius etc. which were employed in measuring the relative weighted number densities for the SLACS lenses, in order to compute ζ_{1}, ζ_{1/r} corresponding to each line of sight in the MS. We then use the κ maps computed by Hilbert et al. (2009) and read off the values corresponding to the location of the selected line of sight, thus constructing the p(κ_{ext}ζ_{1}, ζ_{1/r}) probability density function (PDF). The Hilbert et al. (2009) maps were computed for a range of source redshift planes. Over the range spanned by the source redshifts of the SLACS lenses, there are 17 MS redshift planes, with spacing Δz ∼ 0.035−0.095. We used the maps best matching the source redshift of each SLACS lens. For 23 of the SLACS lenses there are available external shear measurements by Shajib et al. (2020b), which we used, optionally, as a third constraint. Compared to previous inferences of p(κ) for the TDCOSMO lenses, we made two computational simplifications to our analysis, in order to be able to scale our technique to the significantly larger number of lenses: (1) We did not resample from the photometry of the MS galaxies, taking into account photometric uncertainties similar to those in the observational data. A toy simulation showed that this step results in negligible differences. (2) We use only 1/8 of the lines of sight in the MS. We then checked that this results in Δκ ≲ 0.001 offsets, negligible for the purpose of our analysis.
Figure 9 shows the p(κ_{ext}ζ_{1}, ζ_{1/r}) distributions for the subselected sample based on morphology and local environment. As expected from the significantly lower source redshifts of the SLACS sample compared to the TDCOSMO lenses, most of p(κ) PDFs for the individual lenses are very narrow and peak at ∼zero, with dispersion ∼0.01. This is because the volume is smaller and thus there are relatively few structures in the MS at these low redshifts to contribute. In fact, the relative weighted number density constraints have relatively little impact on most of the p(κ) distributions, which resemble the PDFs for all lines of sight. Finally, we note that, while our approach to infer p(κ) for the SLACS lenses is homogeneous, this is not the case for the TDCOSMO lenses. This is by necessity, as the environmental data we used for the TDCOSMO lenses has varied in terms of depth, number of filters and available targeted spectroscopy. Nonetheless, as we have shown through simulations by Rusu et al. (2017, 2020), such differences do not bias the p(κ) inference.
Fig. 9. External convergence posteriors of the 33 SLACS lenses that pass our morphology and local environment selection cut based on the weighted number counts (gray dashed lines) and the population distribution (black solid line) (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/sample_selection.ipynb). 

Open with DEXTER 
6.5. SLACS inference
Here we present the hierarchical inference on the mass profile and anisotropy parameters from the selected sample of the SLACS lenses. We remind the reader that we use 33 SLACS lenses, of which 14 have imaging modeling constraints on the powerlaw slope γ_{pl}. Nine of the lenses in our final sample have also VLT VIMOS IFU constraints in addition to SDSS spectroscopy (five of which have imaging modeling constraints on the powerlaw slope). The separate inference presented in this section is meant to provide consistency checks and to gain insights into how the likelihood of the SLACS data set is going to impact the constraints on the mass profiles, and thus H_{0}, when combining with the TDCOSMO data set.
We are making use of the marginalized posteriors in the lens model parameters of Shajib et al. (2020b) in the same way as for the TDLMC and TDCOSMO sample. For SLACS lenses that do not have a model and parameter inference by Shajib et al. (2020b), we use the Einstein radii measured by Auger et al. (2009) derived from a singular isothermal ellipsoid (SIE) lens model. For the powerlaw slopes of those lenses we apply the inferred Gaussian population distribution prior on γ_{pl} from the selected sample which has measured values, with γ_{pl, pop} = 2.10 ± 0.16. Figure 10 presents the imaging data inferred γ_{pl} for the 14 quality lenses selected in our sample by Shajib et al. (2020b).
Fig. 10. Powerlaw slope γ_{pl} inferences obtained from HST imaging data for the 14 SLACS lenses within our selection cut from Shajib et al. (2020a,b). We derive a population distribution from these lenses to be applied on the subset of lenses without measured γ_{pl} from imaging data. The black dashed line indicates the population mean and the blue band the 1sigma population width based on the 14 individual measurements (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/sample_selection.ipynb). 

Open with DEXTER 
Table 4 presents the parameters and priors used in the hierarchical inference of this section. In particular, we fix the cosmology to assess constraining power and consistency with the TDCOSMO data set. We separate the inference on λ_{int, 0} of the VIMOS IFU data set from the SDSS measurements to assess systematic differences between the two data products. Further more, we use a uniform prior in a_{ani}, 𝒰(a_{ani}), rather than a logarithmic prior 𝒰(log(a_{ani})), to assess and illustrate the information on the anisotropy parameter from the IFU data set.
For the analysis of the SLACSonly sample in this section, we fix the cosmological model. The cosmological dependence folds in the prediction of the velocity dispersion through the distance ratio D_{s}/D_{ds} (Eq. (17)). This ratio is not sensitive to H_{0} and the SLACSonly data set is not constraining H_{0}. When combining the SLACS and TDCOSMO sample in the next section, the cosmology dependence is fully taken into account.
We perform two posterior inferences: one with the SDSS velocity dispersion data only, and one combining SDSS and VIMOS IFU binned dispersions. Figure 11 shows the two different posteriors. The constraints on λ_{int} (parameters λ_{int, 0}, α_{λ}, σ(λ_{int})) come for all three cases entirely from the kinematics of the SDSS measurements.
Fig. 11. Posterior distribution for the SLACS sample with priors according to Table 4. Orange: inference with the SDSS spectra. Purple: inference with SDSS spectra and VIMOS IFU data set. The posterior of λ_{int, 0} was blinded during the analysis (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/SLACS_sample/SLACS_constraints.ipynb). 

Open with DEXTER 
All the parameters are statistically consistent with each other and the TDCOSMO analysis of Sect. 5 except the posterior in the scatter of the internal MST, σ(λ_{int}). The TDCOSMO constraints of σ(λ_{int}) are consistent with zero scatter in the mass profile parameter and 2sigma bound at 0.1, while the inference of the SLACS sample results in a larger scatter. An underestimation of uncertainties in the velocity dispersion measurements, if not accounted for in the analysis, will directly translate to an increase in σ(λ_{int}). We point out the excellent agreement of the anisotropy distribution with the TDLMC Rung3 hydrodynamical simulations (Sect. 4).
7. Hierarchical analysis of TDCOSMO+SLACS
We describe now the final and most stringent analysis of this work, obtained by combining the analysis of the TDCOSMO lenses, presented in Sect. 5, and that of the SLACS sample, presented in Sect. 6. The parameterization and priors have been validated on the TDLMC mock data set in Sect. 4. We remind the reader that the choices of the analyses are identical and thus we can combine the TDCOSMO and SLACS sample on the likelihood level. We define the parameterization and priors of our hierarchical model in Sect. 7.1 and present the result and the H_{0} measurement in Sect. 7.2.
7.1. Parameterization and priors
For our final H_{0} measurement, we assume a flat ΛCDM cosmology with uniform prior in H_{0} in [0, 150] km s^{−1} Mpc^{−1} and a narrow prior on Ω_{m} with 𝒩(μ = 0.298, σ = 0.022) from the Pantheon SNIa sample (Scolnic et al. 2018, see Sect. 3.2.4). For λ_{int}, we assume an identical distribution for the selected population of the SLACS lenses and the TDCOSMO sample for the scaling in r_{eff}/θ_{E} (Eq. (50)). We also assume the same stellar anisotropy population distributions for the SLACS and TDCOSMO lenses. To account for potential systematics in the VIMOS IFU measurement (see Sect. 6.2.2), we introduce a separate a separate internal MST distribution λ_{ifu}, effectively replacing λ_{int} when fitting the IFU data. This approach allows us to use the anisotropy constraints from the IFU data while not requiring a perfect absolute calibration of the measurements. For the external convergence we use the individual p(κ_{ext}) distributions from the two samples.
As discussed in Sect. 6, there is an inconsistency in the inferred spread in the λ_{int} distribution between the SLACS and TDCOSMO sample. We attribute this inconsistency to uncertainties that were not accounted for in the velocity dispersion measurements of the SDSS data products. In our joint analysis, we add a parameter that describes an additional relative uncertainty in the velocity dispersion measurements, σ_{σP, sys}, such that the total uncertainty in the velocity dispersion measurements is the square of the quoted measurement uncertainty plus this unaccounted term,
σ_{σP, sys} is the same for all the SDSS measured velocity dispersions. Table 5 presents all the parameters being fit for, including their priors, in our joint analysis of the SLACS and TDCOSMO sample^{18}.
Summary of the model parameters sampled in the hierarchical inference on the TDCOSMO+SLACS sample.
7.2. Results
Here we present the posteriors of the joint hierarchical analysis of 33 SLACS lenses (nine of which have IFU data) and the seven quasar timedelay TDCOSMO lenses for the parameterization and priors described in Table 5. To trace back information to specific data sets, we sample different combinations of the TDCOSMO and SLACS data sets under the same priors. The TDCOSMOonly inference was already presented in Sect. 5 and results in km s^{−1} Mpc^{−1}. Besides the TDCOSMOonly result, we perform the inference for the TDCOSMO+SLACS_{IFU} data set, effectively allowing anisotropy constraints being used on top of the TDCOSMO data set, resulting in km s^{−1} Mpc^{−1}; the TDCOSMO+SLACS_{SDSS} data set, using the SLACS lenses with their SDSS spectroscopy to inform the analysis, results in km s^{−1} Mpc^{−1}. For our final inference of this work of the joint data sets of TDCOSMO+SLACS_{SDSS + IFU}, we measure km s^{−1} Mpc^{−1}.
Figure 12 presents the key parameter posteriors of the TDCOSMOonly, TDCOSMO+SLACS_{IFU}, TDCOSMO+SLACS_{SDSS}, and the TDCOSMO+SLACS_{SDSS + IFU} analyses. Not shown on the plot are the Ω_{m} posteriors (effectively identical to the prior), the σ_{σP, sys} posteriors for the SDSS kinematics measurements, the distribution scatter parameters σ(λ_{int} and σ(a_{ani}), and the IFU calibration nuisance parameter λ_{ifu}. All the onedimensional marginalized posteriors, except for the nuisance parameter λ_{ifu}, of the different combinations of the data sets are provided in Table 6.
Fig. 12. Posterior distributions of the key parameters for the hierarchical inference. Blue: constraints from the TDCOSMOonly sample. Violet: constraints with the addition of IFU data of nine SLACS lenses to inform the anisotropy prior on the TDCOSMO sample, TDCOSMO+SLACS_{IFU}. Orange: constraints with a sample of 33 additional lenses with imaging and kinematics data (HST imaging + SDSS spectra) from the SLACS sample, TDCOSMO+SLACS_{SDSS}. Purple: joint analysis of TDCOSMO and 33 SLACS lenses with SDSS spectra of which nine have VIMOS IFU data, TDCOSMO+SLACS_{SDSS + IFU}. Priors are according to Table 5. The 68th percentiles of the 1D marginalized posteriors are presented in Table 6. The posteriors in H_{0} and λ_{int, 0} were held blinded during the analysis (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/joint_inference.ipynb). 

Open with DEXTER 
We compare the best fit model prediction of the joint TDCOSMO+SLACS_{SDSS + IFU} inference to the timedelay distance and kinematics of the TDCOSMO data set in Fig. 13, to the SDSS velocity dispersion measurements in Fig. 14 and to the IFU data set in Fig. 15. The model prediction uncertainties include the population distributions in λ_{int} and a_{ani} and the measurement uncertainty in the SDSS and VIMOS velocity dispersion uncertainties include the inferred σ_{σP, sys} uncertainty.
Fig. 13. Illustration of the goodness of the fit of the maximum likelihood model of the joint analysis in describing the TDCOSMO data set. Blue points are the measurements with the diagonal elements of the measurement covariance matrix. Orange points are the model predictions with the diagonal elements of the model covariance uncertainties. Left: comparison of measured timedelay distance from imaging data and time delays compared with the predicted value from the cosmological model, the internal and external MST (and their distributions). Right: comparison of the velocity dispersion measurements and the predicted values. In addition to the MST terms, the uncertainty in the model also includes the uncertainty in the anisotropy distribution a_{ani}. For lenses with multiple velocity dispersion measurements, the diagonal terms in the error covariance are illustrated (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/joint_inference.ipynb). (a) Fit to the timedelay distance. (b) Fit of velocity dispersion. 

Open with DEXTER 
Fig. 14. Illustration of the goodness of the fit of the maximum likelihood model of the joint analysis in describing the SDSS velocity dispersion measurements of the 34 SLACS lenses in our sample. Blue points are the measurements with the diagonal elements of the measurement covariance matrix. Orange points are the model predictions with the diagonal elements of the model covariance uncertainties. The measurement uncertainties include the uncertainties in the quoted measurements and the additional uncertainty of σ_{σP, sys}. The model uncertainties include the lens model uncertainties and the marginalization over the λ_{int} and a_{ani} distribution (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/joint_inference.ipynb). 

Open with DEXTER 
Fig. 15. Illustration of the goodness of the fit of the maximum likelihood model of the joint analysis in describing the VIMOS radially binned IFU velocity dispersion measurements of the nine SLACS lenses with VIMOS data in our sample. Blue points are the measurements with the diagonal elements of the measurement covariance matrix. Orange points are the model predictions with the diagonal elements of the model covariance uncertainties. The measurement uncertainties include the uncertainties in the quoted measurements and the additional uncertainty of σ_{σP, sys}. The model uncertainties include the lens model uncertainties and the marginalization over the λ_{int} and a_{ani} distribution (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/joint_inference.ipynb). 

Open with DEXTER 
In Fig. 16 we assess trends in the fit of the kinematic data in regards to lensing deflector properties. We see that with the r_{eff}/θ_{E} scaling by α_{λ} (Eq. (50)) we can remove systematic trends in model predictions. We do not find statistically significant remaining trends in our data set beyond the ones explicitly parameterized and marginalized over.
Fig. 16. Relative difference of the measured vs the predicted velocity dispersion for the SLACS and TDCOSMO sample as a function of different parameters associated with the lineofsight and the lensing galaxy. In particular, these are the relative inverse distance weighted over density ζ_{1/r}, the ratio of halflight radius to Einstein radius r_{eff}/θ_{E}, lens redshift z_{lens}, and SIS equivalent velocity dispersion σ_{SIS} (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/joint_inference.ipynb). 

Open with DEXTER 
8. Discussion
In this section^{19}, we discuss the interpretation of our measurement of H_{0}, the robustness of the uncertainties, and present an avenue for further improvements in the precision while maintaining accuracy. We first summarize briefly the key assumptions of this work, and give a physical interpretation of the results (Sect. 8.1). Second, we estimate the contribution of each individual assumption and dataset to the total error budget of the current analysis on H_{0} (Sect. 8.2). Third, we discuss specific aspects of the analysis that need further investigations to maintain accuracy with increased precision in Sect. 8.3. Fourth, in Sect. 8.4 we present the near future prospects for collecting data sets and revising the analysis to increase further the precision on H_{0} with strong lensing timedelay cosmography. Finally, in Sect. 8.5, we compare and discuss the H_{0} measurement of this work with previous work by the TDCOSMO collaboration.
8.1. Physical interpretation of the result
While consistent with the results of Wong et al. (2020), Millon et al. (2020), our inference of H_{0} has significantly lower precision for the TDCOSMO sample, even with the addition of external datasets from SLACS. The larger uncertainty was expected and is a direct result of relaxing the assumptions on the mass profile. By introducing a masssheet degeneracy parameter, we add the maximal degree of freedom in H_{0} while having minimal constraining power by lensing data on their own. This is the most conservative approach when adding a single degree of freedom in our analysis. While mathematically this result is clearly understood, it is worth discussing the physical interpretation of this choice.
If we had perfect cosmological numerical simulations or perfect knowledge of the internal mass distribution within elliptical galaxies, we would not have to worry about the internal MST. The approach chosen by our collaboration (Wong et al. 2020; Shajib et al. 2019; Millon et al. 2020) was to assume physically motivated mass profiles with degrees of freedom in their parameters. In particular, the collaboration used two different mass profiles, a powerlaw elliptical mass profile, and a composite mass profile separating the luminous component (with fixed masstolight ratio) and a dark component described as a NFW profile. The good fit to the data, the small pixellated corrections on the profiles from the first lens system (Suyu et al. 2010), and the good agreement of H_{0} inferred with the two mass profiles was a positive sanity check on the result (Millon et al. 2020).
In this paper we have taken a different viewpoint, and asked how much can the mass profiles depart from a powerlaw and still be consistent with the data. By phrasing the question in terms of the MST we can conveniently carry out the calculations, because the MST leaves the lensing observables unchanged and therefore it corresponds to minimal constraints and assumptions, and thus maximal uncertainties with one additional degree of freedom. However, after the inference, one has to examine the inferred MST transformed profile and evaluate it in comparison with existing and future data to make sure it is realistic. We know that the exact MST cannot be the actual answer because profiles have to go zero density at large radii, but the approximate MST discussed in Sect. 2 provides a convenient interpretation with the addition of a cored mass component.
Figure 17 illustrates a cored mass component approximating the MST inferred from this work, λ_{int} = 0.91 ± 0.04, in combination with a powerlaw model inferred from the population mean of the SLACS analysis by Shajib et al. (2020b). The analysis presented here guarantees that the inferred mass profile is consistent with the properties of TDCOSMO and SLACS lenses. We discuss below how additional data may allow us to constrain the models even further and thus reduce the overall uncertainty while keeping the assumptions at a minimum.
Fig. 17. Illustration of the inferred mass profile of the joint TDCOSMO+SLACS_{SDSS + IFU} analysis. A pure powerlaw with γ_{pl} = 2.10 ± 0.05 is shown in orange. In blue is the result of this work of λ_{int} = 0.91 ± 0.045 when interpreted as a cored mass component with R_{c} uniform in [3″, 10″]. Three dimensional density are illustrated on the left and the lensing convergence on the right. The dashed vertical line on the right panels indicates the Einstein radius. Relative difference in respect to the powerlaw model are presented in the bottom panels (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/MST_pl_cored.ipynb). 

Open with DEXTER 
8.2. Statistical error budget and known systematics
The total error budget of 5% on H_{0} in our combined TDCOSMO+SLACS analysis can be traced back to specific aspects of the data and the uncertainties in the model components/assumptions. Fixing λ_{int} to a singlevalued number (i.e., λ_{int} = 1) is equivalent to assuming a powerlaw profile and leads to an uncertainty in H_{0} of 2% (Millon et al. 2020). By subtracting in quadrature 2% from our total uncertainty, we estimate that the total error contribution of the MST (λ_{int}) to the error budget is 4.5%. Once the MST is introduced, the uncertainty in the mass profile is dominated by uncertainties in the measurement and modeling assumptions of the velocity dispersion. The statistical constraints on the combined velocity dispersion measurements of 33 SLACS lenses with SDSS spectroscopy, accounting for the σ_{σP, sys} contribution, and the TDCOSMO spectroscopic data set contribute 3% to the total error budget. The remaining 3.5% error contribution (in quadrature) to the total H_{0} error budget arises in equal parts from the uncertainty in the anisotropy prior distribution (⟨a_{ani}⟩, σ(a_{ani})) and the MST dependence with r_{eff}/θ_{E} (α_{λ}). The uncertainty in the lineofsight selection effect of the SLACS sample contributes a statistical uncertainty smaller than 0.5%. We note that an overall unaccountedfor shared κ_{ext} term of the ensemble of lenses in our sample would be mitigated through our MST parameterization and thus not affect our H_{0} inference.
8.3. Unaccountedfor systematics
Our framework is conservative in the sense that it imposes minimal assumptions of the mass profile in regards to H_{0}. Furthermore, the methods presented here have been internally reviewed and validated on the hydrodynamical simulations used in the TDLMC (Ding et al. 2018, 2020) (Sect. 4). Despite the known limitations of current numerical simulations at the subkpc scale, the blind validation on external data corroborates our methodology. In this section, we discuss aspects of our analysis that are not part of our validation scheme. In particular, we discuss uncertainties and potential systematics in the kinematics measurements and selection effects of the different lens samples used in this work. At the current level of precision, these are all subdominant effects, but they may be relevant as we further increase the precision.
8.3.1. Uncertainties in the kinematics measurement and modeling
Under the assumptions of this analysis, aperture stellar kinematic measurements drive the overall precision by providing the information needed to mitigate the MST. Given its crucial role, we highlight here the limitations of our kinematic treatment, in order to point the way to further improvements. First, we used a heterogeneous set of stellar velocity dispersions. The TDCOSMO measurements are based on large telescope highquality data and were the subject of extensive tests to assess systematic measurements, sometimes through repeated measurements. The nominal uncertainties are thus accurate, resulting in the internal consistency of all the TDCOSMO systems with a scatter on λ_{int} consistent with zero^{20}.
The SLACSonly analysis with the reported uncertainties of the stellar velocity dispersions leads to an inferred scatter in λ_{int} of about 10%. Assuming the same scatter in λ_{int} among the TDCOSMO and SLACS lenses, the discrepancy in the inferred σ(λ_{int}) between the two samples indicates that the reported uncertainties of the stellar velocity dispersions of the SLACS lenses do not reflect the total uncertainty. For the present analysis, we have addressed this issue by adding additional terms of uncorrelated errors. However, future work should aim to improve the determination of systematics going back to the original data (or acquiring better data), and contemplate the possibility of correlated calibration errors, as due for example to the choice of stellar library or instrumental setup. Second, our analysis is based on spherical Jeans models, assuming anisotropy of the Osipkov–Merritt form. These approximations are sufficient given the current uncertainties and constraints, but future work should consider at least axissymmetric Jeans modeling (e.g., Cappellari 2008; Barnabè et al. 2012; Posacki et al. 2015; Yıldırım et al. 2020), and consider alternate parameterizations of anisotropy. Another possibility is the use of axisymmetric modeling of the phasespace distribution function with a twointegral Schwarzschild method by Cretton et al. (1999), Verolme & de Zeeuw (2002) as performed by Barnabè & Koopmans (2007), Barnabè et al. (2009).
The addition of more freedom to the kinematic models will require the addition of more empirical information that can be obtained by spatially resolved data on distant lens galaxies, or from highquality data (including absorption line shapes) of appropriately selected local elliptical galaxies.
8.3.2. Selection effects of different lens samples
One key pillar in this analysis to improve the precision on the H_{0} measurement from the TDCOSMO sample is the information on the mass profiles of the SLACS sample. The SLACS sample differs in terms of the redshift distribution and r_{eff}/θ_{E} relative to the TDCOSMO sample. Beyond our chosen explicit parameterized dependence of the MST parameter λ_{int} as a function of r_{eff}/θ_{E} we do not find trends in the predicted vs measured velocity dispersion within the SLACS sample. However, we do find differences in the external shear contributions between the SLACS and TDCOSMO sample (Shajib et al. 2020b). This is expected because of selection effects. The TDCOSMO sample is composed of quads at higher redshift than SLACS. So it is not surprising that the TDCOSMO lenses tend to be more elongated (to increase the size of the quad cross section) and be more impacted by mass structure along the line of sight than SLACS. Nonetheless, based on previous studies, we have no reason to suspect that the deflectors themselves are intrinsically different between SLACS and TDCOSMO. Complex angular structure of the lenses might also affect the inference in the powerlaw slope γ_{pl}, as the angular degree of freedoms in our model assumptions are, to some degree, limited (Kochanek 2020b). A study with more lenses and particularly sampling the redshift range of the TDCOSMO sample (see Fig. 16) would allow us to better test our current underlying assumption and in case of a significant redshift evolution to correct for it.
8.3.3. Lineofsight structure
The investigation of the lineofsight structure of strong gravitational lenses of the TDCOSMO and the SLACS sample follows a specific protocol to provide an individual PDF of the external convergence, p(κ_{ext}). In our current analysis, the statistical uncertainty of the SLACS lineofsight structure is subdominant.
In the future – as the other terms of the error budget shrink and this one becomes more relevant – the following steps will be necessary. First, the specific choice of Nbody simulation and semianalytic galaxy evolution model will need to be revisited. Second, it will be necessary to investigate how to improve the comparison with simulation products in order to further mitigate uncertainties. For instance, beyond galaxy number count statistics, weak gravitational lensing observations can also add information on the lineofsight structure (Tihhonova et al. 2018, 2020).
Ideally, we aim for a validation based on simulations in the full cosmological context. These future simulations should include the presence of the strong lensing deflector, to further quantify nonlinear effects from the lineofsight structure on the main deflector modeling as well as the main deflector impact on the lineofsight light path differences (see e.g., Li et al. 2020). Meeting the lineofsight goal will require large box simulations, and for the main deflector this demands a very high fidelity and resolution at the 10−100 pc scales dominated by baryons in the form of stars and gas.
8.3.4. More flexible lens models and extended hierarchical analysis
Getting the uncertainties right requires careful judgment in the use of theoretical assumptions, validated as much as possible by empirical data. Previous work by TDCOSMO assumed that galaxies were described by power laws or stars plus an NFW profile, leading to a given precision. In this work, we relax this assumption, with the goal to study the impact of the MST. As part of this investigation, we introduce the MST parameter λ_{int} in our hierarchical framework and use a PEMD + shear model as baseline. We demonstrate, based on simulations, that these choices are sufficient to the level of precision currently achieved. It is not, however, the end of the story. Additional information will enable better constraints on the mass density profiles. As the precision improves on H_{0}, it will be necessary to keep revisiting our assumptions and validating on a sufficiently large and realistic mock data set.
In the future, additional model flexibility may demand a treatment of more lens model parameters in the full hierarchical context of the inference. Currently, our baseline model is constrained sufficiently by the imaging data of the lensing sample.
However, the development of a hierarchical treatment of additional lensing parameters may also allow us to incorporate lenses with fewer constraints on the lensing nature, such as doubly lensed quasars, or lenses with missing high resolution imaging, or other partially incomplete data products. By pursuing further this development in hierarchical lens modeling, the total number of usable systems can improve, thus, in turn improving the constraints on the Hubble constant.
Substructure adds 0.6%−2% of uncorrelated and unbiased uncertainties on the D_{Δt} inference (Gilman et al. 2020) for individual lenses. Thus, substructure adds a 0.5% uncertainty in quadrature on the combined H_{0} constraints from the seven TDCOSMO lenses. This effect is highly subdominant to other sources of uncertainties related to the MST in our work and we note that this effect might partially be encapsulated in the scatter in λ_{int}, σ(λ_{int}), as inferred to be few percent.
8.4. A pathway forward for timedelay cosmography
After having discussed current limitations on the precision and accuracy of our new proposed hierarchical framework applied to timedelay cosmography, we summarize here the key steps to take in the near future, in terms of improvements on the analysis and addition of data, to improve both precision and accuracy in the H_{0} measurements. Given the new hierarchical context, our largest statistical uncertainty on H_{0} arises from the stellar anisotropy modeling assumptions and the precision on the velocity dispersion measurements. Multiple and spatially resolved high signaltonoise velocity dispersion measurements of gravitational lenses are able to further constrain the stellar anisotropy distribution. This can be provided by a large VLTMUSE and KeckKCWI campaign of multiple lenses and we expect significant constraining power from JWST (Yıldırım et al. 2020). A complementary approach of studying the mass profile and kinematic structure of the deflector galaxies, is to study the local analogs of those galaxies with high signaltonoise ratio resolved spectroscopy. Assumptions about potential redshift evolution need to be mitigated and assessed within a lensing sample covering a wide redshift range.
A more straightforward approach in extending our analysis is by incorporating more galaxygalaxy lenses, in particular lenses that populate a similar distribution to the lensed quasar sample. Such a targeted large sample can reduce potential systematics of our selfsimilarity assumptions, as well as increase the statistical precision on the mass profiles. Recent searches for strong gravitational lenses in current and ongoing large area imaging surveys, such as the Dark Energy Survey (DES) and the HyperSupremeCam survey (HSC) have resulted in hundreds of promising galaxygalaxy scale candidate lenses (see e.g., Jacobs et al. 2019; Sonnenfeld et al. 2020) and dozens of lensed quasars (see e.g., Agnello et al. 2018; Delchambre et al. 2019; Lemon et al. 2020).
With the next generation large ground and space based surveys (Vera Rubin Observatory LSST, Euclid, Nancy Grace Roman Space Telescope), of order 10^{5} galaxygalaxy lenses and of order 10^{3} quasargalaxy lenses will be discovered (Oguri & Marshall 2010; Collett 2015). Limited followup capabilities with high resolution imaging and spectroscopy will be a key limitation and needs to be mitigated with strategic prioritization of targets to maximize resulting precision and accuracy. We refer to Birrer & Treu (2020) for a forcast based on the precision on H_{0} we can expect for a current and future lensing sample with spatially resolved kinematics measurement based on the analysis framework presented in this work.
Beyond the addition of external data sets, we emphasize the further demand on the validation of the modeling approach, both in the imaging analysis as well as the stellar anisotropy modeling. Detailed investigation and data challenges based on realistic data with the same complexity level as the real analysis are a useful tool to make progress. To ensure that the requirements are met in the modeling of the deflector galaxy and the local and lineofsight environment, validation on realistic simulations in the full cosmological context, including selection effects and raytracing through the lineofsight cone of a cosmological box are required. Moreover, we also stress that assessing and tracking systematics at the percent level and the mitigation thereof on the joint inference on H_{0} would be much facilitated by an automatized and homogenized analysis framework encapsulating all relevant aspects of the analysis of individual lenses.
Finally, a decisive conclusion on the current Hubble tension demands for a rigorous assessment of results by different science collaborations. We stress the importance of conducting the analysis blindly in regard to H_{0} and related quantities to prevent experimenter bias, a procedure our collaboration has incorporated and followed rigorously. In addition, all measurements of H_{0} contributing to a decisive conclusion of the tension must guarantee reproducibility. In this work, we provide all software as opensource and release the valueadded data products and analysis scripts to the community to facilitate the needed reproducibility.
8.5. Postblind discussion of the results and comparison with previous timedelay cosmography work
In this section^{21} we discuss how the measurement presented in this paper related to previous work by members of this collaboration as part of the H0LiCOW, STRIDES, and SHARP projects. We then discuss the relationship between the multiple measurements obtained within the hierarchical framework introduced in this paper. All the relevant measurements are summarized in Fig. 18 for quick visualization.
Fig. 18. Comparison of different blind H_{0} measurements by the TDCOSMO collaboration, based on different mass profile assumptions and data sets incorporated. All measurements presented on this plot were performed blindly with regard to the inference of H_{0}. The measurement on top is the combined H0LiCOW six lenses constraints presented by Wong et al. (2020), when averaging powerlaw and composite NFW plus stars (with constant masstolight ratio) on a lensbylens basis without correlated errors among the lenses. The next two measurements are from Millon et al. (2020) of six TDCOSMO timedelay lenses (five H0LiCOW lenses^{22} and one STRIDES lens by Shajib et al. 2020a), when performing the inference assuming either a composite NFW plus stars (with constant masstolight ratio) or the powerlaw mass density profile for the galaxy acting as a lens. Lower panel: results from this work. The main difference with respect to previous work is that we have made virtually no assumption on the radial mass density profile of the lens galaxy, and taken into account the covariance between the lenses. The analysis in this work is constrained only by the stellar kinematics and fully accounts for the uncertainty related to the mass sheet transformation (MST). In this framework, we obtain four measurements according to the datasets considered. The TDCOSMOonly inference is based on the same set of seven lenses as those jointly included by Millon et al. (2020) and Wong et al. (2020). The inferred median value is the same, indicating no bias, and the uncertainties, as expected, are larger. The next three measurements rely on external datasets from the SLACS survey, by making the assumption that the lens galaxies in the two surveys are drawn from the same population. The TDCOSMO+SLACS_{IFU} measurements uses, in addition to the TDCOSMO sample, nine lenses from the SLACS sample with IFU observations to inform the anisotropy prior applied on the TDCOSMO lenses. The TDCOSMO+SLACS_{SDSS} measurement comes from the joint analysis of the TDCOSMO sample and 33 SLACS lenses with SDSS spectroscopy. The TDCOSMO+SLACS_{SDSS + IFU} presents the joint analysis of all three data sets, again assuming selfsimilar distributions of the mass profiles and stellar anisotropy. The TDCOSMOonly and TDCOSMO+SLACS_{IFU} analyses do not rely on selfsimilar mass profiles of the SLACS and TDCOSMO sample while the TDCOSMO+SLACS_{SDSS} and TDCOSMO+SLACS_{SDSS + IFU} measurements (orange and purple) do. All the measurements shown in this plot are in statistical agreement with each other. See Sect. 8.5 for a discussion and physical interpretation of the results (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/tdcosmo_comparison_plot.ipynb). 

Open with DEXTER 
The result of our hierarchical TDCOSMOonly analysis is fully consistent with the assumptions on the mass profiles made in previous H0LiCOW/STRIDES/SHARP work (see e.g., Wong et al. 2020; Shajib et al. 2020a; Millon et al. 2020). The consistency is reinforced by Yang et al. (2020) who concluded that the combination of kinematics and timedelay constraints are consistent with General Relativity, an underlying assumptions of timedelay cosmography. The only difference with respect to the H0LiCOW/STRIDES/SHARP analysis is that the uncertainty has significantly increased. This was expected, because we have virtually eliminated the assumptions on the radial mass profile of elliptical galaxies and, due to the MST, the only source of information left to enable a H_{0} measurement is the stellar kinematics. Without lensing information, due to the well known massanisotropy degeneracy, unresolved kinematics has limited power to constrain the mass profiles. Since our parametrization is maximally degenerate with H_{0} and our assumptions are minimal, this 9% error budget accounts for potential effects of the MST.
Another set of results is obtained within the hierarchical framework with the addition of external information. Under the additional assumption that the galaxies in the external datasets are drawn from the same population as the TDCOSMO deflectors, these results achieve higher precision than TDCOSMO alone. Adding the SLACS dataset shrinks the uncertainty to 5% and shifts the mean inferred H_{0} to a value about 6 km s^{−1} Mpc^{−1} lower than the TDCOSMOonly analysis. This shift is consistent within the uncertainties achieved by the TDCOSMOonly analysis and can be traced back to two factors: (i) the anisotropy constraints prefer a lower a_{ani} value and this moves H_{0} down relative to the chosen prior on a_{ani}. The VIMOS+IFU inference is about 2 km s^{−1} Mpc^{−1} lower than the equivalent TDCOSMOonly inference. (ii) The SLACS lenses prefer an overall lower – but statistically consistent – λ_{int, 0} value for a given anisotropy model by about 8%. The negative trend of λ_{int} with r_{eff}/θ_{E} (α_{λ}) partially mitigates an even lower λ_{int} value preferred by the SLACS sample relative to the TDCOSMO sample.
The shift between the TDCOSMO and TDCOSMO+SLACS results can have two possible explanations (if it is not purely a statistical fluctuation). One option is that elliptical galaxies are more radially anisotropic (and therefore have a flatter mass density profile to reproduce the same velocity dispersion profile) than the prior used to model the TDCOSMO galaxies. The alternative option is that the TDCOSMO and SLACS galaxies are somehow different. Within the observables at disposal, one that may be indicative of a different line of sight anisotropy is the higher ellipticity of the surface brightness and of the projected total mass distribution (Shajib et al. 2020b) of the TDCOSMO deflectors in comparison to the SLACS deflectors. As mentioned in Sect. 6.3, this is understood to be a selection effect because ellipticity increases the cross section for quadruple images and TDCOSMO is a sample of mostly quads (six out of seven), while SLACS is mostly doubles (Treu et al. 2009). Departure from spherical symmetry in elliptical galaxies can arise from rotation or anisotropy. If flattening arises from rotation (which we have neglected in our study) more flattened systems are more likely to be seen edgeon. If it arises from anisotropy, the observed flattening could be due to tangential anisotropy that is not included in our models, or to a smaller degree of radial anisotropy than for other orientations. These two options result in different predictions that can be tested with spatially resolved kinematics of the TDCOSMO lens galaxies. If the shift is just due to an inconsistency between the TDCOSMO prior and the SLACS likelihood, spatially resolved kinematics will bring them in closer alignment. If it is due to intrinsic differences, spatially resolved kinematics will reveal rotation or tangential (less radial) anisotropy. In addition, spatially resolved kinematics of the TDCOSMO sample will reduce the uncertainties of both measurement, and thus resolve whether the shift is a fluctuation or significant.
The other potential way to elucidate the marginal differences between the TDCOSMO and SLACS sample is to obtain precise measurements of mass at scales well beyond the Einstein radius. As seen in Fig. 17, a pure power law and the transformed profile differ by up to 50% in that region (depending on the choice of R_{c}). Satellite kinematics or weak lensing would help reduce the freedom of the MST, provided they reach sufficient precision.
9. Conclusion
The precision of timedelay cosmography has improved significantly in the past few years, driven by improvement in the quality of the data and methodology. As the precision improves it is critical to revisit assumptions and explore potential systematics, while charting the way forward.
In this work, we relaxed previous assumptions on the massprofile parameterization and introduced an efficient way to explore potential systematics associated to the masssheet degeneracy in a hierarchical Bayesian analysis. In this new approach, the mass density profile of the lens galaxies is only constrained by basic information on stellar kinematics. It thus provides a conservative estimate of how much the mass profile can depart from a power law, and how much the error budget can grow as a result. Based on the consistent results of the power law and stars plus NFW profiles in the inference on H_{0} (Millon et al. 2020), we expect very similar conclusions had we performed this analysis with a stars plus NFW profile.
We validated our approach on the TimeDelay Lens Modeling Challenge sample of hydrodynamical simulations. We then applied the formalism and assumptions to the TDCOSMO data set in a blind fashion. Based on the TDCOSMO data set alone we infer km s^{−1} Mpc^{−1}. The uncertainties on H_{0} are dominated by the precision of the spectroscopic data and the modeling uncertainties therein. To further increase our precision, we added selfconsistently to our analysis a set of SLACS lenses with imaging modeling and independent kinematic constraints. We characterized the candidate lenses to be added and explicitly selected only lenses that do not have significantly enhanced local environments. In total, we were able to add 33 additional lenses with no time delay information of which nine have additional 2D kinematics with VIMOS IFU data that allowed us to further constrain uncertainties in the anisotropy profile of the stellar orbits. Our most constrained measurement of the Hubble constant is km s^{−1} Mpc^{−1} from the joint TDCOSMO+SLACS analysis, assuming that the two samples are drawn from the same population.
The 5% error budget reported in this work addresses conclusively concerns about the MST (Schneider & Sluse 2013; Sonnenfeld 2018; Kochanek 2020a, 2020b). If the mass density profiles of lens galaxies are not well described by powerlaws or stars plus NFW halos, this is the appropriate uncertainty to associate with current timedelay cosmography. Additional effects are very much subdominant for now as compared with the effect of the MST. For example, the small level of pixelated corrections to the elliptical powerlaw model obtained in our previous work suggests that the departure from ellipticity is not required by the data.
Based on the methodology presented and the results achieved, we lay out a roadmap for further improvements to ultimately enable a 1% precision measurement of the Hubble constant, which is a clear target both for resolving the Hubble tension and to serve as a prior on dark energy studies (Weinberg et al. 2013). The key ingredients required to reduce the statistical uncertainties are (i) spatially resolved high signaltonoise kinematic measurements; (ii) an increase in the sample size of both lenses with measured timedelays and lenses with highresolution imaging and precise kinematic measurements. Potential sources of systematic that should be investigated further to maintain accuracy at the target precision are those arising from: (i) measurements of the stellar velocity dispersion; (ii) characterization of the selection function and local environment of all the lenses included in the inference; (iii) mass profile modeling assumptions beyond the MST and stellar anisotropy modeling assumptions.
Upcoming deep, widefield surveys (such as those enabled by Vera Rubin Observatory, Euclid and the Nancy Grace Roman Observatory) will discover many thousands of lenses of which several hundred will have accurate time delay measurements (see e.g., Oguri & Marshall 2010; Collett 2015; Huber et al. 2019). The analysis framework presented in this work will serve as a baseline for the analysis of these giant samples of lenses; simultaneously enabling precise and accurate constraints on the Hubble constant and the astrophysics of strong lensing galaxies.
See Birrer et al. (2016) for an analysis explicitly constraining the MST with kinematic data that satisfies the error budget of Kochanek (2020a).
Noting however the caveats on the realism of the TDLMC simulations discussed by Ding et al. (2020).
The integral between the deflector and the source deviates from the Born approximation as the light paths are significantly perturbed (see e.g., BarKana 1996; Birrer et al. 2017).
We note that the mean cosmological background density is already fully encompassed in the background metric and we effectively only require to model the enhancement matter density (see e.g., Wucknitz 2008; Birrer et al. 2017).
To use the IFU data set more optimally, we add the lens SDSSJ02160813, which is the remaining lens within the IFU quality sample that was not selected by Shajib et al. (2020b) from the original SLACS sample.
This section, with the exception of Sect. 8.5, was written before the results of the combined TDCOSMO+SLACS analysis were known to the authors and, thus, reflect the assessment of uncertainties present in our analysis agnostic to its outcome.
Excluding B1608+656 as this lens was only analyzed with a powerlaw model and not with a composite model and thus not part of the model comparison analysis. Additional lensing potential perturbations on top of the powerlaw profile lead to only small amounts of corrections Suyu et al. (2010).
Acknowledgments
SB thanks Kfir Blum, Veronica Motta, Timo Anguita, Sampath Mukherjee, Elizabeth BuckleyGeer for useful discussions, Hyungsuk Tak for participating in the TDLMC, the TDLMC team for setting up the challenge and Yiping Shu for providing access to the SDSS velocity dispersion measurements. AJS was supported by the Dissertation Year Fellowship from the UCLA Graduate Division. TT and AJS acknowledge support from NSF through NSF grant NSFAST1906976, from NASA through grant HSTGO15320 and from the Packard Foundation through a Packard Research Fellowship to TT. AA was supported by a grant from VILLUM FONDEN (project number 16599). This project is partially funded by the Danish council for independent research under the project “Fundamentals of Dark Matter Structures”, DFF–610800470. SB and MWA acknowledges support from the Kavli Foundation. TC is funded by a Royal Astronomical Society Research Fellowship. C.D.F. and G.C.F.C. acknowledge support for this work from the National Science Foundation under grant no. AST1907396 and from NASA through grant HSTGO15320. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (COSMICLENS:grant agreement No 787886). CS is supported by a Hintze Fellow at the Oxford Centre for Astrophysical Surveys, which is funded through generous support from the Hintze Family Charitable Foundation. SHS thanks the Max Planck Society for support through the Max Planck Research Group. This work was supported by World Premier International Research Center Initiative (WPI Initiative), MEXT, Japan. This work was supported by JSPS KAKENHI Grant Number JP20K14511. SB acknowledges the hospitality of the Munich Institute for Astro and Particle Physics (MIAPP) of the Excellence Cluster “Universe”. Based on observations collected at the European Southern Observatory under ESO program 0102.A0600 (PI Agnello), 075.B0226 (PI Koopmans), 177.B0682 (PI Koopmans). This work made use of the following public software packages: HIERARC (this work), LENSTRONOMY (Birrer et al. 2015; Birrer & Amara 2018), DOLPHIN (Shajib et al. 2020b), EMCEE (ForemanMackey et al. 2013), CORNER (ForemanMackey 2016), PPXF (Cappellari 2012), ASTROPY (Astropy Collaboration 2013, 2018), FASTELL (Barkana 1999) and standard Python libraries.
References
 Abbott, B. P., Abbott, R., Abbott, T. D., et al. 2017, Nature, 551, 85 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Abbott, T. M. C., Abdalla, F. B., Annis, J., et al. 2018, MNRAS, 480, 3879 [NASA ADS] [CrossRef] [Google Scholar]
 Agnello, A., Evans, N. W., & Romanowsky, A. J. 2014a, MNRAS, 442, 3284 [NASA ADS] [CrossRef] [Google Scholar]
 Agnello, A., Evans, N. W., Romanowsky, A. J., & Brodie, J. P. 2014b, MNRAS, 442, 3299 [NASA ADS] [CrossRef] [Google Scholar]
 Agnello, A., Sonnenfeld, A., Suyu, S. H., et al. 2016, MNRAS, 458, 3830 [NASA ADS] [CrossRef] [Google Scholar]
 Agnello, A., Lin, H., Kuropatkin, N., et al. 2018, MNRAS, 479, 4345 [NASA ADS] [CrossRef] [Google Scholar]
 Astropy Collaboration (Robitaille, T. P., et al.) 2013, A&A, 558, A33 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Astropy Collaboration (PriceWhelan, A. M., et al.) 2018, AJ, 156, 123 [Google Scholar]
 Auger, M. W., Treu, T., Bolton, A. S., et al. 2009, ApJ, 705, 1099 [NASA ADS] [CrossRef] [Google Scholar]
 Auger, M. W., Treu, T., Bolton, A. S., et al. 2010, ApJ, 724, 511 [NASA ADS] [CrossRef] [Google Scholar]
 BarKana, R. 1996, ApJ, 468, 17 [NASA ADS] [CrossRef] [Google Scholar]
 Barkana, R. 1998, ApJ, 502, 531 [NASA ADS] [CrossRef] [Google Scholar]
 Barkana, R. 1999, Astrophysics Source Code Library [record ascl:9910.003] [Google Scholar]
 Barnabè, M., & Koopmans, L. V. E. 2007, ApJ, 666, 726 [NASA ADS] [CrossRef] [Google Scholar]
 Barnabè, M., Czoske, O., Koopmans, L. V. E., et al. 2009, MNRAS, 399, 21 [NASA ADS] [CrossRef] [Google Scholar]
 Barnabè, M., Czoske, O., Koopmans, L. V. E., et al. 2011, MNRAS, 415, 2215 [NASA ADS] [CrossRef] [Google Scholar]
 Barnabè, M., Dutton, A. A., Marshall, P. J., et al. 2012, MNRAS, 423, 1073 [NASA ADS] [CrossRef] [Google Scholar]
 Bernardi, M., DomínguezSánchez, H., MargalefBentabol, B., Nikakhtar, F., & Sheth, R. K. 2020, MNRAS, 494, 5148 [CrossRef] [Google Scholar]
 Betoule, M., Kessler, R., Guy, J., et al. 2014, A&A, 568, A22 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Binney, J., & Mamon, G. A. 1982, MNRAS, 200, 361 [NASA ADS] [CrossRef] [Google Scholar]
 Binney, J., & Tremaine, S. 2008, Galactic Dynamics: Second Edition (Princeton: Princeton University Press) [Google Scholar]
 Birrer, S., & Amara, A. 2018, Phys. Dark Univ., 22, 189 [NASA ADS] [CrossRef] [Google Scholar]
 Birrer, S., & Treu, T. 2019, MNRAS, 489, 2097 [CrossRef] [Google Scholar]
 Birrer, S., & Treu, T. 2020, A&A, submitted [arXiv:2008.06157] [Google Scholar]
 Birrer, S., Amara, A., & Refregier, A. 2015, ApJ, 813, 102 [NASA ADS] [CrossRef] [Google Scholar]
 Birrer, S., Amara, A., & Refregier, A. 2016, JCAP, 2016, 020 [NASA ADS] [CrossRef] [Google Scholar]
 Birrer, S., Welschen, C., Amara, A., & Refregier, A. 2017, JCAP, 2017, 049 [CrossRef] [Google Scholar]
 Birrer, S., Treu, T., Rusu, C. E., et al. 2019, MNRAS, 484, 4726 [NASA ADS] [CrossRef] [Google Scholar]
 Blandford, R., & Narayan, R. 1986, ApJ, 310, 568 [NASA ADS] [CrossRef] [Google Scholar]
 Blum, K., Castorina, E., & Simonović, M. 2020, ApJ, 892, L27 [CrossRef] [Google Scholar]
 Bolton, A. S., Burles, S., Schlegel, D. J., Eisenstein, D. J., & Brinkmann, J. 2004, AJ, 127, 1860 [NASA ADS] [CrossRef] [Google Scholar]
 Bolton, A. S., Burles, S., Koopmans, L. V. E., Treu, T., & Moustakas, L. A. 2006, ApJ, 638, 703 [NASA ADS] [CrossRef] [Google Scholar]
 Bolton, A. S., Burles, S., Koopmans, L. V. E., et al. 2008, ApJ, 682, 964 [NASA ADS] [CrossRef] [Google Scholar]
 Bonvin, V., Tewes, M., Courbin, F., et al. 2016, A&A, 585, A88 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Bonvin, V., Tihhonova, O., Millon, M., et al. 2018, A&A, 616, A183 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Bonvin, V., Millon, M., Chan, J. H.H., et al. 2019, A&A, 629, A97 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Braibant, L., Hutsemékers, D., Sluse, D., et al. 2014, A&A, 565, L11 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 BuckleyGeer, E. J., Lin, H., Rusu, C. E., et al. 2020, MNRAS, 498, 3241 [CrossRef] [Google Scholar]
 Cappellari, M. 2008, MNRAS, 390, 71 [NASA ADS] [CrossRef] [Google Scholar]
 Cappellari, M. 2012, Astrophysics Source Code Library [record ascl:1210.002] [Google Scholar]
 Cappellari, M. 2017, MNRAS, 466, 798 [NASA ADS] [CrossRef] [Google Scholar]
 Cappellari, M., & Emsellem, E. 2004, PASP, 116, 138 [NASA ADS] [CrossRef] [Google Scholar]
 Cappellari, M., Emsellem, E., Bacon, R., et al. 2007, MNRAS, 379, 418 [NASA ADS] [CrossRef] [Google Scholar]
 Chen, G. C. F., Fassnacht, C. D., Suyu, S. H., et al. 2019, MNRAS, 490, 1743 [NASA ADS] [CrossRef] [Google Scholar]
 Coles, J. 2008, ApJ, 679, 17 [NASA ADS] [CrossRef] [Google Scholar]
 Coles, J. P., Read, J. I., & Saha, P. 2014, MNRAS, 445, 2181 [NASA ADS] [CrossRef] [Google Scholar]
 Collett, T. E. 2015, ApJ, 811, 20 [NASA ADS] [CrossRef] [Google Scholar]
 Collett, T. E., & Cunnington, S. D. 2016, MNRAS, 462, 3255 [NASA ADS] [CrossRef] [Google Scholar]
 Collett, T. E., Marshall, P. J., Auger, M. W., et al. 2013, MNRAS, 432, 679 [NASA ADS] [CrossRef] [Google Scholar]
 Courbin, F., Chantry, V., Revaz, Y., et al. 2011, A&A, 536, A53 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Courbin, F., Bonvin, V., BuckleyGeer, E., et al. 2018, A&A, 609, A71 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Cretton, N., de Zeeuw, P. T., van der Marel, R. P., & Rix, H.W. 1999, ApJS, 124, 383 [NASA ADS] [CrossRef] [Google Scholar]
 Czoske, O., Barnabè, M., Koopmans, L. V. E., Treu, T., & Bolton, A. S. 2008, MNRAS, 384, 987 [NASA ADS] [CrossRef] [Google Scholar]
 Czoske, O., Barnabè, M., Koopmans, L. V. E., Treu, T., & Bolton, A. S. 2012, MNRAS, 419, 656 [CrossRef] [Google Scholar]
 Dawson, K. S., Schlegel, D. J., Ahn, C. P., et al. 2013, AJ, 145, A10 [NASA ADS] [CrossRef] [Google Scholar]
 De Lucia, G., & Blaizot, J. 2007, MNRAS, 375, 2 [NASA ADS] [CrossRef] [Google Scholar]
 Delchambre, L., KroneMartins, A., Wertz, O., et al. 2019, A&A, 622, A165 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Dey, A., Schlegel, D. J., Lang, D., et al. 2019, AJ, 157, 168 [NASA ADS] [CrossRef] [Google Scholar]
 Diehl, H. T., BuckleyGeer, E. J., Lindgren, K. A., et al. 2017, ApJS, 232, 15 [CrossRef] [Google Scholar]
 Ding, X., Treu, T., Shajib, A. J., et al. 2018, MNRAS, submitted [arXiv:1801.01506] [Google Scholar]
 Ding, X., Treu, T., Birrer, S., et al. 2020, MNRAS, submitted [arXiv:2006.08619] [Google Scholar]
 Dobler, G., Keeton, C. R., Bolton, A. S., & Burles, S. 2008, ApJ, 685, 57 [NASA ADS] [CrossRef] [Google Scholar]
 Dobler, G., Fassnacht, C. D., Treu, T., et al. 2015, ApJ, 799, 168 [NASA ADS] [CrossRef] [Google Scholar]
 Eulaers, E., Tewes, M., Magain, P., et al. 2013, A&A, 553, A121 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Faber, S. M., & Jackson, R. E. 1976, ApJ, 204, 668 [NASA ADS] [CrossRef] [Google Scholar]
 Fadely, R., Keeton, C. R., Nakajima, R., & Bernstein, G. M. 2010, ApJ, 711, 246 [NASA ADS] [CrossRef] [Google Scholar]
 Falco, E. E., Gorenstein, M. V., & Shapiro, I. I. 1985, ApJ, 289, L1 [NASA ADS] [CrossRef] [Google Scholar]
 Fassnacht, C. D., Womble, D. S., Neugebauer, G., et al. 1996, ApJ, 460, L103 [NASA ADS] [CrossRef] [Google Scholar]
 Fassnacht, C. D., Pearson, T. J., Readhead, A. C. S., et al. 1999, ApJ, 527, 498 [NASA ADS] [CrossRef] [Google Scholar]
 Fassnacht, C. D., Xanthopoulos, E., Koopmans, L. V. E., & Rusin, D. 2002, ApJ, 581, 823 [NASA ADS] [CrossRef] [Google Scholar]
 Fassnacht, C. D., Koopmans, L. V. E., & Wong, K. C. 2011, MNRAS, 410, 2167 [NASA ADS] [CrossRef] [Google Scholar]
 ForemanMackey, D. 2016, J. Open Sour. Softw., 1, 24 [NASA ADS] [CrossRef] [Google Scholar]
 ForemanMackey, D., Hogg, D. W., Lang, D., & Goodman, J. 2013, PASP, 125, 306 [NASA ADS] [CrossRef] [Google Scholar]
 Freedman, W. L., Madore, B. F., Hatt, D., et al. 2019, ApJ, 882, 34 [NASA ADS] [CrossRef] [Google Scholar]
 Freedman, W. L., Madore, B. F., Hoyt, D., et al. 2020, ApJ, 891, 57 [CrossRef] [Google Scholar]
 Gerhard, O., Kronawitter, A., Saglia, R. P., & Bender, R. 2001, AJ, 121, 1936 [NASA ADS] [CrossRef] [Google Scholar]
 Gilman, D., Birrer, S., & Treu, T. 2020, A&A, 642, A194 [CrossRef] [EDP Sciences] [Google Scholar]
 Gorenstein, M. V., Falco, E. E., & Shapiro, I. I. 1988, ApJ, 327, 693 [NASA ADS] [CrossRef] [Google Scholar]
 Greene, J. E., Murphy, J. D., Graves, G. J., et al. 2013, ApJ, 776, 64 [NASA ADS] [CrossRef] [Google Scholar]
 Grogin, N. A., & Narayan, R. 1996, ApJ, 464, 92 [NASA ADS] [CrossRef] [Google Scholar]
 Hernquist, L. 1993, ApJ, 409, 548 [NASA ADS] [CrossRef] [Google Scholar]
 Hilbert, S., Hartlap, J., White, S. D. M., & Schneider, P. 2009, A&A, 499, 31 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Holder, G. P., & Schechter, P. L. 2003, ApJ, 589, 688 [NASA ADS] [CrossRef] [Google Scholar]
 Huang, C. D., Riess, A. G., Yuan, W., et al. 2020, ApJ, 889, 5 [NASA ADS] [CrossRef] [Google Scholar]
 Huber, S., Suyu, S. H., Noebauer, U. M., et al. 2019, A&A, 631, A161 [CrossRef] [EDP Sciences] [Google Scholar]
 Huterer, D., Keeton, C. R., & Ma, C.P. 2005, ApJ, 624, 34 [NASA ADS] [CrossRef] [Google Scholar]
 Jacobs, C., Collett, T., Glazebrook, K., et al. 2019, ApJS, 243, 17 [CrossRef] [Google Scholar]
 Jee, I., Komatsu, E., & Suyu, S. H. 2015, JCAP, 2015, 033 [NASA ADS] [CrossRef] [Google Scholar]
 Keeton, C. R., & Kochanek, C. S. 1997, ApJ, 487, 42 [NASA ADS] [CrossRef] [Google Scholar]
 Keeton, C. R., & Moustakas, L. A. 2009, ApJ, 699, 1720 [NASA ADS] [CrossRef] [Google Scholar]
 Kochanek, C. S. 2002, ApJ, 578, 25 [NASA ADS] [CrossRef] [Google Scholar]
 Kochanek, C. S. 2003, ApJ, 583, 49 [NASA ADS] [CrossRef] [Google Scholar]
 Kochanek, C. S. 2006, in SaasFee Advanced Course 33: Gravitational Lensing: Strong, Weak and Micro, eds. G. Meylan, P. Jetzer, P. North, et al. (Berlin: Springer), 91 [NASA ADS] [CrossRef] [Google Scholar]
 Kochanek, C. S. 2020a, MNRAS, 493, 1725 [CrossRef] [Google Scholar]
 Kochanek, C. S. 2020b, MNRAS, submitted [arXiv:2003.08395] [Google Scholar]
 Koopmans, L. V. E. 2004, ArXiv eprints [arXiv:astroph/0412596] [Google Scholar]
 Koopmans, L. V. E., Treu, T., Fassnacht, C. D., Blandford, R. D., & Surpi, G. 2003, ApJ, 599, 70 [NASA ADS] [CrossRef] [Google Scholar]
 Kormann, R., Schneider, P., & Bartelmann, M. 1994, A&A, 284, 285 [NASA ADS] [Google Scholar]
 Lang, D., Hogg, D. W., & Mykytyn, D. 2016, Astrophysics Source Code Library [record ascl:1604.008] [Google Scholar]
 Lemon, C., Auger, M. W., McMahon, R., et al. 2020, MNRAS, 494, 3491 [CrossRef] [Google Scholar]
 Li, N., Becker, C., & Dye, S. 2020, MNRAS, submitted [arXiv:2006.08540] [Google Scholar]
 Liao, K., Treu, T., Marshall, P., et al. 2015, ApJ, 800, 11 [NASA ADS] [CrossRef] [Google Scholar]
 Lin, H., BuckleyGeer, E., Agnello, A., et al. 2017, ApJ, 838, L15 [NASA ADS] [CrossRef] [Google Scholar]
 Mao, S., & Schneider, P. 1998, MNRAS, 295, 587 [NASA ADS] [CrossRef] [Google Scholar]
 McCully, C., Keeton, C. R., Wong, K. C., & Zabludoff, A. I. 2014, MNRAS, 443, 3631 [NASA ADS] [CrossRef] [Google Scholar]
 Merritt, D. 1985, AJ, 90, 1027 [NASA ADS] [CrossRef] [Google Scholar]
 Millon, M., Galan, A., Courbin, F., et al. 2020, A&A, 639, A101 [CrossRef] [EDP Sciences] [Google Scholar]
 Morgan, N. D., Caldwell, J. A. R., Schechter, P. L., et al. 2004, AJ, 127, 2617 [NASA ADS] [CrossRef] [Google Scholar]
 Myers, S. T., Fassnacht, C. D., Djorgovski, S. G., et al. 1995, ApJ, 447, L5 [NASA ADS] [CrossRef] [Google Scholar]
 Navarro, J. F., Frenk, C. S., & White, S. D. M. 1997, ApJ, 490, 493 [NASA ADS] [CrossRef] [Google Scholar]
 Nipoti, C., Londrillo, P., & Ciotti, L. 2006, MNRAS, 370, 681 [NASA ADS] [CrossRef] [Google Scholar]
 Oguri, M. 2007, ApJ, 660, 1 [NASA ADS] [CrossRef] [Google Scholar]
 Oguri, M., & Marshall, P. J. 2010, MNRAS, 405, 2579 [NASA ADS] [Google Scholar]
 Oguri, M., Inada, N., Hennawi, J. F., et al. 2005, ApJ, 622, 106 [NASA ADS] [CrossRef] [Google Scholar]
 Osipkov, L. P. 1979, Pisma v Astronomicheskii Zhurnal, 5, 77 [Google Scholar]
 Paraficz, D., & Hjorth, J. 2009, A&A, 507, L49 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Pesce, D. W., Braatz, J. A., Reid, M. J., et al. 2020, ApJ, 891, L1 [CrossRef] [Google Scholar]
 Philcox, O. H. E., Ivanov, M. M., Simonović, M., & Zaldarriaga, M. 2020, JCAP, 2020, 032 [CrossRef] [Google Scholar]
 Planck Collaboration VI. 2020, A&A, 641, A6 [CrossRef] [EDP Sciences] [Google Scholar]
 Posacki, S., Cappellari, M., Treu, T., Pellegrini, S., & Ciotti, L. 2015, MNRAS, 446, 493 [NASA ADS] [CrossRef] [Google Scholar]
 Rathna Kumar, S., Stalin, C. S., & Prabhu, T. P. 2015, A&A, 580, A38 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Read, J. I., Saha, P., & Macciò, A. V. 2007, ApJ, 667, 645 [NASA ADS] [CrossRef] [Google Scholar]
 Refsdal, S. 1964, MNRAS, 128, 307 [NASA ADS] [CrossRef] [Google Scholar]
 Riess, A. G., Casertano, S., Yuan, W., Macri, L. M., & Scolnic, D. 2019, ApJ, 876, 85 [NASA ADS] [CrossRef] [Google Scholar]
 Romanowsky, A. J., & Kochanek, C. S. 1999, ApJ, 516, 18 [NASA ADS] [CrossRef] [Google Scholar]
 Rusu, C. E., Fassnacht, C. D., Sluse, D., et al. 2017, MNRAS, 467, 4220 [NASA ADS] [CrossRef] [Google Scholar]
 Rusu, C. E., Wong, K. C., Bonvin, V., et al. 2020, MNRAS, 498, 1440 [CrossRef] [Google Scholar]
 Saha, P., & Williams, L. L. R. 2006, ApJ, 653, 936 [NASA ADS] [CrossRef] [Google Scholar]
 Saha, P., Coles, J., Macciò, A. V., & Williams, L. L. R. 2006, ApJ, 650, L17 [NASA ADS] [CrossRef] [Google Scholar]
 Schechter, P. L., Bailyn, C. D., Barr, R., et al. 1997, ApJ, 475, L85 [NASA ADS] [CrossRef] [Google Scholar]
 Schneider, P. 1985, A&A, 143, 413 [Google Scholar]
 Schneider, P., & Sluse, D. 2013, A&A, 559, A37 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Schneider, P., & Sluse, D. 2014, A&A, 564, A103 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Schneider, P., Ehlers, J., & Falco, E. E. 1992, Gravitational Lenses (Berlin, Heidelberg: SpringerVerlag), 560 [Google Scholar]
 Schwarzschild, M. 1979, ApJ, 232, 236 [NASA ADS] [CrossRef] [Google Scholar]
 Scolnic, D. M., Jones, D. O., Rest, A., et al. 2018, ApJ, 859, 101 [NASA ADS] [CrossRef] [Google Scholar]
 Sereno, M., & Paraficz, D. 2014, MNRAS, 437, 600 [NASA ADS] [CrossRef] [Google Scholar]
 Shajib, A. J., Treu, T., & Agnello, A. 2018, MNRAS, 473, 210 [NASA ADS] [CrossRef] [Google Scholar]
 Shajib, A. J., Birrer, S., Treu, T., et al. 2019, MNRAS, 483, 5649 [NASA ADS] [CrossRef] [Google Scholar]
 Shajib, A. J., Birrer, S., Treu, T., et al. 2020a, MNRAS, 494, 6072 [CrossRef] [Google Scholar]
 Shajib, A. J., Treu, T., Birrer, S., & Sonnenfeld, A. 2020b, MNRAS, submitted [arXiv:2008.11724] [Google Scholar]
 Shu, Y., Bolton, A. S., Brownstein, J. R., et al. 2015, ApJ, 803, 71 [NASA ADS] [CrossRef] [Google Scholar]
 Sluse, D., Surdej, J., Claeskens, J.F., et al. 2003, A&A, 406, L43 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Sluse, D., Chantry, V., Magain, P., Courbin, F., & Meylan, G. 2012, A&A, 538, A99 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Sluse, D., Rusu, C. E., Fassnacht, C. D., et al. 2019, MNRAS, 490, 613 [CrossRef] [Google Scholar]
 Sonnenfeld, A. 2018, MNRAS, 474, 4648 [NASA ADS] [CrossRef] [Google Scholar]
 Sonnenfeld, A., Verma, A., More, A., et al. 2020, A&A, 642, A148 [CrossRef] [EDP Sciences] [Google Scholar]
 Spiniello, C., Koopmans, L. V. E., Trager, S. C., et al. 2015, MNRAS, 452, 2434 [NASA ADS] [CrossRef] [Google Scholar]
 Springel, V., White, S. D. M., Jenkins, A., et al. 2005, Nature, 435, 629 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Suyu, S. H., Marshall, P. J., Hobson, M. P., & Blandford, R. D. 2006, MNRAS, 371, 983 [NASA ADS] [CrossRef] [Google Scholar]
 Suyu, S. H., Marshall, P. J., Blandford, R. D., et al. 2009, APJ, 691, 277 [NASA ADS] [CrossRef] [Google Scholar]
 Suyu, S. H., Marshall, P. J., Auger, M. W., et al. 2010, ApJ, 711, 201 [NASA ADS] [CrossRef] [Google Scholar]
 Suyu, S. H., Auger, M. W., Hilbert, S., et al. 2013, ApJ, 766, 70 [NASA ADS] [CrossRef] [Google Scholar]
 Suyu, S. H., Treu, T., Hilbert, S., et al. 2014, ApJ, 788, L35 [NASA ADS] [CrossRef] [Google Scholar]
 Suyu, S. H., Bonvin, V., Courbin, F., et al. 2017, MNRAS, 468, 2590 [NASA ADS] [CrossRef] [Google Scholar]
 Taubenberger, S., Suyu, S. H., Komatsu, E., et al. 2019, A&A, 628, L7 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Tewes, M., Courbin, F., Meylan, G., et al. 2012, The Messenger, 150, 49 [NASA ADS] [Google Scholar]
 Tewes, M., Courbin, F., & Meylan, G. 2013, A&A, 553, A120 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Tihhonova, O., Courbin, F., Harvey, D., et al. 2018, MNRAS, 477, 5657 [NASA ADS] [CrossRef] [Google Scholar]
 Tihhonova, O., Courbin, F., Harvey, D., et al. 2020, MNRAS, 498, 1406 [CrossRef] [Google Scholar]
 Tonry, J. L. 1998, AJ, 115, 1 [NASA ADS] [CrossRef] [Google Scholar]
 Treu, T., & Koopmans, L. V. E. 2002, MNRAS, 337, L6 [NASA ADS] [CrossRef] [Google Scholar]
 Treu, T., & Koopmans, L. V. E. 2004, ApJ, 611, 739 [NASA ADS] [CrossRef] [Google Scholar]
 Treu, T., Koopmans, L. V., Bolton, A. S., Burles, S., & Moustakas, L. A. 2006, ApJ, 640, 662 [NASA ADS] [CrossRef] [Google Scholar]
 Treu, T., Gavazzi, R., Gorecki, A., et al. 2009, ApJ, 690, 670 [NASA ADS] [CrossRef] [Google Scholar]
 Unruh, S., Schneider, P., & Sluse, D. 2017, A&A, 601, A77 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Valdes, F., Gupta, R., Rose, J. A., Singh, H. P., & Bell, D. J. 2004, ApJS, 152, 251 [NASA ADS] [CrossRef] [Google Scholar]
 Vanderriest, C., Schneider, J., Herpe, G., et al. 1989, A&A, 215, 1 [NASA ADS] [Google Scholar]
 Verolme, E. K., & de Zeeuw, P. T. 2002, MNRAS, 331, 959 [NASA ADS] [CrossRef] [Google Scholar]
 Vuissoz, C., Courbin, F., Sluse, D., et al. 2008, A&A, 488, 481 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Weinberg, D. H., Mortonson, M. J., Eisenstein, D. J., et al. 2013, Phys. Rep., 530, 87 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
 Wertz, O., Orthen, B., & Schneider, P. 2018, A&A, 617, A140 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Weymann, R. J., Latham, D., Angel, J. R. P., et al. 1980, Nature, 285, 641 [NASA ADS] [CrossRef] [Google Scholar]
 Wisotzki, L., Schechter, P. L., Bradt, H. V., Heinmüller, J., & Reimers, D. 2002, A&A, 395, 17 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Wong, K. C., Suyu, S. H., Auger, M. W., et al. 2017, MNRAS, 465, 4895 [NASA ADS] [CrossRef] [Google Scholar]
 Wong, K. C., Suyu, S. H., Chen, G. C.F., et al. 2020, MNRAS, 498, 1420 [CrossRef] [Google Scholar]
 Wucknitz, O. 2008, MNRAS, 386, 230 [CrossRef] [Google Scholar]
 Xu, D., Sluse, D., Schneider, P., et al. 2016, MNRAS, 456, 739 [NASA ADS] [CrossRef] [Google Scholar]
 Yang, T., Birrer, S., & Hu, B. 2020, MNRAS, 497, L56 [CrossRef] [Google Scholar]
 Yıldırım, A., Suyu, S. H., & Halkola, A. 2020, MNRAS, 493, 4783 [CrossRef] [Google Scholar]
 van Albada, T. S. 1982, MNRAS, 201, 939 [NASA ADS] [CrossRef] [Google Scholar]
 van de Sande, J., Lagos, C. D. P., Welker, C., et al. 2019, MNRAS, 484, 869 [NASA ADS] [CrossRef] [Google Scholar]
 van der Marel, R. P. 1994, MNRAS, 270, 271 [NASA ADS] [CrossRef] [Google Scholar]
Appendix A: Internal MST + PEMD
Figure A.1 shows different approximate MST’s with a core radius of 10 arcsec on top of a powerlaw profile (see also Blum et al. 2020). Figure A.2 shows the mock lens used in Sect. 2.6.1 to perform the imaging modeling inference on the lens model parameters, including the cored component resembling the MST.
Fig. A.1. Illustration of the powerlaw profile (Eq. (39)) in three dimensions (left panel) and in projection (right panel) under an approximate MST with a cored mass component (Eq. (38)). The transforms presented here were indistinguishable by the mock imaging data of Fig. A.2 (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/MST_pl_cored.ipynb). 

Open with DEXTER 
Fig. A.2. Mock HST image with a powerlaw mass profile for which we perform the inference on the detectability of an approximate MST (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/MST_pl_cored.ipynb). 

Open with DEXTER 
Appendix B: Massanisotropy degeneracy
Figure B.1 shows the predicted projected velocity dispersions (Eq. (16)) in radial bins form the center for PEMD profiles with different logarithmic massprofile slopes and halflight radii. We chose a fiducial seeing of FWHM = 10. Alternatively, we display the results assuming a constant anisotropy β_{ani}(r) = const in Fig. B.2. In Fig. B.3 we plot, without seeing and under fixed anisotropy model, the predicted radial change in the velocity dispersion for different core masses, λ_{c}, and core radii, R_{c}.
Fig. B.1. Radial dependence on the projected velocity dispersion measurement for an Osipkov–Merritt anisotropy profile (Eq. (51)). Top to bottom: increase in the half light radius of the deflector. Left to right: change in the mass profile slope (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/anisotropy_ifu.ipynb). 

Open with DEXTER 
Fig. B.2. Radial dependence on the projected velocity dispersion measurement for a constant anisotropy β_{ani}. Top to bottom: increase in the half light radius of the deflector. Left to right: change in the mass profile slope (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/anisotropy_ifu.ipynb). 

Open with DEXTER 
Fig. B.3. Radial dependence on the projected velocity dispersion measurement for different cored components (38) on top of a PEMD profile approximating a pure MST, with normalization λ_{c} and core radii, R_{c}. The projected radius from the center of the galaxy is extended to 5 arcsec to visibly see the impact on the kinematic of larger cored components (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/anisotropy_ifu.ipynb). 

Open with DEXTER 
Appendix C: Likelihood calculation
In this section, we provide the specifics of the likelihood calculation for individual lenses and how we efficiently evaluate the likelihood in the hierarchical context. This includes the imaging likelihood (Appendix C.1), timedelay likelihood (Appendix C.2) and velocity dispersion likelihood (Appendix C.3). Appendix C.4 describes our formalism to track covariances and the marginalization as implemented in HIERARC.
C.1. Imaging likelihood
The likelihood and the lens model inference is not prominently featured in this work, as we are making use of products being derived by our collaboration presented in other work. Nevertheless, the high resolution imaging data and lens model inferences on the likelihood level are essential parts of the analysis.
Given a lens model with parameters ξ_{mass} and surface brightness model with parameters ξ_{ligth}, a model of the imaging data can be constructed, d_{model}. The likelihood is computed at the individual pixel level accounting for the noise properties from background and other noise properties, such as readout, as well as the Poisson contribution from the sources. The imaging likelihood is given by
where k is the number of pixels used in the likelihood and Σ_{pixel} is the error covariance matrix. Current analyses assume uncorrelated noise properties in the individual pixels and the covariance matrix becomes diagonal. The model of the surface brightness of the lensed galaxy requires high model flexibility. The surface brightness components can be captured with linear components and solved for and marginalized over analytically. TDCOSMO uses pixelized grids as well as smooth basis sets (see e.g., Suyu et al. 2006; Birrer et al. 2015, for the current methods in use).
C.2. Timedelay likelihood
The likelihood of the time delay data 𝒟_{td} given a model prediction is
with Δt_{data} is the data vector of relative time delays, is the measurement covariance between the relative delays and
is the model predicted timedelay vector (Eq. (5)) with Δϕ_{Fermat} is the relative Fermat potential vector (Eq. (6)). Effectively, the timedelay distance posterior transform according to Eq. (26) under an MST.
C.3. Velocity dispersion likelihood
The model prediction of the velocity dispersion transforms under MST according to Eq. (25) and cosmological distance ratio relevant for the kinematics is D_{s}/D_{ds} and scales according to Eq. (17). We can write the likelihood of the spectroscopic data, 𝒟_{spec}, given a model as
where is a vector of velocity dispersion measurements, is the measurement error covariance between the measurements (including, for example, stellar template fitting, calibration systematics etc.) and
is the model prediction. The impact of the anisotropy distribution depends on the specific lens and light configuration. We can compute numerically the change in the model predicted dimensionless velocity dispersion component for each individual aperture 𝒜_{j}, J_{𝒜j}(ξ_{mass}, ξ_{light}, β_{ani})
C.4. Marginalization and covariances
The marginalization over ξ_{mass} and ξ_{light} (Eq. (53)) affects the relative Fermat potential Δϕ_{Fermat} in the timedelay likelihood (Eq. (C.3)) and the dimensionless factors (Eqs. (C.5) and (C.6)). We can compute the marginalized likelihood over ξ_{mass} and ξ_{light} under the assumption that the posteriors in ξ_{mass} and ξ_{light} transform to covariant Gaussian distributions in Δϕ_{Fermat} and as a model addition to the error covariances, such that
The model covariance matrix for the time delays can be expressed as
the covariance matrix on the kinematics as
and the crosscovariance between the kinematics and the time delays as
In this form, the model covariances are explicitly dependent on the anisotropy model, the MST and the cosmology.
The covariance between the kinematics and the time delays, , above in Eq. (C.10) is primarily impacted by the average density slope parameter γ of the mass model. γ affects both the kinematics and the Fermat potential and uncertainty in γ can lead to covariances. However, if the density slope parameter is well constrained by imaging data (modulo explicit MST), the covariance in Eq. (C.10) becomes subdominant relative to the uncertainty in the measurement of the kinematics.
When setting , we can separate the inference of D_{Δt}/λ from the kinematics likelihood and can work directly on the D_{Δt}/λ posteriors from the inference from the image data, D_{image}, and the timedelay measurement, D_{td},
This allows us to use individually sampled angular diameter distance posteriors (expression (40)) without sampling an additional MST and then transform them in postprocessing. This is applicable for both, external convergence and internal MST and we effectively evaluate the likelihood on the onedimensional posterior density in D_{Δt}/λ.
In the same way as for the timedelay likelihood, we can perform the marginalization of the kinematics likelihood over the imaging data constraints
Appendix D: TDLMC inference with more general anisotropy models
Fig. D.1. TDLMC Rung3 inference with fixed Ω_{m} to the correct value and a generalized Osipkov–Merritt anisotropy profile (Eq. (D.1)). Blue contours indicate the inference with a uniform prior in a_{ani} while the red contours indicate the inference with uniform priors in log(a_{ani}). The thin vertical line indicates the ground truth H_{0} value in the challenge (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/TDLMC/TDLMC_rung3_inference.ipynb). 

Open with DEXTER 
Fig. D.2. TDLMC Rung3 inference on the profile and anisotropy parameter when assuming the correct cosmology (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/TDLMC/TDLMC_rung3_inference.ipynb). 

Open with DEXTER 
Fig. D.3. TDLMC Rung3 inference on the profile and anisotropy parameter when assuming the correct cosmology for a generalized Osipkov–Merritt anisotropy profile (Eq. (D.1)) (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/TDLMC/TDLMC_rung3_inference.ipynb). 

Open with DEXTER 
In this work, we presented inferences based on the anisotropy parameterization by Osipkov (1979), Merritt (1985) (Eq. (51)). In this appendix we perform the inference on the TDLMC with a more general anisotropy parameterization. Agnello et al. (2014a) introduced a generalization of the Osipkov–Merritt profile with an asymptotic anisotropy value, β_{∞}, different than radial
We perform the identical analysis as presented in Sect. 4 except for the addition of one free parameter, β_{∞}. Table D.1 presents the parameters and priors used in the hierarchical analysis on the TDLMC data set. Figure D.1 shows the results of this inference for the two different priors in a_{ani}. The additional degree of freedom in the anisotropy is not constrained by the mock data and leads to a priorvolume effect. The constraining power on the mass profile relies on the mean anisotropy in the orbits within the aperture of the measurement, and not particularly on the parameterization of the radial dependence (see also e.g., Agnello et al. 2014b). It is more challenging to find uninformative priors in higher dimension. As we found an uninformative prior in a simpler parameterization that leads to a consistent result on the TDLMC data set, we do not explore more degrees of freedom in the anisotropy parameterization in this work.
Summary of the model parameters sampled in the hierarchical inference on TDLMC Rung3 with the anisotropy model of Eq. (D.1).
On the mock data with known input cosmology, we can also reverse the problem and ask which anisotropy parameter configurations result in statistically consistent cosmologies. To do so, we fix the cosmology to the input values and only perform the inference on the anisotropy parameters. Figure D.2 presents the results for the Osipkov–Merritt model of Sect. 4 and Fig. D.3 presents the results for the generalized Osipkov–Merritt profile of this appendix. The posterior on the anisotropy parameter can be interpreted as an informative prior on the anisotropy model parameters from the hydrodynamical simulations of the TDLMC. We do not make use of such a prior in this work but note the consistent inference of the anisotropy parameters for the TDCOSMO+SLACS analysis with this exercise performed on the TDLMC.
Appendix E: SLACS sample details
In this appendix we provide the detailed numerical numbers used in this analysis for the SLACS lenses. Table E.1 lists the data derived from external works that are used in our analysis for the 33 lenses of the SLACS sample. Redshifts are from SDSS presented by Auger et al. (2009), Einstein radii from Auger et al. (2009) and Shajib et al. (2020b) (where available), halflight radii, r_{eff}, from Auger et al. (2009), powerlaw slopes from Shajib et al. (2020b) (where available) and velocity dispersions are based on Bolton et al. (2008) and Shu et al. (2015). Local environment statistics ζ_{1/r} and external shear κ_{ext} are derived in this work (see Sects. 6.3 and 6.4).
Summary of the parameters being used of the individual 33 SLACS lenses selected in Sect. 6 to infer mass profile constraints in combination of imaging and kinematics.
All Tables
Summary of the model parameters sampled in the hierarchical inference on the TDCOSMO+SLACS sample.
Summary of the model parameters sampled in the hierarchical inference on TDLMC Rung3 with the anisotropy model of Eq. (D.1).
Summary of the parameters being used of the individual 33 SLACS lenses selected in Sect. 6 to infer mass profile constraints in combination of imaging and kinematics.
All Figures
Fig. 1. Illustration of a composite profile consisting of a stellar component (Hernquist profile, dotted lines) and a dark matter component (NFW + cored component (Eq. (38)), dashed lines) which transform according to an approximate MST (joint as solid lines). The stellar component gets rescaled by the MST while the cored component transforms the dark matter component. Left: profile components in three dimensions. Right: profile components in projection. The transforms presented here cannot be distinguished by imaging data alone and require i.e., stellar kinematics constraints (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/MST_composite_cored.ipynb). 

Open with DEXTER  
In the text 
Fig. 2. Illustration of the constraining power of imaging data on a cored mass component (Eq. (35)). Shown are the parameter inference of the powerlaw profile mock quadruply lensed quasar of Fig. A.2 when including a marginalization of an additional cored power law profile (Eq. (38)). Orange lines indicate the input truth of the model without a cored component. λ_{c} is the scaled core model parameter (Eq. (35)) resembling the pure MST for large core radii (λ_{c} ≈ λ_{int}) (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/MST_pl_cored.ipynb). 

Open with DEXTER  
In the text 
Fig. 3. Constraints on an approximate internal MST transform with a cored component, λ_{c}, of an NFW profile as a function of core radius. In gray are the 1σ exclusion limits that imaging data can provide. In orange is the region where the total mass of the core within a threedimensional radius exceeds the mass of the NFW profile in the same sphere. In blue is the region where the transformed profile results in negative convergence at the core radius. The white region is effectively allowed by the imaging data and simple plausibility considerations and where we can use the mathematical MST as an approximation (λ_{c} ≈ λ_{int}). The halo mass, concentration and the redshift configuration is displayed in the lower left box (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/MST_pl_cored.ipynb). 

Open with DEXTER  
In the text 
Fig. 4. Comparison of the actual predicted kinematics from the modeling of the physical threedimensional mass distribution κ_{λint} (Eq. (35)) for varying core sizes (solid) and the analytic relation of a perfect MST (Eq. (25), dashed) for the mock lens presented in Fig. A.2. Lower panel: fractional differences between the exact prediction and a perfect MST calculation. The MST prediction matches to < 1% in the considered range. Minor numerical noise is present at the subpercent level (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/MST_pl_cored.ipynb). 

Open with DEXTER  
In the text 
Fig. 5. Mock data from the TDLMC Rung3 inference with the parameters and prior specified in Table 1. Orange contours indicate the inference with a uniform prior in a_{ani} while the purple contours indicate the inference with a uniform priors in log(a_{ani}). The thin vertical line indicates the ground truth H_{0} value in the challenge (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/TDLMC/TDLMC_rung3_inference.ipynb). 

Open with DEXTER  
In the text 
Fig. 6. External convergence posteriors for the individual TDCOSMO lenses (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/TDCOSMO_sample/tdcosmo_sample.ipynb). 

Open with DEXTER  
In the text 
Fig. 7. Hierarchical analysis of the TDCOSMOonly sample when constraining the MST with kinematic information. Parameter and priors are specified in Table 3. Orange contours correspond to the inference with uniform prior on Ω_{m}, 𝒰([0.05, 0.5]), while the purple contours correspond to the prior based on the Pantheon sample with 𝒩(μ = 0.298, σ = 0.022) (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/TDCOSMO_sample/tdcosmo_sample.ipynb). 

Open with DEXTER  
In the text 
Fig. 8. Sample selection of the SLACS lenses being added to the analysis and comparison with the TDCOSMO data set. The comparisons are in lens redshift, z_{lens}, source redshift, z_{source}, measured velocity dispersion, σ^{P}, half light radius of the deflector, r_{eff}, Einstein radius of the deflector, θ_{E}, the ratio of half light radius to Einstein radius, r_{eff}/θ_{E}, and the SIS equivalent velocity dispersion estimated from the Einstein radius and a fiducial cosmology, σ_{SIS}. Open dots correspond to lenses included in our selection without quality lens models. Red points correspond to SLACS lenses which have VIMOS IFU data (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/sample_selection.ipynb). 

Open with DEXTER  
In the text 
Fig. 9. External convergence posteriors of the 33 SLACS lenses that pass our morphology and local environment selection cut based on the weighted number counts (gray dashed lines) and the population distribution (black solid line) (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/sample_selection.ipynb). 

Open with DEXTER  
In the text 
Fig. 10. Powerlaw slope γ_{pl} inferences obtained from HST imaging data for the 14 SLACS lenses within our selection cut from Shajib et al. (2020a,b). We derive a population distribution from these lenses to be applied on the subset of lenses without measured γ_{pl} from imaging data. The black dashed line indicates the population mean and the blue band the 1sigma population width based on the 14 individual measurements (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/sample_selection.ipynb). 

Open with DEXTER  
In the text 
Fig. 11. Posterior distribution for the SLACS sample with priors according to Table 4. Orange: inference with the SDSS spectra. Purple: inference with SDSS spectra and VIMOS IFU data set. The posterior of λ_{int, 0} was blinded during the analysis (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/SLACS_sample/SLACS_constraints.ipynb). 

Open with DEXTER  
In the text 
Fig. 12. Posterior distributions of the key parameters for the hierarchical inference. Blue: constraints from the TDCOSMOonly sample. Violet: constraints with the addition of IFU data of nine SLACS lenses to inform the anisotropy prior on the TDCOSMO sample, TDCOSMO+SLACS_{IFU}. Orange: constraints with a sample of 33 additional lenses with imaging and kinematics data (HST imaging + SDSS spectra) from the SLACS sample, TDCOSMO+SLACS_{SDSS}. Purple: joint analysis of TDCOSMO and 33 SLACS lenses with SDSS spectra of which nine have VIMOS IFU data, TDCOSMO+SLACS_{SDSS + IFU}. Priors are according to Table 5. The 68th percentiles of the 1D marginalized posteriors are presented in Table 6. The posteriors in H_{0} and λ_{int, 0} were held blinded during the analysis (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/joint_inference.ipynb). 

Open with DEXTER  
In the text 
Fig. 13. Illustration of the goodness of the fit of the maximum likelihood model of the joint analysis in describing the TDCOSMO data set. Blue points are the measurements with the diagonal elements of the measurement covariance matrix. Orange points are the model predictions with the diagonal elements of the model covariance uncertainties. Left: comparison of measured timedelay distance from imaging data and time delays compared with the predicted value from the cosmological model, the internal and external MST (and their distributions). Right: comparison of the velocity dispersion measurements and the predicted values. In addition to the MST terms, the uncertainty in the model also includes the uncertainty in the anisotropy distribution a_{ani}. For lenses with multiple velocity dispersion measurements, the diagonal terms in the error covariance are illustrated (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/joint_inference.ipynb). (a) Fit to the timedelay distance. (b) Fit of velocity dispersion. 

Open with DEXTER  
In the text 
Fig. 14. Illustration of the goodness of the fit of the maximum likelihood model of the joint analysis in describing the SDSS velocity dispersion measurements of the 34 SLACS lenses in our sample. Blue points are the measurements with the diagonal elements of the measurement covariance matrix. Orange points are the model predictions with the diagonal elements of the model covariance uncertainties. The measurement uncertainties include the uncertainties in the quoted measurements and the additional uncertainty of σ_{σP, sys}. The model uncertainties include the lens model uncertainties and the marginalization over the λ_{int} and a_{ani} distribution (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/joint_inference.ipynb). 

Open with DEXTER  
In the text 
Fig. 15. Illustration of the goodness of the fit of the maximum likelihood model of the joint analysis in describing the VIMOS radially binned IFU velocity dispersion measurements of the nine SLACS lenses with VIMOS data in our sample. Blue points are the measurements with the diagonal elements of the measurement covariance matrix. Orange points are the model predictions with the diagonal elements of the model covariance uncertainties. The measurement uncertainties include the uncertainties in the quoted measurements and the additional uncertainty of σ_{σP, sys}. The model uncertainties include the lens model uncertainties and the marginalization over the λ_{int} and a_{ani} distribution (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/joint_inference.ipynb). 

Open with DEXTER  
In the text 
Fig. 16. Relative difference of the measured vs the predicted velocity dispersion for the SLACS and TDCOSMO sample as a function of different parameters associated with the lineofsight and the lensing galaxy. In particular, these are the relative inverse distance weighted over density ζ_{1/r}, the ratio of halflight radius to Einstein radius r_{eff}/θ_{E}, lens redshift z_{lens}, and SIS equivalent velocity dispersion σ_{SIS} (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/joint_inference.ipynb). 

Open with DEXTER  
In the text 
Fig. 17. Illustration of the inferred mass profile of the joint TDCOSMO+SLACS_{SDSS + IFU} analysis. A pure powerlaw with γ_{pl} = 2.10 ± 0.05 is shown in orange. In blue is the result of this work of λ_{int} = 0.91 ± 0.045 when interpreted as a cored mass component with R_{c} uniform in [3″, 10″]. Three dimensional density are illustrated on the left and the lensing convergence on the right. The dashed vertical line on the right panels indicates the Einstein radius. Relative difference in respect to the powerlaw model are presented in the bottom panels (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/MST_pl_cored.ipynb). 

Open with DEXTER  
In the text 
Fig. 18. Comparison of different blind H_{0} measurements by the TDCOSMO collaboration, based on different mass profile assumptions and data sets incorporated. All measurements presented on this plot were performed blindly with regard to the inference of H_{0}. The measurement on top is the combined H0LiCOW six lenses constraints presented by Wong et al. (2020), when averaging powerlaw and composite NFW plus stars (with constant masstolight ratio) on a lensbylens basis without correlated errors among the lenses. The next two measurements are from Millon et al. (2020) of six TDCOSMO timedelay lenses (five H0LiCOW lenses^{22} and one STRIDES lens by Shajib et al. 2020a), when performing the inference assuming either a composite NFW plus stars (with constant masstolight ratio) or the powerlaw mass density profile for the galaxy acting as a lens. Lower panel: results from this work. The main difference with respect to previous work is that we have made virtually no assumption on the radial mass density profile of the lens galaxy, and taken into account the covariance between the lenses. The analysis in this work is constrained only by the stellar kinematics and fully accounts for the uncertainty related to the mass sheet transformation (MST). In this framework, we obtain four measurements according to the datasets considered. The TDCOSMOonly inference is based on the same set of seven lenses as those jointly included by Millon et al. (2020) and Wong et al. (2020). The inferred median value is the same, indicating no bias, and the uncertainties, as expected, are larger. The next three measurements rely on external datasets from the SLACS survey, by making the assumption that the lens galaxies in the two surveys are drawn from the same population. The TDCOSMO+SLACS_{IFU} measurements uses, in addition to the TDCOSMO sample, nine lenses from the SLACS sample with IFU observations to inform the anisotropy prior applied on the TDCOSMO lenses. The TDCOSMO+SLACS_{SDSS} measurement comes from the joint analysis of the TDCOSMO sample and 33 SLACS lenses with SDSS spectroscopy. The TDCOSMO+SLACS_{SDSS + IFU} presents the joint analysis of all three data sets, again assuming selfsimilar distributions of the mass profiles and stellar anisotropy. The TDCOSMOonly and TDCOSMO+SLACS_{IFU} analyses do not rely on selfsimilar mass profiles of the SLACS and TDCOSMO sample while the TDCOSMO+SLACS_{SDSS} and TDCOSMO+SLACS_{SDSS + IFU} measurements (orange and purple) do. All the measurements shown in this plot are in statistical agreement with each other. See Sect. 8.5 for a discussion and physical interpretation of the results (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/JointAnalysis/tdcosmo_comparison_plot.ipynb). 

Open with DEXTER  
In the text 
Fig. A.1. Illustration of the powerlaw profile (Eq. (39)) in three dimensions (left panel) and in projection (right panel) under an approximate MST with a cored mass component (Eq. (38)). The transforms presented here were indistinguishable by the mock imaging data of Fig. A.2 (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/MST_pl_cored.ipynb). 

Open with DEXTER  
In the text 
Fig. A.2. Mock HST image with a powerlaw mass profile for which we perform the inference on the detectability of an approximate MST (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/MST_pl_cored.ipynb). 

Open with DEXTER  
In the text 
Fig. B.1. Radial dependence on the projected velocity dispersion measurement for an Osipkov–Merritt anisotropy profile (Eq. (51)). Top to bottom: increase in the half light radius of the deflector. Left to right: change in the mass profile slope (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/anisotropy_ifu.ipynb). 

Open with DEXTER  
In the text 
Fig. B.2. Radial dependence on the projected velocity dispersion measurement for a constant anisotropy β_{ani}. Top to bottom: increase in the half light radius of the deflector. Left to right: change in the mass profile slope (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/anisotropy_ifu.ipynb). 

Open with DEXTER  
In the text 
Fig. B.3. Radial dependence on the projected velocity dispersion measurement for different cored components (38) on top of a PEMD profile approximating a pure MST, with normalization λ_{c} and core radii, R_{c}. The projected radius from the center of the galaxy is extended to 5 arcsec to visibly see the impact on the kinematic of larger cored components (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/MST_impact/anisotropy_ifu.ipynb). 

Open with DEXTER  
In the text 
Fig. D.1. TDLMC Rung3 inference with fixed Ω_{m} to the correct value and a generalized Osipkov–Merritt anisotropy profile (Eq. (D.1)). Blue contours indicate the inference with a uniform prior in a_{ani} while the red contours indicate the inference with uniform priors in log(a_{ani}). The thin vertical line indicates the ground truth H_{0} value in the challenge (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/TDLMC/TDLMC_rung3_inference.ipynb). 

Open with DEXTER  
In the text 
Fig. D.2. TDLMC Rung3 inference on the profile and anisotropy parameter when assuming the correct cosmology (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/TDLMC/TDLMC_rung3_inference.ipynb). 

Open with DEXTER  
In the text 
Fig. D.3. TDLMC Rung3 inference on the profile and anisotropy parameter when assuming the correct cosmology for a generalized Osipkov–Merritt anisotropy profile (Eq. (D.1)) (https://github.com/TDCOSMO/hierarchy_analysis_2020_public/blob/6c293af582c398a5c9de60a51cb0c44432a3c598/TDLMC/TDLMC_rung3_inference.ipynb). 

Open with DEXTER  
In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.